The Hidden Cost of “Just Adding Payments”: What Changes After $100M TPV

A dimly lit, cinematic data center corridor with glowing server racks, intersecting light trails symbolizing payment flows, and a faint overlay of network graphs and risk signals, viewed from a wide angle to emphasize scale and complexity, cool blue and teal tones

Why this matters this week

There’s a recurring pattern in fintech and “payments-adjacent” products right now:

  • Teams bolt on card processing, ACH, or payouts to boost revenue.
  • Metrics look good until total payment volume (TPV) crosses into 9–10 figures.
  • Then fraud, chargebacks, disputes, AML flags, and regulatory questions start to dominate roadmaps and margins.

What changed wasn’t your product. What changed was your risk surface and who’s paying attention to you:

  • Fraud rings notice you.
  • Banks and regulators notice you.
  • Card networks and scheme monitors notice you.
  • Your CFO notices your net take rate quietly getting eroded by losses, reserves, and compliance overhead.

If you’re anywhere near:
– $100M+ annual TPV, or
– multiple geographies, or
– “handling money for others” (wallets, marketplaces, platforms)

…you are now in the infrastructure business: risk, fraud, KYC/AML, compliance, and ledger correctness are first-class concerns, not “features”.

This post is about what actually changes under the hood when you cross that threshold, and what you can do about it in the next week.

What’s actually changed (not the press release)

Three things fundamentally change as you scale fintech infrastructure: who you must answer to, when you can be wrong, and what is now “material”.

1. Your new stakeholders

At small scale, you “answer to” your users and maybe a processor. Past ~$100M TPV:

  • Banks / sponsor institutions

    • Want to know: your controls, chargeback rates, fraud losses, portfolio risk.
    • They start asking for: policy docs, audit trails, model governance, sanctions screening logs.
  • Card networks / schemes

    • Care about: fraud-to-sales ratio, chargeback thresholds, compliance with rules on MCCs, cross-border, surcharging, etc.
  • Regulators

    • Even if you’re not directly licensed, they look at your bank partners and ask about their fintech portfolio.
    • You become “a thing” in regulatory reviews once volumes and complaint counts hit certain levels.
  • Internal finance & audit

    • Revenue vs. net margin after fraud, losses, reserves, and regulatory capital consumption.
    • Reconciliation breaks become audit findings, not “known issues”.

2. Latency expectations vs. correctness

In early days, you can favor latency and UX over strict controls:

  • “Approve more, review later.”
  • “We’ll manually fix reconciliation.”
  • “We’ll backfill KYB improvements later.”

At scale, the cost of wrong decisions increases:

  • False negatives (letting bad actors through) now show up as:

    • Chargeback fees, write-offs
    • Network monitoring issues
    • Regulator inquiries (e.g., sanctions misses)
  • False positives (blocking good users) now show up as:

    • Merchant churn, user complaints
    • Internally: “risk team is blocking growth” wars

You can’t keep pretending PM + a spreadsheet is a risk system.

3. Materiality thresholds

Below $10M TPV, a $50k fraud incident is a painful retro.

At $1B TPV:
– $50k is noise.
– A systemic flaw causing 5–10 bps of loss is existential if your expected margin is 30–50 bps.

Fintech infrastructure is a game of basis points:
– A “small” modeling mistake or routing bug can silently cost millions per year.
– A “tiny” KYC/AML gap can trigger regulator action if it’s systemic.

How it works (simple mental model)

Use this mental model to reason about your fintech stack:

4 planes of fintech infrastructure:

  1. Transaction plane – How money moves.

    • Card, ACH, wires, RTP, A2A/open banking payments, wallets.
    • You care about: scheme rules, cutoffs, settlement timing, failure modes.
  2. Identity & counterparty plane – Who you’re dealing with.

    • KYC (individuals), KYB (businesses), sanctions (OFAC, PEPs, watchlists).
    • Documents, verifications, ownership structures, geolocation, device fingerprints.
  3. Risk & decisioning plane – What you let happen.

    • Real-time: authorization, 3DS, step-up auth, velocity limits, device risk.
    • Near-real-time: transaction monitoring, behavioral models, rules engine.
    • Batch/offline: periodic reviews, limit adjustments, model retraining.
  4. Governance & record plane – What you can prove later.

    • Ledger (double-entry, immutable history).
    • Reconciliation across processors, banks, and internal systems.
    • Policy docs, approval workflows, audit logs, model governance.

Those planes intersect on every payment, payout, or account change. Any weak plane becomes your failure mode.

A concrete mental loop for each transaction:

  1. Who is this? (identity plane)
  2. What are they trying to do? (transaction plane)
  3. Should we allow it, under what limits, and with what friction? (risk plane)
  4. Can we prove what happened and why, in 2 years, to an auditor? (governance plane)

If any of these questions are answered informally (“we sort of know”, “logs exist somewhere”), you’re likely below the bar for a serious fintech operation.

Where teams get burned (failure modes + anti-patterns)

1. Treating KYC/AML as a checkbox integration

Anti-pattern:
– “We integrated a KYC provider; we’re compliant.”

How it burns you:
– Sanctions/PEP list updates are not timely.
– No process to review alerts; they just queue up.
– No documented risk-based approach (e.g., how you treat high-risk countries, industries).

Symptoms:
– Bank asks for your AML program; you send them a vendor’s marketing PDF.
– Audit finds zero SAR/STR filings despite high-risk volumes.

2. Overfitting fraud logic to early, clean data

Common story:
– Early user base is organic, low-risk.
– You ship a simple rules engine: “Allow if deviceseenbefore and ipcountry == usercountry”.
– Growth team starts performance marketing; mix shifts to riskier geos and channels.
– Fraud losses spike; rules churn weekly; ops team drowns in queues.

Failure mode:
– You never built the data foundation:
– No canonical event stream for user actions and payments.
– No feature store shared across rules & models.
– No clear labels for fraud vs disputes vs user error.

3. No system of record for money

Anti-pattern:
– The payment processor’s dashboard is the “source of truth”.
– Internal DBs track “available balance” or “credits” with custom logic.
– Reconciliation is a monthly heroic effort in spreadsheets.

How it burns you:
– Users see inconsistent balances vs. actual custodial accounts.
– You over- or under-pay merchants due to settlement vs. authorization confusion.
– A bug in one service silently mis-posts ledger entries for weeks.

What you actually need:
– A proper ledger:
– Double-entry.
– Immutable history, append-only.
– Explicit states: pending, captured, settled, reversed, refunded, disputed.
– Strict ownership: no service writes ad-hoc balance fields.

4. On-call without risk/compliance muscle

Anti-pattern:
– Incident runbooks only cover uptime and latency.
– Risk/compliance incidents (e.g., processor blocking your MID, bank freezing an account) treated as “business issues”.

Failure modes:
– No pager for:
– Fraud rule gone wrong blocking all new signups in a country.
– Regulator inquiry with tight SLAs for data delivery.
– Scheme monitoring alert (e.g., excessive chargebacks) needing immediate action.

These are S1 incidents for a fintech system; they must live in the same on-call world as outages.

5. One-size-fits-all geo expansion

Pattern:
– Product-market fit in Country A.
– Decide to launch in Countries B, C, D using same onboarding, flows, contracts, and thresholds.
– Local rules on KYC, data residency, licensing, FX, and consumer disputes differ substantially.

Outcome:
– “Copy + adapt” approach leads to mild non-compliance in 3–4 places at once.
– Clean-up involves retroactive contract corrections, disclosures, and technical rewrites.

Practical playbook (what to do in the next 7 days)

Assume:
– You’re at or approaching ~$100M TPV/year.
– You handle other people’s money (marketplace, payouts, wallets, B2B payments, or embedded finance).

1. Map your 4 planes in half a day

Create a single-page architecture doc:

  • Transaction plane

    • All payment methods, providers, and bank partners.
    • For each: auth → capture → settlement → refund → chargeback pathways.
  • Identity plane

    • How individuals and businesses are verified.
    • What data is stored where; who can update it; retention periods.
  • Risk plane

    • Where real-time decisions are made; what data they see.
    • Current rules/models; who owns them; deploy/revert process.
  • Governance plane

    • Ledger: what’s the system of record?
    • Reconciliation: who does it, how often, against what?
    • Policy docs: where are KYC, AML, fraud, and complaints-handling procedures?

Make gaps explicit, not hand-waved.

2. Compute your real unit economics (in basis points)

Within a day, have finance and data collaborate to get, per payment method:

  • Gross fees (interchange, MDR, markups).
  • Scheme and processing fees.
  • Fraud and chargeback losses (net of recoveries).
  • Operational cost driver proxies:
    • Manual review hours.
    • KYC/AML check volume and cost.
    • Dispute handling time.

Target: understand per $100 of volume, how much you actually keep, by segment (geo, channel, merchant type).

Typical discovery:
– A “high-growth segment” is net-negative after losses and operational overhead.
– A particular risk rule is suppressing revenue far more than it saves in fraud.

3. Establish a minimal risk + compliance change process

In the next week:

  • Define owners
    • One DRI for:
      • Fraud/risk operations.
      • KYC/AML & sanctions.
      • Ledger & reconciliation.
  • Change control
    • Any new payment method, country, or product that affects money flows must:
      • Get a quick risk review.
      • Document how each of the 4 planes is impacted.
      • Define “canary” metrics to monitor for first 4 weeks.
  • Incident classification
    • Add these to your incident taxonomy:
      • R

Similar Posts