The Hidden Cost of “Just Using Stripe” for Fintech Infrastructure

A dimly lit operations war room with multiple large screens showing abstract payment flows, risk scores, and transaction graphs, cable-laden racks in the background, cool blue and amber lighting, wide-angle composition emphasizing complex interconnected systems, no visible text

Why this matters this week

If you’re running anything that moves money—marketplace, SaaS billing, consumer app, B2B payments—you’ve probably noticed at least one of these this quarter:

  • Card acceptance down for a specific region or issuer.
  • A “sudden” fraud wave via account takeover or card testing.
  • Compliance asking for “one small change” that implies a major refactor.
  • A new bank/PSP demanding “real-time screening” or specific KYC flows.
  • A regulator (or auditor) asking you to prove something you don’t currently log.

The pattern: teams built on generic PSP rails (Stripe/Adyen/Checkout, etc.), then layered ad‑hoc fraud, KYC, and risk rules on top. That worked at $1M–$10M GMV. At $100M+ GMV or when you expand into 2–3 new regions, these decisions start driving:

  • 20–50 bps margin erosion from chargebacks/fees/false positives.
  • Product delays because “compliance said no” and infra can’t adapt.
  • Hard limits from partners (banks, schemes) who see your portfolio as “too risky”.

This week’s theme: what “fintech infrastructure” should mean if you actually care about reliability, cost, fraud, AML/KYC, and regulatory risk—and why “just use Stripe/whatever” quietly turns into “we’re now running an unintentional mini‑bank” unless you’re deliberate.

What’s actually changed (not the press release)

Three real shifts engineers are feeling, beneath the marketing:

  1. Regulators expect “explainable automation,” not just “we use Vendor X”

    • AML/KYC and transaction monitoring can’t be a black box anymore.
    • You’re expected to:
      • Show why you approved/blocked a transaction or user.
      • Reproduce decisions (model versions, rules, data inputs).
      • Demonstrate ongoing tuning and backtesting.
    • Saying “Vendor X’s AI did it” is not an acceptable control in audits.
  2. Card networks and banks are pushing risk downstream

    • Increased scrutiny on:
      • High‑risk MCCs and business models (crypto, gambling‑adjacent, high‑chargeback verticals).
      • Cross‑border flows and nested relationships (marketplaces, platforms, aggregators).
    • Outcome: your “simple payments integration” is now being asked to:
      • Maintain merchant hierarchies (sub-merchants, sellers).
      • Track ultimate beneficial owners (UBOs).
      • Prove you’re doing sanctions screening and transaction monitoring.
  3. Open banking / A2A payments are finally real in production, with caveats

    • In some regions, A2A (account‑to‑account) rails are:
      • Cheaper than cards.
      • Lower fraud (push payments, Strong Customer Authentication).
    • But:
      • Dispute models are different (not chargeback-friendly).
      • UX is weird vs. cards; more steps, more bank‑side friction.
      • Your current risk engine is likely card‑native; bank account flows behave differently.

The net: infra decisions that felt “back office” 2–3 years ago now directly shape product roadmap, conversion, and margin.

How it works (simple mental model)

Strip away jargon. A modern fintech stack that touches payments, fraud, and compliance is basically four loops stitched together:

  1. Onboarding loop (KYC/KYB)
    Decide who is allowed into your ecosystem.

    • Inputs:
      • User‑provided data (name, address, DOB, docs).
      • External data (KYC providers, corporate registries, sanctions lists, device data).
    • Core decisions:
      • Approve, reject, or refer for manual review.
      • Risk tier (limits, features, monitoring intensity).
    • Key characteristics:
      • Latency-sensitive but not microsecond-level.
      • Strong need for traceability (what data, what rules, what outcome).
  2. Transaction loop (payments risk + routing)
    Decide how money moves and which rail you use.

    • Inputs:
      • Transaction details (amount, currency, merchant, MCC).
      • Behavioral patterns (user history, velocity, device fingerprint).
      • Rail capability (cards vs A2A vs wallets; regional constraints).
    • Core decisions:
      • Approve/decline transaction.
      • Which provider/rail to route to (PSP, acquirer, open banking, local scheme).
    • Key characteristics:
      • Low latency (typically <150 ms budget end‑to‑end for risk + routing).
      • Tight feedback loop from authorisation outcomes and fraud events.
  3. Monitoring loop (fraud, AML, compliance)
    Decide what to investigate and what to report over time.

    • Inputs:
      • Transaction stream (including failed/declined).
      • Derived signals (graph relationships, clusters, anomaly scores).
      • External lists (sanctions, PEP, adverse media).
    • Core decisions:
      • Trigger alerts/cases (suspicious patterns).
      • File regulatory reports (SAR/STR equivalents).
      • Adjust limits or freeze accounts.
    • Key characteristics:
      • Mix of near‑real‑time and batch.
      • Human-in-the-loop workflows are mandatory.
  4. Reconciliation & ledger loop (financial correctness)
    Decide how you prove the money trail is accurate.

    • Inputs:
      • PSP/bank statements.
      • Internal ledger events.
      • Chargebacks, refunds, reversals, fees.
    • Core decisions:
      • Detect breaks/drift between systems.
      • Attribute cost (fraud loss vs fees vs ops errors).
    • Key characteristics:
      • Often neglected until auditors show up.
      • Backbone for understanding true unit economics.

If you’re above $50–100M annual volume and don’t have these loops explicitly modeled (in code and org structure), you’re probably solving them implicitly through scattered services, dashboards, and “that one analyst’s spreadsheet”.

Where teams get burned (failure modes + anti-patterns)

Some recurring disaster patterns from real teams:

  1. “Gateway as risk engine”

    Pattern:

    • Relying entirely on PSP‑provided risk features and rules as if they were a complete fraud and AML solution.

    Failure modes:

    • You can’t:
      • Express your business-specific risk appetite.
      • Combine signals across multiple PSPs/rails.
      • Adapt when a region becomes a fraud hotspot.
    • Regulators/banks view you as outsourcing core risk management, which doesn’t fly past a certain scale.
  2. Monolithic “compliance service” everyone hates

    Pattern:

    • One central service that does KYC, KYB, sanctions, monitoring, case management, and policy enforcement.
    • Every change request from compliance becomes a multi-team project.

    Failure modes:

    • High coupling: small logic changes require cross‑team coordination and long QA cycles.
    • Divergent needs (onboarding vs. transaction monitoring) fight over the same schema and release cadence.
    • Product experiments get blocked because “the compliance box” can’t be changed safely.
  3. “Just ship it; we’ll add AML monitoring later”

    Pattern:

    • Launch product on a PSP; do minimal checks; assume bank partner or PSP will escalate issues.

    Failure modes:

    • When volume or risk profile crosses a threshold, your bank/PSP:
      • Imposes volume caps.
      • Demands bespoke monitoring and reporting retroactively.
    • You then have to:
      • Reconstruct historical data you didn’t log properly.
      • Implement an alerting and case system under time pressure.
    • This is one of the most common triggers for “emergency infra rebuild”.
  4. KYC/KYB spaghetti

    Pattern:

    • Multiple vendor integrations (docs, liveness, corporate registry) sprinkled directly into product flows.

    Failure modes:

    • Any vendor change becomes a UI/backend rewrite.
    • Hard to answer:
      • “What exactly did we verify for this user and when?”
      • “Which provider we used and why they passed?”
    • Audit trails and reason codes are scattered across microservices.
  5. Ledger “as reporting DB”

    Pattern:

    • Using whatever the PSP panel/extracts provide as “source of truth”.

    Failure modes:

    • No internal ledger that tracks:
      • Customer balances.
      • Fees, interchange, FX spread.
      • Reversals/chargebacks timing.
    • You can’t:
      • Reconcile inconsistencies.
      • Attribute fraud losses accurately.
      • Prove financial correctness during due diligence or licensing.

Practical playbook (what to do in the next 7 days)

This is aimed at teams that already process real money and are feeling scaling pain.

1) Map your four loops in one page

Create a single architecture doc that explicitly answers for each loop:

  • Onboarding:

    • Which systems call external KYC / KYB / sanctions providers?
    • Where is the final “approved/rejected” decision stored?
    • Can you list all data fields and docs used in that decision?
  • Transaction risk + routing:

    • Where are your risk rules and models implemented?
    • What is the latency budget for risk in your payment path?
    • Do you have a routing abstraction or is PSP baked into business logic?
  • Monitoring:

    • What generates alerts (rules vs. ML vs. PSP webhooks)?
    • Where do alerts go (JIRA, custom case system, email, nothing)?
    • Who closes the loop (ops team, compliance, no one)?
  • Ledger:

    • Do you have an internal ledger or rely on PSP exports?
    • How do you reconcile daily with banks/PSPs?
    • Can you reconstruct the lifecycle of one transaction across retries, partial captures, refunds, and chargebacks?

If you can’t answer these clearly, that’s not a documentation issue; it’s a design smell.

2) Decouple “risk brain” from “payment pipes” (at least conceptually)

Even if you don’t build a full in‑house risk engine yet, define a contract like:

  • Input:
    • User ID, merchant/seller ID, payment instrument, IP, device, amount, currency, context (onboarding/payment/payout).
  • Output:
    • Decision: allow / block / step-up / manual review.
    • Reason codes and features used.
    • TTL for the decision (how long it’s valid).

Implement a thin service/facade that all payment flows call into. Internally this might still call PSP features, but you’ve made:

  • PSP replaceable behind that interface.
  • Risk logic observable and testable (you can log inputs/outputs, replay decisions).

3) Stand up a minimum viable case management workflow

You don’t need a fancy regtech platform to start:

  • Create a basic “case” entity:
    • Related user/merchant/

Similar Posts