Don’t Build a Fintech House of Cards: Rethinking “Just Use Stripe + Plaid” Architectures

Why this matters this week
A pattern keeps repeating in fintech and payments teams right now:
- You start with “just use Stripe + Plaid” (or equivalent) to validate a product.
- Product works, volumes grow, regulators start to care, finance starts asking hard questions.
- You discover:
- Your fraud losses silently exploded during a growth push.
- Your chargeback ratios are flirting with card network thresholds.
- Your KYC/AML vendor is auto-approving bad actors because you whitelisted a flow to hit a launch date.
- Your risk & compliance team is running the system out of spreadsheets.
By the time you notice, you’ve baked your first-wave vendor decisions into everything: event schemas, ID formats, reconciliation logic, and even your customer support workflows.
This week’s post is about that moment: when a “quick integration” fintech stack needs to become actual financial infrastructure with real fraud controls, AML/KYC rigor, and auditable risk/compliance behavior.
Not a vendor comparison. A mechanism-level look at how to avoid painting yourself into a compliance + payments corner.
What’s actually changed (not the press release)
Three concrete shifts are hitting production teams:
-
Regulators now expect “show, don’t tell” from day one
- Early-stage ≠ exempt. Banking-as-a-Service and embedded finance have dragged even small products into bank-like scrutiny.
- You’re expected to have:
- A documented control framework (risk, fraud, AML, sanctions).
- Explainable decisions (why was this user approved, why was this transaction blocked).
- Traceable audit trails of changes to rules, models, limits.
- “Our vendor handles that” is no longer accepted without you showing:
- What signals you send them.
- How you react to their outputs.
- Your monitoring of their miss/false-positive rates.
-
Fraud is industrializing faster than vendor rule sets
- Fraud rings:
- Share playbooks across neobanks, wallets, and BNPL providers.
- Exploit timing gaps between KYC/AML checks, payment authorization, and settlement flows.
- Off-the-shelf fraud engines:
- Work well for card-not-present ecommerce baselines.
- Often underperform for P2P, wallets, instant payouts, crypto on/off-ramps, and high-risk verticals.
- The net: you cannot rely on one generic “fraud score” per transaction. You need your own composition of signals.
- Fraud rings:
-
Open banking and instant rails changed your blast radius
- Faster payments, SEPA Instant, RTP, Pix, etc. remove the useful friction of multi-day settlement.
- Once money moves, recall is hard or impossible.
- Open banking APIs:
- Expose APIs for sweeping, account verification, and data access.
- Also expose new fraud surfaces: account takeover, synthetic identities linked to real accounts, “clean” mule accounts.
What changed is not “we have APIs now” — it’s that latency to regret (time between decision and irreversible loss / regulatory breach) is collapsing.
How it works (simple mental model)
Use this mental model: Four planes of fintech infrastructure, each with their own contracts and failure modes:
-
Money Movement Plane
(payments, pay-ins, pay-outs, settlement)- Card processors, bank transfer rails, wallets, ledgers.
- Key contracts:
- Idempotent payment operations.
- Clear state transitions (authorized → captured → settled → refunded).
- Reconciliation primitives (external IDs, cut-off times, fee visibility).
-
Identity & Trust Plane
(KYC, KYB, device fingerprinting, behavioral profiles)- Identity verification, doc checks, sanctions/PEP screening, business verification.
- Key contracts:
- Stable user identifiers across vendors.
- Deterministic KYC state transitions:
unverified → pending → verified → restricted → banned.
- Revocability: can you re-assess and reverse verification when new intel arrives?
-
Risk & Fraud Plane
(rules, models, velocity checks, case management)- Rule engines, ML scoring, chargeback management, alert queues.
- Key contracts:
- Feature input schema that you own (not vendor-specific).
- Explanation: human-readable rule hits, model reasons.
- Retrospective replay: ability to
re-score(historical_events, new_model).
-
Compliance & Oversight Plane
(AML, SAR/STR filing, audit, policies)- Transaction monitoring, AML typologies, escalations, regulatory reporting.
- Key contracts:
- Immutable event log with consistent semantics.
- Case lifecycle:
alert → investigation → disposition → (optional) report. - Evidence bundles: you can reconstruct what you knew when.
Healthy architectures make these four planes loosely coupled but data-consistent:
- A payment event references:
- A user/business identity (KYC plane).
- Risk decisions and features at time of decision.
- Downstream compliance outcomes (e.g., “included in SAR #1234”).
If you only remember “card charge succeeded for customer_id=abc”, you’re dead when auditors, Visa, or your sponsor bank asks, “Show me the timeline of this entity’s risk profile and payment activity.”
Where teams get burned (failure modes + anti-patterns)
1. Vendor-shaped data model
Anti-pattern:
– Your internal entities are essentially:
– stripe_customer_id, provider_transaction_id, vendor_kyc_status.
Impact:
– Migrating or multi-homing providers is excruciating.
– You can’t cross-check vendors (e.g., second-opinion KYC, duplicate-device detection across sources).
– Instrumentation is brittle; you lose signal when a provider changes response schemas.
Fix:
– Materialize your own canonical entities:
– internal_user_id, payment_intent_id, kyc_verification_id, device_id.
– Treat provider IDs as attributes, not primary keys.
2. Treating fraud as a “block or allow” black box
Anti-pattern:
– You send a payment to a fraud vendor, get back:
– score = 0.82 and decision = allow.
– That decision directly drives your capture() or payout() with no intermediate layer.
Impact:
– When fraud rises, you have no levers except “turn aggression up to 11” or “switch vendor”.
– You can’t A/B test rules or isolate thin segments (e.g., high-risk geos, specific payment methods).
– Compliance can’t map fraud typologies to AML escalations.
Fix:
– Introduce your own Risk Decision Service:
– Inputs: standardized features (user age, KYC level, velocity counts, device fingerprint, BIN data, etc.).
– External vendor outputs are features, not controllers.
– The service produces:
– A decision (allow, challenge, hold, deny).
– A decision graph (which rules/models contributed).
– A risk level that informs limits and monitoring intensity.
3. Over-reliance on synchronous checks
Anti-pattern:
– All checks happen in a blocking request:
– Payments, KYC, AML, fraud scoring all in the critical path.
– To hit latency SLAs, you start softening checks:
– “Skip full doc verification under $X.”
– “Turn off additional AML ping to avoid timeouts.”
Impact:
– Attackers exploit the window between first action and later batch reviews.
– When regulators show up, logs show:
– “User fully onboarded” before screening results actually returned.
– You degrade UX globally to handle a small risk segment.
Fix:
– Design for staged, async trust-building:
– Stage 1: Light KYC, low limits, extra monitoring, delayed high-value payouts.
– Stage 2: Full KYC/AML, stronger device binding, higher limits.
– Architect flows so high-friction or long-latency checks don’t always block the happy path, but:
– Limit exposure (no instant withdrawals).
– Tag accounts as “provisional” with clear downgrade/lock behavior.
4. No stable “compliance view” of the world
Anti-pattern:
– Compliance analysts work in:
– Spreadsheets.
– Ad-hoc SQL queries on production replicas.
– Vendor UIs with inconsistent entity IDs.
Impact:
– You cannot generate:
– Consistent suspicious activity narratives.
– Portfolio-level AML risk views.
– Reproducible decisions (why a case was closed).
– During audits, you scramble to join 5 systems to answer basic questions.
Fix:
– Build a compliance warehouse / data mart:
– Event-level, append-only log of:
– Onboarding events, screenings, KYC statuses.
– Payment events, chargebacks, disputes.
– Risk and fraud decisions.
– Compliance-facing models:
– Party (person/business + linkages).
– Account/Instrument.
– Case and Alert.
– This doesn’t have to be Big Data; it has to be correct, explainable, and queryable.
Practical playbook (what to do in the next 7 days)
Assuming you already have a product moving money, here’s a 1-week, engineering-heavy checklist.
Day 1–2: Map your actual flows, not your architecture diagram
- Trace 3–5 representative journeys:
- New user signup → first deposit → first withdrawal.
- High-risk scenario (international, large ticket).
- Card charge + chargeback.
- For each step, write down:
- Which systems are called (internal services + vendors).
- What identifiers represent the user, account, transaction.
- What risk/compliance signals are generated (or not).
Deliverables:
– A simple sequence diagram per journey.
– A list of missing signals and ID mismatches.
Day 3: Introduce (or tighten) canonical IDs
- Pick canonical internal identifiers:
user_id,account_id,payment_intent_id,external_instrument_id,device_id.
- Add a mapping layer:
- For each vendor:
vendor → internalmapping tables or translation functions.
- For each vendor:
- Start emitting these IDs in:
- Event logs.
- Analytics.
- Case management tools.
Goal
