← Back to context

Comment by iterance

2 months ago

Interesting. I'd love to learn more about the problem class.

Happy to elaborate. The core problem is bank statement reconciliation - matching raw bank transactions to your accounting records.

Sounds simple until you hit the real-world mess:

1. *Ambiguous descriptions*: "CARD 1234 AMAZON" could be office supplies, inventory, or someone's personal expense on the company card. Same vendor, completely different accounting treatment.

2. *Sequential dependencies*: You need to detect transfers first (money moving between your own accounts), because those shouldn't hit expense/income at all. But transfer detection needs to see ALL transactions across ALL accounts before it can match pairs. Then pattern matching runs, but its suggestions might conflict with the transfer detection. Then VAT calculation runs, but some transactions are VAT-exempt based on what pattern matching decided.

3. *Confidence cascades*: If step 3 says "70% confident this is office supplies," step 7 needs to know that confidence when deciding whether to auto-post or flag for review. But step 5 might have found a historical pattern that bumps it to 95%. Now you're tracking confidence origins alongside confidence scores.

4. *The "almost identical" trap*: "AMAZON PRIME" and "AMAZON MARKETPLACE" need completely different treatment. But "AMZN MKTP" and "AMAZON MARKETPLACE" are the same thing. Fuzzy matching helps, but too fuzzy and you miscategorize; too strict and you miss obvious matches.

5. *Retroactive corrections*: User reviews transaction 47 and says "actually this is inventory, not supplies." Now you need to propagate that learning to similar future transactions, but also potentially re-evaluate transactions 48-200 that already processed.

The conditional branching gets gnarly because each step can short-circuit or redirect later steps based on what it discovered. A clean pipeline assumes linear data flow, but this is more like a decision tree where the branches depend on accumulated state from multiple earlier decisions.