Overview
Bank Statement Parser is the only open-source Python library that parses seven bank statement formats — including PDF via a hybrid LLM pipeline — with a unified API. Single-format libraries (mt-940, ofxparse, pycamt) each handle one format. SaaS tools (Ocrolus, Parseur) offer cloud OCR but require sending data externally and cost $49–$1,000+/month.
Open-Source Alternatives
Single-Format Libraries
Most open-source bank statement parsers handle one format only. If you need multiple formats, you must install and maintain separate libraries with different APIs, output schemas, and update cycles.
| Library | Formats | Output | Balance Verification | Ledger Export | |
|---|---|---|---|---|---|
| Bank Statement Parser | 7 formats | Hybrid pipeline | pandas DataFrame | Golden Rule | hledger, beancount |
| mt-940 (WoLpH) | MT940 only | No | Python objects | No | No |
| ofxparse | OFX only | No | Python objects | No | No |
| pycamt | CAMT.053 only | No | Python objects | No | No |
| ofxtools | OFX v1/v2 only | No | Python objects | No | No |
vs pyiso20022
pyiso20022 generates Python dataclasses from the full ISO 20022 schema catalogue. It is a general-purpose ISO 20022 toolkit for working with PACS, PAIN, CAMT, and ADMI messages.
Bank Statement Parser is purpose-built for parsing bank statements into DataFrames with production features:
| Feature | Bank Statement Parser | pyiso20022 |
|---|---|---|
| Purpose | Statement parsing + extraction + export | ISO 20022 schema toolkit |
| Output | pandas/Polars DataFrames | Python dataclasses |
| Formats | 7 (including PDF, non-ISO) | ISO 20022 only |
| PDF support | Hybrid pipeline (deterministic + LLM + vision) | No |
| Balance verification | Golden Rule + multi-currency | No |
| REST API | Built-in FastAPI | No |
| Enrichment | LLM-powered categorisation | No |
| Ledger export | hledger + beancount | No |
| Streaming | Yes (bounded memory) | No |
| PII redaction | Built-in | No |
| Deduplication | Idempotent transaction hashes | No |
| CLI | Yes | No |
Use pyiso20022 if you need to work with the full ISO 20022 message catalogue. Use Bank Statement Parser if you need to parse bank statements into structured data for analysis, reconciliation, or reporting.
SaaS Alternatives
SaaS tools like Ocrolus, Parseur, and Sensible offer bank statement parsing as a cloud service. They typically use OCR to handle scanned PDFs and support hundreds of bank-specific formats.
| Feature | Bank Statement Parser | SaaS Tools |
|---|---|---|
| Data privacy | 100% local (LLMs via Ollama) | Data sent to cloud |
| Cost | Free (Apache 2.0) | $49–$1,000+/month (as of Q1 2026) |
| Formats | 7 (structured + PDF) | Hundreds (via OCR) |
| PDF support | Yes — hybrid pipeline (deterministic + LLM + vision) | Yes (cloud OCR) |
| Balance verification | Golden Rule (automatic) | Manual / limited |
| Latency | <2 ms (structured), seconds (PDF+LLM) | 1-30 seconds |
| Throughput | 27,000+ tx/second (structured) | API rate-limited |
| REST API | Built-in FastAPI | Proprietary |
| Ledger export | hledger + beancount | No |
| Vendor lock-in | None | Yes |
| Compliance | Local processing, SBOM | Varies by provider |
LLM-Based Parsers
A growing number of tools (Inscribe, Unstract, Mozilla.ai blueprints) use large language models to parse bank statements, including scanned PDFs. When Chase redesigned their consumer statement format in late 2025, template-based parsers broke while LLM parsers adapted automatically.
Bank Statement Parser now includes its own hybrid LLM pipeline (v0.0.5+) that runs entirely locally via Ollama. It combines the best of both approaches:
- Structured formats (XML, CSV, OFX, MT940): Deterministic parsing — 100% accuracy, sub-millisecond latency, zero LLM cost.
- PDF statements: Three-path routing (deterministic table extraction → text-LLM → vision-LLM) with automatic Golden Rule verification to catch extraction errors.
Unlike cloud-only LLM parsers, Bank Statement Parser's hybrid pipeline:
- Runs 100% locally (Ollama) — no data leaves your machine.
- Verifies every extraction with balance verification (Golden Rule).
- Supports interactive review mode for flagged discrepancies.
- Produces idempotent transaction hashes for safe incremental ingestion.
When to choose pure SaaS LLM parsers over Bank Statement Parser: You receive statements from hundreds of banks with wildly different PDF layouts and need out-of-the-box coverage without running local infrastructure.
When to choose Bank Statement Parser: You need local processing for compliance. You want balance verification. You need ledger export. You want zero ongoing cost.
Benchmark methodology: Performance figures measured on Apple M2, Python 3.12, using a 5,000-transaction CAMT.053 file (2.1 MB). Results averaged over 100 runs. Reproduce locally: python -m bankstatementparser.bench. SaaS latency based on published API documentation as of April 2026.
See real-world use cases ❯ | Plan your MT940-to-CAMT migration ❯