Bank Statement Parser is an open-source Python library that parses bank statements from six formats (CAMT.053, PAIN.001, CSV, OFX, QFX, MT940) into structured pandas DataFrames. All processing runs locally — zero network calls, deterministic output, and automatic PII redaction.
Get Started in Seconds
pip install bankstatementparser
from bankstatementparser import create_parser, detect_statement_format
fmt = detect_statement_format("statement.xml")
parser = create_parser("statement.xml", fmt)
df = parser.parse() # pandas DataFrame, ready to use
One Library, Six Formats
Parse CAMT.053, PAIN.001, CSV, OFX, QFX, and MT940 into structured pandas DataFrames with a single, unified API. No need to install separate packages for each format.
| Feature | Bank Statement Parser | Single-format OSS (mt940, ofxparse) | SaaS (Ocrolus, Parseur) |
|---|---|---|---|
| Formats supported | 6, unified API | 1 each | Many (via OCR) |
| Data privacy | 100% local, zero network calls | 100% local | Data sent externally |
| Cost | Free, Apache 2.0 | Free | $49-$1,000+/mo |
| PII redaction | Built-in, on by default | No | Varies |
| Streaming | Bounded memory | No | N/A |
| ZIP security | Built-in hardening | No | N/A |
| Deduplication | Built-in with confidence scores | No | Some |
Built for the ISO 20022 Migration
SWIFT has set firm deadlines: all financial institutions must receive CAMT.053 by November 2027, and MT940/MT942/MT950 will be fully retired by November 2028. Bank Statement Parser handles both legacy MT940 and modern ISO 20022 formats (CAMT.053, PAIN.001) in a single API, so your parsing pipeline works during the transition and beyond.
Performance
- 27,000+ transactions/second for CAMT.053 parsing
- 52,000+ transactions/second for PAIN.001 parsing
- < 2 ms time to first result
- Constant memory from 1K to 50K+ transactions via streaming
- 467 tests with 100% branch coverage across Python 3.9 to 3.14
Why Bank Statement Parser?
- Format Auto-Detection:
detect_statement_format()identifies files automatically andcreate_parser()returns the right parser. - Privacy First: PII redaction is on by default. Sensitive fields (names, IBANs, addresses) are masked in CLI output. Opt in with
--show-piiwhen needed. - Production Ready: Secure ZIP ingestion (bomb protection, encrypted entry rejection), input validation, and path traversal prevention.
- Flexible Output: Export to CSV, JSON, Excel, or convert to Polars DataFrames.
- Parallel Processing: Parse multiple files concurrently with
parse_files_parallel().
Built for Production
Bank Statement Parser is designed for treasury teams, fintech developers, and compliance officers processing sensitive financial data. The library is used in MT940-to-CAMT migration pipelines, automated reconciliation systems, and regulatory audit workflows across financial institutions.
- 467 tests with 100% branch coverage across Python 3.9 to 3.14
- SHA-256 hash-locked dependencies with CycloneDX SBOM for every release
- Deterministic output — identical input produces byte-identical results, every run
- Apache 2.0 licensed — use freely in commercial and internal systems
Evaluating alternatives? See how Bank Statement Parser compares ❯ | Explore real-world use cases ❯