Bank Statement Parser

Parse CAMT.053, PAIN.001, CSV, OFX, QFX, MT940, and PDF into pandas DataFrames. 27K+ tx/s, hybrid PDF pipeline, REST API, zero network calls.

pip install bankstatementparser

Bank Statement Parser is an open-source Python library that parses bank statements from seven formats (CAMT.053, PAIN.001, CSV, OFX, QFX, MT940, and PDF) into structured pandas DataFrames. All processing runs locally — deterministic output, automatic PII redaction, and an optional hybrid PDF pipeline that routes through local LLMs when needed.

Get Started in Seconds

pip install bankstatementparser
from bankstatementparser import create_parser, detect_statement_format

fmt = detect_statement_format("statement.xml")
parser = create_parser("statement.xml", fmt)
df = parser.parse()  # pandas DataFrame, ready to use
# Parse PDFs with the hybrid pipeline (v0.0.5+)
from bankstatementparser.hybrid import smart_ingest

result = smart_ingest("statement.pdf")
print(result.source_method)         # "deterministic" | "llm" | "vision"
print(result.verification.status)   # VERIFIED | DISCREPANCY | FAILED
GitHub StarsMonthly DownloadsPyPI VersionPythonLicenseTestsCoverage

One Library, Seven Formats

Parse CAMT.053, PAIN.001, CSV, OFX, QFX, MT940, and PDF into structured pandas DataFrames with a single, unified API. No need to install separate packages for each format.

FeatureBank Statement ParserSingle-format OSS (mt940, ofxparse)SaaS (Ocrolus, Parseur)
Formats supported7, unified API1 eachMany (via OCR)
PDF supportHybrid pipeline (deterministic + LLM + vision)NoYes (cloud OCR)
Data privacy100% local (LLMs run locally via Ollama)100% localData sent externally
CostFree, Apache 2.0Free$49-$1,000+/mo
Balance verificationGolden Rule (opening + credits − debits = closing)NoVaries
PII redactionBuilt-in, on by defaultNoVaries
StreamingBounded memoryNoN/A
REST APIBuilt-in FastAPI microserviceNoYes
DeduplicationIdempotent transaction hashesNoSome
Ledger exporthledger + beancountNoNo

Hybrid PDF Pipeline

Bank Statement Parser v0.0.5+ includes a three-path hybrid pipeline for PDF bank statements:

Every extraction is verified with the Golden Rule: opening balance + credits − debits == closing balance.

Built for the ISO 20022 Migration

SWIFT has set firm deadlines: all financial institutions must receive CAMT.053 by November 2027, and MT940/MT942/MT950 will be fully retired by November 2028. Bank Statement Parser handles both legacy MT940 and modern ISO 20022 formats (CAMT.053, PAIN.001) in a single API, so your parsing pipeline works during the transition and beyond.

Performance

Why Bank Statement Parser?

Built for Production

Bank Statement Parser is designed for treasury teams, fintech developers, and compliance officers processing sensitive financial data. The library is used in MT940-to-CAMT migration pipelines, automated reconciliation systems, PDF statement ingestion, and regulatory audit workflows across financial institutions.

Evaluating alternatives? See how Bank Statement Parser compares ❯ | Explore real-world use cases ❯

Get started ❯ | View on GitHub ❯ | View on PyPI ❯