Getting Started

Start Building Secure Applications with Bank Statement Parser

Requirements

Install

pip install bankstatementparser

For Polars DataFrame support:

pip install bankstatementparser[polars]

Quick Start

Auto-Detect and Parse Any Format

from bankstatementparser import create_parser, detect_statement_format

fmt = detect_statement_format("transactions.ofx")
parser = create_parser("transactions.ofx", fmt)
df = parser.parse()  # pandas DataFrame
print(df.head())

This works with .xml (CAMT/PAIN.001), .csv, .ofx, .qfx, .mt940, and .sta files.

Parse CAMT.053

from bankstatementparser import CamtParser

parser = CamtParser("statement.xml")
transactions = parser.parse()

Parse PAIN.001

from bankstatementparser import Pain001Parser

parser = Pain001Parser("payment.xml")
payments = parser.parse()

Streaming Large Files

For files with thousands of transactions, use streaming to keep memory bounded:

parser = CamtParser("large_statement.xml")
for transaction in parser.parse_streaming(redact_pii=True):
    process(transaction)  # Memory stays constant

In-Memory Parsing

Parse from bytes without disk I/O -- useful for SFTP or API workflows:

xml_bytes = download_from_sftp()
parser = CamtParser.from_bytes(xml_bytes, source_name="daily.xml")
transactions = parser.parse()

Parallel File Processing

Parse multiple files concurrently:

from bankstatementparser import parse_files_parallel

results = parse_files_parallel([
    "statements/jan.xml",
    "statements/feb.xml",
    "statements/mar.xml",
])
for r in results:
    print(r.path, r.status, len(r.transactions), "rows")

Deduplication

Detect exact duplicates and suspected matches with confidence scores:

from bankstatementparser import CamtParser, Deduplicator

parser = CamtParser("statement.xml")
dedup = Deduplicator()
result = dedup.deduplicate(dedup.from_dataframe(parser.parse()))

print(f"Unique: {len(result.unique_transactions)}")
print(f"Exact duplicates: {len(result.exact_duplicates)}")
print(f"Suspected matches: {len(result.suspected_matches)}")

Secure ZIP Processing

Process zipped XML files with built-in security checks (bomb protection, encrypted entry rejection):

from bankstatementparser import iter_secure_xml_entries, CamtParser

for entry in iter_secure_xml_entries("statements.zip"):
    parser = CamtParser.from_bytes(entry.xml_bytes, source_name=entry.source_name)
    print(f"{entry.source_name}: {len(parser.parse())} transactions")

Export

parser = CamtParser("statement.xml")
parser.export_csv("output.csv")
parser.export_json("output.json")

# Polars (requires bankstatementparser[polars])
polars_df = parser.to_polars()

CLI Usage

# Parse and display
python -m bankstatementparser.cli --type camt --input statement.xml

# Export to CSV
python -m bankstatementparser.cli --type camt --input statement.xml --output transactions.csv

# Stream with PII visible
python -m bankstatementparser.cli --type camt --input statement.xml --streaming --show-pii

CLI options:

Local Development Setup

git clone https://github.com/sebastienrousseau/bankstatementparser.git
cd bankstatementparser
python3 -m venv .venv && source .venv/bin/activate
pip install poetry && poetry install --with dev

Run the test suite:

pytest

API Reference

Parser Classes

Class Format Import
CamtParser CAMT.053 (ISO 20022) from bankstatementparser import CamtParser
Pain001Parser PAIN.001 (ISO 20022) from bankstatementparser import Pain001Parser
CsvStatementParser CSV from bankstatementparser import CsvStatementParser
OfxParser OFX from bankstatementparser import OfxParser
QfxParser QFX from bankstatementparser import QfxParser
Mt940Parser MT940 from bankstatementparser import Mt940Parser

Utility Functions

Function Purpose
detect_statement_format(path) Auto-detect file format
create_parser(path, fmt) Create the appropriate parser
parse_files_parallel(paths) Parse multiple files concurrently
iter_secure_xml_entries(zip_path) Iterate ZIP entries securely

Data Classes

Class Purpose
Deduplicator Detect duplicate transactions
DeduplicationResult Result with unique, exact, and suspected matches
InputValidator Validate file paths and formats
Transaction Normalised transaction record
FileResult Result from parallel parsing
ZipXMLSource ZIP member wrapper

Exceptions

Exception When Raised
ParserError Parsing failures
ExportError Export failures (CSV/JSON/Excel)
ValidationError Input validation failures
ZipSecurityError ZIP security check failures