Aabo

Bii A ṣe Daabobo Data Iṣowo Rẹ

TL;DR: Bank Statement Parser ṣe ilana gbogbo data ni agbegbe, ṣe isọdọtun PII nipasẹ aiyipada, ṣe lile itupalẹ XML si ikọlu XXE, ṣe iṣiṣẹ awọn LLM ni agbegbe nipasẹ Ollama, ati gbe pẹlu awọn igbẹkẹle ti SHA-256 hash tiipa ati CycloneDX SBOM.

Aabo Nipasẹ Apẹrẹ

Bank Statement Parser ni a kọ fun sisẹ data inawo ifura. Gbogbo ipinnu apẹrẹ ṣe pataki aabo, ikọkọ, ati agbara iṣayẹwo.

Igbẹkẹle Awọsanma Odo

Gbogbo sisẹ n ṣẹlẹ ni agbegbe laarin asiko ṣiṣe rẹ. Awọn parser ipinnu ko ṣe ipe nẹtiwọọki rara. Opo gigun PDF aladapọ lo Ollama fun iṣawari LLM agbegbe — ko si data ti a fi ranṣẹ si awọn API awọsanma. A ṣe iṣeto awọn parser XML ni kedere pẹlu no_network=True, resolve_entities=False, ati load_dtd=False lati ṣe idiwọ wiwọle eyikeyi ti o njade.

Isọdọtun PII

Alaye idanimọ tikalararẹ (awọn orukọ, IBANs, awọn adirẹsi ifiweranṣẹ) ni a sọ di mimọ laifọwọyi ninu iṣelọpọ CLI ati ipo sisanwọle. Eyi wa ni titan nipasẹ aiyipada.

Aabo XML (Idaabobo XXE)

Gbogbo itupalẹ XML lo lxml pẹlu awọn eto lile:

Aabo Ile Ipamọ ZIP

iter_secure_xml_entries() ṣe ifọwọsi gbogbo ọmọ ẹgbẹ ZIP ṣaaju isediwon:

Idena Ipa-ọna Rin-kakiri

Ifọwọsi titẹ sii dina awọn ọna faili ti o lewu:

Iṣayẹwo Iwọntunwọnsi (Ofin Goolu)

A ṣe iṣayẹwo gbogbo isediwon PDF pẹlu idogba: opening balance + credits − debits == closing balance. A samisi awọn abajade bi VERIFIED, DISCREPANCY, tabi FAILED. Awọn aiyede le ṣe atunyẹwo ni ifọrọwanilẹnuwo pẹlu --type review.

Iṣelọpọ Ipinnu

Fun awọn ọna kika ti a ṣeto (CAMT, PAIN.001, CSV, OFX, QFX, MT940), fifun faili igbewọle kanna, parser ṣe agbejade iṣelọpọ baiti-kan-naa ni gbogbo ṣiṣe. Ko si aileto, ko si itọkasi awoṣe, ko si iṣapẹẹrẹ heuristic. Eyi ṣe pataki fun:

Aabo Ẹwọn Ipese

Ṣe Idaniloju Ni Agbegbe

python -m pytest                          # 718 tests, 100% branch coverage
python scripts/verify_locked_hashes.py    # SHA-256 hash verification
git log --show-signature -1               # Verify commit signature