Posted by cd_mkdir 13 hours ago
The Hard Part was OCR. I benchmarked Tesseract vs. Mistral OCR-3 on messy bank PDFs; Mistral won because it understands table layouts.
Built with Django/Celery to handle 10k+ transaction ledgers.
Happy to answer questions about the OCR pipeline or the graph theory implementation.