Mistral OCR 4 - Hacker News

Posted by meetpateltech 5 hours ago

295 points | 77 commentspage 2

Insanity 4 hours ago|

Recently I tied OCR with Opus 4.8. (I know, not technically right tool for the job). All I needed to do was extract dates from receipts. It got about 20% of the dates wrong yet rated all as “high confidence”.

Should have probably tried a more OCR specific model

staticman2 37 minutes ago||

I don't know about Opus but I can tell you with Gemini the subscription product OCR is apparently not done by the model. It used a separate old fashioned OCR tool and gives bad results in my tests.

But with Gemini the API the model does do the OCR resulting in much better accuracy.

rsynnott 3 hours ago|||

> All I needed to do was extract dates from receipts

Was this... not basically a solved problem like 30 years ago? I'm pretty sure the shareware OCR tool that came with a black and white scanner I had at one point would do better than 20% wrong.

nik736 4 hours ago|||

Opus is very good at OCR. Way better than the small 1-4B VLMs. If Opus failed, most likely those smaller models will fail as well.

MostlyStable 4 hours ago||

How long have you been testing this? Have you noted a large improvement? I tested Opus for this quite a while ago (maybe 4.5? Whatever was out about a year ago), and it performed quite poorly on my use case.

nik736 4 hours ago||

I have put together an internal benchmark on 1000s of business documents with weird tables, structure, etc. that I run on every relevant model release. Opus 4.8 performs very very well. But it is obviously overkill for the task (and expensive at doing so). I just wanted to respond to the OP.

Insanity 4 hours ago|||

I'm assuming that the reason I didn't have good success rate is because it was not scanned documents, but photographs, and lighting conditions weren't always ideal. I think scanned business documents are a happy-case scenario in a way. (obv, you seem to run it against some complex documents, so that's impressive)

apawloski 2 hours ago|||

I’m curious what your findings are for the best model for your use case

bpodgursky 4 hours ago||

I do not believe this story.

Opus 4.8 scanned hundreds of PDFs for me recently with the worst handwriting imaginable. 100% successful, other than one record where even I could not figure out what was written.

Insanity 4 hours ago|||

I do not believe this story, because of the message I just posted above.

That's not really productive lol, I'm glad it worked for you but these models are non-deterministic and 'YMMV' very much applies everywhere. I had it parse receipts (in fairness, in variable lightning), all taken from iPhone cameras in the past year. And yeah, not a great job, about 20% failed to get the date correct. (Not outrageously wrong, e.g 05/20/2026 becomes 05/23/2026.

YMMV, glad it worked for you.

bpodgursky 4 hours ago|||

Are you sure you weren't using Sonnet or a low-effort reasoning mode?

Insanity 4 hours ago||

Yes, lol

9cb14c1ec0 4 hours ago|||

I believe it. Makes me curious what your prompt was that got such a good result out of Opus.

Ducki 5 hours ago||

I was processing 55 year old paper files, most of them severely degraded, with its predecessor model. I was very impressed! I also tried Abbyy Finereader but it didn't even come close in my experience.

philipkglass 4 hours ago|

I used Abbyy Finereader for several years. I loved it. I completed some large projects with it. Modern VLMs put classic FineReader to shame for processing low-resolution/degraded/non-standard text.

I'm personally using the small Qwen 3.5 models. If you have an OCR problem, Mistral OCR 4 is probably great. Open weights models that you can run on a laptop may also work great.

MostlyStable 4 hours ago||

Does anyone know of OCR benchmarks that include hand-written documents? I'm currently using Gemini pro 3 for this, and error rates are quite good, but it's a little bit pricey, and I'd be interested in a cheaper model that could perform as well, but almost all the OCR benchmarks I'm aware of (and I believe all the ones included in this announcement) are about printed/typeset text.

jimmypk 4 hours ago|

[flagged]

JGB100 2 hours ago||

Not well tested. It switched all U.S. (") double quotation marks to UK-style (') single quotation marks, ignoring the source document. Useless in the US.

pmxi 4 hours ago||

This has been a niche where Mistral has actually been successful. Btw, Hindi and Japanese are bucketed in "Rare Languages," which is odd.

ZiiS 4 hours ago|

I read that as "languages under-represented in the training set".

stri8ted 4 hours ago||

Way too expensive. Google vision OCR (which they failed to compare against), is $1.50 per 1k pages. Vs $4 from Mistral.

cvdub 16 minutes ago||

It’s not the same service. Google’s vision OCR is pure text extraction, not layout. Pretty sure Google’s doc AI services that can identify headers vs body text is $10 per 1k pages.

kojoru 3 hours ago||

interesting - an equivalent Azure Document Intelligence service (scanning with layout) is 10$/1k

coulix 3 hours ago||

I wonder how it does compare to reducto, pulse, extendai.

Ninjinka 3 hours ago||

Is there a complete list of the languages they support, and benchmarks by language, instead of just "Rare Languages"?

mrkn1 3 hours ago||

This runs for free on CPU https://github.com/kouhxp/textsnap

tdubey 4 hours ago|

Are there benchmarks for how this performs on charts, or maybe more accurately, plots? I've yet to find a model that can digitize a plot into X,Y points with some accuracy in my use case of digitizing old datasheets.

More comments...