Top
Best
New

Posted by meetpateltech 5 hours ago

Mistral OCR 4(mistral.ai)
295 points | 77 commentspage 2
Insanity 4 hours ago|
Recently I tied OCR with Opus 4.8. (I know, not technically right tool for the job). All I needed to do was extract dates from receipts. It got about 20% of the dates wrong yet rated all as “high confidence”.

Should have probably tried a more OCR specific model

staticman2 37 minutes ago||
I don't know about Opus but I can tell you with Gemini the subscription product OCR is apparently not done by the model. It used a separate old fashioned OCR tool and gives bad results in my tests.

But with Gemini the API the model does do the OCR resulting in much better accuracy.

rsynnott 3 hours ago|||
> All I needed to do was extract dates from receipts

Was this... not basically a solved problem like 30 years ago? I'm pretty sure the shareware OCR tool that came with a black and white scanner I had at one point would do better than 20% wrong.

nik736 4 hours ago|||
Opus is very good at OCR. Way better than the small 1-4B VLMs. If Opus failed, most likely those smaller models will fail as well.
MostlyStable 4 hours ago||
How long have you been testing this? Have you noted a large improvement? I tested Opus for this quite a while ago (maybe 4.5? Whatever was out about a year ago), and it performed quite poorly on my use case.
nik736 4 hours ago||
I have put together an internal benchmark on 1000s of business documents with weird tables, structure, etc. that I run on every relevant model release. Opus 4.8 performs very very well. But it is obviously overkill for the task (and expensive at doing so). I just wanted to respond to the OP.
Insanity 4 hours ago|||
I'm assuming that the reason I didn't have good success rate is because it was not scanned documents, but photographs, and lighting conditions weren't always ideal. I think scanned business documents are a happy-case scenario in a way. (obv, you seem to run it against some complex documents, so that's impressive)
apawloski 2 hours ago|||
I’m curious what your findings are for the best model for your use case
bpodgursky 4 hours ago||
I do not believe this story.

Opus 4.8 scanned hundreds of PDFs for me recently with the worst handwriting imaginable. 100% successful, other than one record where even I could not figure out what was written.

Insanity 4 hours ago|||
I do not believe this story, because of the message I just posted above.

That's not really productive lol, I'm glad it worked for you but these models are non-deterministic and 'YMMV' very much applies everywhere. I had it parse receipts (in fairness, in variable lightning), all taken from iPhone cameras in the past year. And yeah, not a great job, about 20% failed to get the date correct. (Not outrageously wrong, e.g 05/20/2026 becomes 05/23/2026.

YMMV, glad it worked for you.

bpodgursky 4 hours ago|||
Are you sure you weren't using Sonnet or a low-effort reasoning mode?
Insanity 4 hours ago||
Yes, lol
9cb14c1ec0 4 hours ago|||
I believe it. Makes me curious what your prompt was that got such a good result out of Opus.
Ducki 5 hours ago||
I was processing 55 year old paper files, most of them severely degraded, with its predecessor model. I was very impressed! I also tried Abbyy Finereader but it didn't even come close in my experience.
philipkglass 4 hours ago|
I used Abbyy Finereader for several years. I loved it. I completed some large projects with it. Modern VLMs put classic FineReader to shame for processing low-resolution/degraded/non-standard text.

I'm personally using the small Qwen 3.5 models. If you have an OCR problem, Mistral OCR 4 is probably great. Open weights models that you can run on a laptop may also work great.

MostlyStable 4 hours ago||
Does anyone know of OCR benchmarks that include hand-written documents? I'm currently using Gemini pro 3 for this, and error rates are quite good, but it's a little bit pricey, and I'd be interested in a cheaper model that could perform as well, but almost all the OCR benchmarks I'm aware of (and I believe all the ones included in this announcement) are about printed/typeset text.
jimmypk 4 hours ago|
[flagged]
JGB100 2 hours ago||
Not well tested. It switched all U.S. (") double quotation marks to UK-style (') single quotation marks, ignoring the source document. Useless in the US.
pmxi 4 hours ago||
This has been a niche where Mistral has actually been successful. Btw, Hindi and Japanese are bucketed in "Rare Languages," which is odd.
ZiiS 4 hours ago|
I read that as "languages under-represented in the training set".
stri8ted 4 hours ago||
Way too expensive. Google vision OCR (which they failed to compare against), is $1.50 per 1k pages. Vs $4 from Mistral.
cvdub 16 minutes ago||
It’s not the same service. Google’s vision OCR is pure text extraction, not layout. Pretty sure Google’s doc AI services that can identify headers vs body text is $10 per 1k pages.
kojoru 3 hours ago||
interesting - an equivalent Azure Document Intelligence service (scanning with layout) is 10$/1k
coulix 3 hours ago||
I wonder how it does compare to reducto, pulse, extendai.
Ninjinka 3 hours ago||
Is there a complete list of the languages they support, and benchmarks by language, instead of just "Rare Languages"?
mrkn1 3 hours ago||
This runs for free on CPU https://github.com/kouhxp/textsnap
tdubey 4 hours ago|
Are there benchmarks for how this performs on charts, or maybe more accurately, plots? I've yet to find a model that can digitize a plot into X,Y points with some accuracy in my use case of digitizing old datasheets.
More comments...