Top
Best
New

Posted by lukeinator42 11/19/2025

Meta Segment Anything Model 3(ai.meta.com)
692 points | 134 commentspage 4
rocauc 11/19/2025|
A brief history. SAM 1 - Visual prompt to create pixel-perfect masks in an image. No video. No class names. No open vocabulary. SAM 2 - Visual prompting for tracking on images and video. No open vocab. SAM 3 - Open vocab concept segmentation on images and video.

Roboflow has been long on zero / few shot concept segmentation. We've opened up a research preview exploring a SAM 3 native direction for creating your own model: https://rapid.roboflow.com/

exe34 11/19/2025||
can anyone confirm if this fits in a 3090? the files look about 3.5GB, but I can't work out what the memory needs will be overall.
yeldarb 11/20/2025|
Yes, it should.
exe34 11/20/2025||
thanks!
retinaros 11/20/2025||
a quick question. is it possible in a single prompt to identify multiple type of objects or do you need to send multiple queries? like if i have a prompt "donkey, dogs" will sam3 return in one shot boxes with the class they belong to or do i need to send two queries?
aDyslecticCrow 11/20/2025|
Seems like every mask is one pass. So even multiple penguins are multiple inferences. But if it continues from SAM 2, the heavy compute is the original image encoding which is reused and cashed for every inference.

No idea what they will do for their API, but from a compute perspective the prompt is free once the image is processed.

nowittyusername 11/19/2025||
This thing rocks. i can imagine soo many uses for it. I really like the 3d pose estimation especially
iandanforth 11/20/2025||
I wonder if we'll get an updated DeepSeek-OCR that incorporates this. Would be very cool!
aDyslecticCrow 11/20/2025||
I don't quite see how this would help OCR at all? or am I misunderstanding what kind of OCR you're thinking of?
iandanforth 11/20/2025||
Deepseek-OCR uses SAM V1 as a component in its pipeline already. It also does layout detection.
aDyslecticCrow 11/21/2025||
That sounds like ludicrous overkill to me.
netdur 11/20/2025||
for document layout! did you have success understanding document layout using SAM
maelito 11/19/2025||
Can it detect the speed of a vehicle on any video unsupervised ?
xnx 11/20/2025||
Reminder that Nano Banana is also capable of image segmentation: https://x.com/phillip_lippe/status/1991555954908025123
foota 11/19/2025||
Obligatory xkcd: https://xkcd.com/1425/
hdjrudni 11/20/2025|
That comic doesn't appear to be dated but I'm sure it's been at least 5 years, so that checks out.
esprehn 11/20/2025||
It's from 2014, over a decade old.

Relevant to that comic specifically: https://www.reddit.com/r/xkcd/comments/mi725t/yeardate_a_com...

mertleee 11/20/2025|
[dead]