Posted by lukeinator42 11/19/2025
Meta took the open path because their initial foray into AI was compromised so they have been doing their best to kneecap everyone else since then.
I like the result but let’s not pretend it’s for gracious intent.
The retort was essentially "Can't you just be nice?" but people have the right to ask questions; sometimes the questions reveal much corruption that actually does go on
Yes, the 99% did NOT go straight into non-profits, instead being funneled into his foundation, which has donated millions into actual charitable organizations, but that's arguably millions that wouldn't have otherwise gone to those orgs.
Is it a bit disingenuous to say he's donating 99% of his wealth when his foundation has only donated a few hundred million (or few billion?), which is a single percent of his wealth? Yeah, probably. But a few billion is more than zero, and is undeniably helpful to those organizations.
Don't basically all the "top labs" except Anthropic now have open weight models? And Zuckerberg said they were now going to be "careful about what we choose to open source" in the future, which is a shift from their previous rhetoric about "Open Source AI is the Path Forward".
Facebook is a deeply scummy company[2] and their stranglehold on online advertising spend (along with Google) allows them to pour enormous funds into side bets like this.
i prefer to say thank you when someone is doing something good
These projects come to my mind:
SAM segment anything.
PyTorch
LLama
...
Open source datacenters and server blueprints.
the following instead comes from grok.com
Meta’s open-source hall of fame (Nov 2025)
---------------------
Llama family (2 → 3.3) – 2023-2025 >500k total stars · powers ~80% of models on Hugging Face Single-handedly killed the closed frontier model monopoly
---------------------
PyTorch – 2017 85k+ stars · the #1 ML framework in research TensorFlow is basically dead in academia now
---------------------
React + React Native – 2013/2015 230k + 120k stars Still the de-facto UI standard for web & mobile
---------------------
FAISS – 2017 32k stars · used literally everywhere (even inside OpenAI) The vector similarity search library
---------------------
Segment Anything (SAM 1 & 2) – 2023-2024 55k stars Revolutionized image segmentation overnight
---------------------
Open Compute Project – 2011 Entire open-source datacenter designs (servers, racks, networking, power) Google, Microsoft, Apple, and basically the whole hyperscaler industry build on OCP blueprints
---------------------
Zstandard (zstd) – 2016 Faster than gzip · now in Linux kernel, NVIDIA drivers, Cloudflare, etc. The new compression king
---------------------
Buck2 – 2023 Rust build system, 3-5× faster than Buck1 Handles Meta’s insane monorepo without dying
---------------------
Prophet – 2017 · 20k stars Go-to time-series forecasting library for business
---------------------
Hydra – 2020 · 9k stars Config management that saved the sanity of ML researchers
---------------------
Docusaurus – 2017 · 55k stars Powers docs for React, Jest, Babel, etc.
---------------------
Velox – 2022 C++ query engine · backbone of next-gen Presto/Trino
---------------------
Sapling – 2023 Git replacement that actually works at 10M+ file scale
---------------------
Meta’s GitHub org is now >3 million stars total — more than Google + Microsoft + Amazon combined.
---------------------
Bottom line: if you’re using modern AI in 2025, there’s a ~90% chance you’re running on something Meta open-sourced for free.
[1] I didn't take them up on the offer to interview in the wake of that and so it will be forever known as "I've made a huge mistake."
I put together a YOLO tune for climbing hold detection a while back (trained on 10k labels) and this is 90% as good out of the box - just misses some foot chips and low contrast wood holds, and can't handle as many instances. It would've saved me a huge amount of manual annotation though.
I actually found the easiest way was to run it for free to see if it works for my use case of person deidentification https://chat.vlm.run/chat/63953adb-a89a-4c85-ae8f-2d501d30a4...
[1]: https://github.com/facebookresearch/dinov3 [2]: https://imgeditor.co/
I hope this makes sense and I'm using terms loosely. It is an amazing model but it doesn't work for my use case, that's all!
Edit: answered the question
Deep Learning-based methods will absolutely have a place in this in the future, but today's machines are usually classic methods. Advantages are that the hardware is much cheaper and requires less electric and thermal management. This changes these days with cheaper NPUs, but with machine lifetimes measured in decades, it will take a while.
SAM3 seems to less precisely trace the images — it'll discard kids drawing out the lines a bit, which is okay, but then it also seems to struggle around sharp corners and includes a bit of the white page that I'd like cut out.
Of course, SAM3 is significantly more powerful in that it does much more than simply cut out images. It seems to be able to identify what these kids' drawings represent. That's very impressive, AI models are typically trained on photos and adult illustrations — they struggle with children's drawings. So I could perhaps still use this for identifying content, giving kids more freedom to draw what they like, but then unprompted attach appropriate behavior to their drawings in-game.
BiRefNet 2 seems to do a much better job of correctly removing backgrounds in between the contents outline. So like hands on hips, that region that's fully enclosed but you want removed. It's not just that though, some other models will remove this, but they'll be overly aggressive and remove white areas where kids haven't coloured in perfectly — or like the intentionally left blank whites of eyes for example.
I'm putting these images in a game world once they're cut out, so if things are too transparent, they look very odd.
[Update: should have mentioned I got the 4 second from the roboflow.com links in this thread]
> This excellent performance comes with fast inference — SAM 3 runs in 30 milliseconds for a single image with more than 100 detected objects on an H200 GPU.
I don't even care about the numbers; a visual transformer encoder with output that is too heavy for many edge compute CNNs to use as input isn't gonna cut it.
You can get an easy to use api endpoint by creating a workflow in roboflow with just the sam3 block in it (and hook up an input parameter to forward prompt to the model), which is then available as an HTTP endpoint. You can use the sam3 template and remove the visualization block if you need just json response for a bit faster latency and smaller payload.
Internally we are getting to run approx ~200ms http roundtrip, but our user facing API currently has some additional latency because we have to proxy a bit to hit a different cluster where we have more GPU capacity for this model allocated than we can currently get on GCP.
Two years ago we released autodistill[1], an open source framework that uses large foundation models to create training data for training small realtime models. I'm convinced the idea was right, but too early; there wasn't a big model good enough to be worth distilling from back then. SAM3 is finally that model (and will be available in Autodistill today).
We are also taking a big bet on SAM3 and have built it into Roboflow as an integral part of the entire build and deploy pipeline[2], including a brand new product called Rapid[3], which reimagines the computer vision pipeline in a SAM3 world. It feels really magical to go from an unlabeled video to a fine-tuned realtime segmentation model with minimal human intervention in just a few minutes (and we rushed the release of our new SOTA realtime segmentation model[4] last week because it's the perfect lightweight complement to the large & powerful SAM3).
We also have a playground[5] up where you can play with the model and compare it to other VLMs.
[1] https://github.com/autodistill/autodistill
[2] https://blog.roboflow.com/sam3/
[3] https://rapid.roboflow.com
I'm not sure if the work they did with DINOv3 went into SAM3. I don't see any mention of it in the paper, though I just skimmed it.
It makes a great target to distill SAM3 to.
Could you expand on that? Do you mean you're starting with the pretrained DINO model and then using SAM3 to generate training data to make DINO into a segmentation model? Do you freeze the DINO weights and add a small adapter at the end to turn its output into segmentations?
But I'm impressed by the ability of this model to create a image encoding that is independent of the prompt. I feel like there may be lessons in training approach that can be carried over to unet for a more valuable encoding.