The promised mega-data center deals are meant to boost valuations today, not serve tons of customers three years from now.
Seriously. I have never ever seen so many people so willingly drink the marketing kool-aid from companies selling their product before. It's scarier to me than any threats of AI actually disrupting society (because it is so far from being capable of doing that).
Basically small and medium models that are crazy well trained for their sizes.
Then we have a lot of specular decoding stuff like MTP and others coming to speed up responses, and finally better quantisation to use less memory.
Local LLM is the future, and the larger labs know that the open models will eat their lunch once people realise that the gap is only a few months. If we were good with LLMs a couple months ago, we're good with the open models now.
That's irrelevant to my decision to use local or not.
I didn't read "and how were those models trained" as "Are we there yet?"
Just totally forgetting that the frontier models themselves stole an insane amount to get to where they are.
It's theft all the way across the board, and when someone tries to make the argument that open models theft is bad, but Altman or Amodei's theft is good.. they are revealing a lot about themselves
I have to assume current architectures aren't optimal though, the idea that we stumbled into the one and only optimal solution seems almost impossible.
If you project out that hardware just a couple of years, and the trained models out a couple of years, you end up in a place where it makes so much more sense to run them locally, for all sorts of latency, privacy, efficacy, and domain-specific reasons.
Not all that different from the old terminal & mainframe->pc shifts.
Finally - hardware has seemingly gotten out ahead of software that most folks use - watching YouTube, listening to music, playing a game or two. There was a time when playing an mp3 or watching a 4k video really taxed all but the nicest systems. Hardware fixed that problem, like it very well could this one.
Definitely not the high end local LLMs. The small ones, yes, absolutely.
> If you project out that hardware just a couple of years
One of the biggest bottlenecks for LLMs is memory capacity and bandwidth. With the current glut for memory, it's unlikely we'll see lots of advancements in terms of average memory available or its bandwidth on regular (not super high end devices) in the coming years.
Alternatively, it's possible we get dedicated SMLs for e.g. phone specific use cases, that are optimised and run well.
I consider it to be very careless to entrust your emails, your chats, your calendar, your notes, your calls, your pictures, your contacts, your location history, your waking hours, your files, your TODO list, i.e. stuff including your health data to the for-profit AI companies. The temptation to earn money with your data is just too great, plus the risk of the data being stolen and sold illegally.
Local AI should be the default. For everone who can't do local AI, we need confidential compute. Yes, it has been hacked before. But it's making it a lot harder.
Still, we all do it with Google. (I don't do it anymore but i did it for mostly two decades so I include myself)
The obvious optimization for the case presented would be to generate all the summaries on a server instead of in the client. Then the totally used compute would scale with the number of articles instead of number of users.
Damned if they do, damned if they don't.
You can also…turn it off.
Chrome silently elected people into it _and_ downloaded the model without asking because they decided that’s something they (chrome) fancied doing.
The difference should be pretty obvious.
This comment is quite dishonest about the nature of the discussion.
Also why doesn't their task manager show that it's actually the one downloading? Why does it go out of it's way to hide this activity?
Since I have conky on my desktop I could catch this immediately, and take the action I preferred with my own computer, which was to _immediately_ disable it.
https://developer.chrome.com/blog/new-in-chrome-148#prompt-a...
https://www.google.com/chrome/ai-innovations/
They have absolutely not been shy about any of this.
Please show me where in either of those documents it explains it's going to download a 4GB model.
It's a totally separate tab that opens. It's got nothing to do with what you use as your homepage.
I'm on gentoo. I have to update chrome manually. I updated it. On update I _never_ get a "what's new" page. I've had this profile for more than a decade so I have no actual idea why, but, I can absolutely tell you, I do *not* get one. After update it started consuming all my bandwidth. This use did not show in it's task manager. I have a metered connection. This is a problem for me. I worried it was a compromised plugin. I had to spend 10 minutes in Firefox discovering why chrome was doing this then going to the configuration and disabling this.
This was a disappointing experience. I'm sorry you feel differently; other than stating the obvious, I seriously have no idea what you and the other corporate defense squad members are trying to achieve with this gaslighting nonsense.
Note that this package and update is actually not maintained by Google at all, it's done by Gentoo: https://wiki.gentoo.org/wiki/Project:Chromium/How_to_bump_Ch...
I hate to be an apologist for anything but I think you are pointing fingers in the wrong place. The Google-official releases use the built-in automatic updater and do show What's New. This is a Gentoo release and they chose to do their own thing for updates.
Not to mention that the LLM that I choose to run requires a monster machine and is infinitely more capable than whatever google chose to put on their browser?
I mean, none of this affects me because I don't use chrome, obviously, but you don't see the difference? Bewildering.
I may personally be of modest intelligence, but to acquire the intelligence that I do have, I did not need to train on every book ever written, every Wikipedia article ever written, every blog post ever written, every reference manual ever written, every line of code ever written, and so on. In fact, I didn't train on even 1% of those materials, or even 0.00000000001% of those. The texts themselves were demonstrably not a prerequisite for intelligence.
At minimum, given that it only took me about 20 years of casual observation of my surroundings to approximate intelligence, this is proof positive that the only "dataset" you need is a bunch of sensors and the world around you.
And yes, of course, the human brain does not start from zero; it had a few million years of evolution to produce a fertile plot for intelligence to take root. But that fundamental architecture is fairly generic, and does not at all seem predicated on any sort of specific training set. You could feasibly evolve it artificially.
A universal translator with image and voice recognition and a decent breadth of encyclopedic knowledge in only a small fraction of an English Wikipedia dump(6GB/20+GB) is not "huge".
It is probably closer to the theoretical limit than anyone could have expected.
In the future, when regular home computers have the capabilities of modern servers, we'll be able to train the entire LLM at home.
Great observation! Often the excitement of novelty makes us lose sight of the real goal