Top
Best
New

Posted by root-parent 4 hours ago

Google limits Meta's use of its Gemini AI models(www.cnbc.com)
119 points | 56 comments
HarHarVeryFunny 4 hours ago|
This seems to be a bit of a misleading headline.

In the current climate limiting someone's use of AI might be expected to be about restricting access or restricting what someone can do with it, but the story here ostensibly seems to be about capacity constraints, not any limitation on what models or capabilities Google is giving Meta access to.

dwroberts 1 hour ago||
Given Meta’s current AI situation though, I wouldn’t be surprised if they were trying to do distillation and the capacity story is a cover
londons_explore 4 hours ago||
These kind of limits happen all the time for big clients.

Cloud services like to present the illusion of an infinite amount of compute available at a fixed price per unit, but the reality is if you try to use too much of any service you'll find you have a quota and requests to increase it will fall on deaf ears if the provider doesn't have more of that resource.

Too much of my working life has been spent shoehorning services into less space/compute/ram/spindles or migrations to other data centers to solve such issues.

gchamonlive 3 hours ago||
If you allow me a bit of pedantry, it's infinite "for all intents and purposes". It doesn't mean you can request civilizational levels of compute, but for a blog, a crud, an ETL and such, that is regular use cases with sensible scale you can absorb any elastic demand.

Having said that, I agree with you. You have to request limit increases often and can't scale even in those instances if you don't plan ahead.

microgpt 2 hours ago|||
Yeah but you don't need cloud for a blog. Cloud was sold as effectively infinite resources - capacity isn't infinite, or effectively infinite, it's 20% more than you are currently using and you pay 300% more for that.

There has to be a name for this deceptive marketing tactic where you say something is unlimited and then it is only unlimited as long as you don't use very much.

It would be one thing if you occasionally got a "no more capacity" error when requesting large amounts of resources but it doesn't work that way. They confine you to a relatively small amount of resources the entire time you have an account. If you want more you have to request it.

hirako2000 1 hour ago|||
It was sold as flexible, near instant provisioning of DC level resources. I don't recall having seen infinite anywhere.
microgpt 1 hour ago||
It's not flexible if it only flexes 20% above your current usage
sebzim4500 1 hour ago||
For 99+% of users it will flex 10000% above their current usage
microgpt 8 minutes ago||
For 99% of users they could save three quarters of their costs by switching to a traditional VPS provider.
gchamonlive 2 hours ago|||
A blog for your product, if your product is already on the cloud, is a very sensible use case for the cloud. Static one deployed to a bucket and a CDN, fast, cache on the edge, high availability.

The tiny blog sure isn't for the cloud, but also it's not the main client of the cloud.

> it's 20% more than you are currently using and you pay 300% more for that.

I'm assuming you are comparing to self hosting. Then you need to account for things that are difficult to put a price like your time maintaining a physical infrastructure and the lessons you will learn with it.

Sounds like I'm defending the big cloud, but there is a valid use that is disconsidered because it's trendy to hate on the cloud.

> They confine you to a relatively small amount of resources the entire time you have an account. If you want more you have to request it.

It's a form of KYC, nothing wrong with that.

microgpt 2 hours ago|||
I compare cloud to non-cloud VPSes. If you compare them to self-hosting the price is even more biased against cloud, even with current RAM prices. Did you know you can get 40G or 100G dedicated internet to your colo rack for something like $2000 a month (prices vary greatly, YMMV)? Colo only makes sense if you need a fairly large quantity of compute resources, but the per-unit cost can be very good. Every other style of hosting is building on top of it with a profit margin, after all.
tough 2 hours ago|||
if im going to have to ask for capacity, why dont I just get my own bare metal servers then?
hirako2000 1 hour ago||
Because you don't have to wait for weeks just for delivery. And you pay for elastic usage.
vidarh 1 hour ago||
You can order bare metal servers delivery time in minutes from any number of hosting providers and the cost difference is so huge you can afford to keep excess capacity and still come out ahead.
tough 49 minutes ago||
I run the CI infra for our company, and our bare metal costs (sans my salary baked in), are one order of magnitude less than if using any other CI saas provider like github or others.

Like literally 10x times more expensive to do so, to run CI jobs...

I dont want to imagine the margin AWS has like generally, cause it can easily be a 90% too

microgpt 6 minutes ago||
Right? It's actually crazy how much they don't cost. Are you using it more than 10%? If so, you're saving money.

I assume you're using your owned server and not a provider like Hetzner? So you did have a substantial delivery time. Although in my city is a recycled that resells used servers, and I could show up there with a truck and get a server within hours if I'm not too picky. Or use some random desktop or laptop off the pile, short-term.

vidarh 1 hour ago||||
Even as a small customer it's easy to hit quotas or hit availablity constraints of more unusual instance types.
hapless 2 hours ago|||
definitionally that's "for some intents and purposes" my man
gchamonlive 1 hour ago||
For all intents and purposes is a figure of speech, meaning in every practical sense.
kouunji 2 hours ago||
Google makes claims here about high demand for Gemini - does anyone here have insight into how much of the load on Google is paid use vs the load from putting AI summaries into every web search?
vineyardmike 3 minutes ago||
My curiosity is not the free AI summaries (which they can opaquely tune as necessary), but instead the renting of TPUs to Anthropic and OpenAI. Many of these contracts were announced last minute and seemed to involve a very desperate Anthropic. Based on the Anthropic/xAI data center contract, they’re willing to pay crazy markup to get immediate access to compute.

I want to know how impacted Gemini has been by that, because that will reveal a lot about their margins and revenue generating first party demand. Each MSFT earnings report they discuss the balance they’re dealing with between supplying GPUs to Azure customers and first party demand.

My pet theory is that Gemini is “losing” the LLM race because they’re preferentially selling the TPUs to competitors, while keeping just enough for themselves to stay competitive and build their own products.

kzrdude 10 minutes ago|||
Don't know, but Gemini 3.1 flash lite is available for free under relatively generous limits, and it had lots of random interruptions like when I was testing it. (Intermittently responding with errors due to high load.)
cherryteastain 9 minutes ago|||
It's worse than OpenAI or Anthropic. However their lower tier consumer offerings can sometimes be had for <$10/mo on offer and come bundled with other Google services like cloud storage.
singingtoday 1 hour ago|||
We use Gemini for some specific tasks. It is often unavailable due to capacity limits or other downtime.

It's probably the best multimodal model I've worked with (if somebody knows a better one for audio analysis, please let me know!)

nicce 1 hour ago||
I don't know numbers, but their APIs have a bad uptime in my experience for some models. Too often failure because of "traffic too high".
sunaookami 1 hour ago||
Yeah I had a trial for AI Pro or whatever it's called and could never use Gemini CLI (when it still existed) because it was constantly "overloaded". Using the API directly (wihtout a subscription) sometimes works but the models are so buggy and the endpoints constantly spew errors that it's not usable. See this forum thread for example: https://discuss.ai.google.dev/t/frequent-503-errors-service-... it started with 503 errors since JANUARY and it's still not fixed. These are "stable" GA models!

I HIGHLY doubt that Gemini is overloaded, Google has been bullshitting with their crap models since release. Waste of everyone's time.

symisc_devel 4 hours ago||
I do believe this will be the norm from now on to get access to top frontier model. Computing capacity plus state restrictions plus KYC will be imposed to organisations to get access, individuals will be served last on the queue with degraded performance. Once the Chinese models catch up, nobody (at least individuals) will turn back again to frontier labs.
mden 3 hours ago||
This seems less about frontier models and restriction and more just lack of compute capacity to meet demand. This has always been an issue for large clients running on cloud, though not to this extent.
fancyfredbot 3 hours ago||
[dead]
HarHarVeryFunny 4 hours ago||
It's interesting that Meta is heavily using Google's models (as opposed to Anthropic or OpenAI) given that they are not SOTA for coding. I wonder if this for some strategic/competitive reason, or maybe for cost saving?
dofm 3 hours ago||
I would imagine there are many situations within Meta's applications where relatively small models can do a good job — sentiment analysis, abusive language detection, characterising users based on their posts, summarising a user's complaint so it can be ignored more efficiently, assessing whether ads are likely to be fraudulent so they can be run more often, etc.
sarjann 3 hours ago|||
Google tends to be very good at vision and smaller/ edge
HarHarVeryFunny 3 hours ago|||
Hmm ... I was assuming they were using these models for development, but I wonder if any of it might be for production instead - perhaps using vision models to analyze posted content? That would certainly be massive scale, but I'd have thought that scale would require them to be running in their own datacenters.

OTOH, if they are stressing Google's capacity then it seems it has to be for production use, which would relfect a massive failure on Meta's side given their investment in datacenters and AI. If they can't utilize their own models and datacenters, then maybe they should just rent the excess capacity to Google! :)

fer 2 hours ago||||
I double check with Gemini anything ML/AI related, anecdotal but I feel like it's much more solid explaining things and pointing out pitfalls.
Chu4eeno 36 minutes ago|||
Not really, especially recent gemini's tend to hallucinate unbelievably much especially with visual input.

And their safety tuning is neither effective nor precise on edge models.

re-thc 2 hours ago|||
> It's interesting that Meta is heavily using Google's models (as opposed to Anthropic or OpenAI)

Who says they aren't? Could be using all of them for "research".

ZappoMan 3 hours ago||
[flagged]
netdur 3 hours ago||
Google is the only LLM frontier that can supply huge enterprise grade AI, yet still struggle, the other one is spacex but their LLM is Grok
microgpt 3 hours ago|
also the only cloud platform, the only workspace, the only cloud drive... it's just standard Google fare
snake_doc 2 hours ago||
Image/video understanding still quite cost effective from the Gemini flash series models?

Image generation and veo models I’d imagine quite effective for creators; new Instagram accounts with AI content that are garnering millions of followers in spans of weeks are quite common now

Zambyte 4 hours ago||
Facebook does seem to be falling behind. Does anyone here use Llama over more recent options for any technical reasons?
moshegramovsky 2 hours ago||
Facebook is ethically challenged and that's putting it very very very mildly. Yes, they have unlimited money, but at a certain point, it comes across like a rich dude at a bar telling a beautiful woman that he'll buy her a diamond bracelet if she will just come over to his place right now. They make my skin crawl.
khurs 4 hours ago|||
if you use this as a rough gauge: https://openrouter.ai/models?order=top-weekly

Llama Meta 70b is 50th or so down the list of popular models.

It has 24.1b tokens used in 7 days vs the top models that have trillions or hundreds of billions of tokens.

So practically dead!

therealdrag0 23 minutes ago||
Is that biased towards code generation? As opposed to application features using LLMs, which I think is more what we’re talking about.
dataminded 3 hours ago||
Meta's latest model is Spark Muse and not available outside of its products.

https://ai.meta.com/blog/introducing-muse-spark-msl/

htrp 1 hour ago||
still waiting on that API launch which was supposed to happen very very soon
gcanyon 2 hours ago||
Meta builds its own models. How similar is this to a story with the headline “OpenAI limits Anthropic’s use of its ChatGPT AI models.”?
notatoad 2 hours ago|
Not similar at all, as explained in the article below the headline.
sidcool 2 hours ago|
How do they figure out it's being used by Meta?
jsnell 2 hours ago|
... Because Meta have a contract with Google, are paying for the requests, and are supplying their API key with every request.
More comments...