Posted by baylearn 20 hours ago
Maybe it is the reverse? It is not them offering a product, it is the users offering their interaction data. Data which might be harvested for further training of the real deal, which is not the product. Think about it: They (companies like OpenAI) have created a broad and diverse user base which without a second thought feeds them with up-to-date info about everything happening in the world, down to the individual life and even their inner thoughts. No one in the history of mankind ever had such a holistic view, almost gods eye. That is certainly something a super intelligence would be interested in. They may have achieved it already and we are seeing one of its strategies playing out. Not saying they have, but this observation would not be incompatible or indicate they haven't.
People bring problems to the LLM, the LLM produces some text, people use it and later return to iterate. This iteration functions as a feedback for earlier responses from the LLM. If you judge an AI response by the next 20 rounds of interaction or more you can gauge if it was useful or not. They can create RLHF data this way, using hindsight or extra context from other related conversations of the same user on the same topic. That works because users try the LLM ideas in reality and bring outcome results back to the model, or they simply recall from their personal experience if that approach would work or not. The system isn't just built to be right; it's built to be correctable by the user base, at scale.
OpenAI has 500M users, if they generate 1000 tokens/user/day that means 0.5T interactive tokens/day. The chat logs dwarf the original training set in size and are very diverse, targeted to our interests, and mixed with feedback. They are also "on policy" for the LLM, meaning they contain corrections to mistakes the LLM made, not generic information like web scrape.
You're right that LLMs eventually might not even need to crawl the web, they have the whole society dump data into their open mouths. That did not happen with web search engines, only social networks did that in the past. But social networks are filled with our cultural wars and self conscious posing, while the chat room is an environment where we don't need to signal our group alignment.
Web scraping gives you humanity's external productions - what we chose to publish. But conversational logs capture our thinking process, our mistakes, our iterative refinements. Google learned what we wanted to find, but LLMs learn how we think through problems.
I distinctly remember search engines 30 years ago having a "live searches" page (with optional "include adult searches" mode)
or the best models should be free to use. if it's free to use then I think I can live with it
More like usher in climate catastrophe way ahead of schedule. AI-driven data center build outs are a major source of new energy use, and this trend is only intensifying. Dangerously irresponsible marketing cloaks the impact of these companies on our future.
This is the main point that proves to me that these companies are mostly selling us snake oil. Yes, there is a great deal of utility from even the current technology. It can detect patterns in data that no human could; that alone can be revolutionary in some fields. It can generate data that mimics anything humans have produced, and certain permutations of that can be insightful. It can produce fascinating images, audio, and video. Some of these capabilities raise safety concerns, particularly in the wrong hands, and important questions that society needs to address. These hurdles are surmountable, but they require focusing on the reality of what these tools can do, instead of on whatever a group of serial tech entrepreneurs looking for the next cashout opportunity tell us they can do.
The constant anthropomorphization of this technology is dishonest at best, and harmful and dangerous at worst.
No, it can generate data that mimics anything humans have put on the WWW
But it is far from snake oil as it actually is useful and does a lot of stuff really.
Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.
As far as I can tell smart engineers are using AI tools, particularly people doing coding, but even non-coding roles.
The criticism feels about three years out of date.
The other reason is because the primary focus of the last 3 years has been scaling the data and hardware up, with a bunch of (much needed) engineering around it. This has produced better results, but it can't sustain the AGI promises for much longer. The industry can only survive on shiny value added services and smoke and mirrors for so long.
Even just in industry, I think data functions at companies will have a dicey future.
I haven't seen many places where there's scientific peer review - or even software-engineering-level code-review - of findings from data science teams. If the data scientist team says "we should go after this demographic" and it sounds plausible, it usually gets implemented.
So if the ability to validate was already missing even pre-LLM, what hope is there for validation of the LLM-powered replacement. And so what hope is there of the person doing the non-LLM-version of keeping their job (at least until several quarters later when the strategy either proves itself out or doesn't.)
How many other departments are there where the same lack of rigor already exists? Marketing, sales, HR... yeesh.
Last week I had Claude and ChatGPT both tell me different non-existent options to migrate a virtual machine from vmware to hyperv.
Week before that one of them (don't remember which, honestly) gave me non existent options for fio.
Both of these are things that the first party documentation or man page has correct but i was being lazy and was trying to save time or be more efficient like these things are supposed to do for us. Not so much.
Hallucinations are still a problem.
A few days ago, I asked free ChatGPT to tell me the head brewer of a small brewery in Corpus Christi. It told me that the brewery didn't exist, which it did, because we were going there in a few minutes, but after re-prompting it, it gave me some phone number that it found in a business filing. (ChatGPT has been using web search for RAG for some time now.)
Hallucinations are still a massive problem IMO.
Nonsense, there is a TON of discussion around how the standard workflow is "have Cursor-or-whatever check the linter and try to run the tests and keep iterating until it gets it right" that is nothing but "work around hallucinations." Functions that don't exist. Lines that don't do what the code would've required them to do. Etc. And yet I still hit cases weekly-at-least, when trying to use these "agents" to do more complex things, where it talks itself into a circle and can't figure it out.
What are you trying to get these things to do, and how are you validating that there are no hallucinations? You hardly ever "hear about it" but ... do you see it? How deeply are you checking for it?
(It's also just old news - a new hallucination is less newsworthy now, we are all so used to it.)
Of course, the internet is full of people claiming that they are using the same tools I am but with multiple factors higher output. Yet I wonder... if this is the case, where is the acceleration in improvement in quality in any of the open source software I use daily? Or where are the new 10x-AI-agent-produced replacements? (Or the closed-source products, for that matter - but there it's harder to track the actual code.) Or is everyone who's doing less-technical, less-intricate work just getting themselves hyped into a tizzy about getting faster generation of basic boilerplate for languages they hadn't personally mastered before?
As such even if there is a lot of money AI will make, it can still be the right decision to sell tools to others who will figure out how to use it. And of course if it turns out another pointless fad with no real value you still make money. (I'd predict the answer is in between - we are not going to get some AGI that takes over the world, but there will be niches where it is a big help and those niches will be worth selling tools into)
Now they are cancelling those plans. For them "AGI" was cancelled.
OpenAI claims to be closer and closer to "AGI" as more top scientists left or are getting poached by other labs that are behind.
So why would you leave if the promise of achieving "AGI" was going to produce "$100B dollars of profits" as per OpenAI's and Microsoft's definition in their deal?
Their actions tell more than any of their statements or claims.
Seems to be about this:
> As per the current terms, when OpenAI creates AGI - defined as a "highly autonomous system that outperforms humans at most economically valuable work" - Microsoft's access to such a technology would be void.
https://www.reuters.com/technology/openai-seeks-unlock-inves...
Microsoft itself hasn't said they're doing this because of oversupply in infrastructure for it's AI offerings, but they very likely wouldn't say that publicly even if that's the reason.
They are leaving for more money, more seniority or because they don’t like their boss. 0 about AGI
Another way to say it is that people think it’s much more likely for each decent LLM startup grow really strongly first several years then plateau vs. then for their current established player to hit hyper growth because of AGI.
The 'no one jumps ship if agi is close' assumption is really weak, and seemingly completely unsupported in TFA...
Then they leave for more money.
Of course, but that's part of my whole point.
Such statements and targets about how close we are to "AGI" has only become nothing but false promises and using AGI as the prime excuse to continue raising more money.
At Microsoft, "AI" is spelled "H1-B".
To fund yourself while building AGI? To hedge risk that AGI takes longer? Not saying you're wrong, just saying that even if they did believe it, this behavior could be justified.
Observations of reality is more consistent with company FOMO than with actual usefulness.
Personally I think AGI is ill-defined and won't happen as a new model release. Instead the thing to look for is how LLMs are being used in AI research and there are some advances happening there.
What if chatbots and user interactions ARE the path to AGI? Two reasons they could be: (1) Reinforcement learning in AI has proven to be very powerful. Humans get to GI through learning too - they aren’t born with much intelligence. Interactions between AI and humans may be the fastest way to get to AGI. (2) The classic Silicon Valley startup model is to push to customers as soon as possible (MVP). You don’t develop the perfect solution in isolation, and then deploy it once it is polished. You get users to try it and give feedback as soon as you have something they can try.
I don’t have any special insight into AI or AGI, but I don’t think OpenAI selling useful and profitable products is proof that there won’t be AI.
The central claim here is illogical.
The way I see it, if you believe that AGI is imminent, and if your personal efforts are not entirely crucial to bringing AGI about (just about all engineers are in this category), and if you believe that AGI will obviate most forms of computer-related work, your best move is to do whatever is most profitable in the near-term.
If you make $500k/year, and Meta is offering you $10M/year, then you ought to take the new job. Hoard money, true believer. Then, when AGI hits, you'll be in a better personal position.
Essentially, the author's core assumption is that working for a lower salary at a company that may develop AGI is preferable to working for a much higher salary at a company that may develop AGI. I don't see how that makes any sense.
Also 10m would be a drop in the bucket compared to being a shareholder of a company that has achieved AGI; you could also imagine the influence and fame that comes with it.
It'll be Vaswani and the others for the transformer, then maybe Zelikman and those on that paper for thought tokens, then maybe some of the RNN people and word embedding people will be cited as pioneers. Sutskever will definitely be remembered for GPT-1 though, being first to really scale up transformers. But it'll actually be like with flight and a whole mass of people will be remembered, just as we now remember everyone from the Wrights to Bleriot and to Busemann, Prandtl, even Whitcomb.
I'm not an aerodynamicist, and I know about those guys, so they can't be infinitely obscure. I imagine every French person knows about Bleriot at least.
Bleriot was a french aviation pioneer and not a physicist. He built the first monoplane. Busemann was an aerodynamicist who invented wing sweep and also did important work on supersonic flight. Prandtl is known for research on lift distribution over wings, wingtip vortices, induced drag and he basically invented much of the theory about wings. Whitcomb gave his name to the Whitcomb area rule, although Otto Frenzl had come up with it earlier during WWII.
It means you don't have much faith that the company you're working at will be the ones to pull it off.
And if they don't like their boss and the other job sounds better, well...
Uh, sure. How many rocket engineers who worked for moon landing could you name?
Unless you’re a significant shareholder, that’s almost always the best move, anyway. Companies have no loyalty to you and you need to watch out for yourself and why you’re living.
I know people who've taking quite good comp from startups to do things that would require fundamental laws of physics to be invalidated; they took the money and devised experiments that would show the law to be wrong.
I know that sounds broad or obvious, but people seem to easily and unknowingly wander into "Human intelligence is magically transcendent".
I don't know if you're making it, but the simplest mistake would be to think that you can prove that a computer can evaluate any mathematical function. If that were the case then "it's got to be doable with algorithms" would have a fairly strong basis. Anything the mind does that an algorithm can't would have to be so "magically transcendent" that it's beyond the scope of the mathematical concept of "function". However, this isn't the case. There are many mathematical functions that are proven to be impossible for any algorithm to implement. Look up uncomputable functions you're unfamiliar with this.
The second mistake would be to think that we have some proof that all physically realisable functions are computable by an algorithm. That's the Physical Church-Turing Thesis mentioned above, and as the name indicates it's a thesis, not a theorem. It is a statement about physical reality, so it could only ever be empirically supported, not some absolute mathematical truth.
It's a fascinating rabbit hole if you're interested - what we actually do and do not know for sure about the generality of algorithms.
But the poster you responded to didn't say it's magically transcendent, they just pointed out that there are many significantly hard problems that we don't solutions for yet.
If an AI or AGI can look at a picture and see an apple, or (say) with an artificial nose smell an apple, or likewise feel or taste or hear* an apple, and at the same identify that it is an apple and maybe even suggest baking an apple pie, then what else is there to be comprehended?
Maybe humans are just the same - far far ahead of the state of the tech, but still just the same really.
*when someone bites into it :-)
For me, what AI is missing is genuine out-of-the-box revolutionary thinking. They're trained on existing material, so perhaps it's fundamentally impossible for AIs to think up a breakthrough in any field - barring circumstances where all the component parts of a breakthrough already exist and the AI is the first to connect the dots ("standing on the shoulders of giants" etc).
It will confidently analyze and describe a chess position using advanced sounding book techniques, but its all fundamentally flawed, often missing things that are extremely obvious (like, an undefended queen free to take) while trying to sound like its a seasoned expert - that is if it doesn't completely hallucinate moves that are not allowed by the rules of the game.
This is how it works in other fields I am able to analyse. It's very good at sounding like it knows what its doing, speaking at the level of a masters level student or higher, but its actual appraisal of problems is often wrong in a way very different to how humans make mistakes. Another great example is getting it to solve cryptic crosswords from back in the day. It often knows the answer already in its training set, but it hasn't seen anyone write out the reasoning for the answer, so if you ask it to explain, it makes nonsensical leaps (claiming birch rhymes with tyre level nonsense)
Hanging a queen is not evidence of a lack of intelligence - even the very best human grandmasters will occasionally do that. But in pretty much every single video, the LLM loses the plot entirely after barely a couple dozen moves and starts to resurrect already-captured pieces, move pieces to squares they can't get to, etc - all while keeping the same confident "expert" tone.
At that point, the question of whether the model really does understand is pointless. We might as well argue about whether humans understand.
What I’m hearing here is that you are willing to get your surgery done by him and not by one of the real doctors - if he is capable of pronouncing enough doctor-sounding phrases.
This is just a thing to say that has no substantial meaning.
- What is "sufficiently" mean?
- What is functionally equivalent?
- and what is even understanding?
All just vague hand wavingWe're not philosophizing here, we're talking about practical results and clearly, in the current context, it does not deliver in that area.
> At that point, the question of whether the model really does understand is pointless.
You're right it is pointless, because you are suggesting something that doesnt exist. And the current models cannot understand
Except it clearly does, in a lot of areas. You can't take a 'practical results trump all' stance and come out of it saying LLMs understand nothing. They understand a lot of things just fine.
I was almost going to explicitly mention your point but deleted it because I thought people would be able to understand.
This is not a philosophy/theology sitting around handwringing about "oh but would a sufficiently powerful LLM be able to dance on the head of a pin". We're talking about a thing, that actually exists, that you can actually test. In a whole lot of real-world scenarios that you try to throw at it, it fails in strange and unpredictable ways. Ways that it will swear up and down it did not do. It will lie to your face. It's convincing. But then it will lose in chess, it will fuck up running a vending machine buisness, it will get lost coding and reinvent the same functions over and over, it will make completely nonsensical answers to crossword puzzles.
This is not an intelligence that is unlimited, it is a deeply flawed two year old that just so happens to have read the entire output of human writing. It's a fundamentally different mind to ours, and makes different mistakes. It sounds convincing and yet fails, constantly. It will tell you a four step explanation of how its going to do something, then fail to execute four simple steps.
~2028
Not sure we need it. The counter example is the LLM itself. We had absolutely zero idea that the attention heads would bring such benefits down the road.
However I'm flabbergasted by the lack of attention to so-called "hallucinations" (which is a misleading, I mean marketing, term and we should be talking about errors or inaccuracies).
The problem is that we don't really know why LLMs work. I mean you can run the inference and apply the formula and get output from the given input, but you can't "explain" why LLM produced phase A as an output instead of B,C, or N. There's just too many parameters and computations to go though, and the very concept of "explaining" or "understanding" might not even apply here.
And if we can't understand how this thing works, we can't understand why it doesn't work properly (produces wrong output) and also don't know how to fix it.
And instead of talking about it and trying to find a solution everybody moved on to the agents which are basically LLMs that are empowered to perform complex actions IRL.
How does this makes any sense to anybody? I feel like I'm crazy or missing something important.
I get it, a lot of people are making a lot of money and a lot of promises are being made. But this is absolutely fundamental issue that is not that difficult to understand to anybody with a working brain, and yet I am really not seeing any attention paid to it whatsoever.
You can get use out of an LLM without understanding how every node works.
Imagine that occasionally when getting in contact with the nail it shatters to bits, or goes through the nail as it were liquid, or blows up, or does something else completely unexpected. Wouldn't you want to fix it? And sure, it might require deep understanding of the nature of the materials and forces involved.
That's what I'd do.
We're also pretty good at working around human 'hallucinations' and other inaccuracies. Whether it be someone having a bad day, a brain fart, or individual clumsiness. eg in a (bad) organisation, sometimes we do it with layers of reviews and committees, much like layers of LLMs judging each other.
I think too much is attached to the notion of "we don't understand how the LLM works". We don't understand how any complicated intelligence works, and potentially won't for the forseeable future.
More generally, a lot of society is built up from empirical understanding of black box systems. I'd claim the field of physics is a prime example. And we've built reliable systems from unreliable components (see the field of distributed systems).
You can damage a company by using a spreadsheet and not understanding how it works.
In your personal opinion, what are the things you should know before using an LLM?
LLMs generate text based on weights in a model, and some of it happens to be correct statements about the world. Doesn't mean the rest is generated incorrectly.
You're describing a lack of errors in verification (working as designed/built, equations correct).
GP is describing an error in validation (not doing what we want / require / expect).
They are a magical black box magic 8 ball, that more likely than not gives you the right answer. Maybe people can explain the black box, and make the magic 8 ball more accurate.
But at the end of the day, with a very complex system it will always be some level of black box unreliable magic 8 ball.
So the question then is how do you build an reliable system from unreliable components. Because llms directly are unreliable.
The answer to this is agents, ie feedback loops between multiple llm calls, which in isolation are unreliable, but in aggregate approach reliability.
At the end of the day the bet on agents is a bet that the model companies will not get a model that will magically be 100% correct on the first try.
When you have a complex system that does not always work correctly, you start disassembling it to simpler and simpler components until you find the one - or maybe several - that are not working as designed, you fix whatever you found wrong with them, put the complex system together again, test it to make sure your fix worked, and you're done. That's how I debug complex cloud-based/microservices-infected software systems, that's how they test software/hardware systems found in aircraft/rockets and whatever else. That's such a fundamental principle to me.
If LLM is a black box by definition and there's no way to make it consistently work correctly, what is it good for?..
many things are unpredictable on the real world. Most of the machines we make are built upon layers of redundancies to make imperfect systems stable and predictable. this is no different.
Right. It worked for social media monetization.
"... hallucinations ..."
The elephant in the room. Until that problem is solved. AI systems can't be trusted to do anything on their own. The solution the AI industry has settled on is to make hallucinations an externality, like pollution. They're fine as long as someone else pays for the mistakes.
LLMs have a similar problem to Level 2-3 self-driving cars. They sort of do the right thing, but a human has to be poised to quickly take over at all times. It took Waymo a decade to get over that hump and reach level 4, but they did it.
AI system can be trusted to do most of the things on their own. You can't trust them for actions with irreversible consequences, but everything else is ok.
I can use them to write documents, code, create diagrams, designs etc. I just need to verify the result, but that's 10% of the actual work. I would say that 90% of modern day office work can be done with the help of AI.
However, you don't need either of these to completely decimate the job markets and by extension our societies.
Historically speaking, "good enough" and cheaper had always won over "better, but more expensive". I suspect LLMs will raise this question endlessly until significant portions of the society are struggling - and who knows what will happen then
Before LLMs started going anywhere, I thought that's gonna be an issue for later generations, but at this point I suspect we'll witness it within the next 10 yrs.
I see a lot of talk that the first company that achieves AGI, will also achieve market dominance. All other players will crumble. But surely when someone achieves AGI, their competitors will in all likelihood be following closely after. And once those achieve AGI, academia will follow.
Point is, at some point AGI itself will become available the everyone. The only things that will be out of reach for most, is compute - and probably other expensive things on the infrastructure part.
Current AI funding seems to revolve around some sort of winner-take-all scenario. Just keep throwing incredible amounts of money at it, and hope that you've picked the winner. I'm just wondering what the outcome will be if this thesis turns out wrong.
That is the moat. That, and training data.
Even today, compute and data are the only things that matter. There is hardly any secret software sauce. This means that only large corporations with a practically infinite amount of resources to throw at the problem could potentially achieve AGI. Other corporations would soon follow, of course, but the landscape would be similar to what it is today.
This is all assuming that the current approaches can take us there, of which I'm highly skeptical. But if there's a breakthrough at some point, we would still see AI tightly controlled by large corporations that offer it as a (very expensive) service. Open source/weight alternatives would not be able to compete, just like they don't today. Inference would still require large amounts of compute only accessible to companies, at least for a few years. The technology would be truly accessible to everyone only once the required compute becomes a commodity, and we're far away from that.
If none of this comes to pass, I suspect there will be an industry-wide crash, and after a few years in the Trough of Disillusionment, the technology would re-emerge with practical applications that will benefit us in much more concrete and subtle ways. Oh, but it will ruin all our media and communication channels regardless, directly causing social unrest and political regression, that much is certain. (:
In a way, all the hype can only indicate that AGI is still a distant illusion. If it were really around the corner we'd be hearing different stories.
It's not going to be fun or easy, but as far as the financials go, we were there in 2001.
The question is assuming we do get AGI, what the ramifications of that will be. Instead of hiring employees, a business can spin up employees (and down) like a tech company can spin up EC2 instances. Great for employers, terrible for employees.
That's a big "if" though.
This makes no sense to me at all. Is it a war metaphor? A race? Why is there no reason to jump ship? Doesn't it make sense to try to get on the fastest ship? Doesn't it make sense to diversify your stock portfolio if you have doubts?
Feel free to challenge these numbers, but it's a starting place. What's not accounted for is the cost of training (compute time, but also employee and everything else), which needs to be amortized over the length of time a model is used, so ChatGPT's costs rise significantly, but they do have the advantage that hardware is shared across multiple users.
The numbers would bankrupt them within weeks.
It's surprising to me the number of people I consider smart and deep original thinkers who are now parroting lines and ideas (almost word-for-word) from folks like Andrej Karpathy and Sam Altman, etc.
But, of course, "Show me the incentive and I will show you the outcome" never stops being relevant.