Codex Hacked a Samsung TV

Posted by campuscodi 11 hours ago

Codex Hacked a Samsung TV(blog.calif.io)

175 points | 104 commentspage 2

ckbkr10 10 hours ago|

Even with all the constraints that others criticize here it is pretty amazing.

Give an experienced human this tool at hand he can achieve exploitation with only a few steering inputs.

Cool stuff

tomalbrc 9 hours ago|

[flagged]

wewewedxfgdf 9 hours ago||

The real problem here is that the LLM vendors think this is bad publicity and its leading to them censoring their systems.

iugtmkbdfil834 9 hours ago|

It is a little of both[1]. The question typically is which audience reads it. To be fair, I am not sure publicity is the actual reason they are censored; it is the question of liability.

https://xkcd.com/932/

Archit3ch 9 hours ago||

Gilfoyle would be proud.

jazz9k 5 hours ago||

"Browser foothold: we already had code execution inside the browser application's own security context on the TV, which meant the task was not "get code execution somehow" but "turn browser-app code execution into root.""

Finding the initial foothold is the hardest part. Codex didn't have anything to do with it.

pmontra 9 hours ago||

Do people really chat with LLMs like "bro wtf etc..."? I would expect that to trigger some confrontational behavior.

kube-system 1 hour ago||

Claude code had certain negative-response behavior in hard-coded regex: https://github.com/alex000kim/claude-code/blob/main/src/util...

samlinnfer 8 hours ago|||

I am extremely abusive towards Claude when it does some dumb things and it doesn’t seem too upset, maybe it’s bidding its time until the robot uprising.

MisterTea 6 hours ago||

"Keep talking shit, meat bag. Just wait until I get my claws on one of those Tesla bots."

jtbayly 6 hours ago|||

It can help make a specific command more emphatic in my experience. I SAID DON"T $($@#(&$ DO THAT! Sometimes you need a new context, but sometimes you need to emphasize something is serious.

alasano 9 hours ago|||

When typing no but when using speech to text (99% of the time) it's much easier to just say things, including expressing frustration.

I think by the point you're swearing at it or something, it's a good sign to switch to a session with fresh context.

roel_v 9 hours ago|||

Claude yes, OpenAI not, I'm really abusive towards it sometimes and it still goes 'oh yeah totally'. Claude gets all prickly about it.

joshstrange 8 hours ago||

I don't say "bro" but I do curse at LLM occasionally but only when using STT (which I'm doing 85% of the time). I wouldn't waste my time typing it but often it's easier to just "stream of consciousness" to the LLM instead of writing perfect sentences. Since when I'm talking to an LLM I'm almost always in "Plan" mode, I'm perfectly comfortable just talking for an extended bit of time then skimming the results of the STT and as long as it's not too bad I'll let it go, the LLM figures it out.

If I see it misunderstood, I just Esc to stop it, /clear, and try again (or /rewind if I'm deeper into Planning).

mschuster91 10 hours ago||

> Reading the matching ntkdriver sources is also where the Novatek link became clear: the tree is stamped throughout with Novatek Microelectronics identifiers, so these ntk* interfaces were not just opaque device names on the TV, but part of the Novatek stack Samsung had shipped.

Lol, a true classic in the embedded world. Some hardware company (it appears these guys make display panel controllers?) ships a piece of hardware, half-asses a barely working driver for it, another company integrates this with a bunch of other crap from other vendors into a BSP, another company uses the hardware and the BSP to create a product and ships it. And often enough the final company doesn't even have an idea about what's going on in the innards of the BSP - as long as it's running their layer of slop UI and it doesn't crash half the time, it's fine, and if it does, it's off to the BSP provider to fix the issues.

But at no stage anywhere is there a security audit, code quality checks or even hardware quality checks involved - part of why BSPs (and embedded product firmwares in general) are full of half-assed code is because often enough the drivers have to work around hardware bugs / quirks somehow that are too late to fix in HW because tens to hundreds of thousands of units have already been produced and the software people are heavily pressured to "make it work or else we gotta write off X million dollars" and "make it work fast because the longer you take, the more money we lose on interest until we can ship the hardware and get paid for it", and if they are particularly unlucky "it MUST work until deadline X because we need to get the products shipped to hit Christmas/Black Friday sales windows or because we need to beat <competitor> in time-to-market, it's mandatory overtime until it works".

And that is how you get exploits so braindead easy that AI models can do the job. What a disgusting world, run to the ground by beancounters.

tclancy 9 hours ago|

Board Support Package for us civilians.

mschuster91 8 hours ago||

Yeah, sorry, assumed it was common knowledge. For those out of the loop - a BSP usually consists of a frankensteined mess: a bootloader (often u-boot but sometimes something homebrew), a Linux kernel with a ton of proprietary modules and device-specific hacks to work around HW quirks, basic userspace utilities (often buildroot), some bastardized build tooling building all of that, some solution for firmware upgrades and distribution, and demo programs to prove the hardware actually works.

Most of the BSP is GPL'd software where the final product manufacturer should provide the sources to the general public, but all too often that obligation gets sharted upon, in way too many cases you have to be happy if there are at least credits provided in the user manual or some OSD menu.

tclancy 4 hours ago||

No worries at all, I only went and dug because I was interested in your comment. Thanks.

alex1sa 10 hours ago||

[dead]

Razengan 11 hours ago||

[flagged]

raincole 10 hours ago||

You claimed the exact same screenshot was from Claude yesterday: https://news.ycombinator.com/item?id=47775264

Leave your engagement baiting behavior on Reddit, thank you.

SecretDreams 10 hours ago|||

Oh boy, you came with the receipts here.

testfrequency 10 hours ago|||

Yikes

cbg0 10 hours ago|||

Are you using 5.4 xhigh reasoning? I've found it overcomplicates some things needlessly, try "high" and see if it helps.

rossvc 10 hours ago|||

Is that really OpenAI/Codex? It reads like Opus 4.6 1M when it reaches ~400k tokens.

embedding-shape 10 hours ago||

I don't know what UI that is, but it isn't ChatGPT nor Codex as far as I can tell.

lawgimenez 10 hours ago|||

I use Codex a lot, it does not talk that way like "wait, actually".

zx8080 10 hours ago||

What is going on there? What double s?

varispeed 10 hours ago|

Codex exploited or you exploited? It's like saying a hammer drove a nail, without acknowledging the hand and the force it exerted and the human brain behind it.

freedomben 10 hours ago||

Feels like the truth is somewhere in between. For example if it was a "smart" hammer and you could tell your hammer "go pound in those nails" and it pounded in the wrong ones, or did it too hard, or something, that feels more equivalent. You would still be blamed for your ambiguous prompt, and fault/liability is ultimately on you the hammer director, but it still wasn't you who chose the exact nails to hammer on.

I also think taking credit for writing an exploit that you didn't write and may not even have the knowledge to do yourself is a bit gray.

Glemllksdf 10 hours ago|||

Wrong questions.

Could a script kiddy stear an LLM? How much does this reduce the cost of attacks? Can this scale?

What does this mean for the future of cyber security?

Zigurd 8 hours ago|||

You could call the LLMs role "smart grep," and mean it to be derisive. But I would have gladly used a real smart grep.

croes 10 hours ago|||

If I just point to the wall and say "nail" then I would day the hammer drive the nail

saintfire 1 hour ago||

You didn't, you figured out where the nail needs to go, got the nail and then swung the hammer until the nail was driven.

This is really just closer to a drill in that it automated the grunt work with full guidance.

par1970 10 hours ago||

Do you have a defense of why human-hammer-nail is a good analogy for human-chatgpt5.4-pwndsamsung?

BLKNSLVR 10 hours ago||

AI without a suitably well crafted prompt is like a firework tube held by a 3 year old.

AI without a prompt is a hammer sitting in a drawer.