Posted by dot_treo 15 hours ago
I was just setting up a new project, and things behaved weirdly. My laptop ran out of RAM, it looked like a forkbomb was running.
I've investigated, and found that a base64 encoded blob has been added to proxy_server.py.
It writes and decodes another file which it then runs.
I'm in the process of reporting this upstream, but wanted to give everyone here a headsup.
It is also reported in this issue: https://github.com/BerriAI/litellm/issues/24512
This threat actor seems to be very quickly capitalising on stolen credentials, wouldn’t be surprised if they’re leveraging LLMs to do the bulk of the work.
Do the labs label code versions with an associated CVE to label them as compromised (telling the model what NOT to do)? Do they do adversarial RL environments to teach what's good/bad? I'm very curious since it's inevitable some pwned code ends up as training data no matter what.
I assume most labs don't do anything to deal with this, and just hope that it gets trained out because better code should be better rewarded in theory?
Run all your new dependencies through static analysis and don't install the latest versions.
I implemented static analysis for Python that detects close to 90% of such injections.
1. pin dependencies with sha signatures 2. mirror your dependencies 3. only update when truly necessary 4. at first, run everything in a sandbox.
Since they all seem positive, it doesn't seem like an attack but I thought the general etiquette for github issues was to use the emoji reactions to show support so the comment thread only contains substantive comments.
> It also seems that attacker is trying to stifle the discussion by spamming this with hundreds of comments. I recommend talking on hackernews if that might be the case.
Configure the CI to make a release with the artefacts attached. Then have an entirely private repo that can't be triggered automatically as the publisher. The publisher repo fetches the artefacts and does the pypi/npm/whatever release.
https://docs.npmjs.com/generating-provenance-statements
https://packaging.python.org/en/latest/specifications/index-...
some will even audit each package in there (kind crap job but it works fairly well as mitigation)
LiteLLM wouldn't be my top choice, because it installs a lot of extra stuff. https://news.ycombinator.com/item?id=43646438 But it's quite popular.
or pyproject.toml (not possible to filter based on absence of a uv.lock, but at a glance it's missing from many of these): https://github.com/search?q=path%3A*%2Fpyproject.toml+%22%5C...
or setup.py: https://github.com/search?q=path%3A*%2Fsetup.py+%22%5C%22lit...
> ### Software Supply Chain is a Pain in the A*
> On top of that, the room for vulnerabilities and supply chain attacks has increased dramatically
AI Is not about fancy models, is about plain old Software Engineering. I strongly advised our team of "not-so-senior" devs to not use LiteLLM or LangChain or anything like that and just stick to `requests.post('...')".
[0] https://sb.thoughts.ar/posts/2025/12/03/ai-is-all-about-soft...