Posted by dot_treo 1 day ago
I was just setting up a new project, and things behaved weirdly. My laptop ran out of RAM, it looked like a forkbomb was running.
I've investigated, and found that a base64 encoded blob has been added to proxy_server.py.
It writes and decodes another file which it then runs.
I'm in the process of reporting this upstream, but wanted to give everyone here a headsup.
It is also reported in this issue: https://github.com/BerriAI/litellm/issues/24512
The Python ecosystem provides too many nooks and crannies for malware to hide in.
I hope that everyone's course of action will be uninstalling this package permanently, and avoiding the installation of packages similar to this.
In order to reduce supply chain risk not only does a vendor (even if gratis and OS) need to be evaluated, but the advantage it provides.
Exposing yourself to supply chain risk for an HTTP server dependency is natural. But exposing yourself for is-odd, or whatever this is, is not worth it.
Remember that you are programmers and you can just program, you don't need a framework, you are already using the API of an LLM provider, don't put a hat on a hat, don't get killed for nothing.
And even if you weren't using this specific dependency, check your deps, you might have shit like this in your requirements.txt and was merely saved by chance.
An additional note is that the dev will probably post a post-mortem, what was learned, how it was fixed, maybe downplay the thing. Ignore that, the only reasonable step after this is closing a repo, but there's no incentive to do that.
Programming for different LLM APIs is a hassle, this library made it easy by making one single API you call, and in the backstage it handled all the different API calls you need for different LLM providers.
That's what they pay us for
I'd get it if it were a hassle that could be avoided, but it feels like you are trying to avoid the very work you are being paid for, like if a MCD employee tried to pay a kid with Happy Meal toys to work the burger stand.
Another red flag, although a bit more arguable, is that by 'abstracting' the api into a more generic one, you achieving vendor neutrality, yes, but you also integrate much more loosely with your vendors, possibly loose unique features (or can only access them with even more 'hassle' custom options, and strategically, your end product will veer into commodity territory, which is not a place you usually want to be.
This is like a couple hours of work even without vibe coding tools.
There's more than that (even if most other systems also provide a OpenAI compatible API which may or may not expose either all features of the platform or all features of the OpenAI API), and the differences are not cosmetic, but since LiteLLM itself just presents an OpenAI-compatible API, it can't be providing acccess to other vendor features that don't map cleanly to that API, and I don't think its likely to be using the native API for each and being more complete in its OpenAI-compatible implementation of even the features that map naturally than the first-party OpenAI-compatibility APIs.)
Back in the days of GPT-3.5, LiteLLM was one of the projects that helped provide a reliable adapter for projects to communicate across AI labs' APIs and when things drifted ever so slightly despite being an "OpenAI-compatible API", LiteLLM made it much easier for developers to use it rather than reinventing and debugging such nuances.
Nowadays, that gateway of theirs isn't also just a funnel for centralizing API calls but it also serves other purposes, like putting guardrails consistently across all connections, tracking key spend on tokens, dispensing keys without having to do so on the main platforms, etc.
There's also more to just LiteLLM being an inference gateway too, it's also a package used by other projects. If you had a project that needed to support multiple endpoints as fallback, there's a chance LiteLLM's empowering that.
Hence, supply chain attack. The GitHub issue literally has mentions all over other projects because they're urged to pin to safe versions since they rely on it.
It's pretty disappointing that safetensors has existed for multiple years now but people are still distributing pth files. Yes it requires more code to handle the loading and saving of models, but you'd think it would be worth it to avoid situations like this.
Hundreds of downvoted comments like "Worked like a charm, much appreciated.", "Thanks, that helped!", and "Great explanation, thanks for sharing."