Posted by jrandolf 20 hours ago
The LLMs are completely private (we don't log any traffic).
The API is OpenAI-compatible (we run vLLM), so you just swap the base URL. Currently offering a few models.
I personally would like something like this but with "regular" GPU access. Some people still use them for something other than LLMs ^^.
I recall hearing about them years ago.
Good to see they're thriving!
I can sign up for a cohort today, but there's not even a hint of how long it will take the cohort to fill up. The most subscribed cohort is only at 42% (and dropping), so maybe days to weeks? That's a long time to wait if you have a use case to satisfy.
And then the cohort expires, and I have to sign up for another one and play the waiting game again? Nobody wants that level of unreliability.
Also, don't say "15-25 tok/s". That is a min-max figure, but your FAQ says that this is actually a maximum. It makes no sense to measure a maximum as a range, and you state no minimum so I can only assume that it is 0 tok/s. If all users in the cohort use it simultaneously, the best they're getting is something like 1.5 tok/s (probably less), which is abyssmal.
You mention "optimization", but I have no idea what that means. It certainly doesn't mean imposing token limits, because your FAQ says that won't happen. If more than 25 users are using the cohort simultaneously, it is a physical impossibility to improve performance to the levels you advertise without sacrificing something else, like switching to a smaller model, which would essentially be fraud, or adding more GPUs which will bankrupt you at these margins. With 465 users per cohort, a large chunk of whom will be using tools like OpenClaw, nobody will ever see the performance you are offering.
The issue here is you are trying to offer affordable AI GPU nodes without operating at a loss. The entire AI industry is operating at a loss right now because of how expensive this all is. This strategy literally won't work right now unless you start courting VCs to invest tens to hundreds of millions of dollars so you can get this off the ground by operating at a loss until hopefully you turn a profit at some point in the future, but at that point developers will probably be able to run these models at home without your help.
For filling up the cohorts, I agree and we're launching for a week to gather feedback.
Split a "it needs to run in a datacenter because its hardware requirements are so large" AI/LLM across multiple people who each want shared access to that particular model.
Sort of like the Real Estate equivalent of subletting, or splitting a larger space into smaller spaces and subletting each one...
Or, like the Web Host equivalent of splitting a single server into multiple virtual machines for shared hosting by multiple other parties, or what-have-you...
I could definitely see marketplaces similar to this, popping up in the future!
It seems like it should make AI cheaper for everyone... that is, "democratize AI"... in a "more/better/faster/cheaper" way than AI has been democratized to date...
Anyway, it's a brilliant idea!
Wishing you a lot of luck with this endeavor!