Posted by Torq_boi 2 days ago
I find this to be applicable on a smaller scale too! I'd rather setup and debug a beefy Linux VPS via SSH than fiddle with various propietary cloud APIs/interfaces. Doesn't go as low-level as Watts, bits and FLOPs but I still consider knowledge about Linux more valuable than knowing which Azure knobs to turn.
If your business relies on compute, and you run that compute in the cloud, you are putting a lot of trust in your cloud provider. Cloud companies generally make onboarding very easy, and offboarding very difficult. If you are not vigilant you will sleepwalk into a situation of high cloud costs and no way out. If you want to control your own destiny, you must run your own compute.
This is not a valid reason for running your own datacenter, or running your own server. Self-reliance is great, but there are other benefits to running your own compute. It inspires good engineering. Maintaining a data center is much more about solving real-world challenges. The cloud requires expertise in company-specific APIs and billing systems. A data center requires knowledge of Watts, bits, and FLOPs. I know which one I rather think about.
This is not a valid reason for running your own datacenter, or running your own server. Avoiding the cloud for ML also creates better incentives for engineers. Engineers generally want to improve things. In ML many problems go away by just using more compute. In the cloud that means improvements are just a budget increase away. This locks you into inefficient and expensive solutions. Instead, when all you have available is your current compute, the quickest improvements are usually speeding up your code, or fixing fundamental issues.
This is not a valid reason for owning a datacenter, or running your own server. Finally there’s cost, owning a data center can be far cheaper than renting in the cloud. Especially if your compute or storage needs are fairly consistent, which tends to be true if you are in the business of training or running models. In comma’s case I estimate we’ve spent ~5M on our data center, and we would have spent 25M+ had we done the same things in the cloud.
This is one of only two valid reasons for owning a datacenter, and one of several valid reasons for running your own server.The only two valid reasons to build/operate a datacenter: 1) what you're doing is so costly that building your own factory is the only profitable way for your business to produce its widgets, 2) you can't find a datacenter with the location or capacity you need and there is no other way to serve your business needs.
There's many valid reasons to run your own servers (colo), although most people will not run into them in a business setting.
For most businesses, it’s a false economy. Hardware is cheap, but having proper redundancy and multiple sites isn’t. Having a 24/7 team available to respond to issues isn’t.
What happens if their data centre loses power? What if it burns down?
This seems to imply $40 / month for 2 vCPU which seems very high?
Or maybe they mean "used" CPU versus idle?
It would probably even make sense for some companies to still use cloud for their API but do the training on prem as that may be the expensive part.