Posted by hajtom 10 hours ago
For example, we were running a 20 node k8s cluster for our Cortex (distributed Prometheus) install, monitoring about 30k servers around the world, and it was generating a bit over a TB of data a day. It was a lot more cost effective and performant to create a minio cluster for that data than to use S3.
Also, you can get durability with minio with multi cluster replication.
Probably yes.
Need to start reconsidering the approach now and looking for alternatives
Anyone have any suggestions?
https://garagehq.deuxfleurs.fr/
Edit: jeez, three of us all at once...
rclone serve s3 path/to/buckets --addr :9000 --auth-key <key-id>,<secret>
`Be wary that an OSD, whether based on a physical device or a file, is resource intensive.`
Can anyone quantify "resource intensive" here? Is it "takes an entire Raspberry Pi to run the minimum set" or is it "takes 4 cores per OSD"?
Edit: This is the specific doc page https://canonical-microceph.readthedocs-hosted.com/stable/ho...
minio was also suited for some smaller use cases (e.g. running a partial S3 compatible storage for integration tests). Ceph isn't really good for it.
But if you ran large minio clusters in production ceph might be a very good alternative.
I haven't tried it though. Seems simple enough to run.
Am forced to use MinIO for certain products now but will eventually move to better eventually. Garage is high on my list of alternatives.
Seaweedfs is more mature and has many interfaces (S3, webdav, SFTP, REST, fuse mount). It's most appropriate for storing lots of small files.
I prefer the command line interface and data/synchronization model of Garage, though. It's easier to manage, probably because the developers aren't biting off more than they can chew.
Like in the old MinIO days, an S3 object is a file on the filesystem, not some replicated blocks. You could always rebuild the full object store content with a few rsync. I appreciate the simplicity.
My main concern was that you couldn't configure it easily through files, you had to use CLI, which wasn't very convenient. I hope this has changed.
Configuration is still through the CLI, though it's fairly simple. If your usecase is similar to the way that the Deuxfleurs organization uses it -- several heterogeneous, geographically distributed nodes that are more or less set-it-and-forget-it -- then it's probably a good fit.
My use case is relatively common : I want small S3 compatible object stores that can be deployed in Kubernetes without manual intervention. The CLI part was a bit in the way last time, this could have been automated but it wasn't straightforward.