Comment by PhilippGille

11 days ago

They point to AIStor as alternative.

Other alternatives:

https://github.com/deuxfleurs-org/garage

https://github.com/rustfs/rustfs

https://github.com/seaweedfs/seaweedfs

https://github.com/supabase/storage

https://github.com/scality/cloudserver

https://github.com/ceph/ceph

Among others

I'm the author of another option (https://github.com/mickael-kerjean/filestash) which has a S3 gateway that expose itself as a S3 server but is just a proxy that forward your S3 call onto anything else like SFTP, local FS, FTP, NFS, SMB, IPFS, Sharepoint, Azure, git repo, Dropbox, Google Drive, another S3, ... it's entirely stateless and act as a proxy translating S3 call onto whatever you have connected in the other end

  • Is this some dark pattern or what?

    https://imgur.com/a/WN2Mr1z (UK: https://files.catbox.moe/m0lxbr.png)

    I clicked settings, this appeared, clicking away hid it but now I cant see any setting for it.

    The nasty way of reading that popup, my first way of reading it, was that filestash sends crash reports and usage data, and I have the option to have it not be shared with third parties, but that it is always sent, and it defaults to sharing with third parties. The OK is always consenting to share crash reports and usage.

    I'm not sure if it's actually operating that way, but if it's not the language should probably be

        Help make this software better by sending crash reports and anonymous usage statistics.
    
        Your data is never shared with a third party.
    
        [ ] Send crash reports & anonymous usage data.
        
    
        [ OK ]

  • Hi, I really like how the software is setup, but there is almost no documentation. This makes it hard to setup. I want to run it with the local storage plugin, but it keeps using / as the storage location and I want it to use /data which I volume mount. Maybe think about improving the documentation :)

  • I was looking at running [versitygw](https://github.com/versity/versitygw) but filestash looks pretty sweet! Any chance you're familiar with Versity and how the S3 proxy may differ?

    • I did a project with Monash university who were using Versity on their storage to handle multi tiers storage on their 12PB cluster, with glacier like capabilities on tape storage with a robot picking up data on their tape backup and a hot storage tier for better access performance, lifecycle rules to move data from hot storage to cold, etc.... The underlying storage was all Versity and they had Filestash working on top, effectively we did some custom plugins so you could recall the data on their own selfhosted glacier while using it through the frontend so their user had a Dropbox like experience. Depending on what you want to do they can be very much complimentary

      1 reply →

  • Didn't know about filestash yet. Kudos, this framework seems to be really well implemented, I really like the plugin and interface based architecture.

from my experiences

rustfs have promise, supports a lot of features, even allows to bring your own secret/access keys (if you want to migrate without changing creds on clients) but it's very much still in-development; and they have already prepared for bait-and-switch in code ( https://github.com/rustfs/rustfs/blob/main/rustfs/src/licens... )

Ceph is closest feature wise to actual S3 feature-set wise but it's a lot to setup. It pretty much wants few local servers, you can replicate to another site but each site on its own is pretty latency sensitive between storage servers. It also offers many other features aside, as S3 is just built on top of their object store that can be also used for VM storage or even FUSE-compatible FS

Garage is great but it is very much "just to store stuff", it lacks features on both S3 side (S3 have a bunch of advanced ACLs many of the alternatives don't support, and stuff for HTTP headers too) and management side (stuff like "allow access key to access only certain path on the bucket is impossible for example). Also the clustering feature is very WAN-aware, unlike ceph where you pretty much have to have all your storage servers in same rack if you want a single site to have replication.

  • Not sure what you mean about Ceph wanting to be in a single rack.

    I run Ceph at work. We have some clusters spanning 20 racks in a network fabric that has over 100 racks.

    In a typical Leaf-Spine network architecture, you can easily have sub 100 microsecond network latency which would translate to sub millisecond Ceph latencies.

    We have one site that is Leaf-Spine-SuperSpine, and the difference in network latency is barely measurable between machines in the same network pod and between different network pods.

  • I don't think this is a problem. The CLA is there to avoid future legal disputes. It prevents contributors from initiating IP lawsuits later on, which could cause significantly more trouble for the project.

    • Hypothetically, isn't one of the "legal disputes" that's avoided is if the projects relicenses to a closed-source model without compensating contributors, the contributors can't sue because the copyright of the contributions no longer belongs to be them?

      1 reply →

    • Contributors cannot initiate IP lawsuits if they've contributed under Apache 2.0. CLAs are to avoid legal disputes when changing the license to commercial closed source.

Would be cool to understand the tradeoffs of the various block storage implementations.

I'm using seaweedfs for a single-machine S3 compatible storage, and it works great. Though I'm missing out on a lot of administrative nice-to-haves (like, easy access controls and a good understanding of capacity vs usage, error rates and so on... this could be a pebcak issue though).

Ceph I have also used and seems to care a lot more about being distributed. If you have less than 4 hosts for storage it feels like it scoffs at you when setting up. I was also unable to get it to perform amazingly, though to be fair I was doing it via K8S/Rook atop the Flannel CNI, which is an easy to use CNI for toy deployments, not performance critical systems - so that could be my bad. I would trust a ceph deployment with data integrity though, it just gives me that feel of "whomever worked on this, really understood distributed systems".. but, I can't put that feeling into any concrete data.

Apart from Minio, we tried Garage and Ceph. I think there's definitely a need for something that interfaces using S3 API but is just a simple file system underneath, for local, testing and small scale deployments. Not sure that exists? Of course a lot of stuff is being bolted onto S3 and it's not as simple as it initially claimed to be.

  • Minio started like that but they migrated away from it. It's just hard to keep it up once you start implementing advanced S3 features (versioning/legal hold, metadata etc.) and storage features (replication/erasure coding)

    • There's plenty of middle ground between "just expose underlying FS as objects, can't support many S3 operations well" and "it's a single node not a distributed storage system".

      For my personal use, asynchronous remote backup is plenty for durability, a Bee-Link ME Mini with 6x4TB NVMe is plenty of space for any bucket I care to have, and I'd love to have an S3 server that doesn't even attempt to be distributed.

  • > for local, testing and small scale deployments

    Yes I'm looking for exactly that and unfortunately haven't found a solution.

    Tried garage, but they require running a proxy for CORS, which makes signed browser uploads a practical impossibility for the development machine. I had no idea that such a simple popular scenario is in fact too exotic.

  • The problem with that approach is that S3 object names are not compatible with POSIX file names. They can contain characters that are not valid on a filesystem, or have special meaning (like "/")

    A simple litmus test I like to do with S3 storages is to create two objects, one called "foo" and one called "foo/bar". If the S3 uses a filesystem as backend, only the first of those can be created

  • Try versitygw.

    I’m considering it for a production deployment, too. There’s much to be said for a system that doesn’t lock your data in to its custom storage format.

That's a great list. I've just opened a pull request on the minio repository to add these to the list of alternatives.

https://github.com/minio/minio/pull/21746

  • I believe the Minio developers are aware of the alternatives, having only their own commercial solution listed as alternatives might be a deliberate decision. But you can try merging the PR, there's nothing wrong with it

  • While I do approve of that MR, doing it is ironic considering the topic was "MinIO repository is no longer maintained"

    Let's hope the editor has second thoughts on some parts

  • The mentioned AIStor "alternative" is on the min.io website. It seems like a re-brand. I doubt they will link to competing products.

Had great experience with garage for an easy to setup distributed s3 cluster for home lab use (connecting a bunch of labs run by friends in a shared cluster via tailscale/headscale). They offer a "eventual consistency" mode (consistency_mode = dangerous is the setting, so perhaps don't use it for your 7-nines SaaS offering) where your local s3 node will happily accept (and quickly process) requests and it will then duplicate it to other servers later.

Overall great philosophy (target at self-hosting / independence) and clear and easy maintenance, not doing anything fancy, easy to understand architecture and design / operation instructions.

Wrote a bit about differences between rustfs and garage here https://buttondown.com/justincormack/archive/ignore-previous... - since then rustfs fixed the issue I found. They are for very different use cases. Rustfs really is close to a minio rewrite.

From my experience, Garage is the best replacement to replace MinIO *in a dev environment*. It provides a pretty good CLI that makes automatic setup easier than MinIO. However in a production environment, I guess Ceph is still the best because of how prominent it is.

  • Garage doesn't support CORS which makes it impossible to use for development for scenarios where web site visitors PUT files to pre-signed URLs.

    • Yep I know, I had to build a proxy for s3 which supports custom pre-signed URLs. In my case it was worth it because my team needs to verify uploaded content for security reasons. But for most cases I guess that you can't really bother deploying a proxy just for CORS.

      https:/github.com/beep-industries/content

    • Garage does actually support CORS (via PutBucketCORS). If there is anything specific missing, would you be wiling to open an issue on our issue tracker so that we can track it as a feature request?

      1 reply →

We are using ruftfs for our simple usecases as a replacement for minio. Very slim footprint and very fast.

Both rustfs and seaweedfs are pretty pretty good based on my light testing.