Comment by Yokohiii

18 hours ago

Production grade multi tenant databases want to *solely* run on RAM.

> why would you not want to index?

Because if you don't need an index it wastes RAM, as you've learned. Maintaining indices also has a cost. Index only what you need.

In the sense of the blog post: A senior with decent DB experience would have told you. ;)

You mean NoSQL which is slightly different and nuanced, in a shop that was mostly SQL with the exception of me, the one Junior developer using MongoDB and Elastic, mind you, we got a lot of things done and I learned a lot more about Mongo than I would like.

In all fairness this was my first job a few years ago as a developer, I deep dove MongoDB but I was also one of the only devs using it at this place.

My previous experience with MongoDB had been in college and more limited.

Everything "wants to" run solely in RAM, but we don't have infinite RAM, so a "production grade" database should also be able to fetch data from disk unless this is an explicit tradeoff. MariaDB and PostgreSQL do not require all indices to be stored in RAM. Obviously they can be accessed more quickly if they are in RAM but they are designed under the assumption they will often be stored on disk. It sounds like MongoDB is not, and given the reputation of MongoDB, this is as likely to be incompetence as it is to be a willing tradeoff.

  • Every serious database that is designed to handle moderate to high traffic, will expect you to have RAM to fit all data and indices. Relational DBs do a solid job if that's not the case, but that also sabotages the efficiency you could get from them. It will work for some time. If it's enough for your, that's fine.

    I am not experienced with MongoDB, I don't know if previous comment reports were the users fault or MongoDB's. But one thing is clear to me, complaining it uses too much RAM and not knowing the reasons for it, is a user problem. A common mistake is to setup a DB and expect it just magically does works. DBs are complicated beasts, you have to know how to deal with them.

    • Potentially a mix of both, though MongoDB was still very young when we were using it. Places like Google were championing it, or rather places that can afford to burn a ton of RAM.

    • You certainly don't need to hold all data in RAM to serve "moderate" traffic. A modern hard drive can seek about 80 times per second, an optimized RAID array even more, and an SSD tens of thousands, and if we're pessimistic, it takes 10 seeks to service a request. To me a light load means up to about a request every second, a moderate load means maybe 20 requests per second and a heavy load means hundreds or thousands of requests per second. Pessimistically each (read) request takes 5-10 random reads to service and almost every system is read-mostly.

      I think these are realistic expectations for most apps. Obviously the likes of Netflix and Uber get orders of magnitude more, but 99.9% of apps aren't a Netflix or an Uber, and you don't have to optimize for scaling until your app is on a trajectory to become one, and putting your database on an SSD already let's you handle several thousand concurrent users with ease.

      1 reply →