How rqlite is tested

10 months ago (philipotoole.com)

54 comments

otoolep

A couple of extra points I've found useful:

The first test of a class of tests is the hardest, but it's almost always worth adding. Second and subsequent tests are much easier, especially when using:

Parametrised tests let you test more things without copying boilerplate, but don't throw more variants just to get the count up. Having said that:

Exhaustive validation of constraints, when it's plausible. We have ~100k tests of our translations for one project, validating that every string/locale pair can be resolved. Three lines of code, two seconds of wall-clock time, and we know that everything works. If there are too many variants to run them all, then:

Prop tests, if you can get into them. Again, validate consistency and invariants.

And make sure that you're actually testing what you think you're testing by using mutation testing. It's great for having some confidence that tests actually catch failures.

oweiler 10 months ago
That's why I prefer test frameworks that make parametrised tests so easy that they become the default.
Kotest and Spock are such frameworks.
- andrewaylett 10 months ago
  
  Great :). We use Junit5, and a mixture of @EnumSource, @MethodSource, and dynamic tests.
  Or Pytest, or a short shell script, or Busted, or Jest, or Node's built in test runner, or a "--selftest" flag to a variant of the native executable.
otoolep 10 months ago

>The first test of a class of tests is the hardest, but it's almost always worth adding.
Agreed, this has been my experience too.

t43562 10 months ago

The testing Pyramid makes sense. The problem for (perhaps) a lot of us will be that we're working on things where some or even all of the levels are missing and we have to try to bring them to a sensible state as fast as possible....but we have limited ability to do so. It's managing imperfection.

We're also possibly working with multiple teams on products that interact and it ends up being "nobody's job" to fill in the e2e layer for example.

Then when someone bites the bullet to get on with it....the whole thing isn't designed to be tested. e.g. how does anyone do testing with Auth0 as their auth mechanism? How do you even get a token to run an E2E type test? I have to screen scrape it which is awful.

Without those E2E tests - even just the test that you can login - the system can break and even when it's a test environment that makes the environment useless for other developers and gets in everyone's way. It becomes the victim's job to debug what change broke login and push the perpetrator to fix it. With automated e2e tests the deployment that broke something is easy to see and rollback before it does any damage.

I suppose I'm challenging the focus in a sense - I care about e2e more because some of those issues block teams from working. If you can't work because of some stupid e2e failure, you can't get fixes out for issues you found in the unit/integration tests.

hitchstory 10 months ago
The pyramid makes sense for certain types of application - very logic heavy with light integration touch points with everything cleanly dependency injected. rqlite fits this pattern.
If you're building a logic lite application which has a lot of integration touch points (I find most commercial code actually fits this pattern) then it makes zero sense - an integration test heavy test suite is what you want - an upside down pyramid if you will.
If you have a ball of mud on your hands then it doesnt matter what kind of app it is, E2E tests are the only thing that makes sense and you need to build very sophisticated fakes (requiring a lot of engineering skill) to avoid things like flakiness bugs and the token scraping problem you referred to.
If you're writing a parser, you probably want 100% unit tests and property tests for all those millions of parser edge cases. No pyramid of any shape is required, just a slab.
The testing pyramid reminds me of microservices: the people who came up with Their Grand Idea had no clue what it was about their specific circumstances that made their approach work for them but still managed to market it as The Way.
Realistically, the ultimate shape of your test suite should be an emergent property based upon a series of smart, local decisions - ideally TDD'ed tests which match the kind of code you are writing right now. Making a particular shape a goal is asinine.
- dpc_01234 10 months ago
  
  It's very common mistake in SWE to generalize and extrapolate from own situation.
  The way things to do things is very dependent on the project. They way one maintain, debug, test e.g. a math library vs embedded sw vs a video game vs a database vs a enterprise app vs a service in a distributed system are all just different.
  Almost all advice should be qualified by "in this domain, one this type of project ...".

didip 10 months ago

I love how dedicated you are with this project, Philip. Been watching it for many many years.

fegu 10 months ago

There seems to be some copy pasta in the FAQ on if any node can be contacted for writes or reads, the paragraph on reads mentions writes.

otoolep 10 months ago
Thanks for flagging -- fixed.
- atoav 10 months ago
  
  My feedback regarding the presentation is that I think there should be slightly more focus on why one would choose rqlite over say sqlite. That probably means more info on the distributed part.
  I happen to know raft and the kind of problem it solves, but the average reader might not. A practical demonstration of the problems it solves might be in order.
  So that also means describing for whicj applications rqlite is more wellsuited (and for which it might be worse) than other databases.
  
  1 reply →

the-alchemist 10 months ago

can't wait for the https://jepsen.io/ report!

otoolep 10 months ago

It's not done by Aphyr himself, but someone ran Jepsen-like testing a couple of years back.
https://github.com/wildarch/jepsen.rqlite/blob/main/doc/blog...

someguy101010 10 months ago

I also enjoyed this in video format https://youtu.be/JLlIAWjvHxM?feature=shared&t=2049

Always have been envious of that performance testing setup that is shown here

scottyeager 10 months ago

Have been following rqlite for a while and just started using it in one project. So far it delivers on the promise of simplicity. Thanks!

codethief 10 months ago
Could you elaborate on what kind of project you're using rqlite for?
- scottyeager 10 months ago
  
  Philip kindly posted the link, but I can also say a bit about it. My project using rqlite is a bot for Telegram that sends alerts and provides info. I also operate an instance of this bot that serves a small but substantial group of users.
  My motivation in adapting the bot to work with rqlite was looking for a low resistance path to higher availability. I store the small amount of state that the bot requires in rqlite and also use it to implement a simple leader election.
  
  2 replies →
- otoolep 10 months ago
  
  This? https://github.com/threefoldfoundation/node-status-bot :-)
  
  1 reply →

Hixon10 10 months ago

What do you think about deterministic simulation testing, which is currently trending?

qaq 10 months ago

Anyone using rqlite in prod ?

otoolep 10 months ago
Yes, Replicated for example: https://www.replicated.com/blog/app-manager-with-rqlite
It's still in active use by them. https://www.textgroove.com/ is another company that uses it, though I know less about their use case.
- qaq 10 months ago
  
  Cool, rqlite looks like a really nice option for c2 storage layer
joostdecock 10 months ago

Yes, it's our backing database for Morio [1], an open source observability plumbing tool.
We're still building it but we're dogfooding it and run it in production ourselves.
Can't recommend it enough
[1] https://morio.it/
computerfan494 10 months ago
I have been using it in production for more than a year. There were some early hiccups associated with my use-case that were resolved on upgrade and with a bit of tuning. Otherwise it has been very stable. I'll say that I wish the official language bindings were tested more like rqlited itself. They've been more problematic than the database.
- qaq 10 months ago
  
  Thank you gonna give a spin as c2 storage layer
- otoolep 10 months ago
  
  Yeah, the client libraries[1] need work. It's been difficult to keep them in good shape when there is so much to do on the core database itself. Any community input very much appreciated for the client libraries!
  https://rqlite.io/docs/api/client-libraries/

kopirgan 10 months ago

Thanks for the post...did not know about this.

Just went through the website and read through some documents. It is quite easy to read and understand.

One part I couldn't follow was security - other than items listed (file system, HTTPS, IP based filter), is it correct to say that if you know & have access to the URL API endpoints, any query can be run directly off the db with curl or such tools? How is this aspect managed in production? Sorry if this question is inappropriate or too dumb.

otoolep 10 months ago

rqlite supports a few levels security, which you can use separately or together.
You can enable BasicAuth on the HTTP endpoints. You can also require that any client presenting BasicAuth credentials has the right permission for the operation the client is trying to perform. Finally you can also enable Mutual TLS, requiring that any client that connects must first present a cert that is signed by an acceptable CA.
For full details on security check out https://rqlite.io/docs/guides/security/

aguynamedben 10 months ago

rqlite is my favorite “out there” project that is actually freaking genius!

denysvitali 10 months ago

In case you missed it, OP is also the author of rqlite

(I for one didn't notice at first)

otoolep 10 months ago

Yes, that is right. So here's my disclaimer: I'm the author of rqlite.
I really admire the testing that the SQLite team does[1], and the way it allows them to stand behind the statements they make about quality. It's inspiring.
IMO there have only been two really big improvements to software development relative to when I started programming professionally 25 years ago: 1) code reviews becoming mainstream, and 2) unit testing. (Perhaps Gen AI will be the third). I believe extensive testing is the only reason that rqlite continues to be developed to this day. It's not just that it helps keep the quality high, it's a key design guide. If a new module cannot be unit tested during development, in a straightforward manner, it's a strong sign one's decomposition of the problem is wrong.
[1] https://www.sqlite.org/testing.html

JimmyWilliams1 10 months ago

[dead]

Ericson2314 10 months ago

Now that there is testing like https://turso.tech/blog/introducing-limbo-a-complete-rewrite... going on, tbh the testing described in the linked piece doesn't seem so good.

dangoodmanUT 10 months ago
Probably a lot better than 99.999% of software? This is made by one person, with code that cannot be deterministically tested (external SQLite) But you seem to be cargo culting, as this testing has existed in DBs for nearly a decade
- IshKebab 10 months ago
  
  It's not cargo culting. Deterministic Simulation Testing has been the standard for silicon verification practically forever (though it isn't called that because nobody does anything else).
  Applying it to software is a logical step if you want similar levels of quality assurance.
  Do you need that level of assurance for a database? Probably not in most cases. But I think it's a reasonable thing to mention on a post that is touting database reliability, especially because there are real databases that do use it.
  
  4 replies →
- otoolep 10 months ago
  
  rqlite has been in development for about a decade too!
  https://github.com/rqlite/rqlite/blob/master/CHANGELOG.md#10...
  
  1 reply →
abrookewood 10 months ago

FYI this is a reference to Deterministic Simulation Testing (DST), which certainly sounds interesting: https://apple.github.io/foundationdb/testing.html

cynicalsecurity 10 months ago

1. Is this a one man project? Why? What if the author dies?

2. Why Go? Go is garbage collected, how is this even a good idea for a database engine in the first place?

aabhay 10 months ago

Making good on the username, for sure!
dangoodmanUT 10 months ago

This is HN outrage bait if I've ever seen it
ChocolateGod 10 months ago
> Why Go? Go is garbage collected, how is this even a good idea for a database engine in the first place?
First, the database itself is Sqlite, which is C and arguably the most widely deployed database engine in the galaxy.
Secondly, Go is popular for network services as it's easy to write (relative to languages like C++, Rust etc), is memory-safe and is "fast enough" for the network/disk to be your bottleneck and its concurrency features make it easy to use hardware efficiently without having to worry about scheduling in your application, among other reasons.
- anilgulecha 10 months ago
  
  > the most widely deployed database engine in the galaxy.
  The planet perhaps. We don't really know at the galaxy level. At best we're atleast 200k years away from knowing for sure :)
  /nitpick
  
  2 replies →