← Back to context

Comment by ryanjshaw

1 day ago

IMO They don’t have a high concurrency DB writing system, they just think they do.

Recordings can and should be streamed to an object store. Parallel processes can do transcription on those objects; bonus: when they inevitably have a bug in transcription, retranscribing meetings is easy.

The output of transcription can be a single file also stored in the object store with a single completion message notification, or if they really insist on “near real-time”, a message on a queue for every N seconds. Much easier to scale your queue than your DB, eg Kafka partitions.

A handful of consumers can read those messages and insert into the DB. Benefit is you have a fixed and controllable write load into the database, and your client workload never overloads the DB because you’re buffering that with the much more distributed object store (which is way simpler than running another database engine).