Comment by joshmn

1 day ago

I have a moderately sized 2TB production database I have enjoyed using pgBackRest on, and was—this week—going to set it up on another 8TB database we have.

What's the next-closest thing? wal-g? barman? databasus? I only get to cosplay as a DBA.

13 comments

joshmn

sgarland 1 day ago

I've used barman on somewhat large-ish DBs (30+ TB), and had no complaints with it. I am a DBRE, if that holds any weight.

briffle 21 hours ago
We recently moved from Barman to pgBackrest. Our main complaints with barman were that incremental backups utilized hardlinks. Which was great, we could have our 7TB database backed up, and the next day, only 20GB in changes. But, when replicating that data to cloud storage, there is no concept of hardlinks, so now we had to push 14TB to cloud storage. Also, at least last time we looked a while back, file compression was only the WAL files, unless you used the newer barman-cloud-backup tool, which we did not.
Also, pgBackrest lets you do the majority of the backup from a physical standby, which is VERY nice for removing the load off production.
None of these seemed like issues, until we looked at pgBarman, and suddenly realized how nice that would be.
- sgarland 13 hours ago
  
  We just piped the backups through pigz for compression; rapidgzip also exists for parallelized decompression (or any other compression algorithm you’d like to use, of course).
joshmn 1 day ago

barman seems to cover "Natural disaster" in their docs. Seems good.
I'll take a look. Thanks!

ramraj07 1 day ago

Backing up multi terabyte production postgres databases is not merely cos playing ha ha

3manuek 21 hours ago

The "closest" would be using Barman with hook scripts (https://docs.pgbarman.org/release/3.18.0/user_guide/hook_scr...) if you rely on cloud storage for storing backups.

https://github.com/aiven-open/pghoard seems like a good option too, but I haven’t tested it yet to have a solid opinion.

drcongo 1 day ago

I can beat you on the timing - I'd never used pgBackRest before, but started setting it up on a project about 2 hours ago, by the time I'd finished the README had been updated.

infinet 19 hours ago

Anyone put the standby on ZFS or other filesystems that can take snapshots for backup?

skibbityboop 10 hours ago

Not for PostgreSQL, but for MariaDB we run replicas in FreeBSD jails on a server with lots of ZFS space. The jailed Maria instances just stop every hour (so the DB flushes everything to disk), the host snapshots all of their data volumes, and then starts the jails back up. Within a minute or so they're fully caught up to the primaries again. Gives us months and months of recovery checkpoints.
It's great because it's a completely clean save from a shutdown state, so when we need a scratch copy of a database it only takes as long as cloning whatever snapshot we want (depending on how far back we need to to), then starting a scratch jail that runs from those clone filesystems. When finished, just shutdown scratch and delete the clones, it's like it never happened.
abrookewood 12 hours ago

A previous company I was at did this on the primary. It always seemed to work, but no one was really comfortable with it, largely because there wasn't too much ZFS experience at the time and also because the process did not coalesce the database before doing it. I think it's still a valid strategy, but not one I have had time to verify thoroughly.

hosteur 1 day ago

databasus does not do PITR.

zigzag312 1 day ago

Is that info up-to-date? Their readme states:

  **Backup types**
  
  - **Logical** — Native dump of the database in its engine-specific binary format. Compressed and streamed directly to storage with no intermediate files
  - **Physical** — File-level copy of the entire database cluster. Faster backup and restore for large datasets compared to logical dumps
  - **Incremental** — Physical base backup combined with continuous WAL segment archiving. **Enables Point-in-time recovery (PITR)** — restore to any second between backups. Designed for disaster recovery and near-zero data loss requirements

EDIT: It seem PITR has been added this March (for PostgreSQL)

https://github.com/databasus/databasus/issues/411

zigzag312 1 day ago

pg_probackup seems to be another one.