Comment by gavinray

1 day ago

You can use "psql" to dump subsets of data from tables and then later import them.

Something like:

  psql <db_url> -c "\copy (SELECT * FROM event_data ORDER BY created_at DESC LIMIT 100) TO 'event-data-sample.csv' WITH CSV HEADER"

https://www.postgresql.org/docs/current/sql-copy.html

It'd be really nice if pg_dump had a "data sample"/"data subset" option but unfortunately nothing like that is built in that I know of.

pg_dump has a few annoyances when it comes to doing stuff like this — tricky to select exactly the data/columns you want, and also the dumped format is not always stable. My migration tool pgmigrate has an experimental `pgmigrate dump` subcommand for doing things like this, might be useful to you or OP maybe even just as a reference. The docs are incomplete since this feature is still experimental, file an issue if you have any questions or trouble

https://github.com/peterldowns/pgmigrate

Indeed, but is there a way to do it as a "point in time", eg do a "virtual checkpoint" at a timestamp, and do all the copy operations from that timestamp, so they are coherent?