So I took the discogs-xml2db
tool and ran it against the
Discogs Data, May 2022 release. I got back 8.1 Gb, 😱, of csv data to
ingest into PostgreSQL. I’ve done it for previous months and it’s
ingested just fine, but there’s some interesting exploration that can
be done with the csv data, before, and after ingest. But I’m gonna
need a few tools:
- VisiData, “VisiData is an interactive multitool for tabular data.”
- xsv, “xsv is a command line program for indexing, slicing, analyzing, splitting and joining CSV files. “
- Data Fluent for PostgreSQL, “Build a better understanding of your data in PostgreSQL.”