Comment by alexanderameye
1 month ago
Interesting! I’ve been working on a tracker for the Belgian parliament [0]. Similar story: very old site + data mostly gets published as a weekly pdf report (including votes and discussions).
I have
1. A scraper/parser that ingests the data daily and transforms it into .parquet files.
2. A LLM summarizer to summarize larger discussions.
3. A static site that gets automatically generated (based on the .parquet files) to provide insight.
What makes your tracker ‘real-time’? Does it ingest the livestream of a parliamentary session while it is happening?
Similar setup here with Zürich's city council (https://github.com/SiiiTschiii/zuerichratsinfo), we scrape daily, parse votes and auto-generate social media posts. Your Belgian tracker looks great, we should definitely exchange notes on the LLM summarization approach!