Comment by crazygringo

3 years ago

Oh thanks for clarifying. Turns out the link to the folder for all the states is here:

https://drive.google.com/drive/folders/173oHgms6wYy5WKz_i3Lh...

But there doesn't appear to be any file that calculates the nationwide totals.

It just seems like such a strange omission but I'm on mobile and can't add up the numbers from across a ton of different files myself.

I downloaded all the .CSV files from that site and quickly loaded them into a table. It just took a couple minutes, but I didn't stop to verify that there were not duplicate rows across the various files.

When I added up the totals, I got: APC - 7,225,399 LP - 5,286,181 PDP - 5,285,900 NNPP - 1,529,575

Note: I was using a beta version of a new database tool I created to do this.

  • should be something quick to whip up in a few minutes in pandas I'd think assuming the column headers are identical and in the same order. It would translate into a bunch of pandas concat call and with the merged table a value_counts for the column where the vote is retained.

    • I have no doubt that a pandas expert (or a postgres expert, or a mysql expert, or..) can whip up something fairly quickly to load in the data and find the totals.

      My tool is designed for people who are not experts but just have a basic understanding of relational tables (e.g. someone comfortable with a spreadsheet) to be able to load a data set like this and analyze it with just a few clicks of a mouse. Using it, I was able to do the whole thing in about 2 minutes.

      BTW: When my numbers did not match up with another HN commenter on this thread, I investigated and found a bug in my code. Once fixed the numbers were correct. (I guess that's why it is still in beta!)