Comment by brchr
5 months ago
It is possible to reproduce one of the key claims in this post -- the "Russian tail" in the early voting tallies -- straight from the raw data hosted on the Clark County, NV website. This code can be run in a Colab notebook:
# Download and extract zip file
import requests
import zipfile
import io
# Get raw data from Clark County website
zip_url = "https://elections.clarkcountynv.gov/electionresultsTV/cvr/24G/24G_CVRExport_NOV_Final_Confidential.zip"
# Download the zip file
response = requests.get(zip_url)
zip_file = zipfile.ZipFile(io.BytesIO(response.content))
# Extract to the current working directory
zip_file.extractall()
# Close the zip file
zip_file.close()
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Read the actual data, skipping the first three header rows and excluding downballot races
df = pd.read_csv('/content/24G_CVRExport_NOV_Final_Confidential.csv', skiprows=3, usecols=range(21), low_memory=False)
# Find the Trump and Harris columns
trump_col = "REP"
harris_col = "DEM"
# Convert to numeric
df[trump_col] = pd.to_numeric(df[trump_col], errors='coerce')
df[harris_col] = pd.to_numeric(df[harris_col], errors='coerce')
# Filter for early voting
early_voting = df[df['CountingGroup'] == 'Early Voting']
# Group by tabulator and calculate percentages
tabulator_stats = early_voting.groupby('TabulatorNum').agg({
harris_col: 'sum',
trump_col: 'sum'
}).reset_index()
# Calculate total votes and percentages
tabulator_stats['total_votes'] = tabulator_stats[harris_col] + tabulator_stats[trump_col]
tabulator_stats['harris_pct'] = tabulator_stats[harris_col] / tabulator_stats['total_votes'] \* 100
tabulator_stats['trump_pct'] = tabulator_stats[trump_col] / tabulator_stats['total_votes'] \* 100
# Create subplots
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 8))
# Plot Harris histogram
ax1.hist(tabulator_stats['harris_pct'], bins=50, edgecolor='black', color='blue', alpha=0.7)
ax1.set_title('Distribution of Harris Votes by Tabulator (Early Voting Only)')
ax1.set_xlabel('Percentage of Votes for Harris')
ax1.set_ylabel('Number of Tabulators')
# Plot Trump histogram
ax2.hist(tabulator_stats['trump_pct'], bins=50, edgecolor='black', color='red', alpha=0.7)
ax2.set_title('Distribution of Trump Votes by Tabulator (Early Voting Only)')
ax2.set_xlabel('Percentage of Votes for Trump')
ax2.set_ylabel('Number of Tabulators')
plt.tight_layout()
plt.show()
This produces a figure identical (up to histogram bucketing) to the one at the end of the linked article.
Thank you! I've run the notebook and reproduced the histograms of the early votes. I'm grateful for you sharing your work. For the other commenters who have dismissed the analysis without providing details, I would recommend that you reproduce this notebook and dig in.
For others who don't know what the "Russian tail" is, here is the link within the PDF: https://www.rferl.org/a/georgia-election-manipulation-russia...
It seemed more useful for my response to this to be placed as a comment to the original post: https://news.ycombinator.com/item?id=43000312