Comment by brchr

5 months ago

It is possible to reproduce one of the key claims in this post -- the "Russian tail" in the early voting tallies -- straight from the raw data hosted on the Clark County, NV website. This code can be run in a Colab notebook:

  # Download and extract zip file
  import requests
  import zipfile
  import io

  # Get raw data from Clark County website
  zip_url = "https://elections.clarkcountynv.gov/electionresultsTV/cvr/24G/24G_CVRExport_NOV_Final_Confidential.zip"

  # Download the zip file
  response = requests.get(zip_url)
  zip_file = zipfile.ZipFile(io.BytesIO(response.content))

  # Extract to the current working directory
  zip_file.extractall()

  # Close the zip file
  zip_file.close()

  import pandas as pd
  import matplotlib.pyplot as plt
  import numpy as np

  # Read the actual data, skipping the first three header rows and excluding downballot races
  df = pd.read_csv('/content/24G_CVRExport_NOV_Final_Confidential.csv', skiprows=3, usecols=range(21), low_memory=False)

  # Find the Trump and Harris columns
  trump_col = "REP"
  harris_col = "DEM"

  # Convert to numeric
  df[trump_col] = pd.to_numeric(df[trump_col], errors='coerce')
  df[harris_col] = pd.to_numeric(df[harris_col], errors='coerce')

  # Filter for early voting
  early_voting = df[df['CountingGroup'] == 'Early Voting']

  # Group by tabulator and calculate percentages
  tabulator_stats = early_voting.groupby('TabulatorNum').agg({
      harris_col: 'sum',
      trump_col: 'sum'
  }).reset_index()

  # Calculate total votes and percentages
  tabulator_stats['total_votes'] = tabulator_stats[harris_col] + tabulator_stats[trump_col]
  tabulator_stats['harris_pct'] = tabulator_stats[harris_col] / tabulator_stats['total_votes'] \* 100
  tabulator_stats['trump_pct'] = tabulator_stats[trump_col] / tabulator_stats['total_votes'] \* 100

  # Create subplots
  fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 8))

  # Plot Harris histogram
  ax1.hist(tabulator_stats['harris_pct'], bins=50, edgecolor='black', color='blue', alpha=0.7)
  ax1.set_title('Distribution of Harris Votes by Tabulator (Early Voting Only)')
  ax1.set_xlabel('Percentage of Votes for Harris')
  ax1.set_ylabel('Number of Tabulators')

  # Plot Trump histogram
  ax2.hist(tabulator_stats['trump_pct'], bins=50, edgecolor='black', color='red', alpha=0.7)
  ax2.set_title('Distribution of Trump Votes by Tabulator (Early Voting Only)')
  ax2.set_xlabel('Percentage of Votes for Trump')
  ax2.set_ylabel('Number of Tabulators')

  plt.tight_layout()
  plt.show()

This produces a figure identical (up to histogram bucketing) to the one at the end of the linked article.

2 comments

brchr

beedeebeedee 5 months ago

Thank you! I've run the notebook and reproduced the histograms of the early votes. I'm grateful for you sharing your work. For the other commenters who have dismissed the analysis without providing details, I would recommend that you reproduce this notebook and dig in.

For others who don't know what the "Russian tail" is, here is the link within the PDF: https://www.rferl.org/a/georgia-election-manipulation-russia...

derangedHorse 5 months ago

It seemed more useful for my response to this to be placed as a comment to the original post: https://news.ycombinator.com/item?id=43000312