Comment by sebg
2 days ago
Not OP.
But I'm guessing it's something like this:
import pandas as pd
def calculate_monthly_business_sum(df, date_column, value_column):
"""
Calculate monthly sums anchored to the last business day of each month
Parameters:
df: DataFrame with dates and values
date_column: name of date column
value_column: name of value column to sum
Returns:
DataFrame with sums anchored to last business day
"""
# Ensure date column is datetime
df[date_column] = pd.to_datetime(df[date_column])
# Group by end of business month and sum
monthly_sum = df.groupby(pd.Grouper(
key=date_column,
freq='BME' # Business Month End frequency
))[value_column].sum().reset_index()
return monthly_sum
# Example usage:
df = pd.DataFrame({ 'date': ['2024-01-01', '2024-01-31', '2024-02-29'], 'amount': [100, 200, 300] })
result = calculate_monthly_business_sum(df, 'date', 'amount')
print(result)
Which you can run here => https://python-fiddle.com/examples/pandas?checkpoint=1732114...
It's actually much simpler than that. Assuming the index of the dataframe DF is composed of timestamps (which is normal for timeseries):
df.resample("BME").sum()
Done. One line of code and it is quite obvious what it is doing - with perhaps the small exception of BME, but if you want max readability you could do:
df.resample(pd.offsets.BusinessMonthEnd()).sum()
This is why people use pandas.