●WWDC — WWDC 2026 confirms Siri runs on Google Gemini; third-party handoff to ChatGPT is dropped, and Siri AI won't ship in the EU under the DMA at iOS 27●BILLING — 6 days until the Jun 15 change: Agent SDK, headless Claude Code, GitHub Actions, and third-party agents move to API-rate monthly credit●OUTAGE — claude.ai, Claude Code, and Cowork saw an outage (Jun). Scheduled runs are safest when built around fallbackModel and retries●DYNAMIC-WORKFLOWS — Dynamic workflows are on by default on Max/Team and the API, for codebase-wide bug hunts and independent verification●ULTRACODE — Claude Code's new ultracode setting sits in the effort menu, fixing effort to xhigh while Claude decides when to run a workflow●OPUS4.8 — Claude Opus 4.8 is settled in as the default across major plans, with stronger coding, agentic, and reasoning skills●WWDC — WWDC 2026 confirms Siri runs on Google Gemini; third-party handoff to ChatGPT is dropped, and Siri AI won't ship in the EU under the DMA at iOS 27●BILLING — 6 days until the Jun 15 change: Agent SDK, headless Claude Code, GitHub Actions, and third-party agents move to API-rate monthly credit●OUTAGE — claude.ai, Claude Code, and Cowork saw an outage (Jun). Scheduled runs are safest when built around fallbackModel and retries●DYNAMIC-WORKFLOWS — Dynamic workflows are on by default on Max/Team and the API, for codebase-wide bug hunts and independent verification●ULTRACODE — Claude Code's new ultracode setting sits in the effort menu, fixing effort to xhigh while Claude decides when to run a workflow●OPUS4.8 — Claude Opus 4.8 is settled in as the default across major plans, with stronger coding, agentic, and reasoning skills
Cowork × Python Data Analysis Automation: Build a Weekly Report Pipeline with pandas and Matplotlib
Learn how to combine Cowork's scheduled tasks with Python to fully automate your data analysis reports. From pandas aggregation to Matplotlib charts and Google Sheets integration — a complete, production-ready implementation guide.
Every Monday morning, the same ritual: open the CSV, build a pivot table, copy the chart into a slide deck, send it to the team. Sound familiar?
Nearly all of that work can be automated with Cowork and Python. Imagine opening Slack on Monday morning to find last week's KPI report already waiting — generated while you slept. That's what this guide builds.
This is not a Python basics tutorial. It's a practical design guide for building a production-grade data pipeline by combining Cowork's scheduled tasks, bash environment, and Claude AI's analytical capabilities.
Why Automate Data Analysis in Cowork?
"If you know Python, just set up a cron job on a server" — that's a fair point. But Cowork offers something cron doesn't: native Claude AI integration.
For pure aggregation and charting, cron works fine. But with Cowork, you can pass your aggregated results directly to Claude and ask it to explain "what changed compared to last week" and "possible causes for anomalies" in plain language — then inject that analysis into the top of your report. No extra infrastructure required.
The other advantage is access to your local Mac environment. Since Cowork runs as a desktop app, it can read local CSV files, access files on network drives, and interact with any tools installed on your machine — all without special configuration.
Here's the pipeline we'll build:
Fetch data from local CSVs or Google Sheets via Python
Aggregate and preprocess with pandas
Generate charts with Matplotlib (PNG output)
Let Claude AI analyze week-over-week changes and write a commentary
Save as a Markdown report and send a Slack notification
Run automatically every Monday morning via Cowork's scheduled tasks
Step 1: Set Up the Python Environment in Cowork
Cowork has direct access to your Mac's shell environment. Start by checking what's available and installing any missing libraries.
Open a new Cowork conversation and prompt:
Check my bash environment. Show me python3 --version, pip3 --version,
and whether pandas, matplotlib, and openpyxl are installed.
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦If you've been manually summarizing spreadsheets, you'll learn how to set up Cowork's scheduled tasks to auto-generate weekly reports every Monday morning
✦Get complete, copy-paste-ready pipeline code combining pandas, Matplotlib, and Google Sheets API — no stub code, everything runs
✦Learn production-ready patterns including error handling, logging, and Slack notifications so your automation keeps running reliably
Secure payment via Stripe · Cancel anytime
Step 2: Designing the Data Ingestion Layer
The structure of your ingestion script depends on your data source. Here are implementations for the two most common cases: local CSVs and Google Sheets.
Pattern A: Loading Local CSV Files
For data exported as CSV files from sales tools or analytics platforms:
# data_loader.pyimport pandas as pdfrom pathlib import Pathfrom datetime import datetime, timedeltaimport logginglogging.basicConfig( level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')logger = logging.getLogger(__name__)def load_weekly_data(data_dir: str, weeks_back: int = 2) -> pd.DataFrame: """ Load and combine CSVs from the past N weeks in the specified directory. Expected filename format: sales_YYYY-MM-DD.csv """ data_path = Path(data_dir) today = datetime.now() dfs = [] for i in range(weeks_back * 7): target_date = today - timedelta(days=i) filename = f"sales_{target_date.strftime('%Y-%m-%d')}.csv" filepath = data_path / filename if filepath.exists(): try: df = pd.read_csv(filepath, encoding='utf-8-sig') df['source_file'] = filename df['record_date'] = target_date.strftime('%Y-%m-%d') dfs.append(df) logger.info(f"Loaded: {filename} ({len(df)} rows)") except Exception as e: logger.warning(f"Failed to load {filename}: {e}") if not dfs: raise FileNotFoundError(f"No CSV files found in {data_path} for past {weeks_back} weeks") combined = pd.concat(dfs, ignore_index=True) logger.info(f"Total rows loaded: {len(combined)}") return combineddef validate_dataframe(df: pd.DataFrame, required_columns: list) -> bool: """ Verify required columns exist and the DataFrame isn't empty. Always validate before processing — don't assume the data is clean. """ missing = [col for col in required_columns if col not in df.columns] if missing: logger.error(f"Missing required columns: {missing}") return False if df.empty: logger.error("DataFrame is empty") return False null_counts = df[required_columns].isnull().sum() if null_counts.any(): logger.warning(f"Null values found:\n{null_counts[null_counts > 0]}") return True
Pattern B: Loading from Google Sheets
For teams that maintain their source data in shared spreadsheets:
# sheets_loader.pyimport gspreadfrom google.oauth2.service_account import Credentialsimport pandas as pdimport loggingfrom pathlib import Pathlogger = logging.getLogger(__name__)SCOPES = [ 'https://www.googleapis.com/auth/spreadsheets.readonly', 'https://www.googleapis.com/auth/drive.readonly']def get_sheets_client(credentials_path: str) -> gspread.Client: """ Build a Google Sheets client from a service account credentials file. Store credentials.json at ~/data-analysis-automation/credentials/ """ creds = Credentials.from_service_account_file( credentials_path, scopes=SCOPES ) return gspread.authorize(creds)def load_sheet_as_dataframe( client: gspread.Client, spreadsheet_id: str, sheet_name: str, header_row: int = 1) -> pd.DataFrame: """ Load a Google Sheets worksheet as a DataFrame. Args: spreadsheet_id: The ID from the spreadsheet URL (between /d/ and /edit) sheet_name: Name of the tab to load header_row: Row number of the header (1-indexed) """ try: spreadsheet = client.open_by_key(spreadsheet_id) worksheet = spreadsheet.worksheet(sheet_name) data = worksheet.get_all_records(head=header_row) df = pd.DataFrame(data) logger.info(f"Loaded from Sheets: {sheet_name} ({len(df)} rows)") return df except gspread.exceptions.APIError as e: logger.error(f"Sheets API error: {e}") raise except gspread.exceptions.WorksheetNotFound: logger.error(f"Worksheet '{sheet_name}' not found") raise
Step 3: Weekly Aggregation with pandas
Once your data is loaded, it's time to aggregate. Here are the patterns that cover most real-world use cases:
# aggregator.pyimport pandas as pdimport numpy as npfrom datetime import datetime, timedeltaimport logginglogger = logging.getLogger(__name__)def compute_weekly_summary(df: pd.DataFrame, date_col: str, value_col: str) -> dict: """ Calculate this week's and last week's totals, along with the change rate. Returns: { 'this_week': {'total': ..., 'daily_avg': ..., 'max': ...}, 'last_week': {'total': ..., 'daily_avg': ..., 'max': ...}, 'change_rate': 0.12 # +12% } """ df[date_col] = pd.to_datetime(df[date_col]) today = datetime.now().date() # Set week start to Monday this_week_start = today - timedelta(days=today.weekday() + 7) this_week_end = this_week_start + timedelta(days=6) last_week_start = this_week_start - timedelta(days=7) last_week_end = this_week_start - timedelta(days=1) this_week_df = df[ (df[date_col].dt.date >= this_week_start) & (df[date_col].dt.date <= this_week_end) ] last_week_df = df[ (df[date_col].dt.date >= last_week_start) & (df[date_col].dt.date <= last_week_end) ] def summarize(week_df: pd.DataFrame) -> dict: if week_df.empty: return {'total': 0, 'daily_avg': 0, 'max': 0, 'min': 0} values = week_df[value_col].dropna() return { 'total': float(values.sum()), 'daily_avg': float(values.mean()), 'max': float(values.max()), 'min': float(values.min()), 'count': len(values) } this_week_stats = summarize(this_week_df) last_week_stats = summarize(last_week_df) # Guard against zero division if last_week_stats['total'] \!= 0: change_rate = (this_week_stats['total'] - last_week_stats['total']) / last_week_stats['total'] else: change_rate = None return { 'this_week': this_week_stats, 'last_week': last_week_stats, 'change_rate': change_rate, 'period': { 'this_week': f"{this_week_start} to {this_week_end}", 'last_week': f"{last_week_start} to {last_week_end}" } }def detect_anomalies(df: pd.DataFrame, value_col: str, threshold: float = 2.0) -> pd.DataFrame: """ Flag values outside mean ± N standard deviations as anomalies. threshold=2.0 covers roughly 95% of a normal distribution. This saves the manual "did something weird happen last Tuesday?" check. """ mean = df[value_col].mean() std = df[value_col].std() lower = mean - threshold * std upper = mean + threshold * std anomalies = df[(df[value_col] < lower) | (df[value_col] > upper)].copy() anomalies['anomaly_type'] = np.where( anomalies[value_col] > upper, 'HIGH', 'LOW' ) if not anomalies.empty: logger.warning(f"Detected {len(anomalies)} anomalies") return anomalies
Step 4: Auto-Generating Charts with Matplotlib
This section shows how to generate clean, production-ready charts. Font configuration is a common early stumbling block, so the setup below handles it correctly from the start.
# chart_generator.pyimport matplotlibmatplotlib.use('Agg') # Must be set before any other matplotlib imports for headless environmentsimport matplotlib.pyplot as pltimport matplotlib.dates as mdatesfrom matplotlib import rcParamsimport pandas as pdfrom pathlib import Pathimport loggingfrom datetime import datetimelogger = logging.getLogger(__name__)# Font configuration for macOS (Cowork runs on Mac)rcParams['font.family'] = ['Hiragino Sans', 'Helvetica Neue', 'sans-serif']rcParams['axes.unicode_minus'] = False # Prevents minus sign rendering issuesdef generate_weekly_trend_chart( df: pd.DataFrame, date_col: str, value_col: str, title: str, output_dir: str, filename: str = "weekly_trend.png") -> str: """ Generate a daily line chart and save it as PNG. Returns: Path to the saved file """ output_path = Path(output_dir) output_path.mkdir(parents=True, exist_ok=True) fig, ax = plt.subplots(figsize=(12, 5)) df = df.copy() df[date_col] = pd.to_datetime(df[date_col]) df_sorted = df.sort_values(date_col) daily = df_sorted.groupby(df_sorted[date_col].dt.date)[value_col].sum().reset_index() daily.columns = ['date', 'value'] daily['date'] = pd.to_datetime(daily['date']) ax.plot(daily['date'], daily['value'], marker='o', linewidth=2, color='#2563eb', markersize=6) ax.fill_between(daily['date'], daily['value'], alpha=0.1, color='#2563eb') ax.xaxis.set_major_formatter(mdates.DateFormatter('%m/%d')) ax.xaxis.set_major_locator(mdates.DayLocator(interval=1)) plt.setp(ax.xaxis.get_majorticklabels(), rotation=45, ha='right') ax.set_title(title, fontsize=14, fontweight='bold', pad=15) ax.set_xlabel('Date', fontsize=11) ax.set_ylabel(value_col, fontsize=11) ax.grid(True, alpha=0.3) # Annotate max and min max_idx = daily['value'].idxmax() ax.annotate( f"Peak: {daily.loc[max_idx, 'value']:,.0f}", xy=(daily.loc[max_idx, 'date'], daily.loc[max_idx, 'value']), xytext=(10, 10), textcoords='offset points', fontsize=9, color='#dc2626', arrowprops=dict(arrowstyle='->', color='#dc2626') ) plt.tight_layout() save_path = output_path / filename fig.savefig(save_path, dpi=150, bbox_inches='tight') plt.close(fig) # Always close to prevent memory leaks in long-running tasks logger.info(f"Chart saved: {save_path}") return str(save_path)def generate_comparison_bar_chart( summary: dict, metric_name: str, output_dir: str, filename: str = "comparison.png") -> str: """ Generate a this-week vs last-week comparison bar chart. """ output_path = Path(output_dir) output_path.mkdir(parents=True, exist_ok=True) fig, ax = plt.subplots(figsize=(8, 5)) categories = ['Last Week', 'This Week'] values = [summary['last_week']['total'], summary['this_week']['total']] colors = ['#94a3b8', '#2563eb'] bars = ax.bar(categories, values, color=colors, width=0.5, edgecolor='white', linewidth=1.5) if summary['change_rate'] is not None: rate = summary['change_rate'] * 100 arrow = "↑" if rate > 0 else "↓" color = "#16a34a" if rate > 0 else "#dc2626" ax.text( 1, values[1] + max(values) * 0.02, f"{arrow} {abs(rate):.1f}%", ha='center', va='bottom', fontsize=14, fontweight='bold', color=color ) for bar, val in zip(bars, values): ax.text( bar.get_x() + bar.get_width() / 2, bar.get_height() + max(values) * 0.01, f"{val:,.0f}", ha='center', va='bottom', fontsize=11, fontweight='bold' ) ax.set_title(f"{metric_name} — Weekly Comparison", fontsize=14, fontweight='bold', pad=15) ax.set_ylabel(metric_name, fontsize=11) ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'{x:,.0f}')) ax.grid(True, axis='y', alpha=0.3) ax.spines['top'].set_visible(False) ax.spines['right'].set_visible(False) plt.tight_layout() save_path = output_path / filename fig.savefig(save_path, dpi=150, bbox_inches='tight') plt.close(fig) logger.info(f"Comparison chart saved: {save_path}") return str(save_path)
Step 5: Letting Claude AI Analyze the Data
This is where Cowork pulls ahead of a plain cron job. You can pass aggregated results to Claude and get a human-readable business analysis in return.
In your scheduled task prompt, include something like this:
Read the following data analysis results and explain their business significance
in 3–5 sentences. Avoid jargon — write so that a non-technical manager can
immediately understand what happened and why it matters.
[Summary Period]
This week: {this_week_period}
Last week: {last_week_period}
[Key KPIs]
- This week total: {this_week_total}
- Last week total: {last_week_total}
- Week-over-week change: {change_rate}%
- This week daily average: {this_week_avg}
[Anomaly Information]
{anomaly_info}
Format your response as a Markdown blockquote (> ...).
Claude's response goes directly into the report header — no copy-pasting required.
Step 6: Report Generation and Slack Notification
Bring everything together into a Markdown report and push a summary to Slack:
# reporter.pyimport jsonimport loggingimport requestsfrom datetime import datetimefrom pathlib import Pathlogger = logging.getLogger(__name__)def generate_markdown_report( summary: dict, ai_comment: str, chart_paths: list, output_dir: str) -> str: """ Assemble the full Markdown report and save it to disk. """ today = datetime.now().strftime('%Y-%m-%d') report_path = Path(output_dir) / f"weekly_report_{today}.md" change_rate = summary.get('change_rate') change_emoji = "📈" if (change_rate or 0) > 0 else "📉" change_text = "increase" if (change_rate or 0) > 0 else "decrease" change_rate_str = f"{abs((change_rate or 0) * 100):.1f}%" content = f"""# Weekly Report ({today})## AI Analysis Summary{ai_comment}---## KPI Summary- **This week total**: {summary['this_week']['total']:,.0f}- **Last week total**: {summary['last_week']['total']:,.0f}- **Week-over-week**: {change_emoji} {change_rate_str} {change_text}- **This week daily average**: {summary['this_week']['daily_avg']:,.1f}- **This week peak**: {summary['this_week']['max']:,.0f}**Periods**- This week: {summary['period']['this_week']}- Last week: {summary['period']['last_week']}---## Generated Charts""" for path in chart_paths: content += f"- `{path}`\n" content += f"""---*This report was auto-generated by the Cowork + Python data pipeline.**Generated at: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}*""" report_path.write_text(content, encoding='utf-8') logger.info(f"Report saved: {report_path}") return str(report_path)def notify_slack( webhook_url: str, summary: dict, ai_comment: str, report_path: str) -> bool: """ Send a concise summary to Slack via Incoming Webhook. Send just the highlights — let the full report be the source of truth. """ change_rate = summary.get('change_rate') if change_rate is not None: change_text = f"{'📈' if change_rate > 0 else '📉'} {abs(change_rate * 100):.1f}% {'up' if change_rate > 0 else 'down'}" else: change_text = "No comparison data" payload = { "blocks": [ { "type": "header", "text": { "type": "plain_text", "text": f"📊 Weekly Report — {datetime.now().strftime('%Y/%m/%d')}" } }, { "type": "section", "fields": [ { "type": "mrkdwn", "text": f"*This Week Total*\n{summary['this_week']['total']:,.0f}" }, { "type": "mrkdwn", "text": f"*Week-over-Week*\n{change_text}" } ] }, { "type": "section", "text": { "type": "mrkdwn", "text": f"*AI Analysis:*\n{ai_comment[:300]}..." } }, { "type": "context", "elements": [ { "type": "mrkdwn", "text": f"Full report: `{report_path}`" } ] } ] } try: response = requests.post(webhook_url, json=payload, timeout=10) response.raise_for_status() logger.info("Slack notification sent") return True except requests.exceptions.RequestException as e: logger.error(f"Slack notification failed: {e}") return False # Notification failure shouldn't fail the whole pipeline
Step 7: Registering the Scheduled Task in Cowork
Wire everything into a single entry point, then register it as a Cowork scheduled task:
In Cowork's Scheduler, create a new task with this prompt:
Every Monday at 7:00 AM, run the weekly data analysis report pipeline.
bash: cd ~/data-analysis-automation && .venv/bin/python main_pipeline.py
After execution, check the tail of the log file at
~/data-analysis-automation/logs/pipeline_YYYYMMDD.log
and report whether it succeeded or failed. If it failed, summarize the error.
Here are the three issues that most reliably cause headaches in production.
1. Matplotlib crashes with NSException in scheduled tasks
When Matplotlib runs in a headless environment (no display), it tries to initialize a GUI backend and crashes on macOS. The fix is to call matplotlib.use('Agg') at the very top of your script, before any other matplotlib imports. If you forget this, you'll hit a maddening scenario where the script works fine when you run it manually but fails every time in the scheduled task.
2. SettingWithCopyWarning leads to silent data corruption
# ❌ This triggers SettingWithCopyWarning and may not modify the originaldf_filtered = df[df['status'] == 'active']df_filtered['new_col'] = 'value'# ✅ Always use .copy() after filteringdf_filtered = df[df['status'] == 'active'].copy()df_filtered['new_col'] = 'value'
This warning often seems harmless but can silently fail to write the value you expect. Make it a habit: any time you filter a DataFrame and then add a column, use .copy() first.
3. Silent failures in Slack notification hide pipeline errors
A common mistake is letting the pipeline fail silently when Slack notifications don't go through. The notify_slack function above returns a boolean rather than raising an exception — this is intentional. A Slack API hiccup shouldn't cause your entire Monday morning report to not exist. Structure your error handling so that communication failures are logged but don't cascade into pipeline failures.
Writing Back to Google Sheets
For teams that prefer tracking KPIs in a shared spreadsheet rather than Markdown files, add a write-back step:
def write_summary_to_sheet(client, spreadsheet_id: str, sheet_name: str, summary: dict): """ Append this week's summary row to the specified Google Sheet. Existing data is never overwritten — new rows are always appended. """ worksheet = client.open_by_key(spreadsheet_id).worksheet(sheet_name) today = datetime.now().strftime('%Y-%m-%d') change_rate = summary.get('change_rate') row = [ today, summary['this_week']['total'], summary['last_week']['total'], f"{change_rate * 100:.2f}%" if change_rate is not None else "N/A", summary['this_week']['daily_avg'], summary['this_week']['max'], ] worksheet.append_row(row, value_input_option='USER_ENTERED') logger.info(f"Appended row to Google Sheets: {today}")
What to Do with the Time You Get Back
Once this pipeline is running, you'll reclaim 30–60 minutes every week that used to go to manual report assembly.
The question worth sitting with is: what do you do with that time?
The part of the process that still requires a human is the judgment — noticing that a number looks wrong for a reason the data doesn't capture, deciding what the trend means for next month's plan, choosing which anomaly is worth investigating and which to ignore. Claude AI can flag the anomalies and summarize the trends, but it can't carry the context of your business the way you do.
Automation clears the mechanical work so you can focus on that judgment. That's the actual payoff here — not just the saved hours, but the mental space to use the data rather than just produce it.
Share
Thank You for Reading
Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.