Governed Sports Analytics, Exposed to AI Agents

A Python ETL pipeline builds a documented, governed data model from raw ticketing and attendance data — then a Model Context Protocol (MCP) server exposes that model as structured tools, so any AI agent or assistant can query it safely without touching raw source data.

Pipeline Architecture
Extract
Raw Sources
Ticketing, attendance & fan engagement extracts — games.csv, ticket_sales.csv, fan_engagement.csv
Transform
Python ETL
Joins, aggregates seat-tier sales to game grain, computes derived metrics, emits data dictionary
Govern
Curated Model
game_performance + seat_tier_sales tables, each with documented grain, keys & column definitions
Serve
MCP Server
4 typed tools exposed over stdio — any MCP client (Claude, agents, BI copilots) can query safely
MCP Tools Exposed
list_games(opponent_tier?, min_sellthrough?, promo_night_only?, limit)
Browse games with optional filters — returns sellthrough, attendance, revenue per game.
get_game_detail(game_id)
Full fact + seat-tier breakdown for one game, including pricing and channel mix per tier.
query_revenue_summary(opponent_tier?, promo_night_only?)
Aggregate revenue, sellthrough, attendance & satisfaction — answers comparison questions without raw rows.
get_data_dictionary()
Self-describing schema — grain, primary keys, and column definitions an agent can introspect first.
Sample Agent Query (Live Test Output)
mcp_server/server.js — stdio transport
// agent calls query_revenue_summary({ opponent_tier: "marquee" }) { "games_matched": 9, "total_revenue": 30607244, "avg_revenue_per_game": 3400805, "avg_sellthrough": 0.986, "avg_attendance": 18280, "avg_fan_satisfaction": 8.06 }
Season Snapshot (Underlying Data Model)
Home Games
41
2025–26 season
Total Revenue
$114.5M
Tickets + concessions + merch
Avg Sellthrough
85.8%
Across all seat tiers
Avg Attendance
16,443
of 18,418 capacity
Avg Revenue by Opponent Tier
Sellthrough Rate by Opponent Tier
Game Performance (Sample Rows from Governed Model)
Game IDDateOpponentTierSellthroughAttendanceRevenueSatisfaction