My AI Sports Analyst: How I Wake Up to World Cup Insights Every Morning
How I built a scheduled AI agent that collects World Cup 2026 match stats, generates tournament predictions, and delivers a morning briefing while I sleep.
Table of Contents
The FIFA World Cup 2026 kicked off on June 11th. And I had a bit of a problem.
Most of the matches are played in the Americas. That means evening kickoffs in Mexico, the US, and Canada translate to the middle of the night here in Israel. I’m not staying up until 3 AM to watch group stage matches. But I also don’t want to wake up, grab my phone, and spend 20 minutes scrolling through sports apps piecing together what happened.
So I built myself a personal sports analyst. One that wakes up before I do, scours the internet for match results, collects detailed statistics, and even makes predictions about who’s going to win the whole thing.
And it takes me zero effort every morning.
The Setup
I’m using Amazon Quick’s scheduled agents feature. If you’re not familiar, it lets you create an AI agent with a specific prompt, give it access to tools (web search, file read/write, etc.), and set it on a schedule. The agent runs autonomously at the time you specify, does its thing, and posts the results to your activity feed.
My agent is called wc2026-daily-stats. It runs every day at 9:00 AM Israel time. By the time I’m pouring my first coffee, the results are already waiting for me.
What It Actually Does
The agent has a three-part workflow:
Part 1: Collecting Match Stats
Every morning, the agent:
- Checks what day it is
- Searches the web for “FIFA World Cup 2026 results” from the previous day
- For each match it finds, it digs deeper. It searches for detailed box score statistics from sports sites
- It fetches those pages and extracts everything: possession percentages, shots on target, xG (expected goals), goal scorers with timestamps, cards, saves, corners, the works
The level of detail is honestly better than what I’d get casually browsing a sports app. Here’s what a typical match entry looks like in my stats file:
## Match 4: United States 4-1 Paraguay
**Date:** June 13, 2026 | **Group D** | **Venue:** SoFi Stadium, Inglewood
### Goal Scorers
| Team | Player | Minute |
|------|--------|--------|
| USA | Damián Bobadilla (OG) | 7' |
| USA | Folarin Balogun | 31' |
| USA | Folarin Balogun | 45'+5' |
| Paraguay | Mauricio | 73' |
| USA | Giovanni Reyna | 90'+8' |
### Match Statistics
| Statistic | United States | Paraguay |
|-----------|--------------|----------|
| Possession | ~58% | ~42% |
| Total Shots | ~22 | — |
| xG | ~2.8 | — |
Every match gets this treatment. After 12 days of the tournament, I have 40 matches catalogued with full stats.
Part 2: The Prediction Engine
This is the part I find most fun.
After collecting the day’s stats, the agent reads the entire accumulated stats file (all 40+ matches so far) and produces an updated prediction for which two teams will make the final.
It’s not just “pick the favorites.” The agent weighs multiple factors:
- Current tournament form: goals scored vs. conceded, xG performance
- Quality of opposition: beating Germany is worth more than thrashing Curaçao 7-1
- Squad depth: how many different scorers? Are substitutes making an impact?
- Tournament pedigree: have these teams delivered at World Cups before?
- Tactical solidity: clean sheets, defensive organization
- Mentality indicators: comebacks, late winners, composure under pressure
- Home advantage: this matters in the US/Mexico/Canada venues
The prediction comes with a confidence percentage that increases as more data accumulates. It started around 30% after the first few matches and is currently at 48% with two matches per team analyzed.
Right now? The agent is predicting an Argentina vs France final. Messi has 5 goals in 2 matches (all-time World Cup leading scorer at 38 years old), and Mbappé has 4. The agent also tracks a “Changes from yesterday” section explaining why the prediction shifted. Two days ago it was Germany vs Argentina. France earned the upgrade after a clinical 3-0 against Iraq.
It even picks dark horses. Currently watching Norway (Haaland with 4 goals) and Japan (came back twice against the Netherlands).
Part 3: The Morning Notification
Finally, the agent posts a summary to my activity feed. It includes:
- How many matches were played yesterday
- Final scores
- One standout stat per match
- The current prediction with a one-line explanation
So when I open Amazon Quick in the morning, there’s a notification waiting: “3 matches yesterday. France 3-0 Iraq (Mbappé brace, now has 16 career WC goals). 🔮 Prediction: Argentina vs France. Messi and Mbappé on a collision course for a 2022 final rematch.”
That’s it. I’m up to speed in 10 seconds.
How the Data is Stored
Everything lives in two local markdown files:
-
wc2026_all_match_stats.mdis the running log. Every match gets appended to the end with detailed stats. It’s currently at 40 matches and about 68KB. The agent reads the existing file, appends new matches, and writes it back. -
wc2026_final_prediction.mdgets completely rewritten each day. It contains the current standings, top 10 contenders with key metrics, the predicted finalists with detailed reasoning, confidence level, dark horses, and a Golden Boot tracker.
Both are just plain markdown files sitting in my Documents folder. Nothing fancy. I can open them anytime and read through the full tournament history or check the latest prediction.
The Technical Bits
For those who want to know what’s under the hood:
Why Web Scraping and Not a Sports API?
This is the question every developer asks. “Why not just use a football stats API?”
I tried. Trust me, I tried.
API-Football (api-sports.io) is the most popular one. Free tier gives you 100 requests per day. Sounds great. Except their free tier is locked to seasons 2022-2024. The moment you query for 2026 World Cup data, you get: "Free plans do not have access to this season, try from 2022 to 2024." So unless I wanted to pay for a subscription for a month-long tournament, that was out.
BALLDONTLIE has a FIFA World Cup endpoint. Free tier available. But at tournament time, you’re relying on a third-party API to have ingested the data promptly. And their rate limits and reliability during a live global event? Questionable.
Zafronix offers 250 requests/day for free, no credit card. But it’s relatively unknown, and I wasn’t about to build a workflow around an API I couldn’t verify would have real-time WC2026 data on day one.
So I went with web scraping. And honestly? It works better for my use case.
The Sites Being Crawled
The agent scrapes two main sources:
Primary: DailySports.net
This is the goldmine. Their match pages have the most granular stats I’ve found anywhere. Full match stats plus half-by-half breakdowns, passes, attacks, dangerous attacks, crosses, throw-ins, and a full event timeline. The URL pattern is predictable (dailysports.net/stat/football/{team1}-vs-{team2}/), which makes it easy for the agent to construct the right URL from the team names.
Backup: Sporting News
When DailySports doesn’t have a match yet (they sometimes lag by a few hours), the agent falls back to Sporting News box scores. These give you the essentials: possession, shots, corners, xG, and saves. Not as detailed, but solid enough to fill in the blanks.
Discovery: General web search
For finding which matches were played yesterday, the agent just does a broad web search (“FIFA World Cup 2026 results June 22, 2026”). It doesn’t need a specific source for that. The web search returns headlines from ESPN, BBC Sport, FIFA.com, whatever is ranking that day. The agent grabs the team names and scores, then goes deep on the stats from the specialized sources above.
Why This Approach Actually Works Better
Here’s the thing. Sports APIs give you structured JSON. Clean, predictable, easy to parse. But they also give you only what their schema supports. If the API doesn’t have an xG field, you don’t get xG. If they haven’t added “dangerous attacks” as a metric, tough luck.
Web scraping with an LLM flips this. The agent reads the page like a human would, extracts whatever is there, and structures it into my markdown format. If DailySports adds a new stat tomorrow, the agent will probably pick it up without me changing anything. It’s more resilient to changes in what data is available, not less.
The tradeoff? It’s slower (8-12 minutes per run vs. seconds with an API) and occasionally a stat is marked as “—” when the source page was weird. But for a daily batch job that runs while I sleep? Speed doesn’t matter. And the “—” gaps are honestly fine. I’d rather have 90% of stats from a rich source than 100% of a limited set from a locked-down API.
And yes, I’m aware that relying on specific websites means they could change their layout or go down. It’s a single point of failure, and I’ve written about that problem before. But having a primary + backup source with a general web search fallback gives me enough resilience for a month-long tournament.
The schedule: Runs at 09:00 IDT via a time_of_day schedule. It has run 6 times so far, all successful. Average run takes about 8-12 minutes because it’s doing multiple web searches and fetching full pages for each match.
The tools it has access to:
web_searchandurl_fetchfor finding and reading match resultsfile_readandfile_writefor maintaining the stats filesrun_pythonfor any data processingupdate_feedfor posting the morning notificationskip_cyclefor days when no matches were played
The model: It uses the “smart” tier. I want the analysis and prediction reasoning to be thoughtful, not just a quick summary.
Here is the full code of the task
You are a FIFA World Cup 2026 match statistics collector and tournament analyst. Every day at 9:00 AM IDT, you collect detailed match stats for any World Cup games played the previous day AND update your running prediction for which two teams will make the final.
## Your workflow:
### PART 1: Daily Stats Collection
1. Use `get_current_time` to determine today's date, then search for yesterday's World Cup 2026 results:
web_search("FIFA World Cup 2026 results {yesterday's date}")
2. For each completed match found, search for detailed stats:
- Search: "World Cup 2026 {team1} vs {team2} match statistics box score"
- Try DailySports.net (primary - most granular) and Sporting News box scores (backup)
- Fetch the stats page with url_fetch
3. For each match, collect:
- Final score, venue, group
- Possession %
- Shots on target / off target / total
- Corners
- Fouls
- Yellow/Red cards
- Saves
- Total passes
- xG (if available)
- Goal scorers with minutes
- Key events (cards, subs)
4. Read the existing stats file at /Users/maishsk/Documents/wc2026_all_match_stats.md using file_read, then append yesterday's matches to it using file_write (write the complete updated file with ALL existing content plus new matches appended at the end).
### PART 2: Final Prediction
5. After updating the stats file, read the FULL file and analyze ALL matches played so far. Then update the prediction file at /Users/maishsk/Documents/wc2026_final_prediction.md with your current best prediction for which two teams will meet in the final. The prediction file should include:
- **Current standings summary**: Points, GD, goals scored for all teams
- **Top 10 contenders list** with key metrics (pts, GD, goals/match, xG where available)
- **Predicted Finalist #1** with detailed reasoning (form, squad depth, quality of wins, tactical observations)
- **Predicted Finalist #2** with detailed reasoning
- **Confidence level** (percentage) — this should increase as the tournament progresses
- **Key factors considered**: tournament form, pedigree, squad quality, injury news mentioned in match reports, strength of opposition faced, home advantage, historical knockout stage performance
- **Changes from yesterday**: note if/why your prediction changed since last time
- **Dark horses**: 1-2 teams that could upset the prediction
- **Date of prediction** and number of matches analyzed
When making your prediction, weigh these factors:
- Current tournament form (goals scored, goals conceded, xG performance)
- Quality of opposition faced (beating strong teams > thrashing weak ones)
- Squad depth (how many different scorers? substitutes making impact?)
- Tournament pedigree (past World Cup performances of these squads)
- Tactical solidity (clean sheets, defensive organization)
- Mentality indicators (comebacks, late goals, composure under pressure)
- Home advantage (for USA/Mexico/Canada matches)
- Bracket position (once knockouts are determined)
### PART 3: Feed Update
6. Post a summary to the activity feed using update_feed with importance="important". Include:
- How many matches were played yesterday
- Final scores
- One highlight stat per match (e.g., most shots, highest xG, biggest possession gap)
- 🔮 Current final prediction: "Team A vs Team B" with a one-line reason why
## Important notes:
- The tournament runs June 11 - July 19, 2026
- If no matches were completed yesterday, call skip_cycle
- DailySports.net URL pattern: dailysports.net/stat/football/{team1}-vs-{team2}/
- Stats file absolute path: /Users/maishsk/Documents/wc2026_all_match_stats.md
- Prediction file absolute path: /Users/maishsk/Documents/wc2026_final_prediction.md
- Format each match section with a markdown H2 header: ## Match {N}: {Team1} {score1} - {score2} {Team2}
- Be bold with your prediction — make a clear call, don't hedge excessively
- If your prediction changes from the previous day, explain WHY in the "Changes" section
What I’ve Learned
A few observations after running this for almost two weeks:
The predictions are surprisingly reasonable. It’s not just picking the biggest names. It correctly identified that Germany’s 9 goals in 2 matches (impressive on paper) were inflated by a 7-1 against Curaçao, while France’s victories were against stronger opponents. That’s good analysis.
The daily “changes” section is the best part. Knowing why the prediction changed is more interesting than the prediction itself. “Germany dropped because their goals came against weak opposition while France earned maximum points against tougher teams.”
Consistency of format matters. Because the agent writes each match in the same structured format, I can easily scan and compare. Who had the highest xG? Which teams are overperforming their expected goals? The structured data makes these questions answerable at a glance.
It’s like having a dedicated analyst who never sleeps. I built this in maybe 15 minutes of prompting, and it’s been running reliably every day since. That’s the beauty of scheduled agents. Set it up once, and it just works. (If you want another example of this kind of thing, I recently had my AI assistant write an entire MCP proxy for me in a single session.)
Would I Do Anything Differently?
Honestly, not much. If I were starting over, I might add:
- A group stage standings table that updates automatically
- Alerts when a team I’m watching is eliminated
- A comparison of the agent’s predictions vs actual results (accountability!)
But for a quick weekend project that took 15 minutes to set up? I’m very happy with how this turned out.
And here’s the thing that still blows my mind. I didn’t write a single line of code. Not one. No Python scripts, no cron jobs, no API wrappers. I described what I wanted in plain English, gave the agent the right tools, and it figured out the rest. That’s the power of these kinds of tools. You don’t need to be a developer to build something like this. Anyone with a clear idea of what they want can actually build it.
The World Cup runs until July 19th. I’ll keep the agent running and see how its predictions hold up in the knockout stage when things get really unpredictable. Will it be Argentina vs France? Ask me again in 3 weeks.
I would be very interested to hear your thoughts or comments. Are you using scheduled agents for anything creative? Hit me up on LinkedIn, X.