⚠️
Prerequisite
Complete "Data Exploration with Python" First
This course builds directly on pandas, Jupyter, and matplotlib. If you haven't done the Data Exploration course yet, start there — especially the vocabulary, the Anaconda setup, and Days 1–3 of the week program. You'll struggle here without that foundation.

Install One New Library

Anaconda already gave you pandas and matplotlib. You need one more thing: scikit-learn, the standard Python library for machine learning.

1
Install scikit-learn and confirm it works
Open Jupyter, create a new notebook called "ml-week.ipynb", and run the following in the first cell:
import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression, LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.metrics import accuracy_score, confusion_matrix import sklearn print("scikit-learn version:", sklearn.__version__) print("All imports successful — ready to go!")
If you get an ImportError Open a new cell and run: !pip install scikit-learn — then restart the kernel (Kernel → Restart) and try the imports again.

Warm-Up: Pandas & Matplotlib Review

Before you train any models, sharpen the tools you already have. These exercises use any dataset you have handy — the cricket data from last course works perfectly.

Pick Your Dataset

Your dataset drives everything — your model, your question, your presentation. Pick something you're genuinely curious about. A great question with messy data beats a boring question with clean data every time.

Bonus Challenge

🎮 Connect Your Model to Your Online Game

You built a game earlier this summer. Can your ML model power something inside it? This is exactly how real game studios work — they train models on player data to make their games smarter, fairer, and more fun. Here are some ideas:

🏏
Cricket Win Probability
Train a model on historical ODI data: runs scored, wickets left, overs remaining → probability of winning. Wire the output into your cricket game as a live "win %" display.
🏆
Score Predictor
Use player stats from your game's leaderboard (or a sports dataset) to predict where a player's score will land. Show it as a progress bar toward their predicted final rank.
🎯
Adaptive Difficulty
Train a model on game session data (score, time, mistakes) to classify a player as "struggling", "on track", or "breezing through" — and use that to adjust difficulty dynamically.
Match Outcome Classifier
For any sports game: train a classifier on historical match stats (possession, shots, home/away) to predict win, loss, or draw. Display it as a pre-match prediction panel.
🃏
Card or Character Strength Ranker
For a card or RPG-style game: collect stats on characters/cards and train a regression model to predict their "power score." Use it to auto-balance new cards you add later.
🚗
Racing Time Predictor
For a racing game: train on lap data (car stats, track, weather conditions) to predict lap time. Display "predicted fastest lap" before the race starts.

Choose Your Direction

Pick one of these — or bring your own dataset from Kaggle. Click any card for the full brief and model suggestions.

Find your own dataset on Kaggle kaggle.com/datasets has thousands of free CSVs. Good ones to look for: at least 500 rows, a numeric or yes/no column you want to predict, and a topic you actually care about. Sports, music, movies, weather, and gaming all have great datasets.

How Machine Learning Works

Read these before Day 3. You don't need to memorise them — just build enough mental model that the code makes sense when you write it.

Speak the Language

The words data scientists use. Drop these at the meetup.

The 10-Day ML Journey

Two weeks, ten days of work, two Saturday meetups. Week 1 is about data and understanding. Week 2 is about models and results. Expand each day to see tasks and code.

Discussion Questions

Questions to think about during the two weeks and fuel both Saturday conversations. The best ones don't have clean answers.

Two Meetups, Two Milestones

Week 1 meetup is a check-in: you show your dataset, your question, and your first graphs — no model required yet. Week 2 meetup is the full presentation: your trained model, your results, and what you learned. Pick a presentation format below.