Training Projects
Structured training exercises on real datasets. Each project includes a dataset, clear goal, and expected techniques.
Beginner — Classical ML
1. House Price Prediction
- Dataset: California Housing (sklearn built-in) or Ames Housing
- Goal: predict house prices
- Techniques: linear regression, feature engineering, Ridge/Lasso
- Code:
../projects/01_house_prices/
2. Spam Classifier
- Dataset: SMS Spam Collection (UCI)
- Goal: classify messages as spam/ham
- Techniques: TF-IDF, Naive Bayes, logistic regression
- Code:
../projects/02_spam_classifier/
3. Customer Segmentation
- Dataset: Mall Customers (Kaggle)
- Goal: find customer groups
- Techniques: K-means, PCA, visualization
- Code:
../projects/03_customer_segmentation/
Intermediate — Deep Learning
4. MNIST Digit Classifier
- Dataset: MNIST (torchvision built-in)
- Goal: classify handwritten digits, >99% accuracy
- Techniques: feedforward net → CNN, training loop, evaluation
- Code:
../projects/04_mnist/
5. CIFAR-10 with Transfer Learning
- Dataset: CIFAR-10 (torchvision built-in)
- Goal: classify 10 object categories
- Techniques: pretrained ResNet, fine-tuning, data augmentation
- Code:
../projects/05_cifar10_transfer/
6. Sentiment Analysis Pipeline
- Dataset: IMDB Reviews (Hugging Face datasets)
- Goal: positive/negative review classification
- Techniques: TF-IDF baseline → fine-tuned DistilBERT
- Code:
../projects/06_sentiment/
7. Time Series Forecasting
- Dataset: Jena Climate (Keras datasets) or Air Passengers
- Goal: predict temperature/passenger count
- Techniques: feature engineering, LSTM, transformer
- Code:
../projects/07_time_series/
Advanced — Modern AI
8. Build a RAG System
- Dataset: Wikipedia subset or your own documents
- Goal: question-answering over custom knowledge base
- Techniques: embeddings, FAISS, retrieval + LLM generation
- Code:
../projects/08_rag_system/
9. Fine-Tune a Small LLM
- Dataset: Alpaca or custom instruction pairs
- Goal: instruction-following model from a base model
- Techniques: QLoRA, training loop, evaluation
- Code:
../projects/09_finetune_llm/
10. Object Detection
- Dataset: COCO subset or Pascal VOC
- Goal: detect and localize objects in images
- Techniques: YOLOv8, fine-tuning, mAP evaluation
- Code:
../projects/10_object_detection/
11. RL Agent
- Dataset: Gymnasium environments (CartPole → LunarLander)
- Goal: train an agent to solve progressively harder environments
- Techniques: Q-learning → DQN → PPO
- Code:
../projects/11_rl_agent/