Home/Roadmaps/Data Scientist
📊

Data Scientist Roadmap

Become a data scientist who can extract insights, build ML models, and communicate findings that drive business decisions. Focused on the skills Indian companies actually hire for.

6-9 months5-10 LPA → 30-60 LPA expected10 steps • 34 free resources
1

Python for Data Science

3-4 weeks

Python is the language of data science. Master the basics, then focus on NumPy, Pandas, and Matplotlib — the three libraries you'll use every single day.

By the end, you'll be able to

  • Write Python scripts for data manipulation and analysis
  • Use NumPy for numerical operations and Pandas for dataframes
  • Create publication-quality charts with Matplotlib and Seaborn
🛠️

Mini-project

Analyze the IPL cricket dataset: batting averages, team performance trends, and player comparisons. Present findings in a Jupyter notebook with visualizations.

2

Statistics & Probability

3-4 weeks

Data science without statistics is just guessing. Learn descriptive stats, probability distributions, hypothesis testing, confidence intervals, and A/B testing.

By the end, you'll be able to

  • Calculate and interpret mean, median, variance, and standard deviation
  • Run hypothesis tests and explain p-values to non-technical stakeholders
  • Design and analyze A/B tests correctly
🛠️

Mini-project

Analyze an e-commerce A/B test dataset: determine if a new checkout flow increases conversion rate. Calculate statistical significance and present your recommendation.

3

SQL & Data Querying

2-3 weeks

Most data lives in databases. Master SQL: complex joins, window functions, CTEs, and subqueries. This is the most-tested skill in data science interviews.

By the end, you'll be able to

  • Write complex SQL with window functions, CTEs, and subqueries
  • Optimize queries for large datasets
  • Extract business insights directly from a database
🛠️

Mini-project

Solve 30 SQL problems on LeetCode/HackerRank. Then write 10 business queries on a sample e-commerce database.

4

Exploratory Data Analysis (EDA)

2-3 weeks

Before any model, you explore. Learn how to clean messy data, handle missing values, detect outliers, engineer features, and tell stories with data.

By the end, you'll be able to

  • Clean messy real-world datasets systematically
  • Detect and handle outliers, missing values, and data quality issues
  • Create insightful visualizations that reveal patterns
🛠️

Mini-project

Do a full EDA on the Titanic or House Prices dataset from Kaggle. Document every step in a clean Jupyter notebook.

5

Machine Learning Fundamentals

4-5 weeks

Learn the core ML algorithms: linear regression, logistic regression, decision trees, random forests, SVMs, and k-means. Understand bias-variance tradeoff and cross-validation.

By the end, you'll be able to

  • Implement and evaluate classification and regression models
  • Understand bias-variance tradeoff and prevent overfitting
  • Use scikit-learn for the full ML pipeline: preprocess → train → evaluate
🛠️

Mini-project

Build a loan default prediction model using a real bank dataset from Kaggle. Compare 5 models, tune hyperparameters, and write a report.

6

Feature Engineering & Model Selection

2-3 weeks

The difference between a good model and a great model is features. Learn encoding, scaling, feature creation, and how to systematically select the best model.

By the end, you'll be able to

  • Engineer meaningful features from raw data
  • Handle categorical variables, text features, and time series
  • Use cross-validation and grid search for model selection
🛠️

Mini-project

Compete in a Kaggle competition. Focus on feature engineering to move up the leaderboard rather than trying exotic models.

7

Deep Learning Basics

3-4 weeks

Learn neural networks: perceptrons, backpropagation, CNNs for images, and RNNs/LSTMs for sequences. Use TensorFlow or PyTorch.

By the end, you'll be able to

  • Build and train neural networks with TensorFlow/PyTorch
  • Understand backpropagation and gradient descent intuitively
  • Apply CNNs for image classification and RNNs for text
🛠️

Mini-project

Build an image classifier that detects whether food is Indian or Western cuisine. Train on a custom dataset, deploy as a simple web app.

8

Data Storytelling & Communication

1-2 weeks

The best data scientists are great communicators. Learn to build dashboards, write clear reports, and present findings to non-technical stakeholders.

By the end, you'll be able to

  • Build interactive dashboards with Tableau or Power BI
  • Present technical findings in business language
  • Write data reports that drive decisions
🛠️

Mini-project

Create a dashboard analyzing Zomato restaurant data across Indian cities: ratings, cuisine trends, pricing. Present it to a friend as if they were a business stakeholder.

9

End-to-End Projects

4-6 weeks

Build 2-3 complete data science projects: problem definition → data collection → EDA → modeling → deployment → presentation. These are your interview tickets.

By the end, you'll be able to

  • Complete end-to-end ML projects from problem to deployment
  • Deploy models as APIs or web apps
  • Present project results with clear business impact
🛠️

Mini-project

Build a movie recommendation system, deploy it as a Streamlit app, and write a Medium article explaining your approach.

10

Interview Prep

3-4 weeks

Data science interviews test: Python/SQL coding, statistics, ML theory, case studies, and a take-home assignment. Prepare across all these areas.

By the end, you'll be able to

  • Solve SQL and Python coding questions in interviews
  • Explain ML algorithms, their assumptions, and when to use each
  • Crack case study rounds with structured analytical thinking
🛠️

Mini-project

Do 3 mock interviews, solve 30 SQL + 30 Python problems, and practice explaining your projects in 3 minutes.

🎉

Pick the path that fits you

Not sure if this is the right roadmap? Browse all our career paths and find the one that matches your goals.