Data Science Program

Data Science — Comprehensive Course & Syllabus

A complete Data Science path covering Foundation (SQL, Excel), Core Python & Libraries, EDA, Statistics, Machine Learning, Time Series, Dimensionality Reduction, Clustering, Ensemble Methods, Visualization (Tableau/Power BI), Cloud Deployment and capstone projects.

Practical, project-driven curriculum designed for real-world hiring. Each module contains hands-on labs, assignments and projects aligned to industry needs.

Foundation — SQL & Excel (Basics → Advanced)
  • SQL: Intro to SQL, DDL/DML/DQL, Aggregate functions, Date functions, Sub-queries, Joins, Views, Indexes, Union/Intersect, Stored procedures, Advanced SQL practice.
  • Excel: Importing data, formatting, formulas, lookup & reference, pivot tables, charts, what-if analysis, macros basics, reporting in Excel.
  • Deliverables: SQL exercises, Excel reports & dashboards.
Core Track — Python, NumPy, Pandas, EDA
  • Python fundamentals: variables, flow controls, functions, collections (list, tuple, dict), list comprehensions, lambda functions.
  • NumPy: arrays, indexing, slicing, array operations.
  • Pandas: Series & DataFrame creation, reading from files, indexing, sorting, concatenation, joins, merging, reshaping, pivot tables, groupby, missing-value handling, duplicates, treatment.
  • Exploratory Data Analysis (EDA): summary statistics, handling missing values, variable distributions, correlation and covariance, advanced data exploration techniques.
  • Deliverables: EDA notebook + data cleaning pipeline.
Visualization & Statistics
  • Matplotlib & Seaborn: line plots, histograms, boxplots, scatter, heatmaps, pairplots, violin, joint, count plots.
  • Summary statistics, central tendency, dispersion, skewness, kurtosis.
  • Probability basics, discrete & continuous distributions (Bernoulli, Binomial, Poisson, Uniform, Normal).
  • Hypothesis testing: t-tests, chi-square, ANOVA, post-hoc tests; assumptions, normality tests.
Machine Learning — Supervised & Unsupervised
  • Supervised: Linear Regression (OLS), Logistic Regression (MLE), Model evaluation metrics, Regularization (L1/L2), Feature scaling, Feature selection.
  • Tree-based: Decision Trees, Random Forests — feature importance, pruning, tuning.
  • Ensembling: Bagging, Boosting (XGBoost/AdaBoost), stacking concepts.
  • Unsupervised: K-means, Hierarchical clustering, PCA (dimensionality reduction).
  • Model tuning: cross-validation, hyperparameter tuning, bias-variance tradeoff, overfitting/underfitting.
  • Deliverables: Regression & Classification projects (Property price prediction, Vaccine usage prediction, Heart disease prediction).
Time Series & Forecasting
  • Time series components, trend & seasonality, visualizing time series.
  • Exponential smoothing: Holt, Holt-Winters; ARIMA modelling; forecasting evaluation.
  • Deliverables: Forecast project (sales or mortgage analysis).
Advanced Analytics, Cloud & Deployment
  • Advanced topics: Association rule mining (Apriori), Market Basket Analysis, Ensemble techniques, XGBoost.
  • Cloud basics & deployment: AWS fundamentals, EC2, SageMaker overview, deployment steps, well-architected framework basics.
  • Web integration: Flask basics, connecting ML models via APIs.
  • Deliverables: Cloud-deployable model demo, Flask API wrapper for model inference.
BI & Visualization Tools (Tableau & Power BI)
  • Power BI: report building, data transformation, dashboards.
  • Tableau: interface, data connections, calculations, dashboards & stories, mapping & visual analytics.
  • Deliverables: Dashboard project (interactive sales dashboard).
Projects & Capstone
  • Hands-on projects throughout the course: E-commerce customer segmentation, Taxi fare prediction, Heart disease prediction, Property price prediction, Stock analysis, Forecasting sales.
  • Capstone Project: real-world comprehensive project (students apply end-to-end pipeline from data ingestion to deployment).
  • Presentation: Project demo, code repo, documentation & report submission.
Prerequisites & Who Should Apply
  • Basic programming knowledge (recommended) — helpful but not mandatory.
  • Motivation to work on datasets and complete assignments.
  • Comfort with mathematics at high-school level; statistics basics helpful.
Assessment, Placement, & Certification
  • Assessments: module-level quizzes, practical assignments, class assessments and project evaluations.
  • Certification: JCRM certificate on successful completion & capstone project submission.
  • Placement assistance: interview prep, resume review, and placement support network.