All for Joomla All for Webmasters

Data Science

Data Science

Data Science Training Course

Data Science Course in Pune

 

About the Course

UPDATED SYLLABUS - EFFECTIVE FROM BATCHES STARTING FROM 1ST FEB 2018

Our Data Science course is divided into three levels, Associate, Developer and Professionals

Data Science Associate - Includes R Programming + Statistics  + Machine Learning with R + Tableau for Data Visualization (Duration - 70 Hours)

Data Science Developer Includes R Programming + Statistics  + Machine Learning with R + Tableau for Data Visualization + Python Programming (Duration - 120 Hours)

Data Science ProfessionalsIncludes R Programming + Statistics  + Machine Learning with R + Tableau for Data Visualization +  Machine Learning with Python + Deep Learning with Python + Neural Network with Python + NLP with Python + Tensor Flow (Duration - 120 Hours)

 Prerequisites: Basic computer knowledge, any data related experience will be advantageous.

After the classes:

Ethans Data science modules are precisely designed ensuring all industry requirements are met & making students eligible for plethora of job openings in the field of data analytics. Students will easily crack interviews on Business analytics, data visualization (Tableau), Python related positions. Any interview for entry level data analyst position would be a cake walk for the candidates

Who get this training?

  • Any graduate/post graduate or students in final stages of graduation + People willing to align careers in analytics
  • Team leaders working with data and often need basic data analysis
  • Engineers looking for career opportunities in IT/ITES industry
  • Management students looking for strategic positions
  • People already working with huge datasets
  • Hadoop Professionals
  • CA, CS, CFA

 

Syllabus

Course Syllabus: 

Data Science Associate (Detailed Syllabus)

Introduction to Data Science

  • Introduction to Data Analytics
  • Introduction to Business Analytics
  • Understanding Business Applications
  • Data types and data Models
  • Type of Business Analytics
  • Evolution of Analytics
  • Data Science Components
  • Data Scientist Skillset
  • Univariate Data Analysis
  • Introduction to Sampling

Basic Operations in R Programming

  • Introduction to R programming
  • Types of Objects in R
  • Naming standards in R
  • Creating Objects in R
  • Data Structure in R
  • Matrix, Data Frame, String, Vectors
  • Understanding Vectors & Data input in R
  • Lists, Data Elements
  • Creating Data Files using R

Data Handling in R Programming

  • Basic Operations in R – Expressions, Constant Values, Arithmetic, Function Calls, Symbols
  • Sub-setting Data
  • Selecting (Keeping) Variables
  • Excluding (Dropping) Variables
  • Selecting Observations and Selection using Subset Function
  • Merging Data
  • Sorting Data
  • Adding Rows
  • Visualization using R
  • Data Type Conversion
  • Built-In Numeric Functions
  • Built-In Character Functions
  • User Built Functions
  • Control Structures
  • Loop Functions

Introduction to Statistics

  • Basic Statistics
  • Measure of central tendency
  • Types of Distributions
  • Anova
  • F-Test
  • Central Limit Theorem & applications
  • Types of variables
  • Relationships between variables
  • Central Tendency
  • Measures of Central Tendency
  • Kurtosis
  • Skewness
  • Arithmetic Mean / Average
  • Merits & Demerits of Arithmetic Mean
  • Mode, Merits & Demerits of Mode
  • Median, Merits & Demerits of Median
  • Range
  • Concept of Quantiles, Quartiles, percentile
  • Standard Deviation
  • Variance
  • Calculate Variance
  • Covariance
  • Correlation

Introduction to Statistics - 2

  • Hypothesis Testing
  • Multiple Linear Regression
  • Logistic Regression
  • Market Basket Analysis
  • Clustering (Hierarchical Clustering & K-means Clustering)
  • Classification (Decision Trees)
  • Time Series Analysis (Simple Moving Average, Exponential smoothing, ARIMA+)

 Introduction to Probability

  • Standard Normal Distribution
  • Normal Distribution
  • Geometric Distribution
  • Poisson Distribution
  • Binomial Distribution
  • Parameters vs. Statistics
  • Probability Mass Function
  • Random Variable
  • Conditional Probability and Independence
  • Unions and Intersections
  • Finding Probability of dataset
  • Probability Terminology
  • Probability Distributions

Data Visualization Techniques

  • Bubble Chart
  • Sparklines
  • Waterfall chart
  • Box Plot
  • Line Charts
  • Frequency Chart
  • Bimodal & Multimodal Histograms
  • Histograms
  • Scatter Plot
  • Pie Chart
  • Bar Graph
  • Line Graph

Introduction to Machine Learning

  • Overview & Terminologies
  • What is Machine Learning?
  • Why Learn?
  • When is Learning required?
  • Data Mining
  • Application Areas and Roles
  • Types of Machine Learning
  • Supervised Learning
  • Unsupervised Learning
  • Reinforcement learning

Machine Learning Concepts & Terminologies

 Steps in developing a Machine Learning application

  • Key tasks of Machine Learning
  • Modelling Terminologies
  • Learning a Class from Examples
  • Probability and Inference
  • PAC (Probably Approximately Correct) Learning
  • Noise
  • Noise and Model Complexity
  • Triple Trade-Off
  • Association Rules
  • Association Measures

Regression Techniques

  • Concept of Regression
  • Best Fitting line
  • Simple Linear Regression
  • Building regression models using excel
  • Coefficient of determination (R- Squared)
  • Multiple Linear Regression
  • Assumptions of Linear Regression
  • Variable transformation
  • Reading coefficients in MLR
  • Multicollinearity
  • VIF
  • Methods of building Linear regression model in R
  • Model validation techniques
  • Cooks Distance
  • Q-Q Plot
  • Durbin- Watson Test
  • Kolmogorov-Smirnof Test
  • Homoskedasticity of error terms
  • Logistic Regression
  • Applications of logistic regression
  • Concept of odds
  • Concept of Odds Ratio
  • Derivation of logistic regression equation
  • Interpretation of logistic regression output
  • Model building for logistic regression
  • Model validations
  • Confusion Matrix
  • Concept of ROC/AOC Curve
  • KS Test

Market Basket Analysis

  • Applications of Market Basket Analysis
  • What is association Rules
  • Overview of Apriori algorithm
  • Key terminologies in MBA
  • Support
  • Confidence
  • Lift
  • Model building for MBA
  • Transforming sales data to suit MBA
  • MBA Rule selection
  • Ensemble modelling applications using MBA

Time Series Analysis (Forecasting)

  • Model building using ARIMA, ARIMAX, SARIMAX
  • Data De-trending & data differencing
  • KPSS Test
  • Dickey Fuller Test
  • Concept of stationarity
  • Model building using exponential smoothing
  • Model building using simple moving average
  • Time series analysis techniques
  • Components of time series
  • Prerequisites for time series analysis
  • Concept of Time series data
  • Applications of Forecasting

Decision Trees using R

  • Understanding the Concept
  • Internal decision nodes
  • Terminal leaves.
  • Tree induction: Construction of the tree
  • Classification Trees
  • Entropy
  • Selecting Attribute
  • Information Gain
  • Partially learned tree
  • Overfitting
  • Causes for over fitting
  • Overfitting Prevention (Pruning) Methods
  • Reduced Error Pruning
  • Decision trees - Advantages & Drawbacks
  • Ensemble Models

K Means Clustering

  • Parametric Methods Recap
  • Clustering
  • Direct Clustering Method
  • Mixture densities
  • Classes v/s Clusters
  • Hierarchical Clustering
  • Dendogram interpretation
  • Non-Hierarchical Clustering
  • K-Means
  • Distance Metrics
  • K-Means Algorithm
  • K-Means Objective
  • Color Quantization
  • Vector Quantization

Tableau Analytics

  • Tableau Introduction
  • Data connection to Tableau
  • Calculated fields, hierarchy, parameters, sets, groups in Tableau
  • Various visualizations Techniques in Tableau
  • Map based visualization using Tableau
  • Reference Lines
  • Adding Totals, sub totals, Captions
  • Advanced Formatting Options
  • Using Combined Field
  • Show Filter & Use various filter options
  • Data Sorting
  • Create Combined Field
  • Table Calculations
  • Creating Tableau Dashboard
  • Action Filters
  • Creating Story using Tableau

Analytics using Tableau

  • Clustering using Tableau
  • Time series analysis using Tableau
  • Simple Linear Regression using Tableau

R integration in Tableau

  • Integrating R code with Tableau
  • Creating statistical model with dynamic inputs
  • Visualizing R output in Tableau
  • Case Study 1- Real time project with Twitter Data Analytics
  • Case Study 2- Real time project with Google Finance
  • Case Study 3- Real time project with IMDB Website

Data Science Professional (Detailed Syllabus)

Data Analytics Using Python (Self Paced Module)

  • Introduction to Anaconda Python
  • Introduction to Numpy Module
  • Machine Learning with Python
  • Plotting with Matplotlib
  • Data Analysis with Pandas

Data Preprocessing with Python

  • Missing Data
  • Categorical Data
  • Splitting the Dataset into the Training set and Test set
  • Feature Scaling

Using Git and GitHub

  • Setting up Your GitHub Account
  • Configuring Your First Git Repository
  • Making Your First Git Commit
  • Pushing Your First Commit to GitHub
  • Git and GitHub Workflow Step-by-Step

Regression Models with Python

Simple Linear Regression

  • Dataset + Business Problem Description
  • Intuition
  • Simple Linear Regression

Multiple Linear Regression

  • Dataset + Business Problem Description
  • Intuition
  • Multiple Linear Regression

Polynomial Regression

  • Intuition
  • Python Regression Template
  • Polynomial Regression

Support Vector Regression (SVR)

  • Intuition
  • SV Regression

Decision Tree Regression

  • Intuition
  • Decision Tree Regression

Random Forest Regression

  • Intuition
  • Random Forest Regression
  • Evaluating Regression Models Performance
  • R-Squared Intuition
  • Adjusted R-Squared Intuition
  • Evaluating Regression Models Performance
  • Interpreting Linear Regression Coefficient

Classification Models

Logistic Regression

  • Intuition
  • Logistic Regression
  • Python Classification Template

K-Nearest Neighbors (K-NN)

  • Intuition
  • K-NN creation

Support Vector Machine (SVM)

  • Intuition
  • SVM

Decision Tree Classification

  • Intuition
  • Decision Tree Classification

Random Forest Classification

  • Intuition
  • Random Forest Classification

Evaluating Classification Models Performance

  • False Positives & False Negatives
  • Confusion Matrix
  • Accuracy Paradox
  • CAP Curve
  • CAP Curve Analysis

Clustering Models

K-Means Clustering

  • K-Means Clustering
  • Intuition
  • Random Initialization Trap
  • Selecting The Number Of Clusters
  • K-Means Clustering

Hierarchical Clustering

  • Intuition
  • Hierarchical Clustering How Dendrograms Work
  • Hierarchical Clustering Using Dendrograms
  • HC

Natural Language Processing

  • Deep Learning
  • Artificial Neural Networks
  • ANN Intuition
  • Building an ANN
  • Business Problem Description
  • Building an ANN
  • [Exercise] Should we say goodbye to that customer?
  • Evaluating, Improving and Tuning the ANN
  • Evaluating the ANN
  • Improving the ANN
  • Tuning the ANN
  • [Exercise] - Put me one step down on the podium
  • Convolutional Neural Networks
  • CNN Intuition
  • Building a CNN
  • Building a CNN
  • [Exercise] - What's that pet?
  • Evaluating, Improving and Tuning the CNN
  • [Exercise] - Get the gold medal
  • Dimensionality Reduction
  • Principal Component Analysis (PCA)
  • Linear Discriminant Analysis (LDA)
  • Model Selection & Boosting
  • Model Selection
  • k-Fold Cross Validation
  • Grid Search

TensorFlow and Machine Learning

  • Introducing TF
  • Lab: Simple Math Operations
  • Computation Graph
  • Tensors
  • Lab: Tensors
  • Image Processing
  • Images As Tensors
  • Lab: Reading and Working with Images
  • Lab: Image Transformations
  • Introducing MNIST
  • Lab: K-Nearest-Neighbors
  • Individual Neuron
  • Learning Regression
  • Learning XOR
  • XOR Trained
  • Regression in TensorFlow
  • Lab: Access Data from Yahoo Finance
  • Non TensorFlow Regression
  • Lab: Linear Regression - Setting Up a Baseline
  • Gradient Descent
  • Lab: Linear Regression
  • Lab: Multiple Regression in TensorFlow
  • Logistic Regression Introduced
  • Linear Classification
  • Lab: Logistic Regression - Setting Up a Baseline
  • Logit
  • Softmax
  • Argmax
  • Lab: Logistic Regression
  • Estimators
  • Lab: Linear Regression using Estimators
  • Lab: Logistic Regression using Estimators