Course content
Description
Overview
Data Science is Study and evaluation of data using mathematics, statistics and computer science. Used to extract key information for top management that helps in strategic business making, product development, forecasting and trend analysis. Organize and analyze large amounts of data, using specially designed software tools. Python and R are widely used in Data Science. Python is a general purpose language and R is for statistical computing.
Training Objective (What you will learn)
Develop in depth understanding in data science and business analytics. Covers machine learning, data mining, predictive modeling, visualization techniques and statistics. Helps in applying quantitative modelling and data analysis techniques on huge transactional data. Knowledge of statistical data analysis techniques to bring out information that help in business decision making. Provide required expertise in Python and R languages.
Prerequisites
It is a vast field based on multiple fields. Should have b exposure to Statistics, Mathematics and programming knowledge. Experience in data analytics or hands on with any ETL tool will be added advantage. Basics of Machine learning and deep learning will help as they are integral part of data science.
Market Demand
Demand for data scientists in current market is only beginning. With AI and big data demand for data science is growing very rapidly.
Data Science is Study and evaluation of data using mathematics, statistics and computer science. Used to extract key information for top management that helps in strategic business making, product development, forecasting and trend analysis. Organize and analyze large amounts of data, using specially designed software tools. Python and R are widely used in Data Science. Python is a general purpose language and R is for statistical computing.
Training Objective (What you will learn)
Develop in depth understanding in data science and business analytics. Covers machine learning, data mining, predictive modeling, visualization techniques and statistics. Helps in applying quantitative modelling and data analysis techniques on huge transactional data. Knowledge of statistical data analysis techniques to bring out information that help in business decision making. Provide required expertise in Python and R languages.
Prerequisites
It is a vast field based on multiple fields. Should have b exposure to Statistics, Mathematics and programming knowledge. Experience in data analytics or hands on with any ETL tool will be added advantage. Basics of Machine learning and deep learning will help as they are integral part of data science.
Market Demand
Demand for data scientists in current market is only beginning. With AI and big data demand for data science is growing very rapidly.
Curriculum
Part A : Python Basic Concepts
- Introduction to Python and its involvement with Data Science
- Understanding Object Orientation Programming
- Installation: Python 3.6 or later version, pip, iPython, Sublime Text Editor, Anaconda(Jupyter and Spyder)
- Python Identifiers, Naming Conventions, Variables and Types
- Defining Functions, Classes and Methods
- Understanding Indentation
- Executing sample programs in all Editors
- Difference Between Functions and Methods
- How to use Python Functions and Methods
- Decision making through conditions and Loops
- Declaring instances and Workout its accessibility
- Understanding global and local variables in python
- Instantiating Classes and flow of execution
- Accessing Methods, Variables, Global variables and Functions
- Working with self and super keywords
- Object String representation through __str__ and __repr__
- Constructors; Initialization; object: a base class
- Inheritance Concept; Overriding and Overloading concept
- Constructors with respect to inheritance
- Understanding __name__ == ‘__main__’
- Exceptions:
- Overview of exception
- Raising common causing exceptions
- Exception Hierarchy
- Raising exception at calling method
- Handling exceptions through try, except, else and finally
- Exception propagation
- Customized Exceptions
- List: Creating, Accessing, Slicing, Manipulating lists, Built-in Functions & Methods in list, Iterating & Enumerating list data and Working with Nested lists.
- Tuple, Set and Dictionaries (same above all operations)
- Handling conversions of sample data with Data Structures
- Patterns, searching, Modifiers, flags
- Working with examples to find specific strings, phone numbers, email addresses and filtering html data with regular expressions
- Working with text files and .csv
- Reading and Writing data to the files
- Importing required packages to work with .csv
- Statistical thinking in Python and approach of Data Analysis
- Fundamental statistics terms and its definitions
- Applying basic statistics in Python with NumPy
- Cumulative Distribution functions
- Modelling Distributions
- Graphical exploratory data analysis with Python
- Probability theories:
- Ranges, Mean, Variance, Standard Deviation and various distributions
- Mass and Density functions
- Kernel density estimation
- Understanding Bayes theorem and predictions*
- Estimation
- Sampling distributions, bias and Exponential distributions
- Hypothesis testing
- Hypothesis Test
- Testing Correlation and Proportions
- Chi-Squared Tests
- Errors, Power and Replication
- NumPy: N-dimensional array operations
- Array creations, conversions, dimensional understandings, shaping, reshaping, generating sample large datasets, Linear algebra functionalities and numerical operations etc…
- SciPy: High-level Scientific Computing
- Linear Algebra operations
- Interpolation
- Optimization and fit
- Statistics and random numbers
- Numerical Integration
- Fast Fourier transforms
- Signal processing and image manipulation
Part A :Pandas and NumPy Functionalities:
- Introduction
- Pandas DataFrame basics
- Understanding data, looking at columns, rows and cells
- Subsetting Columns, Rows with methods
- Grouped and Aggregated Calculations
- Frequency Means and Counts
- Basic plot
- Pandas Data Structures
- Creating your own data (Series and DataFrame)
- Series (also called as Vector) Object operations
- Broadcasting and Scalar operations
- DataFrame Broadcasting (Vectorized)
- Making changes to Series and DataFrame
- Adding additional Columns
- Dropping values
- Exporting and Importing Data
- Introduction
- Matplotlib
- Statistical Graphics using matplotlib
- Univariate
- Bivariate
- Multivariate Data
- Seaborn Library Plotting methodology
- Univariate, Bivariate and Multivariate
- Pandas Objects Plotting
- Histogram, Density Plot, Scatterplot, Hexbin Plot and Boxplot
- Seaborn Themes and Styles
- Data Assembly
- Concatenations and Merging Multiple datasets
- Missing Data:
- Introduction
- What is a NaN Value
- Working with merged data, user input values and Re-indexing
- Working with missing data
- Finding and Counting missing data
- Cleansing missing data
- Calculations with missing data
- ConclusionUnderstanding Multiple Observations (Normalization)
- Understanding Data Types
- Converting types
- Categorical Data
- Convert to Category
- Manipulating Categorical Data
- Strings and Text Data
- String Subsettings
- String Methods
- String Formatting
- Apply and Groupby Operations:
- Introduction
- Functions
- Apply over a Series and DataFrame
- Apply- Column-wise and Row-wise operations
- Groupby Operation:
- Aggregate Methods and Functions
- The datetime Data Type:
- Python’s datetime Object
- Loading, Converting, Extracting Date components
- Date Calculations
- Datetime Methods
- Subsettingdatetime, Date Ranges, Shifting Values, TimeZones
- Linear Models
- Linear and Multiple Regressions using statsmodelsand sklearn
- Generalized Linear Models
- Logistic and Poisson Regressions using statsmodels and sklearn
- Survival Analysis
- Model diagnostics
- Residuals
- Comparing Multiple Models
- k-Fold Cross-Validation
- Regularization
- Clustering
- k-Means, Dimension Reduction with PCA (Principal Component Analysis)
- Hierarchical Clusterings
- Conclusions
Mode of Training
Online
Total duration of the course
5 to 7 weeks
Training duration per day
50 mins - 90 mins
Communication Mode
Go to meeting, WEB-EX
Software access:
Software will be installed/Server access will be provided, whichever is possible
Material
Soft copy of the material will be provided during the training.
Training
Both weekdays and weekends
Training Fee
$300