Course content

Overview
Data Science is Study and evaluation of data using mathematics, statistics and computer science. Used to extract key information for top management that helps in strategic business making, product development, forecasting and trend analysis. Organize and analyze large amounts of data, using specially designed software tools. Python and R are widely used in Data Science. Python is a general purpose language and R is for statistical computing.
Training Objective (What you will learn)
Develop in depth understanding in data science and business analytics. Covers machine learning, data mining, predictive modeling, visualization techniques and statistics. Helps in applying quantitative modelling and data analysis techniques on huge transactional data. Knowledge of statistical data analysis techniques to bring out information that help in business decision making. Provide required expertise in Python and R languages.
Prerequisites
It is a vast field based on multiple fields. Should have b exposure to Statistics, Mathematics and programming knowledge. Experience in data analytics or hands on with any ETL tool will be added advantage. Basics of Machine learning and deep learning will help as they are integral part of data science.
Market Demand
Demand for data scientists in current market is only beginning. With AI and big data demand for data science is growing very rapidly.
Part A : Python Basic  Concepts
  1. Introduction to Python and its involvement with Data Science
  2. Understanding Object Orientation Programming
  3. Installation: Python 3.6 or later version, pip, iPython, Sublime Text Editor, Anaconda(Jupyter and Spyder)
  4. Python Identifiers, Naming Conventions, Variables and Types
  5. Defining Functions, Classes and Methods
  6. Understanding Indentation
  7. Executing sample programs in all Editors
  8. Difference Between Functions and Methods
  9. How to use Python Functions and Methods
  10. Decision making through conditions and Loops
  11. Declaring instances and Workout its accessibility
  12. Understanding global and local variables in python
  13. Instantiating Classes and flow of execution
  14. Accessing Methods, Variables, Global variables and Functions
  15. Working with self and super keywords
  16. Object String representation through __str__ and __repr__
  17. Constructors; Initialization; object: a base class
  18. Inheritance Concept; Overriding and Overloading concept
  19. Constructors with respect to inheritance
  20. Understanding __name__ == ‘__main__’
  21. Exceptions:
    1. Overview of exception
    2. Raising common causing exceptions
    3. Exception Hierarchy
    4. Raising exception at calling method
    5. Handling exceptions through try, except, else and finally
    6. Exception propagation
    7. Customized Exceptions
Part B: Data Structures:
  1. List: Creating, Accessing, Slicing, Manipulating lists, Built-in Functions & Methods in list, Iterating & Enumerating list data and Working with Nested lists.
  2. Tuple, Set and Dictionaries (same above all operations)
  3. Handling conversions of sample data with Data Structures
Part C: Regular Expressions in Python
  1. Patterns, searching, Modifiers, flags
  2. Working with examples to find specific strings, phone numbers, email addresses and filtering html data with regular expressions
File I/O
  1. Working with text files and .csv
  2. Reading and Writing data to the files
  3. Importing required packages to work with .csv
  1. Statistical thinking in Python and approach of Data Analysis
  2. Fundamental statistics terms and its definitions
  3. Applying basic statistics in Python with NumPy
    1. Cumulative Distribution functions
    2. Modelling Distributions
  4. Graphical exploratory data analysis with Python
  5. Probability theories:
    1. Ranges, Mean, Variance, Standard Deviation and various distributions
    2. Mass and Density functions
    3. Kernel density estimation
    4. Understanding Bayes theorem and predictions*
  6. Estimation
    1. Sampling distributions, bias and Exponential distributions
  7. Hypothesis testing
    1. Hypothesis Test
    2. Testing Correlation and Proportions
    3. Chi-Squared Tests
    4. Errors, Power and Replication
  1. NumPy: N-dimensional array operations
    1. Array creations, conversions, dimensional understandings, shaping, reshaping, generating sample large datasets, Linear algebra functionalities and numerical operations etc…
  1. SciPy: High-level Scientific Computing
    1. Linear Algebra operations
    2. Interpolation
    3. Optimization and fit
    4. Statistics and random numbers
    5. Numerical Integration
    6. Fast Fourier transforms
    7. Signal processing and image manipulation
Part A :Pandas and NumPy Functionalities:
  1. Introduction
  2. Pandas DataFrame basics
    1. Understanding data, looking at columns, rows and cells
    2. Subsetting Columns, Rows with methods
    3. Grouped and Aggregated Calculations
      1. Frequency Means and Counts
    4. Basic plot
  3. Pandas Data Structures
    1. Creating your own data (Series and DataFrame)
    2. Series (also called as Vector) Object operations
      1. Broadcasting and Scalar operations
    3. DataFrame Broadcasting (Vectorized)
    4. Making changes to Series and DataFrame
      1. Adding additional Columns
      2. Dropping values
    5. Exporting and Importing Data
Part B :  Introduction to Plotting:
  1. Introduction
  2. Matplotlib
  3. Statistical Graphics using matplotlib
    1. Univariate
    2. Bivariate
    3. Multivariate Data
  4. Seaborn Library Plotting methodology
    1. Univariate, Bivariate and Multivariate
  5. Pandas Objects Plotting
    1. Histogram, Density Plot, Scatterplot, Hexbin Plot and Boxplot
  6. Seaborn Themes and Styles
Part C : Data Manipulation:
  1. Data Assembly
    1. Concatenations and Merging Multiple datasets
  2. Missing Data:
  3. Introduction
  4. What is a NaN Value
  5. Working with merged data, user input values and Re-indexing
  6. Working with missing data
    1. Finding and Counting missing data
    2. Cleansing missing data
    3. Calculations with missing data
    4. ConclusionUnderstanding Multiple Observations (Normalization)
Part D :Data Munging:
  1. Understanding Data Types
  2. Converting types
  3. Categorical Data
    1. Convert to Category
    2. Manipulating Categorical Data
  4. Strings and Text Data
    1. String Subsettings
    2. String Methods
    3. String Formatting
  5. Apply and Groupby Operations:
    1. Introduction
    2. Functions
    3. Apply over a Series and DataFrame
    4. Apply- Column-wise and Row-wise operations
    5. Groupby Operation:
    6. Aggregate Methods and Functions
  6. The datetime Data Type:
  7. Python’s datetime Object
  8. Loading, Converting, Extracting Date components
  9. Date Calculations
  10. Datetime Methods
  11. Subsettingdatetime, Date Ranges, Shifting Values, TimeZones
  1. Linear Models
    1. Linear and Multiple Regressions using statsmodelsand sklearn
  2. Generalized Linear Models
    1. Logistic and Poisson Regressions using statsmodels and sklearn
    2. Survival Analysis
  3. Model diagnostics
    1. Residuals
    2. Comparing Multiple Models
    3. k-Fold Cross-Validation
  4. Regularization
  5. Clustering
    1. k-Means, Dimension Reduction with PCA (Principal Component Analysis)
    2. Hierarchical Clusterings
    3. Conclusions

Mode of Training

Online

Total duration of the course

5 to 7 weeks

Training duration per day

50 mins - 90 mins

Communication Mode

Go to meeting, WEB-EX

Software access:

Software will be installed/Server access will be provided, whichever is possible

Material

Soft copy of the material will be provided during the training.

Training

Both weekdays and weekends

Training Fee

 $300