| Courses Software Training | Locality Aundh |
Data Science, Big Data Analytics and R Programming
About the Course:
In this course, you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas of turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like R Programming, SAS, MINITAB and EXCEL.
Course features:
140+ hours of Detailed teaching
Exclusive doubt clarification session ( One on One OR Many to One)
Real-Time examples driven approach,
Live Project
Placement assistance
Pre-requisite
Any Graduate. No programming and/or statistics knowledge or skills required
Duration of the course:
Week days Batch (3 months : Mon to Fri , 2 hours per session)
Weekend Batch ( 4 Months : Sat and Sun : 2 Hours per session)
Customized batch : Contents tailored on need basis
o Quick course on Machine Learning ( 1 Month, 2 hours per session Sat/Sun Batch for working Executives)
o Introduction to R Programming ( 1 Month, 2 hours per session Sat/Sun Batch for working Executives/ Students)
Module1:
Big Data: Definition, Characteristics, Architecture, Technologies, Applications & Sampling Big Data
Data Science: Overview, Data scientist, History, Criticism, Software
Machine Learning: Overview, Types of problems and tasks ( supervised and unsupervised learning), Relation to statistics, Theory, Approaches (Decision tree learning, Association rule learning, etc), Applications, Software( Open-source software, Commercial software with open-source editions, Commercial software)
Statistics: Random Data, Terms and definitions ( Mean, Standard deviation, Probability, normal distribution etc) Exercises on calculating various parameters, Data collection, sampling and observational
Probability Distributions: Discrete Random Variables, Mean, Expected Value, Binomial Random Variable, Poisson Random Variable, Continuous Random Variable, Normal distribution
Statistical tests and procedures:
o Hypothesis testing
o Analysis of variance (ANOVA)
o Chi-squared test
o Correlation
o Factor analysis
o Mann Whitney U
o Mean square weighted deviation (MSWD)
o Regression analysis
o Student's t-test
o Time series analysis
Module2: R Programming
Introduction: Overview on R, environment setup, Basic syntax, console and scripts, comments in R
Data Types: Vectors, Matrices, arrays, lists, Factors and Data frames.
Variables and Operators: declaration, Assignment, data type of a variable, Arithmetic operations, relational, logical, assignment and misc operators.
Decision making and loops : if else statement, while loop, for loop, repeat and controlling the loop with break and next statements
Functions: user defined and built-in functions, defining and calling with and without parameters
Detailed study on Vectors, Lists, Data frames, Matrices:
R Packages
Data access from various formats: CSV, XLS, XML, JSON and other formats, Data manipulation
Data Presentation and Charts : Graphs, Line Graph, BAR chart, Pie chart, Box chart, Histograms, scatter plots etc
Regression : Linear, multiple, logistic, poison s regression, , Analysis of Co variance, Time series analysis, mean square error
Decision Tree: Random forest, survival analysis, Chi-square test
Module3: Prediction and Machine learning Project
Titanic survival prediction:
Expedia Booking prediction
Prediction of operating condition of a water point in a data set from Africa