Dữ liệu
The Data Science Course: Complete Data Science Bootcamp
- 4.5/5.0
- 12k Enrolled
- All levels
- Last updated 09/2021
- English

0 xu
- A Practical Example: What You Will Learn in This Course
- What Does the Course Cover
- Data Science and Business Buzzwords: Why are there so Many?
- What is the difference between Analysis and Analytics
- Business Analytics, Data Analytics, and Data Science: An Introduction
- Continuing with BI, ML, and AI
- Traditional AI vs. Generative AI
- More Examples of Generative AI
- A Breakdown of our Data Science Infographic
- Applying Traditional Data, Big Data, BI, Traditional Data Science and ML
- The Reason Behind These Disciplines
- Techniques for Working with Traditional Data
- Real Life Examples of Traditional Data
- Techniques for Working with Big Data
- Real Life Examples of Big Data
- Business Intelligence (BI) Techniques
- Real Life Examples of Business Intelligence (BI)
- Techniques for Working with Traditional Methods
- Real Life Examples of Traditional Methods
- Machine Learning (ML) Techniques
- Types of Machine Learning
- Evolution and Latest Trends of Machine Learning (ML)
- Real Life Examples of Machine Learning (ML)
- Necessary Programming Languages and Software Used in Data Science
- Finding the Job - What to Expect and What to Look for
- Debunking Common Misconceptions
- The Basic Probability Formula
- Computing Expected Values
- Frequency
- Events and Their Complements
- Fundamentals of Combinatorics
- Permutations and How to Use Them
- Simple Operations with Factorials
- Solving Variations with Repetition
- Solving Variations without Repetition
- Solving Combinations
- Symmetry of Combinations
- Solving Combinations with Separate Sample Spaces
- Combinatorics in Real-Life: The Lottery
- A Recap of Combinatorics
- A Practical Example of Combinatorics
- Sets and Events
- Ways Sets Can Interact
- Intersection of Sets
- Union of Sets
- Mutually Exclusive Sets
- Dependence and Independence of Sets
- The Conditional Probability Formula
- The Law of Total Probability
- The Additive Rule
- The Multiplication Law
- Bayes' Law
- A Practical Example of Bayesian Inference
- Fundamentals of Probability Distributions
- Types of Probability Distributions
- Characteristics of Discrete Distributions
- Discrete Distributions: The Uniform Distribution
- Discrete Distributions: The Bernoulli Distribution
- Discrete Distributions: The Binomial Distribution
- Discrete Distributions: The Poisson Distribution
- Characteristics of Continuous Distributions
- Continuous Distributions: The Normal Distribution
- Continuous Distributions: The Standard Normal Distribution
- Continuous Distributions: The Students' T Distribution
- Continuous Distributions: The Chi-Squared Distribution
- Continuous Distributions: The Exponential Distribution
- Continuous Distributions: The Logistic Distribution
- A Practical Example of Probability Distributions
- Probability in Finance
- Probability in Statistics
- Probability in Data Science
- Population and Sample
- Types of Data
- Levels of Measurement
- Categorical Variables - Visualization Techniques
- Numerical Variables - Frequency Distribution Table
- The Histogram
- Cross Tables and Scatter Plots
- Mean, median and mode
- Skewness
- Variance
- Standard Deviation and Coefficient of Variation
- Covariance
- Correlation Coefficient
- Practical Example: Descriptive Statistics
- Introduction
- What is a Distribution
- The Normal Distribution
- The Standard Normal Distribution
- Central Limit Theorem
- Standard error
- Estimators and Estimates
- What are Confidence Intervals?
- Confidence Intervals; Population Variance Known; Z-score
- Confidence Interval Clarifications
- Student's T Distribution
- Confidence Intervals; Population Variance Unknown; T-score
- Margin of Error
- Confidence intervals. Two means. Dependent samples
- Confidence intervals. Two means. Independent Samples (Part 1)
- Confidence intervals. Two means. Independent Samples (Part 2)
- Confidence intervals. Two means. Independent Samples (Part 3)
- Practical Example: Inferential Statistics
- Null vs Alternative Hypothesis
- Rejection Region and Significance Level
- Type I Error and Type II Error
- Test for the Mean. Population Variance Known
- p-value
- Test for the Mean. Population Variance Unknown
- Test for the Mean. Dependent Samples
- Test for the mean. Independent Samples (Part 1)
- Test for the mean. Independent Samples (Part 2)
- Practical Example: Hypothesis Testing
- Introduction to Programming
- Why Python?
- Why Jupyter?
- Installing Python and Jupyter
- Understanding Jupyter's Interface - the Notebook Dashboard
- Prerequisites for Coding in the Jupyter Notebooks
- Variables
- Numbers and Boolean Values in Python
- Python Strings
- Using Arithmetic Operators in Python
- The Double Equality Sign
- How to Reassign Values
- Add Comments
- Understanding Line Continuation
- Indexing Elements
- Structuring with Indentation
- Comparison Operators
- Logical and Identity Operators
- The IF Statement
- The ELSE Statement
- The ELIF Statement
- A Note on Boolean Values
- Defining a Function in Python
- How to Create a Function with a Parameter
- Defining a Function in Python - Part II
- How to Use a Function within a Function
- Conditional Statements and Functions
- Functions Containing a Few Arguments
- Built-in Functions in Python
- Lists
- Using Methods
- List Slicing
- Tuples
- Dictionaries
- For Loops
- While Loops and Incrementing
- Lists with the range() Function
- Conditional Statements and Loops
- Conditional Statements, Functions, and Loops
- How to Iterate over Dictionaries
- Object Oriented Programming
- Modules and Packages
- What is the Standard Library?
- Importing Modules in Python
- Introduction to Regression Analysis
- The Linear Regression Model
- Correlation vs Regression
- Geometrical Representation of the Linear Regression Model
- Python Packages Installation
- First Regression in Python
- Using Seaborn for Graphs
- How to Interpret the Regression Table
- Decomposition of Variability
- What is the OLS?
- R-Squared
- Multiple Linear Regression
- Adjusted R-Squared
- Test for Significance of the Model (F-Test)
- OLS Assumptions
- A1: Linearity
- A2: No Endogeneity
- A3: Normality and Homoscedasticity
- A4: No Autocorrelation
- A5: No Multicollinearity
- Dealing with Categorical Data - Dummy Variables
- Making Predictions with the Linear Regression
- What is sklearn and How is it Different from Other Packages
- How are we Going to Approach this Section?
- Simple Linear Regression with sklearn
- Simple Linear Regression with sklearn - A StatsModels-like Summary Table
- Multiple Linear Regression with sklearn
- Calculating the Adjusted R-Squared in sklearn
- Feature Selection (F-regression)
- Creating a Summary Table with P-values
- Feature Scaling (Standardization)
- Feature Selection through Standardization of Weights
- Predicting with the Standardized Coefficients
- Underfitting and Overfitting
- Train - Test Split Explained
- Practical Example: Linear Regression (Part 1)
- Practical Example: Linear Regression (Part 2)
- Practical Example: Linear Regression (Part 3)
- Practical Example: Linear Regression (Part 4)
- Practical Example: Linear Regression (Part 5)
- Introduction to Logistic Regression
- A Simple Example in Python
- Logistic vs Logit Function
- Building a Logistic Regression
- An Invaluable Coding Tip
- Understanding Logistic Regression Tables
- What do the Odds Actually Mean
- Binary Predictors in a Logistic Regression
- Calculating the Accuracy of the Model
- Underfitting and Overfitting
- Testing the Model
- Introduction to Cluster Analysis
- Some Examples of Clusters
- Difference between Classification and Clustering
- Math Prerequisites
- K-Means Clustering
- A Simple Example of Clustering
- Clustering Categorical Data
- How to Choose the Number of Clusters
- Pros and Cons of K-Means Clustering
- To Standardize or not to Standardize
- Relationship between Clustering and Regression
- Market Segmentation with Cluster Analysis (Part 1)
- Market Segmentation with Cluster Analysis (Part 2)
- How is Clustering Useful?
- Types of Clustering
- Dendrogram
- Heatmaps
- Traditional data science methods and the role of ChatGPT
- How to install ChatGPT
- How ChatGPT can boost your productivity
- Data Preprocessing with ChatGPT
- First attempt at machine learning with ChatGPT
- Analyzing a client database with ChatGPT in Python
- Analyzing a client database with ChatGPT in Python – analyzing top products
- Analyzing a client database with ChatGPT in Python – analyzing top clients, RFM
- Exploratory data analysis (EDA) with ChatGPT - histogram and scatter plot
- Exploratory data analysis (EDA) with ChatGPT - correlation matrix, outlier detec
- Hypothesis testing with ChatGPT
- Marvels comic book database: Intro to Regular Expressions (RegEx)
- Decoding comic book data: Python Regular Expressions and ChatGPT
- Algorithm recommendation: Movie Database Analysis with ChatGPT
- Algorithm recommendation: recommendation engine for movies with ChatGPT
- Ethical principles in data and AI utilization
- Using ChatGPT for ethical considerations
- Intro to the Case Study
- The Naive Bayes Algorithm
- Tokenization and Vectorization
- Imbalanced Data Sets
- Overcome Imbalanced Data in Machine Learning
- Loading the Dataset and Preprocessing
- Optimizing User Reviews: Data Preprocessing & EDA
- Reg Ex for Analyzing Text Review Data
- Understanding Differences between Multinomial and Bernouilli Naive Bayes
- Machine Learning with Naïve Bayes (First Attempt)
- Machine Learning with Naïve Bayes – converting the problem to a binary one
- Testing the Model on New Data
- What is a Matrix?
- Scalars and Vectors
- Linear Algebra and Geometry
- Arrays in Python - A Convenient Way To Represent Matrices
- What is a Tensor?
- Addition and Subtraction of Matrices
- Errors when Adding Matrices
- Transpose of a Matrix
- Dot Product
- Dot Product of Matrices
- Why is Linear Algebra Useful?
- What to Expect from this Part?
- Introduction to Neural Networks
- Training the Model
- Types of Machine Learning
- The Linear Model (Linear Algebraic Version)
- The Linear Model with Multiple Inputs
- The Linear model with Multiple Inputs and Multiple Outputs
- Graphical Representation of Simple Neural Networks
- What is the Objective Function?
- Common Objective Functions: L2-norm Loss
- Common Objective Functions: Cross-Entropy Loss
- Optimization Algorithm: 1-Parameter Gradient Descent
- Optimization Algorithm: n-Parameter Gradient Descent
- Basic NN Example (Part 1)
- Basic NN Example (Part 2)
- Basic NN Example (Part 3)
- Basic NN Example (Part 4)
- How to Install TensorFlow 2.0
- TensorFlow Outline and Comparison with Other Libraries
- TensorFlow 1 vs TensorFlow 2
- A Note on TensorFlow 2 Syntax
- Types of File Formats Supporting TensorFlow
- Outlining the Model with TensorFlow 2
- Interpreting the Result and Extracting the Weights and Bias
- Customizing a TensorFlow 2 Model
- What is a Layer?
- What is a Deep Net?
- Digging into a Deep Net
- Non-Linearities and their Purpose
- Activation Functions
- Activation Functions: Softmax Activation
- Backpropagation
- Backpropagation Picture
- What is Overfitting?
- Underfitting and Overfitting for Classification
- What is Validation?
- Training, Validation, and Test Datasets
- N-Fold Cross Validation
- Early Stopping or When to Stop Training
- What is Initialization?
- Types of Simple Initializations
- State-of-the-Art Method - (Xavier) Glorot Initialization
- Stochastic Gradient Descent
- Problems with Gradient Descent
- Momentum
- Learning Rate Schedules, or How to Choose the Optimal Learning Rate
- Learning Rate Schedules Visualized
- Adaptive Learning Rate Schedules (AdaGrad and RMSprop)
- Adam (Adaptive Moment Estimation)
- Preprocessing Introduction
- Types of Basic Preprocessing
- Standardization
- Preprocessing Categorical Data
- Binary and One-Hot Encoding
- MNIST: The Dataset
- MNIST: How to Tackle the MNIST
- MNIST: Importing the Relevant Packages and Loading the Data
- MNIST: Preprocess the Data - Create a Validation Set and Scale It
- MNIST: Preprocess the Data - Shuffle and Batch
- MNIST: Outline the Model
- MNIST: Select the Loss and the Optimizer
- MNIST: Learning
- MNIST: Testing the Model
- Business Case: Exploring the Dataset and Identifying Predictors
- Business Case: Outlining the Solution
- Business Case: Balancing the Dataset
- Business Case: Preprocessing the Data
- Business Case: Load the Preprocessed Data
- Business Case: Learning and Interpreting the Result
- Business Case: Setting an Early Stopping Mechanism
- Business Case: Testing the Model
- Summary on What You've Learned
- What's Further out there in terms of Machine Learning
- An overview of CNNs
- An Overview of RNNs
- An Overview of non-NN Approaches
- How to Install TensorFlow 1
- TensorFlow Intro
- Actual Introduction to TensorFlow
- Types of File Formats, supporting Tensors
- Basic NN Example with TF: Inputs, Outputs, Targets, Weights, Biases
- Basic NN Example with TF: Loss Function and Gradient Descent
- Basic NN Example with TF: Model Output
- MNIST: What is the MNIST Dataset?
- MNIST: How to Tackle the MNIST
- MNIST: Relevant Packages
- MNIST: Model Outline
- MNIST: Loss and Optimization Algorithm
- Calculating the Accuracy of the Model
- MNIST: Batching and Early Stopping
- MNIST: Learning
- MNIST: Results and Testing
- Business Case: Getting Acquainted with the Dataset
- Business Case: Outlining the Solution
- The Importance of Working with a Balanced Dataset
- Business Case: Preprocessing
- Creating a Data Provider
- Business Case: Model Outline
- Business Case: Optimization
- Business Case: Interpretation
- Business Case: Testing the Model
- Business Case: A Comment on the Homework
- What are Data, Servers, Clients, Requests, and Responses
- What are Data Connectivity, APIs, and Endpoints?
- Taking a Closer Look at APIs
- Communication between Software Products through Text Files
- Software Integration - Explained
- Game Plan for this Python, SQL, and Tableau Business Exercise
- The Business Task
- Introducing the Data Set
- Importing the Absenteeism Data in Python
- Checking the Content of the Data Set
- Introduction to Terms with Multiple Meanings
- Using a Statistical Approach towards the Solution to the Exercise
- Dropping a Column from a DataFrame in Python
- Analyzing the Reasons for Absence
- Obtaining Dummies from a Single Feature
- More on Dummy Variables: A Statistical Perspective
- Classifying the Various Reasons for Absence
- Using .concat() in Python
- Reordering Columns in a Pandas DataFrame in Python
- Creating Checkpoints while Coding in Jupyter
- Analyzing the Dates from the Initial Data Set
- Extracting the Month Value from the "Date" Column
- Extracting the Day of the Week from the "Date" Column
- Analyzing Several "Straightforward" Columns for this Exercise
- Working on "Education", "Children", and "Pets"
- Final Remarks of this Section
- Exploring the Problem with a Machine Learning Mindset
- Creating the Targets for the Logistic Regression
- Selecting the Inputs for the Logistic Regression
- Standardizing the Data
- Splitting the Data for Training and Testing
- Fitting the Model and Assessing its Accuracy
- Creating a Summary Table with the Coefficients and Intercept
- Interpreting the Coefficients for Our Problem
- Standardizing only the Numerical Variables (Creating a Custom Scaler)
- Interpreting the Coefficients of the Logistic Regression
- Backward Elimination or How to Simplify Your Model
- Testing the Model We Created
- Saving the Model and Preparing it for Deployment
- Preparing the Deployment of the Model through a Module
- Deploying the 'absenteeism_module' - Part I
- Deploying the 'absenteeism_module' - Part II
- Analyzing Age vs Probability in Tableau
- Analyzing Reasons vs Probability in Tableau
- Analyzing Transportation Expense vs Probability in Tableau
- Using the .format() Method
- Iterating Over Range Objects
- Introduction to Nested For Loops
- Triple Nested For Loops
- List Comprehensions
- Anonymous (Lambda) Functions
- Introduction to pandas Series
- Working with Methods in Python - Part I
- Working with Methods in Python - Part II
- Parameters and Arguments in pandas
- Using .unique() and .nunique()
- Using .sort_values()
- Introduction to pandas DataFrames - Part I
- Introduction to pandas DataFrames - Part II
- pandas DataFrames - Common Attributes
- Data Selection in pandas DataFrames
- pandas DataFrames - Indexing with .iloc[]
- pandas DataFrames - Indexing with .loc[]
test3
test4
test5
test6