Subscription Plan

Browse all courses

Login Register

Discounted Courses

Tranding Courses

Regulated Courses

Subscription Plan

Learner Reviews

Student ID Card

Health and Social Care

Food Hygiene and Safety

Teaching and Education

Career and Employability

IT and Software

Counselling and Therapy

Business and Management

Quality Licence Scheme Endorsed

Regulated (RQF) Courses

Self-Paced Online learning

24/7 Online Support

14 days Money Back Guarantee

Fully Accredited Courses

Home
Course
IT and Software
Teaching and Education
Machine Learning for Aspiring Data Scientists: Zero to Hero

Machine Learning for Aspiring Data Scientists: Zero to Hero

4.7

(18 students)

Last updated:October 30, 2024

Language:English

Machine Learning for Aspiring Data Scientists: Zero to Hero

4.7

(18 students)

Rated Excellent on

Overview

Welcome to “Machine Learning for Aspiring Data Scientists: Zero to Hero,” a comprehensive course designed to equip beginners with the essential skills and knowledge needed to excel in the field of data science. This course takes a step-by-step approach, starting from the fundamentals of machine learning and advancing to complex algorithms and applications. You will learn through engaging lectures, hands-on projects, and real-world case studies, ensuring a practical understanding of concepts. The course covers topics including supervised and unsupervised learning, data preprocessing, model evaluation, and deployment. By the end of the course, you will not only grasp theoretical aspects but also gain hands-on experience using popular tools and frameworks such as Python, Scikit-Learn, and TensorFlow. This program is structured to foster critical thinking and problem-solving skills, preparing you for a career in data science and related fields.

Learning Outcomes

By the end of this course, you will be able to:

Understand fundamental machine learning concepts and techniques.
Implement supervised and unsupervised learning algorithms.
Preprocess and analyze datasets effectively.
Evaluate and fine-tune models for optimal performance.
Apply machine learning tools and libraries in real-world scenarios.

How will I get my certificate?

You may have to take a quiz or a written test online during or after the course. After successfully completing the course, you will be eligible for the certificate.

Who is This course for?

This course is ideal for aspiring data scientists, analysts, and anyone interested in learning machine learning from scratch. It’s perfect for students, professionals looking to transition into data science, or enthusiasts eager to expand their technical skills in a rapidly evolving field.

Requirements

No prior experience in programming or data science is needed, but a basic understanding of Python and statistics will be beneficial. Participants should have access to a computer with internet connectivity, enabling them to engage in hands-on exercises and projects throughout the course.

Career Path

Data Scientist
Machine Learning Engineer
Data Analyst
Business Intelligence Developer
Research Scientist

Course Curriculum

15 sections
190 lectures
15 hours, 50 minutes total length

Expand all sections

Machine Learning Models

Modeling an epidemic

00:08:00
The machine learning recipe

00:06:00
The components of a machine learning model

00:02:00
Why model?

00:03:00
On assumptions and can we get rid of them?

00:09:00
The case of AlphaZero

00:11:00
Overfitting/underfitting/bias/variance

00:11:00
Why use machine learning

00:05:00

Linear regression

The InsureMe challenge

00:06:00
Supervised learning

00:05:00
Linear assumption

00:03:00
Linear regression template

00:07:00
Non-linear vs proportional vs linear

00:05:00
Linear regression template revisited

00:04:00
Loss function

00:03:00
Training algorithm

00:08:00
Code time

00:15:00
R squared

00:06:00
Why use a linear model?

00:04:00

Scaling and Pipelines

Introduction to scaling

00:06:00
Min-max scaling

00:03:00
Code time (min-max scaling)

00:09:00
The problem with min-max scaling

00:03:00
What’s your IQ?

00:11:00
Standard scaling

00:04:00
Code time (standard scaling)

00:02:00
Model before and after scaling

00:05:00
Inference time

00:07:00
Pipelines

00:03:00
Code time (pipelines)

00:05:00

Regularization

Spurious correlations

00:04:00
L2 regularization

00:10:00
Code time (L2 regularization)

00:05:00
L2 results

00:02:00
L1 regularization

00:06:00
Code time (L1 regularization)

00:04:00
L1 results

00:02:00
Why does L1 encourage zeros?

00:09:00
L1 vs L2: Which one is best?

00:01:00

Validation

Introduction to validation

00:02:00
Why not evaluate model on training data

00:06:00
The validation set

00:05:00
Code time (validation set)

00:08:00
Error curves

00:06:00
Model selection

00:06:00
The problem with model selection

00:06:00
Tainted validation set

00:05:00
Monkeys with typewriters

00:03:00
My own validation epic fail

00:07:00
The test set

00:06:00
What if the model doesn’t pass the test?

00:05:00
How not to be fooled by randomness

00:02:00
Cross-validation

00:04:00
Code time (cross validation)

00:07:00
Cross-validation results summary

00:02:00
AutoML

00:05:00
Is AutoML a good idea?

00:05:00
Red flags: Don’t do this!

00:07:00
Red flags summary and what to do instead

00:05:00
Your job as a data scientist

00:03:00

Common Mistakes

Intro and recap

00:02:00
Mistake #1: Data leakage

00:05:00
The golden rule

00:04:00
Helpful trick (feature importance)

00:02:00
Real example of data leakage (part 1)

00:05:00
Real example of data leakage (part 2)

00:05:00
Another (funny) example of data leakage

00:02:00
Mistake #2: Random split of dependent data

00:05:00
Another example (insurance data)

00:05:00
Mistake #3: Look-Ahead Bias

00:06:00
Example solutions to Look-Ahead Bias

00:02:00
Consequences of Look-Ahead Bias

00:02:00
How to split data to avoid Look-Ahead Bias

00:03:00
Cross-validation with temporally related data

00:03:00
Mistake #4: Building model for one thing, using it for something else

00:04:00
Sketchy rationale

00:06:00
Why this matters for your career and job search

00:04:00

Classification - Part 1: Logistic Model

Classifying images of handwritten digits

00:07:00
Why the usual regression doesn’t work

00:04:00
Machine learning recipe recap

00:02:00
Logistic model template (binary)

00:13:00
Decision function and boundary (binary)

00:05:00
Logistic model template (multiclass)

00:14:00
Decision function and boundary (multi-class)

00:01:00
Summary: binary vs multiclass

00:01:00
Code time!

00:20:00
Why the logistic model is often called logistic regression

00:05:00
One vs Rest, One vs One

00:05:00

Classification - Part 2: Maximum Likelihood Estimation

Where we’re at

00:02:00
Brier score and why it doesn’t work

00:06:00
The likelihood function

00:11:00
Optimization task and numerical stability

00:03:00
Let’s improve the loss function

00:09:00
Loss value examples

00:05:00
Adding regularization

00:02:00
Binary cross-entropy loss

00:03:00

Classification - Part 3: Gradient Descent

Recap

00:03:00
No closed-form solution

00:02:00
Naive algorithm

00:04:00
Fog analogy

00:05:00
Gradient descent overview

00:03:00
The gradient

00:06:00
Numerical calculation

00:02:00
Parameter update

00:04:00
Convergence

00:02:00
Analytical solution

00:02:00
[Optional] Interpreting analytical solution

00:05:00
Gradient descent conditions

00:03:00
Beyond vanilla gradient descent

00:03:00
Code time

00:07:00
Reading the documentation

00:11:00

Classification metrics and class imbalance

Binary classification and class imbalance

00:06:00
Assessing performance

00:04:00
Accuracy

00:07:00
Accuracy with different class importance

00:04:00
Precision and Recall

00:07:00
Sensitivity and Specificity

00:03:00
F-measure and other combined metrics

00:05:00
ROC curve

00:07:00
Area under the ROC curve

00:06:00
Custom metric (important stuff!)

00:06:00
Other custom metrics

00:03:00
Bad data science process

00:04:00
Data rebalancing (avoid doing this!)

00:06:00
Stratified split

00:03:00

Neural Networks

The inverted MNIST dataset

00:04:00
The problem with linear models

00:05:00
Neurons

00:03:00
Multi-layer perceptron (MLP) for binary classification

00:05:00
MLP for regression

00:02:00
MLP for multi-class classification

00:01:00
Hidden layers

00:01:00
Activation functions

00:03:00
Decision boundary

00:02:00
Loss function

00:03:00
Intro to neural network training

00:03:00
Parameter initialization

00:03:00
Saturation

00:05:00
Non-convexity

00:04:00
Stochastic gradient descent (SGD)

00:05:00
More on SGD

00:07:00
Code time!

00:13:00
Backpropagation

00:11:00
The problem with MLPs

00:04:00
Deep learning

00:09:00

Tree-Based Models

Decision trees

00:04:00
Building decision trees

00:09:00
Stopping tree growth

00:03:00
Pros and cons of decision trees

00:08:00
Decision trees for classification

00:07:00
Decision boundary

00:01:00
Bagging

00:04:00
Random forests

00:06:00
Gradient-boosted trees for regression

00:07:00
Gradient-boosted trees for classification [optional]

00:04:00
How to use gradient-boosted trees

00:03:00

K-nn and SVM

Nearest neighbor classification

00:03:00
K nearest neighbors

00:03:00
Disadvantages of k-NN

00:04:00
Recommendation systems (collaborative filtering)

00:03:00
Introduction to Support Vector Machines (SVMs)

00:05:00
Maximum margin

00:02:00
Soft margin

00:02:00
SVM vs Logistic Model (support vectors)

00:03:00
Alternative SVM formulation

00:06:00
Dot product

00:02:00
Non-linearly separable data

00:03:00
Kernel trick (polynomial)

00:10:00
RBF kernel

00:02:00
SVM remarks

00:06:00

Unsupervised Learning

Intro to unsupervised learning

00:01:00
Clustering

00:03:00
K-means clustering

00:10:00
K-means application example

00:03:00
Elbow method

00:02:00
Clustering remarks

00:07:00
Intro to dimensionality reduction

00:05:00
PCA (principal component analysis)

00:08:00
PCA remarks

00:03:00
Code time (PCA)

00:13:00

Feature Engineering

Missing data

00:02:00
Imputation

00:04:00
Imputer within pipeline

00:04:00
One-Hot encoding

00:05:00
Ordinal encoding

00:03:00
How to combine pipelines

00:04:00
Code sample

00:08:00
Feature Engineering

00:07:00
Features for Natural Language Processing (NLP)

00:11:00
Anatomy of a Data Science Project

00:01:00

Frequently Bought Together

Diploma in EYFS Teaching at QLS Level 5

£19.99

5

(1 student)

view course Add to cart

Advanced Diploma in Speech Therapy and SEN Teaching Assistant at QLS Level 7

£19.99

5

(5 students)

view course Add to cart

4.7 course rating - 18 reviews

£19.99

Regular Price

£419

TAKE THIS COURSE

14 Days Money Back Guarantee

This course includes:

Duration:

15 hours, 50 minutes
Access:

1 year access
Units:

190
Level:

Intermediate
CPD Points:

10
Certificate:

Yes

An online learning portal for you to connect from wherever you want and when you need it to acquire skills at your own pace.

Rated Excellent on

Explore

Explore

Important Links

Important Links

Certificate Validator

Quickly and easily check the validity of your Apex Learning course certificates with Apex Learning’s Course Certificate Validator tool.

© – Apex Learning is the trading name of Digilearn Ltd (Company Number: 13076490). All rights reserved.

Chat with usLiveChat