ML Terminology
Machine Learning Subcategories
- Supervised Learning
- Unsupervised Learning
Supervised Machine Learning uses a set of input variables to predict the value of an output variable.
Unsupervised Machine Learning, uses patterns from any unlabeled dataset, trying to understand patterns (or groupings) in the data.
Machine Learning Phases
Machine learning has two main phases:
1. Training:
Input data are used to calculate the parameters of the model.
2. Inference:
The "trained" model outputs correct data from any input.
Machine Learning Models
A Model defines the relationship between the label (y) and the features (x).
There are three phases in the life of a model:
- Data Collection
- Training
- Inference
Machine Learning Training
The goal of training is to create a model that can answer a question. Like what is the expected price for a house?
Machine Learning Inference
Inference is when the trained model is used to infer (predict) values using live data. Like putting the model into production.
Supervised Learning
Supervised learning uses labeled data (data with known answers) to train algorithms to:
- Classify Data
- Predict Outcomes
Supervised learning can classify data like "What is spam in an e-mail", based on known spam examples.
Supervised learning can predict outcomes like predicting what kind of video you like, based on the videos you have played.
Unsupervised Learning
Unsupervised learning is used to predict undefined relationships like meaningful patterns in data.
It is about creating computer algorithms than can improve themselves.
It is expected that machine learning will shift to unsupervised learning to allow programmers to solve problems without creating models.
Reinforcement Learning
Reinforcement learning is based on non-supervised learning but receives feedback from the user whether the decisions is good or bad. The feedback contributes to improving the model.
Self-Supervised Learning
Self-supervised learning is similar to unsupervised learning because it works with data without human added labels.
The difference is that unsupervised learning uses clustering, grouping, and dimensionality reduction, while self-supervised learning draw its own conclusions for regression and classification tasks.
Key Machine Learning Terminologies are:
- Relationships
- Labels
- Features
- Models
- Training
- Inference
Relationships
Machine learning systems uses Relationships between Inputs to produce Predictions.
In algebra, a relationship is often written as y = ax + b:
- y is the label we want to predict
- a is the slope of the line
- x are the input values
- b is the intercept
With ML, a relationship is written as y = b + wx:
- y is the label we want to predict
- w is the weight (the slope)
- x are the features (input values)
- b is the intercept
Machine Learning Labels
In Machine Learning terminology, the label is the thing we want to predict.
It is like the y in a linear graph:
Algebra | Machine Learning |
y = ax + b | y = b + wx |
Machine Learning Features
In Machine Learning terminology, the features are the input.
They are like the x values in a linear graph:
Algebra | Machine Learning |
y = ax + b | y = b + wx |
Sometimes there can be many features (input values) with different weights:
y = b + w1x1 + w2x2 + w3x3 + w4x4