ML Terminology


Machine Learning Subcategories

  • Supervised Learning
  • Unsupervised Learning

Supervised Machine Learning uses a set of input variables to predict the value of an output variable.

Unsupervised Machine Learning, uses patterns from any unlabeled dataset, trying to understand patterns (or groupings) in the data.


Machine Learning Phases

Machine learning has two main phases:

1. Training:
Input data are used to calculate the parameters of the model.

2. Inference:
The "trained" model outputs correct data from any input.


Machine Learning Models

A Model defines the relationship between the label (y) and the features (x).

There are three phases in the life of a model:

  • Data Collection
  • Training
  • Inference

Machine Learning Training

The goal of training is to create a model that can answer a question. Like what is the expected price for a house?


Machine Learning Inference

Inference is when the trained model is used to infer (predict) values using live data. Like putting the model into production.


Supervised Learning

Supervised learning uses labeled data (data with known answers) to train algorithms to:

  • Classify Data
  • Predict Outcomes

Supervised learning can classify data like "What is spam in an e-mail", based on known spam examples.

Supervised learning can predict outcomes like predicting what kind of video you like, based on the videos you have played.


Unsupervised Learning

Unsupervised learning is used to predict undefined relationships like meaningful patterns in data.

It is about creating computer algorithms than can improve themselves.

It is expected that machine learning will shift to unsupervised learning to allow programmers to solve problems without creating models.


Reinforcement Learning

Reinforcement learning is based on non-supervised learning but receives feedback from the user whether the decisions is good or bad. The feedback contributes to improving the model.


Self-Supervised Learning

Self-supervised learning is similar to unsupervised learning because it works with data without human added labels.

The difference is that unsupervised learning uses clustering, grouping, and dimensionality reduction, while self-supervised learning draw its own conclusions for regression and classification tasks.


Key Machine Learning Terminologies are:

  • Relationships
  • Labels
  • Features
  • Models
  • Training
  • Inference

Relationships

Machine learning systems uses Relationships between Inputs to produce Predictions.

In algebra, a relationship is often written as y = ax + b:

  • y is the label we want to predict
  • a is the slope of the line
  • x are the input values
  • b is the intercept

With ML, a relationship is written as y = b + wx:

  • y is the label we want to predict
  • w is the weight (the slope)
  • x are the features (input values)
  • b is the intercept

Machine Learning Labels

In Machine Learning terminology, the label is the thing we want to predict.

It is like the y in a linear graph:

Algebra Machine Learning
y = ax + b y = b + wx

Machine Learning Features

In Machine Learning terminology, the features are the input.

They are like the x values in a linear graph:

Algebra Machine Learning
y = ax + b y = b + wx

Sometimes there can be many features (input values) with different weights:

y = b + w1x1 + w2x2 + w3x3 + w4x4


Copyright 1999-2023 by Refsnes Data. All Rights Reserved.