MLJ.jl

Looking for a sequential progression of tutorials? See DataScienceTutorials.jl

Data Processing

Loading and Accessing Data

Intended Learning Outcomes:

Understand how to load and access various datasets in R using RDatasets.jl
Learn how to save and load a local dataset in CSV format using CSV.jl

Manipulating Data Frames with DataFrames.jl

Intended Learning Outcomes:

Learn how to inspect, describe, and convert datasets into the form of Data Frames
Learn how to modify a Data Frame by adding columns and imputing missing values
Familiarize yourself with the groupby and combine operations on Data Frames

Working with Categorical Data

Intended Learning Outcomes:

Understand the different types of categorical data (e.g., nominal and ordinal data) via CategoricalArrays.jl
Learn how to work with and utilize such categorical arrays

Understanding Scientific Types

Intended Learning Outcomes:

Gain a comprehension of the rationale behind having scientific types and their different categories
Learn how to inspect and modify the scientific types in your data using ScientificTypes.jl
Learn about practical tips and tricks related to scientific types

Data Processing and Visualization

Intended Learning Outcomes:

Learn how to apply common data processing techniques on a real-world dataset
Learn how to create various plots (e.g., bar charts and histograms) to analyze your data

Vectors, Matrices and Data Loading in Julia

Intended Learning Outcomes:

Understand how to work with vectors and matrices in Julia
Learn about loading and plotting datasets in Julia

MLJ for Data Scientists in Two Hours

Intended Learning Outcomes:

Get a grasp on using MLJ as a data scientist new to MLJ or Julia
Refresh your skills on building simple models
Learn how to prepare example real-life data by loading, coercing, partitioning and unpacking data
Learn how to build pipelines in MLJ
Learn about how to manually and automatically evaluate models in MLJ
Understand how to perform feature selection in MLJ
Learn how to wrap models in iterative strategies in MLJ
Learn how to tune hyperparameters in MLJ
Familiarize yourself with confusion matrices, ROC curve and stratified cross-validation
Learn how to save and perform final evaluations on your models in MLJ
Understand the different types and methods introduced by MLJ

Linear Regression on Temporal Power Data

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization workflows
Gain an understanding of exploratory data analytics to better understand the data before developing your model
Train and analyze linear regression models on temporal data with MLJ

MLJTutorial Part 1: Data Representation

Intended Learning Outcomes:

Learn about \"scientific types\" and how to ensure MLJ interprets your data correctly

MLJTutorial Part 3: Transformers and Pipelines

Intended Learning Outcomes:

Learn how to combine data pre-processing and supervised learning into a single pipeline
Learn how to use a model wrapper to perform target transformations

Exploratory Data Analysis

The Glass Dataset Part I: Exploratory Data Analysis

Intended Learning Outcomes:

Learn to use ScientificTypes.jl, DataFrames.jl, StatsBase.jl, and StatsPlots.jl to carry out an exploratory data analysis in Julia

Predicting a Successful Mt Everest Climb

Intended Learning Outcomes:

Dive into a dataset on Himalayan climbing expeditions to learn how to train, evaluate and compare several supervised classifiers

Classification

Preparing data and model with Iris

Intended Learning Outcomes:

Understand why and how to coerce the data types of different variables in your dataset
Learn how to separate features and targets for training
Be able to find and load the models suitable for your data

Supervised and Unsupervised Workflows in MLJ

Intended Learning Outcomes:

Learn how to implement a supervised learning workflow with MLJ
Learn how to implement an unsupervised learning workflow with MLJ
Familiarize yourself with using MLJ's classification and transformation models

Hyperparameter Tuning for Single and Composite Models

Intended Learning Outcomes:

Learn how to optimize a single hyperparameter of your model
Learn how to tune multiple hyperparameters, that are possibly nested, and visualize the results

Logistic Regression & Friends on Stock Market Data

Intended Learning Outcomes:

Understand how to load and preprocess example datasets from RDatasets.jl
Explore how to train and analyze logistic regression on stock market data
Explore classification-related metrics such as cross-entropy loss, confusion matrix, and area under the ROC curve
Compare logistic regression to various other classifiers such as LDA, QDA, and KNN
Analyze training classification models on imbalanced datasets

Exploring Tree-based Models

Intended Learning Outcomes:

Explore various tree-based models for classification and regression including ordinary decision trees, random forests, and XGBoost
Refresh your skills on hyperparameter tuning and building MLJ pipelines

Building and Tuning a Support Vector Machine

Intended Learning Outcomes:

Familiarize yourself with generating and visualizing custom classification data
Learn how to build and tune support vector machine (SVM) models with MLJ

MLJ for Data Scientists in Two Hours

Intended Learning Outcomes:

Get a grasp on using MLJ as a data scientist new to MLJ or Julia
Refresh your skills on building simple models
Learn how to prepare example real-life data by loading, coercing, partitioning and unpacking data
Learn how to build pipelines in MLJ
Learn about how to manually and automatically evaluate models in MLJ
Understand how to perform feature selection in MLJ
Learn how to wrap models in iterative strategies in MLJ
Learn how to tune hyperparameters in MLJ
Familiarize yourself with confusion matrices, ROC curve and stratified cross-validation
Learn how to save and perform final evaluations on your models in MLJ
Understand the different types and methods introduced by MLJ

KNN, Logistic Regression and PCA on Wine Dataset

Intended Learning Outcomes:

Familiarize yourself with the common data preprocessing steps in MLJ
Refresh your skills on building pipelines and comparing classification models with MLJ
Learn how to reduce the dimensionality of high-dimensional data using dimensionality reduction techniques such as PCA

XGBoost on Crabs Dataset

Intended Learning Outcomes:

Learn how to build XGBoost models in MLJ
Familiarize yourself with various XGBoost hyperparameters and their effects
Refresh your skills on using learning curves and hyperparameter tuning in MLJ

EvoTree Classifier on Horse Colic Dataset

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing techniques in Julia
Get familiar with building baselines models for your learning task in MLJ
Refresh your understanding of using pipelines, evaluation and hyperparameter tuning in MLJ

Exploring Generalized Linear Models

Intended Learning Outcomes:

Understand how to use generative linear models from GLM.jl in MLJ
Practice examples of using linear regression and logistic regression models in MLJ
Understand how to interpret the outputs from linear and logistic regression models

Credit Fraud Detection with Classical and Deep Models

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization workflows
Refresh your understanding of classification metrics such as the confusion matrix and ROC curves
Build and hyperparameter tune logistic regression and SVM models
Learn how to build basic neural networks with MLJFlux.jl
Learn how to correct for class imbalance using the Imbalance.jl package

Benchmarking Classification Models on Breast Cancer Data

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization workflows
Learn how MLJ can be used to benchmark a large set of models against some dataset

BMI Classification with Decision Trees

Intended Learning Outcomes:

Learn how to load tabular data, set up its scientific types and study any existing imbalance
Observe how basic random oversampling can significantly improve decision tree performance on imbalanced data
Practice MLJ workflows related to evaluation such as cross-validation and new metrics

Effect of Ratios Oversampling Hyperparameter

Intended Learning Outcomes:

Learn how to study the imbalance of an existing dataset
Get a stronger grasp on how the ratios hyperparameter which reflects the amount of oversampling can affect the classification decision boundaries

From RandomOversampling to ROSE

Intended Learning Outcomes:

Understand the relationship between pure random oversampling and the ROSE algorithm
Understand the effect of increasing the `s` hyperparameter for ROSE

SMOTE on Customer Churn Data

Intended Learning Outcomes:

Observe how SMOTE can be used to address class imbalances on a real dataset with logistic regression as the classifier
Familiarize yourself with common MLJ workflows related to loading and processing data
Practice MLJ workflows related to evaluation such as cross-validation and new metrics

SMOTEN on Mushroom Data

Intended Learning Outcomes:

Familiarize yourself with common MLJ workflows related to loading and processing data
Use SMOTEN to address class imbalances on a real dataset with over 20 categorical columns
Practice MLJ workflows related to evaluation such as cross-validation and new metrics

SMOTENC on Customer Churn Data

Intended Learning Outcomes:

Observe how SMOTENC can be used to address class imbalances on a real dataset with categorical and continuous columns
Familiarize yourself with common MLJ workflows related to loading and processing data
Practice MLJ workflows related to evaluation such as cross-validation and new metrics

Effect of ENN Hyperparameters

Intended Learning Outcomes:

Familiarize yourself with common MLJ workflows related to loading and processing data
Explore the effects of various hyperparameter(s) of the ENN algorithm and how it can be useful for data cleaning

SMOTE-Tomek for Ethereum Fraud Detection

Intended Learning Outcomes:

Familiarize yourself with common MLJ workflows related to loading and processing data
Understand how hybrid resampling algorithms such as SMOTE-Tomek can be defined with the `BalancedModel` construct

Balanced Bagging for Cerebral Stroke Prediction

Intended Learning Outcomes:

Familiarize yourself with common MLJ workflows related to loading and processing data
Understand how balanced bagging can significantly improve classification performance on imbalanced data

Incremental Training of Neural Networks

Intended Learning Outcomes:

Explore incremental training with MLJ

Hyperparameter Tuning of Neural Networks

Intended Learning Outcomes:

Learn how to tune different hyperparameters of MLJFlux models with emphasis on training hyperparameters.

MNIST Classification with Neural Networks

Intended Learning Outcomes:

Learn how to build and training neural networks for image classification

Spam Detection with RNNs

Intended Learning Outcomes:

Learn how to train a neural network for spam classification over SMS messages

Julia Boards the Titanic

Intended Learning Outcomes:

Learn how to train a Decision Tree to predict survival for passengers on the Titanic. Aimed at new Julia users

MLJTutorial Part 2: Selecting, Training and Evaluating Models

Intended Learning Outcomes:

Learn how to match models to data and do basic training and model evaluation

The Glass Dataset Part II: Training a Decision Tree

Intended Learning Outcomes:

Learn how to choose a classification model, and train it for predicting and evaluating

Machine Learning Property Loans for Fun and Profit

Intended Learning Outcomes:

Use data in the public domain to train, tune, and compare multiple models to predict the probability of a loan default

Predicting a Successful Mt Everest Climb

Intended Learning Outcomes:

Dive into a dataset on Himalayan climbing expeditions to learn how to train, evaluate and compare several supervised classifiers

Regression

Preparing data and model with Iris

Intended Learning Outcomes:

Understand why and how to coerce the data types of different variables in your dataset
Learn how to separate features and targets for training
Be able to find and load the models suitable for your data

Building and Tuning Bagging Ensemble Models

Intended Learning Outcomes:

Understand how to implement bagging ensemble models in MLJ and compare them to atomic models
Learn how to optimize the parameters of bagging ensemble models and visualize the results

Building Random Forests with Bagging Ensembles

Intended Learning Outcomes:

Familiarize yourself with dealing with real-world datasets such as the Boston Housing dataset
Understand how to implement Random Forests using bagging over Decision Trees
Learn how to analyze the effect of a specific hyperparameter using MLJ's learning curve
Learn how to tune the parameters of Random Forests

Composing Models and Target Transformations

Intended Learning Outcomes:

Learn how to transform the target of your regression data using MLJ
Understand how to combine models and transformation algorithms in MLJ
Gain an understanding of the benefits of using MLJ pipelines

Multivariate Linear Regression & Interactions

Intended Learning Outcomes:

Understand how to build single and multivariable linear regression models with MLJ
Learn how to add interaction terms to model nonlinear trends in your data
Learn how to plot regression fits and their residuals

Building Polynomial Regression Models and Tuning Them

Intended Learning Outcomes:

Understand how to build a polynomial regression model with MLJ
Learn how to use feature selectors and models in an MLJ pipeline
Analyze and hyperparameter tune polynomial regression models

Ridge & Lasso Regression on Hitters Dataset

Intended Learning Outcomes:

Strengthen your data preparation, plotting, and analysis skills
Compare different types of linear regression such as Lasso and Ridge regression
Refresh on hyperparameter tuning and model composition with MLJ

Exploring Tree-based Models

Intended Learning Outcomes:

Explore various tree-based models for classification and regression including ordinary decision trees, random forests, and XGBoost
Refresh your skills on hyperparameter tuning and building MLJ pipelines

Tree-based models on King County Houses Dataset

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization techniques in Julia
Explore different tree-based models such as decision trees, random forests and gradient boosters and compare them together

Tree-based models on Airfoil Dataset

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization techniques in Julia
Explore different tree-based models such as decision trees, random forests and compare them together
Refresh your understanding of tuning hyperparameters with MLJ and analyzing tuning results

LightGBM on Boston Data

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization techniques in Julia
Build and analyze LightGBM models in MLJ by utilizing learning curves and hyperparameter tuning

Exploring Generalized Linear Models

Intended Learning Outcomes:

Understand how to use generative linear models from GLM.jl in MLJ
Practice examples of using linear regression and logistic regression models in MLJ
Understand how to interpret the outputs from linear and logistic regression models

Linear Regression on Temporal Power Data

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization workflows
Gain an understanding of exploratory data analytics to better understand the data before developing your model
Train and analyze linear regression models on temporal data with MLJ

Custom Neural Networks on Boston Data

Intended Learning Outcomes:

Learn how to build and train arbitrary feedforward neural networks via MLJFlux.jl
Understand how deep learning MLJFlux models can be hyperparameter tuned with MLJ

KNN & Ridge Regression Learning Network on AMES Pricing Data

Intended Learning Outcomes:

Get familiar with building baselines models for your machine learning task
Learn how to build simple learning networks (advanced model composition) in MLJ
Learn how to tune and analyze the evaluation results from learning networks

Build Basic Learning Networks with MLJ

Intended Learning Outcomes:

Have a clear understanding of how learning networks function in MLJ
Be able to construct basic learning networks with MLJ
Understand how to evaluate and tune learning networks

Lightning Tour of MLJ Meta-algorithms

Intended Learning Outcomes:

Get a rapid overview of pipelines and model wrappers for preprocessing, iteration control, and hyperparameter tuning

Clustering

Unsupervised Learning with PCA and Clustering

Intended Learning Outcomes:

Learn how to build unsupervised models such as KMeans and PCA in MLJ
Learn how to analyze and visualize results from unsupervised models such as KMeans and PCA

Dimensionality Reduction

Unsupervised Learning with PCA and Clustering

Intended Learning Outcomes:

Learn how to build unsupervised models such as KMeans and PCA in MLJ
Learn how to analyze and visualize results from unsupervised models such as KMeans and PCA

KNN, Logistic Regression and PCA on Wine Dataset

Intended Learning Outcomes:

Familiarize yourself with the common data preprocessing steps in MLJ
Refresh your skills on building pipelines and comparing classification models with MLJ
Learn how to reduce the dimensionality of high-dimensional data using dimensionality reduction techniques such as PCA

Neural Networks

Custom Neural Networks on Boston Data

Intended Learning Outcomes:

Learn how to build and train arbitrary feedforward neural networks via MLJFlux.jl
Understand how deep learning MLJFlux models can be hyperparameter tuned with MLJ

Credit Fraud Detection with Classical and Deep Models

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization workflows
Refresh your understanding of classification metrics such as the confusion matrix and ROC curves
Build and hyperparameter tune logistic regression and SVM models
Learn how to build basic neural networks with MLJFlux.jl
Learn how to correct for class imbalance using the Imbalance.jl package

Benchmarking Classification Models on Breast Cancer Data

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization workflows
Learn how MLJ can be used to benchmark a large set of models against some dataset

Incremental Training of Neural Networks

Intended Learning Outcomes:

Explore incremental training with MLJ

Hyperparameter Tuning of Neural Networks

Intended Learning Outcomes:

Learn how to tune different hyperparameters of MLJFlux models with emphasis on training hyperparameters.

Model Composition of Neural Networks

Intended Learning Outcomes:

Learn how to compose neural networks with other MLJ components

Comparing Neural Networks and Other Models

Intended Learning Outcomes:

Learn how to compare neural networks with other models

Early Stopping of Neural Networks

Intended Learning Outcomes:

Learn how early stopping can be applied to neural networks

Live Training of Neural Networks

Intended Learning Outcomes:

Train neural networks and see learning plots in real time

Basic Neural Architectural Search

Intended Learning Outcomes:

Learn how to naively search and compare different neural network architecture

MNIST Classification with Neural Networks

Intended Learning Outcomes:

Learn how to build and training neural networks for image classification

Spam Detection with RNNs

Intended Learning Outcomes:

Learn how to train a neural network for spam classification over SMS messages

Class Imbalance

Credit Fraud Detection with Classical and Deep Models

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization workflows
Refresh your understanding of classification metrics such as the confusion matrix and ROC curves
Build and hyperparameter tune logistic regression and SVM models
Learn how to build basic neural networks with MLJFlux.jl
Learn how to correct for class imbalance using the Imbalance.jl package

BMI Classification with Decision Trees

Intended Learning Outcomes:

Learn how to load tabular data, set up its scientific types and study any existing imbalance
Observe how basic random oversampling can significantly improve decision tree performance on imbalanced data
Practice MLJ workflows related to evaluation such as cross-validation and new metrics

Effect of Ratios Oversampling Hyperparameter

Intended Learning Outcomes:

Learn how to study the imbalance of an existing dataset
Get a stronger grasp on how the ratios hyperparameter which reflects the amount of oversampling can affect the classification decision boundaries

From RandomOversampling to ROSE

Intended Learning Outcomes:

Understand the relationship between pure random oversampling and the ROSE algorithm
Understand the effect of increasing the `s` hyperparameter for ROSE

SMOTE on Customer Churn Data

Intended Learning Outcomes:

Observe how SMOTE can be used to address class imbalances on a real dataset with logistic regression as the classifier
Familiarize yourself with common MLJ workflows related to loading and processing data
Practice MLJ workflows related to evaluation such as cross-validation and new metrics

SMOTEN on Mushroom Data

Intended Learning Outcomes:

Familiarize yourself with common MLJ workflows related to loading and processing data
Use SMOTEN to address class imbalances on a real dataset with over 20 categorical columns
Practice MLJ workflows related to evaluation such as cross-validation and new metrics

SMOTENC on Customer Churn Data

Intended Learning Outcomes:

Observe how SMOTENC can be used to address class imbalances on a real dataset with categorical and continuous columns
Familiarize yourself with common MLJ workflows related to loading and processing data
Practice MLJ workflows related to evaluation such as cross-validation and new metrics

Effect of ENN Hyperparameters

Intended Learning Outcomes:

Familiarize yourself with common MLJ workflows related to loading and processing data
Explore the effects of various hyperparameter(s) of the ENN algorithm and how it can be useful for data cleaning

SMOTE-Tomek for Ethereum Fraud Detection

Intended Learning Outcomes:

Familiarize yourself with common MLJ workflows related to loading and processing data
Understand how hybrid resampling algorithms such as SMOTE-Tomek can be defined with the `BalancedModel` construct

Balanced Bagging for Cerebral Stroke Prediction

Intended Learning Outcomes:

Familiarize yourself with common MLJ workflows related to loading and processing data
Understand how balanced bagging can significantly improve classification performance on imbalanced data

Missing Value Imputation

EvoTree Classifier on Horse Colic Dataset

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing techniques in Julia
Get familiar with building baselines models for your learning task in MLJ
Refresh your understanding of using pipelines, evaluation and hyperparameter tuning in MLJ

Encoders

Supervised and Unsupervised Workflows in MLJ

Intended Learning Outcomes:

Learn how to implement a supervised learning workflow with MLJ
Learn how to implement an unsupervised learning workflow with MLJ
Familiarize yourself with using MLJ's classification and transformation models

Composing Models and Target Transformations

Intended Learning Outcomes:

Learn how to transform the target of your regression data using MLJ
Understand how to combine models and transformation algorithms in MLJ
Gain an understanding of the benefits of using MLJ pipelines

Ridge & Lasso Regression on Hitters Dataset

Intended Learning Outcomes:

Strengthen your data preparation, plotting, and analysis skills
Compare different types of linear regression such as Lasso and Ridge regression
Refresh on hyperparameter tuning and model composition with MLJ

KNN, Logistic Regression and PCA on Wine Dataset

Intended Learning Outcomes:

Familiarize yourself with the common data preprocessing steps in MLJ
Refresh your skills on building pipelines and comparing classification models with MLJ
Learn how to reduce the dimensionality of high-dimensional data using dimensionality reduction techniques such as PCA

Tree-based models on Airfoil Dataset

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization techniques in Julia
Explore different tree-based models such as decision trees, random forests and compare them together
Refresh your understanding of tuning hyperparameters with MLJ and analyzing tuning results

Exploring Generalized Linear Models

Intended Learning Outcomes:

Understand how to use generative linear models from GLM.jl in MLJ
Practice examples of using linear regression and logistic regression models in MLJ
Understand how to interpret the outputs from linear and logistic regression models

Credit Fraud Detection with Classical and Deep Models

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization workflows
Refresh your understanding of classification metrics such as the confusion matrix and ROC curves
Build and hyperparameter tune logistic regression and SVM models
Learn how to build basic neural networks with MLJFlux.jl
Learn how to correct for class imbalance using the Imbalance.jl package

Benchmarking Classification Models on Breast Cancer Data

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization workflows
Learn how MLJ can be used to benchmark a large set of models against some dataset

Feature Engineering

Building Polynomial Regression Models and Tuning Them

Intended Learning Outcomes:

Understand how to build a polynomial regression model with MLJ
Learn how to use feature selectors and models in an MLJ pipeline
Analyze and hyperparameter tune polynomial regression models

MLJ for Data Scientists in Two Hours

Intended Learning Outcomes:

Get a grasp on using MLJ as a data scientist new to MLJ or Julia
Refresh your skills on building simple models
Learn how to prepare example real-life data by loading, coercing, partitioning and unpacking data
Learn how to build pipelines in MLJ
Learn about how to manually and automatically evaluate models in MLJ
Understand how to perform feature selection in MLJ
Learn how to wrap models in iterative strategies in MLJ
Learn how to tune hyperparameters in MLJ
Familiarize yourself with confusion matrices, ROC curve and stratified cross-validation
Learn how to save and perform final evaluations on your models in MLJ
Understand the different types and methods introduced by MLJ

Hyperparameter Tuning

Hyperparameter Tuning for Single and Composite Models

Intended Learning Outcomes:

Learn how to optimize a single hyperparameter of your model
Learn how to tune multiple hyperparameters, that are possibly nested, and visualize the results

Building and Tuning Bagging Ensemble Models

Intended Learning Outcomes:

Understand how to implement bagging ensemble models in MLJ and compare them to atomic models
Learn how to optimize the parameters of bagging ensemble models and visualize the results

Building Random Forests with Bagging Ensembles

Intended Learning Outcomes:

Familiarize yourself with dealing with real-world datasets such as the Boston Housing dataset
Understand how to implement Random Forests using bagging over Decision Trees
Learn how to analyze the effect of a specific hyperparameter using MLJ's learning curve
Learn how to tune the parameters of Random Forests

Building Polynomial Regression Models and Tuning Them

Intended Learning Outcomes:

Understand how to build a polynomial regression model with MLJ
Learn how to use feature selectors and models in an MLJ pipeline
Analyze and hyperparameter tune polynomial regression models

Ridge & Lasso Regression on Hitters Dataset

Intended Learning Outcomes:

Strengthen your data preparation, plotting, and analysis skills
Compare different types of linear regression such as Lasso and Ridge regression
Refresh on hyperparameter tuning and model composition with MLJ

Exploring Tree-based Models

Intended Learning Outcomes:

Explore various tree-based models for classification and regression including ordinary decision trees, random forests, and XGBoost
Refresh your skills on hyperparameter tuning and building MLJ pipelines

Building and Tuning a Support Vector Machine

Intended Learning Outcomes:

Familiarize yourself with generating and visualizing custom classification data
Learn how to build and tune support vector machine (SVM) models with MLJ

XGBoost on Crabs Dataset

Intended Learning Outcomes:

Learn how to build XGBoost models in MLJ
Familiarize yourself with various XGBoost hyperparameters and their effects
Refresh your skills on using learning curves and hyperparameter tuning in MLJ

EvoTree Classifier on Horse Colic Dataset

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing techniques in Julia
Get familiar with building baselines models for your learning task in MLJ
Refresh your understanding of using pipelines, evaluation and hyperparameter tuning in MLJ

Tree-based models on Airfoil Dataset

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization techniques in Julia
Explore different tree-based models such as decision trees, random forests and compare them together
Refresh your understanding of tuning hyperparameters with MLJ and analyzing tuning results

LightGBM on Boston Data

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization techniques in Julia
Build and analyze LightGBM models in MLJ by utilizing learning curves and hyperparameter tuning

Custom Neural Networks on Boston Data

Intended Learning Outcomes:

Learn how to build and train arbitrary feedforward neural networks via MLJFlux.jl
Understand how deep learning MLJFlux models can be hyperparameter tuned with MLJ

KNN & Ridge Regression Learning Network on AMES Pricing Data

Intended Learning Outcomes:

Get familiar with building baselines models for your machine learning task
Learn how to build simple learning networks (advanced model composition) in MLJ
Learn how to tune and analyze the evaluation results from learning networks

Stacking with Learning Networks

Intended Learning Outcomes:

Have a grasp of how to build and analyze complex learning networks (e.g., stacking)
Be able to evaluate and tune learning networks

Lightning Tour of MLJ Meta-algorithms

Intended Learning Outcomes:

Get a rapid overview of pipelines and model wrappers for preprocessing, iteration control, and hyperparameter tuning

Comparing Neural Networks and Other Models

Intended Learning Outcomes:

Learn how to compare neural networks with other models

Basic Neural Architectural Search

Intended Learning Outcomes:

Learn how to naively search and compare different neural network architecture

MLJTutorial Part 4: Tuning Hyperparameters

Intended Learning Outcomes:

Learn how to use learning curves to tune a single hyperparameter
Learn how to use a model wrapper to tune one or more hyperparameters using a random search

Machine Learning Property Loans for Fun and Profit

Intended Learning Outcomes:

Use data in the public domain to train, tune, and compare multiple models to predict the probability of a loan default

Pipelines

Composing Models and Target Transformations

Intended Learning Outcomes:

Learn how to transform the target of your regression data using MLJ
Understand how to combine models and transformation algorithms in MLJ
Gain an understanding of the benefits of using MLJ pipelines

Unsupervised Learning with PCA and Clustering

Intended Learning Outcomes:

Learn how to build unsupervised models such as KMeans and PCA in MLJ
Learn how to analyze and visualize results from unsupervised models such as KMeans and PCA

MLJ for Data Scientists in Two Hours

Intended Learning Outcomes:

Get a grasp on using MLJ as a data scientist new to MLJ or Julia
Refresh your skills on building simple models
Learn how to prepare example real-life data by loading, coercing, partitioning and unpacking data
Learn how to build pipelines in MLJ
Learn about how to manually and automatically evaluate models in MLJ
Understand how to perform feature selection in MLJ
Learn how to wrap models in iterative strategies in MLJ
Learn how to tune hyperparameters in MLJ
Familiarize yourself with confusion matrices, ROC curve and stratified cross-validation
Learn how to save and perform final evaluations on your models in MLJ
Understand the different types and methods introduced by MLJ

KNN, Logistic Regression and PCA on Wine Dataset

Intended Learning Outcomes:

Familiarize yourself with the common data preprocessing steps in MLJ
Refresh your skills on building pipelines and comparing classification models with MLJ
Learn how to reduce the dimensionality of high-dimensional data using dimensionality reduction techniques such as PCA

EvoTree Classifier on Horse Colic Dataset

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing techniques in Julia
Get familiar with building baselines models for your learning task in MLJ
Refresh your understanding of using pipelines, evaluation and hyperparameter tuning in MLJ

Exploring Generalized Linear Models

Intended Learning Outcomes:

Understand how to use generative linear models from GLM.jl in MLJ
Practice examples of using linear regression and logistic regression models in MLJ
Understand how to interpret the outputs from linear and logistic regression models

Credit Fraud Detection with Classical and Deep Models

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization workflows
Refresh your understanding of classification metrics such as the confusion matrix and ROC curves
Build and hyperparameter tune logistic regression and SVM models
Learn how to build basic neural networks with MLJFlux.jl
Learn how to correct for class imbalance using the Imbalance.jl package

Lightning Tour of MLJ Meta-algorithms

Intended Learning Outcomes:

Get a rapid overview of pipelines and model wrappers for preprocessing, iteration control, and hyperparameter tuning

SMOTE-Tomek for Ethereum Fraud Detection

Intended Learning Outcomes:

Familiarize yourself with common MLJ workflows related to loading and processing data
Understand how hybrid resampling algorithms such as SMOTE-Tomek can be defined with the `BalancedModel` construct

Model Composition of Neural Networks

Intended Learning Outcomes:

Learn how to compose neural networks with other MLJ components

MLJTutorial Part 3: Transformers and Pipelines

Intended Learning Outcomes:

Learn how to combine data pre-processing and supervised learning into a single pipeline
Learn how to use a model wrapper to perform target transformations

Iterative Models

Exploring Tree-based Models

Intended Learning Outcomes:

Explore various tree-based models for classification and regression including ordinary decision trees, random forests, and XGBoost
Refresh your skills on hyperparameter tuning and building MLJ pipelines

MLJ for Data Scientists in Two Hours

Intended Learning Outcomes:

Get a grasp on using MLJ as a data scientist new to MLJ or Julia
Refresh your skills on building simple models
Learn how to prepare example real-life data by loading, coercing, partitioning and unpacking data
Learn how to build pipelines in MLJ
Learn about how to manually and automatically evaluate models in MLJ
Understand how to perform feature selection in MLJ
Learn how to wrap models in iterative strategies in MLJ
Learn how to tune hyperparameters in MLJ
Familiarize yourself with confusion matrices, ROC curve and stratified cross-validation
Learn how to save and perform final evaluations on your models in MLJ
Understand the different types and methods introduced by MLJ

XGBoost on Crabs Dataset

Intended Learning Outcomes:

Learn how to build XGBoost models in MLJ
Familiarize yourself with various XGBoost hyperparameters and their effects
Refresh your skills on using learning curves and hyperparameter tuning in MLJ

EvoTree Classifier on Horse Colic Dataset

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing techniques in Julia
Get familiar with building baselines models for your learning task in MLJ
Refresh your understanding of using pipelines, evaluation and hyperparameter tuning in MLJ

Tree-based models on King County Houses Dataset

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization techniques in Julia
Explore different tree-based models such as decision trees, random forests and gradient boosters and compare them together

LightGBM on Boston Data

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization techniques in Julia
Build and analyze LightGBM models in MLJ by utilizing learning curves and hyperparameter tuning

Custom Neural Networks on Boston Data

Intended Learning Outcomes:

Learn how to build and train arbitrary feedforward neural networks via MLJFlux.jl
Understand how deep learning MLJFlux models can be hyperparameter tuned with MLJ

Benchmarking Classification Models on Breast Cancer Data

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization workflows
Learn how MLJ can be used to benchmark a large set of models against some dataset

Lightning Tour of MLJ Meta-algorithms

Intended Learning Outcomes:

Get a rapid overview of pipelines and model wrappers for preprocessing, iteration control, and hyperparameter tuning

BMI Classification with Decision Trees

Intended Learning Outcomes:

Learn how to load tabular data, set up its scientific types and study any existing imbalance
Observe how basic random oversampling can significantly improve decision tree performance on imbalanced data
Practice MLJ workflows related to evaluation such as cross-validation and new metrics

Ensemble Models

Building and Tuning Bagging Ensemble Models

Intended Learning Outcomes:

Understand how to implement bagging ensemble models in MLJ and compare them to atomic models
Learn how to optimize the parameters of bagging ensemble models and visualize the results

Building Random Forests with Bagging Ensembles

Intended Learning Outcomes:

Familiarize yourself with dealing with real-world datasets such as the Boston Housing dataset
Understand how to implement Random Forests using bagging over Decision Trees
Learn how to analyze the effect of a specific hyperparameter using MLJ's learning curve
Learn how to tune the parameters of Random Forests

Stacking with Learning Networks

Intended Learning Outcomes:

Have a grasp of how to build and analyze complex learning networks (e.g., stacking)
Be able to evaluate and tune learning networks

Bayesian Models

Logistic Regression & Friends on Stock Market Data

Intended Learning Outcomes:

Understand how to load and preprocess example datasets from RDatasets.jl
Explore how to train and analyze logistic regression on stock market data
Explore classification-related metrics such as cross-entropy loss, confusion matrix, and area under the ROC curve
Compare logistic regression to various other classifiers such as LDA, QDA, and KNN
Analyze training classification models on imbalanced datasets

Benchmarking Classification Models on Breast Cancer Data

Intended Learning Outcomes:

Familiarize yourself with common data preprocessing and visualization workflows
Learn how MLJ can be used to benchmark a large set of models against some dataset

Model Composition

Learning Networks

Intended Learning Outcomes:

Learn about advanced model composition, beyond simple pipelines