Supervised DDVFA Example

Source code notebook compat Author Update time

DDVFA is an unsupervised clustering algorithm by definition, but it can be adaptived for supervised learning by mapping the module's internal categories to the true labels. ART modules such as DDVFA can also be used in simple supervised mode where provided labels are used in place of internal incremental labels for the clusters, providing a method of assessing the clustering performance when labels are available.

We begin with importing AdaptiveResonance for the ART modules and MLDatasets for some data utilities.

using AdaptiveResonance # ART
using MLDatasets        # Iris dataset
using DataFrames        # DataFrames, necessary for MLDatasets.Iris()
using MLDataUtils       # Shuffling and splitting
using Printf            # Formatted number printing

We will download the Iris dataset for its small size and benchmark use for clustering algorithms.

# Get the iris dataset
iris = Iris(as_df=false)
# Manipulate the features and labels into a matrix of features and a vector of labels
features, labels = iris.features, iris.targets
([5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8 6.7 6.7 6.3 6.5 6.2 5.9; 3.5 3.0 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 3.7 3.4 3.0 3.0 4.0 4.4 3.9 3.5 3.8 3.8 3.4 3.7 3.6 3.3 3.4 3.0 3.4 3.5 3.4 3.2 3.1 3.4 4.1 4.2 3.1 3.2 3.5 3.1 3.0 3.4 3.5 2.3 3.2 3.5 3.8 3.0 3.8 3.2 3.7 3.3 3.2 3.2 3.1 2.3 2.8 2.8 3.3 2.4 2.9 2.7 2.0 3.0 2.2 2.9 2.9 3.1 3.0 2.7 2.2 2.5 3.2 2.8 2.5 2.8 2.9 3.0 2.8 3.0 2.9 2.6 2.4 2.4 2.7 2.7 3.0 3.4 3.1 2.3 3.0 2.5 2.6 3.0 2.6 2.3 2.7 3.0 2.9 2.9 2.5 2.8 3.3 2.7 3.0 2.9 3.0 3.0 2.5 2.9 2.5 3.6 3.2 2.7 3.0 2.5 2.8 3.2 3.0 3.8 2.6 2.2 3.2 2.8 2.8 2.7 3.3 3.2 2.8 3.0 2.8 3.0 2.8 3.8 2.8 2.8 2.6 3.0 3.4 3.1 3.0 3.1 3.1 3.1 2.7 3.2 3.3 3.0 2.5 3.0 3.4 3.0; 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4 1.7 1.5 1.7 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2 1.3 1.5 1.3 1.5 1.3 1.3 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4 4.7 4.5 4.9 4.0 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5 4.1 4.5 3.9 4.8 4.0 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5 4.5 4.7 4.4 4.1 4.0 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1 6.0 5.1 5.9 5.6 5.8 6.6 4.5 6.3 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9 5.7 5.2 5.0 5.2 5.4 5.1; 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 0.2 0.2 0.1 0.1 0.2 0.4 0.4 0.3 0.3 0.3 0.2 0.4 0.2 0.5 0.2 0.2 0.4 0.2 0.2 0.2 0.2 0.4 0.1 0.2 0.1 0.2 0.2 0.1 0.2 0.2 0.3 0.3 0.2 0.6 0.4 0.3 0.2 0.2 0.2 0.2 1.4 1.5 1.5 1.3 1.5 1.3 1.6 1.0 1.3 1.4 1.0 1.5 1.0 1.4 1.3 1.4 1.5 1.0 1.5 1.1 1.8 1.3 1.5 1.2 1.3 1.4 1.4 1.7 1.5 1.0 1.1 1.0 1.2 1.6 1.5 1.6 1.5 1.3 1.3 1.3 1.2 1.4 1.2 1.0 1.3 1.2 1.3 1.3 1.1 1.3 2.5 1.9 2.1 1.8 2.2 2.1 1.7 1.8 1.8 2.5 2.0 1.9 2.1 2.0 2.4 2.3 1.8 2.2 2.3 1.5 2.3 2.0 2.0 1.8 2.1 1.8 1.8 1.8 2.1 1.6 1.9 2.0 2.2 1.5 1.4 2.3 2.4 1.8 1.8 2.1 2.4 2.3 1.9 2.3 2.5 2.3 1.9 2.0 2.3 1.8], InlineStrings.String15["Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-setosa" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-versicolor" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica" "Iris-virginica"])

Because the MLDatasets package gives us Iris labels as strings, we will use the MLDataUtils.convertlabel method with the MLLabelUtils.LabelEnc.Indices type to get a list of integers representing each class:

labels = convertlabel(LabelEnc.Indices{Int}, vec(labels))
unique(labels)
3-element Vector{Int64}:
 1
 2
 3

Next, we will create a train/test split with the MLDataUtils.stratifiedobs utility:

(X_train, y_train), (X_test, y_test) = stratifiedobs((features, labels))
(([5.8 6.2 5.1 6.5 6.3 5.5 6.7 5.8 6.4 5.1 6.3 5.2 7.2 4.7 4.8 6.5 5.5 5.0 5.7 4.9 6.6 6.4 6.9 7.2 6.0 5.5 6.2 4.9 6.5 6.9 6.7 6.9 5.0 5.9 6.0 4.5 6.3 6.3 5.9 6.7 4.8 5.9 5.5 5.0 5.4 7.9 5.1 6.7 6.1 6.2 6.3 6.3 7.7 5.8 5.4 6.8 6.4 5.4 4.4 5.6 5.7 5.1 6.6 5.0 5.2 6.2 7.7 5.5 5.6 6.7 5.1 5.5 7.4 5.1 5.7 5.1 5.7 6.8 6.5 6.1 7.2 6.1 5.1 4.3 5.4 5.4 5.2 4.8 6.3 7.6 6.0 6.7 4.6 4.9 5.7 5.0 4.8 5.2 5.7 4.8 7.7 5.6 4.6 6.3 7.0; 2.8 2.2 3.7 3.0 2.5 2.5 3.0 4.0 2.8 3.8 2.8 4.1 3.6 3.2 3.4 3.0 2.3 3.3 2.6 2.5 2.9 2.8 3.1 3.2 3.4 2.4 2.8 3.0 2.8 3.1 3.0 3.1 3.4 3.0 2.2 2.3 2.9 3.3 3.0 3.3 3.4 3.2 2.4 3.2 3.0 3.8 3.3 3.1 2.6 3.4 3.3 2.5 3.0 2.7 3.4 2.8 3.2 3.9 3.2 2.8 3.8 3.5 3.0 2.3 3.4 2.9 3.8 4.2 2.7 3.1 3.8 2.6 2.8 3.5 2.8 2.5 2.9 3.0 3.2 3.0 3.0 2.8 3.8 3.0 3.4 3.9 2.7 3.1 2.7 3.0 2.7 3.1 3.1 3.1 2.5 3.5 3.0 3.5 4.4 3.0 2.8 3.0 3.6 3.4 3.2; 5.1 4.5 1.5 5.2 4.9 4.0 5.2 1.2 5.6 1.9 5.1 1.5 6.1 1.6 1.9 5.8 4.0 1.4 3.5 4.5 4.6 5.6 5.4 6.0 4.5 3.8 4.8 1.4 4.6 4.9 5.0 5.1 1.5 4.2 5.0 1.3 5.6 6.0 5.1 5.7 1.6 4.8 3.7 1.2 4.5 6.4 1.7 4.4 5.6 5.4 4.7 5.0 6.1 4.1 1.7 4.8 4.5 1.7 1.3 4.9 1.7 1.4 4.4 3.3 1.4 4.3 6.7 1.4 4.2 4.7 1.6 4.4 6.1 1.4 4.5 3.0 4.2 5.5 5.1 4.6 5.8 4.0 1.5 1.1 1.5 1.3 3.9 1.6 4.9 6.6 5.1 5.6 1.5 1.5 5.0 1.3 1.4 1.5 1.5 1.4 6.7 4.5 1.0 5.6 4.7; 2.4 1.5 0.4 2.0 1.5 1.3 2.3 0.2 2.1 0.4 1.5 0.1 2.5 0.2 0.2 2.2 1.3 0.2 1.0 1.7 1.3 2.2 2.1 1.8 1.6 1.1 1.8 0.2 1.5 1.5 1.7 2.3 0.2 1.5 1.5 0.3 1.8 2.5 1.8 2.1 0.2 1.8 1.0 0.2 1.5 2.0 0.5 1.4 1.4 2.3 1.6 1.9 2.3 1.0 0.2 1.4 1.5 0.4 0.2 2.0 0.3 0.3 1.4 1.0 0.2 1.3 2.2 0.2 1.3 1.5 0.2 1.2 1.9 0.2 1.3 1.1 1.3 2.1 2.0 1.4 1.6 1.3 0.3 0.1 0.4 0.4 1.4 0.2 1.8 2.1 1.6 2.4 0.2 0.1 2.0 0.3 0.1 0.2 0.4 0.3 2.0 1.5 0.2 2.4 1.4], [3, 2, 1, 3, 2, 2, 3, 1, 3, 1, 3, 1, 3, 1, 1, 3, 2, 1, 2, 3, 2, 3, 3, 3, 2, 2, 3, 1, 2, 2, 2, 3, 1, 2, 3, 1, 3, 3, 3, 3, 1, 2, 2, 1, 2, 3, 1, 2, 3, 3, 2, 3, 3, 2, 1, 2, 2, 1, 1, 3, 1, 1, 2, 2, 1, 2, 3, 1, 2, 2, 1, 2, 3, 1, 2, 2, 2, 3, 3, 2, 3, 2, 1, 1, 1, 1, 2, 1, 3, 3, 2, 3, 1, 1, 3, 1, 1, 1, 1, 1, 3, 2, 1, 3, 2]), ([6.8 4.4 5.1 5.4 5.6 5.0 6.1 5.0 4.7 5.8 4.4 5.5 5.8 6.4 4.9 7.7 4.9 6.3 6.1 6.4 6.1 6.5 6.4 5.3 5.7 6.7 5.8 5.6 6.0 7.3 4.6 6.7 6.4 5.8 6.0 7.1 5.6 5.7 5.0 5.0 4.9 6.9 6.0 4.6 5.0; 3.2 3.0 3.4 3.7 2.9 3.0 2.8 3.6 3.2 2.7 2.9 3.5 2.7 2.7 3.1 2.6 3.1 2.3 2.9 3.1 3.0 3.0 3.2 3.7 2.8 3.3 2.7 3.0 3.0 2.9 3.2 2.5 2.9 2.6 2.2 3.0 2.5 3.0 3.5 2.0 2.4 3.2 2.9 3.4 3.4; 5.9 1.3 1.5 1.5 3.6 1.6 4.7 1.4 1.3 3.9 1.4 1.3 5.1 5.3 1.5 6.9 1.5 4.4 4.7 5.5 4.9 5.5 5.3 1.5 4.1 5.7 5.1 4.1 4.8 6.3 1.4 5.8 4.3 4.0 4.0 5.9 3.9 4.2 1.6 3.5 3.3 5.7 4.5 1.4 1.6; 2.3 0.2 0.2 0.2 1.3 0.2 1.2 0.2 0.2 1.2 0.2 0.2 1.9 1.9 0.1 2.3 0.1 1.3 1.4 1.8 1.8 1.8 2.3 0.2 1.3 2.5 1.9 1.3 1.8 1.8 0.2 1.8 1.3 1.2 1.0 2.1 1.1 1.2 0.6 1.0 1.0 2.3 1.5 0.3 0.4], [3, 1, 1, 1, 2, 1, 2, 1, 1, 2, 1, 1, 3, 3, 1, 3, 1, 2, 2, 3, 3, 3, 3, 1, 2, 3, 3, 2, 3, 3, 1, 3, 2, 2, 2, 3, 2, 2, 1, 2, 2, 3, 2, 1, 1]))

Now, we can create our DDVFA module. We'll do so with the default contstructor, though the module itself has many options that you can alter during instantiation.

art = DDVFA()
DDVFA(opts_DDVFA
  rho_lb: Float64 0.7
  rho_ub: Float64 0.85
  alpha: Float64 0.001
  beta: Float64 1.0
  gamma: Float64 3.0
  gamma_ref: Float64 1.0
  similarity: Symbol single
  max_epoch: Int64 1
  display: Bool false
  gamma_normalization: Bool true
  uncommitted: Bool false
  activation: Symbol gamma_activation
  match: Symbol gamma_match
  update: Symbol basic_update
  sort: Bool false
, opts_FuzzyART
  rho: Float64 0.85
  alpha: Float64 0.001
  beta: Float64 1.0
  gamma: Float64 3.0
  gamma_ref: Float64 1.0
  max_epoch: Int64 1
  display: Bool false
  gamma_normalization: Bool true
  uncommitted: Bool false
  activation: Symbol gamma_activation
  match: Symbol gamma_match
  update: Symbol basic_update
  sort: Bool false
, DataConfig(false, Float64[], Float64[], 0, 0), 0.0, FuzzyART[], Int64[], 0, 0, Float64[], Float64[], Dict{String, Any}("bmu" => 0, "mismatch" => false, "M" => 0.0, "T" => 0.0))

We can train the model in batch mode upon the data in a simple supervised mode. We do so by passing the integer vector of labels to the training method with the simple keyword y. Just as in unsupervised training, we can extract the module's prescribed labels from the training method, which should match up to the training labels as we will see later.

# Train in simple supervised mode by passing the labels as a keyword argument.
y_hat_train = train!(art, X_train, y=y_train)
println("Training labels: ",  size(y_hat_train), " ", typeof(y_hat_train))
Training labels: (105,) Vector{Int64}

We can classify the testing data to see how we generalize. At the same time, we can see the effect of getting the best-matching unit in the case of complete mismatch (see the docs on Mismatch vs. BMU)

# Classify both ways
y_hat = AdaptiveResonance.classify(art, X_test)
y_hat_bmu = AdaptiveResonance.classify(art, X_test, get_bmu=true)

# Check the shape and type of the output labels
println("Testing labels: ",  size(y_hat), " ", typeof(y_hat))
println("Testing labels with bmu: ",  size(y_hat_bmu), " ", typeof(y_hat_bmu))
Testing labels: (45,) Vector{Int64}
Testing labels with bmu: (45,) Vector{Int64}

Finally, we can calculate the performances (number correct over total) of the model upon all three regimes:

  1. Training data
  2. Testing data
  3. Testing data with get_bmu=true
# Calculate performance on training data, testing data, and with get_bmu
perf_train = performance(y_hat_train, y_train)
perf_test = performance(y_hat, y_test)
perf_test_bmu = performance(y_hat_bmu, y_test)

# Format each performance number for comparison
@printf "Training performance: %.4f\n" perf_train
@printf "Testing performance: %.4f\n" perf_test
@printf "Best-matching unit testing performance: %.4f\n" perf_test_bmu
Training performance: 1.0000
Testing performance: 1.0000
Best-matching unit testing performance: 1.0000

This page was generated using DemoCards.jl and Literate.jl.