Unsupervised DDVFA Example

Update time

DDVFA is an unsupervised clustering algorithm by definition, so it can be used to cluster a set of samples all at once in batch mode.

We begin with importing AdaptiveResonance for the ART modules and MLDatasets for loading some data.

using AdaptiveResonance # ART
using MLDatasets        # Iris dataset
using DataFrames        # DataFrames, necessary for MLDatasets.Iris()
using MLDataUtils       # Shuffling and splitting

We will download the Iris dataset for its small size and benchmark use for clustering algorithms.

# Get the iris dataset
iris = Iris(as_df=false)
# Extract the features into a local variable
features = iris.features

4×150 Matrix{Float64}:
 5.1  4.9  4.7  4.6  5.0  5.4  4.6  5.0  4.4  4.9  5.4  4.8  4.8  4.3  5.8  5.7  5.4  5.1  5.7  5.1  5.4  5.1  4.6  5.1  4.8  5.0  5.0  5.2  5.2  4.7  4.8  5.4  5.2  5.5  4.9  5.0  5.5  4.9  4.4  5.1  5.0  4.5  4.4  5.0  5.1  4.8  5.1  4.6  5.3  5.0  7.0  6.4  6.9  5.5  6.5  5.7  6.3  4.9  6.6  5.2  5.0  5.9  6.0  6.1  5.6  6.7  5.6  5.8  6.2  5.6  5.9  6.1  6.3  6.1  6.4  6.6  6.8  6.7  6.0  5.7  5.5  5.5  5.8  6.0  5.4  6.0  6.7  6.3  5.6  5.5  5.5  6.1  5.8  5.0  5.6  5.7  5.7  6.2  5.1  5.7  6.3  5.8  7.1  6.3  6.5  7.6  4.9  7.3  6.7  7.2  6.5  6.4  6.8  5.7  5.8  6.4  6.5  7.7  7.7  6.0  6.9  5.6  7.7  6.3  6.7  7.2  6.2  6.1  6.4  7.2  7.4  7.9  6.4  6.3  6.1  7.7  6.3  6.4  6.0  6.9  6.7  6.9  5.8  6.8  6.7  6.7  6.3  6.5  6.2  5.9
 3.5  3.0  3.2  3.1  3.6  3.9  3.4  3.4  2.9  3.1  3.7  3.4  3.0  3.0  4.0  4.4  3.9  3.5  3.8  3.8  3.4  3.7  3.6  3.3  3.4  3.0  3.4  3.5  3.4  3.2  3.1  3.4  4.1  4.2  3.1  3.2  3.5  3.1  3.0  3.4  3.5  2.3  3.2  3.5  3.8  3.0  3.8  3.2  3.7  3.3  3.2  3.2  3.1  2.3  2.8  2.8  3.3  2.4  2.9  2.7  2.0  3.0  2.2  2.9  2.9  3.1  3.0  2.7  2.2  2.5  3.2  2.8  2.5  2.8  2.9  3.0  2.8  3.0  2.9  2.6  2.4  2.4  2.7  2.7  3.0  3.4  3.1  2.3  3.0  2.5  2.6  3.0  2.6  2.3  2.7  3.0  2.9  2.9  2.5  2.8  3.3  2.7  3.0  2.9  3.0  3.0  2.5  2.9  2.5  3.6  3.2  2.7  3.0  2.5  2.8  3.2  3.0  3.8  2.6  2.2  3.2  2.8  2.8  2.7  3.3  3.2  2.8  3.0  2.8  3.0  2.8  3.8  2.8  2.8  2.6  3.0  3.4  3.1  3.0  3.1  3.1  3.1  2.7  3.2  3.3  3.0  2.5  3.0  3.4  3.0
 1.4  1.4  1.3  1.5  1.4  1.7  1.4  1.5  1.4  1.5  1.5  1.6  1.4  1.1  1.2  1.5  1.3  1.4  1.7  1.5  1.7  1.5  1.0  1.7  1.9  1.6  1.6  1.5  1.4  1.6  1.6  1.5  1.5  1.4  1.5  1.2  1.3  1.5  1.3  1.5  1.3  1.3  1.3  1.6  1.9  1.4  1.6  1.4  1.5  1.4  4.7  4.5  4.9  4.0  4.6  4.5  4.7  3.3  4.6  3.9  3.5  4.2  4.0  4.7  3.6  4.4  4.5  4.1  4.5  3.9  4.8  4.0  4.9  4.7  4.3  4.4  4.8  5.0  4.5  3.5  3.8  3.7  3.9  5.1  4.5  4.5  4.7  4.4  4.1  4.0  4.4  4.6  4.0  3.3  4.2  4.2  4.2  4.3  3.0  4.1  6.0  5.1  5.9  5.6  5.8  6.6  4.5  6.3  5.8  6.1  5.1  5.3  5.5  5.0  5.1  5.3  5.5  6.7  6.9  5.0  5.7  4.9  6.7  4.9  5.7  6.0  4.8  4.9  5.6  5.8  6.1  6.4  5.6  5.1  5.6  6.1  5.6  5.5  4.8  5.4  5.6  5.1  5.1  5.9  5.7  5.2  5.0  5.2  5.4  5.1
 0.2  0.2  0.2  0.2  0.2  0.4  0.3  0.2  0.2  0.1  0.2  0.2  0.1  0.1  0.2  0.4  0.4  0.3  0.3  0.3  0.2  0.4  0.2  0.5  0.2  0.2  0.4  0.2  0.2  0.2  0.2  0.4  0.1  0.2  0.1  0.2  0.2  0.1  0.2  0.2  0.3  0.3  0.2  0.6  0.4  0.3  0.2  0.2  0.2  0.2  1.4  1.5  1.5  1.3  1.5  1.3  1.6  1.0  1.3  1.4  1.0  1.5  1.0  1.4  1.3  1.4  1.5  1.0  1.5  1.1  1.8  1.3  1.5  1.2  1.3  1.4  1.4  1.7  1.5  1.0  1.1  1.0  1.2  1.6  1.5  1.6  1.5  1.3  1.3  1.3  1.2  1.4  1.2  1.0  1.3  1.2  1.3  1.3  1.1  1.3  2.5  1.9  2.1  1.8  2.2  2.1  1.7  1.8  1.8  2.5  2.0  1.9  2.1  2.0  2.4  2.3  1.8  2.2  2.3  1.5  2.3  2.0  2.0  1.8  2.1  1.8  1.8  1.8  2.1  1.6  1.9  2.0  2.2  1.5  1.4  2.3  2.4  1.8  1.8  2.1  2.4  2.3  1.9  2.3  2.5  2.3  1.9  2.0  2.3  1.8

Next, we will instantiate a DDVFA module. We could create an options struct for reuse with opts=opts_DDVFA(...), but for now we will use the direct keyword arguments approach.

art = DDVFA(rho_lb=0.6, rho_ub=0.75)

DDVFA(opts_DDVFA
  rho_lb: Float64 0.6
  rho_ub: Float64 0.75
  alpha: Float64 0.001
  beta: Float64 1.0
  gamma: Float64 3.0
  gamma_ref: Float64 1.0
  similarity: Symbol single
  max_epoch: Int64 1
  display: Bool false
  gamma_normalization: Bool true
  uncommitted: Bool false
  activation: Symbol gamma_activation
  match: Symbol gamma_match
  update: Symbol basic_update
, opts_FuzzyART
  rho: Float64 0.75
  alpha: Float64 0.001
  beta: Float64 1.0
  gamma: Float64 3.0
  gamma_ref: Float64 1.0
  max_epoch: Int64 1
  display: Bool false
  gamma_normalization: Bool true
  uncommitted: Bool false
  activation: Symbol gamma_activation
  match: Symbol gamma_match
  update: Symbol basic_update
, DataConfig(false, Float64[], Float64[], 0, 0), 0.0, FuzzyART[], Int64[], 0, 0, Float64[], Float64[], Dict{String, Any}("bmu" => 0, "mismatch" => false, "M" => 0.0, "T" => 0.0))

To train the module on the training data, we use train!. The train method returns the prescribed cluster labels, which are just what the algorithm believes are unique/separate cluster. This is because we are doing unsupervised learning rather than supervised learning with known labels.

y_hat_train = train!(art, features)

150-element Vector{Int64}:
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 2
 1
 1
 1
 1
 1
 1
 1
 1
 3
 3
 3
 4
 3
 3
 3
 4
 3
 4
 4
 3
 4
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 3
 4
 3
 3
 3
 4
 3
 3
 3
 3
 4
 3
 5
 3
 5
 3
 5
 5
 3
 5
 5
 5
 3
 3
 5
 3
 3
 3
 5
 5
 5
 3
 5
 3
 5
 3
 5
 5
 3
 3
 5
 5
 5
 5
 5
 3
 3
 5
 5
 3
 3
 5
 5
 5
 3
 5
 5
 5
 3
 3
 5
 3

Though we could inspect the unique entries in the list above, we can see the number of categories directly from the art module.

art.n_categories

Because DDVFA actually has FuzzyART modules for F2 nodes, each category has its own category prototypes. We can see the total number of weights in the DDVFA module by summing n_categories across all F2 nodes.

total_vec = [art.F2[i].n_categories for i = 1:art.n_categories]
total_cat = sum(total_vec)

This page was generated using DemoCards.jl and Literate.jl.