Garage of Code: Monitoring Neural Network convergence with Multidimensional Scaling

I want to better understand what happens when a neural network is trained. One way is to analyze how the weights change. Since there is typically over 100 weights, we need to look at a low dimensional extract if we want a comprehensive view. I want to trace the search through weight space like a path, so we need to reduce to 2 or 3 dimensions.

An interesting approach to this extreme dimensionality reduction is Multidimensional Scaling (MDS). MDS uses a matrix of pairwise distances between the elements x_i to find low dimensional real valued z_i such that

$\sum_{i,j} (d_{i,j} - |z_i - z_j|)^2$
is minimized, where d_ij is the distance between elements x_i and x_j, for some given metric. I used euclidean distance for the metric.

The problem

I apply this to a simple classification: given a 2-dimensional input, classify whether the point is in the unit circle. I use a neural network with 1 hidden layer of 30 nodes. Therefore, there are (2+1)*30 weights in the first layer, and (30+1)*1 weights in the output layer. (The +1 terms are the biases). This comes out to 121 weights. So we are reducing from 121 dimensions to 2, while trying to optimally preserve pairwise distances.

The search paths

Unsurprising: the search takes smaller and smaller steps as the training reaches a better state.
Surprising: the optimization appears to be just a straight march towards the goal.

Looking at just the end of the search path. As the training converges, the search takes smaller and more erratic steps. But it does not walk in circles (yet, at this point).

How can this be improved?

- Can look at each layer by itself

- Show MDS fit score, perhaps color code / size code the plot by uncertainty

Garage of Code

codecogs equations

tisdag 14 maj 2019

Monitoring Neural Network convergence with Multidimensional Scaling

Inga kommentarer:

Skicka en kommentar