Spring 2024
Understanding Deep Learning by Simon J.D. Prince. Published by MIT Press, 2023.
https://udlbook.github.io/udlbook
Inference: \(y = f[x, \Phi]\)
\(y\): prediction
\(x\): input
\(\Phi\): model parameters
We “learn” the parameters from pairs of “training data” \(\{x_i, y_i\}\)
We quantify the accuracy by using a (scalar) loss function \(L[\Phi]\). The smaller the loss, the better our model “fits” the data.
To check the “generalization” of the model, we run the model on “test data”, which is separate from the training data.
\(y = f[x, \Phi] = \Phi_0 \\ + \Phi_1 a[ \Theta_{10} + \Theta_{11} x] \\ + \Phi_2 a[ \Theta_{20} + \Theta_{21} x] \\ + \Phi_3 a[ \Theta_{30} + \Theta_{31} x]\)
We now have 10 parameters
And also, an “activation” function \(a[]\)
https://distill.pub/2020/grand-tour/
http://projector.tensorflow.org/
https://ml4a.github.io/ml4a/looking_inside_neural_nets/