Lab 7: Clustering and Dimensionality Reduction
Lab 7: Clustering and Dimensionality Reduction
Slides
The slides I showed this week can be found here.
Miscellaneous Notes
- Homework 3 has been posted and is due next Friday (3/15) at 11:59pm
- Homework 2 is being graded, grades will likely be posted next week
Topics Covered
- We discussed various dimensionality reduction techniques, which are used to project high-dimensional data into a low-dimensional space while preserving the clusters from the high-dimensional space. These included:
- Principal Component Analysis (PCA)
- Multidimensional Scaling (MDS)
- Sparse Random Projection
- Locally Linear Embedding
- t-Distributed Stochastic Neighbor Embedding (t-SNE)
- Uniform Manifold Approximation and Projection (UMAP)
- We also applied each of these methods to the MNIST dataset of hand-drawn digits, projecting the 784-dimensional MNIST vectors into both 2 and 3 dimensions and visualizing the results. The code we used to create these visualizations can be found here.
- We discussed common pitfalls that can lead to misreadings of t-SNE plots
Further Reading
- This article contains an in-depth explanation of the interactive “MNIST Cube” visualization we discussed, as well as some animations of other clustering techniques
- This article breaks down the examples of common t-SNE plot pitfalls we looked at in more detail.
- This article compares t-SNE and UMAP with interactive visualizations.