principle component analysis
scroll ↓ to Resources
Note
- new features are linear combinations of existing ones with some weights
- weights are selected so that the dispersion of a new feature is maximized
- sum of weights are 1, otherwise we can always increase the dispersion by proportional increase of weights
- if we want to get m new features, we maximize the dispersion of all m new features
- all weights for each one of m new features are different
- prior to PCA the dataset needs to be normalized per feature: subtract the average
- this allows to compute the optimization formula without subtracting the mean
- geometrical interpretation: we project the dataset on a new hypersurface
Cheat sheet
Robust PCA
- There is also a very cool extension “robust PCA” algorithm capable of splitting data into signal / noise, ..but it’s still unclear why it works so well, from a theoretical perspective.
- numerical-linear-algebra/nbs/3. Background Removal with Robust PCA.ipynb at master · fastai/numerical-linear-algebra · GitHub
Resources
Transclude of base---related.base
Links to this File
table file.inlinks, filter(file.outlinks, (x) => !contains(string(x), ".jpg") AND !contains(string(x), ".pdf") AND !contains(string(x), ".png")) as "Outlinks" from [[]] and !outgoing([[]]) AND -"Changelog"
