Dimensionality reduction
Dimensionality reduction is an unsupervised machine learning (ML) technique that reduces the number of features in a dataset while preserving its meaningful structure and relationships.
PCA
Principal Component Analysis (PCA) captures the most significant patterns in the data by transforming it into a new coordinate system to maximize variance along orthogonal axes.
- Open a table
- Run Top Menu > ML > Dimensionality reduction > PCA...
- Select the source table and
Feature
columns - Set the number of principal
Components
- Set
Center
and/orScale
data pre-processing options - Press OK
Datagrok ensures blazingly fast computations:
See also:
UMAP
Uniform Manifold Approximation and Projection (UMAP) is a nonlinear method for mapping high-dimensional data to a lower-dimensional space preserving its global and local structures.
- Open a table
- Run Top Menu > ML > Dimensionality reduction > UMAP...
- Select the source table and
Feature
columns - Set
Hyperparameters
and press OK
Use scatter plot and/or 3D scatter plot to visualize results:
See also:
t-SNE
t-distributed stochastic neighbor embedding (t-SNE) reveals the underlying complex data structure by representing its similar points as nearby neighbors in a lower-dimensional space.
- Open a table
- Run Top Menu > ML > Dimensionality reduction > t-SNE...
- Select the source table and
Feature
columns - Set
Hyperparameters
and press OK
See also:
SPE
Stochastic proximity embedding (SPE) is a self-organizing method that produces meaningful underlying dimensions from proximity data.
- Open a table
- Run Top Menu > ML > Dimensionality reduction > SPE...
- Select the source table and
Feature
columns - Set
Hyperparameters
and press OK
See also: