PCA based sorting

Next: Clustering Up: Sorting Previous: Sorting

PCA based sorting

Spike signals represent a large number of variables which makes it very likely that subsets of these variable are highly correlated to another. The accuracy and reliability of a classification of such data will suffer if highly correlated or variables which are unrelated to the outcome are included to the analysis.

Reducing the dimensionality of the spike data without sacrificing accuracy is therefore a key step in spike sorting. PCA transforms a number of (possibly) correlated variables into a smaller number of uncorrelated variables - the principal components. As a result the dimensionality of the dataset is reduced but most of the original variability retained. The first principal component accounts for as much of the variability in the data as possible, and each of the succeeding components accounts for as much of the remaining variability as possible.

NEV2lkit performs Principal Component Analysis on the individual recorded spike signals and calculates the first 3 principal components which are then used in subsequent cluster analysis based on KlustaKwik. Using a pop-up menu in the GUI it is possibly specify whether the eigenvector calculation is based on the correlations, the variance/covariances or the sum of squares of the cross-products of the spike signal matrix.

Depending of the data set structure it might however possible that the variance of the linear combination can not be maximized. If this happens while sorting data using on this method, NEV2lkit clusters this data set, omitting the PCA based calculation step, using the high dimensional, original measured signal waveforms.