Submitted by eiliya_20 t3_103641x in MachineLearning
Hi guys, I have a set of feature values defined as x = {f_1,f_2,...,f_n} (x does not contain any zero) and the goal is to measure the similarity between these features using Mahalanobis distance so x is converted to a diagonal matrix called X_i where the diagonal elements are f_1,f_2,...,f_n, therefore, the distance is measured using columns of X_i.
Then I calculate the covariance matrix of X_i which is semi-positive definite (SPD) but the inverse of the covariance matrix is non-SPD and Mahalanobis distance is not valid(it became negative).
Any ideas or suggestions?
Thanks.
HotAd9055 t1_j2xd5gd wrote
Add an identity matrix to the covariance before taking the inverse. It works. Otherwise, if you are looking for outliers consider using a shrinkage procedure as in http://www.ledoit.net/honey.pdf