Viewing a single comment thread. View all comments

vikigenius t1_iy4tb05 wrote

That's not PCA's fault. No matter what kind of technique you use, if you try to represent a 768d vector in a 2d plane by using dimensionality reduction, you will lose lots of explainabilility.

The real question is about what properties do you care to preserve when doing dimensionality reduction and validating that your technique tries to preserve as much of it as possible.

7

olmec-akeru OP t1_iy7a8fe wrote

So this may not be true: the surface of a Riemannian manifold is infinite, so you can encode infinite knowledge onto its surface. From there the diffeomorphic property allows one to traverse the surface and generate explainable, differentiable, vectors.

2

vikigenius t1_iy7kxuu wrote

Huh? Diffeomorphisms are dimensionality preserving. You can't have a diffeomorphism between Rn to R2 unless n=2. That's the only way your differential mapping is bijective

So I am not sure how the diffeomorphisms guarantee that you can have lossless dimensionality reduction.

What can happen is that if your data inherently lies on a lower dimensional manifold. For instance if you have A subset of Rn that has an inherent dimensionality of just 2 then you can trivially just represent it in 2 dimensions. For example if you have a 3d space where the 3rd dimension is an exact linear combination of the 1st and 2nd then it's inherent dimensionality is 2 and you can obviously losslessly reduce it to 2d.

But most definitely not all datasets have an inherent dimensionality of 2.

1

gooblywooblygoobly t1_iydf4z2 wrote

A super trivial example is that a (hyper)plane is a Riemannian manifold. Since we know that PCA is lossy, and PCA projects to a (hyper)plane, it can't be that projecting to manifolds are enough to perfectly preserve information.

1