I would like to measure the smoothness of an NLP-autoencoder's latent space. The idea is to sample two Gaussian vectors v1 and v2 in the latent space of the AE, and generate N-1 points between them like so:

vi = v1 + (v2 - v1) / (N * i)

My idea is to then decode these vectors and measure the BLEU score between d(vi) and d(vi+1) for all N-2 comparisons.

Is this idea reasonable, do you have a better one? Is there a technique from AEs with images that can be useful here?

Comments

You must log in or register to comment.

jackilion t1_j5zk1sb wrote on January 26, 2023 at 5:39 PM

I'm not working on NLP but I have seen your idea in papers on diffusion models. You are basically linearly interpolating your latent space. There are other interpolation techniques you could try, but your idea will definitely give you some insight into your latent space.

Another possibiltiy would be some kind of grid search through the latent space, tho depending on your dimensions it could be too hard.

Lastly, you could visualize the latent space by projecting it into 2 or 3 dimensions via t-SNE or something similar.

Blutorangensaft OP t1_j5zu9ti wrote on January 26, 2023 at 6:42 PM

Thank you for your answer. If a paper on diffusion models pops into your mind that uses this method, feel free to post it.

How would you derive a quantitative evaluation from t-SNE? I thought it's mostly used for visualisation. I'm looking to compute some kind of score from the interpolation.

jackilion t1_j634fkx wrote on January 27, 2023 at 11:10 AM

What's the point of this score?

Blutorangensaft OP t1_j6356ho wrote on January 27, 2023 at 11:19 AM

Compare different autoencoders in their ability to create valid language in a continuous space. Later, I want to generate sentences in its latent space by using another neural network, and have them decoded to real sentences by the autoencoder. I want the space to be smooth because the second neural net will naturally be using gradient descent, which involves infinitesimal changes. I believe this network will perform better if the changes that happen actually represent meaningful distances between real sentences.

jackilion t1_j63e6ah wrote on January 27, 2023 at 12:55 PM

There is no reason to assume your latent space will be smooth by itself. I remember a paper for image generation that had techniques for smoothing out the latent space that can be applied during training:

https://arxiv.org/abs/2106.09016

It's about GANs, not autoencoders, but maybe you can find some ideas in there.

Blutorangensaft OP t1_j654qyd wrote on January 27, 2023 at 7:54 PM

Thank you for the reference, it looks very promising. I've heard of ways to smooth the latent space through Lipschitz regularisation, but then got disappointed again when I read "ah well it's just layer normalisation". So many things in ML come in a different appearance and actually mean the same thing once you implement them.

crt09 t1_j6317t4 wrote on January 27, 2023 at 10:28 AM

Just speaking from gut here but you could go the other way around and get sentences with varying BLEU differences, encode them all and see how distance their latent representations are, this way you wouldnt have to worry about the effect of the validity of the generated sentences which might be a problem with the other way around (I think)

Blutorangensaft OP t1_j632b2s wrote on January 27, 2023 at 10:42 AM

Using slightly different sentences to be decoded to the same sentence exists as an idea in the form of denoising autoencoders, yes. I plan to use this down the road, but for now I am interested in thinking about measuring performance.

crt09 t1_j633u7c wrote on January 27, 2023 at 11:03 AM

I think there's miscommunication, it sounds like you think I'm proposing a training method but I'm suggesting how to measure smoothness.

If you have the BLEU distances between input sentences and the distances between their latents, you can see measure how the distances change between the two which I *think* would indicate smoothness. Or you could do some other measurements on the latents to see how smoothly(?) they are distributed? tbh I'm not entirely sure what you mean by smooth, sorry.

If you're looking to measure performance wouldn't that loss for the training method you be mentioned be useful?

Or are you looking for measuring performance on decoding side?

Blutorangensaft OP t1_j6344jf wrote on January 27, 2023 at 11:06 AM

Ahh, I get you now, my apologies. I'm more interested in the performance on the decoding side indeed, because I want to later generate sentences in that latent space with another neural net and have them decoded to normal tokens.

iidealized t1_j655guq wrote on January 27, 2023 at 7:58 PM

Paper that seems relevant:

https://arxiv.org/abs/1905.12777