Submitted by uwashingtongold t3_100bf73 in MachineLearning

Say we have two distributions of segmentation annotations (i.e., a bunch of segmentation maps) which I want to establish are 'similar'. To be more specific, I'm working on a research project where we have in-house annotations for images from a public dataset and I want to quanitatively establish that our annotations and the dataset annotations differ in similar ways, or that our annotations 'fall into the distribution of the current dataset'.

(I'm aware that I can measure similarity between distributions with a measure like KL-divergence, but what I'm not sure about is how I would establish what level is 'similar enough'.)

9

Comments

You must log in or register to comment.

yeolj0o t1_j2hkerk wrote

I was having the exact thought about comparing segmentation labels. The best "deep learning style" approach I've come up with (and also not satisfying) is running a semantic image synthesis model (e.g., SPADE, OASIS, PITI) and comparing FIDs. A better approach for my case (I am working with cityscapes) where the scene outline is mostly fixed, is to utilize height priors and compare KL divergence or FSD according to the height (bottom, mid, top).

1

Just_CurioussSss t1_j2hyuxm wrote

The best approach will depend on the nature of the data and the specific research question you are trying to answer. It may be helpful to try a few different approaches and see which one works best for your specific case.

1

uwashingtongold OP t1_j2idybf wrote

Super interesting! Thanks for the reply. Can you talk more about the later approach? My dataset is a medical segmentation dataset which obeys similar fixed properties as cityscapes. What do you mean by height priors and how are you comparing divergence metrics to establish similarity?

Although for reference, our paper is studying ambiguity in segmentation. So the annotations all have high variance, and there are multiple annotations per image

2

yeolj0o t1_j2iizj4 wrote

This cityscapes segmentation approach paper provides intuition on the height prior, which is basically categorizing an image into three part according to the height of the pixel coordinates and then measuring pixel-wise class distributions. As you mentioned in your original post, you can use KL-divergence to measure similarity of the class distribution between two images.

For your case (measuring ambiguity), I think measuring the class distribution (of an image) seems like a bad idea since local differences may be the key difference you want to observe. Instead, I think measuring miou between two or more images can be a good measure since ambiguous annotations must have a small overlapping region, thus having a low miou.

2

neriticzone t1_j2iln01 wrote

Not sure if I understand this question but isn’t this what the Dice coefficient is used for?

1

Mental-Swordfish7129 t1_j2llbwf wrote

What if you encode the data with high-dimensional binary vectors and utilize a sparse distributed memory? I've used this approach many times with models I've built and you can measure semantic (Hamming) distance between data and you have a latent space for what similar data would have to look like. It's similar to a self-organizing map approach.

1