Viewing a single comment thread. View all comments

BlazeObsidian t1_j4v495i wrote

Autoencoders like VAE’s should work better than any other models for image to image translation. Maybe you can try different VAE models and compare their performance

I was wrong.

1

kingdroopa OP t1_j4v5t9a wrote

Hmm, interesting! Do you have any papers/article/sources supporting this claim?

2

BlazeObsidian t1_j4var74 wrote

Sorry, I was wrong. Modern deep VAE's can match SOTA GAN model performance for img superresolution(https://arxiv.org/abs/2203.09445) but I don't have evidence for recoloring.

But diffusion models are shown to outperform GAN's on multiple img-to-img translation tasks. Eg:- https://deepai.org/publication/palette-image-to-image-diffusion-models

You could probably reframe your problem as an image colorization task:- https://paperswithcode.com/task/colorization and the SOTA is still Palette linked above

1

kingdroopa OP t1_j4vbaxk wrote

Thanks :) I noticed Palette uses paired images, whilst mine are a bit unaligned. Would you considered it a paired image set, or unpaired? They look closely similar, but don't share pixel information in the top/bottom of the images.

1

BlazeObsidian t1_j4vc61q wrote

That depends on the extent to which the pixel information is misaligned I think. If cropping your images is not a solution and a large portion of your images have this issue, the model wouldn't be able to generate the right pixel information for the misaligned sections. But it's worth giving a try with Palette if the misalignment is not significant.

2