Submitted by midasp t3_y4elh1 in MachineLearning

In Computer Science, it is known that we are very close to the limits of compressing all the information found in an image. There is no way to losslessly compress images much farther. So we've resorted to lossy compression where some of the image's information is thrown away.

Instead of throwing away information, maybe there is another approach to getting smaller image files. What if a significant percentage of that information resides somewhere else?

Suppose we train a ML model (Resnet, Diffusion Models or whatever) on a wide and comprehensive set of images with two tasks. Task #1 is that the model can take an image, I, as input and outputs a smaller encoding, E. And the same model can be used for task #2, take the encoding E as input and give us the same image I as its output.

In this way, the ML model acts as a large external repository of image information that maps between I and E. Instead of transmitting I, we now just need encode I to encoding E, transmit the much smaller E. As long as both transmitter and receiver has the same ML model, the receiver applies the reverse and uses E to decode and get get back the original image, I.

6

Comments

You must log in or register to comment.

Marvsdd01 t1_isdr7im wrote

take a look at auto-encoders for data compression ;)

37

prehumast t1_isfn0ne wrote

First thing I thought of.

Additionally, one of the "issues" with ML compression schemes is that inherently the models are biased towards their training data. So we can see modest compression gains on sub classes, but generalizing to any image is not as straightforward... At least guarantees aren't.

3

seiqooq t1_isgqd96 wrote

In fairness, this is not unique to ML. Compression algorithms (among others) can be loaded with parameters that were optimized against sets of data as well.

2

prehumast t1_isgwuxq wrote

True... I humbly accept the rebuke. 🙃

1

ThrowThisShitAway10 t1_isdsqd5 wrote

Yes of course. A lot of compression is moving towards AI-based methods because they can be a lot better.

There is actually an explicit connection between AI and compression. It is believed that advanced methods to compress text are equivalent to the AGI problem. There's even a million dollar prize for anyone who can make progress in this domain: https://en.wikipedia.org/wiki/Hutter_Prize

11

WikiSummarizerBot t1_isdsrcg wrote

Hutter Prize

>The Hutter Prize is a cash prize funded by Marcus Hutter which rewards data compression improvements on a specific 1 GB English text file, with the goal of encouraging research in artificial intelligence (AI). Launched in 2006, the prize awards 5000 euros for each one percent improvement (with 500,000 euros total funding) in the compressed size of the file enwik9, which is the larger of two files used in the Large Text Compression Benchmark; enwik9 consists of the first 1,000,000,000 characters of a specific version of English Wikipedia. The ongoing competition is organized by Hutter, Matt Mahoney, and Jim Bowery.

^([ )^(F.A.Q)^( | )^(Opt Out)^( | )^(Opt Out Of Subreddit)^( | )^(GitHub)^( ] Downvote to remove | v1.5)

9

Crazy-Space5384 t1_iseeqex wrote

But they limit the size of the decompressor executable so that it cannot contain a priori knowledge about the text corpus. Meaning you can‘t include a pretrained network…

2

MTGTraner t1_isem4w5 wrote

That seems fair, no? Otherwise, you could just deploy an overfitted model!

7

BrotherAmazing t1_isfrx3g wrote

And might need ‘N’ decompressors on your PC for ‘N’ files, and the size of those decompressors might he so large that it starts to outweigh the savings in compression. I mean, a decompressor that knows what the text is can magically “decompress” a file of size 0 with nothing in it to the original text. lol

1

midasp OP t1_isf183v wrote

It doesn't really matter. With just a language model trained on general use English (or whatever human language is in the corpus), it should still be able transform each sentence or paragraph into a short encoding.

1

Crazy-Space5384 t1_isf1r74 wrote

But so does traditional data compression. So it‘s to be proven that an ML model gets closer to the entropy limit - given that the model must be transferred alongside the encoded text given the size restriction of the decompressor binary.

1

midasp OP t1_isf72lo wrote

I'm sorry, I should have clarified that I have no interest in the Hutter Prize or its rules, nor is it about the getting close to the entropy limit.

My idea is more about the transmitter and receiver already having mutually shared information (stored within the ML model). In such a situation, the transmitter can reduce the amount of information that needs to be transmitted because it does not have to transmit that mutually shared information. The receiver will be able to combine the transmitted information with its shared information to rebuild the original message.

I should not have used the term "image compression", that is an error on my part and I apologize if it lead any confusion. It is only "compression" in the sense that we are transmitting less information rather than transmitting the message in its entirety and pushing the limits of information compression.

0

mew_bot t1_isduxun wrote

I saw a similar project with stable diffusion

9

lucellent t1_iselx8p wrote

First thing that came to my mind too. Someone on Reddit made an image compressor from SD (although it's no longer the same image as before, it will look identical from a distance but in reality it's not the same image)

3

Professional-Ebb4970 t1_isgm99j wrote

I strongly disagree with your first paragraph. There is still a lot of work to be done on lossless compression, and I don't believe we are as close to the Shannon Bound as you seem to imply.

For instance, there are recent methods that use neural networks to do lossless compression by using a combination of ANS, Bits Back and VAEs, and they can often achieve much better compression rates than traditional methods. For an example, check this paper: https://arxiv.org/abs/1901.04866

3

bobwmcgrath t1_isg4h7w wrote

It would be a lossy compression, but that's basically what super resolution does.

1