saw79 t1_izjv9wa wrote on December 9, 2022 at 5:18 PM

Reply to Does anyone know how to get the NxNx12 from the input image - is it just using reshape function or is there any other function that can be used by Actual-Performer-832

Sometimes that beautiful one-liner just isn't worth it compared to something like

torch.cat((
    img[:N, :N],
    img[N:, :N],
    img[:N, N:],
    img[N:, N:],
    ), dim=-1)

saw79 t1_iz0158r wrote on December 5, 2022 at 1:49 PM

Reply to How to choose a starting CNN architecture? by OffswitchToggle

I don't think it makes sense these days to implement a CNN architecture from scratch for a standard problem (e.g., classification), except as a learning exercise. A common set of classification networks that I use as a go-to are the EfficientNet architectures. Usually I use the timm library (for PyTorch), and instantiating the model is just 1 line of code (see its docs). You can either load it in pretrained (from ImageNet) or randomly initialized, and further fine-tune yourself. EfficientNet has versions 0-7 that give increasing performance at the cost of computation/size. If you're in TensorFlow-land I'm sure there's something analogous. Both TF and PT have model zoos in official packages too. Like torchvision.models or whatever.

saw79 t1_ixiusbb wrote on November 23, 2022 at 7:39 PM

Reply to How to efficiently re-train a classification model with an addition of a new class? by kingfung1120

Your model should output 3 logits, one for class_a, one for class_b, and one for class_c.

When you use data from the 1st dataset,

penalize class_a outputs for samples with class_b and anything_but_a_b labels
penalize class_b outputs for samples with class_a and anything_but_a_b labels
penalize class_c outputs for samples with class_a and class_b labels

When you use data from the 2nd dataset,

penalize class_a outputs for samples with class_c labels
penalize class_b outputs for samples with class_c labels
penalize class_c outputs for samples with not_class_c labels

saw79 t1_irwr0ae wrote on October 11, 2022 at 4:48 PM

Reply to comment by sqweeeeeeeeeeeeeeeps in I made densify –– a tool for enriching point cloud datasets by jsonathan

You're correct, but that's not what he's doing.

saw79 t1_ir23cl5 wrote on October 4, 2022 at 7:40 PM

Reply to comment by XecutionStyle in A wild question? Why CNNs are not aware of visual quality? [D] by ThoughtOk5558

All I meant by nebulous was that he didn't have a concrete idea for what to actually use as visual quality, and you've nicely described how it's actually a very deep inference that we as humans make with our relatively advanced brains.

I did not mean that it it's conceptually something that can't exist. I think we're very much in agreement.

saw79 t1_ir1a2tb wrote on October 4, 2022 at 4:37 PM

Reply to comment by ThoughtOk5558 in A wild question? Why CNNs are not aware of visual quality? [D] by ThoughtOk5558

Oh ok cool. Is your code anywhere? What kind of energy model? I have experience with other types of deep generative models but actually am just starting to learn about EBMs myself recently.

saw79 t1_ir13tsx wrote on October 4, 2022 at 3:57 PM

Reply to A wild question? Why CNNs are not aware of visual quality? [D] by ThoughtOk5558

In addition to other commenter's [good] point about your nebulous "visual quality" idea, a couple other comments on what you're seeing:

Frankly, your generative model doesn't seem very good. If your generated samples don't look anything like CIFAR images, I would stop here. Your model's p(x) is clearly very different from CIFAR's p(x).
Why are "standard"/discriminative models' confidence scores high? This is a hugely important drawback of discriminative models and one reason why generative models are interesting in the first place. Discriminative models model p(y|x) (class given data), but don't know anything about p(x). Generative models model p(x, y) = p(y|x) p(x); i.e., they generally have access to the prior p(x) and can assess whether an image x can even be understood by the model in the first place. These types of models would (hopefully, if done correctly), give low confidence on "crappy" images.