Submitted by johnGettings t3_10vc9v2 in deeplearning
Comments
jimtoberfest t1_j7huuug wrote
How did the tiling approach work out or you didn’t try it?
I had something similar in trying to identify spin on a baseball in successive frames.
johnGettings OP t1_j7hwa3w wrote
I didn't try it. I decided to just bust this out and move on to the next project. It was fun though.
just_phone_user t1_j7jytqc wrote
How does your extended ResNet compares to a standard one with Global Average Pooling (like it's done in timm
in PyTorch)?
thelibrarian101 t1_j7l3y43 wrote
What tool did you use to create these images? https://raw.githubusercontent.com/johnGettings/Hi-ResNet/main/images/HiResNet.png
They look really good
johnGettings OP t1_j7l45fw wrote
Excel lol
thelibrarian101 t1_j7l4oi8 wrote
Haha ok, I was hoping for a neat preset or sth ^^
GufyTheLire t1_j7ljbmj wrote
Do you expect the model to learn subtle details useful for classification from a relatively small training dataset? Wouldn't it be a better approach to train a defect detector for the model to know what is important on your images and then classify found features? Maybe this is the reason why large classification models are not widely used?
johnGettings OP t1_j7lqkj2 wrote
Yes, definitely agree. The project started as one thing, then turned into another, then another. I was only doing the coin grading for fun and wasn't planning on actually implementing it anywhere. So I switched gears and just focused on building a high resolution ResNet, regardless of what would be best for the actual coin grading.
There are probably better solutions, especially for this size of a dataset, and maybe a sliding window is necessary to achieve very high accuracy.
But I think this model can still be useful and preferable for some datasets of large images with fine patterns. Or at the very least preferred for simplicitys sake.
johnGettings OP t1_j7gpg3x wrote
A 224x224 image is sufficient for most classification tasks, but there are instances where the fine details of a large image need to be analyzed. Hi-ResNet is the ResNet50 architecture expanded (with the same rules from the paper) to allow for higher resolution images.
I was working on a coin grading project and found that accuracy could not surpass 30% because the image size completely obscured the necessary details of the coin. One option is to tile the image, run them each through a classifier, and combine the outputs. Another is to just try a classifier with a higher resolution input, which is actually kind of difficult to find. Maybe I did not look hard enough, but I figured it would be a good exercise to build this out regardless.
It may come in handy for you later. It's a very simple function with 3 arguments that returns a Hi-ResNet Tensorflow model.