Submitted by johnGettings t3_10vc9v2 in deeplearning
johnGettings OP t1_j7gpg3x wrote
A 224x224 image is sufficient for most classification tasks, but there are instances where the fine details of a large image need to be analyzed. Hi-ResNet is the ResNet50 architecture expanded (with the same rules from the paper) to allow for higher resolution images.
I was working on a coin grading project and found that accuracy could not surpass 30% because the image size completely obscured the necessary details of the coin. One option is to tile the image, run them each through a classifier, and combine the outputs. Another is to just try a classifier with a higher resolution input, which is actually kind of difficult to find. Maybe I did not look hard enough, but I figured it would be a good exercise to build this out regardless.
It may come in handy for you later. It's a very simple function with 3 arguments that returns a Hi-ResNet Tensorflow model.
jimtoberfest t1_j7huuug wrote
How did the tiling approach work out or you didn’t try it?
I had something similar in trying to identify spin on a baseball in successive frames.
johnGettings OP t1_j7hwa3w wrote
I didn't try it. I decided to just bust this out and move on to the next project. It was fun though.
just_phone_user t1_j7jytqc wrote
How does your extended ResNet compares to a standard one with Global Average Pooling (like it's done in timm
in PyTorch)?
Viewing a single comment thread. View all comments