Comments

You must log in or register to comment.

johnGettings OP t1_j7gpg3x wrote

A 224x224 image is sufficient for most classification tasks, but there are instances where the fine details of a large image need to be analyzed. Hi-ResNet is the ResNet50 architecture expanded (with the same rules from the paper) to allow for higher resolution images.

I was working on a coin grading project and found that accuracy could not surpass 30% because the image size completely obscured the necessary details of the coin. One option is to tile the image, run them each through a classifier, and combine the outputs. Another is to just try a classifier with a higher resolution input, which is actually kind of difficult to find. Maybe I did not look hard enough, but I figured it would be a good exercise to build this out regardless.

It may come in handy for you later. It's a very simple function with 3 arguments that returns a Hi-ResNet Tensorflow model.

3

jimtoberfest t1_j7huuug wrote

How did the tiling approach work out or you didn’t try it?

I had something similar in trying to identify spin on a baseball in successive frames.

2

johnGettings OP t1_j7hwa3w wrote

I didn't try it. I decided to just bust this out and move on to the next project. It was fun though.

3

just_phone_user t1_j7jytqc wrote

How does your extended ResNet compares to a standard one with Global Average Pooling (like it's done in timm in PyTorch)?

1

GufyTheLire t1_j7ljbmj wrote

Do you expect the model to learn subtle details useful for classification from a relatively small training dataset? Wouldn't it be a better approach to train a defect detector for the model to know what is important on your images and then classify found features? Maybe this is the reason why large classification models are not widely used?

2

johnGettings OP t1_j7lqkj2 wrote

Yes, definitely agree. The project started as one thing, then turned into another, then another. I was only doing the coin grading for fun and wasn't planning on actually implementing it anywhere. So I switched gears and just focused on building a high resolution ResNet, regardless of what would be best for the actual coin grading.

There are probably better solutions, especially for this size of a dataset, and maybe a sliding window is necessary to achieve very high accuracy.

But I think this model can still be useful and preferable for some datasets of large images with fine patterns. Or at the very least preferred for simplicitys sake.

1