SnowTime11 t1_j2r5jk0 wrote on January 3, 2023 at 10:43 AM

I have been classifying series of data that are cyclic, as in not exactly periodic but repeating. As a form of data augmentation, I've been trying to separately classify the single cycles rather than the whole series. To get the final class score, I average the scores (before softmax) of the cycles belonging to the same series. This approach seem to yield very good results, for some reasons I believe:

Smaller input data leads to a smaller model, and segmenting the input increases the available data

Focusing on a single period seems to make the classifier highlight better features from saliency maps

Combining the output of the classifier can be beneficial, as in if one cycle is corrupted and wrongly classified the others may compensate from it. This probably happens even when classifying the whole time series, but with the segmentation is more explicit.

Has this been done in any other work? Am I falling into some kind of fallacy by applying this segmentation?

comradeswitch t1_j341em8 wrote on January 5, 2023 at 10:13 PM

This is in essence how convolutional neural networks work- most often, looking at small patches of an image with many overlapping windows and the same core model looking at each. Then the same can be done for the output of the very small patches to get summarization of slightly larger patches of the image, and so on. At the end, the output is coming from many different analyses of different, overlapping segments of the data considered together.

I'd be wary of creating explicit synthetic examples that contain e.g. exactly one cycle of interest or whatever unless you know for a fact that it's how the model will be evaluated. You can imagine how snipping out a cycle from beginning to end could give an easier problem than taking segments of the same length but with random phase, for example. It may be simpler and more robust to do this in the model directly with convolution and feed in the whole series at once.