Viewing a single comment thread. View all comments

marcus_hk t1_iw9gdpi wrote

I designed a custom architecture to model an analog signal processor with lots of different settings combinations. It was a custom MGU (minimal gated unit) that modulates HiPPO memory according to settings embeddings. Can train in parallel, so much faster than, say, a PyTorch GRU.

Another recent design combines convolution and transformers to model spinal CT scans, which is challenging because a single scan can have a shape like (512, 1, 1024, 1024) that is too large to train for dense tasks like segmentation. If you simply resize to a constant shape, then you lose or distort the physical information embedded in the scans. You don't want a scan of the neck to be the same size as a scan of the whole spine, for instance. So you've got to be more clever than that, and something this specialized doesn't come ready to go out of the box.

3