agaz1985 t1_jc9pebz wrote on March 15, 2023 at 8:27 AM

Reply to Image reconstruction by grid_world

18 is your Z dimension, if you move it to the third dimension so Bx3x18x90x90 you can apply multiple 3DConv until you reach a 2D representation and after that you apply 2DConv. For example, let's say we apply 2 times 3DConv->3DMaxPooling with kernels (3,1,1) and (2,1,1) you'll end up with an output of BxCx3x90x90, if you then apply a single 3DConv with kernel (3,1,1) you'll have an output of BxCx1x90x90 or simply BxCx90x90 which can then be passed to 2DConv layers. So basically you ask the model to compress the info in your Z dimension before moving to the spatial dimensions. You can also do the two things together by playing with the kernel size of conv layers. That said, integrating this into UNet it's a bit more work than just using a predefined UNet but it is doable, look for 3D+2D Unet for example.