RaeudigerRaffi t1_jahpbod wrote on March 1, 2023 at 2:55 PM

Reply to comment by RaeudigerRaffi in [D] backprop through beam sampling ? by SaltyStackSmasher

To add to this I thought a bit about it and technically in PyTorch, this should be possible to do with some trickery with custom autograd functions. You can probably sample with Gumbel Softmax and return the argmax. In the custom backward you can just skip the argmax part and backprop as if the Gumbel Softmax output has been returned and not the argmax on the Gumbel Softmax.