disastorm t1_jd66swu wrote on March 22, 2023 at 2:48 AM

Reply to comment by trnka in [D] Simple Questions Thread by AutoModerator

I see thanks, is that basically the equivallent of having "top_k" = 1?

Can you explain what these mean. From what I understand top_k means it considers the top K number of possible words at each step.

I can't exactly understand what top_p means, can they be use together?

trnka t1_jd82eo1 wrote on March 22, 2023 at 2:43 PM

If you're using some API, it's probably best to look at the API docs.

If I had to guess, I'd say that top_k is about the beam width in beam search. And top_p is dynamically adjusting the beam width to cover the amount of the probability distribution you specify.

top_k=1 is probably what we'd call a greedy search. It's going left to right and picking the most probable token. The sequence of tokens selected in this way might not be the most probable sequence though.

Again, check the API docs to be sure.

All that said, these are just settings for discovering the most probable sequence in a computationally efficient way. It's still deterministic and still attempting to find the most probable sequence. What I was describing in the previous response was adding some randomness so that it's not deterministic.

disastorm t1_jd92s7i wrote on March 22, 2023 at 6:33 PM

Thanks I found some articles talking about these variables.