Viewing a single comment thread. View all comments

zoontechnicon t1_j5oiraa wrote

I'm trying to use this model to summarize text: https://huggingface.co/bigscience/mt0-large Text generation seems to end after the special end token </s> however. I wonder how I would coax it to generate longer texts. Any ideas?

1

zoontechnicon t1_j69b6g5 wrote

The solution, as evidenced by code in huggingface/transformers is to force the probability of the end token to -Inf. What a hack...

1