Speechbox is built on the premise that Whisper is good enough to pretty much transcribe any English speech. Furthermore, Whisper was trained to predict punctuated and orthographic text.

Speechbox leverages Whisper's quality to "unnormalize" audio transcriptions (see examples below) to make them more useful for further downstream applications while guaranteeing that the exact same words are being used.

"we are going to the san francisco beach" can have multiple meanings:

We are going to the San Francisco beach!
We are going to the San Francisco beach?
We are going to the San Francisco beach.

Speechbox will pick the correct one for you 😉

👉 GitHub: https://github.com/huggingface/speechbox

🤗 Demo: https://huggingface.co/spaces/speechbox/whisper-restore-punctuation

Comments

You must log in or register to comment.

sloganking t1_j2xnk3k wrote on January 4, 2023 at 5:50 PM

#1,280,153

Have whisper's hallucinations been improved yet? I know before, it could sometimes derail, and repeat itself nonsensically.

It's highs seem the highest, but it's lows are well.. nonsensical.

pvp239 OP t1_j2xoukt wrote on January 4, 2023 at 5:58 PM

#1,280,225

Replying to sloganking (#1,280,153)

The way it's implemented, Whisper cannot hallucinate because it can only predict letters of the original normalized transcript or punctuation, so the algorithm in speechbox guarantees that Whisper cannot hallucinate (you can think of it as a very restricted beam search)

thetall0ne1 t1_j2y0hri wrote on January 4, 2023 at 7:09 PM

#1,280,723

Reminds me of Machine Box (http://machinebox.io)

Franck_Dernoncourt t1_j2y328o wrote on January 4, 2023 at 7:24 PM

#1,280,821

Thanks! How does Speechbox' punctuation restoration compare to other existing models/codebases for punctuation restoration?

Finslayer t1_j2y43fj wrote on January 4, 2023 at 7:31 PM

#1,280,872

Hi ,

How accurate are those corrections? Do you have any benchmarks? How fast it is?
When we were finetuning wav2vec2 models we hit this exact same problem and finetuned t5 model for the task https://huggingface.co/Finnish-NLP/t5-small-nl24-casing-punctuation-correction