Submitted by CeFurkan t3_10xgvhj in MachineLearning

I have got old lecture recordings

I want to improve their sound quality

I have tested adobe AI noise removal but not very good

I also tested descript studio sound not very good either

I wonder if there are any public model, github repo, github project, hugging face repo that I can use to remove noise and improve sound quality of existing audio recordings?

Thank you so much for replies

Recordings are in English

Here example recording that needs to be cleaned 5 min audio : https://sndup.net/stjs/

full lecture : https://youtu.be/2zY1dQDGl3o

31

Comments

You must log in or register to comment.

Dry-Feature113 t1_j7t04s6 wrote

Can you upload a sample? Is it a bandwidth issue, a booming mic issue, cracks and pops issue...etc.

3

Locomule t1_j7uintq wrote

Greets again, you've been helping me with SD. I Followed your account last night and noticed this post. I'm a recording musician and though I might be able to help you out with this issue.

Here is my result after using the audio editor Reaper to apply EQ and Compression.

2

vivehelpme t1_j7vcbx5 wrote

Transcribe them and put the transcripts in TTS

3

vivehelpme t1_j7vfol7 wrote

Instead of trying to salvage the original recording why not recreate it by putting the text transcript into a text-to-speech model?

As you have it transcribed you don't even need to do any advanced speech recognition that filters the noise, just paste the text into something a bit more advanced than Microsoft Sam

2

CeFurkan OP t1_j7volvh wrote

but what about synching? how to solve synching problem?

i haven't found any way to re-voice with proper synchronization

i can prepare a perfect .vtt file but how to sync it with video?

1

CeFurkan OP t1_j7whf7d wrote

i have vtt file you know the subtitles we use for movies

but i haven't found and text to speech that can generate speech with that timing

do you know any?

​

about your suggested approach, any way to automatically do it? i mean we generate speech then we sync but how?

1

express_mode_420 t1_j7wizoa wrote

I'm not sure how I'd go about syncing it, but would this be an adequate workaround:

  • break apart your script in small chunks by time stamp
  • generate different tts recordings off of each time stamp
  • generate an audio file that inserts each of the produced recordings at their respective time-stamped location
  • replace the audio of the recording with your newly produced recording
2

evanthebouncy t1_j7wjt36 wrote

don't put your email in public like this. dm the guy. remove the email while you still can.

EQ and Compression are good techniques to try, reaper is free. I'm sure your friend can show you.

2

Locomule t1_j7x4aea wrote

Hey, seriously, if you are ever interested I can write something up. I need to anyway for future reference, the mechanics of sound are what musicians are all about yet shockingly few actually trace their craft back to the root, the simple physical properties of the medium.

2

Locomule t1_j7x4y84 wrote

Oh wow, I can't wait to check that out! I was just telling my son about the old days of Telnet gaming of which I dabbled in. I was a member of an old school (post Telnet) early graphical MMORPG called DragonSpires which itself spawned Furcadia, now the longest continuously running MMORPG online last time I checked? Or something like that. Then I went on to help run a Player Worlds based MMORPG called Delrith Online. Seems like soooo long ago now...

2

No_Network_3714 t1_j89zlcu wrote

Thought I had previously replied. I am also interested in letting you to try and clean up my two audio files, or know when it goes public. The are both over 40 minutes, were recorded in a car and the microphone was held too close

1

No_Network_3714 t1_j92iku0 wrote

Thank you. I have uploaded the two audio files in a wav format to Google docs but will need an email address in order to share this with you. How do you suggest you get that information to me?

1