Submitted by CeFurkan t3_10xgvhj in MachineLearning

I have got old lecture recordings

I want to improve their sound quality

I have tested adobe AI noise removal but not very good

I also tested descript studio sound not very good either

I wonder if there are any public model, github repo, github project, hugging face repo that I can use to remove noise and improve sound quality of existing audio recordings?

Thank you so much for replies

Recordings are in English

Here example recording that needs to be cleaned 5 min audio : https://sndup.net/stjs/

full lecture : https://youtu.be/2zY1dQDGl3o

31

Comments

You must log in or register to comment.

logsinh t1_j7t1cna wrote

If the recordings are not confidential, I can process them for you (because we are not ready to publish the model yet). If you prefer public model, this one is pretty good: https://huggingface.co/spaces/hshr/DeepFilterNet2

16

CeFurkan OP t1_j7tob1u wrote

>Nvidia RTX voice

example link that you can download extract audio quickly if you wish : https://youtu.be/2zY1dQDGl3o

also here 5 min example speech : https://sndup.net/stjs/

2

logsinh t1_j7tqjmm wrote

The audio is a bit distorted possibly due to noise gating. I don't see too much noise, so maybe noise reduction is not what you need. The audio has 8 kHz bandwidth (16 kHz sample rate), maybe you may try to use an audio super-resolution network such as https://github.com/mindslab-ai/nuwave2 to increase the audio bandwidth.

3

CeFurkan OP t1_j7tr5at wrote

yes i had tried some options obs back in time. it was probably noise gate. even i forgotten it.

thank you so much for reply gonna test that repo now

1

CeFurkan OP t1_j7trlei wrote

their example really good improvement but do i need training for that?

opened an issue thread but not much hope : https://github.com/mindslab-ai/nuwave2/issues/11

1

logsinh t1_j7tu0x1 wrote

Just download the checkpoint and use the command at Inference session. sr should be 16000

2

CeFurkan OP t1_j7tvbny wrote

thanks i made it work

however i got out of memory error on RTX 3060 - 12 GB vram

it is like a joke :/

https://i.imgur.com/KslqNBg.png

1

logsinh t1_j7tsvku wrote

Anyway, here is the denoised audio of your example speech: https://www.sndup.net/pbxf/. There is no improvement, your best bet is audio super-resolution.

Input: Speech MOS: 4.259 Noise MOS: 4.369 Overall MOS: 3.927

Output: Speech MOS: 4.263 Noise MOS: 4.403 Overall MOS: 3.947

2

CeFurkan OP t1_j7ttvbm wrote

>audio super-resolution

thank you so much for answers and testing

any idea to get super resolution ? or my only option is mindslab-ai/nuwave2 ?

1

No_Network_3714 t1_j87lp29 wrote

I am also interested in having you process two recordings. They both a little over 40 minutes in length. If you feel you can do this, please contact me at (email address removed). Thanks.

1

No_Network_3714 t1_j89zlcu wrote

Thought I had previously replied. I am also interested in letting you to try and clean up my two audio files, or know when it goes public. The are both over 40 minutes, were recorded in a car and the microphone was held too close

1

logsinh t1_j8cnqzr wrote

Pls upload it somewhere, preferably, wav format. I will do it when I have time.

1

No_Network_3714 t1_j92iku0 wrote

Thank you. I have uploaded the two audio files in a wav format to Google docs but will need an email address in order to share this with you. How do you suggest you get that information to me?

1

starstruckmon t1_j7telrz wrote

Nvidia RTX voice

3

CeFurkan OP t1_j7to9qw wrote

this is pre recording. how can I use it to process this recordings fast?

1

vivehelpme t1_j7vcbx5 wrote

Transcribe them and put the transcripts in TTS

3

CeFurkan OP t1_j7vf77g wrote

what you mean by that? i have transcripts but then what to do? thank you

1

vivehelpme t1_j7vfol7 wrote

Instead of trying to salvage the original recording why not recreate it by putting the text transcript into a text-to-speech model?

As you have it transcribed you don't even need to do any advanced speech recognition that filters the noise, just paste the text into something a bit more advanced than Microsoft Sam

2

CeFurkan OP t1_j7volvh wrote

but what about synching? how to solve synching problem?

i haven't found any way to re-voice with proper synchronization

i can prepare a perfect .vtt file but how to sync it with video?

1

express_mode_420 t1_j7w3mrm wrote

Could you speech-to-text your lecture, collecting timestamps, do the same with TTS and automagically sync that way?

2

CeFurkan OP t1_j7whf7d wrote

i have vtt file you know the subtitles we use for movies

but i haven't found and text to speech that can generate speech with that timing

do you know any?

​

about your suggested approach, any way to automatically do it? i mean we generate speech then we sync but how?

1

express_mode_420 t1_j7wizoa wrote

I'm not sure how I'd go about syncing it, but would this be an adequate workaround:

  • break apart your script in small chunks by time stamp
  • generate different tts recordings off of each time stamp
  • generate an audio file that inserts each of the produced recordings at their respective time-stamped location
  • replace the audio of the recording with your newly produced recording
2

CeFurkan OP t1_j7wsy5f wrote

so it is a logical layout

any software that can do it?

1

express_mode_420 t1_j7wya6a wrote

I think this is more likely a task for Python. I haven't done anything like this myself, it's just the approach I would start with.

2

CeFurkan OP t1_j7yjgw6 wrote

if only i were not a c# programmer but a python programmer :/

1

express_mode_420 t1_j7z394g wrote

Check out murf.ai, that service works similarly to what i described

2

CeFurkan OP t1_j81neng wrote

tested looks awesome but i have to purchase yearly plan which is 3500$ lol :D

1

Fit_Schedule5951 t1_j7tk2db wrote

Try denoiser from facebook

2

CeFurkan OP t1_j7todbc wrote

>denoiser

I need a post-processor for existing recordings. Would that work for that? could give me link?

1

Fit_Schedule5951 t1_j7twyx9 wrote

https://github.com/facebookresearch/denoiser

Use the pretrained model on your recordings

2

CeFurkan OP t1_j7txh8r wrote

thank you so much

i tested with model = pretrained.dns64().cuda()

is this their best pre trained mode?

2

jeanfeydy t1_j7tnqu0 wrote

I used https://audo.ai/noise-removal for my own lectures: it’s more than good enough to make up for a poor microphone and background noise. You can try for free on your own audio samples and see for yourself!

2

Locomule t1_j7uintq wrote

Greets again, you've been helping me with SD. I Followed your account last night and noticed this post. I'm a recording musician and though I might be able to help you out with this issue.

Here is my result after using the audio editor Reaper to apply EQ and Compression.

2

CeFurkan OP t1_j7wi9v3 wrote

> audio editor Reaper to apply EQ and Compression

if you make a video i would watch it and show me how to do :D

2

Locomule t1_j7x4aea wrote

Hey, seriously, if you are ever interested I can write something up. I need to anyway for future reference, the mechanics of sound are what musicians are all about yet shockingly few actually trace their craft back to the root, the simple physical properties of the medium.

2

CeFurkan OP t1_j7whknq wrote

wow this is amazing

how can I contact to you?

my email : monstermmorpg@gmail.com

my discord : MonsterMMORPG#2198

1

evanthebouncy t1_j7wjt36 wrote

don't put your email in public like this. dm the guy. remove the email while you still can.

EQ and Compression are good techniques to try, reaper is free. I'm sure your friend can show you.

2

CeFurkan OP t1_j7wleql wrote

it is fine this is my public email

​

thank you for warning

1

Locomule t1_j7wsobl wrote

MonsterMMORPG eh? Very interesting!! :D

2

CeFurkan OP t1_j7wt0as wrote

ye that is the game ,i develop : https://www.monstermmorpg.com

2

Locomule t1_j7x4y84 wrote

Oh wow, I can't wait to check that out! I was just telling my son about the old days of Telnet gaming of which I dabbled in. I was a member of an old school (post Telnet) early graphical MMORPG called DragonSpires which itself spawned Furcadia, now the longest continuously running MMORPG online last time I checked? Or something like that. Then I went on to help run a Player Worlds based MMORPG called Delrith Online. Seems like soooo long ago now...

2

CeFurkan OP t1_j7yjf7j wrote

this is also very old school

text and image based but extremely in depth game mechanics

2

Locomule t1_j7z6bxv wrote

I have it bookmarked, I will definitely fire it up and take a spin. I will probably steal borrow some ideas for a Scratch project :)

2