audioflux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training, and is used to study various tasks in the audio field such as Classification, Separation, Music Information Retrieval(MIR) and ASR etc.

Source Code: https://github.com/libAudioFlux/audioFlux

Comments

You must log in or register to comment.

CheekProfessional146 t1_jd2faom wrote on March 21, 2023 at 10:49 AM

Very good, but what is the difference between it and librosa

Leo_D517 OP t1_jd2g6pg wrote on March 21, 2023 at 11:00 AM

First, librosa is a very good audio feature library.

The difference between audioflux and librosa is that:

Systematic and multi-dimensional feature extraction and combination can be flexibly used for various task research and analysis.
High performance, core part C implementation, FFT hardware acceleration based on different platforms, convenient for large-scale data feature extraction.

It supports the mobile end and meets the real-time calculation of audio stream at the mobile end.

Our team wants to do audio MIR related business at mobile end, all operations of feature extraction must be fast and cross-platform support for the mobile end.

For training, we used the librosa method to extract CQT-related features at that time. It took about 3 hours for 10000 sample data, which was really slow.

Here is a simple performance comparison

Server hardware:

- CPU: AMD Ryzen Threadripper 3970X 32-Core Processor
- Memory: 128GB

Each sample data is 128ms(sampling rate: 32000, data length: 4096).

The total time it takes to extract features from 1000 sample data.

Package	audioFlux	librosa	pyAudioAnalysis	python_speech_features
Mel	0.777s	2.967s	--	--
MFCC	0.797s	2.963s	0.805s	2.150s
CQT	5.743s	21.477s	--	--
Chroma	0.155s	2.174s	1.287s	--

Finally, audioflux has been developed for about half a year, and open source has only been more than two months. There must be some deficiencies and improvements. The team will continue to work hard to listen to community opinions and feedback.

Thank you for your participation and support. We hope that the follow-up of the project will be better and better.

is_it_fun t1_jd3syu7 wrote on March 21, 2023 at 5:10 PM

Thank you for the very detailed response!

waffles2go2 t1_jd5su7l wrote on March 22, 2023 at 1:02 AM

> FFT hardware acceleration based on different platforms

???? I love me some FFTs but "hardware acceleration"?

Nowado t1_jd62sxt wrote on March 22, 2023 at 2:16 AM

https://www.reddit.com/r/Python/comments/11xfa47/deep_learning_for_audio_a_library_for_audio_and/jd2rd55?utm_medium=android_app&utm_source=share&context=3

I'd recommend mixing wording a bit, maybe some older accounts?

rising_pho3nix t1_jd2jjsz wrote on March 21, 2023 at 11:37 AM

This is nice.. I'm doing MIR as part of my Thesis work. Will definitely use this.

Leo_D517 OP t1_jd2k6u1 wrote on March 21, 2023 at 11:44 AM

Thank you for your support. If you are interested, you can join our project. Suggestions and feedback are welcome.

rising_pho3nix t1_jd2ka1c wrote on March 21, 2023 at 11:45 AM

Yes definitely. Currently exploring topics, once I start data processing will contact you.

xbcslzy t1_jd2eyo7 wrote on March 21, 2023 at 10:45 AM

Nice, hope it helps me in my work

fanjink t1_jd2ghpk wrote on March 21, 2023 at 11:03 AM

This library looks great, but I get this:
OSError: dlopen(/Users/***/opt/anaconda3/envs/audio/lib/python3.9/site-packages/audioflux/lib/libaudioflux.dylib, 0x0006): tried: '/Users/***/opt/anaconda3/envs/audio/lib/python3.9/site-packages/audioflux/lib/libaudioflux.dylib' (mach-o file, but is an incompatible architecture (have (x86_64), need (arm64e)))

Leo_D517 OP t1_jd2hhov wrote on March 21, 2023 at 11:15 AM

First of all, we have noticed this issue and it will be resolved in the upcoming next version. For now, you can install by compiling the source code.

Please follow the steps in the Document to compile the source code.

The steps are as follows:

Installing dependencies on macOS
Install Command Line Tools for Xcode. Even if you install Xcode from the app store you must configure command-line compilation by running:
xcode-select --install
Python setup:
$ python setup.py build
$ python setup.py install

fanjink t1_jd2ho6o wrote on March 21, 2023 at 11:17 AM

Thank you, I’ll try it later

JJtheSucculent t1_jd3axwm wrote on March 21, 2023 at 3:14 PM

This is cool. I’m curious to try it out for an audio side project.

r4and0muser9482 t1_jd4e8qn wrote on March 21, 2023 at 7:24 PM

Looks neat. How does it compare to OpenSMILE? The license sure makes it an attractive alternative.

Leo_D517 OP t1_jd7sq7l wrote on March 22, 2023 at 1:36 PM

OpenSMILE is mainly used for emotion analysis and classification of audio, while audioFlux focuses on various feature extraction of audio , and is used to study various tasks in the audio field such as Classification, Separation, Music Information Retrieval(MIR) and ASR etc.

Oswald_Hydrabot t1_jd5d0h0 wrote on March 21, 2023 at 11:08 PM

Very cool! I have been looking for a better toolkit for audio analysis, this looks great!

Oceanboi t1_jd6g49h wrote on March 22, 2023 at 4:12 AM

How do these handmade features compare to features identified by CNNs? Only reason I ask is that I'm finishing up some thesis work on sound event detection using different spectral representations as inputs to CNNs (Cochleagram, Linear Gammachirp, Logarithmic Gammachirp, Approximate Gammatone filters, etc). Wondering how these features perform in comparison on similar tasks (UrbanSound8K) and where it fits in the larger scheme of things.

[deleted] t1_jd3c7rh wrote on March 21, 2023 at 3:22 PM

[removed]

[deleted] t1_jd3cd8g wrote on March 21, 2023 at 3:23 PM

[removed]

SkullHero t1_jd3pkgh wrote on March 21, 2023 at 4:48 PM

Can't wait to try this out 😁

[deleted] t1_jd5cyzf wrote on March 21, 2023 at 11:08 PM

[removed]

waffles2go2 t1_jd5t24q wrote on March 22, 2023 at 1:04 AM

>audioflux
>
> is a deep learning tool library for audio and music analysis, feature extraction.

itsnotlupus t1_jd5td54 wrote on March 22, 2023 at 1:06 AM

It's not a user-facing product, it's a building block that would be useful to train music-oriented neural network, be they diffusers or other types of models.

It's probably going to take a little while before we see new models that leverage this library.

If you're looking for "stable diffusion but for music" right now, you could look at Riffusion (https://huggingface.co/riffusion/riffusion-model-v1)

ShowerVagina t1_jd5dnri wrote on March 21, 2023 at 11:13 PM

Yes. We have image AI, NLP text AI, video is on the way, probably later this year. I've been waiting for music AI. Jukebox was pretty meh. I know it can be way better.

nonetheless156 t1_jd5u6pr wrote on March 22, 2023 at 1:12 AM

Yeah! I can sing and be creative in that sense. But no patience in learning guitar or piano. But when we can make stuff like that, better than jukebox ai, I can start on that hobby

gootecks t1_jd6o1uo wrote on March 22, 2023 at 5:42 AM

Really interesting project! Do you think it could be used to detect sound effects in games? For example, you press a button in the game which triggers an attack that makes a sound when it connects.

Leo_D517 OP t1_jd7vszp wrote on March 22, 2023 at 1:58 PM

Of course, you can use audioFLux to extract features and then build and train models for the sound effects audio that needs to be detected.

Then, real-time audio features are extracted from the audio stream obtained by the microphone, and a trained model is used for prediction.

[deleted] t1_jd2kdrn wrote on March 21, 2023 at 11:46 AM

[deleted]