Submitted by AutoModerator t3_yntyhz in MachineLearning
Please post your questions here instead of creating a new thread. Encourage others who create new posts for questions to post here instead!
Thread will stay alive until next one so keep posting after the date in the title.
Thanks to everyone for answering questions in the previous thread!
natfabulous t1_ivbpmj3 wrote
Hi! I'm looking for a neural network that takes in speech and outputs phonemes. I basically want the first part of a speech-to-text network. I'd like to do this operation in real time. I've had no luck finding a network like this so I'd appreciate any input :)
Input: array of numbers representing the last N seconds of speechOutput: array of IPA-like values for each T milliseconds chunk of input