gdahl

gdahl t1_j6upct4 wrote

I would say the turning point was when we published the first successful large vocabulary results with deep acoustic models in April 2011, based on work conducted over the summer of 2010. When we published the paper you mention, it was to recognize that these techniques were the new standard in top speech recognition groups.

Regardless, there were deep learning roles in tech companies in 2012, just not very many of them compared to today.

8

gdahl t1_j6uc1bh wrote

Deep learning existed as a field in 2012. The speech recognition community had already adopted deep learning by that point. The Brain team at Google already existed. Microsoft, IBM, and Google were all using deep learning. As an academic subfield, researchers started to coalesce around "deep learning" as a brand in 2006, but it certainly was very niche at that point.

23

gdahl t1_j6ubet7 wrote

Deep learning roles 10 years ago (in 2013) were pretty similar to what they look like now, except they are much more numerous now. I'm sure there will be some changes and a proliferation of more entry-level roles and "neural network technician" roles, but it isn't going to be that different.

1

gdahl t1_iqpg125 wrote

People use lots of other things too. Probit regression, Poisson likelihoods, all sorts of stuff. As you said, it is best to fit what you are doing to the problem.

Logistic-regression style output layers are very popular in deep learning, perhaps even more than in other parts of the ML community. But Gaussian Process Classification is often done with probit models (see http://gaussianprocess.org/gpml/chapters/RW3.pdf ). However, if necessary people will design neural network output activation functions and losses to fit the problem they are solving.

That said, a lot of people doing deep learning joined the field in the past 2 years and just use what they see other people using, without giving it much thought. So we get these extremely popular cross entropy losses.

6