Viewing a single comment thread. View all comments

pandasiloc t1_j69uj8v wrote

The human brain doesn’t work like this. It’s not a question about “being smart” or simply having learned something previously. In order to perform an implementation of this on the spot in a stressful situation, the relevant theory needs to be very fresh in your memory.

I highly doubt you would be able to reproduce a proof of the Fundamental Theorem of Algebra on the spot, even though it’s a simple concept that many people learn in middle school.

I would probably fail this question because I haven’t worked with deep learning much since I graduated 4 years ago. I majored in math at an Ivy League school and graduated with a pretty good GPA, so I don’t think my math is ‘weak’, either.

This kind of question does not make sense to ask on a live call unless someone claims to be working with deep learning architectures as part of their daily work.

3

Featureless_Bug t1_j69xpo0 wrote

Oh, a fellow mathematician. Look, I graduated from Cambridge 6 years ago, but I could still prove the fundamental theorem of algebra analytically or with Galois theory (I still remember the general ideas of both proofs I think), so I guess it depends on a person. But FTA is also a much more complicated thing to prove than the chain rule, and you don't even need to prove it to know how to use it. And sorry, if you don't remember how to differentiate multivariable functions, then you are an extraordinarily lousy mathematician. And if you know how to differentiate multivariable functions and if you are smart, you should be able to quickly come up with an implementation for backprop even if you don't remember anything else

0

pandasiloc t1_j6a4a2n wrote

I never said I didn’t remember how to differentiate multivariate functions - my point was that equating conceptual mathematical knowledge and the ability to implement a specific application of such concepts in a time-constrained and stressful situation is inappropriate.

A lot of things need to come together in answering a question like this - remembering that the chain rule is the key concept in backprop the first place, knowledge of how to implement matrix algebra in code, knowing the commonly-used loss functions, how to compute their derivatives, and how to represent the differentiation in code, etc. None of these things is complicated on its own; the difficulty arises in bringing everything together in a small amount of time. It’s fair to expect people in the field to intuitively remember what is going on but on the spot implementation in under 30 minutes requires a level of rigor that is unrealistic for even a competent person who does not have the theory fresh in their memory.

You keep using the term ‘smart’ and I don’t know what you mean by this. Your last statement is just an assertion without argument, one you’ve repeated throughout your comments but I see no reason to believe, given the above.

2