Submitted by **Ashb0rn3_** t3_1182mkd
in **MachineLearning**

[removed]

Submitted by **Ashb0rn3_** t3_1182mkd
in **MachineLearning**

[removed]

I went through this about five years ago.

For me, the main job was learning all of the terminology and getting a feel for which techniques are used to solve what kinds of problems. At the time when I went through this, I spent many hours listening to podcasts. Just listening to people talk about the stuff helped me get a map of the territory and decide where to dive deeper.

Then as soon as I had even the slightest grasp of a possible solution to a problem in my domain, I would go try to attack it. In this early era I made hundreds of Jupyter notebooks. Each one was me spending a few hours trying out a technique on some data from my business. Some worked, some didn't, but I got a lot of experience in a short time.

I had a strong math and SWE background to begin with. If you don't, you may have some extra catching up to do. As far as math goes, Linear Algebra is the most important. Probability Theory and Differential Equations are also very applicable. Most SWE work tied to Machine Learning is pretty basic. Lots of Python, but it helps to understand how computers work because you do get into data at scale pretty often.

At this point, I've deployed many ML systems to production, they are serving hundreds of thousands of users daily, and I can keep up with experts when conversing, designing stuff, etc.

> Differential Equations

I have a (somewhat) strong math background (studied many math courses with the math department of the universities I studied at) and a strong SW background (web and then MLE for a few years) - however, I have never used or studied Differential Equations (god knows why). I understand quite deeply how calculus and linear algebra are related to neural networks, and probability is related to the field everywhere by definition - but could you explain to me when you need knowledge of Differential Equations? I ask it due to my ignorance, again - I have never studied it. Could you link it to ML concepts which I probably don't understand well due to my ignorance? Also, I would add optimization to the answer :)

Edit: also 2 - how deeply would you suggest to learn it? https://www.youtube.com/watch?v=9fQkLQZe3u8 what do you think about this one?

I guarantee that you have used stochastic gradient descent before if you’ve done any significant amount of ML work. This technique and other optimization methods like it are rooted in differential equations.

This anecdote I have heard but I was kind of hoping for non-trivial cases from everyday life at work. I feel I understand SGD perfectly fine without learning to solve complicated DE but it's probably limiting me on other tasks, or my ability to analyze ML algorithms. Are you sure it's the right hierarchy to say that SGD is rooted in differential equations? I mean, I agree you are right, it's a differential equation, but are the methods you learn in differential equations courses useful for ML?

I found a nice article about the link to SGD: https://tivadardanka.com/blog/why-does-gradient-descent-work - but I am not sure if I am convinced (again, I am still an idiot about it, I shouldn't have any opinion regarding links to differential equations lol - but for me trying to fit SGD to the framework of differential equations is against the KISS principle). Sorry if I go too deep, I just try to figure out how much effort (I can actually study it all day for fun but we have work and so on) to put into it since we only have some amount of time :)

Thanks for the answer! I was convinced (by your message and myself today) it's terrible I don't know it and I should learn it ASAP.

As a start, I suggest learning the following:

Statistics:

- probability (distributions, basic manipulations)

- statistical summaries (univariate and bivariate)

- hypothesis testing / confidence intervals

- linear regression

Linear Algebra:

- basic understanding of arranging data in vectors and matrices

- operators (matrix multiplication, ...)

Calculus:

- limits

- basic differentiation and integration (at least of polynomials)

Information Theory (Discrete):

- entropy, joint entropy, conditional entropy, mutual information

average_bme_studentt1_j9f83gt wroteI've been loosely following this: https://i.am.ai/roadmap/#machine-learning-roadmap If anyone with more experience has any comments about that specific map, I'd be happy to hear them