Immarhinocerous

Immarhinocerous t1_jcsdckk wrote

As someone who's mostly self-taught (I have a BSc, but it's in health sciences), I followed a similar route to what they recommended. It gives you:

  • income,

  • experience working in software development - you will hopefully learn a lot from this,

  • exposure to co-workers who you may be able to learn from, especially if your backend role is at a place doing ML.

With income, you can also afford to take more courses. Even if it's only the occasional weekend course, or something you work on a few nights a week, it can help you expand your skillset while gaining other practical skills (backend work with APIs, DBs, cloud infrastructure, etc are all useful).

After doing that awhile, you may be able to land a more focused ML role, or be able to do a master's program (which combined with your SWE experience will give you a leg up on landing the role you want). If you want to go straight into an ML role after SWE, you will definitely need project experience. But you can do that while working, if you're up for it.

One of the best ML people I know has a maths background, works in risk/finance, and is basically entirely self-taught. But the guy is brilliant and insanely passionate about what he does. I just mention him to show that you don't absolutely need to go the master's route. But it could be worthwhile when you can afford it, especially if you're lacking in maths.

3

Immarhinocerous t1_j3gkq83 wrote

Google is your friend here. ChatGPT may even give a decent response.

Start by learning bagging, then learn boosting.

I find the following site fairly good: https://machinelearningmastery.com/essence-of-boosting-ensembles-for-machine-learning .

The explanations are usually approachable, or the author usually has another article on the topic at a simpler level of detail. He has good code samples, many of his articles also go quite in depth, so he caters to a broad range of audiences and is good at sticking with a certain level of depth in a topic in his articles. I've even bought some of his materials and they were fairly good, but his free articles are plenty.

There are lots of other sites you can find that will teach you. Read a few different sources. They're worth understanding well. Since you stated you've read papers on gradient descent, you might find some helpful papers by searching scholar.google.com.

This is also a good place to start: https://www.ibm.com/topics/bagging

1

Immarhinocerous t1_j3g6gf5 wrote

That's really interesting, thanks for the share. Though I wonder if most decision trees still don't converge upon the same solutions as a neural network, even if they're capable of representing the same solutions. If trees don't converge on the same solutions, and NNs outperform trees, that would mean NNs are still needed for training, then the models can be optimized for run-time by basing a tree off the NN.

2

Immarhinocerous t1_j3g5qgb wrote

What do you want to do with it?

For tabular data of a few million rows or less, you're often much better off using XGBoost, or one of the other boosting libraries. Read up on boosting. It is an alternative approach to deep learning. Technically it can also be used with neural networks, including deep learning, but in practice it is not often used with it because boosting relies on multiple weak learners whereas deep learning has long training times for the creation of one strong learner.

XGBoost and CatBoost have won many many Kaggle competitions. My former employer trained all their production models using XGBoost, and they were modeling people's credit scores. There are many reasons to use XGBoost including speed of training (much faster to train than deep neural networks) and interpretability (easier to interpret the model's decision-making process with XGBoost, because under the hood it's just decision trees).

I mostly use XGBoost, and sometimes fairly simple LSTMs. I use them primarily for financial modeling. XGBoost works well and the fast training times let me do optimization across a wide range of model parameters, without spending a bunch of money on GPUs.

If you want to do image analysis though, you do need deep learning for state-of-the-art. Ditto reinforcement learning. Ditto several other types of problems.

So, it depends.

1