onzanzo t1_j2hr1kb wrote on January 1, 2023 at 11:34 AM

i'll give you a more general answer/opinion. [i reread my answer after writing it and it looks like an unconnected mess of ideas, excuse the lack of consistency and the flow]

i've done ms+phd. it is true in the industry you won't most likely get to explore novel ideas.

instead, you will focus mostly on optimizing and deploying models. yes, you won't have as much freedom. but doesn't mean you will have none.

furthermore, you are free to choose a job that gives you more freedom (even if it may -or may not!- come at a pay cut). i'm in one of these jobs (by luck not by design).

i use my free time to continually learn about intricacies of deployment and training on low resource systems. there are many techniques available but not sexy to the ML community which are ignored, i also get to learn in my off time.

you can learn all the math and some more that you would in your phd within industry. learning isn't as hard as bringing together many concepts to make something novel and that works. i'd say learning is 10-15% of the time, rest is writing the paper, doing experiments and all the people and paperwork. you don't need to be underpaid or cater to a few advisors to expand your mind. you need grit and willpower.

i'm not saying do not get a phd. despite having negligible impact on my skillset (i.e., i would've known what i know now anyway), it did get me a high paying job. sadly there is a glass ceiling or a harder-to-go-up mentality among the executives for non-phds.

but looking back, if i just worked after the undergrad, and kept an open mind about following research and learning something new every week or so, i would've made much more money, and would likely have more real world skills.

an example is deploying models. i alluded to this above, but it's amazing how little we think about fitting models into smartphones or latency issues in research. actually i take it back, it's not amazing it's by design. i think trying to do everything would be ill-advised. but the fact of the matter is, if you want your product to have a reasonable cost + be used, you need to think about these. yes, quantize your models to 8-bits. oh no, you dropped 15% in accuracy, what now? you have a great transcription network (e.g., whisper), but it's too laggy or doesn't fit in your edge device memory. solution? knowledge distillation? that's great, only the problem is the memory requirement for distillation exceeds your hardware capabilities. will you opt for a smaller model now, fine-tune for longer, prune your models??

none of these questions are rocket science. but in a stressful environment where every day you are burning a few thousand dollars in just the salaries + infrastructure (let alone your competitors moving fast past you in the market), you need to have a quick way of judging and taking action.

on the other hand, if phd's weren't pushing the field with their blood, sweat, and tears, none of this would be possible. my incoherent and long overdue point is: you need to decide which side of this do you want to be on. are you a trailblazer, or execute and package these great ideas?

i personally hated (may be too strong a word, let's say dissatisfied with) doing novel research. i wasn't smart enough to contribute great ideas to the community, nor i had the patience. but i am really good at implementing these ideas. not the best, but ok enough to not feel unsafe about if i have job security. i actually find so many people in the industry lack the above skills. so if you think you can apply these ideas meticulously, you may consider the industry.

masters is fine, but honestly is not necessary. in 2 years you will spend there, you can expand your skillset 10 fold.

good luck