Featureless_Bug t1_j9kuu22 wrote

Large scale is somewhere to the north of 1-2 TB of data. Even if you had that much data, in absolutely most cases tabular data has such a simplistic structure that you wouldn't need that much data to achieve the same performance - so I wouldn't call any kind of tabular data large scale to be frank


Featureless_Bug t1_j69xpo0 wrote

Oh, a fellow mathematician. Look, I graduated from Cambridge 6 years ago, but I could still prove the fundamental theorem of algebra analytically or with Galois theory (I still remember the general ideas of both proofs I think), so I guess it depends on a person. But FTA is also a much more complicated thing to prove than the chain rule, and you don't even need to prove it to know how to use it. And sorry, if you don't remember how to differentiate multivariable functions, then you are an extraordinarily lousy mathematician. And if you know how to differentiate multivariable functions and if you are smart, you should be able to quickly come up with an implementation for backprop even if you don't remember anything else


Featureless_Bug t1_j69nsjq wrote

>It's definitely an easy question if it was a common question and hence featured on leetcode, where candidates would practice it before the interview.

I mean, if it was on leetcode, it wouldn't make sense to ask it in the interview, because then you will get prepared answers.

>Someone with 2 years of experience don't remember the knitty gritty maths to implement NN from scratch

If you cannot apply chain rule, your math is very weak. If your math is very weak, you probably won't be a great ML engineer. It's not that you need a lot of math, but you need a broad general understanding of what can work and what can't quite often, actually.


Featureless_Bug t1_j69mojw wrote

I mean, it is kind of a very basic question and it takes like 15 minutes at most if you understand what you are doing. It is similar to leetcode-style questions for SE, it is not something that you will do on the job, but if you are smart, you will pass easily, and if you are not, you will struggle - so a great interview task


Featureless_Bug t1_j1veefz wrote

>I think if this is going to be implemented, it has to be at model level, not as an extra layer on top. Just thinking outloud with my not so great ML knowledge, if we mark every image in training data with some special and static "noise" which is unnoticable to human eyes, all the images generated will be marked with the same "noise".

This is already wrong - it might work, it might not work

>So this is for running open source alternatives on your own cluster.

Well, of course the open source models will be trained on data without any noise added, people are not stupid

>When it comes to "why would OpenAI do it", it would be nice for them to be able to track where does their generated pictures/content end up to for investors etc. This can also help them "license" the images generated with their models instead of charging per run.

Well, open AI won't do it because no one wants watermarked images. Consequently, if they tried to watermark their outputs, people will be even more likely to use open-source alternatives. That's why open AI won't do it


Featureless_Bug t1_j10vwsx wrote

What the hell do you want, mate? Everyone uses Python because it is easier to use Python as a front end in ML. And if you ever need to customize something heavy, you just write it in C++ or Rust and call it from Python.

If you don't think it is easier than writing everything in C++ or Rust (which is braindead, btw, any compiled language is a terrible choice for ML experimenting), then do it - noone is stopping you.


Featureless_Bug t1_j0guhqs wrote

This is a joke and not a paper, tbh. "Therefore, for continuous activations, the neural network equivalent tree immediately becomes infinite width even for a single filter," - the person who wrote this has no idea what infinity actually means, and that a decision tree with infinite width is by definition not a decision tree anymore. And they try to sell it as something that would increase explainability of neural networks, just wow. Is there a way to request removal of a "paper" from arxiv?