pornthrowaway42069l

pornthrowaway42069l t1_jdn6noe wrote

Not going to deny that GPT-4 looks impressive, but, they could set up 10 bajillion-quadrillion parameters, question is, do they have the data to effectively utilize all of these? Maybe its time to start looking into decreasing number of parameters, and making more efficient use of the data.

4

pornthrowaway42069l t1_iy8srkr wrote

Ah, I see. During training, the loss and metrics you see are actually moving averages, not exact losses/metrics in that epoch. I can't find the documentation rn, but I know I seen it before. What this means is that losses/metrics during training won't be a "good" gauge to compare with, since they include information from previous epochs.

1

pornthrowaway42069l t1_ixm7q3s wrote

You can specify several losses, or have multi-output with a single loss - in both cases Keras will average them out (I think its non-weighted by default, and you can specify the weights, but I don't remember 100%).

You can't really have 3 different loss values for a single network - otherwise it won't know how to use that to backpropagate. The best you can do is write a custom loss function, and mix them in a way that makes sense for your problem (You will still need to provide a singular value at the end), or provide the weights (You'd need to look up APIs docs for that).

1

pornthrowaway42069l t1_itsbufj wrote

I'd try some baseline/simpler models on the same data and see how it performs. Maybe the model just can't do any better, that's always a good one to check before panicking.

You can also try to use K-means or DBSCAN or something like that, and try to get 2 clusters of results - see if those algos can segment your data better than your network. If so, maybe the network is set up incorrectly somehow, if not, maybe something funky happening to your data in pipeline.

2

pornthrowaway42069l t1_issyo30 wrote

I had similar experience in some big companies.

Bombed the leetcode, but found an opportunity to show-case my (fairly cool) project code during technical interview. Asking the guy questions, he confused feature importance with feature selection, couldn't answer about a baseline model (They had a black-box without one), and a bunch of other things. When I said "I kind of prepared for pandas + SQL more", said "We expect you to know those things". I guess they expect me to know how to use pandas and SQL but not python for crappy leetcode questions.

The truth is, most companies/ml departments have no idea what they want or should be doing. Good luck to that head of ML team, because I was glad I wasn't selected, with such great interview and ML skills it's a bullet dodged.

89

pornthrowaway42069l t1_irh69qh wrote

It's a difference "Presented Paper X at Y event" vs "Published highly reviewed paper"

It a) Gives you network opportunities b) Shows you take your stuff seriously c) Shows that you have oral skills, which means you can like use words in a not totally idiotic manner to convey your thoughts, which is always a plus in a technical field.

1

pornthrowaway42069l t1_irgt36a wrote

On one hand a big presentation like this will be a wonder on the resume.

On the other hand, you will spend money and potentially do worse on a final.

IMO, if money is no concern, give the speech, I personally feel it would worth more than a course later in life. If money is tight, or you are not sure, then it's alright to stay home - you already have a flex on you (Highly rated paper), so it's not the end of the world if you won't do it.

9