Submitted by hollow_sets t3_101a9gd in MachineLearning
I was wondering what each individual does while they wait for the model to train because thats what I am waiting for (ETA 5 DAYS)
Submitted by hollow_sets t3_101a9gd in MachineLearning
I was wondering what each individual does while they wait for the model to train because thats what I am waiting for (ETA 5 DAYS)
Really, though, I think being really disciplined about this habit is important, because itโs so easy to get sucked into. Itโs like a little shot of dopamine every time that little number on the screen ticks upward. Makes me feel like Iโm on Wall Street. Haha
๐ yes exactly
Accurate ๐
I train whenever the machine trains. Whenever I put on a run I do a set of squats or pushups. I see it as a regularisation method. Forces me to think a bit more before turning on a big run
Doing this for a model that has an ETA of 5 days sounds a bit too much for a workout ;-;
I mean just do a set whenever you turn on the run. If you're anything like me you turn on tons of runs where you quickly spot a mistake.
Besides that, I would also recommend exercise every day. Not 120 hours long of course, just 30 minutes to an hour. I used to have very bad slouching posture like a lot of my colleagues. ML researchers spend a lot of time sitting behind a desk. Bouldering and going to the gym have wonders to my posture
Hahaha yeah I do spot mistakes as soon as I start the run
That sounds like a good idea A set or maybe just read a paper while it trains
Edit: Also I'll be going back to my University campus(still a bachelor's student) this week so physical activity is going to go off the charts
I am starting bouldering soon. What's your routine on a given week given gym as well?
5 days is pretty good! Some of the big models are many, many months.
Maybe you'd enjoy reading Meta's OPT-175B logbook while you're waiting...
https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/chronicles/OPT175B_Logbook.pdf
Oh wow thats actually pretty good
I'll read it and also start maintaining my own log book too
Since I am working on two research projects, This will be fun
Play Elden Ring
And quadruple check everything
Well well guess what I just got a signal kill at evaluation hohoho (I am evaluating the model after every 1000 steps and its takes an approximate 5 hours to go through each 1000 steps) This was the first eval check so fuck
use `%` modulo to do a eval check before you start training (i.e 0th step). Saves a ton of time to debug, because something always goes wrong.
Yea, thanks for the advice :D (I was going to wait like an idiot) Fixed it now and seems like it is running properly
Personally I also like to eval way more often than every 5 hours. Perhaps use a smaller eval subset for every hour?
Sounds fair enough Current evaluation time is like 1.5 hours so I didn't go ahead with an hourly evaluation plan
100 crunches everytime your validation error increases. Youโll either have perfect abs or great models
Holy moly, Guess what, I am doing this now
Going to participate in kaggle competitions more often (I have a fear of competitions so I never participated) and everytime I fuck up
I increase the number of crunches by 10
Are you in academia or industry? In industry, I do other work while I wait for training to complete. Code cleanup, refactor and simplify my modules so they'll be easier to maintain, start building out modules for post processing / integrating the model for the end use case. If all of that is already completed, I start working on another project in my teams backlog. There's always other work to do, no reason to sit around waiting for a model to train.
Academia for now
Since Im a student (bachelors) and no one wants someone with just a bachelor's so I can't really enter the industry properly even if I want to
As a student, you should take the time to work on code cleanup. Usually I see students use one big training script that has a lot going on. For my projects I typically build out a pip installable module with submodules for preprocessing/structuring raw data, model building with lots of kwargs so it can be customized, dataset objects with transformations or randomness etc for batch loading efficiently, etc etc. My actual training scripts are only a few lines of code. Hyperparams in all caps at the top, import functions from my module, and call the functions. And my modules are written in a way that employees of various skill levels can contribute to the project. Myself and another colleague do all of the more advanced AI work, but any member of the team can be a USER of the module, and we have more general data scientists that can contribute to preprocessing code, containerization, post processing tools, etc.
Even if you don't do a full module, make a utils.py file to pull out any long pieces of code and write it as an importable function. Use docstrings for every function with Google's docstring style guide (or use the autodocstring extension on VSCode, it's great). Use a linter like flake8 or black to make sure your code looks clean and professional. This all seems like minor, tedious stuff, but if you have to go back and edit/maintain code you wrote a year ago, it's a lifesaver. And it also means that in an industry environment, another coworker can step in and easily understand and edit your code. It might not make a functional difference to you right now, but good, clean, professional code is great on a resume.
This sounds like a good plan to do while I wait for the model to train.
I'll start from tomorrow (since its 10:30 pm and I feel like I have burnt myself out for the day fixing the errors) Hope no more errors pop up while I sleep
I do the dishes, change batteries in things that need them and waste time on Reddit.
I start writing the paper...
I would have as well but currently I have no clue what to do
Right now I am training the model(efficient-video-recognition) just to see if its resource friendly to our servers or not
So no clarity in which direction I have to move.
you can always use a smaller dataset and scale down the model to ensure it works and then train it whole, at least this is what I do... Generally waiting a week to see if the model works is very time-consuming...
I continue working on that long backlog of things I'd like to implement:
I try to get to sleep, but I can't
cause each time I woke up, the loss went to nan
I try to hang out with friends, but
every 5 minutes I kept refreshing weights and biases and my friends never hang out with me anymore
I try to play online games, but
each time the training went into OOM, I just close the game and try to tune my hyperparameters.
Now I just pray anxiously while scrolling online store for second hand 3090. And I look more or less like Gollum
Gah never had friends in the first place to allow them to leave (joking)
But yeah gotta tune it I guess loss went to nan and I just woke up ;-;
update jira's, documentation, youtube
I go running if I manage not to look at error charts in tensorboard
Hahahaha I use weights and biases for this It just tells me the error rates and accuracy in a graph and if the program is running or not
Other than that I just stay in the illusion that everything is fine
Pina colada & a book, it goes great with the tropical temperature in the office during training. It makes a Canadian January feel like the Bahamas in August.
Read training data
This is why having multiple projects is good. Just work on other coding or writing up while you wait.
I go eat food.
Touch grass.
Can you parallelize runs ?
Parallelize as in? Multiple models running together?
multiple trainings of different models or same model with different training parameters/hyperparameters.
i.e if you have a cloud environment with a number of processing units available on demand.
Yepp i can
[removed]
[removed]
You mean computing wise? Well I guess work on my procedural generation engine, listening to music, hunt for software vulns with SMT solvers and such. Or go about an IRL activity/ hobby.
Ohh that sounds fun
I'll try working on some full stack development as well now that I have free time
Yeah, mostly is fun - more so even lately chatting with ChatGPT as coding assistant simultaneously. What will you be working on more specifically :) ?
Im working on video recognition Just started the project though so no direction as of now
Netflix and chill
/s
I always retrained my models on Fridays so that I'd lose less time. Then I found that I always had enough administrative type tasks, meetings I'd put off, or training I'd like to do (training that I totally wouldn't take advantage of if it wasn't for the downtime), to get me to the end of the model training.
Just another thought, have you considered creating content for Reddit or LinkedIn? Obviously you created this post, but I mean educating others about what you're working on? My network is a huge piece of my career now, if I'm looking for a job I'm not applying places because I get a lot of inbound opportunities. It's tangential to work and will help you in your career (As long as you're not in finance or some other industry where people don't talk about their work).
I do post alot of random stuff on twitter and sometimes LinkedIn as well
I am considering to make content on what I am working on (atleast on streams I do show what Im doing to the few viewers I get)
Might consider posting on reddit too but not LinkedIn for explaining what I'm doing (since there are some real assholes in my university and I don't think I want them to know what I'm doing in work)
(Also would it be possible for to send some opportunities to apply to? My career path sure is going to be in the ML development industry and later on to academia)
Sure. If I see another MLE role I can share.. they always ask if I know anyone.
Aye thats going to help me out alot
Also which subreddits would you recommend me for posting about what I am doing? I guess writing blogs on medium is one way (Also I started to maintain a log book for my on-going project, though still local) On LinkedIn, its more like just normal text posts which I don't think is possible to do on reddit.
__Maximum__ t1_j2n9cnv wrote
Everyone is being sarcastic here. In reality, we all pray while checking the loss every 20 seconds.