Viewing a single comment thread. View all comments

Tgs91 t1_j2n5aih wrote

Are you in academia or industry? In industry, I do other work while I wait for training to complete. Code cleanup, refactor and simplify my modules so they'll be easier to maintain, start building out modules for post processing / integrating the model for the end use case. If all of that is already completed, I start working on another project in my teams backlog. There's always other work to do, no reason to sit around waiting for a model to train.

15

hollow_sets OP t1_j2n6nmg wrote

Academia for now
Since Im a student (bachelors) and no one wants someone with just a bachelor's so I can't really enter the industry properly even if I want to

1

Tgs91 t1_j2na3mm wrote

As a student, you should take the time to work on code cleanup. Usually I see students use one big training script that has a lot going on. For my projects I typically build out a pip installable module with submodules for preprocessing/structuring raw data, model building with lots of kwargs so it can be customized, dataset objects with transformations or randomness etc for batch loading efficiently, etc etc. My actual training scripts are only a few lines of code. Hyperparams in all caps at the top, import functions from my module, and call the functions. And my modules are written in a way that employees of various skill levels can contribute to the project. Myself and another colleague do all of the more advanced AI work, but any member of the team can be a USER of the module, and we have more general data scientists that can contribute to preprocessing code, containerization, post processing tools, etc.

Even if you don't do a full module, make a utils.py file to pull out any long pieces of code and write it as an importable function. Use docstrings for every function with Google's docstring style guide (or use the autodocstring extension on VSCode, it's great). Use a linter like flake8 or black to make sure your code looks clean and professional. This all seems like minor, tedious stuff, but if you have to go back and edit/maintain code you wrote a year ago, it's a lifesaver. And it also means that in an industry environment, another coworker can step in and easily understand and edit your code. It might not make a functional difference to you right now, but good, clean, professional code is great on a resume.

9

hollow_sets OP t1_j2ndrb9 wrote

This sounds like a good plan to do while I wait for the model to train.

I'll start from tomorrow (since its 10:30 pm and I feel like I have burnt myself out for the day fixing the errors) Hope no more errors pop up while I sleep

2