Submitted by coinfelix t3_xszo17 in MachineLearning

I was reading this blog on "Effective ML Teamwork at an Early Stage Startup". More there is in the post but in short, it says, quoting from the post

  • Don't create APIs for ML, just copy&paste.
  • Test every line.
  • If you aren't sure about the design, test first.
  • Always keep your experiments reproducible (lineage, data, code, baseline).
  • Document everything. Be clear, and avoid abbreviations.

Reading the post, especially the parts about APIs, doesn't make sense to me, but I wonder what other ML professionals think about ML teamwork practices and what they do at their companies if you don't mind discussing them here.

9

Comments

You must log in or register to comment.

bernhard-lehner t1_iqpxkj7 wrote

"Document, document, document,...". Lol, and who the hell is going to find time to ever go back and read the documentation? I'm not sure if this post was written by somebody who actually works on the level of research and coding...

5

melgor89 t1_iqrxupz wrote

I would say that lack of documentation is one of the key issues in startups. Ex. Then nobody knows why sth was created on that way, what are the scores from the previous version, what was the main issue from last model. Everything is in sb mind, but when this research left the company, retreating the whole pipeline takes a lot of time.

So I would say proper documentation + simple code, without unnecessary abstraction is a key to move startup further.

People say that there is no time for making documentation as there are more important tasks. From my perspective, it is short term thinking as then you would spend 3x more time on figuring it out why sth was done in such way. This is just my thoughts, based on 5 years in startups and 4 in corporations.

2

rlagent32 t1_iqtw03a wrote

I don't work in a startup but the culture at my place is very startup like. Lot of prototyping and fast iteration. In such a fast moving environment, code is constantly evolving so without the right documentation, it can get extremely hard to understand what the code is even trying to do. Of course there's a tradeoff between actually getting work done vs documenting what you're doing. But having the right level of documentation is a great investment since it allows people to actually focus on moving forward with development.

2

trnka t1_ir7g3c0 wrote

On the API topic, my read is that the post cautions against writing too many wrappers around standard ML libraries. My experience is that folks tend to write wrappers too soon, and then they can make coding harder in the future. My rule of thumb is to not write a wrapper until you have 3 distinct production implementations of something, so that you have real information on the appropriate level of abstraction needed.

On the other topics, it depends also on your stage of startup. If you're pre-product-market-fit, you're faced with the dilemma between spending time for the long term (if your company survives) and iterating faster to ensure that your company survives. So it's a balancing act depending on your level of confidence in profitability, next round of investment, etc.

Early on, I'd expect 80% or more of research experiments to fail and be thrown away. In those cases it's mainly important to share the findings of your research with the rest of the company. Writing is ideal but tech talks work too.

For the projects that make a difference to the company, it's important to identify when they've met the bar and then dedicate some time to making the project easier to maintain and extend (whether improving the code or docs).

1