Viewing a single comment thread. View all comments

TheTwigMaster t1_j5097m6 wrote

Using open source models might be good for quickly experimenting and getting a feel/sense of the value of an approach for a particular problem. But at a company (especially big tech companies), there are many more things to consider:

  • How do I scale this to my particular dataset? It’s a bigger pain to change my data to fit a given model than to change the model to fit my data
  • How can I integrate my company’s infrastructure/tooling/monitoring to this? Often it ends up being simpler to revisit the implementation from scratch
  • How easy is it to experiment with adjustments to this? Often we don’t want to pick a single architecture forever, so we want to be able to adjust and modify easily. Open source models may not always accommodate this.

At the risk of being flippant/dismissive: coding up a model/architecture is one of the easiest and fastest parts of the problem. So if you can make other things easier by making a model implementation from scratch, it’s makes sense to just do that.

12

tennismlandguitar OP t1_j51r1ao wrote

Wow, thanks for the response, that was really enlightening-- I never thought about monitoring to support these models.
Have you noticed one of these problems to be the biggest issue in industry?
W/ regards to your last point, that definitely makes sense in case of a simple CNN or deep network, but sometimes there are more complicated RL algorithms or transformers that become a bit difficult and time-intensive to implement. In these cases, I would suspect that it would be easier to use something open-sourced?

2