Comments

You must log in or register to comment.

junetwentyfirst2020 t1_j4z0oyt wrote

🤫 they do. But there tends to be licensing issues, so they don’t.

51

tennismlandguitar OP t1_j51prk0 wrote

I suppose it's just the teams I've been on, then!

Do you see mostly research teams use these? Or have you also seen software teams use some ML engineers to integrate these open-source models into their products? (Where licensing is not an issue)

3

junetwentyfirst2020 t1_j51r600 wrote

I refuse to answer on the grounds that I may purger myself

3

tennismlandguitar OP t1_j51sbtt wrote

HAHA no worries, sent you a DM about this stuff in general, answer with what you're comfortable with!

1

LcuBeatsWorking t1_j4znni1 wrote

One problem of many open-source models is that they are badly documented (there was a big discussion last year that many models used in scientific papers couldn't be replicated).

So reverse-engineering them is often harder than building your own from scratch.

15

tennismlandguitar OP t1_j51q970 wrote

Totally agree! I've found this problem to be a big issue myself, so I assumed that was the main issue. Sent you a DM to talk a bit more about this :)

−1

TheTwigMaster t1_j5097m6 wrote

Using open source models might be good for quickly experimenting and getting a feel/sense of the value of an approach for a particular problem. But at a company (especially big tech companies), there are many more things to consider:

  • How do I scale this to my particular dataset? It’s a bigger pain to change my data to fit a given model than to change the model to fit my data
  • How can I integrate my company’s infrastructure/tooling/monitoring to this? Often it ends up being simpler to revisit the implementation from scratch
  • How easy is it to experiment with adjustments to this? Often we don’t want to pick a single architecture forever, so we want to be able to adjust and modify easily. Open source models may not always accommodate this.

At the risk of being flippant/dismissive: coding up a model/architecture is one of the easiest and fastest parts of the problem. So if you can make other things easier by making a model implementation from scratch, it’s makes sense to just do that.

12

tennismlandguitar OP t1_j51r1ao wrote

Wow, thanks for the response, that was really enlightening-- I never thought about monitoring to support these models.
Have you noticed one of these problems to be the biggest issue in industry?
W/ regards to your last point, that definitely makes sense in case of a simple CNN or deep network, but sometimes there are more complicated RL algorithms or transformers that become a bit difficult and time-intensive to implement. In these cases, I would suspect that it would be easier to use something open-sourced?

2

[deleted] t1_j50fd7o wrote

They do, I'm not too sure what you're talking about.

And on the flipside, businesses to need to distinguish themselves or create value. So unless you are using the open source model in the context of an application, then what good is the open source model if anyone can run it?

9

tennismlandguitar OP t1_j51rizd wrote

Haha I guess I just haven't seen it in my experience.

I think for research scientists, it becomes far easier to implement improvements to the existing SOTA models if they don't have to try implementing them from scratch.

For MLEs, definitely makes sense that it needs to be in the context of an application. Is that hard enough to drive people away from trying in your experience?

1

Omnes_mundum_facimus t1_j52i6fr wrote

  1. Because lawyers, and 2) because performance on academic data sets doesn't translate into good performance on whatever domain specific problem we might be having.
6

tennismlandguitar OP t1_j52zysi wrote

What about finetuning those models to make sure the performance is satisfactory?

1

Omnes_mundum_facimus t1_j555aa2 wrote

The short but mostly true conversation we had with legal.

  • engineer: so this model was actually developed by our biggest competitor
  • lawyer: wtf?????
  • engineer: And we used a pretrained checkpoint, again from even bigger competitor
  • laywer: wtf??
  • engineer: All cool, It was trained on this imagenet thing
  • lawyer: And who owns this imagenet thing?
  • engineer: ????
  • lawyer: And did everybody in this imagenet thing consent to his or her picture being used?
  • engineer: ????
  • lawyer: what the actual f ?????
  • engineer: So I guess we are using our own model trained from our own data then.
8

terath t1_j50rz6q wrote

They probably do use open source architectures and maybe code, but often train their own model on their own data. This is because the research training sets both don't match whatever domain companies need to use it with, but also because many of the research data sets licenses forbid commercial use.

4

z_fi t1_j4zy4dq wrote

Restrictive licensing and limited usefulness to industry problems.

3

tennismlandguitar OP t1_j51r6gb wrote

Definitely agree with the first point here. Could you expand a bit more on the second? Why is it limited usefulness with transfer learning and fine-tuning today?

1

Direct_Ad_7772 t1_j59xhgh wrote

Dear academics, please add an MIT license to your GitHub code. If it has no license, or some non-commercial license, we company folks can't use your code.

3

PredictorX1 t1_j4zmy54 wrote

Can you give some examples of problems that an organization would solve with open source models?

1

tennismlandguitar OP t1_j51rtwb wrote

Definitely! An example could be the use of https://github.com/AI4Finance-Foundation/FinRL in quant-firms and fintech.

1

AmalgamDragon t1_j56uj5c wrote

I recently started using RL in my personal work on automated futures trading. After reviewing the libraries available in the RL space, I did try the one you linked too. Some of the samples were broken. While I did tweak the code to get the samples to work, I found it to be more straightforward to get up and running using PPO from stable-baselines3.

2