Submitted by spruce5637 t3_yyglj9 in MachineLearning

I was introduced the AllenNLP framework a couple months before the developers announced that they will no longer update or maintain the framework. I have used it for some of my projects, one of which are still in development, but now that AllenNLP is going obsolete, it feels like a bad idea to keep using the framework.

(Not only because there will be loads of new stuff that the framework won't support but also because publishing code using an outdated framework will only lower the chance of other people using my code in the future.)

My hope is to move my current project out of AllenNLP, but it's such a huge pain in the back to migrate from one framework to another. I'm wondering if anyone has done similar things before, and how did you keep producing new results for your project during the migration, how to ensure that everything is reproduced, and how to not want to pull out all your hair in frustration while doing it?

Also, any recommended frameworks that is less likely to die within a few months?

13

Comments

You must log in or register to comment.

Randomramman t1_iwuj1te wrote

Huggingface transformers library is awesome if you want access to TONS of pre-trained, open-source LLMs for fine-tuning. Not sure if it would suit your needs though.

11

spruce5637 OP t1_iwuwi53 wrote

Ah of course, I've been using Huggingface here and there; I think AllenNLP uses some of it in their code too. I guess I should have stuck to using 🤗 and nothing else to begin with lol

The main problem really is how to disentangle my model from AllenNLP without breaking it.

3

new_name_who_dis_ t1_ix1607r wrote

Disentangling your model will involve reading AllenNLP’s source code and taking from it what you need. I don’t think there’s an easier way of doing it

2

ZestyData t1_iwvmkg6 wrote

This is an infamous Software Engineering smell/tech-debt issue. Unfortunately, you hitched your wagon to the wrong horse - and this is why framework choice is a hugely important matter in all Engineering teams. And doing your research before commiting to structing entire projects around a framework may seem like a waste of time but it can prevent major headaches like this.

In reality, there is no shortcut to rewriting / disentangling your project from AllenNLP to more conventional NLP (Huggingface, Spacy) which are all but guaranteed to be safe for long term use.

It's now a matter of convincing Product / the business that you need to commit a lot of time to an Epic that won't deliver direct value.

​

>how to ensure that everything is reproduced

Unit tests, integration tests, logging & monitoring.

And of course comparing ML metrics for each model that was produced in AllenNLP versus the replacement models produced with Spacy/Huggingface/native-pytorch.

​

> and how to not want to pull out all your hair in frustration while doing it

Sorry mate, but there's no avoiding this one. Good luck.

7

spruce5637 OP t1_iwyi7hv wrote

Thanks a lot for your comment! I only had a vague idea and little experience on how to do all this, so your advice really helped me lay things out and start making a concrete plan. Will be doing tests and comparing metrics for sure, and let's hope my supervisor will accept this as a legit way of spending my time...

1

killver t1_iwuucud wrote

What exactly do you want to migrate? If you have models in production I am sure you can keep them in production. And for training you can switch fresh to new frameworks like huggingface.

1

spruce5637 OP t1_iwuzdlt wrote

I have a project in development that's using AllenNLP and I hope to move it out of the framework. My main concern is ensuring everything works like before when I switch over (e.g. the tokenizer, the encoder, the whole data "pipeline")

(Edit: I'm also not sure if I should dig into their source code and compare it with Huggingface to ensure everything works as before under the hood, since reproducibility is really important and all)

2

killver t1_iwv0pg0 wrote

You didn't really answer my question what parts of your pipeline you want to try to move. But in general AllenNLP is for quite some time now already irrelevant in the space, Id suggest to move to Huggingface asap.

2

spruce5637 OP t1_iwyh6fe wrote

>You didn't really answer my question what parts of your pipeline you want to try to move.

...almost the whole pipeline? Reading in examples, batching them, tokenization, encoding them into tensors, training, saving, loading for prediction are all built under the framework.

>But in general AllenNLP is for quite some time now already irrelevant in the space, Id suggest to move to Huggingface asap.

Yeah that's the vibe I'm getting, hence the post. Thanks for your suggestions though!

1