Submitted by kayhai t3_zf01qj in MachineLearning

I work in an oil refinery. Beyond my regular role, I have been working on Python-based analysis at my workplace, including machine learning. Many colleagues have sent their data to me for analysis or to create ML models, but I do not have time to process all the requests (though I’d love to).

I’m hoping to look for a no-code and low-cost method that empowers chemical/mechanical/electrical engineers (who have no Python or ML knowledge) to attempt ML studies on their data, before passing it to me for further work or to put into production.

We happen to be using Power BI for dashboarding. Is asking the engineers to use Power BI Premium Pay-per-user AutoML a good idea? Or are there better, or cheaper or easier to use platforms? Thanks for your advice.

Additional question: would anyone know the full list of models that are considered by Power BI’s automl? Googling doesn’t seem to give me such info.

11

Comments

You must log in or register to comment.

rshah4 t1_izb32qr wrote

This is tough. I use to work for a large AutoML company that worked with oil and gas companies. It's difficult and often frustrating for non ML people to use AutoML tools. To use ML you need to know how to setup your problem - what is the target, partitioning data, . . It takes an understanding of ML to do this. Otherwise you will end up with people with 20 rows of data wanting to make a prediction or trying to use ML for something a simple rule would do or building a multilabel model where a binary model would have been better.

My suggestion is to keep them in the descriptive world, and if they want to move to ML, someone needs to introduce ML concepts to them before they start using the tools.

16

kayhai OP t1_izbxsit wrote

Makes sense, I’m also afraid we might end up with a rubbish-in-rubbish-out situation.

1

rshah4 t1_izbymy6 wrote

If you have repeatable use cases, you can build a simple app like streamlit that applies the ML. But this way you can set some boundaries on how they are using ML. Glad you get it.

2

SmorgasConfigurator t1_iz9ddkw wrote

This will always depend on how fancy you need your ML to be. But I had a similar problem some time ago where I wanted to give persons who knew very little programming something more advanced than dumb Excel charts to filter, analyze and explore their data.

I ended up with KNIME: https://www.knime.com

You can create pretty advanced data selections and analysis by "programming" logic units by an interactive interface and mostly simple configurations. They have free versions that are pretty OK, that you can run on your desktop, which contains basic to medium ML methods. It worked in my case, maybe it can work for you too.

4

kayhai OP t1_iz9g4t7 wrote

Thanks. I looked at their site and it looks interesting. I’d need to convince my IT to let me install it. May I ask what’s their pricing like (their website just says “contact us”)? Or is it generally alright to use the free version?

1

SmorgasConfigurator t1_iz9gu64 wrote

In the past I used the free desktop version and it was sufficient. They used to have specialized addons, which you then paid for (there are specialized chemistry nodes I know). So at first you can download the free version and see where that takes you. I honestly doubt it’s all that expensive if you want their corporate support and special addons. It’s a nice small company.

2

kayhai OP t1_iz9h3o4 wrote

Sure. I just wanted to be sure they are alright for it to be used free-of-charge, without limitations, in a commercial setting. And if we eventually need corporate-level support, I’m sure we won’t mind paying a reasonable fee.

1

space-ish t1_iz9x7qr wrote

Depends on your use case really. As a brief example, You can choose MS AI builder in a flow, pass those results to the power bi data model for visualization.

2

RealGrande t1_izab5f8 wrote

If you're ready using Microsoft then Microsoft Azure is the answer, specifically azureml. Decent no code tools

2

kayhai OP t1_izd9e4a wrote

Yes, and “management” really loves Microsoft 🫠

2

spqr54 t1_izbo3yk wrote

I work in the chemical industry as a data analyst and I use JMP every day. It is an extremely complete and efficient tool with a very active user community. The learning curve is exponential. The time saving is immeasurable compared to coding in Python. The downside is that you can't be as flexible as with code. But for industrial problems, I would say that it meets 95% of the needs.

2

Exciting-Engineer646 t1_izd6iej wrote

I hate to ask this, but how well do they understand their data? Bad data, distributions that break assumptions (heavy tails, autocorrelation, etc), missing data, and all of the rest will cause model failures even if they are able to code. If it is worthwhile for your company to use that data then they need to properly resource it.

2

kayhai OP t1_izd6ldt wrote

Yes, I have sufficient domain knowledge to have meaningful conversations with the requesters.

1

Exciting-Engineer646 t1_izd70cq wrote

Not your knowledge, but theirs! If you want this to be fully code free, they probably need to own the data end as well.

I trust very few people on knowing which model their data can support. So even if you can find a code free solution, you still need to find a scalable solution for data curation and model selection.

2

Exciting-Engineer646 t1_izd724x wrote

Not your knowledge, but theirs! If you want this to be fully code free, they probably need to own the data end as well.

I trust very few people on knowing which model their data can support. So even if you can find a code free solution, you still need to find a scalable solution for data curation and model selection.

2

kayhai OP t1_izd8t7g wrote

Oh, I get what you mean, even if we find a code-free solution, we’d still need them to at least understand the requirements on data quality. Unfortunately, not everyone understands the requirements on data quality and I am also hoping the code-free softwares can help with that.

Just for example, Power BI does a VERY basic screening on the data and indicates whether it is good for regression or classification and whether certain features should be excluded from the study (due to low relation).

1

GotScooped t1_iz9gkef wrote

Edge Impulse is a great no code ML platform that has a number of tools for exploring your data and developing models. It’s specifically design around deploying embedded ML which might be relevant to your coworkers use cases. Deploying predictive models locally, close to the data source, can have some serious efficiency benefits.

1

kayhai OP t1_iz9gpql wrote

Yes, “serious efficiency benefits” summarises what I’m looking for! Trying not to end up with the Python-guy having to do a lot of heavy lifting!

2