Comments

You must log in or register to comment.

guava-bandit t1_j62mxs8 wrote

For the separate columns question: depending on the importance that those in isolation would have on whether a customer would buy a product or not, you might want a feature per action and each with a flag value on whether the user did it or not. This is more something you’ll have to think about and to test out. If you do end up doing a feature per action, you might want to look at some regularisation for your logistic regression parameters, as maybe some of the actions are not as useful in predicting a good outcome.

For the training bit (.fit()), you need to pass in to the fit function your prepared dataset X used for training in 2D format and then for the y argument you need to pass in your class target data. I must say that the error you get confuses me a bit though.

I hope this is giving you some pointers though, and opening up the discussion to more useful input :)

3

Thanos_nap OP t1_j62yrcn wrote

Yes, this is helpful. Thank you.

So to give you a idea of the actions, it has actions from our end and customer action (for marketing): Email / SMS / etc communication from our end Email open/sms clicked by customer

Transaction data actions: Bought x on date and time, bought y on date and time, etc.

All of it is arranged as per the timestamp of that action.

.fit() part I'm passing the data in same manner as you mentioned but not sure why the error is still there. Will check the tutorial someone else has posted!

1

vwings t1_j63c9z7 wrote

Yes, good point. I would recommend to use KERAS for this modeling task. As soon as you have the data in the right data structure, you can solve this with maybe 25 lines of code ...

1

teenaxta t1_j62mz4o wrote

Customer ID is useless so obviously it will be dropped. Now the actions he did is a bit tricky.

if actions are discrete classes, then i think you should break up the column into sub classes and then one hot encode the actions.

I cant really understand why you need LSTM here. Do you have a sequence data or any sort of temporal component ? If you have to use LSTM you can just set your sequence length to 1 and essentially use it as a NN. But that makes no sense honestly. Would be much better to use something like XGboost

3

Thanos_nap OP t1_j62nsnj wrote

Oh yes customer ID will be dropped that was just for identification. As for why we need LSTM..that's because they just want it with LSTM because LSTM is the "new" thing here. That's all..i have explained them it's not really needed but obviously top management knows better.

3

vwings t1_j63c23v wrote

Lol, LSTM for the sake of it. If there is no temporal component, then it's just the wrong model. Can you tell them that Transformers are the "new" LSTMs? Transformers handle sets (instead of sequences), so they would make a lot of sense in your application..

2

Thanos_nap OP t1_j63dyc3 wrote

There is a temporaral component. These customer actions are week wise. So the data is Customer ID, week number, action, converted yes or no.

I can get this in the 3d shape with time step as week, features = actions. But I'm confused what would be the batch here.

But yes, i agree with you this is not the best method for my use case!

2

vwings t1_j64itph wrote

The batch dimensions are the different customers. You have N costumers, across T weeks and possible actions. This should give you a sparse tensor of dimensions [N,T,K] that you can easily plug into any LSTM....

2

vwings t1_j63c3ss wrote

How do you know that the costumer is male?

0

geldersekifuzuli t1_j62obhr wrote

This is a great tutorial here. https://youtu.be/ZrgVlfNduj8

He shared the codes as well. This can be replicated for your own data set pretty easily. Two days is more than enough if you just replicate this work for your own dataset.

Reminder : Glove is an lstm based model.

2