Viewing a single comment thread. View all comments

mysakbm t1_it661l7 wrote

I think that 2. it's data leak. That would explain the bump.

However, if you created the averaged vector from train set and then new one for test set then I'm wrong.

2

Integral_humanist OP t1_it6m96u wrote

Nope used the same one. Created the vectors on the users in the training the data, and then reused them for the test data.

Since I'm predicting the future behavior of the same users, this isn't a problem right?
I'm essentially using past user behavior via value, to predict future (same) user behavior with different content categories.

1