Comments

You must log in or register to comment.

No_Dust_9578 t1_j8vv85i wrote

Few things. Don't make a model from scratch, use a pre-trained one. There are plenty on hugging face. Another thing, later on, if you have your own data, you can use it to fine tune those models to better suit your task. This is a general approach to ML applications where data isn't available or not enough. Side note, speaking from experience, those large sentiment models that are out there do have great performance but some of them have been trained with large sentiment datasets that have inconsistencies. For instance, once I had to validate manually the performance on my data and noticed that the pre-trained models predicted the following sentence as POSITIVE sentiment but to a human, this is not positive: "oh yay, I love cold food...". So be careful and setup some sanity checks. Don't fully assume the predictions are accurate.

5

justundertheblack OP t1_j8vvwgw wrote

btw this is a school project so we have to train our own model and we have the dataset for it too so do you know any good ones?

1

bubudumbdumb t1_j8wpmih wrote

Do you know how to validate a pricing signal, back testing and portfolio optimization? The NLP/ML part might be the easy one

1

justundertheblack OP t1_j8wpq8p wrote

Naah I don't Can you point me towards some resources?

1

bubudumbdumb t1_j8wtq42 wrote

https://www.investopedia.com/terms/b/backtesting.asp

https://en.m.wikipedia.org/wiki/Modern_portfolio_theory

With extreme synthesis :

markets are not stationary environments so you have to expect and mitigate drift. This have implications on the evaluation methodology and on the choice of time series models that can be calibrated with fewer data points.

A strategy to make money in the markets allocate capital on multiple financial instruments using multiple signals therefore the value of a signal is the predictive advantage that it provides when stacked on top of others commonly used signals. If the predictive capability of the news sentiment is easily replicated by a linear combination of cheaply available signals then it's not worth much.

1

justundertheblack OP t1_j8wuh36 wrote

Trueee I've heard that such models need to be tuned regularly I'll definitely look into it

1

Hot_Initial7865 t1_j8wrxtj wrote

It sounds like a school assignment

1

justundertheblack OP t1_j8ws5u6 wrote

Naah it's a college project 😂

1

emotionalfool123 t1_j8wwxtu wrote

Best way to find any topic and related code is to search it on Google Scholar.

E.g. https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=nlp+stock+market+trading+code+github&btnG=

https://finbert.ai/

https://github.com/jinanzou/astock

These are the two results I found. Happy learning.

1