itsyourboiirow
itsyourboiirow t1_jecqjqd wrote
Reply to comment by Evening_Ad6637 in [D] Training a 65b LLaMA model by Business-Lead2679
Training requires a significant more amount of memory as it it has to keep track of the gradient for every parameter. I would check to see how much memory it takes up on your computer.
itsyourboiirow t1_jecqc1d wrote
Reply to comment by Nhabls in [D] Training a 65b LLaMA model by Business-Lead2679
This is the only downside I've found. Sometimes it's too darn hard to find an instance.
itsyourboiirow t1_je7n7p8 wrote
Others have mentioned it, but do data augmentation, crop, resize, rotate, etc. and you'll be able to increase the size of your dataset and improve results.
itsyourboiirow OP t1_iy9809b wrote
Reply to comment by DinosParkour in [D] Difference between sparse and dense information retrieval by itsyourboiirow
Thanks for the in depth response!
itsyourboiirow t1_iy5aa1i wrote
Reply to comment by radarsat1 in [D] What method is state of the art dimensionality reduction by olmec-akeru
Correct. But you don't necessarily have to discard the extra dimensions to do PCA.
Submitted by itsyourboiirow t3_z76uel in MachineLearning
itsyourboiirow t1_iskrzq9 wrote
Reply to comment by Mmm36sa in [D] Simple Questions Thread by AutoModerator
You could try PCA and a random forest or a K-nearest neighbors
itsyourboiirow t1_iskrfyz wrote
Reply to comment by liljontz in [D] Simple Questions Thread by AutoModerator
If you are doing it to learn and for fun, I would look into a Recurrent Neural Network (RNN) or a Long short term memory (LSTM) model for generation. They’re really good at picking up patterns in text. Im sure it would be able to do it well with enough training data.
itsyourboiirow t1_iskqzyh wrote
Reply to comment by whydontigetbetter01 in [D] Simple Questions Thread by AutoModerator
I don’t know what flutter is. But PyTorch has methods that will optimize a model for mobile devices and make it GPU compatible for both iOS and Android.
itsyourboiirow t1_iskqchc wrote
Reply to comment by ABCDofDataScience in [D] Simple Questions Thread by AutoModerator
Yeah I’m not sure about the details. But I would guess it’s so you can use back propagation and loss functions on your NN.
itsyourboiirow t1_jefs1oh wrote
Reply to [D] Simple Questions Thread by AutoModerator
People/organizations to follow on Twitter with all things machine learning (traditional, deep neural networks, LLM, etc)