BenoitParis
BenoitParis t1_j4qbih9 wrote
Reply to [D] Is it possible to update random forest parameters with new data instead of retraining on all data? by monkeysingmonkeynew
Hoeffding Trees come to mind. The keyword you are looking for is 'online learning'. Apparently there's a python package dedicated to that:
https://scikit-multiflow.readthedocs.io/en/stable/api/api.html
But 250000 rows is not that high. Since your time requirements are daily I'd consider looking for other algorithms or implementations in other languages before that.
BenoitParis t1_j7v9wue wrote
Reply to [D] Similarity b/w two vectors by TKMater
Lots to choose from:
https://docs.scipy.org/doc/scipy/reference/spatial.distance.html
How do your vectors look like? What do you intend to do with them? Will you be clustering them? Indexing them? How many are there? How did you obtain them? What do they represent? What is their type?