Viewing a single comment thread. View all comments

Zealousideal-Copy463 OP t1_j5tjusi wrote

Thanks for your comment! I have tried using ec2 and keeping data in EBS but not sure if it's the best solution, what is your workflow there?

I'm playing around mostly with NLP and image models. Right now I'm trying to process videos, like 200GB for a retrieval problem, what I do is: get frames, get feature vectors from pre trained resnet, and resnext (this takes a lot of time). And then I train a siamese network on all of those vectors. As I said I have tried with s3 and sagemaker, but I have to move data into sagemaker notebooks and I waste a lot of time there. Also tried to process stuff in ec2 but setting the whole thing took me a while (downloading data, installing libraries, creating scripts in the shell to process videos, etc).

1

v2thegreat t1_j60fvmd wrote

Well, there are ec2 instances that are already setup. How often do you do this sort of thing? It might be justified to build your own home setup, but as someone who does that themselves, I can tell you it's kinda tedious and you end up being your own IT guy

1