Viewing a single comment thread. View all comments

FuB4R32 t1_j5tdt1c wrote

We use Google cloud buckets + tensorflow - it works well since you can always point a VM to a cloud bucket (e.g. tfrecords) and it just has access to the data. I know you can do something similar in Jax, haven't tried pytorch. It's the same in a Colab notebook. Not sure if you can point it to a cloud location from local machine though but as others are saying the 4090 might not be the best use case (e.g. you can use a TPU in a Colab notebook to get similar performance)

1

Zealousideal-Copy463 OP t1_j5tk24z wrote

Ohh, I didn't know that about GCP, so you can point a VM to a bucket and it just "reads" the data? you don't have to "upload" data into the VM?

As I said in a previous comment, my problem with AWS (S3 and Sagemaker), is that the data is in a different network, and even though is still an AWS network, you have to move data around and that takes a while (when it's 200 GB of data).

1

FuB4R32 t1_j5v9mlr wrote

Yeah as long as your VM is in the same region as the bucket it should be fine. Even if you have 200GB it doesn't take that long to move between regions either

2