incrediblediy

incrediblediy t1_jb5dzqa wrote

This is when they were running individually on full 16x PCIE 4.0, can be expected with TFLOPS (3x) as well. (i.e. I have compared times when I had only 3060 vs 3090 on the same slot, running model on a single GPU each time)

I don't do much training on 3060 now, just connected to monitors etc.

I have changed the batch sizes to suit 24 GB anyway as I am working with CV data. Could be bit different with other types of models.

3060 = FP32 (float) 12.74 TFLOPS (https://www.techpowerup.com/gpu-specs/geforce-rtx-3060.c3682)
3090 = FP32 (float) 35.58 TFLOPS (https://www.techpowerup.com/gpu-specs/geforce-rtx-3090.c3622)

I must say 3060 is a wonderful card and helped me a lot until I found this ex-mining 3090. Really worth for the price with 12 GB VRAM.

1

incrediblediy t1_jacum67 wrote

I am similar to you, just passed first year of my PhD. I am using Win10 at home (RTX3090 + RTX3060) and Linux GPU servers at uni (command line only). At the end of the day, it really doesn't matter as I am using Python and other libraries which are cross platform. I am keeping conda environments in both systems similar though.

2

incrediblediy t1_izm4ey7 wrote

Looks like, GTX980 4GB = 165 W & RTX2080 6GB = 160 W, which would be 325 W. I haven't used Intel K CPUs, so I am not that familiar with power usage of that. But I think 850 W would be more than enough, if it is a proper 850 W PSU, even considering power usage by other components like motherboard, RAM, SSD etc.

You can use this to calculate power requirement https://outervision.com/power-supply-calculator

My power usage is AMD Ryzen 5600x (75 W) + RTX3060 (170W) + RTX3090 (350W) = 595 W at max, I think with other components total was 750 W ( System power budget : https://outervision.com/b/8XoZwf ).

I have a 850 W, Tier A - Deepcool PQ850M which is a Seasonic based 80+ Gold. I have power stress tested with OCCT and it was fine.

1

incrediblediy t1_izi6n58 wrote

>Now if I connect my 2060 along with the gtx 980, and connect my display to the 980 , will pytorch be use the whole vram of 2060 ?

Yes, I have a similar setup, RTX3090 - No display (full VRAM for training), RTX3060 - 2 Monitors

When I play games, I connect 1 monitor to RTX3090 and play on that, other monitor on RTX3060

2

incrediblediy t1_iyzkww4 wrote

Reply to comment by [deleted] in 4080 vs 3090 by simorgh12

> K80

Yes, I meant that I got K80 and I was doing some CNN/BERT etc. Just checked, K80 (single unit) has similar TFLOPs to GTX1060 3GB so with other overheads in cloud (slow CPU, drive storage etc), Colab could be slower anyway.

Now I have a PC with dual GPU setup (RTX3090 + RTX3060) and have access to GPU servers at Uni, so no more Colab :)

> have a 1650 which is no slouch but colab trained in 5s what took my GPU 10 minutes.

is that a laptop GPU ?

1

incrediblediy t1_iyy4mua wrote

Reply to comment by [deleted] in 4080 vs 3090 by simorgh12

I am not sure about this, even my GTX1060 3 GB was kinda fast than K80 on Google Colab. Also think about storage size/speed, internet upload speed, security/restrictions of data, 12 hour limitation etc.

3

incrediblediy t1_iyy49kh wrote

Reply to 4080 vs 3090 by simorgh12

4080 16 GB should be actually the 4070 TI
3090 24GB would be a better choice specially with VRAM, you can also get an used card which would be much cheaper

3

incrediblediy t1_iyjrk9c wrote

Reply to comment by democracyab in RTX 2060 or RTX 3050 by democracyab

get a used card (may be $200 ?), not brand new. PSU are not that expensive now (may be $50?), I used to have a 450W Silverstone 80+ Bronze with Ryzen5600x + RTX3060, it worked well.

Anyway, you will need a PSU for RTX2060 as well. RTX2060 & RTX3060 has almost same power requirement.

2

incrediblediy t1_iycjg6d wrote

You can use your own preprocessing on top of keras preprocessing and data loader, or you can use a custom code for all together.
According to https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator ,

Deprecated: tf.keras.preprocessing.image.ImageDataGenerator is not recommended for new code. Prefer loading images with tf.keras.utils.image_dataset_from_directory and transforming the output tf.data.Dataset with preprocessing layers

You can do mini batch training depending on available VRAM, even with a batch size of 1. I assume you are referring to VRAM as RAM, as we hardly do deep learning with CPU for image datasets.

example: you can use data_augmentation pipeline step to have control over preprocessing like this (I used this code with older TF version (2.4.0 or 2.9.0.dev may be) and might need to change function locations for new version as above)

train_ds = tensorflow.keras.preprocessing.image_dataset_from_directory(
    image_directory,
    labels='inferred', 
    label_mode='int',
    class_names=classify_names,     
    validation_split=0.3,
    subset="training",
    shuffle=shuffle_value,
    seed=seed_value,
    image_size=image_size,
    batch_size=batch_size,
)

data_augmentation = tensorflow.keras.Sequential(
    [
        tensorflow.keras.layers.experimental.preprocessing.RandomFlip("horizontal"),
        tensorflow.keras.layers.experimental.preprocessing.RandomRotation(0.1), 
    ]
)

augmented_train_ds = train_ds.map( lambda x, y: (data_augmentation(x, training=True), y))
2

incrediblediy t1_ixm8sku wrote

I think you might find more info on such a rig, if you search about building a "mining rig", it is quite the same. I have seen they have used multiple 1200 W server PSU's connected together with interface board.

> power spikes can cause each of them to go up to 1000W

:O that's quite a lot, is this for a commercial application or home project, otherwise you might be able to find 4 used 3090s with better ROI

2

incrediblediy t1_ix9xbce wrote

Reply to comment by Nerveregenerator in GPU QUESTION by Nerveregenerator

if your CPU/Motherboard support PCIe 4.0 16x slot, that is all needed for a RTX3090. I have a 5600x with cheap B550M-DS3H motherboard running RTX3090 + RTX3060. I also got an used RTX3090 from ebay after decline of mining. Just make sure your PSU can support it, draws 370 W at max.

2

incrediblediy t1_ix7czdr wrote

Reply to comment by Nerveregenerator in GPU QUESTION by Nerveregenerator

> 4 1080s combined will get me 1.5x throughout as a 3090 with FP32 training. FP16 seems to yield a 1.5x speed up for the 3090 for training.

I think that's when only comparing CUDA cores without Tensor cores, anyway you can't merge VRAM together for large models

3

incrediblediy t1_ix7cqrg wrote

> 4 1080Ti's or 1 3090 > ebay for 200 bucks

you can also get an used 3090 for the same price of 4*200, also you can use 24 GB VRAM for training larger models

5