one_eyed_sphinx OP t1_j7yrvl2 wrote on February 10, 2023 at 10:53 AM

Reply to comment by allanmeter in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx

whats your recommendation for VRAM:RAM:NVME ratios?

one_eyed_sphinx OP t1_j7yr8st wrote on February 10, 2023 at 10:45 AM

Reply to comment by ThomasBudd93 in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx

some of the people seem to connect it to the AMD processors and motherboards. do you think it's the reason?
nvidia is known to downgrade thier gaming GPU so people will buy the proffessional ones.

one_eyed_sphinx OP t1_j7yqh5v wrote on February 10, 2023 at 10:34 AM

Reply to comment by suflaj in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx

>NVME

yeah, the GPU memory is horible bottleneck. I am trying to find ways to go around it but it doesnt seems there are too many best practices for it. is there a way to use pined memory for faster model data transfer?

one_eyed_sphinx OP t1_j7ypyu3 wrote on February 10, 2023 at 10:27 AM

Reply to comment by allanmeter in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx

a minimum threadripper? you are saying this because of the number of lanes or the number of cores?
can you elaborate more on "Assuming you have a handle on data vs model distribution strategy"?

one_eyed_sphinx OP t1_j7tzqmj wrote on February 9, 2023 at 11:52 AM

Reply to comment by ThomasBudd93 in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx

can you find me a reference?

one_eyed_sphinx OP t1_j7tzoiq wrote on February 9, 2023 at 11:51 AM

Reply to comment by suflaj in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx

>eco

so this is the fine point that I want to understand, what I am trying to optimize with the build is the data transfer time, how much time it takes to load a model from RAM to VRAM. if I have10 models that need 16 GB of VRAM to run, the need to share resources. so I want to "memory hot swap" (I don't know if there is a proper term for it, I found "Bin packing") the models on an incoming request. so the data transfer is somewhat critical in my point of view and as I understand it, only the PCI speed is the bottleneck here, correct me if I'm wrong.