Viewing a single comment thread. View all comments

head_robotics OP t1_j99tts4 wrote

Did you use something like bitsandbytes for the 8bit inference?

How did you implement it?

https://github.com/TimDettmers/bitsandbytes

6

Disastrous_Elk_6375 t1_j99ujv1 wrote

add this to your .from_pretrained("model" , device_map="auto", load_in_8bit=True)

Transformers does the rest.

15