satireplusplus t1_jaiwxlo wrote on March 1, 2023 at 7:35 PM

Wow, nice, I will try it out!

Btw: If you want to format your code in your post, you need to add 4 spaces in front of any line in your post. Otherwise all newlines are lost.

Lines starting with four spaces are treated like code:

if 1 * 2 &lt; 3:
    print("hello, world!")

bo_peng OP t1_jaixxp5 wrote on March 1, 2023 at 7:41 PM

Thank you :) I was using the markdown mode instead because I didn't know this

KerfuffleV2 t1_jaiz1k8 wrote on March 1, 2023 at 7:48 PM

Unfortunately, that doesn't work on the old reddit layout. We just see a garbled mess.

Here's a fixed version of the code/examples:

(not my content)

Example:

'cuda:0 fp16 *10 -> cuda:1 fp16 *8 -> cpu fp32' = first 10 layers on cuda:0 fp16, then 8 layers on cuda:1 fp16, then on cpu fp32

'cuda fp16 *20+' = first 20 layers on cuda fp16, then stream the rest on it

os.environ['RWKV_JIT_ON'] = '1'
os.environ["RWKV_CUDA_ON"] = '0' #  if '1' then compile CUDA kernel for seq mode (much faster)
from rwkv.model import RWKV

from rwkv.utils import PIPELINE, PIPELINE_ARGS
pipeline = PIPELINE(model, "20B_tokenizer.json") # find it in https://github.com/BlinkDL/ChatRWKV

# download models: https://huggingface.co/BlinkDL
model = RWKV(model='/fsx/BlinkDL/HF-MODEL/rwkv-4-pile-169m/RWKV-4-Pile-169M-20220807-8023', strategy='cpu fp32')

ctx = "\nIn a shocking finding, scientist discovered a herd of dragons living in a remote, previously unexplored valley, in     Tibet. Even more surprising to the researchers was the fact that the dragons spoke perfect Chinese."
print(ctx, end='')
def my_print(s):
    print(s, end='', flush=True)

# For alpha_frequency and alpha_presence, see "Frequency and presence penalties":
# https://platform.openai.com/docs/api-reference/parameter-details
args = PIPELINE_ARGS(temperature = 1.0, top_p = 0.7,
                     alpha_frequency = 0.25,
                     alpha_presence = 0.25,
                     token_ban = [0], # ban the generation of some tokens
                     token_stop = []) # stop generation whenever you see any token here
pipeline.generate(ctx, token_count=512, args=args, callback=my_print)

I kind of want to know what happens in the story...

bo_peng OP t1_jaj2pr2 wrote on March 1, 2023 at 8:11 PM

strange. all spaces are lost even when i add 4 spaces in front of all code lines

UPDATE: works in markdown editor :)