Submitted by bo_peng t3_1135aew in MachineLearning
jamesvoltage t1_j8yjrqo wrote
Reply to comment by MysteryInc152 in [R] RWKV-4 14B release (and ChatRWKV) - a surprisingly strong RNN Language Model by bo_peng
State space models (S4, H3, etc) are also competitive with 2B param transformer language models and have an effectively infinite context window https://hazyresearch.stanford.edu/blog/2023-01-20-h3
Viewing a single comment thread. View all comments