what is the size of the vocabulary? and how to train the tokenizer? BPE or wordpiece?
65536It's a greedy tokenizersee https://github.com/BlinkDL/ChatRWKV/tree/main/tokenizer
· Sign up or log in to comment