Mason Wang

Byte Pair Encoding

start with vocabulary of bytes

merge two tokens and add it to the vocabulary

bye-level BPE

Last Reviewed: 10/28/2025