Train a tokenizer
GPU-native Byte Pair Encoding on WebGPU compute shaders. Drop your text files, pick a vocabulary size, and train — everything runs on your GPU.
Tokenize text
Load a trained vocabulary or train one first, then encode text into BPE tokens to see compression in action.
Pre-tokenize
Tokenize text files with your BPE vocabulary and export as .bin
for Transformer training. Uses the vocabulary from the Train tab or a loaded
.json.
Train a transformer
Tiny LLaMA-style transformer on WebGPU. Drop a tokenized
.bin from the Pre-tokenize tab, pick a model size,
train — gradients and AdamW all run on your GPU.