dead presidents
A character-level GPT trained on the speeches of US presidents. The whole model — 464,928 parameters as a 1.8 MB float32 binary — streams to your browser once and runs entirely in a Web Worker. There is no server-side inference and no API call.
It is not instruction-tuned and its context is only 192 characters, so it does not answer questions. It continues a short text seed in the cadence of the corpus. dead presidents has no database and no JSON API; it is a static page, so it sits apart from the content services.
How it works
The model ships as two static files served alongside the page:
- model.json — a manifest: architecture config, the 34-character vocabulary, and the byte offsets of each tensor.
- model.bin — the float32 weights, read into typed arrays in one fetch.
A classic Web Worker loads a dependency-free inference engine, reconstructs the GPT-2-style transformer (RMSNorm, causal attention, tied output head), and runs KV-cached generation. The JS engine matches the source-of-truth Python model to about 1e-6, so the browser output is the real model, not an approximation.
Model
Char-level, GPT-2-style: n_embd 96, n_layer 4, n_head 4, block_size 192, vocabulary 34 (33 characters + a BOS marker), validation 1.475 bits/char. Weights and training code live at github.com/iammatthias/dp.
Inference
Generation runs in the Worker so the page stays responsive, behind a small OpenAI-shaped client (streaming chat.completion.chunk objects where each "token" is one character). The seed is the controls' text; temperature and a character budget steer the output. Everything happens locally — no wallet, no server, no network after the weights load.