and that is exactly how a predictive text algorithm works.
some tokens go in
they are processed by a deterministic, static statistical model, and a set of probabilities (always the same, deterministic, remember?) comes out.
pick the word with the highest probability, add it to your initial string and start over.
if you want variety, add some randomness and don’t just always pick the most probable next token.
Coincidentally, this is exactly how llms work. It’s a big markov chain, but with a novel lossy compression algorithm on its state transition table. The last point is also the reason why, if anyone says they can fix llm hallucinations, they’re lying.
and that is exactly how a predictive text algorithm works.
some tokens go in
they are processed by a deterministic, static statistical model, and a set of probabilities (always the same, deterministic, remember?) comes out.
pick the word with the highest probability, add it to your initial string and start over.
if you want variety, add some randomness and don’t just always pick the most probable next token.
Coincidentally, this is exactly how llms work. It’s a big markov chain, but with a novel lossy compression algorithm on its state transition table. The last point is also the reason why, if anyone says they can fix llm hallucinations, they’re lying.