Description
Markov chain is a procedural algorithm used to make coherent chains of values, instead of just random noise. It can be used from realistic temperatures to random names to matching colors or even notes of music. Markov chain is a probability table constructed from a sample, for example a name markov chain generator usually uses a database of hundreds names or words. From that data the algorithm creates a table wherein it stores the probability for some value to follow the other. i.e. the name "Karla" would be translated to 'K' -> 'a', 'a' -> 'r', 'r' -> 'l', 'l' -> 'a'. This is order 1 Markov chain generation. Order two would be (from the same example): 'Ka' -> 'r', 'ar' -> 'l', 'rl' -> 'a' (With probabilities for a given letter following two others instead of just one.) The order can increase almost infinitely, although procession time and the number of analyzed data often poses a limit. From this table of probabilities, the algorithm then chooses one variable, and then based on the probabilities, the one after it. (i.e. "a", which in our very limited example has a probability of 50 % to follow with 'r' and a 50 % probability of being followed with an empty space)
Code Example
Python example:
import random
table = {}
order = 2
print 'order:', order
def load(s):
for i in range(len(s) - order):
try:
table[s[i:i + order]]
except KeyError:
table[s[i:i + order]] = []
table[s[i:i + order]] += s[i + order]
load('bannanax')
print table
def generate(start = None, max_length = 20):
if start == None:
s = random.choice(table.keys())
else:
s = start
try:
while len(s) < max_length:
s += random.choice(table[s[-order:]])
except KeyError:
pass
return s
for i in range(5):
print generate()
Example outputs:
order: 0
{'': ['b', 'a', 'n', 'n', 'a', 'n', 'a', 'x']}
xaaxnnnbnnxananxxnnn
abnnnnanxbannaaaaana
nxnnxanbxanaaaanxnxa
anxbaaxnabnnbanaxaaa
nananaananbaanaanann
order: 1
{'a': ['n', 'n', 'x'], 'b': ['a'], 'n': ['n', 'a', 'a']}
nnnnnnanax
nnax
nanax
nnannax
ananananannanax
order: 3
{'ban': ['n'], 'ann': ['a'], 'nan': ['a'], 'ana': ['x'], 'nna': ['n']}
bannanax
annanax
nanax
anax
nanax
With the input string 'bannanax' any order of 3 or higher will yield only substrings of the original, since each input tuple has only one possible output.
External Links
Markov Chain - Wikipedia article on Markov Chains.
Poetry Links - Markov Generator - A Markov poetry generator based on words
Fun with Markov Chains - An example of using Markov chains to combine two texts, with code.
Generating Text - from the book Programming Pearls, uses suffix arrays to generate word or letter level markov text.
Garkov = Garfield + Markov chains.
Markov Chains explained visually - Several interactive demos of Markov Chains.
Markov Namegen - A configurable name generator based on Markov chains.