The basic idea of Perlin noise is to choose pseudo-random gradient directions at the vertices of a regular cube grid, and blend smoothly between them.
Multiple layers (octaves) can be combined using fractal Brownian motion to produce a fractal appearance. Higher octaves are more precise, in that they have more points (higher frequency), but they are (usually) given lower weights.
Ken Perlin designed Simplex noise as a less visibly grid-aligned replacement to Perlin Noise. It also scales better when generalized to higher dimensions.
Code Example
PCG Wiki References
Flickr Stream
The following images on Flickr have been generated using Perlin Noise:
External Links
Perlin noise - Wikipedia article on Perlin noise.
Making Noise - Ken Perlin's Discussion of Perlin Noise.
Procedural 3D Content Generation - Dean Macri and Kim Pallister: An Intel publication on using Perlin noise. Notably about terrain mapping. (Only the first page isn't dead)