Generating a pseudo-random 2d noise texture
Let's imagine an infinite 2d grid (or more realistically, a very large grid, larger than what I can reasonnably keep in memory), and to each node of that grid, we associate an integer value. Every time we look at one particular node of the grid, we want the same value associated to that node. The values of the grid's nodes should follow a random distribution. At least, it should look random to the naked eye : pseudo-random. Let's call this function GridNoise.
When working with procedural content generation, say, procedural textures, GridNoise is an essential building block. The integer associated to one grid node can be used to generate information. Then, between the nodes of the grid, one can interpolate those informations. This is how value noise and Perlin noise works, for instance. Indeed, pretty much all procedural textures uses GridNoise at their core. However, I never seen a clear, consistent method to build something like GridNoise. Since GridNoise is a low-level, essential building block, one might want something without obvious defects. I introduce here my recipe to build such a GridNoise function.
The idea is use the coordinates of a node to build a key, and to hash that key with a hash function to get our integer. A good hash function will give us the pseudo-random distribution for the integer values. Or will it ? Let's take a look. I use Pearson hashing : it's a very simple hash function, fast to compute, it have good properties, and it's dead simple to code. Pearson hashing works on array of 8 bits integers. Here it is my implementation of Pearson hashing in Python, as a functor.
import random
class PearsonHash:
def __init__(self, rnd=random.Random()):
self.permut = range(256)
rnd.shuffle(self.permut)
self.seed = rnd.choice(self.permut)
def __call__(self, array):
h = self.seed
for value in array:
h = self.permut[h ^ value]
return h
As you can see, because Pearson hash relies on a permutation table, it's not one single function, but a whole familly of functions ! So you can define plenty variant of it, you are not stuck with one single hash function. It's nice, we will be able to define different varieties of GridNoise.
Now, how to build a key from 2d integer coordinates ? Let's assume 16 bits coordinates (thus defining a 65536x65536 noise texture) for this example, it would be same with 32 bits. We can take the 16 bits of the X coordinate, the 16 bits of the Y coordinates, put them one after each other into a 32 bits value. 32 bits value, same as an array of 4x8 bytes values. If we have X=1011 and Y=0100, then the resulting 1d coordinate would be Z=10110100.
Let's try this ! Here it is an implementation in Python, assuming you have a Picture object
import array
phash = PearsonHash()
pic = Picture(512, 512)
tmp = array.array('B', [0] * 4)
for i in xrange(pic.h):
for k in xrange(2):
tmp[k+2] = ((i >> (8 * k)) & 255)
for j in xrange(pic.w):
for k in xrange(2):
tmp[k] = (j >> (8 * k)) & 255
pic.pixel(j, i) = phash(tmp)
Hum... It looks like white noise, but we can see some really nasty artifacts. You don't see it ? Let me show you that !
We can see vertical bands, 256 pixels wide. Each band looks noisy, but not so random, the bands have a distinct character. Not so random indeed... We can confirm what is just a visual test by something a bit more rigorous. If we try to compress this picture, throwing lot's of CPU time at this, we find out if there is some obvious repeating pattern. I used optipng for this : it reduced the picture data size by 90%... So our random data are not very random. Random data are data which most compact description are themselves. We did something wrong, but what ? Pearson hashing is innocent, it's a know, well-test hashing function. So the problem is with the way we built our key from node's coordinate. If we reverse X and Y in the 4 bytes array, it just turns the vertical bands into horizontal bands. It does not compress very well... Rotate this by 90 degrees, and suddenly, it become very easily compressible, also around 90% size reduction with optipng.
So we have to built our key completly differently. Struck by an Eureka instant, I got this intuition : use the Z curve ! The Z what ?!
The Z-curve is a space-filling curve, defining a map between 2d coordinates and 1d coordinates. The 2d coordinates are n n bits numbers. The 1d coordinate is a 2n 2n bits number, built by interleaving the bits of the 2d coordinates. If we have X=1011 and Y=0100, then the resulting 1d coordinate would be Z=10011010. The curve shown above is built linking the nodes by increasing Z. Here it is a naive implementation of the 2d to 1d conversion for the Z curve, assuming 16 bits per coordinates.
def mix(x, y):
i = 0
for j in xrange(16):
i |= (x & 1) << (2 * j)
x >>= 1
i |= (y & 1) << (2 * j + 1)
y >>= 1
return i
Because of its shape, I believed that the Z curve would scramble those nasty patterns we got. It would give something that looks and feel much more like actual noise. The code to generate the picture becomes
import array
phash = PearsonHash()
pic = Picture(512, 512)
tmp = array.array('B', [0] * 4)
for i in xrange(pic.h):
for j in xrange(pic.w):
z = mix(j, i)
for k in xrange(4):
tmp[k] = (z >> (8 * k)) & 255
pic.pixel(j, i) = phash(tmp)
And here it is a sample of what we get from this
No visible artifacts ! Using a clever & expensive compresser does not reduce the size at all ! We did it, we have a very good GridNoise function. For demonstration purpose, I used 16 bits coordinates for the grid nodes, thus GridNoise is a procedure 65536x65536 noise texture. If we use 32 bits, or even 64 bits coordinates, then we got a Universe-scale noise texture to work with. I might have overlooked some awfull flaw, I would be glad to know it if anybody notice that.
The approach generalize well to N dimensions. Z-order is defined in N dimensions : just interleave bits as we did for 2 dimensions. You might have a look here to see how to interleave bits efficiently. The Z curve can be replaced by any decent space-filling curve. However, the Z curve is maybe the cheapest to compute I know, while being close to the best ones I know, the Hilbert curve and the Peano curve.