Flat files are fine for smaller crawls, however the structure begins to
break down for long crawls. Directories with millions of small files are
slow to interact with, especially on traditional hard disks. LevelDB
requires a few extra dependencies.
A third (unrelated) dependency is added to fix the install of Pillow.
The 32 bit hash is likely to have at least a few collisions
over 1 million sites. To avoid this we use the fast murmur3_x64_128
hash from pyhash and mask it to 64 bits. This requires a few
additional dependencies, which are included in the install script.