You can think of a hash table as a dynamic array, with a really strange policy for inserting items.
Hash Functions
Many hash functions follow the same pattern. You start with some initial hash value, usually a constant with certain carefully chosen mathematical properties. Then you walk the data to be hashed. For each byte, you mix the bits into the hash value somehow, and then scramble the resulting bits around some.
FNV-1a
Techniques
See: double hashing, cuckoo hashing, Robin Hood hashing, linear probing
Glossary
Collision
:Occurs when the calculated index (i.e. hash) of two elements in a hashtable are the same.
Load Factor
:The ratio between the size of a hash table, and the number of elements it contains
Separate Chaining
:A strategy for resolving collisions between elements wherein each bucket of a hash table points to a linked-list of elements, rather than a single position.
It's simple to implement, but tends to be memory inefficient.
-
Open Addressing
Also known as "closed hashing"
A hashing technique wherein each element is stored directly in the table
Upon collision, the table probes for the next available bucket.
The simplest form of probing is linear probing, in which the table walks its internal array looking for the first open bucket.
-
Clustering
The presence of multiple entries in one section of the table
High degrees of clustering in a table suggest a different hashing algorithm or probing algorithm may be appropriate.