How To: Constructing a Pretty Quick Hash Table

May 21st, 2024

Introduction

This is more of a code-drop, than a formal explanation of the concept. I’ve recently got a chance to work on this and thought I’d share with the world, while also documenting it for future reference.
As the title mentions, the idea is to construct a quick hash table that prioritizes speed, rather than being a general purpose hash-table. If you are familiar with workings of the hash-table, you would probably get most out of this.
The code provided should work right off the shelf. You just gotta add it to your project and place both the files in same folder.

Implementation Details

I’ve supported following operations in my implementation:

Insert
Delete
Search
Read-Only Access using overloaded indexing operator

The hash-table has following the features:

Table prioritizes fast search/retrieval operations.
Re-Hashing occurs only during the insert operation
Open-Addressing is used to resolve hash collisions.
Single data blob is used to store the Key-Value data and is maintained by Hash-Table itself.
Tries to Rehash if there’s too much probing or exceeds the load factor defined.
Load factor should always be less than 1.
If table is optimized for minimal probing, it’ll be faster than collision resolution by chaining.
To minimize number of times table is probed, each key is split into 3 partitions. So, key is basically a logical construct that’s a combination of multiple partitions. The partition is where the Key-Value data resides in.

There’s a small performance analysis comparing the implementation to STL’s unordered map.

Code

HashTable.h

// HashTable.h
#ifndef _H_HASHTABLE
#define _H_HASHTABLE

#include <assert.h>
#include <math.h>
#include <utility>

namespace pd
{
	/*
	* Supports Insert, Delete, Search.
	* Delete assumes there's a valid entry present. Use search to query before deleting if not sure.
	* Cannot modify entries once inserted
	* Supports insertion of multiple similar valued keys, but no way to iterate over them.
	* Entries are unordered during iteration
	* Supports Range-Based for loop through a simple Iterator implementation.
	* 
	* Uses Open-Addressing with double hashing.
	* Tries to Rehash if there's too much probing or if load factor goes beyond the max allowed value.
	* Load factor should always be less than 1.
	* With minimal probing, should be faster than collision resoluton by chaining.
	*/
	template<typename _KEY
		, typename _VALUE
		, typename _HASH = std::hash<_KEY>
	>
	class HashTable
	{
		struct Data
		{
			enum State : uint8_t
			{
				Invalid = 0,
				Occupied = 1,
				Deleted = 2
			};
			struct DataPair
			{
				DataPair(_KEY pKey, _VALUE pValue)
					: key(pKey)
					, value(pValue)
				{}
				_VALUE		value;
				_KEY		key;
			};

			Data(){}
			Data(_KEY pKey, _VALUE pValue)
			: state(Invalid)
			, pair(DataPair(pKey, pValue))
			{}
			Data(const Data&) = default;
			DataPair	pair;
			State		state;
		};

		using PData = Data*;
		using C_PData = const Data*;
		using DataRef = Data&;
		using C_DataRef = const Data&;
		using KeyRef = _KEY&;
		using C_KeyRef = const _KEY&;
		using ValueRef = _VALUE&;
		using C_ValueRef = const _VALUE&;
		
		using PairRef = typename Data::DataPair&;
		using C_PairRef = const typename Data::DataPair&;

#define AS_DATAPTR(buffer) ((PData)buffer)

	public:
		class Iterator
		{
		public:
			Iterator(const Data* const buffer, uint32_t size, uint32_t index);
			
			/*Operator Overload ops*/
			Iterator& operator++();
			Iterator& operator--();
			C_PairRef operator*() const;
			bool operator==(const Iterator& other) const;
			bool operator!=(const Iterator& other) const;
		private:
			uint32_t				m_index = 0;
			uint32_t				m_size = 0;
			const Data* const		m_buffer = nullptr;
		};

	public:
		HashTable();
		HashTable(uint32_t requiredCapacity);
		HashTable(const HashTable<_KEY, _VALUE, _HASH>& other);
		HashTable operator=(const HashTable<_KEY, _VALUE, _HASH>& other);
		~HashTable();

		void Insert(C_KeyRef key, C_ValueRef value);
		void Remove(C_KeyRef key);
		const Iterator Search(C_KeyRef key) const;

		const Iterator begin() const;
		const Iterator end() const;
		C_PairRef operator[](C_KeyRef key);
	private:

		/* 
		* Inserts into the hash table. Rehashes if necessary
		* Asserts if insertion fails
		*/
		void Insert_Internal(DataRef data);
		void Remove_Internal(C_KeyRef key);
		const Iterator Search_Internal(C_KeyRef key) const;

		/* If conditions specified here are met, table will be rehased with old buffer invalidated*/
		void RehashIfRequired();
		/* Requests new memory from OS and rehashes all entries in old buffer to new one*/
		void Rehash();
		bool MaxLoadExceeded() const;
		/* capacity request here is to increase number of keys. Actual capacity will be calculated from it
		Capacity requested must be greater than currently available. Increases in powers of 2*/
		void RequestCapacity(uint32_t capacity);
		void* RequestMemory() const;

	public:
		//Debug functions
		void Print();
		void PrintTableVitals() const;
		void ProbeWholeTable() const;

	private:
		float					m_currentLoad		= 0;
		uint32_t				m_numKeys			= 0;
		uint32_t				m_size				= 0;
		uint32_t				m_capacity			= 0;
		uint32_t				m_maxProbes			= 0;
		void*					m_buffer			= nullptr;

		float					m_maxLoadAllowed	= 0.75;
		const _HASH				c_HashFunObj;
		const uint8_t			c_NumPartitions		= 3;
	};


	/*
	* m should be a power of 2
	* Result of this should be a relatively prime to m. Only then the whole table could be traversed reliably
	* Should return value in range [1 m-1] that's relatively prime to m
	*/
	inline uint32_t secondaryHash(uint32_t m, uint32_t key)
	{
		const uint32_t p = 429484801; // Random prime number
		uint32_t b = (key * p) % (m - 1); // b must be less than m. Doesnt matter even if multiplication overflows
		const uint32_t res = b | 1; // res should be odd. Powers of 2 are relatively prime to all odd numbers
		return res;
	}
	inline uint32_t primaryHash(uint32_t m, uint32_t key)
	{
		return key % m;
	}

	inline uint32_t hash(uint32_t i, uint32_t m, uint32_t key)
	{
		return (primaryHash(m, key) + i * secondaryHash(m, key)) % m;
	}

#include "HashTable.inl"
}

#endif

HashTable.inl

// HashTable.inl
#include "HashTable.h"

#define HASHTABLE_TEMPLATE_DECL template<typename _KEY, typename _VALUE, typename _HASH>
#define HASHTABLE_TEMPLATE_TYPES _KEY, _VALUE, _HASH
#define HASHTABLE_CLASS HashTable<HASHTABLE_TEMPLATE_TYPES>

#define HASHTABLE_ITERATOR_TYPE typename HashTable<HASHTABLE_TEMPLATE_TYPES>::Iterator
#define HASHTABLE_ITERATOR_CLASS HashTable<HASHTABLE_TEMPLATE_TYPES>::Iterator
#define HASHTABLE_DATA_TYPE typename HashTable<HASHTABLE_TEMPLATE_TYPES>::Data
#define HASHTABLE_DATA_CLASS HashTable<HASHTABLE_TEMPLATE_TYPES>::Data
#define HASHTABLE_DATA_PAIR_TYPE typename HashTable<HASHTABLE_TEMPLATE_TYPES>::Data::DataPair
#define HASHTABLE_DATA_PAIR_CLASS HashTable<HASHTABLE_TEMPLATE_TYPES>::Data::DataPair

// Hash Table ===========================================================================
HASHTABLE_TEMPLATE_DECL
HASHTABLE_CLASS::HashTable()
{
	RequestCapacity(10);
}

HASHTABLE_TEMPLATE_DECL
HASHTABLE_CLASS::HashTable(uint32_t requiredCapacity)
{
	RequestCapacity(requiredCapacity);
}

HASHTABLE_TEMPLATE_DECL
HASHTABLE_CLASS::HashTable(const HASHTABLE_CLASS& other)
	: m_currentLoad(other.m_currentLoad)
	, m_numKeys(other.m_numKeys)
	, m_size(other.m_size)
	, m_capacity(other.m_capacity)
	, m_maxProbes(other.m_maxProbes)
	, m_maxLoadAllowed(other.m_maxLoadAllowed)
	, c_HashFunObj(other.c_HashFunObj)
	, c_NumPartitions(other.c_NumPartitions)
{
	RequestCapacity(m_numKeys);
	if (m_buffer != nullptr)
	{
		memcpy(m_buffer, other.m_buffer, m_capacity * sizeof(Data));
	}
}

HASHTABLE_TEMPLATE_DECL
HASHTABLE_CLASS HASHTABLE_CLASS::operator=(const HASHTABLE_CLASS& other)
{
	if (m_buffer != nullptr)
	{
		free(m_buffer);
	}
	RequestCapacity(other.m_numKeys);
	for (auto itr : other)
	{
		Data data(itr.key, itr.value);
		Insert_Internal(data);
	}
	return HASHTABLE_CLASS(other);
}

HASHTABLE_TEMPLATE_DECL
HASHTABLE_CLASS::~HashTable()
{
	if (m_buffer != nullptr)
	{
		delete m_buffer;
	}
}

HASHTABLE_TEMPLATE_DECL
void HASHTABLE_CLASS::Insert(C_KeyRef key, C_ValueRef value)
{
	Data data(key, value);
	Insert_Internal(data);
	RehashIfRequired();
}

HASHTABLE_TEMPLATE_DECL
void HASHTABLE_CLASS::Remove(C_KeyRef key)
{
	Remove_Internal(key);
}

HASHTABLE_TEMPLATE_DECL
const HASHTABLE_ITERATOR_TYPE HASHTABLE_CLASS::Search(C_KeyRef key) const
{
	return Search_Internal(key);
}

HASHTABLE_TEMPLATE_DECL
const HASHTABLE_ITERATOR_TYPE HASHTABLE_CLASS::begin() const
{
	return Iterator(AS_DATAPTR(m_buffer), m_capacity, 0);
}

HASHTABLE_TEMPLATE_DECL
const HASHTABLE_ITERATOR_TYPE HASHTABLE_CLASS::end() const
{
	return Iterator(AS_DATAPTR(m_buffer), m_capacity, m_capacity);
}

HASHTABLE_TEMPLATE_DECL
const HASHTABLE_DATA_PAIR_TYPE& HASHTABLE_CLASS::operator[](C_KeyRef key)
{
	if (Search_Internal(key) == end())
	{
		Data data(key, _VALUE());
		Insert_Internal(data);
	}
	return *Search(key);
}

HASHTABLE_TEMPLATE_DECL
void HASHTABLE_CLASS::Insert_Internal(DataRef data)
{
	uint32_t k = 0;
	bool found = false;

	while (k < m_numKeys && !found)
	{
		// [0 .. m_Keys-1]
		uint32_t keyHash = (uint32_t)c_HashFunObj(data.pair.key);
		uint32_t tableKey = hash(k, m_numKeys, keyHash);
		uint32_t partitionKey = tableKey * c_NumPartitions;
		for (uint32_t i = 0; i < c_NumPartitions; ++i)
		{
			uint32_t dataIndex = partitionKey + i;
			//Search for empty slot
			if (AS_DATAPTR(m_buffer)[dataIndex].state != Data::Occupied)
			{
				Data* newData = new (AS_DATAPTR(m_buffer) + dataIndex) Data(data);
				newData->state = Data::Occupied;
				found = true;
				break;
			}
		}
		++k;
	}
	assert(found && "Hash Table corrupted. Could not insert the provided data");

	++m_size;
	m_currentLoad = (float)m_size / (float)m_capacity;

	if (k > m_maxProbes)
	{
		m_maxProbes = k;
	}
}

HASHTABLE_TEMPLATE_DECL
void HASHTABLE_CLASS::Remove_Internal(C_KeyRef key)
{
	uint32_t k = 0;

	bool deleted = false;
	while (k < m_numKeys && !deleted)
	{
		// Gets expensive if load factor inches close to 1
		uint32_t keyHash = (uint32_t)c_HashFunObj(key);
		uint32_t tableKey = hash(k, m_numKeys, keyHash);
		uint32_t partitionKey = tableKey * c_NumPartitions;
		for (uint32_t i = 0; i < c_NumPartitions; ++i)
		{
			uint32_t dataIndex = partitionKey + i;
			if (AS_DATAPTR(m_buffer)[dataIndex].state == Data::Occupied
				&& AS_DATAPTR(m_buffer)[dataIndex].pair.key == key)
			{
				AS_DATAPTR(m_buffer)[dataIndex].state = Data::Deleted;
				deleted = true;
				break;
			}
		}

		if (deleted) break;

		++k;
		assert(k < m_maxProbes && "Table corrupt. Invalid times probed while deleting");
	}

	assert(deleted && "Trying to delete a non-existent key from HashTable.");
	--m_size;
	m_currentLoad = (float)m_size / (float)m_capacity;
}

HASHTABLE_TEMPLATE_DECL
const HASHTABLE_ITERATOR_TYPE HASHTABLE_CLASS::Search_Internal(C_KeyRef key) const
{
	uint32_t k = 0;

	while (k < m_numKeys)
	{
		// Gets expensive if load factor inches close to 1
		uint32_t keyHash = (uint32_t)c_HashFunObj(key);
		uint32_t tableKey = hash(k, m_numKeys, keyHash);
		uint32_t partitionKey = tableKey * c_NumPartitions;
		for (uint32_t i = 0; i < c_NumPartitions; ++i)
		{
			uint32_t dataIndex = partitionKey + i;
			if (AS_DATAPTR(m_buffer)[dataIndex].state == Data::Occupied
				&& AS_DATAPTR(m_buffer)[dataIndex].pair.key == key)
			{
				return Iterator(AS_DATAPTR(m_buffer), m_capacity, dataIndex);
			}
		}

		++k;
		// We've exceeded the max number of probes registered. No way the required key exists
		if (k > m_maxProbes)
		{
			return end();
		}
	}

	return end();
}

HASHTABLE_TEMPLATE_DECL
void HASHTABLE_CLASS::RehashIfRequired()
{
	// If we are probing alot, it's time to Re-Hash
	if ((m_maxProbes > m_numKeys / 2) && (m_size > m_capacity / 4))
	{
		std::cout << "Rehashing. Reasion: Too many probes into table\n";
		PrintTableVitals();
		Rehash();
		return;
	}
	//If max load goes beyond the threshold, it's also time to rehash
	if (MaxLoadExceeded())
	{
		std::cout << "Rehashing. Reason: Max Load Exceeded\n";
		PrintTableVitals();
		Rehash();
		return;
	}
}

HASHTABLE_TEMPLATE_DECL
void HASHTABLE_CLASS::Rehash()
{
	//This is fater compared to when chaining is used
	uint32_t prevCapacity = m_capacity;
	uint32_t prevNumKeys = m_numKeys;
	void* oldBuffer = m_buffer;
	RequestCapacity(m_numKeys << 1);

	m_maxProbes = 0;
	m_size = 0;
	m_currentLoad = 0;
	assert(oldBuffer != nullptr && "It's not possible that old buffer is null.");

	//Insert all valid elements from old buffer into this
	for (uint32_t i = 0; i < prevCapacity; ++i)
	{
		DataRef cData = AS_DATAPTR(oldBuffer)[i];
		if (cData.state == Data::Occupied)
		{
			Insert_Internal(cData);
		}
	}
	free(oldBuffer);
}

HASHTABLE_TEMPLATE_DECL
bool HASHTABLE_CLASS::MaxLoadExceeded() const
{
	return m_currentLoad >= m_maxLoadAllowed;
}

HASHTABLE_TEMPLATE_DECL
void HASHTABLE_CLASS::RequestCapacity(uint32_t capacity)
{
	//TODO: Add a upper bound check
	assert(capacity < (1ul << (31ul - c_NumPartitions)) && "Cannot allocated requested capacity. Table will overflow");
	m_numKeys = 1 << (uint32_t)std::ceil(std::log2f((float)capacity));
	m_capacity = m_numKeys * c_NumPartitions;
	m_buffer = RequestMemory();
}

HASHTABLE_TEMPLATE_DECL
void* HASHTABLE_CLASS::RequestMemory() const
{
	assert(c_NumPartitions > 0 && c_NumPartitions <= 5 && "Unsupported number of partitions given");

	uint32_t size = (m_numKeys * c_NumPartitions) * sizeof(Data);
	assert(size % alignof(Data) == 0 && "Size should be a multiple of alignment");
	void* data = malloc(size);
	assert(data != nullptr && "OS couldnt grant us required memory");
	if (data != nullptr)
	{
		memset(data, 0, size);
	}
	return data;
}

//=======================================================================================
// Iterator =============================================================================

HASHTABLE_TEMPLATE_DECL
HASHTABLE_ITERATOR_CLASS::Iterator(const Data* const buffer, uint32_t size, uint32_t index)
	: m_buffer(buffer)
	, m_size(size)
	, m_index(index)
{
	//Get to the first valid element
	while (m_buffer[m_index].state != Data::Occupied && m_index < m_size)
	{
		++m_index;
	}
}

HASHTABLE_TEMPLATE_DECL
HASHTABLE_ITERATOR_TYPE& HASHTABLE_ITERATOR_CLASS::operator++()
{
	++m_index;
	while (m_buffer[m_index].state != Data::Occupied && m_index < m_size)
	{
		++m_index;
	}
	return *this;
}

HASHTABLE_TEMPLATE_DECL
HASHTABLE_ITERATOR_TYPE& HASHTABLE_ITERATOR_CLASS::operator--()
{
	--m_index;
	while (m_buffer[m_index].state != Data::Occupied && m_index >= 0)
	{
		--m_index;
	}
	return *this;
}

HASHTABLE_TEMPLATE_DECL
const HASHTABLE_DATA_PAIR_TYPE& HASHTABLE_ITERATOR_CLASS::operator*() const
{
	return (PairRef)m_buffer[m_index].pair;
}

HASHTABLE_TEMPLATE_DECL
bool HASHTABLE_ITERATOR_CLASS::operator==(const Iterator& other) const
{
	return m_index == other.m_index;
}

HASHTABLE_TEMPLATE_DECL
bool HASHTABLE_ITERATOR_CLASS::operator!=(const Iterator& other) const
{
	return !(*this == other);
}

//=======================================================================================
// Debug Functions ======================================================================

HASHTABLE_TEMPLATE_DECL
void HASHTABLE_CLASS::Print()
{
	PrintTableVitals();

	for (auto itr : (*this))
	{
		std::cout << "Key: " << itr.key << "\tValue: " << itr.value << "\n";
	}
}

HASHTABLE_TEMPLATE_DECL
void HASHTABLE_CLASS::PrintTableVitals() const
{
	std::cout << "=================== Printing Vitals ======================\n";
	std::cout << "Num Keys: " << m_numKeys << "\n";
	std::cout << "Num Partitions: " << (uint32_t)c_NumPartitions << "\n";
	std::cout << "Max capacity: " << m_capacity << "\n";
	std::cout << "Size: " << m_size << "\n";
	std::cout << "Max Probes: " << m_maxProbes << "\n";
	std::cout << "Max Load allowed: " << m_maxLoadAllowed << "\n";
	std::cout << "Current Load: " << m_currentLoad << "\n";
	std::cout << "==========================================================\n";
}

HASHTABLE_TEMPLATE_DECL
void HASHTABLE_CLASS::ProbeWholeTable() const
{
	srand((uint32_t)time(0) % 7);
	uint32_t key = rand();

	std::cout << "Started Probing with Key: " << key << "\n";
	uint32_t k = 0;
	while (k < m_numKeys)
	{
		uint32_t hashKey = hash(k, m_numKeys, key);
		uint32_t partitionKey = hashKey * c_NumPartitions;
		std::cout << "Probed Slot: " << hashKey << "\n";
		for (int i = 0; i < c_NumPartitions; ++i)
		{
			uint32_t dataIndex = partitionKey + i;
			AS_DATAPTR(m_buffer)[dataIndex] = { 0, 0, Data::Occupied };
		}
		++k;
	}

	bool wholeTableProbed = true;
	for (uint32_t i = 0; i < m_capacity; ++i)
	{
		if (AS_DATAPTR(m_buffer)[i].state != Data::Occupied)
		{
			wholeTableProbed = false;
			break;
		}
	}
	if (wholeTableProbed)
	{
		std::cout << "Finished probing whole table\n";
	}
	else
	{
		std::cout << "Could not probe whole table. Hash function would lead to bubbles in table\n";
	}
}

//=======================================================================================

Performance Analysis

I’ve done some performance analysis using the following code below:

    HashTable<int, int> hashTable(10);
    std::unordered_map<int, int> stlMap(10);

    // Insert
    auto start = std::chrono::system_clock::now();
    for (int i = 0; i < 10000; ++i)
    {
        hashTable.Insert(i, i);
    }
    auto end = std::chrono::system_clock::now();
    std::cout << "Insert Time for custom HashTable: " << (end - start).count() << "\n";

    start = std::chrono::system_clock::now();
    for (int i = 0; i < 10000; ++i)
    {
        stlMap.insert({ i, i });
    }
    end = std::chrono::system_clock::now();
    std::cout << "Insert Time for STL HashTable: " << (end - start).count() << "\n";

    //Delete
    start = std::chrono::system_clock::now();
    for (int i = 0; i < 5000; ++i)
        hashTable.Remove(i);
    }
    end = std::chrono::system_clock::now();
    std::cout << "Deletion Time for custom HashTable: " << (end - start).count() << "\n";

    start = std::chrono::system_clock::now();
    for (int i = 0; i < 5000; ++i)
    {
        stlMap.erase(stlMap.find(i));
    }
    end = std::chrono::system_clock::now();
    std::cout << "Deletion Time for STL HashTable: " << (end - start).count() << "\n";

For initial size of 10, following is the console output:
– Insert Time for custom HashTable: 396716
– Insert Time for STL HashTable: 502560
– Deletion Time for custom HashTable: 26462
– Deletion Time for STL HashTable: 107265

Upon increasing the initial size of our hash table to 4096, there’s minimal re-hashing. The insert time also improved considerably!!
Following is the output with initial size set to 4096 on both tables:
– Insert Time for custom HashTable: 128295
– Insert Time for STL HashTable: 488191
– Deletion Time for custom HashTable: 18199
– Deletion Time for STL HashTable: 106090

The time taken for deletion remains consistent in both the cases and is quite faster than the STL hash table.
It’s definitely not a general purpose table, but will be quite useful on performance critical portions of code.

May 21, 2024

Data Structures

Adroit Things

How To: Constructing a Pretty Quick Hash Table

Introduction

Implementation Details

Code

Performance Analysis

Leave a Reply Cancel reply