Consistent Hashing in JavaScript (NodeJS) - AI Infrastructure & Distributed Systems
Consistent hashing is a crucial technique widely used in AI infrastructure and distributed systems to achieve load balancing and maintain data distribution across nodes efficiently. This is particularly important for AI microservices architecture, LLM deployment, and scalable AI applications. In this blog post, we’ll explore a comprehensive implementation of consistent hashing in Node.js, specifically tailored for AI infrastructure needs.
Why Consistent Hashing Matters for AI Applications
In the context of AI development and machine learning systems, consistent hashing becomes essential for:
- AI Model Distribution: Distributing AI models across multiple servers
- Vector Database Sharding: Managing large-scale vector databases for AI applications
- LLM Load Balancing: Ensuring optimal performance for Large Language Model deployments
- AI Microservices: Managing AI service instances in microservices architecture
- Real-time AI Processing: Maintaining consistent performance for AI inference workloads
Background
Consistent hashing addresses the challenge of redistributing data when the number of nodes in a distributed system changes. This is particularly critical in AI infrastructure where model weights, embeddings, and vector data need to be efficiently distributed. Traditional hashing techniques can lead to significant data migration when nodes are added or removed, which is especially problematic for AI applications that require consistent performance and minimal downtime.
The Implementation
Let’s delve into a simple implementation of consistent hashing using Node.js. The code includes classes for StorageNode
and ConsistentHashing
, along with functions for adding and removing nodes, as well as assigning items to nodes.
const crypto = require("crypto");
class StorageNode {
constructor(name, host) {
this.name = name;
this.host = host;
}
putFile(path) {
// Implementation for putting a file on the node
console.log(`File ${path} stored on node ${this.name}`);
}
fetchFile(path) {
// Implementation for fetching a file from the node
console.log(`File ${path} fetched from node ${this.name}`);
}
}
class ConsistentHashing {
constructor(totalSlots) {
this.totalSlots = totalSlots;
this.nodes = [];
this.keys = [];
}
hashFn(key) {
const hash = crypto.createHash("sha256").update(key, "utf-8").digest("hex");
return parseInt(hash, 16) % this.totalSlots;
}
addNode(node) {
if (this.keys.length === this.totalSlots) {
throw new Error("Hash space is full");
}
const key = this.hashFn(node.host);
const index = this.keys.findIndex((k) => k > key);
if (index > 0 && this.keys[index - 1] === key) {
throw new Error("Collision occurred");
}
this.nodes.splice(index, 0, node);
this.keys.splice(index, 0, key);
return key;
}
removeNode(node) {
if (this.keys.length === 0) {
throw new Error("Hash space is empty");
}
const key = this.hashFn(node.host);
const index = this.keys.findIndex((k) => k === key);
if (index === -1) {
throw new Error("Node does not exist");
}
this.keys.splice(index, 1);
this.nodes.splice(index, 1);
return key;
}
assign(item) {
const key = this.hashFn(item);
const index = this.keys.findIndex((k) => k > key) % this.keys.length;
return this.nodes[index];
}
}
// Example usage
const consistentHashing = new ConsistentHashing(5);
const nodes = [
new StorageNode("A", "10.131.213.12"),
new StorageNode("B", "10.131.217.11"),
new StorageNode("C", "10.131.142.46"),
new StorageNode("D", "10.131.114.17"),
new StorageNode("E", "10.131.189.18"),
];
nodes.forEach((node) => consistentHashing.addNode(node));
const fileToUpload = "example.txt";
const assignedNode = consistentHashing.assign(fileToUpload);
assignedNode.putFile(fileToUpload);
Usage Example
The implementation is exemplified with a scenario involving storage nodes A, B, C, D, and E. Files are uploaded to nodes based on the consistent hashing algorithm, ensuring a balanced distribution.
// Upload a file to the assigned node
const fileToUpload = "example2.txt";
const assignedNode = consistentHashing.assign(fileToUpload);
assignedNode.putFile(fileToUpload);
Conclusion
Consistent hashing is a crucial concept in AI infrastructure and distributed systems, offering an elegant solution to the challenges of data distribution and load balancing. For AI developers and machine learning engineers, this technique is essential for building scalable AI applications, managing vector databases, and deploying LLM services efficiently.
The provided Node.js implementation serves as a foundation for understanding and integrating consistent hashing into your AI projects, whether you’re building AI microservices, managing AI model deployments, or creating distributed AI systems. This approach ensures optimal performance and scalability for modern AI applications and intelligent systems.