Load balancing in a cluster is a complex task that requires advanced knowledge in multiple disciplines to make optimal solutions and overcome bottlenecks on different levels. It requires understanding hardware architectures, principles of networking, distributed heterogeneous systems and databases.

In this project we aim to define the optimal number of shards for a semantic DB that defines how the system scales and the required number of nodes in the cluster, yielding efficient usage of hardware and minimal response time. Besides static definition of the cluster topology, dynamic routing of queries in the produced topology has to be implemented.