In the DICE group, we develop Tentris(github, paper), one of the fastest triple stores currently available.
Tentris is a triplestore that is conceptually based on tensors and tensor algebra. Tensors are implemented by a condensed, monolithic indexing datastructure dubbed hypertrie. The evaluation of SPARQL queries is conducted as Einstein summation (einsum). The einsum implementation of Tentris is based on a state-of-the-art worst-case optimal join (WCOJ) algorithm.
So much, so awesome. But there are still things that need to be worked on. Currently, Tentris can only run on a single machine. To scale on more machines to process even larger datasets or to serve more parallel requests, tentris needs to be distributed. So your master thesis task will be to scetch and implement Tentris Cluster.
The master thesis includes:
Required skills:
Links: