Knowledge graph embedding methods learn continuous vector representations for entities and relations in knowledge graphs and have been successfully employed for many applications including link prediction . "Finding the best ratio between expressiveness and parameter space size is the keystone of embedding models" . However, performing extensive hyperparameter optimization does necessitate state-of-the-art hardware systems. For instance, the RotatE model requires 9 hours of computation  to reach its peak performance on the FB15K benchmark dataset with a GeForce GTX 1080 Ti GPU. The total elapsed runtime of the RotatE model during the hyperparamter optimization phase on FB15K equates 1512 hours. The availability of state-of-the-art hardware system have often determined which research ideas succeed (and fail) .
Meanwhile, Nakkiran et. al. [5,6] from the OPENAI research show that the double descent phenomenon occurs in CNNs, ResNets, and transformers: "performance first improves, then gets worse, and then improves again with increasing model size, data size, or training time".
In this thesis, the student is asked to answer the following questions:
 Convolutional Complex Knowledge Graph Embeddings (https://arxiv.org/abs/2008.03130)
 Complex Embeddings for Simple Link Prediction (https://arxiv.org/abs/1606.06357)
 ROTATE: Knowledge Graph Embedding by relational rotation in complex space (https://arxiv.org/abs/1902.10197)
 The Hardware Lottery (https://arxiv.org/abs/2009.06489)
 Deep Double Descent: Where Bigger Models and More Data Hurt (https://arxiv.org/abs/1912.02292)
 deep-double-descent blogpost (https://openai.com/blog/deep-double-descent/)