UMR CNRS 7253

Everest
Everest
Everest
Everest
Everest

Site Tools

Français

User Tools


en:transe

Project: Translating Embeddings for Modeling Multi-relational Data

This page proposes material (pdf, code and data) related to the paper “Translating Embeddings for Modeling Multi-relational Data” published by A. Bordes et al. in Proceedings of NIPS 2013 [1].

Abstract

We consider the problem of embedding entities and relationships of multi-relational data in low-dimensional vector spaces. Our objective is to propose a canonical model which is easy to train, contains a reduced number of parameters and can scale up to very large databases. Hence, we propose, TransE, a method which models relationships by interpreting them as translations operating on the low-dimensional embeddings of the entities. Despite its simplicity, this assumption proves to be powerful since extensive experiments show that TransE significantly outperforms state-of-the-art methods in link prediction on two knowledge bases. Besides, it can be successfully trained on a large scale data set with 1M entities, 25k relationships and more than 17M training samples.

Papers

  • Conference version (NIPS 13): (pdf) (nips) (poster)
  • Related paper on using this method for improving relation extraction (EMNLP 13): (pdf)

Code

The Python code used to run the experiments in [1] is now available from Github as part of the SME library [3]: (code). It allows to reproduce the main experiments from the paper (from Table 3) using the Raw metric (Results can be slightly different due to the Stochastic Gradient training – for FB15k, we can reach even better results). See the included README and comments in the code for details on how to use it. The code requires the Theano library.

Data

  • Freebase (FB15k). ASCII format: (data). See [1] or the included README for more details. (FB1M also in [1] will not be released).
  • WordNet. ASCII format: (data). See [3] or the included README for more details.

Contacts

Antoine Bordes: Heudiasyc, UMR CNRS 7253, Université de Technologie de Compiègne, France.
Nicolas Usunier: Heudiasyc, UMR CNRS 7253, Université de Technologie de Compiègne, France.
Jason Weston: Google, New York, USA.

References

[1] A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston and O. Yakhnenko. Translating Embeddings for Modeling Multi-relational Data. In Advances of Neural Information Processing Systems 2013.
[2] J. Weston, A. Bordes, O. Yakhnenko and N. Usunier. Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Seattle, USA. 2013.
[3] A. Bordes, X. Glorot, J. Weston and Y. Bengio. A Semantic Matching Energy Function for Learning with Multi-relational Data. Machine Learning Journal - Special Issue on Learning Semantics. 2012.