EVEREST: Learning high-level representations of sparse tensors.

Funded by an ANR grant - Young Researcher program (JCJC)
Co-funded by a Google Faculty Research Award

PIs: Antoine Bordes & Sébastien Destercke (CNRS).

Heudiasyc Laboratory (CNRS - UTC)
Centre de Recherches de Royallieu,
Université de Technologie de Compiègne,
BP 20529, 60205 Compiègne cedex, France.

Dates: Jan. 1st, 2013 - Dec. 31st, 2016.

Summary on ANR website (in French)
Video introducing EVEREST on UTC website (in French)




Overview :
Huge amounts of structured and relational data are available in many domains of engineering, industry or research ranging from the Semantic Web, or bioinformatics to recommender systems. As a result, knowledge bases (KBs), such as Freebase, WordNet or GeneOntology, became essential tools for storing, manipulating and accessing information, but they are also incomplete, imprecise and far too large to be used as efficiently and broadly as they could. Hence, there is need for methods able to summarize, complete or merge these large databases. This is our main motivation. KBs can be represented as 3-dimensional tensors, and we will rely on tensor factorization methods to learn compact representations. The overall objective of the EVEREST project is to bring a leap forward in factorization of large sparse tensors in order to improve the accessibility, completeness and reliability of real-world KBs. This line of research could have a huge impact in industry (Semantic Web, biomedical applications, etc.). For that reason, Xerox Research Center Europe is supporting this project and will supply data, provide expertise and ease industrial transfer. This proposal is also consistent with the long-term research direction of its principal partner, Heudiasyc, since it contributes in several aspects of the 10-years LabEx program on “Technological Systems of Systems” started in 2011.