Italy

Researcher (scientific/technical/engineering)

Date of the expedition

From 01/04/2024 to 31/07/2024

Selected Track

Paired Teams

Project title

Neuro-Symbolic Recommender Systems

Host Organization

The WISE Lab, Computer Science Department, Rutgers University

Media

Biography

I am Tommaso Carraro, a PhD student at the University of Padova, Padova, Italy, and Fondazione Bruno Kessler, Trento, Italy. I previously obtained my bachelor’s and master’s degree in Computer Science at the University of Padova. I am conducting research in the field of Artificial Intelligence, specifically on Recommender Systems, under the supervision of Prof. Fabio Aiolli (University of Padova) and Prof. Luciano Serafini (Fondazione Bruno Kessler). My PhD verts on investigating the application of Neuro-Symbolic AI to Recommender Systems. I am currently in the USA for an NGI Enrichers project to collaborate with Prof. Yongfeng Zhang at Rutgers University, The State University of New Jersey.

Project Summary

My Paired Teams project with NGI Enrichers involves the development of a Neuro-Symbolic Recommendation System that can mitigate three major limitations that are impacting the performance of modern approaches in real-world scenarios: data sparsity (learning with a few user ratings), cold-start (learning in the total absence of ratings), and explainability (how to effectively provide reasons for recommended items). Neuro-symbolic computing is the field of AI that studies the incorporation of symbolic AI (e.g., logical reasoning) and sub-symbolic AI (e.g., neural networks, deep learning). Current state-of-the-art Recommendation Systems rely on Deep Learning, which cannot deal with sparse structures (data sparsity), requires a lot of samples to learn (it does not work in a cold-start setting), and is not explainable by design (explainability). On the other hand, logical reasoning can perform few- and zero-shot learning, and it is totally interpretable thanks to the usage of logical formulas. It is clear that by integrating these two paradigms, it could be possible to mitigate the aforementioned limitations. However, this research area is still in its early stages, and a lot of effort has to be devoted to understanding how an effective integration can be obtained.

Key Result

We are currently working on releasing a public and novel dataset for the cross-domain recommendation based on the Amazon Review dataset. Cross-domain recommendation is a specific task where knowledge acquired from a source domain (e.g., books) is used to compensate for missing ratings in a target domain (e.g., movies). The Amazon Review dataset is subdivided into multiple datasets useful for experimenting with cross-domain recommendations. In particular, we selected three datasets: books, movies, and songs. This novel dataset will be used to train a neuro-symbolic recommender that can leverage common-sense knowledge encoding connections between source and target domains to transfer information across domains and effectively mitigate sparsity and cold-start in the target domain.  An example of common-sense knowledge could be that if a user liked a Harry Potter book (source domain), she could also like one of the movies (target domain) in this series. This reasoning can be encoded in a logical formula that our Neuro-Symbolic approach can use to perform recommendations. Our dataset will include connections between source and target items, and we aim to find these connections using semantic paths in a knowledge graph, for example, “Harry Potter (book) -> based on -> Harry Potter (movie).” By doing so, if there exists a path between two items, it can be potentially used for both recommendation and model-intrinsic explanation.

The steps of our project are the following:

  1. Curation of the original dataset by imputing missing metadata regarding titles, directors (for movies), authors (for books), artists (for songs), and release dates of the items (DONE);
  2. Use the information retrieved in the first step to find matching entities on Wikidata (i.e., the knowledge graph) (DONE);
  3. Use the Neo4j framework to find paths between each pair of entities in different domains;
  4. Release of the dataset in this GitHub repository;
  5. Implementation of our Neuro-Symbolic recommender system on the created dataset;
  6. Research paper on the entire project.

Impact of the Fellowship

The expected outcomes of the fellowship include:

  • The release of a novel dataset for the cross-domain recommendation task that can potentially become a reference in the community;
  • A research paper about Neuro-Symbolic recommendation based on semantic knowledge;
  • The release of all the code regarding this project in a public repository. This will give researchers in the field the possibility to experiment with our new technology.