Integrating VIVO and eagle-i to develop a Resource Recommender System

2015-08-19T14:09:26Z (GMT) by Suzanne Thompson Amarendra Das
Integrating VIVO and eagle-i to develop a Resource Recommender System 
Suzanne Thompson, MS, PhD, Rodney Jacobson, BS, Michael Li, Sukdith Punjasthitkul, MS, 
Steven B. Andrews, PhD and Amar K. Das, MD, PhD 
Informatics Collaboratory for Design, Development, and Dissemination (ic3d) 
Dartmouth SYNERGY Clinical and Translational Science Institute 
Geisel School of Medicine at Dartmouth, Hanover NH 03755 

Keywords: Researcher profiles, translational research resources, recommender system, research management 

Abstract: Over thirty academic institutions, including Dartmouth, participate in the eagle-i research resource network, which provides information to researchers on both local and nationally networked resources, such as cores, labs, specimens, instruments, reagents, and software. Discovery of these eagle-i resources requires active user searching of the semantically structured information via its web interface. In this poster presentation, we discuss the design and implementation of a recommender system for eagle-i resources that is part of the Inspire research management tool we have built for the Dartmouth SYNERGY Clinical and Translational Science Institute. The recommender system automatically matches relevant resources to investigators based on information collected within their VIVO profile, a system we are currently in the process of launching institution wide. 

Background: Since 2009, Dartmouth has participated in the eagle-i network, and continues to update eagle-i with curated clinical and translational resources. In 2013, the Dartmouth SYNERGY Clinical and Translational Science Institute was funded through a NIH Clinical Translational Science Award (CTSA), becoming the newest hub in the CTSA network. As part of these efforts, the SYNERGY-supported Informatics Collaboratory for Design, Development, and Dissemination (ic3d) has developed and released in 2014 a web-based, mobile friendly, open-source research management system, called Inspire, to allow investigators to access and manage a range of CTSA-supported resources, which are encoded in eagle-i. In addition, the ic3d team is also implementing the VIVO system to automatically profile researchers by extracting publication and grant information and listing investigator-provided data on research interest. Our team is currently building an integrated platform that connects data on eagle-i resources, VIVO profiles, and Inspire project information to provide a researcher-specific recommendation of which eagle-i resources are relevant. 

Objective: To design and build a recommender system (called Inspiration) that uses investigator profile in VIVO and project activity information in Inspire to determine relevance of eagle-i resources to a researcher. 

Methods: We chose to design a recommender system based on a vector space model that would represent eagle-i resources and faculty profiles as weighted terms vectors. The vector space model is widely used in web searching as a highly scalable information retrieval method. In our approach, we use the proximity of a vector representing an eagle-i resource to a vector representing an investigator’s research portfolio in the vector space model as an indication of the relevance of that resource to the investigator. The vector for the investigator is based on concepts associated with their publication titles, grant titles, provided research interests, and research project descriptions. 

Results: To create the vector space model, we used the APIs of VIVO, eagle-i, and the Inspire systems to digest the data. Although VIVO and eagle-i store information using the semantic resource description framework (RDF), which have been merged together into the Integrated Semantic Framework, the descriptions of the research resources and information on publication and grant titles are in free text. Similarly, project description fields within Inspire used for request management are unstructured text. We used a dictionary-based approach to map these words into standardized terms, and then created term vectors for each resource and investigator. We implemented the vector space model within python using the gensim library. We are currently working with faculty investigators to evaluate the relevance of the matched eagle-i resources in Inspiration, and will present the results of this study in the poster presentation. 

Research reported in this poster was supported by the Dartmouth SYNERGY Clinical and Translational Science Institute, under award number UL1TR001086 from the National Center for Advancing Translational Sciences (NCATS) of the National Institutes of Health (NIH). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.