Information Discovery on Domain Data Graphs

An increasing amount of data is stored in an interconnected manner. Such data range from the Web – hyperlinked pages – to bibliographical data – graph of citations – to biological data – associations between proteins, genes, publications – to clinical data – associations between patients, hospitalizations, exams and diagnoses.

A critical need in order to leverage the available data is the enablement of information discovery, i.e., given a question (query) find pieces of data or associations between them in the data graph that are “good” (relevant, authoritative and specific) for the query, and rank them according to their “goodness”. Submitting such queries should not require knowledge of a complex query language (e.g., SQL) or of the details of the data (e.g., schema). Unfortunately, little has been done to provide high-quality information discovery on data graphs in domains other than the Web, where search engines have been successful.

This project will facilitate effective information discovery on domain –biological, clinical, patents, e-commerce, spatial– data, which can lead to cost savings, and increased research productivity in these domains.

Acknowledgements

This project is sponsored by NSF (IIS 1216032 (was IIS 0811922), 2008-2013).

People

Alumni

Publications

Book

  • Vagelis Hristidis. Information Discovery on Electronic Health Records. Published by CRC - Taylor & Francis, 2009
    [Link]

Conferences/Workshops/Journals

  • Shiwen Cheng, Arash Termehchy, Vagelis Hristidis. Predicting the Effectiveness of Keyword Queries on Databases. ACM Conference on Information and Knowledge Management (CIKM) 2012
  • Abhijith Kashyap, Vagelis Hristidis. Comprehension-Based Result Snippets. ACM Conference on Information and Knowledge Management (CIKM) 2012
  • Abhijith Kashyap, Reza Amini, Vagelis Hristidis. SonetRank: Leveraging Social Networks to Personalize Search. ACM Conference on Information and Knowledge Management (CIKM) 2012
  • Abhijith Kashyap, Vagelis Hristidis. LogRank: Summarizing Social Activity Logs. 15th SIGMOD International Workshop on the Web and Databases (WebDB) 2012
  • Eduardo J. Ruiz, Vagelis Hristidis, Carlos Castillo, Aristides Gionis, Alejandro Jaimes. Correlating Financial Time Series with Micro-Blogging Activity. ACM International Conference on Web Search and Data Mining (WSDM) 2012
  • Mahashweta Das, Gautam Das, Vagelis Hristidis. Leveraging Collaborative Tagging for Web Item Design. ACM SIGKDD (International Conference on Knowledge Discovery and Data Mining) 2011
  • Vagelis Hristidis, Louiqa Raschid and Yao Wu. Scalable Link-based Personalization for Ranking in Entity-Relationship Graphs, 14th SIGMOD International Workshop on the Web and Databases (WebDB) 2011
    [Download]
  • Abhijith Kashyap, Vagelis Hristidis, Michalis Petropoulos, and Sotiria Tavoulari. Effective Navigation of Query Results Based on Concept Hierarchies, IEEE Transactions on Knowledge and Data Engineering (TKDE) 2011
    [Download]
  • Abhijith Kashyap, Vagelis Hristidis and Michalis Petropoulos. FACeTOR: Cost-Driven Exploration of Faceted Query Results, ACM Conference on Information and Knowledge Management (CIKM) 2010. *Best Interdisciplinary Paper Award*
    [Download]
  • Vagelis Hristidis, Eduardo Ruiz, Alejandro Hernández, Fernando Farfán, Ramakrishna VaradarajanVagelis Hristidis and Eduardo Ruiz. PatentsSearcher: A Novel Portal to Search and Explore Patents, 3rd International Workshop on Patent Information Retrieval, at ACM CIKM 2010
    [Download]
  • Louiqa Raschid, Vagelis Hristidis, John Edmond and Hassan Sayyadi. Challenges in Personalized Authority Flow Based Ranking of Social Media, ACM Conference on Information and Knowledge Management (CIKM) 2010
    [Download]
  • Vagelis Hristidis, Ramakrishna Varadarajan, Paul Biondich and Michael Weiner. Information Discovery on Electronic Medical Records Using Authority-Flow Techniques, BMC Medical Informatics and Decision Making, 2010
    [Link]
  • Vagelis Hristidis, Yuheng Hu and Panagiotis G. Ipeirotis. Relevance-based Retrieval on Hidden-Web Text Databases without Ranking Support, IEEE Transactions on Knowledge and Data Engineering (TKDE) 2010
    [Download]
  • Benjamin Arai, Gautam Das, Dimitrios Gunopulos, Vagelis Hristidis and Nick Koudas. An Access Cost-Aware Approach for Object Retrieval over Multiple Sources. Proceedings of the VLDB Endowment (PVLDB) and VLDB Conference 2010.
    [Download]
  • Vagelis Hristidis, Yuheng Hu and Panagiotis G.Ipeirotis. Ranked Queries over Sources with Boolean Query Interfaces without Ranking Support. IEEE ICDE 2010, short paper (acceptance rate 20%)
    [Download]
  • Abhijith Kashyap, Vagelis Hristidis, Michalis Petropoulos, and Sotiria Tavoulari. Effective Navigation of Query Results Based on Concept Hierarchies. IEEE Transactions on Knowledge and Data Engineering (TKDE) 2010
    [Download]
  • Abhijith Kashyap, Vagelis Hristidis, Michalis Petropoulos and Sotiria Tavoulari. Exploring Biomedical Databases with BioNav. Demo Paper, ACM SIGMOD Conference 2009 (acceptance rate 37%)
    [Download]
  • Ramakrishna Varadarajan, Vagelis Hristidis, Louiqa Raschid, Maria-Esther Vidal, Luis lbanez and Hector Rodriguez-Drumond: Flexible and Efficient Querying and Ranking on Hyperlinked Data Sources, EDBT 2009 (full paper, acceptance rate 33%)
    [Download]
  • Vagelis Hristidis, Yannis Papakonstantinou, Ramakrishna Varadarajan. Using Proximity Search to Estimate Authority Flow. IEEE TKDE 2009
    [Download]
  • Fernando Farfán, Vagelis Hristidis, Anand Ranganathan, and Michael Weiner. XOntoRank: Ontology-Aware Search of Electronic Medical Records. IEEE International Conference on Data Engineering (ICDE) 2009 (long paper, acceptance rate 17%)
    [Download]
  • Abhijith Kashyap, Vagelis Hristidis, Michalis Petropoulos, and Sotiria Tavoulari. BioNav: Effective Navigation on Query Results of Biomedical Databases. IEEE International Conference on Data Engineering, ICDE 2009 (short paper, acceptance rate 27%)
    [Download]
  • Vagelis Hristidis, Oscar Valdivia, Michail Vlachos, Philip S Yu. Information Discovery across Multiple Streams. Elsevier Information Sciences, 2009
    [Download]
  • Ian De Felipe, Vagelis Hristidis, Naphtali Rishe. Keyword Search on Spatial Databases. IEEE ICDE 2008 (full paper, long presentation, acceptance rate 12%)
    [Download]
  • Ramakrishna Varadarajan, Vagelis Hristidis, Louiqa Raschid. Explaining and Reformulating Authority Flow Queries. IEEE ICDE 2008 (full paper, short presentation, acceptance rate 19%)
    [Download]
  • Fernando Farfán, Vagelis Hristidis, Anand Ranganathan, Redmond P. Burke. Ontology-Aware Search on XML-based Electronic Medical Records. Poster Paper, IEEE ICDE 2008 (acceptance rate 31%)
    [Download]