-
What can knowledge graph alignment gain with Neuro-Symbolic learning approaches?
Authors:
Pedro Giesteira Cotovio,
Ernesto Jimenez-Ruiz,
Catia Pesquita
Abstract:
Knowledge Graphs (KG) are the backbone of many data-intensive applications since they can represent data coupled with its meaning and context. Aligning KGs across different domains and providers is necessary to afford a fuller and integrated representation. A severe limitation of current KG alignment (KGA) algorithms is that they fail to articulate logical thinking and reasoning with lexical, stru…
▽ More
Knowledge Graphs (KG) are the backbone of many data-intensive applications since they can represent data coupled with its meaning and context. Aligning KGs across different domains and providers is necessary to afford a fuller and integrated representation. A severe limitation of current KG alignment (KGA) algorithms is that they fail to articulate logical thinking and reasoning with lexical, structural, and semantic data learning. Deep learning models are increasingly popular for KGA inspired by their good performance in other tasks, but they suffer from limitations in explainability, reasoning, and data efficiency. Hybrid neurosymbolic learning models hold the promise of integrating logical and data perspectives to produce high-quality alignments that are explainable and support validation through human-centric approaches. This paper examines the current state of the art in KGA and explores the potential for neurosymbolic integration, highlighting promising research directions for combining these fields.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Knowledge Graphs for the Life Sciences: Recent Developments, Challenges and Opportunities
Authors:
Jiaoyan Chen,
Hang Dong,
Janna Hastings,
Ernesto Jiménez-Ruiz,
Vanessa López,
Pierre Monnin,
Catia Pesquita,
Petr Škoda,
Valentina Tamma
Abstract:
The term life sciences refers to the disciplines that study living organisms and life processes, and include chemistry, biology, medicine, and a range of other related disciplines. Research efforts in life sciences are heavily data-driven, as they produce and consume vast amounts of scientific data, much of which is intrinsically relational and graph-structured.
The volume of data and the comple…
▽ More
The term life sciences refers to the disciplines that study living organisms and life processes, and include chemistry, biology, medicine, and a range of other related disciplines. Research efforts in life sciences are heavily data-driven, as they produce and consume vast amounts of scientific data, much of which is intrinsically relational and graph-structured.
The volume of data and the complexity of scientific concepts and relations referred to therein promote the application of advanced knowledge-driven technologies for managing and interpreting data, with the ultimate aim to advance scientific discovery.
In this survey and position paper, we discuss recent developments and advances in the use of graph-based technologies in life sciences and set out a vision for how these technologies will impact these fields into the future. We focus on three broad topics: the construction and management of Knowledge Graphs (KGs), the use of KGs and associated technologies in the discovery of new knowledge, and the use of KGs in artificial intelligence applications to support explanations (explainable AI). We select a few exemplary use cases for each topic, discuss the challenges and open research questions within these topics, and conclude with a perspective and outlook that summarizes the overarching challenges and their potential solutions as a guide for future research.
△ Less
Submitted 20 December, 2023; v1 submitted 29 September, 2023;
originally announced September 2023.
-
NeSy4VRD: A Multifaceted Resource for Neurosymbolic AI Research using Knowledge Graphs in Visual Relationship Detection
Authors:
David Herron,
Ernesto Jiménez-Ruiz,
Giacomo Tarroni,
Tillman Weyde
Abstract:
NeSy4VRD is a multifaceted resource designed to support the development of neurosymbolic AI (NeSy) research. NeSy4VRD re-establishes public access to the images of the VRD dataset and couples them with an extensively revised, quality-improved version of the VRD visual relationship annotations. Crucially, NeSy4VRD provides a well-aligned, companion OWL ontology that describes the dataset domain.It…
▽ More
NeSy4VRD is a multifaceted resource designed to support the development of neurosymbolic AI (NeSy) research. NeSy4VRD re-establishes public access to the images of the VRD dataset and couples them with an extensively revised, quality-improved version of the VRD visual relationship annotations. Crucially, NeSy4VRD provides a well-aligned, companion OWL ontology that describes the dataset domain.It comes with open source infrastructure that provides comprehensive support for extensibility of the annotations (which, in turn, facilitates extensibility of the ontology), and open source code for loading the annotations to/from a knowledge graph. We are contributing NeSy4VRD to the computer vision, NeSy and Semantic Web communities to help foster more NeSy research using OWL-based knowledge graphs.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
Language Model Analysis for Ontology Subsumption Inference
Authors:
Yuan He,
Jiaoyan Chen,
Ernesto Jiménez-Ruiz,
Hang Dong,
Ian Horrocks
Abstract:
Investigating whether pre-trained language models (LMs) can function as knowledge bases (KBs) has raised wide research interests recently. However, existing works focus on simple, triple-based, relational KBs, but omit more sophisticated, logic-based, conceptualised KBs such as OWL ontologies. To investigate an LM's knowledge of ontologies, we propose OntoLAMA, a set of inference-based probing tas…
▽ More
Investigating whether pre-trained language models (LMs) can function as knowledge bases (KBs) has raised wide research interests recently. However, existing works focus on simple, triple-based, relational KBs, but omit more sophisticated, logic-based, conceptualised KBs such as OWL ontologies. To investigate an LM's knowledge of ontologies, we propose OntoLAMA, a set of inference-based probing tasks and datasets from ontology subsumption axioms involving both atomic and complex concepts. We conduct extensive experiments on ontologies of different domains and scales, and our results demonstrate that LMs encode relatively less background knowledge of Subsumption Inference (SI) than traditional Natural Language Inference (NLI) but can improve on SI significantly when a small number of samples are given. We will open-source our code and datasets.
△ Less
Submitted 8 May, 2023; v1 submitted 13 February, 2023;
originally announced February 2023.
-
AI Assistants: A Framework for Semi-Automated Data Wrangling
Authors:
Tomas Petricek,
Gerrit J. J. van den Burg,
Alfredo Nazábal,
Taha Ceritli,
Ernesto Jiménez-Ruiz,
Christopher K. I. Williams
Abstract:
Data wrangling tasks such as obtaining and linking data from various sources, transforming data formats, and correcting erroneous records, can constitute up to 80% of typical data engineering work. Despite the rise of machine learning and artificial intelligence, data wrangling remains a tedious and manual task. We introduce AI assistants, a class of semi-automatic interactive tools to streamline…
▽ More
Data wrangling tasks such as obtaining and linking data from various sources, transforming data formats, and correcting erroneous records, can constitute up to 80% of typical data engineering work. Despite the rise of machine learning and artificial intelligence, data wrangling remains a tedious and manual task. We introduce AI assistants, a class of semi-automatic interactive tools to streamline data wrangling. An AI assistant guides the analyst through a specific data wrangling task by recommending a suitable data transformation that respects the constraints obtained through interaction with the analyst.
We formally define the structure of AI assistants and describe how existing tools that treat data cleaning as an optimization problem fit the definition. We implement AI assistants for four common data wrangling tasks and make AI assistants easily accessible to data analysts in an open-source notebook environment for data science, by leveraging the common structure they follow. We evaluate our AI assistants both quantitatively and qualitatively through three example scenarios. We show that the unified and interactive design makes it easy to perform tasks that would be difficult to do manually or with a fully automatic tool.
△ Less
Submitted 31 October, 2022;
originally announced November 2022.
-
Understanding Adverse Biological Effect Predictions Using Knowledge Graphs
Authors:
Erik Bryhn Myklebust,
Ernesto Jimenez-Ruiz,
Jiaoyan Chen,
Raoul Wolf,
Knut Erik Tollefsen
Abstract:
Extrapolation of adverse biological (toxic) effects of chemicals is an important contribution to expand available hazard data in (eco)toxicology without the use of animals in laboratory experiments. In this work, we extrapolate effects based on a knowledge graph (KG) consisting of the most relevant effect data as domain-specific background knowledge. An effect prediction model, with and without ba…
▽ More
Extrapolation of adverse biological (toxic) effects of chemicals is an important contribution to expand available hazard data in (eco)toxicology without the use of animals in laboratory experiments. In this work, we extrapolate effects based on a knowledge graph (KG) consisting of the most relevant effect data as domain-specific background knowledge. An effect prediction model, with and without background knowledge, was used to predict mean adverse biological effect concentration of chemicals as a prototypical type of stressors. The background knowledge improves the model prediction performance by up to 40\% in terms of $R^2$ (\ie coefficient of determination). We use the KG and KG embeddings to provide quantitative and qualitative insights into the predictions. These insights are expected to improve the confidence in effect prediction. Larger scale implementation of such extrapolation models should be expected to support hazard and risk assessment, by simplifying and reducing testing needs.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
Query-based Industrial Analytics over Knowledge Graphs with Ontology Reshaping
Authors:
Zhuoxun Zheng,
Baifan Zhou,
Dongzhuoran Zhou,
Gong Cheng,
Ernesto Jiménez-Ruiz,
Ahmet Soylu,
Evgeny Kharlamo
Abstract:
Industrial analytics that includes among others equipment diagnosis and anomaly detection heavily relies on integration of heterogeneous production data. Knowledge Graphs (KGs) as the data format and ontologies as the unified data schemata are a prominent solution that offers high quality data integration and a convenient and standardised way to exchange data and to layer analytical applications o…
▽ More
Industrial analytics that includes among others equipment diagnosis and anomaly detection heavily relies on integration of heterogeneous production data. Knowledge Graphs (KGs) as the data format and ontologies as the unified data schemata are a prominent solution that offers high quality data integration and a convenient and standardised way to exchange data and to layer analytical applications over it. However, poor design of ontologies of high degree of mismatch between them and industrial data naturally lead to KGs of low quality that impede the adoption and scalability of industrial analytics. Indeed, such KGs substantially increase the training time of writing queries for users, consume high volume of storage for redundant information, and are hard to maintain and update. To address this problem we propose an ontology reshaping approach to transform ontologies into KG schemata that better reflect the underlying data and thus help to construct better KGs. In this poster we present a preliminary discussion of our on-going research, evaluate our approach with a rich set of SPARQL queries on real-world industry data at Bosch and discuss our findings.
△ Less
Submitted 22 September, 2022;
originally announced September 2022.
-
Machine Learning-Friendly Biomedical Datasets for Equivalence and Subsumption Ontology Matching
Authors:
Yuan He,
Jiaoyan Chen,
Hang Dong,
Ernesto Jiménez-Ruiz,
Ali Hadian,
Ian Horrocks
Abstract:
Ontology Matching (OM) plays an important role in many domains such as bioinformatics and the Semantic Web, and its research is becoming increasingly popular, especially with the application of machine learning (ML) techniques. Although the Ontology Alignment Evaluation Initiative (OAEI) represents an impressive effort for the systematic evaluation of OM systems, it still suffers from several limi…
▽ More
Ontology Matching (OM) plays an important role in many domains such as bioinformatics and the Semantic Web, and its research is becoming increasingly popular, especially with the application of machine learning (ML) techniques. Although the Ontology Alignment Evaluation Initiative (OAEI) represents an impressive effort for the systematic evaluation of OM systems, it still suffers from several limitations including limited evaluation of subsumption mappings, suboptimal reference mappings, and limited support for the evaluation of ML-based systems. To tackle these limitations, we introduce five new biomedical OM tasks involving ontologies extracted from Mondo and UMLS. Each task includes both equivalence and subsumption matching; the quality of reference mappings is ensured by human curation, ontology pruning, etc.; and a comprehensive evaluation framework is proposed to measure OM performance from various perspectives for both ML-based and non-ML-based OM systems. We report evaluation results for OM systems of different types to demonstrate the usage of these resources, all of which are publicly available as part of the new BioML track at OAEI 2022.
△ Less
Submitted 22 July, 2023; v1 submitted 6 May, 2022;
originally announced May 2022.
-
Contextual Semantic Embeddings for Ontology Subsumption Prediction
Authors:
Jiaoyan Chen,
Yuan He,
Yuxia Geng,
Ernesto Jimenez-Ruiz,
Hang Dong,
Ian Horrocks
Abstract:
Automating ontology construction and curation is an important but challenging task in knowledge engineering and artificial intelligence. Prediction by machine learning techniques such as contextual semantic embedding is a promising direction, but the relevant research is still preliminary especially for expressive ontologies in Web Ontology Language (OWL). In this paper, we present a new subsumpti…
▽ More
Automating ontology construction and curation is an important but challenging task in knowledge engineering and artificial intelligence. Prediction by machine learning techniques such as contextual semantic embedding is a promising direction, but the relevant research is still preliminary especially for expressive ontologies in Web Ontology Language (OWL). In this paper, we present a new subsumption prediction method named BERTSubs for classes of OWL ontology. It exploits the pre-trained language model BERT to compute contextual embeddings of a class, where customized templates are proposed to incorporate the class context (e.g., neighbouring classes) and the logical existential restriction. BERTSubs is able to predict multiple kinds of subsumers including named classes from the same ontology or another ontology, and existential restrictions from the same ontology. Extensive evaluation on five real-world ontologies for three different subsumption tasks has shown the effectiveness of the templates and that BERTSubs can dramatically outperform the baselines that use (literal-aware) knowledge graph embeddings, non-contextual word embeddings and the state-of-the-art OWL ontology embeddings.
△ Less
Submitted 18 March, 2023; v1 submitted 20 February, 2022;
originally announced February 2022.
-
A Simple Standard for Sharing Ontological Mappings (SSSOM)
Authors:
Nicolas Matentzoglu,
James P. Balhoff,
Susan M. Bello,
Chris Bizon,
Matthew Brush,
Tiffany J. Callahan,
Christopher G Chute,
William D. Duncan,
Chris T. Evelo,
Davera Gabriel,
John Graybeal,
Alasdair Gray,
Benjamin M. Gyori,
Melissa Haendel,
Henriette Harmse,
Nomi L. Harris,
Ian Harrow,
Harshad Hegde,
Amelia L. Hoyt,
Charles T. Hoyt,
Dazhi Jiao,
Ernesto Jiménez-Ruiz,
Simon Jupp,
Hyeongsik Kim,
Sebastian Koehler
, et al. (19 additional authors not shown)
Abstract:
Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, ar…
▽ More
Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Are they associated in some other way? Such relationships between the mapped terms are often not documented, leading to incorrect assumptions and making them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Also, the lack of descriptions of how mappings were done makes it hard to combine and reconcile mappings, particularly curated and automated ones.
The Simple Standard for Sharing Ontological Mappings (SSSOM) addresses these problems by: 1. Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in mappings explicit. 2. Defining an easy to use table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data standards. 3. Implementing open and community-driven collaborative workflows designed to evolve the standard continuously to address changing requirements and mapping practices. 4. Providing reference tools and software libraries for working with the standard.
In this paper, we present the SSSOM standard, describe several use cases, and survey some existing work on standardizing the exchange of mappings, with the goal of making mappings Findable, Accessible, Interoperable, and Reusable (FAIR). The SSSOM specification is at http://w3id.org/sssom/spec.
△ Less
Submitted 13 December, 2021;
originally announced December 2021.
-
Prediction of Adverse Biological Effects of Chemicals Using Knowledge Graph Embeddings
Authors:
Erik B. Myklebust,
Ernesto Jiménez-Ruiz,
Jiaoyan Chen,
Raoul Wolf,
Knut Erik Tollefsen
Abstract:
We have created a knowledge graph based on major data sources used in ecotoxicological risk assessment. We have applied this knowledge graph to an important task in risk assessment, namely chemical effect prediction. We have evaluated nine knowledge graph embedding models from a selection of geometric, decomposition, and convolutional models on this prediction task. We show that using knowledge gr…
▽ More
We have created a knowledge graph based on major data sources used in ecotoxicological risk assessment. We have applied this knowledge graph to an important task in risk assessment, namely chemical effect prediction. We have evaluated nine knowledge graph embedding models from a selection of geometric, decomposition, and convolutional models on this prediction task. We show that using knowledge graph embeddings can increase the accuracy of effect prediction with neural networks. Furthermore, we have implemented a fine-tuning architecture that adapts the knowledge graph embeddings to the effect prediction task and leads to better performance. Finally, we evaluate certain characteristics of the knowledge graph embedding models to shed light on the individual model performance.
△ Less
Submitted 30 March, 2022; v1 submitted 8 December, 2021;
originally announced December 2021.
-
OWL2Vec*: Embedding of OWL Ontologies
Authors:
Jiaoyan Chen,
Pan Hu,
Ernesto Jimenez-Ruiz,
Ole Magnus Holter,
Denvar Antonyrajah,
Ian Horrocks
Abstract:
Semantic embedding of knowledge graphs has been widely studied and used for prediction and statistical analysis tasks across various domains such as Natural Language Processing and the Semantic Web. However, less attention has been paid to developing robust methods for embedding OWL (Web Ontology Language) ontologies which can express a much wider range of semantics than knowledge graphs and have…
▽ More
Semantic embedding of knowledge graphs has been widely studied and used for prediction and statistical analysis tasks across various domains such as Natural Language Processing and the Semantic Web. However, less attention has been paid to developing robust methods for embedding OWL (Web Ontology Language) ontologies which can express a much wider range of semantics than knowledge graphs and have been widely adopted in domains such as bioinformatics. In this paper, we propose a random walk and word embedding based ontology embedding method named OWL2Vec*, which encodes the semantics of an OWL ontology by taking into account its graph structure, lexical information and logical constructors. Our empirical evaluation with three real world datasets suggests that OWL2Vec* benefits from these three different aspects of an ontology in class membership prediction and class subsumption prediction tasks. Furthermore, OWL2Vec* often significantly outperforms the state-of-the-art methods in our experiments.
△ Less
Submitted 25 January, 2021; v1 submitted 30 September, 2020;
originally announced September 2020.
-
Dividing the Ontology Alignment Task with Semantic Embeddings and Logic-based Modules
Authors:
Ernesto Jiménez-Ruiz,
Asan Agibetov,
Jiaoyan Chen,
Matthias Samwald,
Valerie Cross
Abstract:
Large ontologies still pose serious challenges to state-of-the-art ontology alignment systems. In this paper we present an approach that combines a neural embedding model and logic-based modules to accurately divide an input ontology matching task into smaller and more tractable matching (sub)tasks. We have conducted a comprehensive evaluation using the datasets of the Ontology Alignment Evaluatio…
▽ More
Large ontologies still pose serious challenges to state-of-the-art ontology alignment systems. In this paper we present an approach that combines a neural embedding model and logic-based modules to accurately divide an input ontology matching task into smaller and more tractable matching (sub)tasks. We have conducted a comprehensive evaluation using the datasets of the Ontology Alignment Evaluation Initiative. The results are encouraging and suggest that the proposed method is adequate in practice and can be integrated within the workflow of systems unable to cope with very large ontologies.
△ Less
Submitted 25 February, 2020;
originally announced March 2020.
-
Correcting Knowledge Base Assertions
Authors:
Jiaoyan Chen,
Xi Chen,
Ian Horrocks,
Ernesto Jimenez-Ruiz,
Erik B. Myklebus
Abstract:
The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking.…
▽ More
The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated using DBpedia and an enterprise medical KB.
△ Less
Submitted 19 January, 2020;
originally announced January 2020.
-
TERA: the Toxicological Effect and Risk Assessment Knowledge Graph
Authors:
Erik Bryhn Myklebust,
Ernesto Jimenez-Ruiz,
Jiaoyan Chen,
Raoul Wolf,
Knut Erik Tollefsen
Abstract:
Ecological risk assessment requires large amounts of chemical effect data from laboratory experiments. Due to experimental effort and animal welfare concerns it is desired to extrapolate data from existing sources. To cover the required chemical effect data several data sources need to be integrated to enable their interoperability. In this paper we introduce the Toxicological Effect and Risk Asse…
▽ More
Ecological risk assessment requires large amounts of chemical effect data from laboratory experiments. Due to experimental effort and animal welfare concerns it is desired to extrapolate data from existing sources. To cover the required chemical effect data several data sources need to be integrated to enable their interoperability. In this paper we introduce the Toxicological Effect and Risk Assessment (TERA) knowledge graph, which aims at providing such integrated view, and the data preparation and steps followed to construct this knowledge graph.
We also present the applications of TERA for chemical effect prediction and the potential applications within the Semantic Web community.
△ Less
Submitted 12 December, 2019; v1 submitted 27 August, 2019;
originally announced August 2019.
-
Knowledge Graph Embedding for Ecotoxicological Effect Prediction
Authors:
Erik Bryhn Myklebust,
Ernesto Jimenez-Ruiz,
Jiaoyan Chen,
Raoul Wolf,
Knut Erik Tollefsen
Abstract:
Exploring the effects a chemical compound has on a species takes a considerable experimental effort. Appropriate methods for estimating and suggesting new effects can dramatically reduce the work needed to be done by a laboratory. In this paper we explore the suitability of using a knowledge graph embedding approach for ecotoxicological effect prediction. A knowledge graph has been constructed fro…
▽ More
Exploring the effects a chemical compound has on a species takes a considerable experimental effort. Appropriate methods for estimating and suggesting new effects can dramatically reduce the work needed to be done by a laboratory. In this paper we explore the suitability of using a knowledge graph embedding approach for ecotoxicological effect prediction. A knowledge graph has been constructed from publicly available data sets, including a species taxonomy and chemical classification and similarity. The publicly available effect data is integrated to the knowledge graph using ontology alignment techniques. Our experimental results show that the knowledge graph based approach improves the selected baselines.
△ Less
Submitted 11 November, 2019; v1 submitted 2 July, 2019;
originally announced July 2019.
-
Canonicalizing Knowledge Base Literals
Authors:
Jiaoyan Chen,
Ernesto Jimenez-Ruiz,
Ian Horrocks
Abstract:
Ontology-based knowledge bases (KBs) like DBpedia are very valuable resources, but their usefulness and usability is limited by various quality issues. One such issue is the use of string literals instead of semantically typed entities. In this paper we study the automated canonicalization of such literals, i.e., replacing the literal with an existing entity from the KB or with a new entity that i…
▽ More
Ontology-based knowledge bases (KBs) like DBpedia are very valuable resources, but their usefulness and usability is limited by various quality issues. One such issue is the use of string literals instead of semantically typed entities. In this paper we study the automated canonicalization of such literals, i.e., replacing the literal with an existing entity from the KB or with a new entity that is typed using classes from the KB. We propose a framework that combines both reasoning and machine learning in order to predict the relevant entities and types, and we evaluate this framework against state-of-the-art baselines for both semantic typing and entity matching.
△ Less
Submitted 26 June, 2019;
originally announced June 2019.
-
Learning Semantic Annotations for Tabular Data
Authors:
Jiaoyan Chen,
Ernesto Jimenez-Ruiz,
Ian Horrocks,
Charles Sutton
Abstract:
The usefulness of tabular data such as web tables critically depends on understanding their semantics. This study focuses on column type prediction for tables without any meta data. Unlike traditional lexical matching-based methods, we propose a deep prediction model that can fully exploit a table's contextual semantics, including table locality features learned by a Hybrid Neural Network (HNN), a…
▽ More
The usefulness of tabular data such as web tables critically depends on understanding their semantics. This study focuses on column type prediction for tables without any meta data. Unlike traditional lexical matching-based methods, we propose a deep prediction model that can fully exploit a table's contextual semantics, including table locality features learned by a Hybrid Neural Network (HNN), and inter-column semantics features learned by a knowledge base (KB) lookup and query answering algorithm.It exhibits good performance not only on individual table sets, but also when transferring from one table set to another.
△ Less
Submitted 30 May, 2019;
originally announced June 2019.
-
Human-centric Transfer Learning Explanation via Knowledge Graph [Extended Abstract]
Authors:
Yuxia Geng,
Jiaoyan Chen,
Ernesto Jimenez-Ruiz,
Huajun Chen
Abstract:
Transfer learning which aims at utilizing knowledge learned from one problem (source domain) to solve another different but related problem (target domain) has attracted wide research attentions. However, the current transfer learning methods are mostly uninterpretable, especially to people without ML expertise. In this extended abstract, we brief introduce two knowledge graph (KG) based framework…
▽ More
Transfer learning which aims at utilizing knowledge learned from one problem (source domain) to solve another different but related problem (target domain) has attracted wide research attentions. However, the current transfer learning methods are mostly uninterpretable, especially to people without ML expertise. In this extended abstract, we brief introduce two knowledge graph (KG) based frameworks towards human understandable transfer learning explanation. The first one explains the transferability of features learned by Convolutional Neural Network (CNN) from one domain to another through pre-training and fine-tuning, while the second justifies the model of a target domain predicted by models from multiple source domains in zero-shot learning (ZSL). Both methods utilize KG and its reasoning capability to provide rich and human understandable explanations to the transfer procedure.
△ Less
Submitted 20 January, 2019;
originally announced January 2019.
-
ColNet: Embedding the Semantics of Web Tables for Column Type Prediction
Authors:
Jiaoyan Chen,
Ernesto Jimenez-Ruiz,
Ian Horrocks,
Charles Sutton
Abstract:
Automatically annotating column types with knowledge base (KB) concepts is a critical task to gain a basic understanding of web tables. Current methods rely on either table metadata like column name or entity correspondences of cells in the KB, and may fail to deal with growing web tables with incomplete meta information. In this paper we propose a neural network based column type annotation frame…
▽ More
Automatically annotating column types with knowledge base (KB) concepts is a critical task to gain a basic understanding of web tables. Current methods rely on either table metadata like column name or entity correspondences of cells in the KB, and may fail to deal with growing web tables with incomplete meta information. In this paper we propose a neural network based column type annotation framework named ColNet which is able to integrate KB reasoning and lookup with machine learning and can automatically train Convolutional Neural Networks for prediction. The prediction model not only considers the contextual semantics within a cell using word representation, but also embeds the semantics of a column by learning locality features from multiple cells. The method is evaluated with DBPedia and two different web table datasets, T2Dv2 from the general Web and Limaye from Wikipedia pages, and achieves higher performance than the state-of-the-art approaches.
△ Less
Submitted 14 November, 2018; v1 submitted 3 November, 2018;
originally announced November 2018.
-
Breaking-down the Ontology Alignment Task with a Lexical Index and Neural Embeddings
Authors:
Ernesto Jimenez-Ruiz,
Asan Agibetov,
Matthias Samwald,
Valerie Cross
Abstract:
Large ontologies still pose serious challenges to state-of-the-art ontology alignment systems. In the paper we present an approach that combines a lexical index, a neural embedding model and locality modules to effectively divide an input ontology matching task into smaller and more tractable matching (sub)tasks. We have conducted a comprehensive evaluation using the datasets of the Ontology Align…
▽ More
Large ontologies still pose serious challenges to state-of-the-art ontology alignment systems. In the paper we present an approach that combines a lexical index, a neural embedding model and locality modules to effectively divide an input ontology matching task into smaller and more tractable matching (sub)tasks. We have conducted a comprehensive evaluation using the datasets of the Ontology Alignment Evaluation Initiative. The results are encouraging and suggest that the proposed methods are adequate in practice and can be integrated within the workflow of state-of-the-art systems.
△ Less
Submitted 31 May, 2018;
originally announced May 2018.
-
Evaluating Ontology Matching Systems on Large, Multilingual and Real-world Test Cases
Authors:
Christian Meilicke,
Ondrej Sváb-Zamazal,
Cássia Trojahn,
Ernesto Jiménez-Ruiz,
José-Luis Aguirre,
Heiner Stuckenschmidt,
Bernardo Cuenca Grau
Abstract:
In the field of ontology matching, the most systematic evaluation of matching systems is established by the Ontology Alignment Evaluation Initiative (OAEI), which is an annual campaign for evaluating ontology matching systems organized by different groups of researchers. In this paper, we report on the results of an intermediary OAEI campaign called OAEI 2011.5. The evaluations of this campaign ar…
▽ More
In the field of ontology matching, the most systematic evaluation of matching systems is established by the Ontology Alignment Evaluation Initiative (OAEI), which is an annual campaign for evaluating ontology matching systems organized by different groups of researchers. In this paper, we report on the results of an intermediary OAEI campaign called OAEI 2011.5. The evaluations of this campaign are divided in five tracks. Three of these tracks are new or have been improved compared to previous OAEI campaigns. Overall, we evaluated 18 matching systems. We discuss lessons learned, in terms of scalability, multilingual issues and the ability do deal with real world cases from different domains.
△ Less
Submitted 15 August, 2012;
originally announced August 2012.
-
First steps in the logic-based assessment of post-composed phenotypic descriptions
Authors:
Ernesto Jimenez-Ruiz,
Bernardo Cuenca Grau,
Rafael Berlanga,
Dietrich Rebholz-Schuhmann
Abstract:
In this paper we present a preliminary logic-based evaluation of the integration of post-composed phenotypic descriptions with domain ontologies. The evaluation has been performed using a description logic reasoner together with scalable techniques: ontology modularization and approximations of the logical difference between ontologies.
In this paper we present a preliminary logic-based evaluation of the integration of post-composed phenotypic descriptions with domain ontologies. The evaluation has been performed using a description logic reasoner together with scalable techniques: ontology modularization and approximations of the logical difference between ontologies.
△ Less
Submitted 7 December, 2010;
originally announced December 2010.
-
Building conceptual spaces for exploring and linking biomedical resources
Authors:
R. Berlanga,
E. Jimenez-Ruiz,
V. Nebot
Abstract:
The establishment of links between data (e.g., patient records) and Web resources (e.g., literature) and the proper visualization of such discovered knowledge is still a challenge in most Life Science domains (e.g., biomedicine). In this paper we present our contribution to the community in the form of an infrastructure to annotate information resources, to discover relationships among them, and t…
▽ More
The establishment of links between data (e.g., patient records) and Web resources (e.g., literature) and the proper visualization of such discovered knowledge is still a challenge in most Life Science domains (e.g., biomedicine). In this paper we present our contribution to the community in the form of an infrastructure to annotate information resources, to discover relationships among them, and to represent and visualize the new discovered knowledge. Furthermore, we have also implemented a Web-based prototype tool which integrates the proposed infrastructure.
△ Less
Submitted 7 December, 2010;
originally announced December 2010.
-
The Management and Integration of Biomedical Knowledge: Application in the Health-e-Child Project (Position Paper)
Authors:
E. Jimenez-Ruiz,
R. Berlanga,
I. Sanz,
R. McClatchey,
R. Danger,
D. Manset,
J. Paraire,
A. Rios
Abstract:
The Health-e-Child project aims to develop an integrated healthcare platform for European paediatrics. In order to achieve a comprehensive view of childrens health, a complex integration of biomedical data, information, and knowledge is necessary. Ontologies will be used to formally define this domain knowledge and will form the basis for the medical knowledge management system. This paper intro…
▽ More
The Health-e-Child project aims to develop an integrated healthcare platform for European paediatrics. In order to achieve a comprehensive view of childrens health, a complex integration of biomedical data, information, and knowledge is necessary. Ontologies will be used to formally define this domain knowledge and will form the basis for the medical knowledge management system. This paper introduces an innovative methodology for the vertical integration of biomedical knowledge. This approach will be largely clinician-centered and will enable the definition of ontology fragments, connections between them (semantic bridges) and enriched ontology fragments (views). The strategy for the specification and capture of fragments, bridges and views is outlined with preliminary examples demonstrated in the collection of biomedical information from hospital databases, biomedical ontologies, and biomedical public databases.
△ Less
Submitted 26 September, 2006;
originally announced September 2006.