The construction principles, approaches, design considerations, and representation challenges of an ontology-based knowledge base prototype: OntoKBCF

Xia Jing, Nicholas Hardiker, Stephen Kay, Tom Marley, Yongsheng Gao

Research output: Contribution to journalArticle

Abstract

Background: Ontologies are key enabling technologies for the Semantic Web. The Web Ontology Language (OWL) is a semantic markup language for publishing and sharing ontologies.
Objectives: The supply of customizable, computable, and formally represented molecular genetics information and health information, via electronic health record (EHR) interfaces, can play a critical role in achieving precision medicine. In this study, we used cystic fibrosis as an example to build OntoKBCF, an ontology-based knowledge base prototype, to supply such information via an EHR prototype. In this paper, we elaborate on the construction and representation principles, approaches, applications, and representation challenges that we faced in the construction of OntoKBCF. The principles and approaches can be referenced and applied in constructing other ontology-based domain knowledge bases.
Methods: We defined the scope of OntoKBCF first according to possible clinical information needs about cystic fibrosis on both a molecular level and a clinical phenotype level. We then selected the knowledge sources to be represented in OntoKBCF. We utilized top-to-bottom content analysis and bottom-up construction to build OntoKBCF. Protégé-OWL was used to construct OntoKBCF. The construction principles included: to use existing basic terms as much as possible; to use intersection and combination in representations; to represent as
many different types of facts as possible; and to provide two to five examples for each type. HermiT 1.3.8.413 within Protégé-5.1.0 was used to check the consistency of OntoKBCF.
Results: OntoKBCF was constructed successfully, with the inclusion of 408 classes, 35 properties, and 113 equivalent classes. OntoKBCF includes both atomic concepts, such as amino acid, and complex concepts, such as
adolescent female cystic fibrosis patients, and their descriptions. We demonstrated that OntoKBCF could make customizable molecular and health information available automatically and usable via an EHR prototype. The
main challenges include the provision of a more comprehensive account of different patient groups and the representation of uncertain knowledge, ambiguous concepts, and negative statements, and more complicated and
detailed molecular mechanisms or pathway information about cystic fibrosis.
Conclusions: Although cystic fibrosis is just one example, based on the current structure of OntoKBCF, it should be relatively straightforward to extend it to cover different topic areas. Moreover, the principles underpinning its
development could be reused for building alternative human monogenetic diseases knowledge bases
LanguageEnglish
Number of pages15
JournalJMIR Medical Informatics
Volume6
Issue number4
DOIs
Publication statusPublished - 21 Dec 2018

Fingerprint

Knowledge Bases
Cystic Fibrosis
Electronic Health Records
Semantics
Language
Precision Medicine
Health
Molecular Biology
Technology
Phenotype
Amino Acids

Cite this

@article{23028416336b406ea3461cdc43b48992,
title = "The construction principles, approaches, design considerations, and representation challenges of an ontology-based knowledge base prototype: OntoKBCF",
abstract = "Background: Ontologies are key enabling technologies for the Semantic Web. The Web Ontology Language (OWL) is a semantic markup language for publishing and sharing ontologies.Objectives: The supply of customizable, computable, and formally represented molecular genetics information and health information, via electronic health record (EHR) interfaces, can play a critical role in achieving precision medicine. In this study, we used cystic fibrosis as an example to build OntoKBCF, an ontology-based knowledge base prototype, to supply such information via an EHR prototype. In this paper, we elaborate on the construction and representation principles, approaches, applications, and representation challenges that we faced in the construction of OntoKBCF. The principles and approaches can be referenced and applied in constructing other ontology-based domain knowledge bases.Methods: We defined the scope of OntoKBCF first according to possible clinical information needs about cystic fibrosis on both a molecular level and a clinical phenotype level. We then selected the knowledge sources to be represented in OntoKBCF. We utilized top-to-bottom content analysis and bottom-up construction to build OntoKBCF. Prot{\'e}g{\'e}-OWL was used to construct OntoKBCF. The construction principles included: to use existing basic terms as much as possible; to use intersection and combination in representations; to represent asmany different types of facts as possible; and to provide two to five examples for each type. HermiT 1.3.8.413 within Prot{\'e}g{\'e}-5.1.0 was used to check the consistency of OntoKBCF.Results: OntoKBCF was constructed successfully, with the inclusion of 408 classes, 35 properties, and 113 equivalent classes. OntoKBCF includes both atomic concepts, such as amino acid, and complex concepts, such asadolescent female cystic fibrosis patients, and their descriptions. We demonstrated that OntoKBCF could make customizable molecular and health information available automatically and usable via an EHR prototype. Themain challenges include the provision of a more comprehensive account of different patient groups and the representation of uncertain knowledge, ambiguous concepts, and negative statements, and more complicated anddetailed molecular mechanisms or pathway information about cystic fibrosis.Conclusions: Although cystic fibrosis is just one example, based on the current structure of OntoKBCF, it should be relatively straightforward to extend it to cover different topic areas. Moreover, the principles underpinning itsdevelopment could be reused for building alternative human monogenetic diseases knowledge bases",
keywords = "OntoKBCF, Knowledge representation, Knowledge base, Ontology, Cystic fibrosis, Molecular genetics information, Phenotypes",
author = "Xia Jing and Nicholas Hardiker and Stephen Kay and Tom Marley and Yongsheng Gao",
year = "2018",
month = "12",
day = "21",
doi = "10.2196/medinform.9979",
language = "English",
volume = "6",
journal = "JMIR Medical Informatics",
number = "4",

}

The construction principles, approaches, design considerations, and representation challenges of an ontology-based knowledge base prototype: OntoKBCF. / Jing, Xia; Hardiker, Nicholas; Kay, Stephen; Marley, Tom; Gao, Yongsheng.

In: JMIR Medical Informatics, Vol. 6, No. 4, 21.12.2018.

Research output: Contribution to journalArticle

TY - JOUR

T1 - The construction principles, approaches, design considerations, and representation challenges of an ontology-based knowledge base prototype: OntoKBCF

AU - Jing, Xia

AU - Hardiker, Nicholas

AU - Kay, Stephen

AU - Marley, Tom

AU - Gao, Yongsheng

PY - 2018/12/21

Y1 - 2018/12/21

N2 - Background: Ontologies are key enabling technologies for the Semantic Web. The Web Ontology Language (OWL) is a semantic markup language for publishing and sharing ontologies.Objectives: The supply of customizable, computable, and formally represented molecular genetics information and health information, via electronic health record (EHR) interfaces, can play a critical role in achieving precision medicine. In this study, we used cystic fibrosis as an example to build OntoKBCF, an ontology-based knowledge base prototype, to supply such information via an EHR prototype. In this paper, we elaborate on the construction and representation principles, approaches, applications, and representation challenges that we faced in the construction of OntoKBCF. The principles and approaches can be referenced and applied in constructing other ontology-based domain knowledge bases.Methods: We defined the scope of OntoKBCF first according to possible clinical information needs about cystic fibrosis on both a molecular level and a clinical phenotype level. We then selected the knowledge sources to be represented in OntoKBCF. We utilized top-to-bottom content analysis and bottom-up construction to build OntoKBCF. Protégé-OWL was used to construct OntoKBCF. The construction principles included: to use existing basic terms as much as possible; to use intersection and combination in representations; to represent asmany different types of facts as possible; and to provide two to five examples for each type. HermiT 1.3.8.413 within Protégé-5.1.0 was used to check the consistency of OntoKBCF.Results: OntoKBCF was constructed successfully, with the inclusion of 408 classes, 35 properties, and 113 equivalent classes. OntoKBCF includes both atomic concepts, such as amino acid, and complex concepts, such asadolescent female cystic fibrosis patients, and their descriptions. We demonstrated that OntoKBCF could make customizable molecular and health information available automatically and usable via an EHR prototype. Themain challenges include the provision of a more comprehensive account of different patient groups and the representation of uncertain knowledge, ambiguous concepts, and negative statements, and more complicated anddetailed molecular mechanisms or pathway information about cystic fibrosis.Conclusions: Although cystic fibrosis is just one example, based on the current structure of OntoKBCF, it should be relatively straightforward to extend it to cover different topic areas. Moreover, the principles underpinning itsdevelopment could be reused for building alternative human monogenetic diseases knowledge bases

AB - Background: Ontologies are key enabling technologies for the Semantic Web. The Web Ontology Language (OWL) is a semantic markup language for publishing and sharing ontologies.Objectives: The supply of customizable, computable, and formally represented molecular genetics information and health information, via electronic health record (EHR) interfaces, can play a critical role in achieving precision medicine. In this study, we used cystic fibrosis as an example to build OntoKBCF, an ontology-based knowledge base prototype, to supply such information via an EHR prototype. In this paper, we elaborate on the construction and representation principles, approaches, applications, and representation challenges that we faced in the construction of OntoKBCF. The principles and approaches can be referenced and applied in constructing other ontology-based domain knowledge bases.Methods: We defined the scope of OntoKBCF first according to possible clinical information needs about cystic fibrosis on both a molecular level and a clinical phenotype level. We then selected the knowledge sources to be represented in OntoKBCF. We utilized top-to-bottom content analysis and bottom-up construction to build OntoKBCF. Protégé-OWL was used to construct OntoKBCF. The construction principles included: to use existing basic terms as much as possible; to use intersection and combination in representations; to represent asmany different types of facts as possible; and to provide two to five examples for each type. HermiT 1.3.8.413 within Protégé-5.1.0 was used to check the consistency of OntoKBCF.Results: OntoKBCF was constructed successfully, with the inclusion of 408 classes, 35 properties, and 113 equivalent classes. OntoKBCF includes both atomic concepts, such as amino acid, and complex concepts, such asadolescent female cystic fibrosis patients, and their descriptions. We demonstrated that OntoKBCF could make customizable molecular and health information available automatically and usable via an EHR prototype. Themain challenges include the provision of a more comprehensive account of different patient groups and the representation of uncertain knowledge, ambiguous concepts, and negative statements, and more complicated anddetailed molecular mechanisms or pathway information about cystic fibrosis.Conclusions: Although cystic fibrosis is just one example, based on the current structure of OntoKBCF, it should be relatively straightforward to extend it to cover different topic areas. Moreover, the principles underpinning itsdevelopment could be reused for building alternative human monogenetic diseases knowledge bases

KW - OntoKBCF

KW - Knowledge representation

KW - Knowledge base

KW - Ontology

KW - Cystic fibrosis

KW - Molecular genetics information

KW - Phenotypes

U2 - 10.2196/medinform.9979

DO - 10.2196/medinform.9979

M3 - Article

VL - 6

JO - JMIR Medical Informatics

T2 - JMIR Medical Informatics

JF - JMIR Medical Informatics

IS - 4

ER -