Ontologies and Semantic Web technologies play an important role in the life sciences to help make data more interoperable and reusable. There are now many publicly available ontologies that enable biologists to describe everything from gene function through to animal physiology and disease.
Various efforts such as the Open Biomedical Ontologies (OBO) foundry provide central registries for biomedical ontologies and ensure they remain interoperable through a set of common shared development principles.
At EMBL-EBI we contribute to the development of biomedical ontologies and make extensive use of them in the annotation of public datasets. Biological data typically comes with rich and often complex metadata, so the ontologies provide a standard way to capture “what the data is about” and gives us hooks to connect to more data about similar things.
These ontology annotations have been put to good use in a number of large-scale data integration efforts and there’s an increasing recognition of the need for ontologies in making data FAIR (Findable, Accessible, Interoperable and Reusable).
EMBL-EBI build a number of integrative data platforms where ontologies are at the core of our domain models. One example is the Open Targets platform, where data about disease from 18 different databases can be aggregated and grouped based on therapeutic areas in the ontology and used to identify potential drug targets.
The ontologies team at EMBL-EBI provide a suite of services that are aimed at making ontologies more accessible for both humans and machines. We work with scientific data curators and software developers to integrate ontologies and semantics into both the data generation and data presentation workflows. We provide:
– An ontology lookup service (OLS) that provides search and visualisation services to over 200+ ontologies
– Services for automating the annotation of metadata and learning from previous annotations (Zooma)
– An ontology mapping and alignment service (OXO)
– Tools for working with metadata and ontologies in spreadsheets (Webulous)
– Software for enriching documents in search engines to support “semantic” query expansion
I’ll present how we are using these services at EMBL-EBI to scale up the semantic annotation of metadata. I’ll talk about our open source technology stack and describe how we utilise a polyglot persistence approach (graph databases, triples stores, document stores etc) to optimize how we deliver ontologies and semantics to our users.
At EMBL-EBI, we make the world’s public biological data freely available to the scientific community via a range of services and tools, perform basic research and provide professional training in bioinformatics.
We are situated on the Wellcome Genome Campus in Hinxton, Cambridge, UK, one of the world’s largest concentrations of scientific and technical expertise in genomics.