SKOS in CENtree: Further Support in our Latest 2.1 Release

At SciBite terminologies underpin all that we do. There are many ways to represent and build a standardised terminology, each with different levels of complexity. On one hand you have simple, informal, lightweight terminologies (e.g., glossaries, dictionaries, and thesauri), where the meaning (semantics) of terms is captured using natural language.

SKOS In CENtee

These can get more informative when we encode structure and semantic relationships into them, such as taxonomies or controlled vocabularies. At the other end of the spectrum, we have full blown formal ontology language like OWL, that allow you to build terminologies with strict and precise semantics.

 

figure 1

Figure 1. Terminologies overview. Terminologies may be represented using varied levels of expressivity and formality, depending on the use case it is designed to serve as well as its level of maturity

At SciBite we recognise that there’s a mixture of formats for capturing terminologies in the wild and that each serve different use-cases.

Standards such as Medical Subject Headings (MeSH) are designed for document indexing and categorisation.  Concepts in MeSH are organised into a hierarchy using generic broader/narrower relationships that are useful in supporting document retrieval and navigation. For example, in MeSH the Anatomy branch organises Body Regions into a hierarchy where concepts such as Eye, Mouth, Nose and Chin are all narrower terms under Face, that is itself a narrower term of Head. In contrast, other standards such as Uberon, represent body regions using an OWL ontology where stricter relationships such as subclass and part-of are used to organise the hierarchy and provide a more meaningful description of these concepts.

Figure 2

Figure 2. Strict semantics in OWL vs weaker semantics in Mesh. Above we can see a class from Uberon, an OWL based ontology, in CENtree and the equivalent class in MeSH. The graph view shows the type of relationships for each class, being strict part_of relations in Uberon and less specific in MeSH.

In a bid to improve the interoperability of controlled vocabularies and terminologies, where weaker semantics are required to organise concept into hierarchies, then the Simple Knowledge Organisation System (SKOS) provides a convenient alternative to more formal ontology modelling languages like OWL.

What is SKOS

SKOS was built as a standard by the W3C for representation of controlled vocabularies and thesauri in the late 2000s. SKOS is often a good starting point when building new vocabularies that may later become ontologies; it provides a more complete standard for describing common features of a controlled terminology such as standard label predicates (pref label, alt label etc) and taxonomic information (broader/narrower relationships).

SKOS is predominantly used to support search and navigation use-cases. In such settings the alt label predicate enables synonyms to be captured whilst the broader and narrower predicates allow users to browse for search terms whilst also enabling information retrieval applications to use this structure to automatically expand queries.

Furthermore, SKOS-XL, which defines an extension for SKOS, allows for the representation of literal entitles (e.g. a label or synonym) as a resource in their own right. This feature allows vocabulary editors to provide unique identify to textual labels and grants the ability to define relationships between these entities. At SciBite we can take advantage the SKOS-XL representation to add additional information about synonyms to aid named entity recognition (NER) in our TERMite system.

This makes SKOS-XL the perfect means for representing and sharing vocabularies within the SciBite stack. SKOS-XL allows for NER ‘rules’ to be captured in the vocabulary before it is passed on to TERMite to be used for marking up text. The same SKOS-XL representation can also be used by our search solution, SciBite Search, for encoding the associated taxonomy of the vocabulary.

SKOS in CENtree

Up until recently, CENtree primarily supported OWL, hiding a lot of the complexity captured in OWL through utilisation of the internal CENtree representation model. This internal representation is also used for controlled vocabularies, and this aligns well with SKOS. We are very pleased to announce that in CENtree 2.1 we have some additional features that build upon CENtree’s ability to support the ingestion, manipulation, and export of SKOS based terminologies:

  • We support ingest and export of both SKOS and OWL; supporting organisations that are working with mixed representations
  • We use SKOS as a higher-level exchange format for vocabs – where we are mostly focused on lexical information (labels and synonyms) and some taxonomy
  • We support SKOS-XL where we want to say additional things about lexical entities (synonyms) such as capture provenance or add termite switch information

Conclusion

Although CENtree has been designed to handle a wide variety of standards in a seamless manner, we have extended some of the SKOS support in the latest release of the tool. Additional SKOS support will not only provide a smooth integration from CENtree to TERMite but will also enable users that are either at the start of their ontology journey and, therefore are ingesting terminologies with less complexity into CENtree, or those who are utilising the SKOS format as a means of representing terminologies across the business.

To learn more about CENtree or find out more about how we can help you get more from your data, contact the SciBite team.

Contact us

Related articles

  1. CENtree 1.1: Latest release of the revolutionary ontology management tool for Life Sciences organisations

    SciBite releases latest version of CENtree, the revolutionary ontology management platform for the life sciences.

    Read
  2. Elsevier acquires SciBite to accelerate solutions for life sciences and corporate R&D industries

    Elsevier, a global research publishing and information analytics provider, and part of RELX, has acquired SciBite, a semantic AI company headquartered in Cambridge, UK, to help customers make faster, more effective R&D decisions through advanced text and data intelligence solutions.

    Read

How could the SciBite semantic platform help you?

Get in touch with us to find out how we can transform your data

Contact us