What’s the most useful way to visualise an ontology? It’s a question I’ve returned to many times over the last decade of building tools which employ ontologies in some way. And when a friend recently asked me about useful mechanisms for visualising ontologies, I thought it was about time I wrote up some thoughts.
Before considering the various visualisations, it’s worth thinking about the components that constitute an ontology to offer some insight into what we’re playing with. An ontology is a graph – a directed graph – which means it has nodes (blobs) and edges (lines), with the edge having a directionality to tell you which way the relationship applies. The most basic of these is the subclass of relationship which tells you that one node is a subtype of another, such as ‘mouse is a subclass of mammal’ (i.e. all mice are also mammals).
There are other common types of relationships used in bio-ontologies which we might wish to see. Most commonly, partonomy which indicates that a node is a part of another node, for instance that a mouse tail is part of a mouse.
There are others, such as develops from which is commonly used to represent developmental biology. Has role is also used to show the different uses of a particular node, in a given context, such as the role of a virus as a vector in an experiment.
This network of nodes and edges is the most natural form to think of an ontology and frequently available in ontology editors or browsers, though one I consider to be only rarely useful. It’s easy to see why when considering the examples below.
As a faithful visual representation, this view reveals the ontology for what it is; complex. The high cognitive load placed on the viewer produces an information overload and makes consuming anything beyond a few nodes and edges difficult. Graph views should really be left to experts who require a really specific view of an ontology and as such are rarely used in applications to show data. Although this view of nodes and edges is overkill for most users, they remain the building blocks upon which most other visualisations build, the most common of which is that of the tree.
Every ontology browser and editor has some version of a hierarchical tree for displaying nodes and edges in which going up the tree (confusingly the language used means travelling up the tree takes you towards the ‘root’ nodes) conveys something of broadness, and going down towards the ‘leaf’ nodes conveys something of narrowness. The default tree view is typically to show the subclass/superclass type relationships as branch-to-leaf relationships. This also adds the possibility of showing additional hierarchical-like relationships, such as part of, in a similar manner, such as the examples shown below from the EBI’s OLS.
Trees are common because they’re intuitive and familiar and used almost universally used in websites to convey information where navigation by ‘topics’ is useful. Consider the image below taken from popular online grocery store Ocado. It’s pretty clear a ‘Cake’ is not a subclass of Bakery, but rather something that is ‘found in’ a bakery, and yet we can read that pretty easily and understand the grouping. A ‘muffin’ is a subclass of ‘small cakes’ though and we can also understand that pretty easily. The familiarity trees offer helps to lower the barriers to understanding what can be a complex picture. They don’t display everything, but what they do display is easy to understand and easy to navigate around; by simple clicking on a node we jump to that node in a ‘follow your nose’ type manner – the essence of the web.
Tree browsers allow a focus to be placed on a part of the ontology – a sub-branch – and can show some of the detail of the nodes in that particular branch. There are also techniques I broadly consider to be summarisation focused. Here, the aim is to show a broader picture – a taking-a-step-back type view – by showing less detail, or rather by aggregating the detail. I often call these satellite views.
One such visualisation uses treemaps in which the hierarchy is nested into rectangles and the area is used to describe a particular property of the tree. In the case of SATORI (shown below) the visualisation neatly describes two properties. Firstly, the size of the rectangle illustrates the number of subclasses for a particular ontology node. For instance, stem cell has quite a lot of subclasses compared to neural cell which has far fewer. Secondly, the colour shading of the rectangle indicates the maximum depth of the subclass hierarchy from that node, darker indicating a larger depth.
The advantage of this treemap summary is in condensing a lot of information into one and making it comparable at a broader more global level, in this case the number of subclasses. This would not be straightforward to do in a basic tree view (e.g. showing all of the subclass tree down to leaf nodes, showing a count of leaf nodes, etc). The disadvantage is that it is collapsing a lot of detail down; each aggregate represents a tree with its own hierarchy. There is, of course, always a trade-off when summarising complex data as we have learnt with the graph view.
Another summarisation approach is to exploit the set-like nature of ontologies. A common method I’ve used in teaching about ontology relationships is to use a Venn diagram such as the one shown below. This can very effectively illustrate subclass relationships, for example that all mice are types of mammals, as are humans, and that humans and mice do not intersect (yet).
The same method can be applied to summarising ontologies using bubble diagrams. The example below is from the Open Targets project, illustrating evidence of links to certain diseases for a given target. In this visualisation, hierarchies are being shown as Venn diagrams, for instance Crohn’s disease is a subclass of digestive system disease. There is some commonality in what is being shown here with the TreeMap approach; colour shading and size are both indicating ‘score’ of the evidence link, though the key difference is size does not indicate anything about the ontology structure, only about the target data it is describing.
Interestingly, the same view in the Open Targets Platform is available in a dendogram tree (below). Each node is equally sized so the ‘bigger means more’ part of the Venn diagram is lost, though the colouring aspect remains. However, the nodes are easier to read as text isn’t chopped due to node boundaries as they are in the Venn.
The emphasis on the Open Targets example is on the data that has been annotated with an ontology, rather than the ontology itself. Here, the ontology visuals become a navigational aid intended to convey information about this data, rather than just a mechanism of exploring the ontology. This is similar in method to the faceted search aid – much like the online shopping example previously. Here, the ontology is overlaid to show where an ontology, or parts of an ontology, appear in the data.
The example below shows another visual taking a similar method in SciBite’s DOCstore. Here, colour cues are being used to indicate where highlighted words correspond to hits within a given ontology, for instance the search term term ‘prostate cancer’ is highlighted in green in the text. Other hits which are not part of the original search are shown in light orange highlights, with a vertical colour bar offering a cue as to which ontology a hit belongs to shown in the key on the left hand side of the page (e.g. blue for genes/proteins, such as the CNNM1 hit).
Much like tree visualisations, these approaches work well because they’re intuitive and familiar. Anyone who has ever performed a crtl+f on a web page or Word document will be accustomed with the ability to highlight search hits.
The biology modeled within an ontology can also be exploited in a more literal manner. The Expression Atlas at EBI uses an anatomical figure which has been annotated with parts of the ontology to visually highlight where data is found. The example below highlights which parts of anatomy are being studied within a given experiment (in this case the 19 NIH Epigenomics Roadmap has been selected) and are shown in red on the anatomical figure. The heatmap to the right renders the ontology in a more conventional style while colour is used illustrate expression level (darker is higher).
As with most visualisation techniques, there is no single ‘best’ way that fits all users and all applications. An understanding of the user community and how they intend to consume the data is critical, as is frequently reviewing their ability to continue to use the visualisations that have been developed as data and ontologies evolve. Often a mix of user types requires a mixture of visualisations (for instance, Open Targets has at least three to describe the same data in different ways).
Trees are intuitive and familiar, maps and set diagrams can summarise well, and more creative visualisations (such as human forms) can also exploit the contents of the ontology more literally. On the more niche side, graph views are very rich but very complex and should be reserved for only the most ardent ontology user. The one thing I can be sure of is that word clouds are almost never used because it’s 2019…
Find out more about how SciBite’s semantic technology can assist with ontology management.
SciBite reflects on discussions from the Pistoia Artificial Intelligence / Machine Learning workshop and annual conference in Boston, MA.Read
One of the key aims of SciBite is to help our customers work with public ontologies in text mining applications. While these ontologies are very valuable resources, they are often built for the purpose of data organisation, not text mining. The reliance on vanilla public ontologies in text-mining will often lead to very poor results.Read
Get in touch with us to find out how we can transform your data.
© SciBite Limited / Registered in England & Wales No. 07778456