Computational approaches help to sift through and identify relevant material from multiple sources – yet without the support of an ontology or controlled vocabulary – they struggle to deal with the ambiguity of scientific literature. Multiple terms can be used to describe the same topic making any key word search difficult.
Accurate detection of important topics within biomedical text relies on highly tuned vocabularies (thesauri) which contain all of the known terms for the same real world “thing”.
These vocabularies may just be flat lists (e.g. a list of all known drugs) or they may be organised into a hierarchy, often as an Ontology. An Ontology structures topics in scientifically-related groups, things like ‘all anti-inflammatories’ or ‘all DNA replication proteins’.
The availability of high quality vocabularies and ontologies is a critical foundation to any text analysis methodology.
We use a variety of public and in-house ontologies/vocabularies as reference tools for the TERMite engine. Each vocabulary is enhanced by a combination of our in-house and experienced manual curation team and our proprietary ontology enrichment software.
Our VOCabs cover many more topics in far greater depth that any publicly available ontologies such as MeSH, Uniprot and MeDDRA. Put simply, if you’re not using SciBite VOCabs in your text analytics, you’re not going to capture the information your users need.
Get in touch with us to find out how we can transform your data.
© SciBite Limited / Registered in England & Wales No. 07778456