TERMite 6.1 Release
13th April 2017

Author: SciBite Team

SciBite is pleased to announce the release of the latest update to our TERMite text analysis engine, part of our Semantic Services Platform. Accessible by a fast, simple  RESTful API, it provides a comprehensive set of capabilities including:

  • Highly curated scientific ontologies, built upon open standards
  • Formal based Named Entity Recognition
  • Relationship mapping and extraction, identifying patterns
  • Elastic Search of semantically rich data
  • Live enrichment of browser based content
  • Seamless connectivity to third party applications, providing search and connectivity

SciBite’s pluggable technology, allows customers to integrate semantic enrichment services across their organisations, supporting multiple use-cases that require deep understanding of scientific content. As well as a fresh new look, the new TERMite release includes many new features which we’re sure will enhance your experience.  Let’s take a quick look at them individually:

The new look TERMite

The new look TERMite

Fuzzy matching

Unlocking even more data through this feature, this is incredibly helpful for those of you dealing with data that is not subject to such high levels of proofing as Medline, for example, patents or internal documents.  Fuzzy matching helps pull out and align those mis-spelled words to known dictionary terms.  

If you’d like to know more about this feature, read our fuzzy-matching blog post.

Increased connectivity to databases

We’ve enhanced TERMite further so not only does it pull data out of your database to annotate and structure it, it will then re-insert it into your database in the format you need.

Online processing services

This covers aspects such as image to text and voice to text, as well as language translation.  

So, for example, you could be listening to a keynote speaker at a conference and would like a deeper understanding of their session.  You can now convert their speech into text and then ‘TERMite’ the text, to highlight the main scientific concepts.  

Converting image to text enables you to finally analyse those pdfs or images of documents you have stored away, and then run those through TERMite too.

TERMite now offers a plug-in to Google translate, so text in other languages can be translated into English and then analysed, enabling you to gain access to even more structured data.

Deeper integration with Python scripts

We’ve developed a TERMite module that you can use within Python scripts, making it even more accessible to data scientists and allowing you to embed your own code within the TERMite engine.

A new Institutions module

Academic institutes can end up being named or cited in a number of ways, all different to their official name.  So if you wanted to calculate publication statistics for academic institutions or analyse grants and funding, you would either have to manually curate the data, or spend a significant amount on an external company to curate it for you.

This new module allows you automatically normalise the same institution to a single unique identifier, saving you time, money and your sanity.

A raft of new dictionaries (VOCab) modules

An example of which is the Country VOCab.  Couple this with our institutions module mentioned earlier and you have a very powerful tool for tracking who’s publishing what and where in the world. Others such as the Bioassay Ontology provide a great way to search and organise pre-clinical data.

There are lots of other features too, so if you’d like to know more about them or any of those listed on this blog, please get in touch with us on info@scibite.com and sign up to our newsletter, to stay up to date with developments.