The benefits of Semantically Enriching document mining for Chemists

In this blog post, learn more about how our partner ChemAxon have integrated SciBite’s ultrafast named entity recognition (NER) and extraction engine solution, TERMite, into their leading cheminformatics platform, and how this can benefit your organisations informatics architecture.

Chemists

At SciBite, we design our applications to make is as easy as possible to incorporate them into our clients existing informatics architecture. Interoperability is key, so we love working with like-minded companies to ensure we deliver the most value to our mutual clients and enable them to better exploit the scientific data available to them.

At last year’s SciBite User Group Meeting in Boston, it was great to hear from Jozsef David, who gave an update on how ChemAxon have incorporated our ultrafast named entity recognition (NER) and extraction engine, TERMite, into their leading cheminformatics platform.

Every year, more than 20,000 new compounds are published in medicinal and biological chemistry journals [1], resulting in chemists spending countless hours and days every year looking for relevant new information. To address this challenge, ChemAxon have developed ChemLocator, a web-based search tool to find chemistry information within unstructured data. ChemLocator leverages ChemAxon’s chemical recognition capabilities to quickly and accurately extract structures from images and chemical names within scientific documents found in both internal repositories and web sources.

Using ChemLocator to extract structured chemical data from unstructured documents

Using ChemLocator to extract structured chemical data from unstructured documents [2]

According to Jozsef “SciBite brings biology intelligence to the ChemAxon platform, enabling scientists to search chemical structure and biology knowledge via a single user interface and a consistent application programming interface”.

Extending ChemLocator’s chemical named entity recognition with SciBite’s biological named entity recognition

Extending ChemLocator’s chemical named entity recognition with SciBite’s biological named entity recognition [3]

Joszef gave a great demo to illustrate how the augmenting ChemLocator with TERMite enables users to quickly answer questions across both biological and chemical information, such as identifying what research has been conducted in a given disease and/or target related to a specific chemical substructure.

Given the partnership is pretty new, there has been some impressive progress to date. Joszef also described some of the new capabilities in the pipeline. We’re excited about how this partnership will develop and the possibilities that making decisions based on all the relevant chemistry and biology evidence will bring for our customers.

Read the TERMite datasheet to learn more or contact the SciBite team to find out what we can do for your organisations informatics architecture.

Get datasheet

[1] Krallinger, M. et al. (2017) Information Retrieval and Text Mining Technologies for Chemistry. Chem. Rev. 117, 7673 – 7761.
[2] Taken from Joszef’s presentation ‘ChemLocator: Document Mining for Chemists’, presented at SciBite’s 2018 UGM in Boston
[3] Taken from Joszef’s presentation ‘ChemLocator: Document Mining for Chemists’, presented at SciBite’s 2018 UGM in Boston

Related articles

  1. SciBite and ChemAxon to Deliver an Integrated Solution for Biology and Chemistry Research

    Cambridge, UK -  SciBite, the award-winning semantic analytics company, today announced a strategic partnership with ChemAxon, the leading cheminformatics company, enabling Pharmaceutical companies to capitalise on unlocking the wealth of R&D data that is at their disposal.

    Read
  2. The pivotal role of Semantic Enrichment in the evolution of Data Commons

    In this blog post, discover how Pfizer have integrated SciBite’s semantically enriched vocabularies into their Data Commons project, which has the goal of enabling scientists to develop and refine hypotheses by investigating correlations between genetic and phenotypic data.

    Read

How could the SciBite semantic platform help you?

Get in touch with us to find out how we can transform your data

Contact us