Recently I’ve been asked various questions about BERT, or more specifically BioBERT, a deep-learning based system for analysis of biomedical text. For those of you who aren’t familiar, BERT (Bidirectional Encoder Representations from Transformers) is a deep-learning based model of natural language, released as open source by Google in late 2018.
BERT is the result of a mammoth compute exercise to generate a model which is then further tuned for specific domains or questions, such as the BioBERT implementation tuned for the biomedical domain. Competing approaches such as XLNet have raised a few eyebrows recently with some claiming to make BERT-based models redundant and others such as the GPT-2 model from OpenAI being “so good they are considered dangerous”!
You can read the original papers and blog posts for a more technical dive into how systems like BERT and XLNet differ from other methods, but the general consensus (on which we very much agree) is that these do represent strong steps forward in our ability to understand textual data. Indeed, SciBite’s award-winning semantic platform leverages these models alongside semantic technologies to address significant customer challenges from pharmacovigilance to managing corporate acquisitions.
In biomedicine, Deep Learning models tend to focus on addressing three key tasks, Named Entity Recognition, Relationship Extraction and Semantic Question/Answering – all topics very close to our heart (and software!). At SciBite we have extensive experience of using these type of deep learning approaches within our products and in customer projects and we’re often asked to give our opinion on how they fit in a wider enterprise-ready ecosystem for text analysis and data management. Common questions include:
Here at SciBite we’ve deployed synergistic combinations of deep learning and semantic technologies for many different use-cases and understand the current landscape is very confusing! No one technology provides all of the answers and while deep learning approaches are all the rage, there are many pitfalls to be aware of. Deploying deep learning and semantics through the SciBite platform insulates our customers from many common issues seen when deep learning models are deployed in production.
To this end, we’ve put together a technical briefing document to help answer common questions for our customers and help guide them through this exciting and rapidly developing area of technology. If you’d like to get a copy, or want to discuss which approach is right for you, please contact us and get in touch with the team.
SciBite CSO and Founder Lee Harland shares his views on why ontologies are relevant in a machine learning-centric world and are essential to help "clean up" scientific data in the Life Sciences industry.Read
When it comes to identifying adverse events (AEs), things are not always as they seem. Consider a paper describing a new treatment for a given illness - how can we determine which adverse event terms refer to actual adverse events as opposed to symptoms of the illness itself, given that those terms may be identical? Is this new drug treating arrhythmias or causing them, for example?Read
Get in touch with us to find out how we can transform your data
© SciBite Limited / Registered in England & Wales No. 07778456