LEWES, Del., Oct. 24, 2022 (GLOBE NEWSWIRE) — John Snow Labs, the Healthcare AI and NLP firm and developer of the Spark NLP library, at this time introduced enhancements to its automated de-identification resolution. The firm just lately established a brand new state-of-the-art document on the n2b2 customary de-identification benchmark, attaining an F1 rating of 96.1%, and lowering its error price by 33%. By enabling organizations to routinely de-identify giant datasets, John Snow Labs empowers product innovation and price financial savings for healthcare organizations worldwide.
Providing customized de-identification required for the monetization of information, John Snow Labs’ automated de-identification resolution is already proving helpful for customers. The service is predicated on the corporate’s Spark NLP for Healthcare library, constructed on prime of the Spark large knowledge framework, enabling the processing of tens of millions of data on giant Spark or Databricks clusters. The de-identification resolution will be delivered as an end-to-end system or a software program library with non-obligatory skilled companies.
“We are using John Snow Labs to de-identify patient notes on a massive scale and the results from the out-of-the-box de-identification models have been remarkable,” mentioned Nadaa Taiyab, Senior Data Scientist, Tegria. “It has been simple to fine-tune models with our own annotated data and improve pipeline results by adding regular expressions and text matching where needed. Overall, the code is very modular and easy to use, making the challenges and complexities of such a large-scale project much easier to navigate.”
Healthcare suppliers possess huge quantities of unstructured patient-level knowledge. This knowledge has great worth, however usually stays untapped attributable to authorized and regulatory necessities. However, by eradicating protected well being info (PHI), the information turns into usable and has the potential to create new income streams and spark healthcare innovation. However, this may be difficult, as stricter de-identification guidelines decrease the chance of re-identification, but additionally lower the usability of the information.
While guide removing of PHI is feasible, it’s usually rife with human error, and requires a number of opinions. Additionally, the bigger the information set, the extra labor- and cost-intensive the undertaking. Academic literature reveals that for a crew with a mean price of $83 per hour whole compensation, processing 135 notes per hour of a mean size of 130 phrases, prices $0.61 per notice. For giant knowledge units consisting of tens of millions of data, that is merely not possible.
“Natural language processing has made it possible to automatically de-identify valuable, but otherwise unusable, unstructured patient-level data, like clinical notes, images, and scanned documents,” mentioned David Talby, CTO, John Snow Labs. “Once de-identified, the datasets can be shared more safely and easily with researchers and builders, ushering in a new generation of accurate an innovative healthcare solutions. Without large-scale automatic data de-identification, this would not be possible at scale.”
Follow @JohnSnowLabs on Twitter for the most recent information and updates. To be taught extra about Spark NLP or to start out your free trial, go to: https://www.johnsnowlabs.com/spark-nlp/.
About John Snow Labs
John Snow Labs, the AI and NLP for healthcare firm, gives state-of-the-art software program, fashions, and knowledge to assist healthcare and life science organizations put AI to good use. Developer of Spark NLP, the world’s most generally used NLP library within the enterprise, John Snow Labs’ award-winning medical NLP software program powers main healthcare and pharmaceutical firms together with Kaiser Permanente, McKesson, Merck, and Roche. The firm is the creator and host of The NLP Summit, additional educating and advancing the NLP group.
Contact
For media inquiries:
Gina Devine
John Snow Labs
+1 339-236-9206
[email protected]