Skip to main content
RISE logo

Non-toxic from the start through AI analysis of patent data

Could a pilot study on European patent data be the first step to preventing a future chemical disaster? Researchers at RISE have built a machine learning model that identifies bisphenols and, potentially, other hazardous chemical substances. With further training and development, the social benefits can be enormous.

The world’s chemical inspectorates are essentially always one step behind. Bisphenols are a telling example. First comes the suspicion: is the substance an endocrine disruptor? Then the evidence: yes, it is. And then the regulation: EU-wide ban on bisphenol A in baby bottles in 2011, and then in thermal paper (used for receipts) in 2020.

The problem? Regulations are introduced after the damage has already been done, while new variants frequently appear that have not yet been regulated, forcing authorities to constantly play catch up.

There is undoubtedly a great need for a powerful and proactive way of working. In collaboration with the Swedish Chemicals Agency and the Swedish Patent and Registration Office, Researchers at RISE have used data from the European Patent Office to evaluate a number of AI methods.

“Patent data is unusually well-organised,” says Olof Görnerup, researcher in machine learning and large-scale data analysis at RISE. “It is structured to be machine-readable and easy to process.”

Furthermore, researchers state that patent data is a good way to understand what is happening in technology development. Granted patents normally precede broad implementation by several years and contain information about the chemical substances in a product. Authorities can therefore learn early on whether problematic chemicals will be introduced to the market.

Language technology presents new possibilities

In the past, this type of mapping has been carried out over months of manual keyword searches. Work that is also difficult to verify – how does one know that all relevant documents have been included?

The rapid development of language technology presents brand-new possibilities.

“Instead, we used semantic search through an AI model that ‘understands’ more of the text,” explains Görnerup. “That certain words and phrases relate to a subject. It involves finding words and phrases that are semantically similar to each other.”

It’s important to understand where chemicals are used

To determine which documents are relevant, the AI system had first been trained on reference data from a previous mapping of bisphenol-related patents and the PubChems database. A small selection of documents with known classifications were shown to an AI-model, which then utilised this information to find other relevant documents.

High precision with AI

The results showed that the best-faring AI method was able to identify 96 percent of all relevant patents (precision). While 91 percent of the patents identified as relevant were in fact relevant (recall).

“We immediately saw that there is massive potential to use AI in this area,” says Görnerup.

The researchers stress that more development is needed and the models need to be fine-trained. Among other things, there is a lot of image data in the form of line drawings and technical diagrams that the pre-trained models (including CLIP) had difficulty processing. And much of the chemical information in patents is image form, so there is great potential for better searching using computer vision.

According to the researchers, something within close reach is the creation of a tool for the Swedish Chemicals Agency to automatically scan for hazardous chemicals without requiring extensive insight into various technology fields:

“It’s important to understand where chemicals are used. A hazardous chemical that is only used in controlled processes does not pose as much of a risk as a chemical that may be less hazardous but which is at risk of spreading widely.

“Are there niche fields of technology where such use slips under the radar? For example, thermal paper was discovered quite late. It is used everywhere; many people handle it daily in receipt printing.

“Imagine the benefits that could be achieved if another PFAS disaster could be avoided.”

Olof Görnerup

Contact person

Olof Görnerup

Senior Researcher

+46 70 252 10 62

Read more about Olof

Contact Olof
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.

* Mandatory By submitting the form, RISE will process your personal data.