Using AI for Good
There is a global need to identify chemicals of emerging concern (CECs) to reduce chemical harm and prevent future environmental damage. We have developed a methodology to search for emerging concern indicators in the up-to-date scientific literature using Natural Language Processing, the first time AI has been used to alert stakeholders that a product may be more harmful than previously thought. Candidate databases were evaluated based on criteria including the relevance and comprehensiveness of research covered, practical considerations for data extraction (e.g., API availability), and licencing considerations. Metadata are downloaded from the selected database and used to locate publishers’ landing pages, from which abstract texts are extracted. To identify emerging concern, texts are evaluated using AI to determine whether they satisfy two principal criteria (for each chemical discussed):
1. Does the text evidence previously unknown effects on human health or the environment?
2. Does the text evidence contamination ability, including indicators of persistence or bioaccumulation?
The proportions of records satisfying each of the criteria are used to calculate ‘emerging concern scores’, which are monitored over time. Our results show that the huge numbers of publications released each day can be rapidly evaluated, and we have integrated the emerging concern scores into an existing chemical management framework. Initial validation confirms that calculated scores reflect the current level of concern regarding a number of chemicals. This new methodology overcomes the overwhelming, if not impossible, task of manually vetting an ever-increasing influx of new literature.