ChemAxon, a provider of chemistry software solutions and consulting services for life science research, and Keymodule, a developer of chemical structure based data mining solutions, have announced the integration of the CLiDE OCSR software within ChemAxon’s technology and the rights to sell CLiDE.
ChemAxon has extensive capabilities for extracting and working with structures from English, Chinese, and Japanese text documents. These text mining capabilities are implemented throughout ChemAxon’s products – such as JChem for Office, Instant JChem and Marvin – as well as it’s text mining toolkits such as Document to Database to automatically extract chemistry from document archives.
However, there is a significant amount of chemistry in documents presented in structure images. The addition of CLiDE for structure recognition from structure images, completes ChemAxon’s capabilities and enables users to retrieve valuable chemistry related data in any document or archive.
CLiDE provides a fast and accurate (98 per cent in some test cases), technology for the extraction of chemical structures from PDF, MS Office and image files, including journal publications, internal documents and patents. A key feature is the ability to flag potential errors at the individual structure level to facilitate manual correction.
By integrating the two technologies, robust automated systems for the routine mining of chemistry in repositories and information systems and for research chemists can be created. This will provide a simple way to extract and use chemistry in day to day documents through ChemAxon’s research tools.