Project to enhance language processing of Big Data announced

Linguamatics, Brandwatch and the University of Sussex, UK, have announced a joint project funded by the UK's Technology Strategy Board to address challenges faced by automated language processing software in harnessing diverse data sources. The project forms part of a broader Technology Strategy Board initiative focusing on enabling technologies to harness Big Data for economic growth.

The development will improve automatic extraction of information from scientific papers, news or social media for applications in research and development, marketing and competitive intelligence. The current generation of language processing has had considerable success in extracting useful information from unstructured text, whether this is research literature or social media. However, adapting to a new domain is often a laborious process with respect both to the type of data (e.g. newswire vs. patent literature) and to the terminology used in a given domain (e.g. in medical practice vs. pharmaceutical research).

Humans can perform these tasks on small data sets, but face a huge challenge in the face of massively increasing amounts of electronic text. The EVOKES project, which stands for Exploitation of Diverse Data via Automatic Adaptation of Knowledge Extraction Software will exploit distributional similarity techniques developed by the University of Sussex. The project will run for 18 months.

The 2026 storage survey: strategies for AI and data-intensive research

Scientific Computing World and Seagate are inviting research computing professionals to share how they're preparing storage infrastructure for the demands of AI and data-intensive science. Contribute to the 2026 Storage Survey and help benchmark the future of research data management.

Project to enhance language processing of Big Data announced

Editor's picks

The 2026 storage survey: strategies for AI and data-intensive research

NEW On-Demand | Ontologies - the missing foundation for AI in drug discovery

On-Demand | One workflow, every tool: how AI-native ELN is changing drug discovery

On Demand: Free Online Panel Discussion | LIMS innovation boosts precision and security

The path to AI federated learning for drug discovery

Workstations vs Clusters for Ansys Applications

Avoid Duplication, Reduce Fragmentation | Integrated Informatics for Scientific Research