HPC platform used to analyse ancient language

Scientists at the University of Reading have discovered that 'I', 'we', 'who' and the numbers '1', '2' and '3' are among the oldest words, not only in English, but across all Indo-European languages. What's more, words like 'squeeze', 'guts', 'stick', 'throw' and 'dirty' look like they are heading for history's dustbin – along with a host of others.

Evolutionary language scientists from the University of Reading have been investigating how languages evolve, and whether that evolution followed any rules. Until recently they believed they would not be able to track words back in time for more than 5,000 years, however their new IBM supercomputer has enabled them to go back almost 30,000 years, and finally provide the answers.

The scientists have been able to analyse the family of Indo-European languages, of which English is a modern-day example, and have reconstructed the rate at which words evolve in order to predict future changes to our vocabulary. The oldest words we use today have been in existence for at least 10,000 years.

Looking to the future, the less-frequently certain words are used, the more likely they are to be replaced. Other simple rules have been uncovered; numerals evolve the slowest, then nouns, then verbs, then adjectives. Conjunctions and prepositions such as 'and', 'or', 'but' , 'on', 'over' and 'against' evolve the most quickly, some as much as 100 times faster than numerals. 'Throw' which is expected to evolve quickly, has a half-life of 900 years, and there are 42 unrelated sounds for it across all the languages. In 10,000 years' time, it will likely have been replaced in 10 of them, possibly including English, unless of course we all do our part to keep the word in circulation.

'50 per cent of the words we use today would be unrecognisable to our ancestors living 2,500 years ago. If a time-traveller came to us, and told us he wanted to go back to that period, we could arm him with the appropriate phrase book, and hopefully keep him out of trouble,' explained Mark Pagel, Professor of Evolutionary Biology at the University of Reading.

The IBM supercomputer at the University of Reading, known as ThamesBlue, is now one year old. Before it arrived, it took an average of six weeks to perform a computational task such as comparing two sets of words in different languages; now these same tasks can be executed in a few hours.

Professor Vassil Alexandrov, the University's leading expert on computational science and director of the University's ACET Centre, said: 'The new IBM supercomputer has allowed the University of Reading to push to the forefront of the research community. It underpins other important research at the university, including the development of accurate predictive models for environmental use. Based on weather patterns and the amounts of pollutant in the atmosphere, our scientists have been able to pinpoint likely country-by-country environmental impacts, such as the affect airborne chemicals will have on future crop yields and cross-border pollution.'

Caroline Isaac, Deep Computing Executive at IBM, said: 'Supercomputers are enabling the world to become increasingly interconnected, instrumented and intelligent. We have now reached a tipping point in price/performance that's allowing breakthroughs in university research that were previously unimaginable.'

HPC platform used to analyse ancient language

Editor's picks

The 2026 storage survey: strategies for AI and data-intensive research

NEW On-Demand | Ontologies - the missing foundation for AI in drug discovery

On-Demand | One workflow, every tool: how AI-native ELN is changing drug discovery

On Demand: Free Online Panel Discussion | LIMS innovation boosts precision and security

The path to AI federated learning for drug discovery

Workstations vs Clusters for Ansys Applications

Avoid Duplication, Reduce Fragmentation | Integrated Informatics for Scientific Research