Around 4,000 years ago, one of the world’s oldest civilizations arose: The Indus Valley Civilizationflourishing in what is now Pakistan, western India, eastern Iran and parts of Afghanistan. In addition to building large cities, the people created a written script consisting of hundreds of characters that remain undeciphered.
The characters, sometimes called Harappan script, vary, and some look like a diamond with a square in the corner; a U with three “fingers” at each end, and an oval with a star-like shape inside.
The article continues below
The unencrypted script
Sign up for our newsletter

Sign up for our weekly Life’s Little Mysteries newsletter to get the latest mysteries before they hit the web.
The Indus Valley Civilization flourished between about 2600 and 1900 BC. Thousands of artifacts containing the script survive to this day, Michael Philip Oakesa researcher in computational linguistics at the University of Wolverhampton in the UK, wrote in an article published in Journal of Quantitative Linguistics.
The surviving texts tend to be very short, with an average of five characters per text, Oakes noted. There is no known bilingual text recorded in the Indus Valley Script and no known text to aid in decipherment – in other words, the Indus Valley Script does not have its own Rosetta stone. It is also uncertain what language the script encodes, and some scholars have argued that it may not encode a language at all, suggesting that they may function more as emblems that convey a person or entity.

Exactly how many characters the script contains is a matter of debate, but they number in the hundreds, Oakes said.
Experts have mixed ideas about whether the script will ever be deciphered. Even if decoded, the texts’ short lengths and scholars’ wide differences of opinion may make it difficult for any decipherment to be widely accepted.
While some experts believe that AI can help decipher the language, researchers will likely need to guide the AI for a complete decoding, the experts said.
Is it partially decrypted already?
Steve Bontaan independent researcher who has a doctorate in linguistics and has studied the script extensively said that some of the work may already be done.
“I believe that the Indus Valley script has already been partially deciphered, but that recognition of that fact is seriously lagging,” Bonta told LiveScience in an email. Bonta said he showed “back in the 90s that certain characters and canonical character fields must be indicative of notations of assets, expressed in different weights.” However, many scholars do not acknowledge that the decipherments are accurate.

Bonta said his claims of partially deciphering the script are far from alone. Before the mid-’90s, “claims of decipherment were published quite regularly,” Bonta said. Neither of these claims has gained widespread acceptance, with one problem being that the brevity of the surviving texts makes it difficult to prove the accuracy of any decipherment.
“Most of the Indus inscriptions are short and highly repetitive, making the task of reproducible decipherment very difficult,” Bonta said.
Over to AI
AI is useful for decipherment attempts and can help researchers generate lists of possible character values. But ultimately, human scientists will still have to take the lead. AI “is an extension of human intellect and intuition, albeit an extraordinarily powerful one,” Bonta said.
Peter Revesza professor of computing at the University of Nebraska-Lincoln who is an expert in computational linguistics and has studied the Indus Valley script extensively, believes the script will be deciphered and that AI can play a significant role. Revesz’s team has used data mining and statistical analysis to help fix which Indus Valley script characters probably have similar meanings.
“The Indus Valley Script will surely be solved in some way, and AI can help, but it must be guided by a good research design,” Revesz said in an email.
Rajesh Raoa professor of computer science at the University of Washington in Seattle who has written several papers on the Indus script is less optimistic about a complete decipherment, but said AI will be useful. Back in the 2000s, with the more primitive AI available at the time, his team particular that the script has a statistical pattern that suggests it encodes a language.
But even with AI, a complete decipherment seems unlikely with the existing texts, according to Rao. The chances “are not very high,” Rao told LiveScience, noting that a partial decipherment might be possible. “We may be able to reconstruct their number system,” Rao said.
Rao said the number system is already partially understood because some inscriptions have numeral marks (vertical lines) believed to represent numbers. These are placed next to symbols that probably represent objects. Additionally, archaeological data suggests that people used a system of standardized weights that involved ratios of 1, 2, 4, 8, 16, 32, and 64. By using the counting marks and the weight system, it may be possible to determine which numbers are recorded on the inscriptions.
To decipher the entire script, Rao believes archaeologists will have to unearth more texts. There are many sites of the Indus Valley Civilization that are largely unexcavated, and he hopes future excavations may yield longer texts or those containing the Indus Valley script along with a known language.






