Classics Converter

  • Latin
  • Sanskrit

What is this?

This is not a translator from Latin or Sanskrit to other languages. Instead, this is a predictive model: it takes words in Latin or Sanskrit (what I'll call “classical” languages) and predicts their descendents, through the natural evolution of language, in modern languages. (Plus Pali, which we'll get to later.)

Over time, languages change. These changes are sometimes unpredictable, but more often than not, they actually follow well-defined rules. In theory, these rules can be programmed into a computer, which is what you're seeing here. For example, a stressed o in Latin tends to become ue in Spanish, such as iocum becoming juego.[] There are many more such rules, often much more complicated.

There are at least five ways this tool could fail to predict a translation:[]

And of course, there is the fact that this tool is not perfect. I know I've made many mistakes with this tool, and there are many more mistakes yet to be discovered. If you'd like to contribute to the Classics Converter, or just check out the source code, the GitHub is at crackalamoo/classics-converter.

Historical Background

Sanskrit dates back to around 1500 BCE, the time of the Rigveda. While it was not written at this time, it was preserved orally so that we know what this early stage of the language was like. Sanskrit was standardized much later, around 500 BCE (give or take a few hundred years), by Pāṇini in his Aṣṭādhyāyī. This standardized form is known as Classical Sanskrit.[]

While Sanskrit is a dead language, its spoken dialects evolved into various Prakrits,[] from which the modern Indo-Aryan languages descend. These include languages such as Hindi/Urdu, Punjabi, and Marathi, mostly found in North India and nearby countries like Pakistan and Bangladesh.[][] There's also Pali, another dead language which continues to be used as the liturgical language of Theravada Buddhism. Pali dates to around 200 BCE, the time of the Prakrits.[]

The situation for Latin is similar. Old Latin dates to around 753 BCE (the traditional founding date of Rome), while Classical Latin dates to around 75 BCE. Classical Latin was not the creation of one person like Classical Sanskrit, but was developed by various writers who wrote the classics of Latin literature. In contrast to Classical Latin, the Latin that was spoken by common people is known as Vulgar Latin. Unfortunately, this form of Latin was rarely written, so we have limited knowledge of what it was like.[]

Like Sanskrit, Latin is now a dead language. However, over the centuries, the various dialects of Vulgar Latin evolved differently and formed modern languages such as Spanish, French, Portuguese, and Italian. The languages descended from Vulgar Latin are known as the Romance languages.[][]

Indo-Aryan languages in South Asia. Image source: Mikeanand, Wikimedia (based on Uwe Dedering), CC BY-SA 3.0.
Romance languages in Europe. Image source: Yuri B. Koryakov, Atlas of Romance languages, CC BY-SA 4.0.

Thus we can make a rough analogy:

Latin Sanskrit
Old Latin Vedic Sanskrit
Classical Latin Classical Sanskrit
Vulgar Latin Prakrits
N/A Pali
Romance languages Modern Indo-Aryan languages

This is a little complicated, and I'm glossing over a lot of important differences between the two, but the takeaway is this: in general, you can enter words in Classical Latin or Classical Sanskrit and get a rough idea of what they might look like in modern languages. In a few cases, you may have to adjust the input to match an unattested Vulgar Latin or Old Indo-Aryan dialect to get the best results.

An interesting fact: Latin and Sanskrit are related, as both descend from Proto Indo-European. An example cognate is पाद pāda in Sanskrit and pedem in Latin, both meaning foot. This makes French pied and Urdu پاؤں pāõ distantly related, along with English foot which had a p- to f- change.[]

Why Latin and Sanskrit?

So why choose Latin and Sanskrit as the source languages for a “classics converter”? They have a number of really attractive qualities that make a “classics converter” work well.

As far as I can tell, these are among the only languages that fit these criteria.

Maybe Tamil, Ge'ez, and Tibetan would be good candidates to add, but they may suffer from a lack of resources: Latin and Sanskrit, and their relations to modern languages, have been studied extensively. Still, it would be cool to see more languages in this tool, both more source languages besides these two and more modern languages to convert into.

References

The rules applied in the converter code, while not necessarily referenced in the main article, mostly came from:

  1. English Wiktionary ^
  2. Sanskrit (Wikipedia) ^
  3. Indo-Aryan languages (Wikipedia) ^
  4. Pali (Wikipedia) ^
  5. Prakrit (Wikipedia) ^
  6. Latin (Wikipedia) ^
  7. The Linguistics of Spanish (Ian Mackenzie, Newcastle University, 1999–2022) ^
  8. The Indo-Aryan Languages (Colin P. Masica, University of Chicago, 1991) ^
  9. Phonological history of French (Wikipedia) ^
  10. Comparison of Portuguese and Spanish (Wikipedia) ^
  11. How Latin Became Italian (Damyan Lissitchkov, 2013) ^
  12. Phonological changes from Classical Latin to Proto-Romance (Wikipedia) ^
  13. History of the Spanish language (Wikipedia) ^

Footnotes