Welcome to TIGRIS Virtual Lab Project Area
e-ŠNUNNA
TEXT ENCODING
The project "e-ŠNUNNA" exploits Text Mining tools, in particular Clustering, to discover homogeneous groups and hidden relations in the data, on a corpus of letters (XIX-XVII centuries b.C.), found in the small Kingdom of EŠNUNNA.
Software for Text Mining and Linguistic Quantitative Analysis require graphic substitution in order to process Mesopotamian e-texts in UNICODE fonts and this leads to the substitution of the following characters:
š  --> $
ṭ  --> v
ṣ  --> c
ḫ  --> h.
Furthermore, Text Mining Software for Linguistic Quantitative Analysis are oriented to modern languages and require adaptation, in order to face with problems related to Mesopotamian texts:
- graphic notation;
- graphemic ambiguities and inconsistencies, including a special use or caps and lower case letters;
- use of more than a language in the same texts;
- etc.
Other graphic conventions:
=_=      "erasure without text"
=abc=  "text over erasure"
/abc/    "text omitted by the scribe and added by scholars"
//abc//  "text mistakenly added by the scribe"
As usual when the texts are in Akkadian, Akkadian is written in lower case and Sumerian logograms in CAPS.
Sumerian texts will be processed adopting suitable conventions for Sumerian.