No sir, I can’t abugida

The peculiar and rebarbative romanization Google Translate uses for Thai is ISO 11940, which is 86 Swiss Francs to you at the time of writing.

I’m trying to work out something between that and the rather lossy RTGS, but my first attempt looks like this:

m_2aa (c_uee1qs^iT_y_aas^aas^tr_ Equus caballus h^r_ueeq Equus ferus caballus) peaen_s^ats^l_eii2y_g_l_uukd2s^y_n_m_sue1g_m_iiK_s^aam_h^l_aakh^l_aay_T_aag_s^aay_P_an_T_uT_ii1T^uukm_n_us^y_n_amm_aal_eii2y_g_l_aeaT^uukc_ai2n_aikijkr_r_m_kaar_dein_T_aag_K^n_s^1g_ kaar_T_h^aar_ kiil_aa s^an_T_n_aakaar_l_aeaqaajjac_ai2peaen_qaah^aar_K^qg_m_n_us^y_n_aibaag_s^aT_n_T_r_r_m_m_aan_aan_n_abP_an_piil_ae2s^ pajjuban_bT_baaT_K^qg_m_2aaT^uukT_aen_T_ii1d2s^y_y_aan_P_aah^n_abaebh^aim_1jn_ bT_baaT_l_dl_g_paih^el_ueeqP_eiiy_g_T_aag_kiil_aal_aeas^an_T_n_aakaar_doy_s^1s^n_h^aiy_1 ta2g_tae1n_aiqdiitjn_T^ueg_pajjuban_r_eaajah^eaen_m_2aapeaen_s^ay_l_aks^n_T_ii1K_s^bK_uu1kabK_aas^bqy_

It’s not meant to look like Klingon. The idea is that high tone consonants are marked with ^, low tone consonants with _, and mid-tone consonants with nothing at all. Tone marks ek, tho, tri and chattawa are marked 1, 2, 3 and 4, as their names suggest. Further postprocessing should combine ^, _ and a digit into some sort of sensible tone-marking. Maybe IPA.

With post-processing in mind, I wrote aspirated consonants as capitals, ng as g, [tɕ] as j and [tɕh] as c. Maybe that’s a bit mad.

The big problem, though, is determining syllable boundaries. There are both open (consonant–vowel) and closed (consonant–vowel–consonant) syllables in Thai, yet the script, being an abugida, implies a vowel after consonants that don’t already have a vowel attached. This may involve “cheating” and looking up someone else’s work…

This entry was posted in th. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *