REVISION: If languages are ergative, like Basque and others I can’t remember right now, then they mark the subjects of transitive verbs, but not intransitive verbs. So “John opened the door and ran away” would require two “Johns”, one in the ergative form for his opening the door, and the other for his running away.
(Unless the language’s verbs of motion had an obligatory direct object in the distance run. “John runs two miles”. You never know.) Now for an ergativity test.
My test set looks like this:
I spoke. *I saw. I saw a ghost. I walked. I walked to the station. The monk spoke. *The monk saw. The monk saw a ghost. The monk walked. The monk walked to the station. Thaksin spoke. *Thaksin saw. Thaksin saw a ghost. Thaksin walked. Thaksin walked to the station.
I’ve asterisked (an asterisk is a bit strong) the sentences which need more context to license them. I know “I saw” is the correct answer to “You, sir, work as a sawyer. What do you do all day?”, but I bet that’s not in the corpus.
Now look at the results:
C?h?n ph?d * P?hm h??n P?hm h??n p?h? C?h?n dein C?h?n dein p? th?? s?t?h?n? Phra ph?d * Phra h??n Phra h??n p?h? Phra p?hiks??u s?ng?h? dein Phra s?ng?h? k? dein p? th?? s?t?h?n? Th?ks??i? ph?d Th?ks??i?* h??n Th?ks??i? h??n p?h? Th?ks??i? dein Th?ks??i? k? dein p? th?? s?t?h?n?
The machine translates “I” with different words for the transitive “saw” and the intransitive “walked”. The story is a bit more complicated for “the monk” and “Thaksin”. I need to do this again with more data, but you get the idea.