I’ve been pasting Classical Chinese poetry into Google Translate to get the Pinyin out. The translation is sometimes OK in parts, if it’s of a chengyu, which is to say a four-character classical tag, but often it’s gibberish. 伊, which is a perfectly ordinary classical third-person pronoun, comes out as “Iraq”, I think because 伊拉克 (Yīlākè) does mean “Iraq” and maybe if you were talking about Iraqi–US relations, say, you’d write 伊美. But my Chinese isn’t good enough to verify that.
I digress. 士与女, which even I can tell means “men and women”, or “ladies and gentlemen”, consisting as it does of the characters for “knight”, classical “and” and “woman”, comes out as “Disabilities and Women”. Is this something like how “cretin” came to mean cretin, or is it merely the training data and the alignment to blame?