Unicode collation is hard
The principle of "garbage in, garbage out" applies to Unicode collation. If you hand it a meaningless string and ask to compare it to another meaningless string, you get meaningless results. I am not a Unicode expert; I just play one on the web. A real Unicode expert is Michael Kaplan, whose explanation of how comparing invalid Unicode ...