What’s the difference between the zero width non-joiner and the zero width space?

Raymond Chen

Raymond

In Break it up, you two!: The zero width non-joiner I discussed the purpose of the zero width non-joiner, which is to request that two adjacent characters be rendered without a ligature. Conversely, the zero width joiner requests that they be rendered with a ligature. Of course, it is up to the rendering engine to make the final decision as to how the rendering is done, but at least you can make your intentions clear.

Another character that seems very similar to the zero width non-joiner is the zero width space. Both of them have no width, and both of them break up ligatures. So what’s the difference?

Well, one of them is a space, and the other one isn’t.

The zero width space is used to indicate where one word ends and another word begins, even though there should not be any space rendered at the word boundary. This is significant for languages which have the concept of words, but not of spaces. For example, the Thai and Korean languages use multiple characters to represent words, but traditionally do not insert spaces between words. The words just run together, and readers are expected to use their experience with the language to know where one word ends and the next begins.

(This sounds kind of unfair to people who are still learning the language, but learning a language is already unfair. After all, in normal speech, people typically do not make discernable pauses between words. All the words run together, and you are expected to use your experience with the language to know where one word ends and the next begins.)

The Unicode Standard calls this out:

Zero-Width Spaces and Joiner Characters. The zero-width spaces are not to be confused with the zero-width joiner characters. U+200C zero width non-joiner and U+200D zero width joiner have no effect on word or line break boundaries, and zero width no-break space and zero width space have no effect on joining or linking behavior. The zero-width joiner characters should be ignored when determining word or line break boundaries.

In English, these special characters don’t have much use. You could use the zero width non-joiner to break up a ligature, say to break up the “fl” ligature in the word “wolf‌like”, but there’s usually not much call for it in English.

Note that if you had used a zero width space instead of a zero width non-joiner, then you are telling the layout engine that “wolf” and “like” are two separate words, that happen to run together without a space. This means that it is possible for a line break to be inserted between them.

Raymond Chen
Raymond Chen

Follow Raymond   

0 comments

Comments are closed.