Oddities

In most cases, where normalisation changes the text, you will see differences between the (NFC and NFKC) and (NFD and NFKD) results. For some Unicode characters, normalisation differences occur between the (NFC and NFD) and (NFKC and NFKD) forms:

The word entered here uses a ligature and the horizontal ellipsis characters. These do not alter in the NFC and NFD normalisations, but are broken down into their constituent characters in the NFKC and NFKD normalisations. This is reflected in macOS, which allows you to have two files or folders which appear to have the same names, one of which uses a ligature, but the other does not. Avoid doing this, as it can prove very confusing when searching, for example. If you want to try those words, they are here: Affinity… and Affinity...

Look closely and you can see that the two folders which appear to have the same name are in fact different.

This gives you an idea as to the sort of problem which can occur when there are mismatches in normalisation – the sort of issue which can arise between HFS+ and Linux, for example.

Normalisation issues can arise in the most unexpected places too. For example, Åland and Åland are quite different strings which normalise to the latter. One reason for this is that the Å character has two separate Unicode encodings, one to represent the Nordic letter, the other for the Ångström unit, which actually uses the Nordic letter. And Ångström is different again…

Back to Welcome

Technical information


The Eclectic Light Company – https://eclecticlight.co