Using Base 22 for Human-Proof Unique Identifiers ================================================ Users must sometimes manually transcribe the unique identifiers we generate. Because certain digits and letters look alike (depending on the font), users often mistranscribe, as the identifiers are just random sequences with no inherent meaning to humans. We can reduce the mistranscription rate by putting similar symbols into equivalence classes. The consuming software would treat all entries in a given equivalence class as identical, so that even when the user mistranscribes, the identifier would still work. Each column in the diagram below is an equivalence class, giving a "base 22" alphabet from which to generate safe identifiers: 8 7 4 6 9 1 0 5 3 2 a b c d e f g h i k m n o p r s t u w x z A B C D E F G H I K M N O P R S T U W X Z Q l v q L V j y J Y For example, if the original identifier were "9sxA3y", the software receiving it would accept "qsx43y" too (for that matter, it would even accept the much further off "G5X4zu", though one would have to wonder about the user in that case). Base 22 is a high enough base that identifiers will remain short for most uses -- at 10 places, we have 26,559,922,791,424 unique values.