Date: 2020nov25
OS: Linux
Q. Linux: tr command - all characters which are equivalent to CHAR
A. The man page for `tr` says you can use [=CHAR=] in a set
to stand for "all characters which are equivalent to CHAR".
So that does that mean?
GNU tr
More info for GNU tr can be found at
https://www.gnu.org/software/coreutils/manual/html_node/Character-sets.html#Character-sets
The syntax ‘[=c=]’ expands to all of the characters that are equivalent to
c, in no particular order. Equivalence classes are a relatively recent
invention intended to support non-English alphabets. But there seems to be
no standard way to define them or determine their contents. Therefore, they
are not fully implemented in GNU tr; each character’s equivalence class
consists only of that character, which is of no particular use.
macOS tr
The man page for the macOS tr says
https://ss64.com/osx/tr.html
[=equiv=] Represents all characters belonging to the same equivalence
class as equiv, ordered by their encoded values.
And provides this example
Remove diacritical marks from all accented variants of the letter 'e':
$ tr "[=e=]" "e"
It appears to respect the local language via the LANG, LC_ALL, LC_CTYPE and LC_COLLATE environment variables.
Workaround
I wrote a tr command which assumes Unicode so [=e=] expands into 0xE8 to 0xEB according to (and similar for other letters)
https://en.wikipedia.org/wiki/List_of_Unicode_characters#Latin-1_Supplement
Since tr supports octal in sets, if you have GNU tr and Unicode you can
remove all accents on lowercase e's with
tr '\350\351\352\353' 'e'