Page created 2022-09-21.
An encoded subject field is written as =?charset?
Second example: =?UTF-8?B?0JXR
$ echo "0JXRhdCw
0LzRgNOPЕхамрӏɛ ѕυвјεст yZsg0ZXP hdCy0ZjO tdGB0YIK" | base64 -d ; echo
echo "base64text" | base64 -d ; echo
For ISO-8859-1-encoded text, use
echo "base64text" | base64 -d | iconv -f Windows-1252 -t UTF-8
If one wants to manually decode UTF-8 (with a table, perhaps), pipe the result of the base64 decoding through
GB2312 is a National Standard of the People's Republic of China (hence the "GB") and an alias for the EUC-CN encoding – the Extended Unix Code is a multibyte character encoding with different variants for various CJK languages, and EUC-CN encodes Simplified Chinese.
One can use a very similar command line to decode Chinese, using
iconv to decode the resulting binary data. (One can specify the encoding as either "GB2312" (no hyphen allowed) or "EUC-CN".)
Example input: =?gb2312?B?yr7A
$ echo "yr7A/db3
zOLX1rbO" | base64 -d | iconv -f GB2312 -t UTF-8 ; echo示例主题字段
If one wants to manually try decoding this, use
xxd for printing out the bytes:
$ echo "yr7A/db3
zOLX1rbO" | base64 -d | xxd00000000: cabe c0fd d6f7 cce2 d7d6 b6ce ............
Wikipedia has pretty okay code charts for decoding.
Decoding is a bit of a hassle, there's a bunch of lookups. Examples:
CA BE: lead byte CA, code table row 42. Trailing byte BE, decimal 190; subtract 160 from this, now we have 30: row 42, column 30, "示". Columns start at 1, so this is the 30th (not 31st) character in the row.
C0 FD: lead byte C0, code table row 32. Trailing byte FD, decimal 253, subtract 160, get 93: 93rd character of row 32, or the second-to-last (there are 94 characters per line): "例".
echo "base64text" | base64 -d | iconv -f GB2312 -t UTF-8 ; echo