2005-02-16 06:21:35 UTC
recent link to a Japanese version of Blat, based on version 1.8.2, prompted
me to investigate Unicode files and what it would take to support this
format in the message body file, as opposed to alternate text or other forms
of text input. Tonight I looked in previous messages for any mention of
UTF-8 or Unicode, to gauge how much need there might be for Unicode support.
There is a way to send these files already, by using the -attach option.
However, sometimes this is not very convenient, and has led to some creative
actions to "convert" Unicode to plain text. If Blat could send Unicode
files automagically, how many folks would actually benefit from this?
It appears that all, or nearly all, Latin based languages do not need
Unicode text formatting, but that Microsoft prefers to store double byte
character sets as Unicode. If the message body file, that elusive first
argument, has a Byte Order Marker in the first two or four bytes, Blat could
use this to identify the file as Unicode and convert it to UTF-8 for
I have a personal distaste for sending message bodies encoded with base64,
since this often is used to hide spam content from the casual viewer. Blat
is coded in such a way that message bodies do not use base64, this is
reserved for attachments. Due to this limitation, UTF-8 bytes will be sent
as quoted printable, which unfortunately takes more bytes than base64. I
could change the code to create an override condition whereby UTF-8 data
would be sent encoded with base64 so the message is smaller.