We're calling on all EU-based Mozillians with iOS or iPadOS devices to help us monitor Apple’s new browser choice screens. Join the effort to hold Big Tech to account!

Search Support

Avoid support scams. We will never ask you to call or text a phone number or share personal information. Please report suspicious activity using the “Report Abuse” option.

Learn More

Preventing mojibake with HTML emails

  • 4 பதிலளிப்புகள்
  • 1 இந்த பிரச்சனை உள்ளது
  • 1 view
  • Last reply by chruss2

I noticed that many of my recipients are seeing junk characters in my emails. I have taken a look at the raw bytes and I think I know what is going on.

MIME headers and HTML head elements being sent:

Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit

<meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> The body is encoded correctly in windows-1252 encoding. Everything looks correct, so what's the problem? Here's my theory: The recipient sees the MIME header. Because it is encoded in windows-1252, it converts the HTML content to UTF-8 but of course the meta tag will still *say* it is windows-1252. Basically, having two headers at different layers specifying window-1252 encoding is causing the recipient to doubly decode the HTML body. I think what we need is some way to make the HTML 7-bit friendly. Convert all non-ASCII characters to numerical entities like  . That way charset conversions can be done at the MIME layer without corrupting the HTML. Is there an option for this in Thunderbird? I poked around in Settings but couldn't find anything.

I noticed that many of my recipients are seeing junk characters in my emails. I have taken a look at the raw bytes and I think I know what is going on. MIME headers and HTML head elements being sent: Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> </head> The body is encoded correctly in windows-1252 encoding. Everything looks correct, so what's the problem? Here's my theory: The recipient sees the MIME header. Because it is encoded in windows-1252, it converts the HTML content to UTF-8 but of course the meta tag will still *say* it is windows-1252. Basically, having two headers at different layers specifying window-1252 encoding is causing the recipient to doubly decode the HTML body. I think what we need is some way to make the HTML 7-bit friendly. Convert all non-ASCII characters to numerical entities like &#160;. That way charset conversions can be done at the MIME layer without corrupting the HTML. Is there an option for this in Thunderbird? I poked around in Settings but couldn't find anything.

தீர்வு தேர்ந்தெடுக்கப்பட்டது

Under Tools/Options/Display/Formatting, click the Advanced button where you can set the default encoding for outgoing mail, and the choice of encoding for replies (see picture).

If you're sending through a Yahoo (ATT, Rogers, Verizon etc.) type server and your recipients see unwanted characters, set mail.strictly_mime to true in Tools/Options/Advanced/General/Config. editor to enforce Quoted Printable encoding (Outlook default).

Read this answer in context 👍 0

All Replies (4)

Folks using yahoo have issues because for about a year they have been enforcing 7bit encoding. You would think they were living in a US centric version of the world.

Basically your mail should be encoded as UTF-8 and those windows encodings forgotten as they are just obsolete and their decoding varies from product to product. They are a product of the days in the early 1990s when Microsoft was of the view the internet was a fad that they could safely ignore.

How are you managing to get windows encoding in there? pasting from word as anything but plain text is a good way, or always has been.

I was replying to an email that used windows encoding. I agree about pushing towards "Unicode everywhere", but I can't control what encoding others use, and it appears that Thunderbird keeps the same encoding when replying (but not when forwarding).

Is there an option to force use of UTF-8 when replying?

தீர்வு தேர்ந்தெடுக்கப்பட்டது

Under Tools/Options/Display/Formatting, click the Advanced button where you can set the default encoding for outgoing mail, and the choice of encoding for replies (see picture).

If you're sending through a Yahoo (ATT, Rogers, Verizon etc.) type server and your recipients see unwanted characters, set mail.strictly_mime to true in Tools/Options/Advanced/General/Config. editor to enforce Quoted Printable encoding (Outlook default).

Thanks a lot for your help. I tested each of those options separately, and each of them solved the problem.

I am using yahoo SMTP servers. @Matt, you also mentioned Yahoo, but I thought you meant Yahoo Mail (a web client). Thanks @sfhowes for being more specific. It did not occur to me that my ISP could suck that much!

I tested their servers myself, and they are allowing 8-bit but only if it's valid UTF-8. Anything non-UTF-8 gets changed to UTF8(U+FFFD). Therefore either option works: forcing replies to UTF-8, or forcing quoted printable (7-bit).

Thanks for pointing me to the options. I would never have found the Reply encoding option under Display (Fonts and Colors). What an illogical place for that option!