Mosh-Production's Blog

May 11, 2009

Windows-1255 Encoding

Filed under: Coding Horror — Maor Tzivony @ 1:47 pm
Tags: , , , , ,

Introduction

This article solves the problem of the windows-1255 encoding

Code Caption

Background

Usually when you get Hebrew emails, the subject line is encoded in a strange way. after reading
a lot of articles on the subject, I decided to rollout a decoding version of my own,
which handles base64 and plain encoding of hebrew messages.

Since we didn’t find any resource, we used the specifications for the mime encoding

The problem

Sometimes when you read mime messages you might encounter weird characters arranged like this

?WINDOWS-1255?B?QWVQQWDWDWEDWF=?

We encountered it first while trying to read mail messages from an exchange server. Later we also encountered a UTF-8 variation of this weird encoding. What we basically did was look for certain patterns in the text with a Regex:

Regex encoding = new Regex("?WINDOWS-1255?[BQ]?[^?].?");

Then we built the replacement string by stripping all the characters until second question mark and the last one. Then we decoded using BASE64 (if a B was present) or plain decoding (Q).

We did the same with Utf-8 encoding, except the Regex was different.

Using the Code

Instantiate the class with the text to decode and use the decode method. the static propery IsWindows1255 can determine if the text is actually encoded to avoid the exception thrown by the
constructor (When given a non encoded text)

Like this:

if (Windows1255Helpers.IsWindows1255(ret))
{
Windows1255Helpers helper = new Windows1255Helpers(ret);
ret = helper.Decode();
}

If you have further questions, don’t hesitate to send a mail.

Points of Interest

I read some specifications as of the encoding or decoding of mime messages and subjects,
and If I missed something, please feel free to remark.

And please excuse my English if I had mistakes, I’m from Israel, and it’s not my primary language…. 🙂

History

First Version – 11.5.2009

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: