• Welcome to Valhalla Legends Archive.
 

Battle.net UTF-8 Information

Started by Skywing, May 03, 2003, 12:06 AM

Previous topic - Next topic

Skywing

According to research done by Kp and backed by experiments the two of us carried out last night, Starcraft 1.10 introduced UTF-8 encoding for all text transmitted to/from Battle.net.  Presently the server does not properly encode/decode this text when relaying to/from legacy clients, so you'll be stuck with using US English when chatting to them for now.

Background for those who don't know about it:
UTF-8 is a method for encoding Unicode characters as 8-bit sequences.  You can use the Win32 WideChartoMultiByte and MultiByteToWideChar functions to translate things to and from UTF-8.  7-bit ASCII characters (<127) aren't specially encoded; however, anything above 127 must be encoded.

C++ bot developers may find the UTF-8 conversion routines which I originally wrote for my MSN Messenger client useful to handle this.  It would be advisable to deal natively in Unicode and remove the to-ANSI translation step, as this represents a potential loss in information.

This may pose a problem for many of the chat encryption/obfuscation schemes in use, as extended ASCII characters must be checked for after UTF-8 processing, and not before.

tA-Kane

Quote from: Skywing on May 03, 2003, 12:06 AM7-bit ASCII characters (<127) aren't specially encoded; however, anything above 127 must be encoded.
What do you do with a character who's value is 127 then?
Macintosh programmer and enthusiast.
Battle.net Bot Programming: http://www.bash.org/?240059
I can write programs. Can you right them?

http://www.clan-mac.com
http://www.eve-online.com

Skywing

#2
Quote from: tA-Kane on May 03, 2003, 02:06 PM
Quote from: Skywing on May 03, 2003, 12:06 AM7-bit ASCII characters (<127) aren't specially encoded; however, anything above 127 must be encoded.
What do you do with a character who's value is 127 then?
http://www.google.com/search?hl=en&lr=&ie=UTF-8&oe=UTF-8&q=UTF-8
http://www.cl.cam.ac.uk/~mgk25/unicode.html
Google is an excellent resource.

Camel

Quote from: tA-Kane on May 03, 2003, 02:06 PM
Quote from: Skywing on May 03, 2003, 12:06 AM7-bit ASCII characters (<127) aren't specially encoded; however, anything above 127 must be encoded.
What do you do with a character who's value is 127 then?
i would go ahead and assume that it's the first 2^7 chars that dont need to be encoded; that would put 127 in the category or non-encoded

tA-Kane

Quote from: Camel on May 03, 2003, 09:33 PMi would go ahead and assume that it's the first 2^7 chars that dont need to be encoded
I had assumed as much, since that would make more sense. By the way, it's (2^7)-1, otherwise 128 would be included.
Macintosh programmer and enthusiast.
Battle.net Bot Programming: http://www.bash.org/?240059
I can write programs. Can you right them?

http://www.clan-mac.com
http://www.eve-online.com

Camel

no it wouldn't
i said the first 128 charactors
0 is the first, 127 is the 128th