• Welcome to Valhalla Legends Archive.
 

UnCensor() Method Code

Started by Joe[x86], March 20, 2005, 08:46 PM

Previous topic - Next topic

Joe[x86]

I just ported this to Java from LoRd[nK]'s Unknown Bot source code. This is untested, and I don't know if I did replace right, but I belive I did.

EDIT 3/20/05 10:02PM: Alphabetized list, removed duplicates, added more words I found from searching forums.

    //author JoeTheOdd
    //removes censoring from messages in public Bnet channels
    //@arg message - message recieved from Bnet
    //@return - uncensored message
    public static String uncensor(String message)
    {
        message = message.replaceAll("@$%!@%&",   "asshole");
        message = message.replaceAll("@$%!@!&",   "asswipe");
        message = message.replaceAll("#@%$!",     "bitch");
        message = message.replaceAll("$!@!$",     "chink");
        message = message.replaceAll("$%@%",      "clit");
        message = message.replaceAll("$!$%",      "cock");
        message = message.replaceAll("$&!%",      "cunt");
        message = message.replaceAll("%@$%",      "dick");
        message = message.replaceAll("%@%&!",     "dildo");
        message = message.replaceAll("&#&$%",     "erect");
        message = message.replaceAll("!@!@!%",    "faggot");
        message = message.replaceAll("!&$%",      "fuck");
        message = message.replaceAll("!@!$",      "gook");
        message = message.replaceAll("$@$&",      "kike");
        message = message.replaceAll("$%$",       "kkk");
        message = message.replaceAll("$%&!",      "klux");
        message = message.replaceAll("%&$#@#!",   "lesbian");
        message = message.replaceAll("&@$%&#$@%", "masturbat");
        message = message.replaceAll("!@!@#",     "nigga");
        message = message.replaceAll("!@!@&#",    "nigger");
        message = message.replaceAll("!@!@%&",    "nipple");
        message = message.replaceAll("!#!@$&",    "orgasm");
        message = message.replaceAll("!&!@$",     "penis");
        message = message.replaceAll("!&$%@",     "pussy");
        message = message.replaceAll("$!@%",      "shit");
        message = message.replaceAll("$%&%",      "slut");
        message = message.replaceAll("!@!@!@",    "vagina");
        message = message.replaceAll("!@!#&",     "whore");
        return message;
    }
Quote from: brew on April 25, 2007, 07:33 PM
that made me feel like a total idiot. this entire thing was useless.

dxoigmn

One thing you'll want to becareful of is ordering of each replace.  For example, if you ran "!&$%@ !&$%" through this, it should ouput "fuck@ fuck" when it should ouput "pussy fuck."  Because "!&$%" is replaced first before "!&$%@", this happens. Other things to pay attention to would be only replacing whole words although I am not quite sure if battle.net partially replaces words (e.g. "fuckface" => "!&$%face" then it wouldn't be a problem).

Joe[x86]

Whoops, guess alphabetizing it wasn't the best decision then, eh? I'll have to fix that after school.
Quote from: brew on April 25, 2007, 07:33 PM
that made me feel like a total idiot. this entire thing was useless.

iago

Another thing I'd do is:

if(message.indexOf('$') >= 0 || message.indexOf('!') >= 0)
{
   ... do everything
}


That way it won't run all those replaces EVERY time.  And yes, I checked, every one of them has a $ or ! in it.
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Joe[x86]

#4
Hmph. iago just loves optimization. VB programmers don't like that.

Fixed ordering and ported to C++ and VB.

Java:
//removes censoring from messages in public Bnet channels
//@arg message - message recieved from Bnet
//@return - uncensored message
public static String uncensor(String message)
{
    message = message.replaceAll("@$%!@%&",   "asshole");
    message = message.replaceAll("@$%!@!&",   "asswipe");
    message = message.replaceAll("#@%$!",     "bitch");
    message = message.replaceAll("$!@!$",     "chink");
    message = message.replaceAll("$%@%",      "clit");
    message = message.replaceAll("$!$%",      "cock");
    message = message.replaceAll("$&!%",      "cunt");
    message = message.replaceAll("%@$%",      "dick");
    message = message.replaceAll("%@%&!",     "dildo");
    message = message.replaceAll("&#&$%",     "erect");
    message = message.replaceAll("!@!@!%",    "faggot");
    message = message.replaceAll("!&$%@",     "pussy");
    message = message.replaceAll("!&$%",      "fuck");
    message = message.replaceAll("!@!$",      "gook");
    message = message.replaceAll("$@$&",      "kike");
    message = message.replaceAll("$%$",       "kkk");
    message = message.replaceAll("$%&!",      "klux");
    message = message.replaceAll("%&$#@#!",   "lesbian");
    message = message.replaceAll("&@$%&#$@%", "masturbat");
    message = message.replaceAll("!@!@#",     "nigga");
    message = message.replaceAll("!@!@&#",    "nigger");
    message = message.replaceAll("!@!@%&",    "nipple");
    message = message.replaceAll("!#!@$&",    "orgasm");
    message = message.replaceAll("!&!@$",     "penis");
    message = message.replaceAll("$!@%",      "shit");
    message = message.replaceAll("$%&%",      "slut");
    message = message.replaceAll("!@!@!@",    "vagina");
    message = message.replaceAll("!@!#&",     "whore");
    return message;
}


C++:
//removes censoring from messages in public Bnet channels
//@arg *txt - message recieved from Bnet
//@return - uncensored message
//
//I don't know C++ very well, but this doesn't
//appear to return anything, from my knoledge of the language.
//Thanks to Grok[vL] for help with this (indirectly, but still..)
void decensor(char *txt)
{
    replace(txt, "@$%!@%&",   "asshole");
    replace(txt, "@$%!@!&",   "asswipe");
    replace(txt, "#@%$!",     "bitch");
    replace(txt, "$!@!$",     "chink");
    replace(txt, "$%@%",      "clit");
    replace(txt, "$!$%",      "cock");
    replace(txt, "$&!%",      "cunt");
    replace(txt, "%@$%",      "dick");
    replace(txt, "%@%&!",     "dildo");
    replace(txt, "&#&$%",     "erect");
    replace(txt, "!@!@!%",    "faggot");
    replace(txt, "!&$%@",     "pussy");
    replace(txt, "!&$%",      "fuck");
    replace(txt, "!@!$",      "gook");
    replace(txt, "$@$&",      "kike");
    replace(txt, "$%$",       "kkk");
    replace(txt, "$%&!",      "klux");
    replace(txt, "%&$#@#!",   "lesbian");
    replace(txt, "&@$%&#$@%", "masturbat");
    replace(txt, "!@!@#",     "nigga");
    replace(txt, "!@!@&#",    "nigger");
    replace(txt, "!@!@%&",    "nipple");
    replace(txt, "!#!@$&",    "orgasm");
    replace(txt, "!&!@$",     "penis");
    replace(txt, "$!@%",      "shit");
    replace(txt, "$%&%",      "slut");
    replace(txt, "!@!@!@",    "vagina");
    replace(txt, "!@!#&",     "whore");
}


VB:
'//removes censoring from messages in public Bnet channels
'//@arg message - message recieved from Bnet
'//@return - uncensored message
Public Function uncensor(message As String) As String
    message = Replace(message, "@$%!@%&",   "asshole")
    message = Replace(message, "@$%!@!&",   "asswipe")
    message = Replace(message, "#@%$!",     "bitch")
    message = Replace(message, "$!@!$",     "chink")
    message = Replace(message, "$%@%",      "clit")
    message = Replace(message, "$!$%",      "cock")
    message = Replace(message, "$&!%",      "cunt")
    message = Replace(message, "%@$%",      "dick")
    message = Replace(message, "%@%&!",     "dildo")
    message = Replace(message, "&#&$%",     "erect")
    message = Replace(message, "!@!@!%",    "faggot")
    message = Replace(message, "!&$%@",     "pussy")
    message = Replace(message, "!&$%",      "fuck")
    message = Replace(message, "!@!$",      "gook")
    message = Replace(message, "$@$&",      "kike")
    message = Replace(message, "$%$",       "kkk")
    message = Replace(message, "$%&!",      "klux")
    message = Replace(message, "%&$#@#!",   "lesbian")
    message = Replace(message, "&@$%&#$@%", "masturbat")
    message = Replace(message, "!@!@#",     "nigga")
    message = Replace(message, "!@!@&#",    "nigger")
    message = Replace(message, "!@!@%&",    "nipple")
    message = Replace(message, "!#!@$&",    "orgasm")
    message = Replace(message, "!&!@$",     "penis")
    message = Replace(message, "$!@%",      "shit")
    message = Replace(message, "$%&%",      "slut")
    message = Replace(message, "!@!@!@",    "vagina")
    message = Replace(message, "!@!#&",     "whore")
    uncensor = message
End Function
Quote from: brew on April 25, 2007, 07:33 PM
that made me feel like a total idiot. this entire thing was useless.

Eric

Quote from: Joey on March 20, 2005, 08:46 PM
I just ported this to Java from LoRd[nK]'s Unknown Bot source code. This is untested, and I don't know if I did replace right, but I belive I did.

EDIT 3/20/05 10:02PM: Alphabetized list, removed duplicates, added more words I found from searching forums.

    //author JoeTheOdd
    //removes censoring from messages in public Bnet channels
    //@arg message - message recieved from Bnet
    //@return - uncensored message
    public static String uncensor(String message)
    {
        message = message.replaceAll("@$%!@%&",   "asshole");
        message = message.replaceAll("@$%!@!&",   "asswipe");
        message = message.replaceAll("#@%$!",     "bitch");
        message = message.replaceAll("$!@!$",     "chink");
        message = message.replaceAll("$%@%",      "clit");
        message = message.replaceAll("$!$%",      "cock");
        message = message.replaceAll("$&!%",      "cunt");
        message = message.replaceAll("%@$%",      "dick");
        message = message.replaceAll("%@%&!",     "dildo");
        message = message.replaceAll("&#&$%",     "erect");
        message = message.replaceAll("!@!@!%",    "faggot");
        message = message.replaceAll("!&$%",      "fuck");
        message = message.replaceAll("!@!$",      "gook");
        message = message.replaceAll("$@$&",      "kike");
        message = message.replaceAll("$%$",       "kkk");
        message = message.replaceAll("$%&!",      "klux");
        message = message.replaceAll("%&$#@#!",   "lesbian");
        message = message.replaceAll("&@$%&#$@%", "masturbat");
        message = message.replaceAll("!@!@#",     "nigga");
        message = message.replaceAll("!@!@&#",    "nigger");
        message = message.replaceAll("!@!@%&",    "nipple");
        message = message.replaceAll("!#!@$&",    "orgasm");
        message = message.replaceAll("!&!@$",     "penis");
        message = message.replaceAll("!&$%@",     "pussy");
        message = message.replaceAll("$!@%",      "shit");
        message = message.replaceAll("$%&%",      "slut");
        message = message.replaceAll("!@!@!@",    "vagina");
        message = message.replaceAll("!@!#&",     "whore");
        return message;
    }


Note that you're not the author of a piece of code if all you did was port it.

Newby

If I remember correctly, that function came from a bot made way before Unknown Bot.

Or one similar to it did, anyhow.
- Newby

Quote[17:32:45] * xar sets mode: -oooooooooo algorithm ban chris cipher newby stdio TehUser tnarongi|away vursed warz
[17:32:54] * xar sets mode: +o newby
[17:32:58] <xar> new rule
[17:33:02] <xar> me and newby rule all

Quote<TehUser> Man, I can't get Xorg to work properly.  This sucks.
<torque> you should probably kill yourself
<TehUser> I think I will.  Thanks, torque.

MyndFyre

Quote from: Joey on March 21, 2005, 04:27 PM
Hmph. iago just loves optimization. VB programmers don't like that.
...unless you're actually a GOOD programmer trying to write GOOD code, in which case you DO like that.  You know, like professionals would?

Instead of making a scornful remark, why don't you attempt to LEARN from someone who is knowledgeable and experienced?
QuoteEvery generation of humans believed it had all the answers it needed, except for a few mysteries they assumed would be solved at any moment. And they all believed their ancestors were simplistic and deluded. What are the odds that you are the first generation of humans who will understand reality?

After 3 years, it's on the horizon.  The new JinxBot, and BN#, the managed Battle.net Client library.

Quote from: chyea on January 16, 2009, 05:05 PM
You've just located global warming.

iago

Quote from: MyndFyre on March 21, 2005, 06:29 PM
Instead of making a scornful remark

I'd call it a "joke".  He's said the same thing to me before :)
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


MyndFyre

Quote from: iago on March 21, 2005, 06:33 PM
Quote from: MyndFyre on March 21, 2005, 06:29 PM
Instead of making a scornful remark

I'd call it a "joke".  He's said the same thing to me before :)

I would generally agree, except that he didn't actually incorporate your suggested change, which I think would have been, mm, 6 lines?  :P
QuoteEvery generation of humans believed it had all the answers it needed, except for a few mysteries they assumed would be solved at any moment. And they all believed their ancestors were simplistic and deluded. What are the odds that you are the first generation of humans who will understand reality?

After 3 years, it's on the horizon.  The new JinxBot, and BN#, the managed Battle.net Client library.

Quote from: chyea on January 16, 2009, 05:05 PM
You've just located global warming.

Lenny

#10
I find it rather strange battle.net replaces a different set of censors on different words.  Almost makes me think some sort of formula was followed.

There appears to be some patterns, but nothing consistent from my first glance.  But if you notice, the length of each word does match its replacement...



The Bovine Revolution
Something unimportant

Live Battle.net:

WARNING: The preceding message may have contained content unsuitable for young children.

Ban

Or perhaps they just bashed random shift-numbers until they came up with a string of similiar length. If they had changed the length of the string the point of the original message would have never been gotten across; some things are needed to properly understand some sentences/messages

Adron

Quote from: Ban on March 22, 2005, 09:38 AM
Or perhaps they just bashed random shift-numbers until they came up with a string of similiar length. If they had changed the length of the string the point of the original message would have never been gotten across; some things are needed to properly understand some sentences/messages

Or if they had changed the length of the message, they would have had to copy the resultant string to a new position in memory, incurring a performance loss. By ensuring that the replacement strings are the same length as the original strings, they can just overwrite the censored words inplace.

iago

If they really wanted to hide it, they could have done "penis"->"*****", etc.  But they didn't, so they must have had some intention of letting people decode them :-/
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Lenny

This is one of the reasons I believe they could have went through a formula to censor the text.  Perhaps the obscenities are categorized, it just seems as if a pattern exists.  But it might only appear so because of the small character range given.

As far as performance goes, it would probably be best if the text wasn't censored at all, eliminating the need to filter every message passed to the server.  But since it is, optimizing it as much as possible would seem logical.

The Bovine Revolution
Something unimportant

Live Battle.net:

WARNING: The preceding message may have contained content unsuitable for young children.