• Welcome to Valhalla Legends Archive.
 

What can go wrong with this code?

Started by Skywing, November 30, 2003, 11:33 PM

Previous topic - Next topic

DarkMinion


DarkMinion

Ok, here is A problem that I have found...

I ran this code (modified a little to show the value of ch):


#include <stdio.h>
#include <ctype.h>

int main(int ac, char **av)
{
   char ch = getc(stdin);
   printf("isspace(ch(%u)) returns %u.\n", ch, isspace(ch));
   return 0;
}


If you input the character alt+160, you get these results:



What happens is 0xffffff gets stored in ch, and that screws up the compilation of isspace()

Declaring ch as int fixes this problem...

iago

Another potential problem is that the user can enter A LOT of crap before pressing enter.  This might overflow some internal buffer and break the code, depending on the OS and the implementation of the stdin buffer.
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Skywing

iago is totally off.  This isn't a problem with input buffering at all.  The character in question could have come from anywhere.

iago

This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


DarkMinion


Skywing

Quote from: DarkMinion on December 02, 2003, 01:51 PM
So...am I right or wrong?
You found something, but I'm not sure if you recognized what it would cause.

iago

Well, the problem that DM's thing shows is caused by sign extention, it seems.  So any character >=0x80 would be sign extended.  But I don't see that causing a problem?
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Adron

I stand by my original answer. If the purpose of the code is to work the way the code is written, then there can be no error as long as it compiles. If you want us to find an error, you'll have to define what the code is supposed to do. Well, and the pits I pointed out.  

Skywing

Quote from: Adron on December 02, 2003, 04:21 PM
I stand by my original answer. If the purpose of the code is to work the way the code is written, then there can be no error as long as it compiles. If you want us to find an error, you'll have to define what the code is supposed to do. Well, and the pits I pointed out.  
The code is supposed to determine if a character is whitespace or not.

DarkMinion

Yeah, and alt+0160 is whitespace, but using your code, isspace says it's not!  Problem!!!  I win!!! :P

iago

It's because the parameter to isspace is an int, and ch is a char, so ch is being sign-extended :)
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Adron

There's no risk of spaces, tabs, formfeeds, linefeeds or carriage returns ever getting sign extended though? The manual says:

Quote
Return Value

isspace returns a non-zero value if c is a white-space character (0x09 – 0x0D or 0x20). iswspace returns a non-zero value if c is a wide character that corresponds to a standard white-space character or is one of an implementation-defined set of wide characters for which iswalnum is false. Each of these routines returns 0 if c does not satisfy the test condition.


Moonshine

There may be potential problems if one doesn't do bounds checking on 'ch', if you link with the debug CRT library.

Quote
When used with a debug CRT library, isspace will display a CRT assert if passed a parameter that isn't EOF or in the range of 0 through 0xFF. When used with a debug CRT library, isspace will use the parameter as an index into an array, with undefined results if the parameter isn't EOF or in the range of 0 through 0xFF.


Skywing

#44
The problem is that if char is signed, and the character ch is a negative number, it will be sign extended to a negative integer (which is probably not going to be equal to EOF) when passed to isspace.  This is out of range input for isspace and causes undefined behavior.  One fix is to do the following:

isspace((unsigned char)ch)

The problem is perhaps more insidious if a char array had been read in via gets or fgets and you had a loop that went through each character checking to see if that character is whitespace or not.  In that case, you wouldn't even have the possible warning that something could be up due to getc() returning an int and possibly returning EOF.

|