• Welcome to Valhalla Legends Archive.
 

Jump table (what is it?)?

Started by tA-Kane, April 09, 2003, 05:46 AM

Previous topic - Next topic

tA-Kane

If you've a set of mutually exclusive integer values which select what code to run, is it always better to use a jump table? If not, when would it be better not to?

Also, I've not read anything about jump tables for PPC ASM... how do you create one on x86 (perhaps I can take that knowledge and experiment with PPC)?

Or, is a jump table just (pseudocode) "if value=this then jump to here, if value = this2 then jump to here2, etc" ?
Macintosh programmer and enthusiast.
Battle.net Bot Programming: http://www.bash.org/?240059
I can write programs. Can you right them?

http://www.clan-mac.com
http://www.eve-online.com

Skywing

#1
Quote from: tA-Kane on April 09, 2003, 05:46 AM
If you've a set of mutually exclusive integer values which select what code to run, is it always better to use a jump table? If not, when would it be better not to?

Also, I've not read anything about jump tables for PPC ASM... how do you create one on x86 (perhaps I can take that knowledge and experiment with PPC)?

Or, is a jump table just (pseudocode) "if value=this then jump to here, if value = this2 then jump to here2, etc" ?
Jump tables are typically used when a large number of values are defined within a similar range.  Usually, you have two tables - one mapping an input value to an index into the second table, which has the address of a handler routine.

A simple x86 example, assuming the value to be tested is in eax and has already been range checked/adjusted:
movzx al, byte ptr [indextable+eax]
jmp dword ptr [handlertable+eax*4]

tA-Kane

Thanks.

So, this pseudocode would be a jump table?


jump to (beginning of table plus (value*4))
(beginning of table)
jump to myhandler1
jump to myhandler2
jump to myhandler3
etc

Also, I assume *4 would be *8 on a 64 bit processor?


If that's a jump table, then wouldn't you need to verify that value isn't an invalid number (which would result in jumping to bad code)? If that's the case, then you'd need to check to verify it's a good value. If you do that, then why not jump after verifying instead of using a jump table?
Macintosh programmer and enthusiast.
Battle.net Bot Programming: http://www.bash.org/?240059
I can write programs. Can you right them?

http://www.clan-mac.com
http://www.eve-online.com

Yoni

#3
Quote from: tA-Kane on April 09, 2003, 07:52 AMSo, this pseudocode would be a jump table?


jump to (beginning of table plus (value*4))
(beginning of table)
jump to myhandler1
jump to myhandler2
jump to myhandler3
etc

No, it's more like this:


jump to *(beginning of table plus (value*4))
(beginning of table)
myhandler1
myhandler2
myhandler3
etc


i.e., the jump table contains addresses, not jump instructions.

tA-Kane

#4
Thanks, Skywing and Yoni, you've answered my basic question.

But, now I have more... :P


*4, that's assuming the machine uses 32 bit memory addressing and/or runs on a 32 bit processor?

Wouldn't you need to verify that value isn't an invalid number (which would result in jumping to bad code)?

If the value isn't a contiguous integer (eg, first value being 1 jumps to the first pointer, value 2 jumps to the second, but value 3 is not expected, though value 4 jumps to a third pointer, and etc similar), then would it be better to do a check and jump if the check is true, or better to use a jump table (with "padding addresses" pointing to a "bad_value_provided" function)?
Macintosh programmer and enthusiast.
Battle.net Bot Programming: http://www.bash.org/?240059
I can write programs. Can you right them?

http://www.clan-mac.com
http://www.eve-online.com

Yoni

#5
Quote from: tA-Kane on April 09, 2003, 11:08 AM
*4, that's assuming the machine uses 32 bit memory addressing and/or runs on a 32 bit processor?
Yes. For code compiled for a 64-bit processor, the formula will use 8 instead of 4.
The general "formula" can be written as something like: jumptable + value * sizeof(void*)

Quote from: tA-Kane on April 09, 2003, 11:08 AMWouldn't you need to verify that value isn't an invalid number (which would result in jumping to bad code)?
Yes. Here's an example (this uses C/C++ code, which can cause a jump table to be generated by the compiler/optimizer):
Code that looks like this:
switch(x) {
case 1: stuff1; break;
case 2: stuff2; break;
case 3: stuff3; break;
}
stuff4;

Might be translated to something like (pseudocode):
jumptable[3] = { &stuff1, &stuff2, &stuff3 }
if(x >= 1 and x <= 3) goto jumptable[x-1] else goto &stuff4


Note that if you use Visual C++ (6.0 and above), there is a Microsoft-specific extension you can use: __assume(0); which means "this code should never be reached, *pokes optimizer*".
Then code that looks like this:
switch(x) {
case 1: stuff1; break;
case 2: stuff2; break;
case 3: stuff3; break;
default: __assume(0);
}
stuff4;

Might be translated to:
jumptable[3] = { &stuff1, &stuff2, &stuff3 }
goto jumptable[x-1]

(The optimizer assumes x will never be anything other than 1, 2, or 3, so it drops the checking.)

Quote from: tA-Kane on April 09, 2003, 11:08 AMIf the value isn't a contiguous integer (eg, first value being 1 jumps to the first pointer, value 2 jumps to the second, but value 3 is not expected, though value 4 jumps to a third pointer, and etc similar), then would it be better to do a check and jump if the check is true, or better to use a jump table (with "padding addresses" pointing to a "bad_value_provided" function)?
If there are big gaps between values, maybe a jump table is not such a good idea. Whether a jump table or a bunch of cmp and jz instructions (or their equivalent in your processor architecture) are used is up to you (read: up to the optimizer).

iago

On our homework (in c++) we used switches instead of if's, and he called it inefficient.  But a single compare/jump is much more efficient than many jumps, so obviously our marker knows nothing (duh)
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


FreakingDude

I don't understand why you are multiplying it by 4? please explain why you would multiply it by 4 on 32 bit and 8 on 64 bit and what would you multiply it by on 16 bit??? please help ^^

Kp

Quote from: FreakingDude on April 11, 2003, 03:35 PM
I don't understand why you are multiplying it by 4? please explain why you would multiply it by 4 on 32 bit and 8 on 64 bit and what would you multiply it by on 16 bit??? please help ^^

It's multiplied by the number of bytes in a single address.  For 32 bit mode, this is 32/8 = 4.  For 64bit mode, 64/8 = 8.  For 16bit mode, 16/8 = 2.
[19:20:23] (BotNet) <[vL]Kp> Any idiot can make a bot with CSB, and many do!

FreakingDude

but why? I don't understnad why you need to multiply it by the number of bytes, why does the computer need you to multiply it by the number of bytes? why can't you just use the address thats there?

Skywing

Quote from: FreakingDude on April 11, 2003, 04:05 PM
but why? I don't understnad why you need to multiply it by the number of bytes, why does the computer need you to multiply it by the number of bytes? why can't you just use the address thats there?
Because the address size depends on what addressing mode you're using (surprise!), and thus you need to adjust the table lookups based on the size of a pointer.

Adron

Note that you have to do this for any array indexing when you are programming in assembly language. If items in your array aren't in sequential addresses (i.e. mostly byte-size) then you need to multiply your index by the size of an item. A high-level programming language (like C, VB, whatever) takes care of the multiplication for you. Although in C you have to be aware of it sometimes.