Print Page - Is there a better way to do this?

Title: Is there a better way to do this?
Post by: Paul on November 16, 2003, 02:03 AM

Where [esi+3C] = 12345678 AND [esi+3C] is constantly updating with a new DWORD value...

My goal is to throw 12345678 into my own static pointer, but in reverse order 87654321.

This is what I've been bouncing around in my head...

push eax
mov al, byte ptr [esi+3F]
mov byte ptr [Pointer part 1], al
mov al, byte ptr [esi+3E]
mov byte ptr [Pointer part 2], al
mov al, byte ptr [esi+3D]
mov byte ptr [Pointer part 3], al
mov al, byte ptr [esi+3C]
mov byte ptr [Pointer part 4], al
pop eax
ret

Where the result would = DWORD of 87654321 in [Pointer], plus account for the auto-updating of the DWORD value in [esi+3C]. So if 12345678 changes to 23456789 it will become 98765432 my new static pointer.

If I'm rambling I apologize...

Anyway, my question is: Is there a better way to do this?

Title: Re:Is there a better way to do this?
Post by: Adron on November 16, 2003, 07:05 AM

This is how I picture it:

Code Select


mov ecx, 8
xor edx, edx
mov eax, [esi+3ch]
shifting:
shl edx, 4
mov ebx, eax
shr eax, 4
and ebx, 15
or edx, ebx
loop shifting
mov pointer, edx

Title: Out of curiousity
Post by: Kp on November 16, 2003, 10:37 AM

Why are you seeking to reverse the nibbles as well? Usually people just want to reverse the ordering at the byte level. :)

Title: There is a dedicated instruction for this...
Post by: Skywing on November 16, 2003, 10:58 AM

Quote from: Paul on November 16, 2003, 02:03 AM
Where [esi+3C] = 12345678 AND [esi+3C] is constantly updating with a new DWORD value...

My goal is to throw 12345678 into my own static pointer, but in reverse order 87654321.

This is what I've been bouncing around in my head...

push eax
mov al, byte ptr [esi+3F]
mov byte ptr [Pointer part 1], al
mov al, byte ptr [esi+3E]
mov byte ptr [Pointer part 2], al
mov al, byte ptr [esi+3D]
mov byte ptr [Pointer part 3], al
mov al, byte ptr [esi+3C]
mov byte ptr [Pointer part 4], al
pop eax
ret

Where the result would = DWORD of 87654321 in [Pointer], plus account for the auto-updating of the DWORD value in [esi+3C]. So if 12345678 changes to 23456789 it will become 98765432 my new static pointer.

If I'm rambling I apologize...

Anyway, my question is: Is there a better way to do this?

Use the bswap instruction - available on i486 and higher.

Title: Actually, Sky...
Post by: Kp on November 16, 2003, 11:02 AM

Quote from: Skywing on November 16, 2003, 10:58 AMUse the bswap instruction - available on i486 and higher.

I started to post the same thing, then noticed that he wants swapping on a per-nibble basis, not per-byte like bswap does. Hence my query to him. Of course, the code he provided us as an example is wrong if he really does want a per-nibble swap. :)

Title: Re:Is there a better way to do this?
Post by: Adron on November 16, 2003, 11:04 AM

Well, I was assuming that his explanation was correct and the code possibly wrong since he wanted help with the code...

I was looking for the bswap, but it wasn't listed in 386intel.txt.....

edit:

Code Select


mov eax, [esi+3ch]
bswap eax
mov ebx, eax
and eax, 0f0f0f0fh
and ebx, 0f0f0f0f0h
shl eax, 4
shr ebx, 4
or eax, ebx
mov pointer, eax

Title: He can do it either way now...
Post by: Skywing on November 16, 2003, 11:05 AM

Quote from: Adron on November 16, 2003, 11:04 AM
Well, I was assuming that his explanation was correct and the code possibly wrong since he wanted help with the code...

Now he knows how to accomplish either of his listed goals. So, we should have everything covered? :p

Title: Re:Is there a better way to do this?
Post by: Paul on November 16, 2003, 07:09 PM

Excellent, when I get off work I'll be able to test/compile this code into my project. Thanks much!

Title: Re:Is there a better way to do this?
Post by: Skywing on November 16, 2003, 07:11 PM

Quote from: Paul on November 16, 2003, 07:09 PM
Excellent, when I get off work I'll be able to test/compile this code into my project. Thanks much!

So, which did you want to do, anyway?

Title: Re:Is there a better way to do this?
Post by: Paul on November 16, 2003, 07:12 PM

Byte swapping!

Title: Re:Is there a better way to do this?
Post by: CupHead on November 26, 2003, 11:25 AM

Along similar lines, I was talking to Sky this morning about byte swapping for other-endian protocols... So he gave me the function:

Code Select


unsigned short bswap(unsigned short u) { return ((u & 0xff) << 8) | (u >> 8); }

This is all well and good, but I decided to write it in ASM for the hell of it and got:

Code Select


WORD ByteSwapWORD( WORD x )
{
   __asm
   {
      mov      ax, x
      and      ax, 0xff
      shl      ax, 8
      shr      x, 8
      or      x, ax
   }

   return x;
}

Which is all well and good until you get to DWORDs. Damned if I could figure out how to do the swapping in C, so I went with ASM again, this time coming up with:

Code Select


DWORD ByteSwapDWORD( DWORD x )
{
   __asm
   {
      push   edx
      push   ebx

      mov      edx, x      // 01 02 03 04
      
      mov      ax, dx      // dx = 03 04
      and      ax, 0xff   // ax = 03 04 -> 0000 0011 0000 0100 -> 0000 0000 0000 0100
      shl      ax, 8      // ax = 0000 0100 0000 0000
      shr      dx, 8      // dx = 0000 0000 0000 0011
      or      dx, ax      // dx = 0000 0100 0000 0011 -> 04 03

      shl      edx, 16      // edx = 0000 0100 0000 0011 0000 0000 0000 0000 -> 04 03 00 00

      xor      ebx, ebx
      mov      eax, x      // eax = 01 02 03 04
      shr      eax, 16      // eax = 00 00 01 02 -> 0000 0000 0000 0000 0000 0001 0000 0010
      mov      bx, ax
      and      ax, 0xff   // ax = 0000 0000 0000 0010
      shl      ax, 8      // ax = 0000 0010 0000 0000
      shr      bx, 8      // bx = 0000 0000 0000 0001
      or      bx, ax      // bx = 0000 0010 0000 0001 -> 02 01
   
      or      edx, ebx   // edx = 0000 0100 0000 0011 0000 0010 0000 0001 -> 04 03 02 01
      mov      x, edx

      pop      ebx
      pop      edx
   }

   return x;
}

As you can see from the comments, I was having lots of fun working out the binary for the instructions and stuff. Anyway, after finishing this behemoth of a function, I seemed to remember something like this on the forums and whipped out my handy (and free) IA-32 Architecture Software Developer's Manual Volume 2: Instruction Set Reference. I looked up bswap and to my amazement, it did the whole DWORD thing in just one instruction. Well, damn, that was a lot of wasted effort. Then it said see xchg for 16-bit numbers and there was a single instruction that did it for words. *sigh* Final code looks like:

Code Select


WORD ByteSwapWORD( WORD x )
{
   __asm
   {
      mov      ax, x
      xchg   ah, al
      mov      x, ax
   }

   return x;
}

DWORD ByteSwapDWORD( DWORD x )
{
   __asm
   {
      mov      eax, x
      bswap   eax
      mov      x, eax
   }

   return x;
}

Anyway, just thought I'd vent some frustration. :P

Title: Re:Is there a better way to do this?
Post by: Skywing on November 26, 2003, 11:34 AM

You could improve those by making them naked and fastcall.

Code Select


__declspec(naked) unsigned short __fastcall ByteSwapWORD(unsigned short)
{
 __asm {
  xchg cl, ch
  mov ax, cx
 }
}

__declspec(naked) unsigned long __fastcall ByteSwapDWORD(unsigned long)
{
 __asm {
  bswap ecx
  mov eax, ecx
 }
}

Title: Re:Is there a better way to do this?
Post by: Kp on November 26, 2003, 04:23 PM

Quote from: Skywing on November 26, 2003, 11:34 AM
You could improve those by making them naked and fastcall.

Even better, make them attribute ((regparm (1))), in which case the argument will be in eax/ax when the function starts, saving you from even having to move it from ecx. ;) Also, it might be worth doing some testing on whether it's faster to xchg or do the exchange manually. Same with bswap -- just because it's one instruction, it might not be fast. Finally, I'm certain CupHead's dword swapper is bloated. I've inlined something that has that effect several times and it's never been that long. :)

Title: Re:Is there a better way to do this?
Post by: CupHead on November 26, 2003, 04:45 PM

I'm sure it's bloated too, probably because I used an instruction for each step (and you can see the progression). Obviously there would be faster ways like just swapping the inner bytes and then the outer bytes, but what you see is what I've got.

Title: Re:Is there a better way to do this?
Post by: Etheran on November 26, 2003, 06:24 PM

Quote from: Kp on November 26, 2003, 04:23 PM
Quote from: Skywing on November 26, 2003, 11:34 AM
You could improve those by making them naked and fastcall.

Even better, make them attribute ((regparm (1))), in which case the argument will be in eax/ax when the function starts, saving you from even having to move it from ecx. ;) Also, it might be worth doing some testing on whether it's faster to xchg or do the exchange manually. Same with bswap -- just because it's one instruction, it might not be fast. Finally, I'm certain CupHead's dword swapper is bloated. I've inlined something that has that effect several times and it's never been that long. :)

How would you do that in msvc++ ? I can't find regparm or __attribute__ on msdn.

Code Select

__attribute__((regparm(1))) ... ?

Title: Re:Is there a better way to do this?
Post by: Skywing on November 26, 2003, 06:44 PM

Quote from: Etheran on November 26, 2003, 06:24 PM
Quote from: Kp on November 26, 2003, 04:23 PM
Quote from: Skywing on November 26, 2003, 11:34 AM
You could improve those by making them naked and fastcall.

Even better, make them attribute ((regparm (1))), in which case the argument will be in eax/ax when the function starts, saving you from even having to move it from ecx. ;) Also, it might be worth doing some testing on whether it's faster to xchg or do the exchange manually. Same with bswap -- just because it's one instruction, it might not be fast. Finally, I'm certain CupHead's dword swapper is bloated. I've inlined something that has that effect several times and it's never been that long. :)
How would you do that in msvc++ ? I can't find regparm or __attribute__ on msdn.

Code Select Expand
__attribute__((regparm(1))) ... ?

Those are GCC extensions and are incompatible with VC.

Title: Re:Is there a better way to do this?
Post by: Etheran on November 26, 2003, 06:48 PM

that's what I thought, but is there any way to do this in vc? I'm thinking no and the only way to do it would be to put the value in eax before you make the function call.

Code Select


int __declspec(naked) myFunction(void);
int ret;

__asm { mov eax, theVal }

ret = myFunction();

or perhaps this generates a compiler error..

Code Select


myFunction();
__asm { mov ret, eax }

Title: Re:Is there a better way to do this?
Post by: Skywing on November 26, 2003, 06:52 PM

Quote from: Etheran on November 26, 2003, 06:48 PM
that's what I thought, but is there any way to do this in vc?

No.

Title: Re:Is there a better way to do this?
Post by: Kp on November 27, 2003, 12:53 AM

Quote from: Skywing on November 26, 2003, 06:52 PM
Quote from: Etheran on November 26, 2003, 06:48 PM
that's what I thought, but is there any way to do this in vc?
No.

Which is truly unfortunate, because there's really no reason that I can see why you shouldn't use all three call-clobbered registers for parameter passing (if you're going to pass values in registers at all -- there exist some circumstances (typically when the parameters are ignored for a while) when it's better not to pass them as registers).

As an interesting quirk, GCC supports MSVC's _fastcall correctly by creating a two-register pass using ecx,edx; too bad VC can't do the reverse and support GCC's ability to do three-register using eax,edx,ecx. :)

Valhalla Legends Archive

Programming => General Programming => Assembly Language (any cpu) => Topic started by: Paul on November 16, 2003, 02:03 AM