• Welcome to Valhalla Legends Archive.
 

Is there a better way to do this?

Started by Paul, November 16, 2003, 02:03 AM

Previous topic - Next topic

Paul

Where [esi+3C] = 12345678 AND [esi+3C] is constantly updating with a new DWORD value...

My goal is to throw 12345678 into my own static pointer, but in reverse order 87654321.

This is what I've been bouncing around in my head...

             push eax
             mov al, byte ptr [esi+3F]
             mov byte ptr [Pointer part 1], al
             mov al, byte ptr [esi+3E]
             mov byte ptr [Pointer part 2], al
             mov al, byte ptr [esi+3D]
             mov byte ptr [Pointer part 3], al
             mov al, byte ptr [esi+3C]
             mov byte ptr [Pointer part 4], al
             pop eax
             ret

Where the result would = DWORD of 87654321 in [Pointer], plus account for the auto-updating of the DWORD value in [esi+3C]. So if 12345678 changes to 23456789 it will become 98765432 my new static pointer.

If I'm rambling I apologize...

Anyway, my question is: Is there a better way to do this?

Adron

#1
This is how I picture it:


mov ecx, 8
xor edx, edx
mov eax, [esi+3ch]
shifting:
shl edx, 4
mov ebx, eax
shr eax, 4
and ebx, 15
or edx, ebx
loop shifting
mov pointer, edx


Kp

Why are you seeking to reverse the nibbles as well?  Usually people just want to reverse the ordering at the byte level. :)
[19:20:23] (BotNet) <[vL]Kp> Any idiot can make a bot with CSB, and many do!

Skywing

Quote from: Paul on November 16, 2003, 02:03 AM
Where [esi+3C] = 12345678 AND [esi+3C] is constantly updating with a new DWORD value...

My goal is to throw 12345678 into my own static pointer, but in reverse order 87654321.

This is what I've been bouncing around in my head...

             push eax
             mov al, byte ptr [esi+3F]
             mov byte ptr [Pointer part 1], al
             mov al, byte ptr [esi+3E]
             mov byte ptr [Pointer part 2], al
             mov al, byte ptr [esi+3D]
             mov byte ptr [Pointer part 3], al
             mov al, byte ptr [esi+3C]
             mov byte ptr [Pointer part 4], al
             pop eax
             ret

Where the result would = DWORD of 87654321 in [Pointer], plus account for the auto-updating of the DWORD value in [esi+3C]. So if 12345678 changes to 23456789 it will become 98765432 my new static pointer.

If I'm rambling I apologize...

Anyway, my question is: Is there a better way to do this?

Use the bswap instruction - available on i486 and higher.

Kp

Quote from: Skywing on November 16, 2003, 10:58 AMUse the bswap instruction - available on i486 and higher.
I started to post the same thing, then noticed that he wants swapping on a per-nibble basis, not per-byte like bswap does.  Hence my query to him.  Of course, the code he provided us as an example is wrong if he really does want a per-nibble swap. :)
[19:20:23] (BotNet) <[vL]Kp> Any idiot can make a bot with CSB, and many do!

Adron

#5
Well, I was assuming that his explanation was correct and the code possibly wrong since he wanted help with the code...

I was looking for the bswap, but it wasn't listed in 386intel.txt.....


edit:

mov eax, [esi+3ch]
bswap eax
mov ebx, eax
and eax, 0f0f0f0fh
and ebx, 0f0f0f0f0h
shl eax, 4
shr ebx, 4
or eax, ebx
mov pointer, eax

Skywing

#6
Quote from: Adron on November 16, 2003, 11:04 AM
Well, I was assuming that his explanation was correct and the code possibly wrong since he wanted help with the code...
Now he knows how to accomplish either of his listed goals.  So, we should have everything covered? :p

Paul

Excellent, when I get off work I'll be able to test/compile this code into my project. Thanks much!

Skywing

Quote from: Paul on November 16, 2003, 07:09 PM
Excellent, when I get off work I'll be able to test/compile this code into my project. Thanks much!
So, which did you want to do, anyway?


CupHead

Along similar lines, I was talking to Sky this morning about byte swapping for other-endian protocols...  So he gave me the function:


unsigned short bswap(unsigned short u) { return ((u & 0xff) << 8) | (u >> 8); }


This is all well and good, but I decided to write it in ASM for the hell of it and got:


WORD ByteSwapWORD( WORD x )
{
   __asm
   {
      mov      ax, x
      and      ax, 0xff
      shl      ax, 8
      shr      x, 8
      or      x, ax
   }

   return x;
}


Which is all well and good until you get to DWORDs.  Damned if I could figure out how to do the swapping in C, so I went with ASM again, this time coming up with:


DWORD ByteSwapDWORD( DWORD x )
{
   __asm
   {
      push   edx
      push   ebx

      mov      edx, x      // 01 02 03 04
      
      mov      ax, dx      // dx = 03 04
      and      ax, 0xff   // ax = 03 04 -> 0000 0011 0000 0100 -> 0000 0000 0000 0100
      shl      ax, 8      // ax = 0000 0100 0000 0000
      shr      dx, 8      // dx = 0000 0000 0000 0011
      or      dx, ax      // dx = 0000 0100 0000 0011 -> 04 03

      shl      edx, 16      // edx = 0000 0100 0000 0011 0000 0000 0000 0000 -> 04 03 00 00

      xor      ebx, ebx
      mov      eax, x      // eax = 01 02 03 04
      shr      eax, 16      // eax = 00 00 01 02 -> 0000 0000 0000 0000 0000 0001 0000 0010
      mov      bx, ax
      and      ax, 0xff   // ax = 0000 0000 0000 0010
      shl      ax, 8      // ax = 0000 0010 0000 0000
      shr      bx, 8      // bx = 0000 0000 0000 0001
      or      bx, ax      // bx = 0000 0010 0000 0001 -> 02 01
   
      or      edx, ebx   // edx = 0000 0100 0000 0011 0000 0010 0000 0001 -> 04 03 02 01
      mov      x, edx

      pop      ebx
      pop      edx
   }

   return x;
}


As you can see from the comments, I was having lots of fun working out the binary for the instructions and stuff.  Anyway, after finishing this behemoth of a function, I seemed to remember something like this on the forums and whipped out my handy (and free) IA-32 Architecture Software Developer's Manual Volume 2: Instruction Set Reference.  I looked up bswap and to my amazement, it did the whole DWORD thing in just one instruction.  Well, damn, that was a lot of wasted effort.  Then it said see xchg for 16-bit numbers and there was a single instruction that did it for words.  *sigh*  Final code looks like:


WORD ByteSwapWORD( WORD x )
{
   __asm
   {
      mov      ax, x
      xchg   ah, al
      mov      x, ax
   }

   return x;
}

DWORD ByteSwapDWORD( DWORD x )
{
   __asm
   {
      mov      eax, x
      bswap   eax
      mov      x, eax
   }

   return x;
}


Anyway, just thought I'd vent some frustration.  :P

Skywing

You could improve those by making them naked and fastcall.

__declspec(naked) unsigned short __fastcall ByteSwapWORD(unsigned short)
{
__asm {
 xchg cl, ch
 mov ax, cx
}
}

__declspec(naked) unsigned long __fastcall ByteSwapDWORD(unsigned long)
{
__asm {
 bswap ecx
 mov eax, ecx
}
}

Kp

Quote from: Skywing on November 26, 2003, 11:34 AM
You could improve those by making them naked and fastcall.

Even better, make them attribute ((regparm (1))), in which case the argument will be in eax/ax when the function starts, saving you from even having to move it from ecx. ;)  Also, it might be worth doing some testing on whether it's faster to xchg or do the exchange manually.  Same with bswap -- just because it's one instruction, it might not be fast.  Finally, I'm certain CupHead's dword swapper is bloated.  I've inlined something that has that effect several times and it's never been that long. :)
[19:20:23] (BotNet) <[vL]Kp> Any idiot can make a bot with CSB, and many do!

CupHead

I'm sure it's bloated too, probably because I used an instruction for each step (and you can see the progression).  Obviously there would be faster ways like just swapping the inner bytes and then the outer bytes, but what you see is what I've got.

Etheran

#14
Quote from: Kp on November 26, 2003, 04:23 PM
Quote from: Skywing on November 26, 2003, 11:34 AM
You could improve those by making them naked and fastcall.

Even better, make them attribute ((regparm (1))), in which case the argument will be in eax/ax when the function starts, saving you from even having to move it from ecx. ;)  Also, it might be worth doing some testing on whether it's faster to xchg or do the exchange manually.  Same with bswap -- just because it's one instruction, it might not be fast.  Finally, I'm certain CupHead's dword swapper is bloated.  I've inlined something that has that effect several times and it's never been that long. :)
How would you do that in msvc++ ?  I can't find regparm or __attribute__ on msdn.

__attribute__((regparm(1))) ... ?