• Welcome to Valhalla Legends Archive.
 

Odd code

Started by iago, February 15, 2004, 03:34 AM

Previous topic - Next topic

iago

As far as I can figure here, I commented.  Please feel free to point out if I'm wrong

.text:6FAB0B73                 xor     ecx, ecx ; clear ecx
.text:6FAB0B75                 test    eax, eax ; check the value of eax
.text:6FAB0B77                 setz    cl ; if eax was 0, set the lowest byte of ecx
.text:6FAB0B7A                 mov     eax, ecx ; move the set value back into eax
.text:6FAB0B7C                 pop     esi ; ..
.text:6FAB0B7D                 retn ; return


It seems to me that this entire set of code is the same as just pop esi and ret, since if eax is true, eax becomes true, and if eax is false, eax becomes false.  Am I missing something, or is this just a weird piece of code?
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Adron

Quote from: iago on February 15, 2004, 03:34 AM
... if eax was 0, set the lowest byte of ecx ...

It seems to be
return (bool)!eax;

iago

Quote from: Adron on February 15, 2004, 04:07 AM
Quote from: iago on February 15, 2004, 03:34 AM
... if eax was 0, set the lowest byte of ecx ...

It seems to be
return (bool)!eax;


ah yes, I go it backwards.  Is all that code necessary for that?  I suppose I can't think of a better way to get it down to a single bit that was the opposite..
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Adron

Quote from: iago on February 15, 2004, 11:12 AM
ah yes, I go it backwards.  Is all that code necessary for that?  I suppose I can't think of a better way to get it down to a single bit that was the opposite..

Can you think of a small way of coding this in assembler?

int a = <some value>
return (bool)a;


Skywing

#4
  • xor dst, dst | cmp dst, src | adc dst, dst if you want bit 1 to be one when src is nonzero
  • xor dst, dst | cmp dst, src | sbb dst, -1 if you want bit 1 to be zero when src is nonzero
See example program:
#include <stdio.h>
bool boolize(int n)
{
   bool rv;

   __asm {
      mov      ecx, n
      xor      eax, eax
      cmp      eax, ecx
      adc      eax, eax
      mov      rv, al
   }

   return rv;
}

bool notboolize(int n)
{
   bool rv;

   __asm {
      mov      ecx, n
      xor      eax, eax
      cmp      eax, ecx
      sbb      eax, -1
      mov      rv, al
   }

   return rv;
}

int __cdecl main(int ac, char **av)
{
   printf("%08x %08x %08x %08x\n", boolize(0xf0f0f0f0), boolize(0xf1f1f1f1), boolize(1), boolize(0));
   printf("%08x %08x %08x %08x\n", notboolize(0xf0f0f0f0), notboolize(0xf1f1f1f1), notboolize(1), notboolize(0));
   getc(stdin);
   return 0;
}


Output:
Quote00000001 00000001 00000001 00000000
00000000 00000000 00000000 00000001

iago

hmm, I've never seen adc before.  Luckly, I keep an x86 reference book under my laptop!
...reads...
ah, add with carry.  

In your first two lines, what is the initial value of dst?
-- dumb question, you're clearing it.  Stupid me :)

I don't quite understand the adc and sbb lines.. they seem to do pretty much the same thing, but other way around :/
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Skywing

Quote from: iago on February 15, 2004, 01:45 PM
hmm, I've never seen adc before.  Luckly, I keep an x86 reference book under my laptop!
...reads...
ah, add with carry.  

In your first two lines, what is the initial value of dst?
-- dumb question, you're clearing it.  Stupid me :)

I don't quite understand the adc and sbb lines.. they seem to do pretty much the same thing, but other way around :/
Try switching them and seeing what happens - that's probably the easiest way to understand.

Adron

Also try

neg eax; sbb eax, eax; inc eax

and

neg eax; sbb eax, eax; neg eax

Yoni

#8
I put together an assembly.txt with a few common tricks and things I saw while reverse-engineering. It's not big and there's nothing really special in it, but this is the first thing there:

neg eax
sbb eax, eax
// if(eax != 0) eax = -1;


This is very commonly generated by the MS compiler when you use things like the ?: operator. Example:

res = a ? 17 : 42;

Should do something like:

mov eax, [a]
neg eax
sbb eax, eax

// if(a) eax = -1; else eax = 0;
// remember, if eax is -1 it should soon become 17, and if eax is 0 it should soon become 42

and eax, -25
// if eax is -1, "and eax, anything" sets eax to "anything". (-1 has all bits set)
// if eax is 0, "and eax, anything" sets eax to 0.
// so far: if(a) eax = -25; else eax = 0;

add eax, 42
// the punchline:
// if(a) eax = -25 + 42; else eax = 42;
// where -25 + 42 = 17.

mov [res], eax // obvious

Hey, that came out pretty good :). I'll add it to assembly.txt and for your viewing pleasure I'll upload it to vL.com.

http://yoni.valhallalegends.com/stuff/assembly.txt

Edit: Wrote "or" instead of "and" in the comment.

iago

#9
Very nice, Yoni.  I just started something somewhat similar to that, but it's nice to see that.  I agree - those really are the most common things I see in asm that confuse me.

<edit> one thing that I would add is the rep movsd and rep stosd and similar functions.  I always forget what goes where between ecx/edi/esi :(
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Adron

Quote from: iago on February 18, 2004, 05:57 PM
<edit> one thing that I would add is the rep movsd and rep stosd and similar functions.  I always forget what goes where between ecx/edi/esi :(

Then how do you remember the names of the registers?

eAx Ackumulator
eBx Base
eCx Count
eDx Data
eSI Source Index
eDI Destination Index
eBP Base Pointer
eSP Stack Pointer
eIP Instruction Pointer
CS Code Segment
DS Data Segment
ES Extra Segment

iago

Quote from: Adron on February 18, 2004, 06:05 PM
Quote from: iago on February 18, 2004, 05:57 PM
<edit> one thing that I would add is the rep movsd and rep stosd and similar functions.  I always forget what goes where between ecx/edi/esi :(

Then how do you remember the names of the registers?

eAx Ackumulator
eBx Base
eCx Count
eDx Data
eSI Source Index
eDI Destination Index
eBP Base Pointer
eSP Stack Pointer
eIP Instruction Pointer
CS Code Segment
DS Data Segment
ES Extra Segment


Rote Learning, of course.  *closes eyes* eax ebc ecx edx esi edi esp ebp.  Oh, and cs ds and es, but nobody likes those ones.
"Ackumulator" looks swedish - you mean "Accumulator".

Besides EIP and ESP and EBP, and besides "REP" instructions, do the names for the registers really mean anything?  I've never seen EAX used as an accumulator, or EBX as a Base or EDX as Data, etc.  Or are the names more arbitrary?
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Skywing

Lots more instructions than just those have special registers that they operate on.  For instance, mul, div, cpuid, ...

Adron

Quote from: iago on February 18, 2004, 08:03 PM
Besides EIP and ESP and EBP, and besides "REP" instructions, do the names for the registers really mean anything?  I've never seen EAX used as an accumulator, or EBX as a Base or EDX as Data, etc.  Or are the names more arbitrary?

They used to mean much more.

A snip from a reference:

----------------------
Memory Operands

A memory-operand address consists of two 16-bit components: a segment value and an
offset. The segment value is supplied by a 16-bit segment register either implicitly chosen
by the addressing mode (described below) or explicitly chosen by a segment override prefix
(see "Segment Override Prefix" on page 2-2). The offset, also called the effective address,
is calculated by summing any combination of the following three address elements:

* Displacement—an 8-bit or 16-bit immediate value contained in the instruction
* Base—contents of either the BX or BP base registers
* Index—contents of either the SI or DI index registers

Any carry from the 16-bit addition is ignored. Eight-bit displacements are sign-extended to
16-bit values.

Combinations of the above three address elements define the following six memory
addressing modes (see Table 1-2 for examples).

1. Direct Mode—The operand offset is contained in the instruction as an 8- or 16-bit
displacement element.
2. Register Indirect Mode—The operand offset is in one of the BP, BX, DI, or SI registers.
3. Based Mode—The operand offset is the sum of an 8- or 16-bit displacement and the contents
of a base register (BP or BX).
4. Indexed Mode—The operand offset is the sum of an 8- or 16-bit displacement and the
contents of an index register (DI or SI).
5. Based Indexed Mode—The operand offset is the sum of the contents of a base register
(BP or BX) and an index register (DI or SI).
6. Based Indexed Mode with Displacement—The operand offset is the sum of a base
register's contents, an index register's contents, and an 8-bit or 16-bit displacement.
----------------------

This means that you cannot use [AX] or [DX] to address memory. Since BP is generally reserved for pointing to the stack frame, that leaves you with only BX if you want a base register. And you cannot address using [SI + DI]; if you want to combine registers, you have to use one base register and one index register.




Adron

Some 80x86 instructions that use special registers other than EIP and ESP:

LOOP - CX is the counter
XLAT - AL is translated by the table whose base BX points to
SCAS/STOS - search for or store AL at DI
MUL/DIV - use DX:AX for the 32-bit value
IN/OUT - use DX or an immediate for the port number
INS/OUTS - use DX for the port number and SI/DI for source/destination
ENTER/LEAVE - use BP for the stack frame
ROL/ROR/RCL/RCL/SHR/SHR/SAL/SAR - use CL for the bit shift count

In addition to this, instructions are different length - operating on AX or other particular registers might generate a shorter instruction than operating on others.


Note that you don't have to use LODS, MOVS, SCAS, STOS with the REP prefix. They work fine by themselves.