Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>> argv[1] is "gotten" by reading from esi. You then proceed to poke eax and ecx. TIL that SI, AX and CX were implicitly linked. What specifically is going on here?

> Where do you see this?

Here:

  mov esi, [esp+8]  ; get argv[1]                value written to esi; okay
  xor eax, eax      ; clear the upper bits       \ how is writing to eax
  xor ecx, ecx      ; will hold converted value  / and ecx now relevant?!
> lea just does math, it doesn't dereference anything.

Right; I said "LEA usage"; the instruction itself I understand, but "[ecx+eax-'0']" is throwing me for a loop.



LEA does math, but with a particular structure to the inputs: a pointer plus an index plus a constant. That's meant to be used as a C compiler would, where the pointer would be to a struct array, the index would be the position within the array, and the constant would be the byte offset within the struct.

Calculating a memory address in that way is a common operation, so x86 processors give it special support with hardware that can calculate that sum all in one step.

Most often the computed address is used in a MOV instruction. LEA provides access to the address-computation hardware without doing the actual data move. That's useful because some math operations can be done faster through that hardware than with general-purpose math instructions. LEA requires that pattern of register + register*shift + constant (we aren't using the shift here).

    imul ecx, ecx, 10
    lea ecx, [ecx+eax-'0']
Those two instructions multiply ecx by 10 and then add the next digit (stored as ASCII in eax) to it. Subtracting '0' converts that digit from ASCII to its numerical value. It just so happens that LEA can execute the add and conversion fast and together in one step, because that combined operation matches the pattern of computing an address.


I didn't know you could do that with lea. That's pretty cool!


Sorry, I'm actively editing my answer as I get a better sense of your questions and I can answer more of them. Please see my edits; also, the specific lines you pointed out are not related to reading in argv[1]. They're just zeroing out eax and ecx; eax because we will only be writing to the lower bits with the lodsb and don't want out arithmetic to be fouled by anything left over in the higher bits, and ecx because we are continuously adding to it to in atoi, so it needs to start off at zero.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: