Govind Parmar Govind Parmar - 1 month ago 11
C Question

Palindrome function always reporting offset #1 for error

I'm writing an x86 assembly function that determines whether a string is a palindrome or not (except for the null-terminator).

This function is meant to return 0 if the strings are palindromes, and if the strings are not palindromes, it will return the comparison that failed (i.e. the index of the character on the left half of the string that didn't match).

While it is successfully detecting which strings are and are not palindromes, it does always reports

1
as the index for a failed palindrome test, regardless of where it actually failed.

Assembly code:

.386
.MODEL FLAT, C
.CODE
; Determines whether or not a given string is a palindrome
; Uses:
; ECX - pointer to start of string (incremented till halfway)
; EDX - pointer to end of string (decremented till halfway)
; AL - dereference character from ECX for comparison
; BL - dereference character from EDX for comparison
; ESI - index where comparison failed in case strings are not palindromes
; Arguments:
; [ESP+4] - pointer to string to test
; [ESP+8] - length of string
; Returns:
; 0 = string is a palindrome
; > 0 = string is not a palindrome; return value is the # comparison that failed (e.g. AABAAA would return 3)
; C prototype: int __cdecl palin(char *str, int len);
palin PROC
push ebx
push esi
; Load ECX with a pointer to the first character in the string
mov ecx, dword ptr [esp+12]
; Copy the pointer into EDX then add the length so EDX points to the end of the string
mov edx, ecx
add edx, dword ptr [esp+16]
xor esi, esi
loop0:
; Begin loop with decrement of EDX to skip the null terminator
dec edx
inc esi
mov al, byte ptr [ecx]
mov bl, byte ptr [edx]
cmp al, bl
; Comparison fail = strings cannot be palindromes
jnz not_palindrome
inc ecx
; If start ptr >= end ptr we are done, else keep looping
cmp ecx, edx
jl loop0
; Return 0 = success; string is a palindrome
xor eax, eax
jmp end_palin
not_palindrome:
; Return > 0 = fail; string is not a palindrome
mov eax, esi
end_palin:
pop esi
pop ebx
ret
palin ENDP
END


C driver for assembly function:

#include <stdio.h>
#include <string.h>

int __cdecl palin(char *str, int len);

int __cdecl main(int argc, char *argv[])
{
int ret;
if(argc<2)
{
printf("Usage: pal word");
return 0;
}
if(ret = (palin(argv[1], strlen(argv[1])) > 0))
{
printf("%s is not a palindrome; first comparison that failed was #%d\n", argv[1], ret);
}
else
{
printf("%s is a palindrome\n", argv[1]);
}
return 0;
}


Sample output:

C:\Temp>pal ABCDEEDCBA
ABCDEEDCBA is a palindrome

C:\Temp>pal ABCDEDCBA
ABCDEDCBA is a palindrome

C:\Temp>pal AABAAA
AABAAA is not a palindrome; first comparison that failed was #1


The last line should return 3 instead of 1 - does anyone know what is going on here?

Answer

There are few bugs in your code... The one you are looking for is here:

if(ret = (palin(argv[1], strlen(argv[1])) > 0))

This should emit warning in good C/C++ compiler, I think, what are you using? Do you use -Wall -Wextra (these are for gcc or clang, for other compiler you should check it's documentation).

It's doing ret = (res > 0), and (res > 0) is boolean expression, so it's 0 or 1.

You probably wanted if ((ret = palin(argv[1], strlen(argv[1]))) > 0), and this shows why it's sometimes better to KISS and split these things into two lines.


Other bug:

jl loop0: should be jb. ecx and edx are memory pointers, thus unsigned. If your data would be allocated on 0x80000000 boundary, then jl would fail at first cmp.

And you can simplify the exit logic:

    ; Return 0 = success; string is a palindrome
        xor esi, esi    ; fake "esi" index = 0, reusing "not palindrome" exit code fully
    not_palindrome: 
    ; Return > 0 = fail; string is not a palindrome
        mov eax, esi
        pop esi
        pop ebx
        ret

And final style nitpick: jnz not_palindrome => I would use jne alias for this one, as you are comparing two chars for equality, not for "zero" (it's the same instruction, just different alias, I tend to use both, trying to use the more appropriate to follow my "human" description of functionality).

Also you can do cmp al,[edx] without loading the second char into bl (saving 1 more instruction and not clobbering ebx, so you don't need to push/pop ebx then, saving 2 more).

If you insist on loading the second character into register just for the "easy to read" code, you can still use ah for second char, removing that ebx completely from the code.