-
-
Notifications
You must be signed in to change notification settings - Fork 359
Bubble sort in X86-64 Assembly #368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all I am happy to see that people still write assembly as it really is a forgotten art.
I see a problem with the code though. The way you load an element out of the array technically works, but only in asm and it defies every rule of scoping.
If you were to turn this asm code into equivalent C code you get into trouble.
Especially this part:
mov eax, DWORD [ebp - 20]
lea edx, [0 + eax * 4]
mov eax, DWORD [ebp + 8]
add eax, edx
mov edx, DWORD [eax]
What you basically do here is you grab the pointer which you build previously from the stack. This would be fine as is, but the pointer currently lies within the stackframe of the main
function and thus is technically outside of bubblesort
's scope. This in turns means that bubblesort should not have a reference to the array as it never was given one.
More precisely the code would turn into:
void bubblesort() {
double first_elem = arr[0];
// rest of the code
}
void main() {
double *arr = {1. ,2. ,3.};
bubblesort();
// rest of the code
}
If you want to test this, in the main
function you can i.e. push another value onto the stack before calling bubblesort
. This is bound to break your algorithm, since you assume the pointer lays on a specific location.
Yes it works but only in asm and it's bad coding style in my opinion. (As an analogy you can use the s registers in mips as arguments for a function but you shouldn't)
I would probably put the array in the data section and make them global that way. That is what you usually find in asm lessons.
Alternatively you could also not push the pointer to the first element of the array onto the stack but save it in esi which is arg[0] for a function.
This would translate to the following C versions:
double *arr = {1. ,2. ,3.};
void bubblesort() {
double first_elem = arr[0];
// rest of the code
}
void main() {
bubblesort();
// rest of the code
}
or
void bubblesort(double *arr) {
double first_elem = arr[0];
// rest of the code
}
void main() {
double *arr = {1. ,2. ,3.};
bubblesort(arr);
// rest of the code
}
respectively. And that would be valid C again
(EDIT)
Could it be that this is x86 calling convention? If so I just lost some of my faith in x86. Because I still don't like breaking the scope even if it is possible.
Also for readability reasons I still would change it to one of the ways I mentioned.
jmp .for_check | ||
.for_loop: | ||
mov eax, DWORD [ebp - 16] | ||
lea edx, [0 + eax * 4] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the 0 +
since it is not really needed. This goes for all the following 0 +
as well.
pop esi | ||
leave | ||
ret | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a style thing i would remove one of the newlines here.
This seems to be doing quite a bit of work for a simple bubblesort. This is mainly due to:
I also ended up implementing my own version. It is very far from being optimized but it's probably quite a bit faster than the version in this PR. Even without the calling convention changes, I think the algorithm became a lot more straightforward as well: .intel_syntax noprefix
.section .rodata
array:
.align 16
.int 1, 45, 756, 4569, 56, 3, 8, 5, -10, -4
.equ array_len, (.-array) / 4
array_fmt: .string "%d "
lf: .string "\n"
unsorted: .string "Unsorted array: "
sorted: .string "Sorted array: "
.section .text
.global main
.extern printf
print_array:
push r12
push r13
mov r12, rdi # Loop variable
lea r13, [rdi + 4*rsi] # Pointer after the last element
print_array_loop:
cmp r12, r13 # If we're done iterating over the array then bail
jge print_array_done
mov rdi, OFFSET array_fmt # Otherwise print the current value
mov esi, DWORD PTR [r12]
xor rax, rax
call printf
lea r12, [r12 + 4] # And increment the loop variable pointer
jmp print_array_loop
print_array_done:
mov rdi, OFFSET lf # Print a closing newline
xor rax, rax
call printf
pop r13
pop r12
ret
bubble_sort:
xor rcx, rcx # The outer loop counter
lea rdx, [rdi + 4*rsi - 4] # The end address for the inner loop
outer_loop:
cmp rcx, rsi # We first check if the outer loop is done
jge bubble_sort_done # And if it is, return
mov rax, rdi # Otherwise we initialize the loop variable of the inner loop
inner_loop:
mov r8d, DWORD PTR [rax] # Load array[j] and array[j+1] through a pointer
mov r9d, DWORD PTR [rax + 4]
cmp r8d, r9d # If array[j] <= array[j+1]
jle loop_counters # Then we can skip this iteration
mov DWORD PTR [rax], r9d # Otherwise swap the values
mov DWORD PTR [rax + 4], r8d
loop_counters:
lea rax, [rax + 4] # First, advance the inner loop
cmp rax, rdx
jl inner_loop # And in case it's not done, repeat
inc rcx # If it is done, go back to doing the outer loop
jmp outer_loop
bubble_sort_done:
ret
main:
# Set up our stack
sub rsp, 40
# We load the array in chunks onto the stack
movaps xmm0, XMMWORD PTR [array]
movaps XMMWORD PTR [rsp], xmm0
movaps xmm0, XMMWORD PTR [array + 16]
movaps XMMWORD PTR [rsp + 16], xmm0
mov rax, QWORD PTR [array + 32]
mov QWORD PTR [rsp + 32], rax
# Print the unsorted array
mov rdi, OFFSET unsorted
xor rax, rax
call printf
mov rdi, rsp
mov rsi, array_len
call print_array
# Sort
mov rdi, rsp
mov rsi, array_len
call bubble_sort
# Print the sorted array
mov rdi, OFFSET sorted
xor rax, rax
call printf
mov rdi, rsp
mov rsi, array_len
call print_array
# Restore the stack pointer, set return value to 0
add rsp, 40
xor rax, rax
ret You'll need |
I probably prefer @Gustorn solution a bit more since it is cleaner and is x86-64 and thus has a sane calling convention. Also the bubblesort is indeed more straight forward. Only thing I would probably do is not have the array in rodata and load it on the stack but rather have it in rwdata and modify it there. Saves a bit Another thing I just realized while reading Gastorns solution that comments would've been really nice. Especially because not all people can read asm that easily. Oh and I finally gave in. The |
I prefer the |
I totally agree to that the |
Should we be doing X86-64 assembly instead x86 assembly? |
My vote would be yes, that's what most people would see nowadays anyway (during debugging, etc.) |
I would also say that we should use x86-64 over x86 for basically the same reasons. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I'm happy with the changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's good enough for Liikt, it's good enough for me.
Thanks for the review!
* Adding x86 assembly to book.json * Adding bubble_sort.asm * Updaing bubble_sort.md * changing x86 to x86-64 * Adding x86-64 bubble sort * Adding bubble_sort.s
No description provided.