Skip to content

Bubble sort in X86-64 Assembly #368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Oct 5, 2018

Conversation

Gathros
Copy link
Contributor

@Gathros Gathros commented Sep 6, 2018

No description provided.

@Gathros Gathros added the Implementation This provides an implementation for an algorithm. (Code and maybe md files are edited.) label Sep 6, 2018
Copy link

@Liikt Liikt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all I am happy to see that people still write assembly as it really is a forgotten art.

I see a problem with the code though. The way you load an element out of the array technically works, but only in asm and it defies every rule of scoping.

If you were to turn this asm code into equivalent C code you get into trouble.

Especially this part:

mov eax, DWORD [ebp - 20]
lea edx, [0 + eax * 4]
mov eax, DWORD [ebp + 8]
add eax, edx
mov edx, DWORD [eax]

What you basically do here is you grab the pointer which you build previously from the stack. This would be fine as is, but the pointer currently lies within the stackframe of the main function and thus is technically outside of bubblesort's scope. This in turns means that bubblesort should not have a reference to the array as it never was given one.

More precisely the code would turn into:

void bubblesort() {
    double first_elem = arr[0];
    // rest of the code
}

void main() {
    double *arr = {1. ,2. ,3.};
    bubblesort();
    // rest of the code
}

If you want to test this, in the main function you can i.e. push another value onto the stack before calling bubblesort. This is bound to break your algorithm, since you assume the pointer lays on a specific location.

Yes it works but only in asm and it's bad coding style in my opinion. (As an analogy you can use the s registers in mips as arguments for a function but you shouldn't)

I would probably put the array in the data section and make them global that way. That is what you usually find in asm lessons.
Alternatively you could also not push the pointer to the first element of the array onto the stack but save it in esi which is arg[0] for a function.

This would translate to the following C versions:

double *arr = {1. ,2. ,3.};

void bubblesort() {
    double first_elem = arr[0];
    // rest of the code
}

void main() {
    bubblesort();
    // rest of the code
}

or

void bubblesort(double *arr) {
    double first_elem = arr[0];
    // rest of the code
}

void main() {
    double *arr = {1. ,2. ,3.};
    bubblesort(arr);
    // rest of the code
}

respectively. And that would be valid C again

(EDIT)
Could it be that this is x86 calling convention? If so I just lost some of my faith in x86. Because I still don't like breaking the scope even if it is possible.

Also for readability reasons I still would change it to one of the ways I mentioned.

jmp .for_check
.for_loop:
mov eax, DWORD [ebp - 16]
lea edx, [0 + eax * 4]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the 0 + since it is not really needed. This goes for all the following 0 + as well.

pop esi
leave
ret

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a style thing i would remove one of the newlines here.

@zsparal
Copy link
Contributor

zsparal commented Sep 28, 2018

This seems to be doing quite a bit of work for a simple bubblesort. This is mainly due to:

  1. cdecl bookkeeping: this calling convention requires a lot of fiddling to store/restore registers and get parameters on the stack. I would just suggest going with the System V ABI, it usually results in much less noise.
  2. The algorithm itself: the implementation seems overly complicated, I found it kind of hard to read. I think it's way too much code for a simple bubble sort

I also ended up implementing my own version. It is very far from being optimized but it's probably quite a bit faster than the version in this PR. Even without the calling convention changes, I think the algorithm became a lot more straightforward as well:

.intel_syntax noprefix

.section .rodata
  array:
    .align  16
    .int    1, 45, 756, 4569, 56, 3, 8, 5, -10, -4 
    .equ    array_len, (.-array) / 4
  array_fmt: .string "%d "
  lf:        .string "\n"
  unsorted:  .string "Unsorted array: "
  sorted:    .string "Sorted array: "

.section .text
  .global main
  .extern printf

print_array:
  push   r12
  push   r13

  mov    r12, rdi                 # Loop variable
  lea    r13, [rdi + 4*rsi]       # Pointer after the last element

print_array_loop:
  cmp    r12, r13                 # If we're done iterating over the array then bail
  jge    print_array_done
  mov    rdi, OFFSET array_fmt    # Otherwise print the current value
  mov    esi, DWORD PTR [r12]
  xor    rax, rax
  call   printf
  lea    r12, [r12 + 4]           # And increment the loop variable pointer
  jmp    print_array_loop

print_array_done:
  mov    rdi, OFFSET lf           # Print a closing newline
  xor    rax, rax
  call   printf

  pop    r13
  pop    r12
  ret

bubble_sort:
  xor    rcx, rcx                 # The outer loop counter
  lea    rdx, [rdi + 4*rsi - 4]   # The end address for the inner loop

outer_loop:
  cmp    rcx, rsi                 # We first check if the outer loop is done
  jge    bubble_sort_done         # And if it is, return
  mov    rax, rdi                 # Otherwise we initialize the loop variable of the inner loop
inner_loop:
  mov    r8d, DWORD PTR [rax]     # Load array[j] and array[j+1] through a pointer
  mov    r9d, DWORD PTR [rax + 4]
  cmp    r8d, r9d                 # If array[j] <= array[j+1]
  jle    loop_counters            # Then we can skip this iteration
  mov    DWORD PTR [rax], r9d     # Otherwise swap the values
  mov    DWORD PTR [rax + 4], r8d
loop_counters:
  lea    rax, [rax + 4]           # First, advance the inner loop
  cmp    rax, rdx
  jl     inner_loop               # And in case it's not done, repeat
  inc    rcx                      # If it is done, go back to doing the outer loop
  jmp    outer_loop

bubble_sort_done:
  ret

main:
  # Set up our stack
  sub    rsp, 40

  # We load the array in chunks onto the stack
  movaps xmm0, XMMWORD PTR [array]
  movaps XMMWORD PTR [rsp], xmm0
  movaps xmm0, XMMWORD PTR [array + 16]
  movaps XMMWORD PTR [rsp + 16], xmm0
  mov    rax, QWORD PTR [array + 32]
  mov    QWORD PTR [rsp + 32], rax

  # Print the unsorted array
  mov    rdi, OFFSET unsorted
  xor    rax, rax
  call   printf

  mov    rdi, rsp
  mov    rsi, array_len
  call   print_array

  # Sort
  mov    rdi, rsp
  mov    rsi, array_len
  call   bubble_sort

  # Print the sorted array
  mov    rdi, OFFSET sorted
  xor    rax, rax
  call   printf

  mov    rdi, rsp
  mov    rsi, array_len
  call   print_array

  # Restore the stack pointer, set return value to 0
  add    rsp, 40
  xor    rax, rax
  ret

You'll need gcc for this one, assuming you called the file main.s: gcc -no-pie main.s -o main && ./main

@Liikt
Copy link

Liikt commented Sep 29, 2018

I probably prefer @Gustorn solution a bit more since it is cleaner and is x86-64 and thus has a sane calling convention. Also the bubblesort is indeed more straight forward.

Only thing I would probably do is not have the array in rodata and load it on the stack but rather have it in rwdata and modify it there. Saves a bit xmm register foo.

Another thing I just realized while reading Gastorns solution that comments would've been really nice. Especially because not all people can read asm that easily.

Oh and I finally gave in. The [ebp+8] problem is really x86 calling convention. Which is rather stupid. And I can finally say x86 <<< x86-64

@zsparal
Copy link
Contributor

zsparal commented Sep 29, 2018

I prefer the .rodata solution because it's conceptually more similar to things you're used to from other programming languages (it also mirrors the C solution). But I can see the argument for either version.

@Liikt
Copy link

Liikt commented Sep 29, 2018

I totally agree to that the .rodata solution mirrors C better. You just tend to see the .rwdata in classes that teach assembly as it is generally easier to understand. Though that is more a minute thing and I'm happy with either way.

@Gathros
Copy link
Contributor Author

Gathros commented Sep 30, 2018

Should we be doing X86-64 assembly instead x86 assembly?

@zsparal
Copy link
Contributor

zsparal commented Sep 30, 2018

My vote would be yes, that's what most people would see nowadays anyway (during debugging, etc.)

@Liikt
Copy link

Liikt commented Oct 1, 2018

I would also say that we should use x86-64 over x86 for basically the same reasons.

@Gathros Gathros changed the title Bubble sort in X86 Assembly Bubble sort in X86-64 Assembly Oct 1, 2018
Copy link

@Liikt Liikt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I'm happy with the changes.

Copy link
Member

@leios leios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's good enough for Liikt, it's good enough for me.

Thanks for the review!

@leios leios merged commit da95ae7 into algorithm-archivists:master Oct 5, 2018
Gathros added a commit that referenced this pull request Oct 5, 2018
@Gathros Gathros deleted the bubble_sort_asm branch October 5, 2018 10:23
kenpower pushed a commit to kenpower/algorithm-archive that referenced this pull request Oct 22, 2018
* Adding x86 assembly to book.json

* Adding bubble_sort.asm

* Updaing bubble_sort.md

* changing x86 to x86-64

* Adding x86-64 bubble sort

* Adding bubble_sort.s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Implementation This provides an implementation for an algorithm. (Code and maybe md files are edited.)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants