Skip to content

Add Python Implementation of Huffman Encoding #98

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 1, 2018
Merged

Add Python Implementation of Huffman Encoding #98

merged 3 commits into from
May 1, 2018

Conversation

foldsters
Copy link
Contributor

No description provided.

@june128 june128 added the Implementation This provides an implementation for an algorithm. (Code and maybe md files are edited.) label Apr 30, 2018
Copy link
Contributor

@Butt4cak3 Butt4cak3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @foldsters! Thank you for contributing!

The code works fine in Python 2 and 3, but there are a few things about it that I think should be changed. Most of them are related to code style.

The code also doesn't show up in the chapter automatically. It has to be imported manually. You can either leave that out and let someone else do it for you, or you can go into "huffman.md", scroll to the bottom and add this before the {% endmethod %} line:

{% sample lang="py" %}
### Python
[import, lang:"python"](code/python/huffman.py)

trees.append((new_tree,new_weight))

# sort the trees list by weight
trees = sorted(trees, key=lambda n: n[1], reverse=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to sort the entire trees list after each iteration. I know that it will always be pretty small, but I think it would be nicer here to find the right place in the list and use list.insert() instead of list.append() and list.sort() here.

# Find the first tree that has a weight smaller than new_weight and returns its index in the list
# If no such tree can be found, use len(trees) instead to append
index = next((i for i, tree in enumerate(trees) if tree[1] < new_weight), len(trees))

# Insert the new tree there
trees.insert(index, (new_tree, new_weight))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of doing an insert, but I thought it would be a little harder to explain and detract from the point of the code. I'll replace it with your code (thanks!) and do the more efficient option from now on.

# encodes the message
def encode(mapping,message):

encoding = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use double quotes here and in a few other places as well, while you used single quotes in others. You should stick to one or the other and since single quotes are more common in Python and because other code examples in the AAA already use them, I recommend you change all your double quotes to single quotes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, I didn't even notice that I did that! Fixing that up now.

return tree

# constructs the mapping with recursion
def build_mapping(tree,code=''):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seem to not like spaces between comma-separated identifiers. To stay consistent with other code examples and code outside the AAA you should probably put spaces between function parameters, list items, etc.

Copy link
Contributor Author

@foldsters foldsters May 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is personal preference, because I usually use white space to indicate order of operations, so ((v,k) for k,v in mapping) tells me that k and v are on the same step, while (((v, k) for k, v in mapping) looks like (v, k) for k, is one step and v in mapping is another, but I see where you're coming from and I'll use sentence syntax from now on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to mention that this goes for pretty much all operators, too (a + b instead of a+b).

I can see how it makes sense in your example and maybe it's okay to omit the space in some cases if it really improves readability. But we generally like spaces here. :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've never programmed in an environment that other people needed to look at my code, so I appreciate the pointers!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good. Code review is weird because I'm always afraid of sounding like "YOU'RE DOING IT WRONG! YOU SHOULD DO IT LIKE ME AND YOUR CODE IS BAD!" but we seem to be on the same page!

# encodes the message
def encode(mapping,message):

encoding = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name of this confused me for a second. Maybe code or encoded are better names here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I'll make the variables more descriptive

# constructs the tree
def build_tree(message):

# get sorted list of character,frequency pairs
Copy link
Contributor

@Butt4cak3 Butt4cak3 May 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm super nitpicky here because this is a comment, but commas are usually followed by spaces.

I mentioned the spaces after commas in another comment already. Oops!

@foldsters
Copy link
Contributor Author

Okay so I've made the edits, and I think it has updated. Still finding my way around github.

@Butt4cak3
Copy link
Contributor

Looks pretty good. I'd merge it, but I can't. Guess there's still something to figure out with the permissions.

@Butt4cak3 Butt4cak3 merged commit 091b3c4 into algorithm-archivists:master May 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Implementation This provides an implementation for an algorithm. (Code and maybe md files are edited.)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants