-
Notifications
You must be signed in to change notification settings - Fork 229
[MRG] Added random_states #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Added random_states #35
Conversation
ping @perimosocordiae. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the docs, the object type should be a numpy RandomState.
""" | ||
the random state object to be passed must be a random.seed() object and not a numpy random seed | ||
""" | ||
state = random_state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The random_state
object isn't used in this function, so I'd drop it entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it used because random is used in random.randint
and random.sample
?
I did not mention numpy
on purpose, because while numpy
randomstate isn't being used, a random seed still is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. We should convert those calls to use the numpy random module. Then, the same technique used in the other methods will apply.
""" | ||
the random state object to be passed must be a numpy random seed | ||
""" | ||
state = random_state | ||
n = len(all_labels) | ||
num_ignored = max(0, n - num_preserved) | ||
idx = np.random.randint(n, size=num_ignored) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of passing in a random state is to generate the random numbers we need from it. So in this case, it would look like:
def random_subset(all_labels, num_preserved=np.inf, random_state=np.random):
...
idx = random_state.randint(n, size=num_ignored)
...
We can use the np.random
module as the default value, which means it'll call np.random.randint
by default.
This wouldn't support passing in a raw numeric seed, but that's ok by me: users can just pass in np.random.RandomState(1234)
if they want to use seed 1234
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, cool. I'll add the np.random
as the default in all the other classes as well.
@perimosocordiae could you let me know how to handle random seeds in the Could you also have a look at the other classes to see if it is fine? |
@perimosocordiae , I addressed the comments and added |
chunks = -np.ones_like(self.known_label_idx, dtype=int) | ||
uniq, lookup = np.unique(self.known_labels, return_inverse=True) | ||
all_inds = [set(np.where(lookup==c)[0]) for c in xrange(len(uniq))] | ||
idx = 0 | ||
while idx < num_chunks and all_inds: | ||
c = random.randint(0, len(all_inds)-1) | ||
c = np.random.randint(0, high=len(all_inds)-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good start, but we need to be using random_state
here:
c = random_state.randint(0, high=len(all_inds)-1)
(Same comment for all the other places where we're directly calling functions on np.random
in this module.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course - I feel rather silly now. Pushing the right commit!
@perimosocordiae, again, incorporated suggestions and changes (silly, stupid oversights). |
Looking much better, thanks! The only remaining issue is the |
Ah, I hadn't noticed it. Made the changes. |
Added options to pass a
random_seed
for all the algorithms which use it.Addresses #33, and will help the tests for #26.