-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
ENH: Add regex=True flag to str_contains #5879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
can you add a vbench for both of these? (with and w/o a regex).... no vbenches for strings at ALL! pls add bonus points for adding benches for more string ops (could be another PR if you want)! |
also, pls add a release notes refering this as an API change (you can put in the 0.13.1 bucket) |
Would |
@cancan101 no |
I added some Benchmarks. Note that I modified |
can you post the run of this benchmarks ? |
Running
yields
|
@unutbu grt! hmm.....maybe have a look at the last 3 to see if anything obvious can do? (esp extract)? |
Where's the perf difference? all those benchmarks look like 1.0x to me, or is that 4% it? |
@y-p: There should be no performance gain shown in the Benchmarks (because the default behavior, |
@unutbu I think that you show these benchmarks, but they are not appearing in the above list; this is a case where you compare across benchmarks
|
No matter, I tried this here and it does indeed give a nice boost. +1 |
@jreback: I'd be glad to but I don't know how. When I run those Benchmarks appear to be missing. And some others (with no I still don't understand how to use vbench very well. Is there any documentation? |
@unutbu that looks right.... try I just meant to do |
The filtering is on the benchmark name (first arg of |
Using regex=False can be faster when full regex searching is not needed. See http://stackoverflow.com/q/20951840/190597 TST: Add a test for str_contains with regex=False BUG: Not all strings.str_* functions return an object with an ndim attribute. PERF: add benchmarks for every str method
if regex is False but the pattern is actually a regex should this raise? (not sure if its easy to determine if its an actual regex) |
any regex is also a valid string, so you can't disambiguate. |
Someone might have a list of regex strings and wish to find those that contain a literal string like @jreback: I tried |
a real world example: 'Mr. Smith' Do I want to match Mr./Mrs smith or an exact match? |
yep...ok |
@y-p: I think filtering is being done in test_perf.py with
and
Moreover,
prints But I must be missing something because I don't know why some Benchmarks are not being run. Maybe this is a clue: |
That explaines why I got different results from you, erase the db and they'll probably vanish. I had no idea wes used the Update: I'm trying to do too many things at once, Of course the first arg is not the name. |
ENH: Add regex=True flag to str_contains
@unutbu thanks for the PR....nice work! |
Using regex=False can be faster when full regex searching is not needed.
See http://stackoverflow.com/q/20951840/190597
Example use case: