Skip to content

ext/pgsql: convert regex match, using pcre jit if available and enabled. #15039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

devnexen
Copy link
Member

No description provided.

@nielsdos
Copy link
Member

Technically this does seem right, but I wonder about the performance: is the extra effort spent JIT compiling worth it for performing a single match? It seems that the regexes can be quite complex, so it's a bit of a shame that they have to be recompiled each time (already true today, but even more true when JITting).
I think it would be good to cache the compiled result (ext/pcre does caching to avoid recompilation).

@devnexen
Copy link
Member Author

ok will give it a try

@devnexen devnexen force-pushed the pgsql_match_usejit branch from 9fa7735 to e45f407 Compare July 20, 2024 19:51
Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't yet check whether the options are translated correctly to the regex strings. But here's a first round of review.

zend_hash_destroy(&PGG(connections));

for (i = 0; i < 11; i ++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can merge the assignment and declaration of i, we're in C99.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also use a sizeof expression instead of the hardcoded 11? Or alternatively, use a define instead of 11 for both the declaration of the array and this loop?

php_error_docref(NULL, E_WARNING, "Cannot allocate match data");
return FAILURE;
}
res = pcre2_match(re, (PCRE2_SPTR)ZSTR_VAL(str), ZSTR_LEN(str), 0, 0, match_data, php_pcre_mctx());
centry->refcount ++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use php_pcre_pce_incref.

php_pcre_free_match_data(match_data);
pcre2_code_free(re);
centry->refcount --;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use php_pcre_pce_decref.

php_error_docref(NULL, E_WARNING, "Cannot allocate match data");
return FAILURE;
}
res = pcre2_match(re, (PCRE2_SPTR)ZSTR_VAL(str), ZSTR_LEN(str), 0, 0, match_data, php_pcre_mctx());
centry->refcount ++;
res = pcre2_match(centry->re, (PCRE2_SPTR)ZSTR_VAL(str), ZSTR_LEN(str), 0, 0, match_data, php_pcre_mctx());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use php_pcre_pce_re to obtain re.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool I did not notice them.

return FAILURE;
}

match_data = php_pcre_create_match_data(0, re);
match_data = php_pcre_create_match_data(0, centry->re);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use php_pcre_pce_re to obtain re.

@@ -147,6 +147,17 @@ ZEND_TSRMLS_CACHE_DEFINE()
ZEND_GET_MODULE(pgsql)
#endif

struct _pcre_cache_entry {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't import internals from another extension into this extension. I added some suggestions on how to avoid this.

memset(pgsql_globals, 0, sizeof(zend_pgsql_globals));
zend_hash_init(&pgsql_globals->connections, 0, NULL, NULL, 1);

#define ADD_REGEX(reg) pgsql_globals->regexes[i ++] = zend_string_init(reg, strlen(reg), true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a bounds check assertion please?

@devnexen devnexen force-pushed the pgsql_match_usejit branch from e45f407 to 985b227 Compare July 20, 2024 21:04
Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is right, thanks!

@devnexen devnexen closed this in ba54ceb Jul 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants