Skip to content

Optimize observers #13649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Jun 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions Zend/zend_builtin_functions.c
Original file line number Diff line number Diff line change
Expand Up @@ -1776,7 +1776,7 @@ ZEND_FUNCTION(debug_print_backtrace)

ZEND_API void zend_fetch_debug_backtrace(zval *return_value, int skip_last, int options, int limit) /* {{{ */
{
zend_execute_data *call;
zend_execute_data *call, *last_call = NULL;
zend_object *object;
bool fake_frame = 0;
int lineno, frameno = 0;
Expand Down Expand Up @@ -1821,6 +1821,7 @@ ZEND_API void zend_fetch_debug_backtrace(zval *return_value, int skip_last, int

if (skip_last) {
/* skip debug_backtrace() */
last_call = call;
call = call->prev_execute_data;
}

Expand Down Expand Up @@ -1857,9 +1858,15 @@ ZEND_API void zend_fetch_debug_backtrace(zval *return_value, int skip_last, int
zval *arg = zend_get_zval_ptr(op_data, op_data->op1_type, &op_data->op1, call);
if (Z_TYPE_P(arg) == IS_UNDEF) goto not_frameless_call;
}
zend_function *func = ZEND_FLF_FUNC(opline);
/* Assume frameless functions are not recursive with themselves.
* This condition may be true when observers are enabled:
* Observers will put a call frame on top of the frameless opcode. */
if (last_call && last_call->func == func) {
goto not_frameless_call;
}
stack_frame = zend_new_array(8);
zend_hash_real_init_mixed(stack_frame);
zend_function *func = ZEND_FLF_FUNC(opline);
zend_string *name = func->common.function_name;
ZVAL_STRINGL(&tmp, ZSTR_VAL(name), ZSTR_LEN(name));
_zend_hash_append_ex(stack_frame, ZSTR_KNOWN(ZEND_STR_FUNCTION), &tmp, 1);
Expand Down Expand Up @@ -2068,6 +2075,7 @@ ZEND_API void zend_fetch_debug_backtrace(zval *return_value, int skip_last, int
} else {
fake_frame = 0;
include_filename = filename;
last_call = call;
call = prev;
}
}
Expand Down
2 changes: 1 addition & 1 deletion Zend/zend_compile.c
Original file line number Diff line number Diff line change
Expand Up @@ -4575,7 +4575,7 @@ static uint32_t find_frameless_function_offset(uint32_t arity, void *handler)

static const zend_frameless_function_info *find_frameless_function_info(zend_ast_list *args, zend_function *fbc, uint32_t type)
{
if (ZEND_OBSERVER_ENABLED || zend_execute_internal) {
if (zend_execute_internal) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we had agreed to fallback using ICALL instead of FRAMELESS in case of OBSERVER.

The actual speed-up of FRAMELESS calls wasn't very significant and the increase of the complexity already made questions. I expect, any observer makes much more significant slowdown and I don't see reasons in attempt
to win 0.1% of performance when we already losing 10%.

This patch looks much more complex...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done the proper benchmarking homework now - frameless makes about 1% in wall time for wordpress, with observers.
And yes, we were losing 5% (wall time). Now we're losing 2.5%.
I would agree on 0.1% speedup not making sense for 10% loss. But that's not the case here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I remember frameless calls gave ~0.8% win on symfony demo amd wordpress (let it be 1%).
How the observer might lose 5% in the back direction? It should be the same 1%.

Copy link
Member Author

@bwoebi bwoebi Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you misunderstand. When enabling a single observer at all have a 5% overall perf loss on the wordpress bench on current master. With this patch, without frameless support, we have a ~3.5% loss. With this patch and frameless support we have ~2.5% loss.
I.e. we do actually gain the same 1% from frameless.

I was just saying that:

I don't see reasons in attempt to win 0.1% of performance when we already losing 10%.

is exaggerated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you misunderstand. When enabling a single observer at all have a 5% overall perf loss on the wordpress bench on current master. With this patch, without frameless support, we have a ~3.5% loss. With this patch and frameless support we have ~2.5% loss.
I.e. we do actually gain the same 1% from frameless.

OK. This sounds more fair. Framiteless functions give 1% speedup, but only without observer. This proposal attempts to give the same 1% speedup when observed.

I don't think it's possible to give the same speedup, because your handler add overhead checking for observability (ZEND_OBSERVER_ENABLED + zend_observer_handler_is_unobserved).
Does this patch makes real improvement without JIT? (Your valgrind numbers show 5% on wordpress, and I can't explain this, because frameless function themselves gave just 1%).

I'm not completely sure what you are doing with JIT. Do you generate code for both frameless and ICALL? Or do you generate the code only for one of them depending on observability at the moment of compilation?

I'll try to analyse your benchmark results more careful...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your valgrind numbers show 5% on wordpress, and I can't explain this, because frameless function themselves gave just 1%.

The instr difference on this patch (with frameless) vs current master is 6-7%. The perf difference is 2.5%.
The instr difference on this patch with frameless vs this patch without frameless is 2%. The perf difference is 1%.

For both JIT and non-JIT.

I'm not completely sure what you are doing with JIT. Do you generate code for both frameless and ICALL? Or do you generate the code only for one of them depending on observability at the moment of compilation?

The generated code is essentially if (needs observing) { push frame (without arg copy); call FLF handler; pop frame } else { call FLF handler; }. So it's not quite falling back to ICALL in JIT.

I don't think it's possible to give the same speedup, because your handler add overhead checking for observability (ZEND_OBSERVER_ENABLED + zend_observer_handler_is_unobserved).

The ZEND_OBSERVER_ENABLED part is specialized away. Only zend_observer_handler_is_unobserved is checked, and that one is inlined - i.e. the full check for FLF functions is just if (*ZEND_OBSERVER_DATA(ZEND_FLF_FUNC(opline)) == ZEND_OBSERVER_NONE_OBSERVED).

which is a very low-cost check, comparatively to ICALL. We can assume the run time cache for FLF functions being set and the func not be a generator or trampoline, saving a fetch and two comparisons too.

Yes, if we had the same handling than we had for FCALL, then the speedup would not quite be the same.

return NULL;
}

Expand Down
48 changes: 48 additions & 0 deletions Zend/zend_execute.c
Original file line number Diff line number Diff line change
Expand Up @@ -1540,6 +1540,54 @@ static zend_never_inline void zend_assign_to_object_dim(zend_object *obj, zval *
}
}

static void frameless_observed_call_copy(zend_execute_data *call, uint32_t arg, zval *zv)
{
if (Z_ISUNDEF_P(zv)) {
ZVAL_NULL(ZEND_CALL_VAR_NUM(call, arg));
} else {
ZVAL_COPY_DEREF(ZEND_CALL_VAR_NUM(call, arg), zv);
}
}

ZEND_API void zend_frameless_observed_call(zend_execute_data *execute_data)
{
const zend_op *opline = EX(opline);
uint8_t num_args = ZEND_FLF_NUM_ARGS(opline->opcode);
zend_function *fbc = ZEND_FLF_FUNC(opline);
zval *result = EX_VAR(opline->result.var);

zend_execute_data *call = zend_vm_stack_push_call_frame_ex(zend_vm_calc_used_stack(num_args, fbc), ZEND_CALL_NESTED_FUNCTION, fbc, num_args, NULL);
call->prev_execute_data = execute_data;

switch (num_args) {
case 3: frameless_observed_call_copy(call, 2, zend_get_zval_ptr(opline+1, (opline+1)->op1_type, &(opline+1)->op1, execute_data)); ZEND_FALLTHROUGH;
case 2: frameless_observed_call_copy(call, 1, zend_get_zval_ptr(opline, opline->op2_type, &opline->op2, execute_data)); ZEND_FALLTHROUGH;
case 1: frameless_observed_call_copy(call, 0, zend_get_zval_ptr(opline, opline->op1_type, &opline->op1, execute_data));
}

EG(current_execute_data) = call;

zend_observer_fcall_begin_prechecked(call, ZEND_OBSERVER_DATA(fbc));
fbc->internal_function.handler(call, result);
zend_observer_fcall_end(call, result);

EG(current_execute_data) = execute_data;

if (UNEXPECTED(EG(exception) != NULL)) {
zend_rethrow_exception(execute_data);
}

zend_vm_stack_free_args(call);

uint32_t call_info = ZEND_CALL_INFO(call);
if (UNEXPECTED(call_info & ZEND_CALL_ALLOCATED)) {
zend_vm_stack_free_call_frame_ex(call_info, call);
} else {
EG(vm_stack_top) = (zval*)call;
}
}


static zend_always_inline int zend_binary_op(zval *ret, zval *op1, zval *op2 OPLINE_DC)
{
static const binary_op_type zend_binary_ops[] = {
Expand Down
2 changes: 2 additions & 0 deletions Zend/zend_execute.h
Original file line number Diff line number Diff line change
Expand Up @@ -430,6 +430,8 @@ ZEND_API void zend_cleanup_unfinished_execution(zend_execute_data *execute_data,
ZEND_API ZEND_ATTRIBUTE_DEPRECATED HashTable *zend_unfinished_execution_gc(zend_execute_data *execute_data, zend_execute_data *call, zend_get_gc_buffer *gc_buffer);
ZEND_API HashTable *zend_unfinished_execution_gc_ex(zend_execute_data *execute_data, zend_execute_data *call, zend_get_gc_buffer *gc_buffer, bool suspended_by_yield);

ZEND_API void zend_frameless_observed_call(zend_execute_data *execute_data);

zval * ZEND_FASTCALL zend_handle_named_arg(
zend_execute_data **call_ptr, zend_string *arg_name,
uint32_t *arg_num_ptr, void **cache_slot);
Expand Down
2 changes: 1 addition & 1 deletion Zend/zend_frameless_function.h
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ typedef void (*zend_frameless_function_3)(zval *return_value, zval *op1, zval *o
extern size_t zend_flf_count;
extern size_t zend_flf_capacity;
ZEND_API extern void **zend_flf_handlers;
extern zend_function **zend_flf_functions;
ZEND_API extern zend_function **zend_flf_functions;

typedef struct {
void *handler;
Expand Down
2 changes: 2 additions & 0 deletions Zend/zend_globals.h
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,8 @@ struct _zend_executor_globals {

uint32_t jit_trace_num; /* Used by tracing JIT to reference the currently running trace */

zend_execute_data *current_observed_frame;

int ticks_count;

zend_long precision;
Expand Down
105 changes: 62 additions & 43 deletions Zend/zend_observer.c
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
+----------------------------------------------------------------------+
| Authors: Levi Morrison <levim@php.net> |
| Sammy Kaye Powers <sammyk@php.net> |
| Bob Weinand <bobwei9@hotmail.com> |
+----------------------------------------------------------------------+
*/

Expand All @@ -23,15 +24,8 @@
#include "zend_llist.h"
#include "zend_vm.h"

#define ZEND_OBSERVER_DATA(function) \
ZEND_OP_ARRAY_EXTENSION((&(function)->common), ZEND_USER_CODE((function)->type) \
? zend_observer_fcall_op_array_extension : zend_observer_fcall_internal_function_extension)

#define ZEND_OBSERVER_NOT_OBSERVED ((void *) 2)

#define ZEND_OBSERVABLE_FN(function) \
(ZEND_MAP_PTR(function->common.run_time_cache) && !(function->common.fn_flags & ZEND_ACC_CALL_VIA_TRAMPOLINE))
Comment on lines -32 to -33
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this check effectively move to?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to zend_observer_fcall_has_no_observers in zend_observer.h


static zend_llist zend_observers_fcall_list;
static zend_llist zend_observer_function_declared_callbacks;
static zend_llist zend_observer_class_linked_callbacks;
Expand All @@ -46,8 +40,6 @@ bool zend_observer_errors_observed;
bool zend_observer_function_declared_observed;
bool zend_observer_class_linked_observed;

ZEND_TLS zend_execute_data *current_observed_frame;

// Call during minit/startup ONLY
ZEND_API void zend_observer_fcall_register(zend_observer_fcall_init init)
{
Expand Down Expand Up @@ -107,7 +99,7 @@ ZEND_API void zend_observer_post_startup(void)

ZEND_API void zend_observer_activate(void)
{
current_observed_frame = NULL;
EG(current_observed_frame) = NULL;
}

ZEND_API void zend_observer_shutdown(void)
Expand All @@ -127,21 +119,24 @@ static void zend_observer_fcall_install(zend_execute_data *execute_data)
zend_function *function = execute_data->func;

ZEND_ASSERT(RUN_TIME_CACHE(&function->common));
zend_observer_fcall_begin_handler *begin_handlers = (zend_observer_fcall_begin_handler *)&ZEND_OBSERVER_DATA(function);
zend_observer_fcall_begin_handler *begin_handlers = ZEND_OBSERVER_DATA(function), *begin_handlers_start = begin_handlers;
zend_observer_fcall_end_handler *end_handlers = (zend_observer_fcall_end_handler *)begin_handlers + list->count, *end_handlers_start = end_handlers;

*begin_handlers = ZEND_OBSERVER_NOT_OBSERVED;
*end_handlers = ZEND_OBSERVER_NOT_OBSERVED;
bool has_handlers = false;

for (zend_llist_element *element = list->head; element; element = element->next) {
zend_observer_fcall_init init;
memcpy(&init, element->data, sizeof init);
zend_observer_fcall_handlers handlers = init(execute_data);
if (handlers.begin) {
*(begin_handlers++) = handlers.begin;
has_handlers = true;
}
if (handlers.end) {
*(end_handlers++) = handlers.end;
has_handlers = true;
}
}

Expand All @@ -151,6 +146,10 @@ static void zend_observer_fcall_install(zend_execute_data *execute_data)
*end_handlers = *end_handlers_start;
*end_handlers_start = tmp;
}

if (!has_handlers) {
*begin_handlers_start = ZEND_OBSERVER_NONE_OBSERVED;
}
}

/* We need to provide the ability to retrieve the handler which will move onto the position the current handler was.
Expand Down Expand Up @@ -182,8 +181,8 @@ static bool zend_observer_remove_handler(void **first_handler, void *old_handler

ZEND_API void zend_observer_add_begin_handler(zend_function *function, zend_observer_fcall_begin_handler begin) {
size_t registered_observers = zend_observers_fcall_list.count;
zend_observer_fcall_begin_handler *first_handler = (void *)&ZEND_OBSERVER_DATA(function), *last_handler = first_handler + registered_observers - 1;
if (*first_handler == ZEND_OBSERVER_NOT_OBSERVED) {
zend_observer_fcall_begin_handler *first_handler = ZEND_OBSERVER_DATA(function), *last_handler = first_handler + registered_observers - 1;
if (*first_handler == ZEND_OBSERVER_NOT_OBSERVED || *first_handler == ZEND_OBSERVER_NONE_OBSERVED) {
*first_handler = begin;
} else {
for (zend_observer_fcall_begin_handler *cur_handler = first_handler + 1; cur_handler <= last_handler; ++cur_handler) {
Expand All @@ -198,24 +197,47 @@ ZEND_API void zend_observer_add_begin_handler(zend_function *function, zend_obse
}

ZEND_API bool zend_observer_remove_begin_handler(zend_function *function, zend_observer_fcall_begin_handler begin, zend_observer_fcall_begin_handler *next) {
return zend_observer_remove_handler((void **)&ZEND_OBSERVER_DATA(function), begin, (void **)next);
void **begin_handlers = (void **)ZEND_OBSERVER_DATA(function);
if (zend_observer_remove_handler(begin_handlers, begin, (void**)next)) {
// Ensure invariant: ZEND_OBSERVER_NONE_OBSERVED in begin_handlers if both are not observed
if (*begin_handlers == ZEND_OBSERVER_NOT_OBSERVED) {
size_t registered_observers = zend_observers_fcall_list.count;
if (begin_handlers[registered_observers] /* first end handler */ == ZEND_OBSERVER_NOT_OBSERVED) {
*begin_handlers = ZEND_OBSERVER_NONE_OBSERVED;
}
}
return true;
}
return false;
}

ZEND_API void zend_observer_add_end_handler(zend_function *function, zend_observer_fcall_end_handler end) {
size_t registered_observers = zend_observers_fcall_list.count;
zend_observer_fcall_end_handler *end_handler = (zend_observer_fcall_end_handler *)&ZEND_OBSERVER_DATA(function) + registered_observers;
void **begin_handler = (void **)ZEND_OBSERVER_DATA(function);
zend_observer_fcall_end_handler *end_handler = (zend_observer_fcall_end_handler *)begin_handler + registered_observers;
// to allow to preserve the invariant that end handlers are in reverse order of begin handlers, push the new end handler in front
if (*end_handler != ZEND_OBSERVER_NOT_OBSERVED) {
// there's no space for new handlers, then it's forbidden to call this function
ZEND_ASSERT(end_handler[registered_observers - 1] == NULL);
memmove(end_handler + 1, end_handler, sizeof(end_handler) * (registered_observers - 1));
} else if (*begin_handler == ZEND_OBSERVER_NONE_OBSERVED) {
*begin_handler = ZEND_OBSERVER_NOT_OBSERVED;
}
*end_handler = end;
}

ZEND_API bool zend_observer_remove_end_handler(zend_function *function, zend_observer_fcall_end_handler end, zend_observer_fcall_end_handler *next) {
size_t registered_observers = zend_observers_fcall_list.count;
return zend_observer_remove_handler((void **)&ZEND_OBSERVER_DATA(function) + registered_observers, end, (void **)next);
void **begin_handlers = (void **)ZEND_OBSERVER_DATA(function);
void **end_handlers = begin_handlers + registered_observers;
if (zend_observer_remove_handler(end_handlers, end, (void**)next)) {
// Ensure invariant: ZEND_OBSERVER_NONE_OBSERVED in begin_handlers if both are not observed
if (*begin_handlers == ZEND_OBSERVER_NOT_OBSERVED && *end_handlers == ZEND_OBSERVER_NOT_OBSERVED) {
*begin_handlers = ZEND_OBSERVER_NONE_OBSERVED;
}
return true;
}
return false;
}

static inline zend_execute_data **prev_observed_frame(zend_execute_data *execute_data) {
Expand All @@ -224,33 +246,33 @@ static inline zend_execute_data **prev_observed_frame(zend_execute_data *execute
return (zend_execute_data **)&Z_PTR_P(EX_VAR_NUM((ZEND_USER_CODE(func->type) ? func->op_array.last_var : ZEND_CALL_NUM_ARGS(execute_data)) + func->common.T - 1));
}

static void ZEND_FASTCALL _zend_observe_fcall_begin(zend_execute_data *execute_data)
{
static void ZEND_FASTCALL _zend_observe_fcall_begin(zend_execute_data *execute_data) {
if (!ZEND_OBSERVER_ENABLED) {
return;
}

zend_function *function = execute_data->func;
zend_observer_fcall_begin_specialized(execute_data, true);
}

if (!ZEND_OBSERVABLE_FN(function)) {
return;
}
ZEND_API void ZEND_FASTCALL zend_observer_fcall_begin_prechecked(zend_execute_data *execute_data, zend_observer_fcall_begin_handler *handler)
{
zend_observer_fcall_begin_handler *possible_handlers_end = handler + zend_observers_fcall_list.count;

zend_observer_fcall_begin_handler *handler = (zend_observer_fcall_begin_handler *)&ZEND_OBSERVER_DATA(function);
if (!*handler) {
zend_observer_fcall_install(execute_data);
if (zend_observer_handler_is_unobserved(handler)) {
return;
}
}

zend_observer_fcall_begin_handler *possible_handlers_end = handler + zend_observers_fcall_list.count;

zend_observer_fcall_end_handler *end_handler = (zend_observer_fcall_end_handler *)possible_handlers_end;
if (*end_handler != ZEND_OBSERVER_NOT_OBSERVED) {
*prev_observed_frame(execute_data) = current_observed_frame;
current_observed_frame = execute_data;
}
*prev_observed_frame(execute_data) = EG(current_observed_frame);
EG(current_observed_frame) = execute_data;

if (*handler == ZEND_OBSERVER_NOT_OBSERVED) {
return;
if (*handler == ZEND_OBSERVER_NOT_OBSERVED) { // this function must not be called if ZEND_OBSERVER_NONE_OBSERVED, hence sufficient to check
return;
}
}

do {
Expand All @@ -265,17 +287,17 @@ ZEND_API void ZEND_FASTCALL zend_observer_generator_resume(zend_execute_data *ex

ZEND_API void ZEND_FASTCALL zend_observer_fcall_begin(zend_execute_data *execute_data)
{
ZEND_ASSUME(execute_data->func);
if (!(execute_data->func->common.fn_flags & ZEND_ACC_GENERATOR)) {
ZEND_ASSUME(EX(func));
if (!(EX(func)->common.fn_flags & ZEND_ACC_GENERATOR)) {
_zend_observe_fcall_begin(execute_data);
}
}

static inline void call_end_observers(zend_execute_data *execute_data, zval *return_value) {
zend_function *func = execute_data->func;
zend_function *func = EX(func);
ZEND_ASSERT(func);

zend_observer_fcall_end_handler *handler = (zend_observer_fcall_end_handler *)&ZEND_OBSERVER_DATA(func) + zend_observers_fcall_list.count;
zend_observer_fcall_end_handler *handler = (zend_observer_fcall_end_handler *)ZEND_OBSERVER_DATA(func) + zend_observers_fcall_list.count;
// TODO: Fix exceptions from generators
// ZEND_ASSERT(fcall_data);
if (!*handler || *handler == ZEND_OBSERVER_NOT_OBSERVED) {
Expand All @@ -288,19 +310,16 @@ static inline void call_end_observers(zend_execute_data *execute_data, zval *ret
} while (++handler != possible_handlers_end && *handler != NULL);
}

ZEND_API void ZEND_FASTCALL zend_observer_fcall_end(zend_execute_data *execute_data, zval *return_value)
ZEND_API void ZEND_FASTCALL zend_observer_fcall_end_prechecked(zend_execute_data *execute_data, zval *return_value)
{
if (execute_data != current_observed_frame) {
return;
}
call_end_observers(execute_data, return_value);
current_observed_frame = *prev_observed_frame(execute_data);
EG(current_observed_frame) = *prev_observed_frame(execute_data);
}

ZEND_API void zend_observer_fcall_end_all(void)
{
zend_execute_data *execute_data = current_observed_frame, *original_execute_data = EG(current_execute_data);
current_observed_frame = NULL;
zend_execute_data *execute_data = EG(current_observed_frame), *original_execute_data = EG(current_execute_data);
EG(current_observed_frame) = NULL;
while (execute_data) {
EG(current_execute_data) = execute_data;
call_end_observers(execute_data, NULL);
Expand Down Expand Up @@ -401,8 +420,8 @@ ZEND_API void ZEND_FASTCALL zend_observer_fiber_switch_notify(zend_fiber_context
callback(from, to);
}

from->top_observed_frame = current_observed_frame;
current_observed_frame = to->top_observed_frame;
from->top_observed_frame = EG(current_observed_frame);
EG(current_observed_frame) = to->top_observed_frame;
}

ZEND_API void ZEND_FASTCALL zend_observer_fiber_destroy_notify(zend_fiber_context *destroying)
Expand Down
Loading
Loading