Skip to content

Better uop coverage in the JIT optimizer #131798

Open
@brandtbucher

Description

@brandtbucher

Out of 263 total uops, 155 of these are ignored by the tier two optimizer. These represent over half of all uops by dynamic execution count.

This issue will serve as a checklist for auditing these missing uops, and adding them where they make sense. At first glance, there's quite a bit of potential here... especially around ability to narrow known output types (like _CONTAINS_OP_SET), and the ability to narrow and remove guards on input types (like _BINARY_OP_SUBSCR_LIST_INT). As I'm going through, I'll cross out anything that doesn't seem like it makes sense to add.

First, here are the 53 missing uops that each represent at least 0.1% of all uops executed:

  • _SET_IP (12.1%)
  • _CHECK_VALIDITY (10.1%)
  • _CHECK_VALIDITY_AND_SET_IP (6.5%)
  • _CHECK_PERIODIC (3.1%)
  • _MAKE_WARM (2.8%)
  • _START_EXECUTOR (1.7%)
  • _GUARD_NOS_INT (1.5%)
  • _BINARY_OP_SUBSCR_LIST_INT (1.0%)
  • _CHECK_FUNCTION (1.0%)
  • _CHECK_MANAGED_OBJECT_HAS_VALUES (0.7%)
  • _ITER_CHECK_LIST (0.7%)
  • _CONTAINS_OP_SET (0.6%)
  • _FOR_ITER_TIER_TWO (0.6%)
  • _GUARD_NOT_EXHAUSTED_LIST (0.6%)
  • _ITER_NEXT_LIST_TIER_TWO (0.6%)
  • _SAVE_RETURN_OFFSET (0.6%)
  • _CALL_LEN (0.5%)
  • _CALL_LIST_APPEND (0.5%)
  • _POP_TOP (0.5%)
  • _RESUME_CHECK (0.5%)
  • _BINARY_OP_SUBSCR_STR_INT (0.4%)
  • _GUARD_DORV_VALUES_INST_ATTR_FROM_DICT (0.4%)
  • _GUARD_KEYS_VERSION (0.4%)
  • _BINARY_OP_SUBSCR_DICT (0.3%)
  • _CALL_BUILTIN_FAST (0.3%)
  • _CHECK_STACK_SPACE_OPERAND (0.3%)
  • _GET_ITER (0.3%)
  • _STORE_SUBSCR (0.3%)
  • _GUARD_NOT_EXHAUSTED_RANGE (0.2%)
  • _BINARY_SLICE (0.2%)
  • _BUILD_LIST (0.2%)
  • _CALL_BUILTIN_O (0.2%)
  • _CALL_NON_PY_GENERAL (0.2%)
  • _CHECK_IS_NOT_PY_CALLABLE (0.2%)
  • _GUARD_NOS_FLOAT (0.2%)
  • _ITER_CHECK_RANGE (0.2%)
  • _ITER_CHECK_TUPLE (0.2%)
  • _LOAD_DEREF (0.2%)
  • _STORE_SUBSCR_LIST_INT (0.2%)
  • _BINARY_OP_EXTEND (0.1%)
  • _CALL_ISINSTANCE (0.1%)
  • _CALL_METHOD_DESCRIPTOR_FAST (0.1%)
  • _CALL_METHOD_DESCRIPTOR_FAST_WITH_KEYWORDS (0.1%)
  • _CALL_METHOD_DESCRIPTOR_NOARGS (0.1%)
  • _CALL_TYPE_1 (0.1%)
  • _CHECK_ATTR_CLASS (0.1%)
  • _CONTAINS_OP_DICT (0.1%)
  • _GUARD_BINARY_OP_EXTEND (0.1%)
  • _GUARD_NOT_EXHAUSTED_TUPLE (0.1%)
  • _ITER_NEXT_TUPLE (0.1%)
  • _LIST_APPEND (0.1%)
  • _STORE_ATTR_SLOT (0.1%)
  • _STORE_SUBSCR_DICT (0.1%)

And here are the 102 missing uops that are less than 0.1%. These are less important, but still may net us some wins on individual benchmarks:

  • _BINARY_OP_SUBSCR_CHECK_FUNC
  • _BINARY_OP_SUBSCR_TUPLE_INT
  • _BUILD_MAP
  • _BUILD_SET
  • _BUILD_SLICE
  • _BUILD_STRING
  • _CALL_BUILTIN_CLASS
  • _CALL_BUILTIN_FAST_WITH_KEYWORDS
  • _CALL_INTRINSIC_1
  • _CALL_INTRINSIC_2
  • _CALL_KW_NON_PY
  • _CALL_METHOD_DESCRIPTOR_O
  • _CALL_STR_1
  • _CALL_TUPLE_1
  • _CHECK_ATTR_METHOD_LAZY_DICT
  • _CHECK_EG_MATCH
  • _CHECK_EXC_MATCH
  • _CHECK_FUNCTION_VERSION_INLINE
  • _CHECK_FUNCTION_VERSION_KW
  • _CHECK_IS_NOT_PY_CALLABLE_KW
  • _CHECK_METHOD_VERSION
  • _CHECK_METHOD_VERSION_KW
  • _CHECK_PERIODIC_IF_NOT_YIELD_FROM
  • _CONVERT_VALUE
  • _COPY_FREE_VARS
  • _DELETE_ATTR
  • _DELETE_DEREF
  • _DELETE_FAST
  • _DELETE_GLOBAL
  • _DELETE_NAME
  • _DELETE_SUBSCR
  • _DEOPT
  • _DICT_MERGE
  • _DICT_UPDATE
  • _END_FOR
  • _END_SEND
  • _ERROR_POP_N
  • _EXIT_INIT_CHECK
  • _EXPAND_METHOD
  • _EXPAND_METHOD_KW
  • _FATAL_ERROR
  • _FORMAT_SIMPLE
  • _FORMAT_WITH_SPEC
  • _GET_AITER
  • _GET_ANEXT
  • _GET_AWAITABLE
  • _GET_LEN
  • _GET_YIELD_FROM_ITER
  • _GUARD_DORV_NO_DICT
  • _GUARD_GLOBALS_VERSION
  • _GUARD_TOS_FLOAT
  • _GUARD_TOS_INT
  • _GUARD_TYPE_VERSION_AND_LOCK
  • _IMPORT_FROM
  • _IMPORT_NAME
  • _IS_NONE
  • _LIST_EXTEND
  • _LOAD_ATTR_NONDESCRIPTOR_NO_DICT
  • _LOAD_ATTR_NONDESCRIPTOR_WITH_VALUES
  • _LOAD_BUILD_CLASS
  • _LOAD_COMMON_CONSTANT
  • _LOAD_FAST_LOAD_FAST
  • _LOAD_FROM_DICT_OR_DEREF
  • _LOAD_GLOBAL
  • _LOAD_GLOBAL_BUILTINS
  • _LOAD_GLOBAL_MODULE
  • _LOAD_LOCALS
  • _LOAD_NAME
  • _LOAD_SUPER_ATTR_ATTR
  • _LOAD_SUPER_ATTR_METHOD
  • _MAKE_CALLARGS_A_TUPLE
  • _MAKE_CELL
  • _MAKE_FUNCTION
  • _MAP_ADD
  • _MATCH_CLASS
  • _MATCH_KEYS
  • _MATCH_MAPPING
  • _MATCH_SEQUENCE
  • _MAYBE_EXPAND_METHOD_KW
  • _NOP
  • _POP_EXCEPT
  • _POP_TWO_LOAD_CONST_INLINE_BORROW
  • _PUSH_EXC_INFO
  • _PUSH_NULL_CONDITIONAL
  • _SETUP_ANNOTATIONS
  • _SET_ADD
  • _SET_FUNCTION_ATTRIBUTE
  • _SET_UPDATE
  • _STORE_ATTR
  • _STORE_ATTR_INSTANCE_VALUE
  • _STORE_ATTR_WITH_HINT
  • _STORE_DEREF
  • _STORE_FAST_LOAD_FAST
  • _STORE_FAST_STORE_FAST
  • _STORE_GLOBAL
  • _STORE_NAME
  • _STORE_SLICE
  • _TIER2_RESUME_CHECK
  • _UNARY_INVERT
  • _UNARY_NEGATIVE
  • _UNPACK_SEQUENCE_LIST
  • _WITH_EXCEPT_START

Linked PRs

Metadata

Metadata

Assignees

Labels

interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetopic-JITtype-featureA feature request or enhancement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions