|
2 | 2 | zval
|
3 | 3 | ######
|
4 | 4 |
|
5 |
| -PHP is a dynamic language. As such, a variable can typically contain a value of any type, and the |
6 |
| -type of the variable may even change during the execution of the program. Under the hood, this is |
7 |
| -implemented through the ``zval`` struct. It is one of the most important data structures in php-src. |
8 |
| -It is essentially a "tagged union", meaning it consists of an integer tag, representing the type of |
9 |
| -the variable, and a union for the value itself. Let's look at the value first. |
| 5 | +PHP is a dynamic language. A variable can typically contain a value of any type, and the type of the |
| 6 | +variable may even change during the execution of the program. Under the hood, this is implemented |
| 7 | +through the ``zval`` struct. It is one of the most important data structures in php-src. It is |
| 8 | +implemented as a "tagged union", meaning it stores what type of value it contains, and the value |
| 9 | +itself. Let's look at the type first. |
| 10 | + |
| 11 | +************ |
| 12 | + zval types |
| 13 | +************ |
| 14 | + |
| 15 | +.. code:: c |
| 16 | +
|
| 17 | + #define IS_UNDEF 0 /* A variable that was never written to. */ |
| 18 | + #define IS_NULL 1 |
| 19 | + #define IS_FALSE 2 |
| 20 | + #define IS_TRUE 3 |
| 21 | + #define IS_LONG 4 /* An integer value. */ |
| 22 | + #define IS_DOUBLE 5 /* A floating point value. */ |
| 23 | + #define IS_STRING 6 |
| 24 | + #define IS_ARRAY 7 |
| 25 | + #define IS_OBJECT 8 |
| 26 | + #define IS_RESOURCE 9 |
| 27 | + #define IS_REFERENCE 10 |
| 28 | +
|
| 29 | +These simple integer constants determine what value is currently stored in a variable. If you are a |
| 30 | +PHP developer, these types should sound fairly familiar. They are pretty much an exact reflection of |
| 31 | +the types you may use in regular PHP code. One small oddity is that ``IS_FALSE`` and ``IS_TRUE`` are |
| 32 | +implemented as separate types, instead of as a ``IS_BOOL`` type. |
| 33 | + |
| 34 | +Some of these types are self-contained, they don't store any auxiliary data. This includes |
| 35 | +``IS_UNDEF``, ``IS_NULL``, ``IS_FALSE`` and ``IS_TRUE``. For the rest of the types, we are going to |
| 36 | +require some additional memory to store the actual value of the variable. |
10 | 37 |
|
11 | 38 | ************
|
12 | 39 | zend_value
|
@@ -35,43 +62,24 @@ the variable, and a union for the value itself. Let's look at the value first.
|
35 | 62 | } ww;
|
36 | 63 | } zend_value;
|
37 | 64 |
|
38 |
| -A C union is a data type that is big enough to hold the biggest of its members. As such, it can hold |
39 |
| -exactly one of its members at a time. For example, ``zend_value`` may store the ``lval`` member, or |
40 |
| -the ``dval`` member, but never both at the same time. Remembering exactly *which* member is stored |
41 |
| -is our job. That's what the ``zval`` types are for. |
| 65 | +A C union is a data type that may store any one of its members at a time, by being (at least) as big |
| 66 | +as its biggest member. For example, ``zend_value`` may store the ``lval`` member, or the ``dval`` |
| 67 | +member, but never both at the same time. However, it doesn't know which member is being stored. |
| 68 | +Remembering this is our job, and that's exactly what the ``IS_*`` constants are for. |
42 | 69 |
|
43 |
| -If you are a PHP developer, the top members should sound pretty familiar, with the exception of |
44 |
| -``counted``. ``counted`` refers to any of the values that use `reference counting <todo>`__ to |
45 |
| -determine the lifetime of a value. This includes strings, arrays, objects, resources and references. |
46 |
| -All of these will be discussed in their own chapters. You may be thinking that some values are |
47 |
| -missing, most notably ``null`` and ``bool``. These values don't hold any auxiliary data, but consist |
48 |
| -solely of the ``zval`` type. |
| 70 | +The top members of ``zend_value`` mostly mirror the ``IS_*`` constants, with the exception of |
| 71 | +``counted``. ``counted`` polymorphically refers to any `reference counted <todo>`__ value, including |
| 72 | +strings, arrays, objects, resources and references. ``null`` and ``bool`` are missing from |
| 73 | +``zend_value`` because their types are self-contained. |
49 | 74 |
|
50 |
| -************ |
51 |
| - zval types |
52 |
| -************ |
53 |
| - |
54 |
| -.. code:: c |
55 |
| -
|
56 |
| - #define IS_UNDEF 0 /* A variable that was never written to. */ |
57 |
| - #define IS_NULL 1 |
58 |
| - #define IS_FALSE 2 |
59 |
| - #define IS_TRUE 3 |
60 |
| - #define IS_LONG 4 /* An integer value. */ |
61 |
| - #define IS_DOUBLE 5 /* A floating point value. */ |
62 |
| - #define IS_STRING 6 |
63 |
| - #define IS_ARRAY 7 |
64 |
| - #define IS_OBJECT 8 |
65 |
| - #define IS_RESOURCE 9 |
66 |
| - #define IS_REFERENCE 10 |
| 75 | +The rest of the fields aren't important for now. |
67 | 76 |
|
68 |
| -These simple integers determine what value is currently stored in ``zend_value``. Together, the |
69 |
| -value and the tag make up the ``zval``, along with some other fields. Note how ``IS_NULL``, |
70 |
| -``IS_FALSE`` and ``IS_TRUE`` are actually ``zval`` types. This explains why they are absent from |
71 |
| -``zend_value``. |
| 77 | +****** |
| 78 | + zval |
| 79 | +****** |
72 | 80 |
|
73 |
| -Finally, here's what the ``zval`` struct actually looks like. This may look intimidating at first. |
74 |
| -Don't worry, we'll go over it step by step. |
| 81 | +Together, the value and the tag make up the ``zval``, along with some other fields. It may look |
| 82 | +intimidating at first. We'll go over it step by step. |
75 | 83 |
|
76 | 84 | .. code:: c
|
77 | 85 |
|
@@ -104,27 +112,16 @@ Don't worry, we'll go over it step by step.
|
104 | 112 | } u2;
|
105 | 113 | };
|
106 | 114 |
|
107 |
| -``zval.value`` reserves space for the actual variable data, if the type requires any. |
108 |
| - |
109 |
| -``zval.u1`` stores the type of the variable. This refers to the ``IS_*`` constants above. You may be |
110 |
| -wondering why this is a ``union``. In short, this field is used not only for the ``IS_*`` constants, |
111 |
| -but also some other flags. The entire ``type_info`` consists of 4 bytes. ``zval.u1.v.type``, the |
112 |
| -lowest byte, is used for the ``IS_*`` constants. ``zval.u1.v.type_flags`` is used for the |
113 |
| -``IS_TYPE_REFCOUNTED`` and ``IS_TYPE_COLLECTABLE`` flags. They will be discussed within the |
114 |
| -`reference counting <todo>`__ chapter. ``zval.u1.v.u.extra`` (containing the useless ``u`` union) is |
115 |
| -currently only used for the ``IS_STATIC_VAR_UNINITIALIZED`` flag, which is somewhat of a fringe-case |
116 |
| -we won't get into here. So, ``zval.u1.type_info`` and ``zval.u1.v`` are essentially two ways to |
117 |
| -access the same data. The ``ZEND_ENDIAN_LOHI_3`` macro is used to guarantee ordering of bytes across |
118 |
| -big- and little-endian architectures. |
119 |
| - |
120 |
| -If you're familiar with C, you'll know that the compiler likes to add padding to structures with |
121 |
| -"odd" sizes. It does that because the CPU can work with some offsets more efficiently that others. |
122 |
| -Ignoring the ``zval.u2`` field for a second, our struct would be 12 bytes in total, 8 coming from |
123 |
| -``zval.value`` and 4 from ``zval.u1``. A compiler on a 64-bit architecture will generally bump this |
124 |
| -to 16 bytes by adding 4 bytes of useless padding. If this padding is added anyway, we might as well |
125 |
| -make use of it. ``zval.u2`` is often unoccupied, but provides 4 additional bytes to be used in |
126 |
| -various contexts. How exactly the value is used depends on the use case, but it's important to |
127 |
| -remember that it may only be used for one of them at a time. |
| 115 | +``zval.value`` reserves space for the actual variable data, as discussed above. |
| 116 | + |
| 117 | +``zval.u1`` stores the variable type, the given ``IS_*`` constant, along with some other flags. It's |
| 118 | +definition looks a bit complicated. You can think of the entire field as a 4 bit integer, split into |
| 119 | +3 parts. ``v.type`` stores the actual variable type, ``v.type_flags`` is used for some `reference |
| 120 | +counting <todo>`__ flags, and ``v.u.extra`` is pretty much unused. |
| 121 | + |
| 122 | +``zval.u2`` defines some more storage for various contexts that is often unoccupied. It's there |
| 123 | +because the memory would otherwise be wasted due to padding, so we may as well make use of it. We'll |
| 124 | +go over the relevant ones in their corresponding chapters. |
128 | 125 |
|
129 | 126 | ********
|
130 | 127 | Macros
|
|
0 commit comments