Skip to content

Commit aa059ed

Browse files
committed
Simplify zval chapter
1 parent e9ed474 commit aa059ed

File tree

1 file changed

+56
-59
lines changed
  • docs/source/core/data-structures

1 file changed

+56
-59
lines changed

docs/source/core/data-structures/zval.rst

Lines changed: 56 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,38 @@
22
zval
33
######
44

5-
PHP is a dynamic language. As such, a variable can typically contain a value of any type, and the
6-
type of the variable may even change during the execution of the program. Under the hood, this is
7-
implemented through the ``zval`` struct. It is one of the most important data structures in php-src.
8-
It is essentially a "tagged union", meaning it consists of an integer tag, representing the type of
9-
the variable, and a union for the value itself. Let's look at the value first.
5+
PHP is a dynamic language. A variable can typically contain a value of any type, and the type of the
6+
variable may even change during the execution of the program. Under the hood, this is implemented
7+
through the ``zval`` struct. It is one of the most important data structures in php-src. It is
8+
implemented as a "tagged union", meaning it stores what type of value it contains, and the value
9+
itself. Let's look at the type first.
10+
11+
************
12+
zval types
13+
************
14+
15+
.. code:: c
16+
17+
#define IS_UNDEF 0 /* A variable that was never written to. */
18+
#define IS_NULL 1
19+
#define IS_FALSE 2
20+
#define IS_TRUE 3
21+
#define IS_LONG 4 /* An integer value. */
22+
#define IS_DOUBLE 5 /* A floating point value. */
23+
#define IS_STRING 6
24+
#define IS_ARRAY 7
25+
#define IS_OBJECT 8
26+
#define IS_RESOURCE 9
27+
#define IS_REFERENCE 10
28+
29+
These simple integer constants determine what value is currently stored in a variable. If you are a
30+
PHP developer, these types should sound fairly familiar. They are pretty much an exact reflection of
31+
the types you may use in regular PHP code. One small oddity is that ``IS_FALSE`` and ``IS_TRUE`` are
32+
implemented as separate types, instead of as a ``IS_BOOL`` type.
33+
34+
Some of these types are self-contained, they don't store any auxiliary data. This includes
35+
``IS_UNDEF``, ``IS_NULL``, ``IS_FALSE`` and ``IS_TRUE``. For the rest of the types, we are going to
36+
require some additional memory to store the actual value of the variable.
1037

1138
************
1239
zend_value
@@ -35,43 +62,24 @@ the variable, and a union for the value itself. Let's look at the value first.
3562
} ww;
3663
} zend_value;
3764
38-
A C union is a data type that is big enough to hold the biggest of its members. As such, it can hold
39-
exactly one of its members at a time. For example, ``zend_value`` may store the ``lval`` member, or
40-
the ``dval`` member, but never both at the same time. Remembering exactly *which* member is stored
41-
is our job. That's what the ``zval`` types are for.
65+
A C union is a data type that may store any one of its members at a time, by being (at least) as big
66+
as its biggest member. For example, ``zend_value`` may store the ``lval`` member, or the ``dval``
67+
member, but never both at the same time. However, it doesn't know which member is being stored.
68+
Remembering this is our job, and that's exactly what the ``IS_*`` constants are for.
4269

43-
If you are a PHP developer, the top members should sound pretty familiar, with the exception of
44-
``counted``. ``counted`` refers to any of the values that use `reference counting <todo>`__ to
45-
determine the lifetime of a value. This includes strings, arrays, objects, resources and references.
46-
All of these will be discussed in their own chapters. You may be thinking that some values are
47-
missing, most notably ``null`` and ``bool``. These values don't hold any auxiliary data, but consist
48-
solely of the ``zval`` type.
70+
The top members of ``zend_value`` mostly mirror the ``IS_*`` constants, with the exception of
71+
``counted``. ``counted`` polymorphically refers to any `reference counted <todo>`__ value, including
72+
strings, arrays, objects, resources and references. ``null`` and ``bool`` are missing from
73+
``zend_value`` because their types are self-contained.
4974

50-
************
51-
zval types
52-
************
53-
54-
.. code:: c
55-
56-
#define IS_UNDEF 0 /* A variable that was never written to. */
57-
#define IS_NULL 1
58-
#define IS_FALSE 2
59-
#define IS_TRUE 3
60-
#define IS_LONG 4 /* An integer value. */
61-
#define IS_DOUBLE 5 /* A floating point value. */
62-
#define IS_STRING 6
63-
#define IS_ARRAY 7
64-
#define IS_OBJECT 8
65-
#define IS_RESOURCE 9
66-
#define IS_REFERENCE 10
75+
The rest of the fields aren't important for now.
6776

68-
These simple integers determine what value is currently stored in ``zend_value``. Together, the
69-
value and the tag make up the ``zval``, along with some other fields. Note how ``IS_NULL``,
70-
``IS_FALSE`` and ``IS_TRUE`` are actually ``zval`` types. This explains why they are absent from
71-
``zend_value``.
77+
******
78+
zval
79+
******
7280

73-
Finally, here's what the ``zval`` struct actually looks like. This may look intimidating at first.
74-
Don't worry, we'll go over it step by step.
81+
Together, the value and the tag make up the ``zval``, along with some other fields. It may look
82+
intimidating at first. We'll go over it step by step.
7583

7684
.. code:: c
7785
@@ -104,27 +112,16 @@ Don't worry, we'll go over it step by step.
104112
} u2;
105113
};
106114
107-
``zval.value`` reserves space for the actual variable data, if the type requires any.
108-
109-
``zval.u1`` stores the type of the variable. This refers to the ``IS_*`` constants above. You may be
110-
wondering why this is a ``union``. In short, this field is used not only for the ``IS_*`` constants,
111-
but also some other flags. The entire ``type_info`` consists of 4 bytes. ``zval.u1.v.type``, the
112-
lowest byte, is used for the ``IS_*`` constants. ``zval.u1.v.type_flags`` is used for the
113-
``IS_TYPE_REFCOUNTED`` and ``IS_TYPE_COLLECTABLE`` flags. They will be discussed within the
114-
`reference counting <todo>`__ chapter. ``zval.u1.v.u.extra`` (containing the useless ``u`` union) is
115-
currently only used for the ``IS_STATIC_VAR_UNINITIALIZED`` flag, which is somewhat of a fringe-case
116-
we won't get into here. So, ``zval.u1.type_info`` and ``zval.u1.v`` are essentially two ways to
117-
access the same data. The ``ZEND_ENDIAN_LOHI_3`` macro is used to guarantee ordering of bytes across
118-
big- and little-endian architectures.
119-
120-
If you're familiar with C, you'll know that the compiler likes to add padding to structures with
121-
"odd" sizes. It does that because the CPU can work with some offsets more efficiently that others.
122-
Ignoring the ``zval.u2`` field for a second, our struct would be 12 bytes in total, 8 coming from
123-
``zval.value`` and 4 from ``zval.u1``. A compiler on a 64-bit architecture will generally bump this
124-
to 16 bytes by adding 4 bytes of useless padding. If this padding is added anyway, we might as well
125-
make use of it. ``zval.u2`` is often unoccupied, but provides 4 additional bytes to be used in
126-
various contexts. How exactly the value is used depends on the use case, but it's important to
127-
remember that it may only be used for one of them at a time.
115+
``zval.value`` reserves space for the actual variable data, as discussed above.
116+
117+
``zval.u1`` stores the variable type, the given ``IS_*`` constant, along with some other flags. It's
118+
definition looks a bit complicated. You can think of the entire field as a 4 bit integer, split into
119+
3 parts. ``v.type`` stores the actual variable type, ``v.type_flags`` is used for some `reference
120+
counting <todo>`__ flags, and ``v.u.extra`` is pretty much unused.
121+
122+
``zval.u2`` defines some more storage for various contexts that is often unoccupied. It's there
123+
because the memory would otherwise be wasted due to padding, so we may as well make use of it. We'll
124+
go over the relevant ones in their corresponding chapters.
128125

129126
********
130127
Macros

0 commit comments

Comments
 (0)