PHP Anonymous Functions: Exposing Black Magic Session





Probably, one should start with the fact that an anonymous function (closure) in PHP is not a function, but an object of the Closure class. Actually, this article could have been finished, but if anyone is interested in the details, welcome to cat.





In order not to be unfounded:

$func = function (){}; var_dump($func); --------- object(Closure)#1 (0) { }
      
      





Looking ahead, I’ll say that this is not really an ordinary object. Let's figure it out.



For example, such a code

 $func = function (){ echo 'Hello world!'; }; $func();
      
      





compiles into such a set of opcodes:

 line #* EIO op fetch ext return operands -------------------------------------------------------------------------- 8 0 E > DECLARE_LAMBDA_FUNCTION '%00%7Bclosure%7D%2Fin%2FcrvX50x7fabda9ed09e' 10 1 ASSIGN !0, ~1 11 2 INIT_DYNAMIC_CALL !0 3 DO_FCALL 0 11 2 > RETURN 1 Function %00%7Bclosure%7D%2Fin%2FcrvX50x7fabda9ed09e: function name: {closure} line #* EIO op fetch ext return operands -------------------------------------------------------------------------- 9 0 E > ECHO 'Hello+world%21' 10 1 > RETURN null
      
      





The block with the description of the function body is not particularly interesting to us, but in the first block there are two interesting opcodes: DECLARE_LAMBDA_FUNCTION and INIT_DYNAMIC_CALL . Let's start with the second one.



INIT_DYNAMIC_CALL



This opcode is used when the compiler sees a function call on a variable or array. Those.

 $variable(); ['ClassName', 'staticMethod']();
      
      





This is not some unique opcode specific to closures only. This syntax also works for objects by calling the __invoke () method, for string variables containing the function name ( $ a = 'funcName'; $ a (); ), and for arrays containing the class name and static method in it.



In the case of closure, we are interested in calling a variable with an object, which is logical.

Going deeper into the VM code that processes this opcode, we get to the zend_init_dynamic_call_object function, in which we will see the following (slicing):

 zend_execute_data *zend_init_dynamic_call_object(zend_object *function, uint32_t num_args) { zend_function *fbc; zend_class_entry *called_scope; zend_object *object; ... if (EXPECTED(function->handlers->get_closure) && EXPECTED(function->handlers->get_closure(function, &called_scope, &fbc, &object) == SUCCESS)) { ... } else { zend_throw_error(NULL, "Function name must be a string"); return NULL; } ... }
      
      





It's funny that the familiar __invoke method call in terms of VM is an attempt to call a closure - get_closure .



Actually, at this point the difference begins in handling the call of the anonymous function and the __invoke method of a regular object.

In PHP, every object has a set of different handlers that defines its utility and magic methods.

The standard set looks like this
 ZEND_API const zend_object_handlers std_object_handlers = { 0, /* offset */ zend_object_std_dtor, /* free_obj */ zend_objects_destroy_object, /* dtor_obj */ zend_objects_clone_obj, /* clone_obj */ zend_std_read_property, /* read_property */ zend_std_write_property, /* write_property */ zend_std_read_dimension, /* read_dimension */ zend_std_write_dimension, /* write_dimension */ zend_std_get_property_ptr_ptr, /* get_property_ptr_ptr */ NULL, /* get */ NULL, /* set */ zend_std_has_property, /* has_property */ zend_std_unset_property, /* unset_property */ zend_std_has_dimension, /* has_dimension */ zend_std_unset_dimension, /* unset_dimension */ zend_std_get_properties, /* get_properties */ zend_std_get_method, /* get_method */ zend_std_get_constructor, /* get_constructor */ zend_std_get_class_name, /* get_class_name */ zend_std_compare_objects, /* compare_objects */ zend_std_cast_object_tostring, /* cast_object */ NULL, /* count_elements */ zend_std_get_debug_info, /* get_debug_info */ /* ------- */ zend_std_get_closure, /* get_closure */ /* ------- */ zend_std_get_gc, /* get_gc */ NULL, /* do_operation */ NULL, /* compare */ NULL, /* get_properties_for */ };
      
      







Now we are interested in the get_closure handler. For a regular object, it points to the zend_std_get_closure function, which checks that the __invoke function is defined for the object and returns either a pointer to it or an error. But for the Closure class that implements anonymous functions, almost all utility functions, including those that control the life cycle, are redefined in this array of handlers. Those. although for the user it looks like an ordinary object, but in fact it is a mutant with superpowers :)

Register handlers for an object of class Closure
 void zend_register_closure_ce(void) /* {{{ */ { zend_class_entry ce; INIT_CLASS_ENTRY(ce, "Closure", closure_functions); zend_ce_closure = zend_register_internal_class(&ce); zend_ce_closure->ce_flags |= ZEND_ACC_FINAL; zend_ce_closure->create_object = zend_closure_new; zend_ce_closure->serialize = zend_class_serialize_deny; zend_ce_closure->unserialize = zend_class_unserialize_deny; memcpy(&closure_handlers, &std_object_handlers, sizeof(zend_object_handlers)); closure_handlers.free_obj = zend_closure_free_storage; closure_handlers.get_constructor = zend_closure_get_constructor; closure_handlers.get_method = zend_closure_get_method; closure_handlers.write_property = zend_closure_write_property; closure_handlers.read_property = zend_closure_read_property; closure_handlers.get_property_ptr_ptr = zend_closure_get_property_ptr_ptr; closure_handlers.has_property = zend_closure_has_property; closure_handlers.unset_property = zend_closure_unset_property; closure_handlers.compare_objects = zend_closure_compare_objects; closure_handlers.clone_obj = zend_closure_clone; closure_handlers.get_debug_info = zend_closure_get_debug_info; /* ------- */ closure_handlers.get_closure = zend_closure_get_closure; /* ------- */ closure_handlers.get_gc = zend_closure_get_gc; }
      
      







The manual says:

In addition to the methods described here, this class also has an __invoke method. This method is necessary only for compatibility with other classes that implement magic call, since this method is not used when calling a function.


And this is true. The get_closure function for a closure does not return __invoke , but your function from which the closure was created.



You can study the sources in more detail yourself - the file zend_closure.c , and we will move on to the next opcode.



DECLARE_LAMBDA_FUNCTION



But this is an opcode that is exclusively for circuit and is no longer working with anything. Under the hood of the processor, there are three main operations:

  1. A pointer to a compiled function is sought, which will be the essence of the closure.
  2. The context of the closure is defined (in other words, this ).
  3. Based on the first two points, an object of class Closure is created .




And here on this place not very pleasant news begins.



So what's wrong with anonymous functions?



Creating a closure is a more difficult operation than creating an ordinary object. Not only is the standard mechanism for creating an object called, it also adds a certain amount of logic, the most unpleasant of which is copying the entire array of opcodes of your function into the body of the closure. This in itself is not so scary, but exactly until you start using it “incorrectly”.



To understand exactly where the problems await, we will analyze the cases when a closure is created.

The closure is recreated:

a) at each processing of the DECLARE_LAMBDA_FUNCTION opcode .

Intuitively - exactly the case where the closure looks good, but in fact a new closure will be created at each iteration of the loop.

 foreach($values as $value){ doSomeStuff($value, function($args) { closureBody }); }
      
      





b) each time the bind and bindTo methods are called :

Here the closure will be created again at each iteration.

 $closure = function($args) { closureBody }; foreach($objects as $object){ $closure->bindTo($object); $object->doSomeStuff($closure); }
      
      





c) each time the call method is called , if a generator is used as a function. And if not a generator, but an ordinary function, then only the part with copying the array of opcodes is executed. So it goes.



conclusions



If performance is not important to you at all costs, then anonymous functions are convenient and enjoyable. And if important, then probably not worth it.



In any case, now you know that closures and cycles, if they are not prepared correctly, are such a combination.



Thanks for attention!



All Articles