How does PHP load its extensions?

You probably already know that native PHP extensions are compiled into *.so files on unix-like systems, and *.dll files in Windows environments, and that the global php.ini file holds a list of all extensions available on your system. This means that if you're building your own extension, you are also going to create such a *.so or *.dll file and you have to update the PHP configuration file so that your own extension is loaded by PHP.

Where to find your PHP configuration files?

If for one reason or another you can not find the PHP configuration file(s) on your system, you can run the following command from the command line:

        php --ini

This will output a list of all configuration files that are loaded by PHP. Extensions are enabled by adding "extension=name.so" lines to the configuration file - where 'name' should of course be replaced by the name of your extension. A default PHP installation already comes with many default extensions, so in the configuration file(s) on your system you will certainly find a number of these "extension=name.so" lines.

The extension lines either take an absolute path ("extension=/path/to/extension.so") or a relative path ("extension=extension.so"). If you'd like to use relative paths, you must make sure that you've copied your extension *.so file to the default extension directory, so that PHP can find it. To find out this default extension directory, use the following command line instruction:

        php -i|grep extension_dir

The extension dir often has the form /usr/lib/php5/20121212 - or a different date string depending on the PHP version you use.

The get_module() startup function

Before we explain how you can create your own extension, we first explain what PHP does to load an extension. When PHP starts, it loads the *.ini configuration file(s) that we just described and for each "extension=name.so" line in these files, it opens the appropriate library, and calls the "get_module()" function from it. Each extension library (your extension too) must therefore define and implement this "get_module()" C function. The function is called by PHP right after the library is loaded (and thus way before pageviews are handled), and it should return a memory address that points to a structure that holds information about all functions, classes, variables and constants that are made available by the extension.

The structure that the get_module() returns is defined in the header files of the Zend engine, but it is a pretty complicated structure without good documentation. Luckily, the PHP-CPP library makes life easier for you, and offers an Extension class that can be used instead.

        #include <phpcpp.h>
        
        /**
         *  tell the compiler that the get_module is a pure C function
         */
        extern "C" {
            
            /**
             *  Function that is called by PHP right after the PHP process
             *  has started, and that returns an address of an internal PHP
             *  strucure with all the details and features of your extension
             *
             *  @return void*   a pointer to an address that is understood by PHP
             */
            PHPCPP_EXPORT void *get_module() 
            {
                // static(!) Php::Extension object that should stay in memory
                // for the entire duration of the process (that's why it's static)
                static Php::Extension myExtension("my_extension", "1.0");
                
                // @todo    add your own functions, classes, namespaces to the extension
                
                // return the extension
                return myExtension;
            }
        }

In the example above you see a very straightforward implementation of the get_module() function. Every PHP extension that uses the PHP-CPP library implements this function in a more or less similar way, and it is the starting point of each extension. A number of elements require special attention. For a start, the only header file that you see is the phpcpp.h header file. If you're using the PHP-CPP library to build your own extensions, you do not have to include the complicated, unstructured, and mostly undocumented header files of the Zend engine - all you need is this single phpcpp.h header file of the PHP-CPP library. If you insist, you are of course free to also include the header files of the core PHP engine - but you do not have to. PHP-CPP takes care of dealing with the internals of the PHP engine, and offers you a simple to use API.

The next thing that you'll notice it that we placed the get_module() function inside an 'extern "C"' code block. As the name of the library already gives away, PHP-CPP is a C++ library. However, PHP expects your library, and especially your get_module() function, to be implemented in C and not in C++. That's why we've wrapped the get_module() function in an 'extern "C"' block. This will instruct the C++ compiler that the get_module() is a regular C function, and that it should not apply any C++ name mangling to it.

The PHP-CPP library defines a "PHPCPP_EXPORT" macro that should be placed in front of the get_module() function. This macro makes sure that the get_module() function is publicly exported, and thus callable by PHP. The macro has a different implementation based on the compiler and operating system.

This, by the way, is also the only macro that PHP-CPP offers. PHP-CPP intends to be a very straightforward C++ library, without using magic or tricks from pre-processors. What you see is what you get: If something looks like a function, you can be sure that it actually IS a function, and when something looks like a variable, you can be sure that it also IS a variable.

Let's move on. Inside the get_module() function the Php::Extension object is instantiated, and it is returned. It is crucial that you make a static instance of this Php::Extension class, because the object must exist for the entire lifetime of the PHP process, and not only for the duration of the get_module() call. The constructor takes two arguments: the name of your extension and its version number.

The final step in the get_module() function is that the extension object is returned. This may seem strange at first, because the get_module() function is supposed to return a pointer-to-void, and not a full Php::Extension object. Why does the compiler not complain about this? Well, the Php::Extension class has a cast-to-void-pointer-operator. So although it seems that you're returning the full extension object, in reality you only return a memory address that points to a data structure that is understood by the core PHP engine and that holds all the details of your extension.

Note that the example above does not yet export any native functions or native classes to PHP - it only creates the extension. That is going to be the next step.

Exporting native functions

An extension can of course only be useful if you define functions and/or classes that can be accessed from PHP scripts. For functions you can do this by adding your native function implementations to the Extension object:

        #include <phpcpp.h>
        
        extern void example1();
        extern void example2(Php::Parameters &params);
        extern Php::Value example3();
        extern Php::Value example4(Php::Parameters &params);
        
        extern "C" {
            PHPCPP_EXPORT void *get_module() {
                static Php::Extension myExtension("my_extension", "1.0");
                myExtension.add("native1", example1);
                myExtension.add("native2", example2);
                myExtension.add("native3", example3);
                myExtension.add("native4", example4);
                return myExtension.module();
            }
        }

What do we see here? We've added four function declarations ("example1", "example2", "example3" and "example4") to the source code of our extension. The reason why we've only declared the functions, and not fully implemented them is to keep the example code relatively small. We assume that the four example functions are implemented in a different file. In a real world example you could just as well remove the "extern" keyword and implement the four functions in the same source file as the get_module() call.

The four functions all have a different signature: Some return a value, while others do not return anything. And some take parameters, while others do not. Despite the different signature of the functions, they can all be made available in PHP by adding them to the extension object, by simply calling the myExtension.add() method. This method takes two parameters: the name by which the function should be accessible in PHP, and the actual native function.

In the example above we've used different names for the native functions ("example1" up to "example4") as for the PHP functions ("native1" to "native4"). This is legal - you do not have to use the same names for your native functions as for your PHP functions. The following PHP script can be used to call the four native functions:

        <?php
        native1();
        native2("a","b");
        $x = native3();
        $y = native4(1,2);
        ?>

It is not possible to export every thinkable C/C++ function to the PHP extension. Only functions that have one of the four supported signatures can be exported: functions that return void or a Php::Value object, and that either accept a Php::Parameters object or no parameters at all, can be added to the extension object and can thus be exported to PHP.

Parameter types

PHP has a mechanism to enforce function parameters types, and to accept parameters either by reference or by value. In the examples above, we have not yet used that mechanism yes: it is up to the function implementations themselves to inspect the 'Parameters' object, and check if the variables are of the right type.

However, the 'Extension::add()' method takes a third optional parameter that you can use to specify the number of parameters that are supported, whether the parameters are passed by reference or by value, and what the type of the parameters is:

        #include <phpcpp.h>
        
        extern void example(Php::Parameters &params);
        
        extern "C" {
            PHPCPP_EXPORT void *get_module() {
                static Php::Extension myExtension("my_extension", "1.0");
                myExtension.add("example", example, {
                    Php::ByVal("a", Php::Type::Numeric),
                    Php::ByVal("b", "ExampleClass"),
                    Php::ByRef("c", "OtherClass")
                });
                return myExtension.module();
            }
        }

Above you see that we passed in additional information when we registered the "example" function. We tell our extension that our function accepts three parameters: the first parameter must be a regular number, while the other ones are object instances of type "ExampleClass" and "OtherClass". In the end, your native C++ "example" function will still be called with a Php::Parameters instance, but the moment it gets called, you can be sure that the Php::Parameters object will be filled with three members, and that two of them are objects of the appropriate type, and that the third one is also passed by reference.

Working with variables

Variables in PHP are non-typed. A variable can thus hold any possible type: an integer, string, a floating point number, and even an object or an array. C++ on the other hand is a typed language. In C++ an integer variable always has a numeric value, and a string variable always hold a string value.

When you mix native code and PHP code, you will need to convert the non-typed PHP variables into native variables, and the other way round: convert native variables back into non-typed PHP variables. The PHP-CPP library offers the "Value" class that makes this a very simple task.