Writing a Clarens server-side module runs the gamut from trivial to complex. Complexity mainly stems from the multi-process architecture of the Apache server, dealing with persistent database connections, and of course security considerations.
No specific Python programming experience is required, following and modifying
the example should be sufficient for most purposes. For more information on
interacting with Apache in this, environment see the
mod_python[6] documentation.
This module has only one method, echo.echo which simply returns its arguments.
The module is a standard Python module script, stored in
$clarens-toplevel/echo/__init__.py
where it will be picked up by the Clarens server automatically when any method in the
module is invoked.
| PYTHON | |
|
This imports the sytem and clarens_util modules, as well as the Apache part of mod_python
| PYTHON | |
|
This define the echo function, with its arguments:
Next, all methods should be documented so that the can be discovered by remote clients:
| PYTHON | |
|
For our simple method, we construct a response and write the response to the client:
| PYTHON | |
|
Note the Python language structure of indenting the contents of the function. Finally, return from the function:
| PYTHON | |
|
This lets the Apache server continue with its processing chain (including possibly compressing the output).
| PYTHON | |
|
The methods_list variable is a dictionary, with a string, 'echo', identifying the method, and the callable method object that we defined ealier as data.
The methods_sig is another dictionary that describes the echo method signature, with its data being a list of possible arguments and return values. Each list element is a comma-separated string, with the first value being the type of thereturn value, and the folowing values are the types of the arguments. E.g.:
['string,string']
is a list with one element for a method that takes a string as argument, and returns a string.
In the case of the echo method is polymorphic, and each argument type is the same as the return type.
| PYTHON | |
|
Then start the Apache server with only one process:
$opkg_root/sbin/httpd -X -f /opt/openpkg/etc/apache2/httpd.conf
Any requests to mod_python handler will cause the Python debugger
prompt to appear on the terminal where the server was started from.
Remember to remove the PythonEnablePdb On again when debugging
is finished, otherwise an error will be reported by the server as
follows:
Handler 'clarens_server' returned invalid return code.
As of version 0.6.9 of the clarens-server package, any exceptions generated by the server during the execution of the server-side module code above will be reported to the client. The amount of information returned depends on the value of the PythonDebug directive in the server configuration. The value of this directive can be set using the clarens-server-config utility described in section 2.3.2.
If debugging is turned on a traceback of the code, along with the called identity and client machine IP address is sent back. Consider the code snippet below, which uses the standard Python xmlrpclib module:
| PYTHON | |
|
The code tries to call the newservice.amethod server-side method. Imagine that the code that implements the module contains generates a division by zero error. In that case the above code could print something like the following:
| output | |
|
This would give the service developer engough information to know that there is a problem in line 33 of the code for the newservice module.
The raw XML message looks like this:
| XML | |
|
If the PythonDebug directive is turned off, the output will look like this:
| output | |
|
This is obviously not very helpful for the developer, but does hide some information from any potential attackers.
| PYTHON | |
|
The arguments of the method is build_fault(req,method_name,error_code,error_string). Always return apache.OK from the function, otherwise Apache will generate its own error message in text/html format, which may not be handled elegantly by all clients. If you do not want to supply your own error handling code, the Clarens server will also catch exceptions, and send a generic error message to the client.
The Clarens server augments this behaviour by also having the service modules be imported upon server startup to allow some initilization to be done if needed. Some examples of this includes populating a database of known services and methods, starting a process that advertises the services offered using the a discovery service, and making sure the file and method access control lists are loaded in the database for quicker access.
The logical way for services to initialize global variables, database connections etc. would be to put the initilization code in the module's global name space, so that it can be executed when the module is imported. The reality is that each module may be imported multiple times, e.g. by the main Clarens server as well as by other server modules.
To handle this in a more coordinated fashion, the main Clarens server will try to import all the modules it knows about once per process. It will then call a method named _startup_init in each module with three arguments:
| PYTHON | |
|
See section 5.6 below for an explanation of what the above code achieves.
The above initilization code will be called once per process, which is quite useful for database connections that cannot be shared between processes. In many cases the initilization code actually need to be called only once when the server starts up. The ACL database update is a good example of that.
In that case it is prudent to protect the initilization code with a global lock that precludes concurrent access by multiple processes. This can be achieved in several ways, one of which is to lock files. This method is fairly portable and has the desirable property that such locks are released when a process terminates, reducing the probability of a deadlock occurring.
The code in the example above may be protected with such a lock as follows:
| PYTHON | |
|
The ACL initilization code will only be called once per server startup. Of course a real implementation would also add some warnings to the exception handler code to warn of other exceptions that may occur when an attempt is made to open the lockfile.
Note that in the above code we do not close the lockfile object so that the lock is held for the entire time while the process is running.
It is often useful for services to store data in an organized way that may not quite need to full power of a RDBMS, but would be tedious to implement using flat files.
Clarens provides a high-performance key-value datastore for this purpose, which is also used to store most of the server's internal data structures. The tdb database stores key-value mapping per file, with only one open connection per file possible for a single process. For this reason opening and closing database handles need to be done in a coordinated fashion to prevent deadlocks and server process crashes.
The clarens_util module provides a method for opening and maintaining a registry of tdb database instances. The method register_dbs should be called with three arguments:
| PYTHON | |
|
This example starts by importing two supporting modules that contain the utility methods and main configuration respectively.
| PYTHON | |
|
The data can be deserialized as follows:
| PYTHON | |
|
The user_nonce value is the authentication username, obtainable via
| PYTHON | |
|
clarens_util.err_msg("myvalue=%s\n"%myvalue)
Exceptions are usually handled by modules themselves which causes Python to log error messages to the log file. If a module fails to load in the first place, Clarens handles the resulting exception, and only reports that the module failed to load.
To log the exception in full in the logs, the exception must be raised in the file
system/__init__.py, at around line 998, after the message Failed to load the module %s was printed. Just add the statement
raise
to the exception handler.
In future a configuration switch will be provided to do this automatically.