New script added
If you are building multiple related software projects with a continuous integration server one important aspect is to be notified when changes in an upstream job break the build or tests for a downstream job. This involves knowing which exact build numbers of the upstream and the downstream job are involved.
The Jenkins continuous integration server uses the notion of file fingerprints for this purpose. The upstream job is built by Jenkins and produces one or several so called artifacts, the results of the build process. The artifacts are archived by Jenkins and fingerprints (hash sums) for each artifact are created and stored along with the build number of the job. When the downstream job starts to build it downloads the (most recent) artifacts from the upstream job and uses them for its purposes, i.e. building and running the own source code. By comparing the fingerprints of the downloaded artifacts with the stored fingerprints Jenkins knows which version of each upstream job was involved in a build and can track which upstream build number broke the downstream job. Jenkins will only issue notifications if this fingerprinting mechanism is properly configured, triggering a build after another is not sufficient to receive these notifications. Moreover, the Blame Upstream Commiters plugin needs to be used and enabled for each downstream job or the global property hudson.upstreamCulprits (will this ever be renamed?) needs to be set.
The rational behind this rather complex mechanism is that it enables a high amount of parallelism for building jobs. While the downstream job builds, the upstream job can already operate again without affecting the downstream job. This would be the case if e.g. a central installation location would be shared between both jobs. If the upstream job installs new files while the downstream job is still building, this will certainly result in hard to debug errors. Moreover, this also allows to run the downstream job on a different build salve (assuming similar systems), which also would not be the case with a central installation location in a file system.
For Java projects (where Jenkins comes from) the explained mechanism usually works well. The upstream job produces one or several jar files containing all resources for the project like images, fingerprints them, no preprocessor is involved which configures the Java code according to the installation setup, and no source code was generated based on this setup. For C++ projects this is usually different, because the language already includes a preprocessor and it is common practice to set certain code lines according to the installation location, e.g. to find additional files like images, because they cannot be packaged in the jar file. Also, C++ projects usually consists of much more files considering all headers compared to Java. This provides more chances to mix something up.
So assuming an upstream C++ job A (using CMake, other build solutions are not covered in this post but the techniques can be applied there, too) which is built in Jenkins, it usually will be configured with an installation location, e.g. inside the job’s workspace like /jenkins/workspace/A/install.
- Often, CMake will use this location, e.g. to generate a config.h which tells that images are found at /jenkins/workspace/A/install/share/A/images etc.
- To use the CMake dependency mechanism, it will generate a AConfig.cmake file and install it also to the share folder (cf. the CMake documentation for find_package). The file might look like:
SET(A_LIBRARIES "/jenkins/workspace/A/install/lib/libA.so") SET(A_INCLUDE_DIRS "/jenkins/workspace/A/install/include/A")
After building the project the job will e.g. use a compression tool to create a single archive and compress all contents of /jenkins/workspace/A/install/, archive this artifact and generate the fingerprints for it.
Both issues mentions above will prevent the dependency tracking of Jenkins to function properly, because the downstream job will download the artifacts to its own workspace, e.g. to /jenkins/workspace/B/upstream/A and unpack them. Cf. the issues:
- The upstream project A will not find external files at all or will use wrong versions, because in the meantime a new build of job A might have started and hence the workspace of this job is currently changing.
- The downstream job B will not build at all or might use a wrong version of A because the contents of AConfig.cmake point to A’s workspace and not the downloaded artifacts.
To enable reliable dependency tracking in Jenkins, the solutions are:
- Do not use this technique at all. The software is generally more flexible if not hard locations are assumed and more situations are covered without recompiling.
- The idea here is to make all paths given in the config file (AConfig.cmake) relative to its current location on the disc. This will look like this:
GET_FILENAME_COMPONENT(CONFIG_DIR "${CMAKE_CURRENT_LIST_FILE}" PATH) SET(A_LIBRARIES "${CONFIG_DIR}/../../lib/libA.so") SET(A_INCLUDE_DIRS "${CONFIG_DIR}/../../include/A")Now the CMake script of B will use the correct downloaded headers, libraries etc. for A from the own workspace
The two aspects make it possible to use fingerprinting in Jenkins for dependency tracking with notifications for upstream committers. Especially the first aspect includes taking care while designing the project but there is no other solution I can think of.
Please note that for executing any tests in downsteam job B you have to set the LD_LIBRARY_PATH to find the right upstream libraries as well.
Random Comments
Some more care needs to be taken to not mix up the dependency tracking again:
- The downstream job needs to make sure that the latest downloaded artifact is really used to build its own source code. So it is a good idea to simply remove the upstream directory as the first step of the build.
- The downloaded artifacts (as explained above the generated archive files) need to be kept after extracting them, because the downstream job also has to generate fingerprints for them (and not for the extracted files) to create a match with the fingerprints stored for the upstream job.
- In order to enable the downstream CMake project to find the upstream project use the _DIR variable for the CMake call as defined in the CMake documentation, e.g.
-DA_DIR="${WORKSPACE}/upstream/A/share/A" - If your upstream project contains a version or revision number in the extracted folder (e.g. ${WORKSPACE}/upstream/A-0.35/) and you want your downstream job to be resilient against version changes in the upstream project you can use some find-magic on UNIX for automatically finding the folder:
A=`find "${WORKSPACE}/upstream" -maxdepth 1 -type d -name "A-*";` - If you are using pkg-config instead of or in addition to the CMake config file mechanism, you can use the
--define-variablecommand line argument to achieve similar flexibility, assuming that all your absolute paths depend on a single prefix-variable in the pc file.
Gcov Coverage Reports in Jenkins
I am currently evaluating the applicability and limitations of the Jenkins continuous integration server for C++ development. Besides several limitations which are mainly caused by the complexity of C++, Jenkins provides a solid basis for continuous integration of C++ projects.
One thing which I was not happy with so far was the missing integration of open-source coverage tools for Linux. Here, Gcov can be used to generate more or less precise coverage reports for projects compiled with GCC. Unfortunately, Gcov itself does not provide tools to export the results in any common or even nicely readable format. Until now, the only working solution I found was to use the Gcov front-end LCOV to generate a HTML report. This report is nice to read but it cannot be tracked by Jenkins with the drawback that no trend report for the code coverage can be generated. Nevertheless, I’ve wrapped the creation of such a HTML report in a CMake function and worked with it so far.
Today, I searched again for cheap solutions to overcome this drawback (this means without writing a custom Jenkins plugin for Gcov coverage files). While searching the net, I found the gcovr script, which parses Gcov result files and is able to convert them into XML files that satisfy the format generated by Cobertura, which is a coverage tool for Java with an existing plugin for Jenkins.
As far as I tested it, this script works well with the Jenkins plugin, so I integrated the execution of this script in my existing coverage function for CMake, which is available in the RSC library. This library also contains additional CMake wrappers for tools that can be used to generate trend reports in Jenkins, like cppcheck. Now our Jenkins can also generate coverage trend reports for C++ projects.
Evauation of Default Arguments in Python
Today I stumbled upon a very subtle problem with default arguments in python. I noticed that loading a python module already instantiated one of my classes even though I could not find an instantiation of this class. In the end it turned out to be a default argument for a function:
def foo(arg=MyClass()):
pass
As I am currently programming a lot in C++ this did not look suspicious to me. But from python’s point of view it absolutely makes sense that the default value for arg is already constructed at module load time and not only on a call to that functions. Moreover it is important to remember that this default argument is constructed only once for all calls to a function as stated here.
Boost bind and smart pointers
I’ve seen this several times causing troubles:
boost::shared_ptr<Foo> p(new Foo); boost::thread t(boost::bind(&Foo::method, p.get()))
This prevents the livecycle management of shared_ptr or any other smart pointer to be effective in the thread. Hence, p may get destructed even though the thread is still active.
Boost bind can handle smart pointers, so instead use the smart pointer itself as the instance argument for bind:
boost::thread t(boost::bind(&Foo::method, p))
After switching my computers to Arch Linux, reusing the existing home directories, I had the problem that the bash did not replace the user’s home directory with the tilde character (~) in the prompt. The solutions for this problem is easy: somewhere in the bash-internals a check tests whether the output of pwd matches the contents of $HOME. $HOME, in turn, is filled from the user definition in /etc/passwd. pwd always returns the current working directory without a trailing slash, but the entry for my user in /etc/passwd contained a trailing slash. Removing this slash from the file solves the problem and the prompt uses ~ again for the home directory.
Ubuntu Default Browser in Thunderbird
Since the update to Ubuntu/Kubuntu Maverick Meerkat I was a bit puzzled why Thunderbird refused to open links in mails with my default browser Firefox even though it was set in the KDE settings and registered as a protocol-handler for http and https in the Thunderbird settings. It seems that there must be another setting in Thunderbird to use the default browser selected for the freedesktop environment which is now used. So updating the alternatives helped in Ubuntu:
sudo update-alternatives --config x-www-browser
Menü Titanic
Speiseplan gestern in der Mensa:
Frühlingsröllchen, Chili-Pfeffer-Dip, Reis, Eisberg mit Möhren, Kokosquark
Ignoring Warnings from System Headers
Compiling C and C++ code with the highest warning levels is a good practice and helps spotting potential errors. For GCC the flags
-Wall -Wextra
will generate a lot of useful warning messages about unused parameters etc.
Unfortunately, this is not the common practice and often the own compiler settings concerning the warning level results in dozens of warnings from system headers on which the own code relies, making it impossible to spot warnings from your own code in the endless mass of console output.
Fortunately, GCC has a way to ignore warnings from foreign headers. Instead of using -I to specify an include path, -isystem tells the compiler to treat the includes from the given path as system headers where no warnings should be reported.
If CMake is used to create the Makefile, a special argument to the INCLUDE_DIRECTORIES function generates these compiler flags:
INCLUDE_DIRECTORIES(SYSTEM /usr/include /and/other /system/paths)
Unused Parameters in C++
Just some quick thought on how to handle compiler warnings about unused parameters in C++ code. While these warnings are helpful most of the time, sometimes you simply have to ignore a parameter that is required by an interface definition. Then you got three solutions to remove the warning:
- Change the compiler flags. The worst solution. In other cases these warnings are really helpful. Generally I really like to compile with the highest warning level. Most of the warnings (at least for GCC) really have something to tell.
- Use a macro to flag a variable as being unused, e.g. from Qt:
foo(int aParameter) { Q_UNUSED(aParameter) }Eventhough this looks like a reasonable solution it has the drawback of having a potential for confusion. Maybe sometime later you need the parameter but forget to remove the Q_UNUSED macro, hence generating something like this:
foo(int aParameter) { Q_UNUSED(aParameter) // 30 lines of other method code int anotherVariable = 30 * aParameter; }Suddenly your code documents that a variable isn’t used but actually it is and the compiler can’t detect this error.
- Simply comment the variable name:
foo(int /*aParameter*/) { // 30 lines of other method code int anotherVariable = 30 * aParameter; }Now the warning is gone, too and the compiler can detect either the illegal use of the variable in line three or the illegal statement that this variable is unused by purpose.
