# Structuring a VCS Repository

My main programming hobby project will soon celebrate its tenth birthday, so I thought a few notes on how I structure my VCS repository might be of interest. The VCS I have been using since the beginning is Subversion. (When I started, Git had already been released, but was really not that popular yet.) So while some details of this article will be specific to Subversion, the general concepts should be applicable elsewhere as well.

When it comes to the structure of a repository, Subversion does not impose anything from a technical point of view. All it sees is a kind-of file system, with all the pros and cons that come with the simplicity of this approach. It makes it easy for people to start using it, which is really good. But it also does not offer help for more advanced use-cases, so that people need to find a way how to map certain requirements onto that file system concept.

As a result, a convention has emerged and been there for many years now. It says that at the top-level of the project there should be only the following folders:

• trunk: Home of the latest version (sometimes called the HEAD revision)
• branches: Development sidelines where work happens in isolation from trunk
• tags: Snapshots that give meaningful names to a certain revision

You will find plenty of additional information on the subject when searching the Internet. I can also recommend the book “Pragmatic Version Control: Using Subversion“, although it seems to be out of print now.

With these general points out of the way, let me start with how I work on my project. There are only a few core rules and despite their simplicity I can handle all situations.

• The most important aspect for the structure of my SVN repository is that all active development on the coming version happens at trunk. See this article for all the important details, why you really want to follow that approach in almost all cases.
• Once a new version is about to be released, I need a place where bug-fixes can be developed. So I create a release branch (e.g. /branches/releases/v1.3) with major and minor version number but not the patch version (I use semantic versioning). From this release branch I then cut the release (v1.3.0 in this case) by pointing the release job of my CI server to the release branch.
• Once the release is done, I create a tag that also includes the patch version. In this example the tag will be from /branches/releases/v1.3 to /tags/releases/v1.3.0 .
• Now I return to working on the next release (v1.4) by switching back to trunk.
• Bug fixing happens primarily on trunk with fixes being back-ported to released versions. There are cases when this is not practical, of course. The two main reasons are that significant structural changes were done on trunk (you do refactor, don’t you?) or another change has implicitly removed the bug there already. But that is the exception.
• When a bug-fix is needed on a released version, I temporarily switch my working copy to the release branch and do the respective work there. Unless the bug is critical I do not release a new version immediately after that, though. So this may repeat a few times, before the maintenance release.
• The maintenance release is then cut, again, from /branches/releases/v1.3 . And after that a new tag is created to /tags/releases/v1.3.1 .

Those rules have proven to be working perfectly and I hope they will continue to do so for the next ten years. I have been quite lucky in that, although for me this is still a hobby project, the result is used by many global companies and organizations in a business-critical context. There are at least 11.000 installations in production that I am aware about, so I cannot be casual about reliability of the delivery process.

# Configuration Management – Part 6: The Secrets

Every non-trivial application needs to deal with configuration data that require special protection. In most cases they are password or something similar. Putting those items into configuration files in clear is a pretty bad idea. Especially so, because these configuration files are almost always stored in a VCS (version control system), that many people have access to.

Other systems replace clear-text passwords found in files automatically with an encrypted version. (You may have seen cryptic values starting with something like {AES} in the past.) But apart from the conceptual issue that parts of the files are changed outside the developer’s control, this also not exactly an easy thing to implement. How do you tell the system, which values to encrypt? What about those time periods that passwords exist in clear text on disk, especially on production systems?

My approach was to leverage the built-in password manager facility of webMethods Integration Server instead. This is an encrypted data store that can be secured on multiple levels, up to HSMs (hardware security module). You can look at it as an associative array (in Java usually referred to as map) where a handle is used to retrieve the actual secret value. With a special syntax (e.g. secretValue=[[encrypted:handleToSecretValue]]) you declare the encrypted value. Once you have done that, this “pointer”will of course return no value, because you still need to actually define it in the password manager. This can be done via web UI, a service, or by importing a flat file. The flat file import, by the way, works really well with general purpose configuration management systems like Chef, Puppet, Ansible etc.

A nice side-effect of storing the actual value outside the regular configuration file is that within your configuration files you do not need to bother with the various environments (add that aspect to the complexity when looking at in-file encryption from the second paragraph). Because the part that is environment-specific is the actual value; the handle can, and in fact should, be the same across all environments. And since you define the specific value directly within in each system, you are already done.

# Configuration Management – Part 5: The External World

A standard requirement in configuration management is using values that already exist somewhere else. The most obvious places are environment variables (defined by the operating system) and Java system properties. Other are existing (!) databases, e.g. from an ERP system, or files.

The reason for referencing values directly from their original sources instead of duplicating them in a copy-paste fashion is the Don’t-Repeat-Yourself (DRY) principle. Although the latter is typically discussed in the context of code, it applies to configuration at least as well. We all know those “great” applications, which require manual updates of the same value in different places. And if you miss only one, all hell breaks loose.

For re-use of values within a file, the standard approach is variable interpolation. A well-known syntax for that comes from the build tool Apache Ant, where ${variableName} can be placed anywhere into a value assignment. Apache Commons Configuration, and therefore also WxConfig, support this syntax. In WxConfig you can even reference values from other files that belong to the same Integration Server package. But what to do for other sources? Well, for the aforementioned environment variables and Java system properties, the respective interpolators from Apache Commons Configuration can also be used in WxConfig. But the latter also defines several interpolators of its own. • Cross package: Assuming you have an Integration Server package that holds some general values (often referred to as “global”), those can be referenced. (There are multiple ways to deal with truly global values and the specifics really determine which one is used best.) • Current date/time: Gets the current date and/or time, which is admittedly a bit of an edge-case. One could argue that this is not really a configuration value, and that is correct. However, there are scenarios when the ability to quite easily have such a value comes in very handy. Think about files that get created during processing of data. Instead of manually concatenating the path, the base filename, the date/time stamp and the file suffix in the code, you could just have something like this in your configuration file: workerFile=${tmpDir}/appFile_\${date:yyyyMMdd-HHmmssSSS}.dat Doesn’t that make things a lot cleaner to read?
• File content: Similar to date/time this was added primarily for convenience and clarity reasons. The typical use-case are templates, where a file (possibly maintained by another application) is just read directly into a configuration value.
• Code invocation: When all else fails, you can have an arbitrary piece of logic be executed and the result be placed into the configuration value.

It is worth noting that all interpolators resolve at invocation. So you always get current results. In the case of the file content interpolator, this has the downside of file I/O; but if that really becomes a bottleneck, you are still not worse off than when having the read somewhere else in the code. And the file system caching is also still there …

With this set of options, pretty much all requirements for a redundancy-free configuration management can be met.

# Configuration Management – Part 3: The Journey Begins

Today I want to tell you about a configuration management solution I have developed. It is built for Software AG‘s ESB (the webMethods Integration Server). The latter did, for a very long time, not have built-in capabilities for dealing with configuration values. So what many projects did, was to hack together a quick-and-dirty “solution” that was basically a wrapper around Java properties.

At the time (early 2008) I was working in a “SWAT” team in presales, where we dealt with large prospects that typically looked for a strategic solution to their integration requirements. Of course configuration management, although not as prevalent as today, came up more and more often. So instead of trying to dodge the bullet, I started working on a solution. And since webMethods Integration Server is built on an open architecture, you can easily add things.

So let’s dive into some of the details of my configuration management solution, called
WxConfig. Its overall design goal was to have packages (roughly the platform’s equivalent to a web application on servlet containers) that could be moved around without any additional manual setup or configuration. The latter is a requirement stemming from the fact that we are in an ESB/integration context. By definition most packages therefore have some “relationship” to the outside world, and are not an application running more or less in isolation. Typical example would be a connection to a database, or an ERP system.

Traditionally there were two approaches for dealing with these external environment dependencies. On development (DEV) environments, there was some kind of document, telling the developer what setup/configuration they would have to apply to their DEV environment. For all “higher” stages like test (TEST) or production (PROD) this was hopefully handled by some kind automation; although back in 2008 that was rarely case and more often than not also there a manual checklist was used.

From a presales perspective (that is where it all started) the DEV part was the relevant one. My idea was to come up with something that could transform the document with the setup instructions into some kind of codification. So I basically devised an XML format (or rather a convention) that did just that. It started out quite small, with only schedulers being supported. (In that specific case we needed a periodic execution of logic for a demo.)

Since then a lot of other things were added. The approach was so successful that  many customers have moved their entire environment setup (well, at least those parts that are supported) to the WxConfig solution. And there are, interestingly enough, quite a few customers who use WxConfig solely for environment configuration but not for configuring their actual applications.

You may now ask: Hang on, what about the other environments, like TEST and PROD? Very good question! This is one of the core strengths of the WxConfig solution and therefore warrants a separate post. Stay tuned …

# LaTeX: Hyphenation of \texttt

I have been using  since I was 17 and still love it. But one thing has been bugging me for a long time. For a while I had been manually tweaking documents to get proper linebreaks when using the \texttt{} macro. Now I had finally come to the point where this did not work any longer. A search gave me this snippet:

\renewcommand{\texttt}[1]{%
\begingroup
\ttfamily
\begingroup\lccode~=/\lowercase{\endgroup\def~}{/\discretionary{}{}{}}%
\begingroup\lccode~=[\lowercase{\endgroup\def~}{[\discretionary{}{}{}}%
\begingroup\lccode~=.\lowercase{\endgroup\def~}{.\discretionary{}{}{}}%
\catcode/=\active\catcode[=\active\catcode.=\active
\scantokens{#1\noexpand}%
\endgroup
}`

It works great!

# VMware Fusion 4 on OS X Mavericks

Having upgraded to OS X Mavericks just recently, one of the main questions for me was, whether I would need to upgrade from VMware Fusion 4 to the current version (which is Fusion 7). Initially the thought had not really occurred to me, but after the second crash of my MacBook I started to look around. The majority of content I found was about the support status and some people had gotten really agitated during the discussion. But little was said about the technical side of things, i.e. whether the stuff was working or not.

What I can say today is that for me VMware Fusion 4.1.4 works nicely and without issues so far. I use it almost exclusively to run a company-provided Java development environment on Windows 7 64 bit. So no 3D graphics, USB drives, or other “fancy” things.

The crashes were most like caused by an old version of the keyboard remapping tool Karabiner (formerly KeyRemap4MacBook). Once I had upgraded to the appropriate version, the crashes were gone.

# Keyboard Tools

Just a quick post to share two awesome utilities that I have been using for a while now. I like working with the keyboard over the mouse in many cases, simply because I am faster with that, and was recently pointed to a a tool that will make things even better:

Auto Hotkey: This allows you to “play” an arbitrary set of key strokes and commands either by replacing text you just typed or by re-assigning a key (combination). Very handy for emails where you can now simply type “brc” and get “Best regards,<NEW_LINE>Christoph”

Another tool worth mentioning is Key Tweak, which provides a GUI over the registry that allows to remap the keyboard. My use-case was a very old IBM Model M keyboard that does not have a Windows key (I was made aware of the tool by this post)

# JMeter 2.5 Release

I am quite happy that my first contribution to a pretty well known open source program has made it into the latest release of JMeter, the famous load-testing tool from the Apache folks. At the beginning of this year I needed to do some load-testing on JMS topics with durable subscriptions. At the time those weren’t available in JMeter so I dug into the code and added the functionality, which was less difficult than anticipated.

# Custom target directory for unattended Java installation on Windows

There is a bunch of information available that explain how to perform an unattended (aka silent) installation of Java on Windows  (examples for JRE and JDK).

What is usually not mentioned is the fact that when you specify a non-standard target directory, it must not contain any spaces. The easiest way to achieve this is the use of the 8.3 format. So instead of “C:\Program Files” you can use “C:\Progra~1”.

# WebDAV Client

If you are looking for a small cross-platform WebDAV client, you should probably check on DAV Explorer. It looks a bit old-fashioned but is small (less than 600 KB), does not require installation, and simply worked for me. The only requirement is a JDK and its “bin” sub-directory in the path, but you have that anyway, don’t you?

It is probably not the best choice if you are a non-technical person, want to work with it every day, and look for a nice integration into your system’s GUI. But for quick tests or the occasional development-centric work it’s absolutely sufficient.