Tag Archives: Chef

Getting Started with Chef Infra Server

A while ago Chef Software announced that they would move all source code to the Apache 2.0 license (see announcement for details), which is something I welcome. Not so much welcomed by many was the fact that they also announced to stop “free binary distributions”. In the past you could freely download and use the core parts of their offering, if that was sufficient for your needs. What upset many people was that the heads-up period for this change was rather short and many answers were left open. It also did not help that naturally their web site held many references to the old model, so people were confused.

In the meantime it seems that Chef has loosened their position on binary distributions a bit. There is now a number of binaries that are available under the Apache 2.0 license and they can be found here. This means that you can use Chef freely, if you are willing to compromise on some features. Thanks a lot for this!

This post will describe what I did to set up a fresh Chef environment with only freely available parts. You need just two things to get started with Chef: the server and the administration & development kit. The latter goes by the name of ChefDK and can be installed on all machines on which development and administration work happens. It comes with various command line tools that allow you to perform the tasks needed.

Interestingly, you will find almost no references to ChefDK on the official web pages. Instead its successor “Chef Workstation” will be positioned as the tool to use. There is only one slight problem here: The latest free version is pretty old (v0.4.2) and did not work for me, as well as various other people. That was when I decided to download the latest free version of ChefDK and give it a try. It worked immediately and since I had not needed any of the additional features that come with Chef Workstation, I never looked back.

No GUI is part of those free components. Of course Chef offer such a GUI (web-based) which is named Chef Management Console. It is basically a wrapper over the server’s REST API. Unfortunately the Management Console is “free” only up to 25 nodes. For that reason, but also because its functionality is somewhat limited compared to the command line tools, I decided to not cover it here.

Please check the licenses by yourself, when you follow the instructions below. It is solely your own responsibility to ensure compliance.

Below you will find a description of what I did to get things up and running. If you have a different environment (e.g. use Ubuntu instead of CentOS) you will need to check the details for your needs. But overall the approach should stay the same.

Environment

The environment I will use looks like this

  • Chef server: Linux VM with CentOS 7 64 bit (minimal selection of programs)
  • Chef client 1: Linux VM like for Chef server
  • Development and administration: Windows 10 Pro 64bit (v1909)

I am not sure yet whether I will expand this in the future. If you are interested, please drop a comment below.

Please check that your system meets the prerequisites for running Chef server.

Component Versions

The download is a bit tricky, since we don’t want to end up with something that falls under a commercial license. As of this writing (April 2020) the following component binaries are the latest that come under an Apache 2.0 license. I verified the latter by clicking at “License Information” underneath each of the binaries that I plan to use.

  • Chef Infra Server: v12.19.31 (go here to check for changes)
  • Chef DK: 3.13.1 (go here to check for changes)

As to the download method Chef offer various methods. Typically I would recommend to use the package manager of your Linux distribution, but this will likely cause issues from a license perspective sooner or later.

Server Installation and Initial Setup

So what we will do instead is perform a manual download by executing the following steps (they are a sub-set of the official steps and all I needed to do on my system):

  • All steps below assume that you are logged in as root on your designated Chef server. If you use sudo, please adjust accordingly.
  • Ensure required programs are installed
    yum install -y curl wget
  • Open ports 80 and 443 in the firwall
    firewall-cmd --permanent --zone public --add-service http && firewall-cmd --permanent --zone public --add-service https && firewall-cmd --reload
  • Disable SELinux
    setenforce Permissive
  • Download install script from Chef (more information here)
    curl -L https://omnitruck.chef.io/install.sh > chef-install.sh
  • Make install script executable
    chmod 755 chef-install.sh
  • Download and install Chef server binary package: The RPM will end up somewhere in /tmp and be installed automatically for you. This will take a while (the download size is around 243 MB), depending on your Internet connection’s bandwidth.
    ./chef-install.sh -P chef-server -v "12.19.31"
  • Perform initial setup and start all necessary components, this will take quite a while
    chef-server-ctl reconfigure
  • Create admin user
    chef-server-ctl user-create USERNAME FIRSTNAME LASTNAME EMAIL 'PASSWORD' --filename USERNAME.pem
  • Create organization
    chef-server-ctl org-create ORG_SHORT_NAME 'Org Full Name' --association-user USERNAME --filename ORG_SHORT_NAME-validator.pem
  • Copy both certificates (USERNAME.pem and ORG_SHORT_NAME-validator.pem) to your Windows machine. I use FileZilla (installers without bloatware can be found here) for such cases.
ChefDK Installation and Initial Setup

What I describe below is a condensed version of what worked for me. More details can be found on the official web pages.

  • I use $HOME in the context below to refer to the user’s home directory on the Windows machine.  You must manually translate it to the correct value (e.g. C:\Users\chris in my case).
  • Download the latest free version of ChefDK for Windows 10 from here and install it
  • Check success of installation by running the following command from a command prompt:
    chef -v
  • Create directory and base version of configuration file for connectivity by running
    knife configure (it may look like it hangs, just give it some time)
  • Copy USERNAME.pem and SHORTNAME-validator.pem to $HOME/.chef
  • Add your server’s certificate (self-signed!) to the list of trusted certificates with
    knife ssl fetch
  • Verify that things work by executing knife environment list, it should return _default as the only existing environment
  • The generated configuration file was named $HOME/.chef/credentials in my case and I decided to rename it config.rb (which is the new name in the official documentation) and also update the contents:
    • Remove the line with [default] at the beginning which seemed to cause issues
    • Add knife[:editor] = '"C:\Program Files\Notepad++\notepad++.exe" -nosession -multiInst' as the Windows equivalent of setting the EDITOR environment variable on Linux.
Initial Project

We will  create a very simple project here

  • Go into the directory where you want all your Chef development work to reside (I use $HOME/src; the comment regarding the use of $HOME from above still applies) and open a command prompt
  • Create a new Chef repo (where all development files live)
    chef generate repo chef-repo (chef-repo is the name, you can of course change that)
  • You will see that  a new directory ($HOME/src/chef-repo) has been created with a number of files in it. Among them is  ./cookbooks/example , which we will upload as a first test. Cookbooks are where instructions are stored in Chef.
  • To be able to upload it, the cookbook path must be configured, so you need to add to $HOME/.chef/config.rb the following line:
           cookbook_path   ["$HOME/src/chef-repo/cookbooks"]
    (example:  cookbook_path ["c:/Users/chris/src/chef-repo/cookbooks"])
  • You can now upload the cookbook via knife cookbook upload example
Client Setup

In order to have the cookbook executed you must now add it to the recipe list (they take the cooking theme seriously at Chef) of the machines, where you want it to run. But first you must bootstrap this machine for Chef.

  • The bootstrap happens with the following command (I recommend to check all possible options by via knife bootstrap --help) executed on your Windows machine :
    knife bootstrap MACHINE_FQDN --node-name MACHINE_NAME_IN_CHEF --ssh-user root --ssh-password ROOT_PASSWORD
  • You can now add the recipe to the client’s run-list for execution:
       knife node run_list add MACHINE_NAME_IN_CHEF example
    and should get a message similar to
      MACHINE_NAME_IN_CHEF :
        run_list:
          recipe[example]
  • You can now check the execution by logging into your client and execute chef-client as root.  It will also be executed about every 30 minutes or so. But checking the result directly is always a good idea after you changed something.

Congratulation, you can now maintain your machines in a fully automated fashion!

webMethods Integration Server: Continuous Deployment

For more than nine years I have been working on a package for webMethods Integration Server. With the experience gained there, I want to discuss a number of aspects about Continuous Deployment.

Versioning

I recommend the use of semantic versioning, which at its core is about the following (for a lot of additional details, just follow the link):

  • The version number consists of three parts: Major, minor, and patch (example v1.4.2).
  • An increase in the major version indicates a non-backwards-compatible change.
  • An increase in the minor version indicates a backwards-compatible change.
  • A increase in the patch version indicates bug fixes only, no functional changes at all.

It is a well-known approach and makes it very easy for everyone to derive the relevant aspects from just looking at the version number. If the release in question contains bug fixes for something you have in use, it is probably a good idea to have a close look and check if a bug relevant to you was fixed. If it is a minor update and thus contains improvements while being backwards compatible, you may want to start thinking about a good time to make the switch. And if it is major update that (potentially) breaks things, a deeper look is needed.

Each Integration Server package has two attributes for holding information about its “version”. One is indeed the version itself and the other is the build. The latter is by default populated with in auto-increase number, which I find not very helpful. Yes, it gives me a unique identifier, but one that does not hold any context. What I put into this field instead is a combination of a date-time-stamp and the change set identifier from the Version Control System (VCS). This allows me to see at a glance when this package was built and what it contains.

Build

Conceptually the work on a single package, as opposed to a set of multiple ones that comprise one application, is a bit different in that you simply deal with only one artifact that gets released. In many projects I have seen, people take a slightly different approach and see the entirety of the project as their to-be-released artifact. This approach is supported by how the related tools (Asset Build Environment and Deployer) work: You simply throw in the source code for several packages, create an archive with metadata (esp. dependencies), and deploy it. Of course you could do this on a per-package basis. But it is easier to have just one big project for all of them.

Like almost always in live, nothing comes for free. What it practically means is that for every change in only one of the potentially many packages of the application, all of them need a re-deployment. Suppose you have an update in a maintenance module that is somewhat unrelated to normal daily operation. If you deploy everything in one big archive, this will effectively cause an outage for your application. So you just introduced a massive hindrance for Continuous Deployment. Of course this can be mitigated with blue-green deployments and you are well advised to have that in place anyway. But in reality few customers are there. What I recommend instead is an approach where you “cut” your packages in such a way that they each of them performs a clearly defined job. And then you have discrete CI job for each of them, of course with the dependencies taken into account.

Artifact Storage

Once your build has been created, it must reside somewhere. In a plain webMethods environment this is normally the file system, where the build was performed by Asset Build Environment (ABE). From there it would be picked up by Deployer and moved to the defined target environment(s). While this has the advantage of being a quite simple setup, it also has the downside that you loose the history of your builds. What you should do instead, is follow the same approach that has been hugely successful for Maven: use a binary repository like Artifactory, Nexus, or one of the others (a good comparison can be found here). I create a ZIP archive of the ABE result and have Jenkins upload it to Artifactory using the respective plugin.

To have the full history and at the same time a fixed download location, I perform this upload twice. The first contains a date-time-stamp and the change set identifier from the Version Control System (VCS), exactly like for the package’s build information. This is used for audit purposes only and gives me the full history of everything that has ever been built. But it is never used for actually performing the deployment. For that purpose I upload the ZIP archive a second time, but in this case without any changing parts in the URL. So it effectively makes it behave a bit like a permalink and I have a nice source for download. And since the packages themselves also contain the date-time stamp and change set identifier, I still know where they came from.

Deployment

Depending on your overall IT landscape there are two possible approaches for handling deployments. The recommended way is to use a general-purpose configuration management tool like Chef, Puppet, Ansible, Salt, etc. This should then also be the master of your webMethods deployments. Just point your script to your “permalink” in the binary reposiory and take it from there. I use Chef and its remote file mechanism. The latter nicely detects if the archive has changed on Artifactory and only then executes the download and deployment.

You can also develop your own scripts to do the download etc., and it may appear to be easier at a first glance. But there is a reason that configuration management tools like Chef et al. have had such success over the last couple years, compared to home-grown scripts. In my opinion it simply does not make sense to spend the time to re-invent the wheel here. So you should invest some time (there are many good videos on YouTube about this topic) and figure out which system is best for your needs. And if you still think that you will be faster with your own script, chances are that you overlooked some requirements. The usual suspects are logging, error handling, security, user management, documentation, etc.

With either approach this makes deployment a completely local operation and that has a number of benefits. In particular you can easily perform any preparatory work like e.g. adjusting content of files, create needed directories, etc.

Summary

All in all, this approach has worked extremely well for me. While it was first developed for an “isolated” utility package, it has proven to be even more useful for entire applications, comprised of multiple packages; in other words, it scales well.

Another big advantages is separation of concerns. It is always clear which activity is done by what component. The CI server performs the checkout from VCS and orchestrates the build and upload to the binary repository. The binary repository holds the deployable artifact and also maintains an audit trail of everything that has ever been built. The general-purpose configuration management tool orchestrates the download from the binary repository and the actual deployment.

With this split of the overall process into discrete steps, it is easier to extend and especially to debug. You can “inject” additional logic (think user exits) and especially implement things like blue-green deployments for a zero-downtime architecture. The latter will require some upfront thinking about shared state, but this is a conceptual problem and not specific to Integration Server.

One more word about scalability. If you have a big-enough farm of Integration Servers running (and some customers have hundreds of them), the local execution of deployments also is much faster than doing it from a central place.

I hope you find this information useful and would love to get your thoughts on it.

Configuration Management – Part 6: The Secrets

Every non-trivial application needs to deal with configuration data that require special protection. In most cases they are password or something similar. Putting those items into configuration files in clear is a pretty bad idea. Especially so, because these configuration files are almost always stored in a VCS (version control system), that many people have access to.

Other systems replace clear-text passwords found in files automatically with an encrypted version. (You may have seen cryptic values starting with something like {AES} in the past.) But apart from the conceptual issue that parts of the files are changed outside the developer’s control, this also not exactly an easy thing to implement. How do you tell the system, which values to encrypt? What about those time periods that passwords exist in clear text on disk, especially on production systems?

My approach was to leverage the built-in password manager facility of webMethods Integration Server instead. This is an encrypted data store that can be secured on multiple levels, up to HSMs (hardware security module). You can look at it as an associative array (in Java usually referred to as map) where a handle is used to retrieve the actual secret value. With a special syntax (e.g. secretValue=[[encrypted:handleToSecretValue]]) you declare the encrypted value. Once you have done that, this “pointer”will of course return no value, because you still need to actually define it in the password manager. This can be done via web UI, a service, or by importing a flat file. The flat file import, by the way, works really well with general purpose configuration management systems like Chef, Puppet, Ansible etc.

A nice side-effect of storing the actual value outside the regular configuration file is that within your configuration files you do not need to bother with the various environments (add that aspect to the complexity when looking at in-file encryption from the second paragraph). Because the part that is environment-specific is the actual value; the handle can, and in fact should, be the same across all environments. And since you define the specific value directly within in each system, you are already done.