Integrations

Best in breed procurement, where many systems - the best in class for each system - are procured independently then integrated with one another has had a tricky decade.

The underlying principle was good. Take lots of vendors who do just one thing really well and connect them to one another to form one fully integrated super system.

But multiple studies suggest that 70-85% of the integration projects which are essential to make these systems work together fail to achieve their objectives.

This post looks at what it’s necessary to consider to avoid these failures, especially in the workforce management space and breaks out specific things to explore when considering an integration project.

Things which have been consistently true

After 15 years of building technology businesses, I've changed my views on a lot of things from how to build software to what a good culture looks like.

A small subset of things have remained consistently true.

All of these are "true" for me, as always when I write "you" I really mean me. Writing is hard.

Information Density and why you should always show your working

A fun thing about being alive now is how much knowledge and information is available in the form of books, blog posts, audio books, social media posts and YouTube videos to name just a few. But now more than ever there's no way we can consume all of it and so on which content we choose to focus our time and attention has the potential to drive wildly different outcomes.

In particular I believe an organisation that is driven heavily by short-form content loses it's ability both to reason about novel situations and discuss decisions constructively, leading to a lower likelihood of success for that organisation.

Writing is thinking

In the book "High Output Management," author Andrew Grove writes:

“Reports are more a medium of self-discipline than a way to communicate information. Writing the report is important; reading it often is not.”

Over the last ten years of building startups, this concept that the purpose of writing is often more to help the writer develop their thinking than it is to help others understand them has become increasingly central to the culture I aim to drive in an organisation.

Meetings vs Documents

A common topic of discussion is whether meetings should be used for information distribution or only discussion.

Probably the most famous example of a conclusion on this is the Amazon approach (documented in working backwards) where all meetings begin with some form of document and 15 minutes silence for people to read followed by discussion.

I'm a strong advocate of this approach but wanted to be better able to explain why I believe it's applicable to most situations and what's different about the small subset of exceptions.

So I attempted to break down the factors which play into it and what I believe is different between the possible approaches.

Deploying Rails to a VPS with Capistrano V3

Deploying Rails to a VPS with Capistrano remains one of the simplest and most reliable methods for getting a Rails app up-and running. With the likes of Hetzner Cloud, Digital Ocean and Linode providing inexpensive, reliable virtual machines, Rails app serving substantial amounts of traffic can be hosted with minimal cost and complexity.

In the previous post we used Chef to prepare an Ubuntu 20.04 server for deployment of our Rails application. This included installing Nginx, PostgreSQL, Redis and our Ruby version of choice. We used Chef for this rather than entering command manually so that we can trivially create additional identical servers in future without needing to remember lots of terminal commands and config file edits.

In this tutorial we'll use Capistrano to automate deployment of our application, including generating all required config files, obtaining a free SSL certificate with Lets Encrypt and enabling zero downtime deployment.

Setting up Ubuntu 20.04 for Rails app Deployment

Deploying Rails to a VPS with Capistrano remains one of the simplest and most reliable methods for getting a Rails app up-and running. With the likes of Hetzner Cloud, Digital Ocean and Linode providing inexpensive, reliable virtual machines, Rails app serving substantial amounts of traffic can be hosted with minimal cost and complexity.

We'll first use Chef to provision a VPS including securing and hardening the server, installing the correct Ruby version(s) and setting up Postgres and Redis. We'll then use Capistrano to deploy our Rails app, including appropriate systemd units to ensure our services are started automatically on boot .

Managing puma with the systemd user instance and monit

Many guides to deploying Rails with Capistrano will use systemd to have it auto-started when the system boots. This is often done using the system instance of systemd which by default can only be controlled by root.

The typical workaround for this is either to grant our Capistrano deployment user passwordless sudo access or to grant them passwordless sudo access to just the commands required to restart the rails (and potentially sidekiq) systemd services.

This can be avoided by using the systemd user instance, which allows persistent services to be managed as a non-root user. This is compatible with the default systemd configuration in Ubuntu 20.04.

Capistrano & Puma with Systemd; Permission denied @ rb_io_reopen

When using the capistrano puma gem with systemd, we may get the error:

Permission denied @ rb_io_reopen - /home/deploy/LOG_FILE_PATH/shared/log/puma_access.log (Errno::EACCES)

This may be caused by doubling up on the puma app servers logging.

Typically our systemd unit will contain something like:

StandardOutput=append:/home/deploy/LOG_FILE_PATH/shared/log/puma_access.log

Which means that any data written to standard output will be appended to the log file specified by systemd.

If we're getting the above error, it's also likely that our puma.rb configuration file contains something like:

stdout_redirect '/home/deploy/LOG_FILE_PATH/shared/log/puma_access.log', true

Which tells puma itself to write to a log file instead of to stdout.

This doubling up leads to the following:

  • systemd creates the log file as the root user
  • puma which we will generally have running as a different user then tries to write to this same file, but it doesn't have permission because it was created by root

The solution to this is simple, we can complete remove this line from puma.rb:

stdout_redirect '/home/deploy/LOG_FILE_PATH/shared/log/puma_access.log', true

Since the redirection of stdout is already being handled by systemd.

Capistrano & Puma; service puma is not active, cannot reload

When trying to use the Capistrano Puma gem to restart Puma via systemd, we may run into an error along the lines of:

puma_APP_NAME.service is not active, cannot reload

This typically happens either because the service was never enabled or because in the time which elapsed between it being enabled and the first deploy taking place, it has crashed a sufficient number of times that it is no longer active.

The behaviour we want in this scenario is to reload the service if it is active, otherwise to restart it.

Happily systemctl [supports this out of the box]https://www.freedesktop.org/software/systemd/man/systemctl.html with systemctl reload-or-restart.

We can add the following to lib/capistrano/tasks to add a task which uses this to the puma namespace provided by the capistrano puma gem:

namespace :puma do
  namespace :systemd do
    desc 'Reload the puma service via systemd by sending USR1 (e.g. trigger a zero downtime deploy)'
    task :reload do
      on roles(fetch(:puma_role)) do
        if fetch(:puma_systemctl_user) == :system
          sudo "#{fetch(:puma_systemctl_bin)} reload-or-restart #{fetch(:puma_service_unit_name)}"
        else
          execute "#{fetch(:puma_systemctl_bin)}", "--user", "reload", fetch(:puma_service_unit_name)
          execute :loginctl, "enable-linger", fetch(:puma_lingering_user) if fetch(:puma_enable_lingering)
        end
      end
    end
  end
end

after 'deploy:finished', 'puma:systemd:reload'

This should be used in conjunction with including the puma systemd tasks in our Capfile using the load_hooks: false option which prevents the default restart task from being called.

install_plugin Capistrano::Puma::Systemd, load_hooks: false

The use of the above task also allows for zero downtime deploys when used with the relevant puma configuration and systemd unit file. See this post for more on the systemd unit file and this repository for a working example.

Capistrano & Puma; neither a valid executable name nor an absolute path

When attempting to deploy a Rails application using the puma web sever using the systemd functionality in the capistrano puma gem, we may receive the error message:

Neither a valid executable name nor an absolute path

When attempting to start the systemd service. This most often occurs when using the capistrano rbenv plugin. This is because the Capistrano rbenv plugin modifies the SSHKit.config.command_map[:bundle] path to include the RBENV_ROOT and RBENV_VERSION environment variables at the start of the bundle path. Systemd doesn't support Exec command starting with environment variables, instead requiring them to be in separate Environment lines.

We can fix this by overriding the puma.server.erb template with a new systemd unit file as follows:

[Unit]
Description=Puma HTTP Server for <%= "#{fetch(:application)} (#{fetch(:stage)})" %>
After=network.target

[Service]
Type=simple
<%="User=#{puma_user(@role)}" if fetch(:puma_systemctl_user) == :system %>
WorkingDirectory=<%= current_path %>
ExecStart=/usr/local/rbenv/bin/rbenv exec bundle exec puma -C <%= fetch(:puma_conf) %>
ExecReload=/bin/kill -USR1 $MAINPID
ExecStop=/bin/kill -TSTP $MAINPID
StandardOutput=append:<%= fetch(:puma_access_log) %>
StandardError=append:<%= fetch(:puma_error_log) %>
<%="EnvironmentFile=#{fetch(:puma_service_unit_env_file)}" if fetch(:puma_service_unit_env_file) %>
<% fetch(:puma_service_unit_env_vars, []).each do |environment_variable| %>
<%="Environment=#{environment_variable}" %>
<% end %>

Environment=RBENV_VERSION=<%= fetch(:rbenv_ruby) %>
Environment=RBENV_ROOT=/usr/local/rbenv

Restart=always
RestartSec=1

SyslogIdentifier=puma_<%= fetch(:application) %>_<%= fetch(:stage) %>

[Install]
WantedBy=<%=(fetch(:puma_systemctl_user) == :system) ? "multi-user.target" : "default.target"%>

Note that this hardcodes the path to rbenv so if the path is different, for example because it's a user install not a system install, this will need updating.

This unit file also adds an ExecReload option to allow us to use systemd for zero downtime deploys.

For a fully working example see this repository.

There's more information in this github issue.

Kubernetes Single Sign On - A detailed guide

In this series of posts we cover how to setup a comprehensive group based single sign on system for Kubernetes including the kubectl cli, any web application with ingress, a docker registry and gitea. We'll cover most of the common SSO models so adapting what's here to other applications such as Gitlab, Kibana, Grafana etc is simple.

The full solution uses Keycloak backed by OpenLDAP. OpenLDAP is required for the Gitea component, but can be skipped for the other components, including OIDC based SSO for kubectl.

Some of the highlights this series covers are:

  1. Login to the kubectl cli using SSO credentials via the browser
  2. Replace basic auth ingress annotations with equally simple but much more secure SSO annotations
  3. Push and pull to a secure private docker registry with full ACL
  1. Contents and overview
  2. Installing OpenLDAP
  3. Installing Keycloak
  4. Linking Keycloak and OpenLDAP
  5. OIDC Kubectl Login with Keycloak
  6. Authenticate any web app using ingress annotations
  7. Gitea (requires LDAP)
  8. Simple Docker Registry
  9. Harbor Docker Registry with ACL

Finally there were a lot of excellent resources I leant on when creating this series, there's a summary of the key ones here.

OIDC Login to Kubernetes and Kubectl with Keycloak

A commonly cited pain point for teams working with Kubernetes clusters is managing the configuration to connect to the cluster. All to often this ends up being either sending KUBECONFIG files with hardcoded credentials back and forth or fragile custom shell scripts wrapping the AWS or GCP cli's.

In this post we'll integrate Kubernetes with Keycloak so that when we execute a kubectl or helm command, if the user is not already authenticated, they'll be presented with a keycloak browser login where they can enter their credentials. No more sharing KUBECONFIG files and forgetting to export different KUBECONFIG paths!

We'll also configure group based access control, so we can, for example create a KubernetesAdminstrators group, and have all users in that group given cluster-admin access automatically.

When we remove a user from Keycloak (or remove them from the relevant groups within Keycloak) they will then lose access to the cluster (subject to token expiry).

For this we'll be using OpenID Connect, more here on how this works.

By default, configuring Kubernetes to support OIDC auth requires passing flags to the kubelet API server. The challenge with this approach is that only one such provider can be configured and managed Kubernetes offerings - e.g. GCP or AWS - use this for their proprietary IAM systems.

To address this we will use kube-oidc-proxy, a tool from Jetstack which allows us to connect to a proxy server which will manage OIDC authentication and use impersonation to give the authenticating user the required permissions. This approach has the benefit of being universal across clusters, so we don't have to follow different approaches for our managed vs unmanaged clusters.

This post is part of a series on single sign on for Kubernetes.

Web application authentication and authorization with Keycloak and OAuth2 Proxy on Kubernetes using Nginx Ingress

In this post we'll setup a generic solution which allows us to add authentication via Keycloak to any application, simply by adding an ingress annotation. This gives us a much more extendable and secure alternative to basic auth.

Comprehensive docker registry on Kubernetes with Harbor and Keycloak for single sign on

In this post we'll install a feature rich but lightweight docker registry and integrate login and authorization with Keycloak users and groups.

Harbor is an open source registry which can serve multiple types of cloud artifacts and secure them using fine grained access control. In this case we'll be focussed on using harbor as a docker image registry and linking it's authentication with Keycloak but it is also capable of serving multiple other types of artifact, including helm charts.

This post is part of a series on single sign on for Kubernetes.