Change Management in the Content Project

Apr 25, 2023  -  Matěj Týč

Systems that our content aims to harden are very often large collections of separate components whose life cycle is independent from the life cycle of the product that ships them. For example, the Red Hat Enterprise Linux consists of more than six thousand of such distinct components, although only a fraction of them s.a. OpenSSH, Grub2 is supported by the ComplianceAsCode project.

Our Red Hat group is contemplating the idea of making the content project better equipped to handle changes in component behavior better. As the project is open-source and has a community, we would like to get it involved from the very start. The purpose of this post is twofold:

  • To serve as a resource of brainstorming and of forming ideas for improvements, and
  • to put incoming implementation of improvements into the appropriate context.

Introduction

Software changes all the time, not only when major version numbers change. Sometimes it happens that what users consider a feature, developers see as a loophole, and when they close it, the component changes its behavior significantly.

Our ComplianceAsCode project that configures components needs to accomodate for this change, and it shouldn’t be done impulsively. We don’t want the project’s code polluted by difficult-to-read if-else conditions and copy-pasted snippets that undermine maintainability.

Anyway, when we talk about component changes, what do we have in mind? Here are some examples:

  • Configuration preference changes - instead of putting everything into one file, prefer distributing the configuration into a directory, for example prefer sshd_config.d directory over the sshd_config file.
  • A configuration option is renamed, e.g. from a whitelist to allowlist.
  • A feature disappears, and an alternative approach is needed - consider the case of OpenSSH losing capabilities of the SetIdleTimeout 0 option to drop inactive sessions.

New Approach Needed

If we think in a very simple way, we clearly need to design smart procedures or workflows that react to changes, so no major brain power is needed to make the right decisions. However, in order to be able to come with workflows that introduce an added value, the project needs to gain some extra capabilities.

Our team has come up with these ideas so far:

  • Lower the pressure on rules - enable coverage of a security requirement by more than one rule.
  • Be able to split a rule into a set of rules easily.
  • Enhance the declarative aspect of the project.
  • Introduce a component-centric view to the project.
  • Facilitate extension of a rule’s scope.

Closer Look

Let’s take a look how those ideas could align to the subject of component changes:

Multiple rules to address one requirement

In practice, we may not know at build-time what exact version of a component will be scanned, and we may address this uncertainty by having multiple rules prepared to handle it. When it comes to the actual scanning, this collection of rules has to make sure that the evaluation will be carried out correctly, regardless of what component version is present on the system.

This could be achieved by a set of rules with disjoint applicability, so at the end, at most one rule is active. In this context, the recently-introduced CPE Applicability Language functionality can be helpful, as it enables content authors to specify that the rule is applicable only when certain conditions s.a. packages versions are met.

Support of rule splitting

Splitting a rule is not rocket science. One could do that by copy-pasting the original rule and then modifying the copy, but it is desirable to minimize the amount of manual operations to the minimum.

At the time of writing this post, creation of a new rule involves noticeable boilerplate. You need to find a place for the yaml file, then you have to create it, input all required keys, populate CCEs, and then you can get on with associated content s.a. remediations.

Therefore, copy-pasting an existing rule is a much better approach, but this approach increases risks of copy-pasting code that will need to be modified one day, and there are still steps remaining that will need automation. This has to be reversed - what can be automated should be automated, and the duplication of data has to decrease by applying a more aggressive normal form.

In this context, the first thing that comes to mind are references. Those are defined in rule files, while they can also be deduced from controls. And perhaps there are other examples of redundant data that you are already thinking of.

Macros can also help to reduce copy-pasting, but let’s face it - the manual work one needs to perform to extract shared parts into macros in a smart way is too tedious.

Go Declarative

Some areas of the project are already declarative, but to have more is better. Declarative way of doing things allows for the introduction of more levels of abstraction, and when something changes, it is much more likely that a change will stay contained in one of such abstraction levels.

For example, we already have an enhanced declarative part in the form of control files. Thanks to that, we don’t have to specify relations of rules to the profile directly, but we can focus on a much easier to grasp relation of a rule to a security control. Profiles then can be defined using security controls that stay, even though their rule composition may change. As an additional benefit, the assignment of rules to security controls can be reused by other parts of the project.

However, this declarative concept can be extended beyond profile compositions. We often use constructs s.a. instead of a more generic and understandable or similar.

In other words, we shouldn’t miss an opportunity to declare that a certain product has a particular property, and every other part of the content should refer to those properties rather than to product names. Designing such product properties in a way that is smart, doesn’t get in the way and that can be reused in prose, checks and remediations is not trivial.

Can you think of other ways of bringing declarative principles to the project?

Ability to track component changes and our reactions to them.

Imagine that a component changes behavior, and you dispatch pull requests that react to that change, and they get merged successfully. However, later you discover that something is still not right, and you need to check out the reaction to that change. What do you do?

You probably check the history of files that had to be touched at that time, and you are able to recover the exact date range of the change, and then related pull requests along with the discussion. However, this information needs to be manually recovered from other changes to the project. That’s doable, but it is, to some degree, a detective work.

Wouldn’t it be nice to have a capability that would allow us to query a component, and we would get change information from the project using some automation? This can be very difficult to achieve exhaustively, but a partial and good enough solution may not be so difficult.

Polymorphic rules

Rules that try to do two things don’t sound like a good idea, but what if those two things would heavily overlap? Great candidates for polymorphism are changes when component configuration doesn’t change, as it is the way of how to achieve this configuration that does. In cases like this, a rule can get a polymorphic remediation and keep the CCE identifier if it has one, as CCEs don’t change if the configuration stays the same.

For instance, the rule no_empty_passwords also mentioned earlier, has polymorphic Bash remediation and a rather simple OVAL check. Another example is different - enable_fips_mode has a simple Bash remediation, but complex OVAL and this example nicely illustrates the richness of the rule-level polymorphism in the project.

We have some more polymorphic rules in our content, but they became polymorphic “maually” - they don’t declare their polymorphism, and there are no build system features that would guide a rule’s transformation to a polymorphic rule. Project support for polymorphism of rules could reduce maintenance costs by making changes cheaper, and also by enabling maintainers overview of what rules are polymorphic and what aren’t.

What now

So that’s it - things need to be done, and there are a lot of questions. Do you have answers, or even more questions? Or do you have worries or objections? In any case, reach out to us on Gitter!

Attempts to implement some improvements outlined here will probably start coming up in some form in the course of 2023. We are early on the cycle, and we are collecting and processing feedback, so our intentions can change is definitely our aim.