Keeping Documentation and Code Synchronized

Posted on Aug 1, 2018
By Jacob Lärfors

Background

All complex software is designed before it is developed. Unlike software, however, design documents tend to stay static and not evolve together with the software. What we end up with are disconnected pieces of documentation and yet more contributors to the technical debt of a system. For instance, coding guidelines and software architectures are designed to produce the highest quality system possible. But how do we manage the adherence to them? Inevitably requirements change and design documentation needs to be updated, e.g. new language constructs means updating coding guidelines, new functionality means new components in the architecture, and a new technology might put a greater demand on the portability of application code. Unfortunately, the value in this work is diminished if the documentation and code have already drifted apart and this is exactly the problem we want to solve.

Software Architecture

The software architecture design of a software system is used to describe things like how the software will be structured and how data will flow through it. It acts almost like a schematic or blueprint which should be frequently referenced by developers. It is typical that software evolves over time, and the design should evolve in unison.

As a general overview, software architecture typically defines:

How a system will be divided up into components to enable parallel software development and help create a more decoupled system which will significantly contribute to development velocity, maintainability, testability and the evolvability of a system.
How these components communicate with each other, e.g. through a web API, message/data structures, or an embedded message bus. Better documentation and adherence to this design will enable practices such as Data-Driven Testing (DDT), Test Driven Development (TDD), etc.
Which component dependencies are permitted, and which are forbidden. From this, one can begin to understand the transitive closure of component dependencies. This is extremely valuable for regression testing and understanding how changes in one component will impact the system.
The guidelines for complexity of individual components to ensure that components do not evolve into systems themselves. Once guideline figures start to be reached or exceeded, a system or component redesign is likely due; stressing the importance of up-to-date documentation that accurately reflects the real-world implementation.

How can we maximize the return on the investment into the design of a system’s software architecture? After the hard work of creating a solid architecture, the software may end up reflecting a very different architecture and thus render the initial design work redundant. And what are the repercussions of not following the software architecture? A maintenance and release nightmare?

Coding Guidelines

In an ideal project, the author of code should only be known from header comments or revision history, and not through ideological views on “the best naming convention”. Coding guidelines exist to ensure that code is developed in a consistent format and to promote best practices that help to avoid common pitfalls. As such, coding guidelines influence several quality attributes such as maintainability, safety, security, reliability, evolvability, etc.

Typically, projects will use existing coding guidelines in combination with specialized custom coding rules. For example, an embedded C++ project may use the AUTOSAR C++ Coding Guidelines (to cover safety, reliability, portability, etc.) together with the Google Style Guide for formatting and relevant CERT Secure Coding Guidelines to cover the security aspect. How can projects empower developers to learn and understand both the necessity and adherence to such a range of coding rules?

Keeping Documentation and Code in Sync

The most common practice to ensure software designs are followed is through code reviews. Code reviews, however, require frequent manual work which could be used to discuss more relevant topics about the code. Occupying developer hours with the mundane task of checking spellings, formatting, and so on will not only cut productivity but also be performed poorly due to the required attention to detail and the nature of developers - to be creative and loathe repetitive tasks.

I am not suggesting the abolishment of code reviews, but quite the contrary; removing inefficiency from code reviews to make them shorter, more productive and focused on more valuable topics.

One can use design documentation, such as UML models, data structure designs, coding guidelines, etc., to implement automated adherence checking and build this into the development process. Ideally, we want a way to proof code as it is being written (much like a spell checker) to empower developers to implement the code as it was supposed to be written; according to the design of the best possible system. Moreover, where deviations from the original design occur (due to new language constructs, newer technology, etc.) we can ensure that the documentation is evolving together with the software and accurately reflects the current state of the software.

Architectural Compliance

Lattix is a great tool for understanding, assessing and defining your architecture using a Dependency Structure Matrix. We have used it in numerous projects to help make the software reflect its intended architecture and keep design documentation updated as the software evolves.

Coding Guideline Compliance

There are lots of different static analysis tools out there to help with coding guideline compliance, though we have found that results are only as good as their visibility. And often, multiple static analysis tools are needed to help cover all the requirements and propagating this information into a single source is incredibly useful for monitoring and managing your technical debt.

For this purpose we generally propose SonarQube and have developed several connectors for the different tools that we use, such as Lattix, allowing us to get architectural violations inside the SonarQube dashboard.

Conclusion

Keeping your documentation and code synchronized will help increase the quality systems, and especially making them more maintainable, reusable and evolvable for the future.

If you are interested in learning more or getting some help in this area then please get in touch!