Software Composition Analysis
The first thing to do to become compliant in open source licensing is to identify which components are used in your project. The project can include internal proprietary components as well as open source components. Both cases must be taken into the consideration to have full knowledge of the components used and to create the bill of materials or BOM, which is by definitions the structure of the project.
The natural place to do software composition analysis, or package scan is in the build process, which on the other hand means that it makes sense to have it integrated into the continuous integration pipeline. This means that for every build we get an updated bill of materials. This is very important to keep track all the time what is our project’s composition and this way we can also make possible changes to the structure already during the development by only seeing the one document. So, it is not necessary to go through source files to have knowledge of this, which is a big help for project managers as well as for developers.
When we have knowledge of which component are there in our project we need to store that information somewhere. That place should be somewhere where the information is easy to access and only the most relevant information is clearly presented. This place should also enable other support functions for open source license compliance. Those functions could be like generation of license documentations and being integration point for other tools it the toolchain. This software component management software is called software component catalogue. In software component catalogue both proprietary and open source components are listed.
The clearing process or component clearing means the process in which we identify the license under which the software component is licensed. This can be done using some license scanner, which scan the source code using different pattern matching algorithms to recognize the license text or parts of it.
Finding license related test from the source code is not so simple task, because of several reasons: there can be only parts of the full license text, license text can be slightly modified or there can be only a reference to see URL or some file. Also, when scanning a whole software project, it is obvious that there will be several different open source licenses in it, so it must also be taken into the consideration that the licenses are mutually compatible. The component clearing is not so straight forward process, so that is why there should be own team to do it.
It should also be distinguished that component clearing and license clearing are two different things. License clearing means the process where license text is read, and the granted rights, obligations and restrictions are evaluated for each license. License clearing is done at least with the help from legal and often by legal only.
One other big matter in the world of open source compliance is perception of vulnerabilities. Open source component as every piece of software can have security holes. The good thing about software being open source is that that the vulnerabilities are commonly known and therefor the top fix for that is also known and because of this transparency open source components are often more secure than their commercial counterparts.
The known vulnerabilities are listed in NVD (National Vulnerability Database), which is hosted by US government. The riskiest time is starting from when a vulnerability is found and published to NVD and to the time when a solution to cover that vulnerability is made and published.
A known vulnerability is always related to a certain component and its version, so the identification of the possible vulnerabilities in a software project should be done right after the software composition analysis, because at this time we know which components we have in our project, so we can then just query NVD for vulnerabilities for our components. Possible vulnerabilities then should be listed somewhere and of course fixed.
In open source compliance we often have a toolchain to perform all tasks, so we do not have only one tool to handle every step.
For software composition analysis we have used tool called Whitesource. It actually does more than composition analysis, it lists the vulnerabilities and scans the main license for a library. Whitesource has many integration plugins for software build tools which makes it ideal for identification of libraries.
For open source component catalogue, we have used sw360. It is a solid platform for watching information about projects and their open source components.
One tools that is very reliable in component clearing if Fossology. It does a deep scan for source code files for license texts and provides a good platform for reviewing scan results.
For vulnerability management we have used Whitesource and sw360.