-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Describe the problem
Background
This project includes reusable assets for checking for incompatibly licensed application dependencies:
- https://github.com/arduino/tooling-project-assets/blob/26aefc2bb13d8449b7a687ba2c5fb35ac47a8f35/workflow-templates/check-go-dependencies-task.md
- https://github.com/arduino/tooling-project-assets/blob/26aefc2bb13d8449b7a687ba2c5fb35ac47a8f35/workflow-templates/check-npm-dependencies-task.md
The system relies on a collection of metadata that defines the licensing of each of the project's dependencies. This "license metadata cache" is generated by the general:cache-dep-licenses task, which invokes the licensed cache command.
The licensed cache command attempts to automatically detect the license type of each dependency, which is recorded in the license key of the metadata file. However, in cases where the developers of a dependency have not defined the license in a standardized manner, such a detection is not possible. In this case, Licensed sets the license key to the placeholder other license type identifier.
The compatibility of dependency licenses is checked using the general:check-dep-licenses task, which invokes the licensed status command. If the metadata of any dependency indicates it has a license type other than the list of allowed licenses defined in the project's Licensed configuration file, the task will fail. In the case where a license type of other is specified in a dependency's metadata due to the licensed cache command not being able to automatically determine the dependency's license type, the general:check-dep-licenses task will fail due to an other license type not being in the list of allowed license types. The project maintainer must then manually determine the license type of the dependency and set the value of the license key in the metadata file for the dependency accordingly.
Problem
It turns out that Licensed has an inexplicable behavior. In cases where multiple possible license sources were found by licensed cache, and there were different license type detections for each source, the license key of the metadata file will be set to other. This is reasonable behavior, as it is important that the license type be correctly determined and thus if there is uncertainty in the automated detection the task should be delegated to the human project maintainers. However, what is not reasonable behavior is that licensed status will pass under these conditions, so long as the license type internally detected by Licensed for the dependency is on the list of allowed licenses. This, even though that license type has not been recorded in the metadata.
🐛 A pass from the general:check-dep-licenses task even though an other license type is defined in the dependency metadata can cause confusion and wasted time for project maintainers who will be given the impression that the system is not enforcing compatible licenses.
🐛 Lack of documentation of determined license type for the dependency makes evaluation of open source compliance of a project more challenging.
To reproduce
Setup
$ mkdir /tmp/foo-module
$ cd /tmp/foo-module
$ echo 'sources:
go: true
allowed:
- apache-2.0
' > .licensed.yml
$ wget --output-document=LICENSE.txt https://raw.githubusercontent.com/spdx/license-list-data/refs/tags/v3.27.0/text/Apache-2.0.txt
[...]
$ echo 'Licensed under the MIT and Apache License 2.0 licenses.
' > COPYING.txt
$ go mod init example.com/foo-module
go: creating new go.mod: module example.com/foo-module
$ echo 'package main
func main() {}
' > main.go
$ licensed cache
Caching dependency records for foo-module
go
Caching example.com/foo-module ()
* 1 go dependencies
$ cat .licenses/go/example.com/foo-module.dep.yml
---
name: example.com/foo-module
version:
type: go
summary:
homepage: https://pkg.go.dev/example.com/foo-module
license: other
licenses:
- sources: LICENSE.txt
text: |
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
[...]
- sources: COPYING.txt
text: "Licensed under the MIT and Apache License 2.0 licenses.\n\n"
notices: []
🙂 From the content of the metadata file, we can see that two possible sources of licensing information were identified, and since the automated license type identifications from each sources were not consistent, the license key was set to other.
Reproduction
$ licensed status
Checking cached dependency records for foo-module
.
1 dependencies checked, 0 errors found.
$ echo $?
0
🐛 The licensed status returned a zero exit status even though the license type of the dependency is not explicitly defined in the metadata.
Expected behavior
A license type determination must be documented in the metadata for every dependency. The general:check-dep-licenses task should fail whenever an other license type value is present in the metadata.
arduino/tooling-project-assets version
Operating system
- Linux
- Windows
Operating system version
- Ubuntu 24.04
- Windows 11
Additional context
In the above demo, I set up the necessary conditions in the project's own license documentation solely for the sake of simplicity. Since Licensed treats the project's package the same as any other dependency, that is sufficient. However, the more significant impact is when it occurs with an actual external dependency. An example of a dependency that produces the conditions is gopkg.in/yaml.v3@v3.0.1.
Since licensed status does not have the desired behavior, I attempted to add a separate check to the general:check-dep-licenses task for presence of other license types in the metadata. I assumed I could use licensed list for this purpose:
licensed list \
--format=json \
--licenses \
| \
jq \
--exit-status \
'[.apps[].sources[].dependencies[]] | all(.license != "other")'
Unfortunately, this doesn't work because the list command only provides the license type identified from the dependency codebase, not the type defined in the dependency license metadata cache:
So the above command will produce false positives due to returning a non-zero exit status when run in any project that has dependencies that are automatically assigned a license type of other by Licensed, even after the maintainer has manually defined the correct license type in the dependency's license metadata cache. Such dependencies are very common, so this renders licensed list unusable for this purpose. Unfortunately, from the reception of licensee/licensed#646, it doesn't seem likely that the required capability will ever be added to Licensed.
Likewise, since the project is not under active development, it seems unlikely that the Licensed developers will ever correct the bad behavior of licensed status.
For these reasons, it is likely the only resolution would be implementing a bespoke check from scratch, consisting of code to recurse through the dependency license metadata cache folder and parse each of the YAML files contained therein.