Skip to content

Commit c081df0

Browse files
KevinJi22AyanSinhaMahapatra
authored andcommitted
Add report
Signed-off-by: Kevin Ji <kyji1011@gmail.com>
1 parent 55d9be6 commit c081df0

File tree

2 files changed

+61
-1
lines changed

2 files changed

+61
-1
lines changed

docs/source/archive/gsoc-toc.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@ GSoC 2022
1717
gsoc/reports/2022/scancodeio_akhil
1818
gsoc/reports/2022/scancode_workbench_omkar
1919
gsoc/reports/2022/vulnerablecode_vulntotal_keshav
20-
2120
gsoc/reports/2022/vulnerablecode_ziad
21+
gsoc/reports/2022/scancode_kevin
2222

2323
GSoC 2021
2424
---------
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
========================================================================
2+
Extending license detection to use licenses external to ScanCode Toolkit
3+
========================================================================
4+
5+
6+
| **Organization:** `AboutCode <https://aboutcode.org>`_
7+
| **Project:** `Scancode Toolkit <https://github.com/nexB/scancode-toolkit>`_
8+
| **Mentee:** `Kevin Ji (KevinJi22) <https://github.com/KevinJi22>`_
9+
| **Mentors:** Philippe Ombredanne, AyanSinhaMahapatra, Jono Yang
10+
11+
Overview
12+
--------
13+
14+
When doing license detection, ScanCode uses the licenses and rules in the ScanCode LicenseDB.
15+
The goal of this project is to extend the capabilities of ScanCode license detection to include
16+
licenses that are external to LicenseDB, such as proprietary licenses to be kept within an
17+
organization. I also extended it to include licenses installed from external sources.
18+
19+
Implementation
20+
--------------
21+
22+
All the work I did is contained in `this single PR <https://github.com/nexB/scancode-toolkit/pull/2979>`_.
23+
I added a new command line option called ``--additional-license-directory`` that someone can use
24+
to include additional licenses/rules contained in other directories in the license index.
25+
Scancode Toolkit uses this license index when doing license detection.
26+
This option must be called with ``--reindex-licenses`` to explicitly regenerate the license cache,
27+
and then when doing license scans, users can just use the regular ``--license`` option and these
28+
additional licenses and/or rules will be used in license detection.
29+
30+
This change also allows users to install directories of licenses or rules to their local machine,
31+
and then Scancode Toolkit will detect and include them in the license cache when someone is
32+
reindexing the licenses. If someone wants to create a directory of licenses or rules that they
33+
want to install and use in Scancode Toolkit, they must subclass a new Plugin class I added.
34+
This allows Scancode Toolkit to identify the location of these installed licenses/rules
35+
through a unique entry point and add them to the license index.
36+
37+
Finally, all these changes are tested through multiple unit tests validating both correct
38+
behavior and error handling as needed.
39+
40+
Post GSoC
41+
---------
42+
43+
I would like to merge this PR into Scancode Toolkit, hopefully allowing users to leverage
44+
this feature to expand their license detection capabilities.
45+
46+
Links
47+
-----
48+
49+
`Project idea <https://github.com/nexB/aboutcode/wiki/GSOC-2022#scancode-toolkit-enable-detection-of-private-licenses>`_
50+
51+
`Official GSoC project page <https://summerofcode.withgoogle.com/programs/2022/projects/e2m1eokW>`_
52+
53+
`GSoC Proposal <https://docs.google.com/document/d/1FGkFTN79Hq
54+
-Z0FLVZdeqn1B9TgTamo9T3Mux1HU4h8M/edit?usp=sharing>`_
55+
56+
Acknowledgements
57+
----------------
58+
59+
Thanks to Jono and Phillippe for being my mentors. I enjoyed all the meetings, code reviews,
60+
and design discussions. Thank you for your time and your patience!

0 commit comments

Comments
 (0)