Skip to content

Commit dba69d9

Browse files
authored
Merge pull request #101 from JohT/feature/update-documentation
Update documentation to showcase anomaly archetype graph visualizations
2 parents 540d31d + 7059b37 commit dba69d9

File tree

1 file changed

+49
-19
lines changed

1 file changed

+49
-19
lines changed

README.md

Lines changed: 49 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
# Code Graph Analysis Pipeline Examples
22

3-
This repository provides examples of how to analyze TypeScript code and Java artifacts using a fully automated GitHub Workflows pipeline with the [code-graph-analysis-pipeline](https://github.com/JohT/code-graph-analysis-pipeline).
3+
This repository provides examples of how to analyze TypeScript code and Java artifacts using a fully automated GitHub Actions workflow pipeline with the [code-graph-analysis-pipeline](https://github.com/JohT/code-graph-analysis-pipeline).
44

55
The process involves three steps:
66

7-
1. **Extract**: Upload TypeScript source code and/or Java artifacts, optionally including their git history, using [actions/upload-artifact](https://github.com/actions/upload-artifact).
7+
1. **Extract**: Upload TypeScript source code and/or Java artifacts, optionally including their Git history, using [actions/upload-artifact](https://github.com/actions/upload-artifact).
88

99
1. **Analyze**: Use the shared workflow [JohT/code-graph-analysis-pipeline/.github/workflows/public-analyze-code-graph.yml](https://github.com/JohT/code-graph-analysis-pipeline/blob/main/.github/workflows/public-analyze-code-graph.yml) to analyze the code and artifacts, then upload the results.
1010

11-
1. **Use**: Download the analysis results with [actions/download-artifact](https://github.com/actions/download-artifact) and utilize them as needed.
11+
1. **Use**: Download the analysis results with [actions/download-artifact](https://github.com/actions/download-artifact) and consume them as needed.
1212

1313
## Table of Contents
1414
<!-- TOC -->
@@ -37,17 +37,21 @@ The process involves three steps:
3737
- [Clustering coefficient vs. Page Rank](#clustering-coefficient-vs-page-rank)
3838
- [Java Types that are surprisingly central or popular](#java-types-that-are-surprisingly-central-or-popular)
3939
- [Largest Java Type Clusters](#largest-java-type-clusters)
40-
- [Java Type Anomalies](#java-type-anomalies)
40+
- [Java Type Top 1 Authority](#java-type-top-1-authority)
41+
- [Java Type Top 1 Bottleneck](#java-type-top-1-bottleneck)
42+
- [Java Type Top 1 Bridge](#java-type-top-1-bridge)
43+
- [Java Type Top 1 Hub](#java-type-top-1-hub)
44+
- [Java Type Top 1 Outlier](#java-type-top-1-outlier)
4145

4246
<!-- /TOC -->
4347

4448
## :rocket: TypeScript Code Pipeline
4549

46-
This example demonstrates how to analyze TypeScript code in a GitHub Workflows pipeline.
50+
This example demonstrates how to analyze TypeScript code in a GitHub Actions workflow.
4751

48-
1. The first job, [prepare-code-to-analyze](https://github.com/JohT/code-graph-analysis-examples/blob/23143b34d8fc6e0ab7d80102d8de0b6e6a4ec98e/.github/workflows/typescript-code-analysis.yml#L40), in the GitHub Actions Workflow [typescript-code-analysis.yml](https://github.com/JohT/code-graph-analysis-examples/blob/23143b34d8fc6e0ab7d80102d8de0b6e6a4ec98e/.github/workflows/typescript-code-analysis.yml), shows how to extract TypeScript code from a repository and upload it for analysis.
52+
1. The first job, [prepare-code-to-analyze](https://github.com/JohT/code-graph-analysis-examples/blob/23143b34d8fc6e0ab7d80102d8de0b6e6a4ec98e/.github/workflows/typescript-code-analysis.yml#L40), in the workflow [typescript-code-analysis.yml](https://github.com/JohT/code-graph-analysis-examples/blob/23143b34d8fc6e0ab7d80102d8de0b6e6a4ec98e/.github/workflows/typescript-code-analysis.yml), shows how to extract TypeScript code from a repository and upload it for analysis.
4953

50-
2. The second job, [analyze-code-graph](https://github.com/JohT/code-graph-analysis-examples/blob/23143b34d8fc6e0ab7d80102d8de0b6e6a4ec98e/.github/workflows/typescript-code-analysis.yml#L89), calls the shared analysis workflows using the uploaded artifacts' names as parameters. Here is a simple example:
54+
2. The second job, [analyze-code-graph](https://github.com/JohT/code-graph-analysis-examples/blob/23143b34d8fc6e0ab7d80102d8de0b6e6a4ec98e/.github/workflows/typescript-code-analysis.yml#L89), calls the shared analysis workflow using the uploaded artifacts' names as parameters. Example:
5155

5256
```yaml
5357
name: Analyze Code Graph
@@ -64,11 +68,11 @@ This example demonstrates how to analyze TypeScript code in a GitHub Workflows p
6468
6569
Java artifacts are analyzed similarly to TypeScript code. The main difference is that Java artifacts are downloaded from a Maven repository instead of being part of the repository.
6670
67-
To include the git history in the analysis, checkout the corresponding source repository and upload it as the source artifact, similar to the TypeScript example. The Java source code isn't used for the analysis, so a bare git clone is sufficient.
71+
To include Git history in the analysis, checkout the corresponding source repository and upload it as the source artifact, as in the TypeScript example. The Java source code isn't used in the analysis, so a bare git clone is sufficient.
6872
69-
The first job, [prepare-code-to-analyze](https://github.com/JohT/code-graph-analysis-examples/blob/23143b34d8fc6e0ab7d80102d8de0b6e6a4ec98e/.github/workflows/java-code-analysis.yml#L40), in the GitHub Actions Workflow [java-code-analysis.yml](https://github.com/JohT/code-graph-analysis-examples/blob/23143b34d8fc6e0ab7d80102d8de0b6e6a4ec98e/.github/workflows/java-code-analysis.yml), shows how to prepare the Java artifacts and git history for analysis.
73+
The first job, [prepare-code-to-analyze](https://github.com/JohT/code-graph-analysis-examples/blob/23143b34d8fc6e0ab7d80102d8de0b6e6a4ec98e/.github/workflows/java-code-analysis.yml#L40), in the workflow [java-code-analysis.yml](https://github.com/JohT/code-graph-analysis-examples/blob/23143b34d8fc6e0ab7d80102d8de0b6e6a4ec98e/.github/workflows/java-code-analysis.yml), shows how to prepare the Java artifacts and Git history for analysis.
7074
71-
The second and third jobs are the same as for the TypeScript example.
75+
The second and third jobs are the same as in the TypeScript example.
7276
7377
## :bookmark_tabs: CSV Report Reference
7478
@@ -100,7 +104,7 @@ This repository is licensed under the Apache License, Version 2.0. See [LICENSE]
100104

101105
## :bar_chart: Analysis Results
102106

103-
Here are some examples from over a hundred reports generated by the analysis. These examples illustrate the results of analyzing [AxonFramework](https://github.com/AxonFramework/AxonFramework), a Java framework for Evolutionary Message-Driven Microservices on the JVM. For the complete set of reports, visit the [analysis-results](./analysis-results) directory.
107+
Below are examples drawn from more than a hundred reports produced by the analysis. They illustrate results from analyzing [AxonFramework](https://github.com/AxonFramework/AxonFramework), a Java framework for evolutionary, message-driven microservices on the JVM. For the complete set of reports, see the [analysis-results](./analysis-results) directory.
104108

105109
### External Dependencies of Java Packages
106110

@@ -120,7 +124,7 @@ Here are some examples from over a hundred reports generated by the analysis. Th
120124

121125
### Object-Oriented Design Metrics for Java Packages
122126

123-
<img src="./analysis-results/AxonFramework/latest/object-oriented-design-metrics-java/ObjectOrientedDesignMetricsJava_files/ObjectOrientedDesignMetricsJava_41_0.png" width="600" alt="Object-Oriented Design Metrics for Java packages">
127+
<img src="./analysis-results/AxonFramework/latest/object-oriented-design-metrics-java/ObjectOrientedDesignMetricsJava_files/ObjectOrientedDesignMetricsJava_41_0.png" width="600" alt="Object-oriented design metrics for Java packages">
124128

125129
### Effective Line Count of Java Methods
126130

@@ -140,7 +144,7 @@ Here are some examples from over a hundred reports generated by the analysis. Th
140144

141145
### Word Cloud of Git Authors
142146

143-
<img src="./analysis-results/AxonFramework/latest/wordcloud/Wordcloud_files/Wordcloud_16_0.png" width="600" alt="Word cloud of git authors">
147+
<img src="./analysis-results/AxonFramework/latest/wordcloud/Wordcloud_files/Wordcloud_16_0.png" width="600" alt="Word cloud of Git authors">
144148

145149
### Number of distinct commit authors
146150

@@ -152,18 +156,44 @@ Here are some examples from over a hundred reports generated by the analysis. Th
152156

153157
### Clustering coefficient vs. Page Rank
154158

155-
This scatter plot compares the importance of Java types to the density of their connections. The Y axis shows the [PageRank](https://en.wikipedia.org/wiki/PageRank) score. Higher values indicate more important and frequently used types. The X axis shows the [clustering coefficient](https://en.wikipedia.org/wiki/Clustering_coefficient). Higher values mean more densely connected neighborhoods. Important bridge or hub Types can be found on the Top-left. Highly influential nodes in dense, well-connected communities can be found on the Top-Right.
159+
The scatter plot below compares the importance of Java types to the density of their connections. The Y axis shows the [PageRank](https://en.wikipedia.org/wiki/PageRank) score (higher values indicate more important and frequently used types). The X axis shows the [clustering coefficient](https://en.wikipedia.org/wiki/Clustering_coefficient) (higher values indicate more densely connected neighborhoods). Important bridge or hub types appear toward the top-left; highly influential nodes in dense communities appear toward the top-right.
156160

157-
<img src="./analysis-results/AxonFramework/latest/anomaly-detection/Java_Type_ClusteringCoefficient_versus_PageRank.svg" width="600" alt="Clustering Coefficient vs. PageRank">
161+
<img src="./analysis-results/AxonFramework/latest/anomaly-detection/Java_Type/ClusteringCoefficient_versus_PageRank.svg" width="600" alt="Clustering Coefficient vs. PageRank">
158162

159163
### Java Types that are surprisingly central or popular
160164

161-
<img src="./analysis-results/AxonFramework/latest/anomaly-detection/Java_Type_ClusterNoise_highly_central_and_popular.svg" width="600" alt="">
165+
<img src="./analysis-results/AxonFramework/latest/anomaly-detection/Java_Type/ClusterNoise_highly_central_and_popular.svg" width="600" alt="Surprisingly central or popular Java Types">
162166

163167
### Largest Java Type Clusters
164168

165-
<img src="./analysis-results/AxonFramework/latest/anomaly-detection/Java_Type_Clusters_largest_size.svg" width="600" alt="">
169+
<img src="./analysis-results/AxonFramework/latest/anomaly-detection/Java_Type/Clusters_largest_size.svg" width="600" alt="Largest Java Type Clusters">
166170

167-
### Java Type Anomalies
171+
### Java Type Top 1 Authority
168172

169-
<img src="./analysis-results/AxonFramework/latest/anomaly-detection/Java_Type_Anomalies.svg" width="600" alt="">
173+
An "Authority" is a code unit many important parts depend on: it has high global importance (PageRank) but low local support (ArticleRank). A large PageRank − ArticleRank gap flags widely used utilities or entry points that are central but not well supported locally.
174+
175+
<img src="./analysis-results/AxonFramework/AxonFramework-4.12.1/anomaly-detection/Java_Type/GraphVisualizations/TopAuthority1.svg" width="600" alt="Top 1 Java Type Authority Graph Visualization">
176+
177+
### Java Type Top 1 Bottleneck
178+
179+
A "Bottleneck" is a code unit with exceptionally high Betweenness centrality — it lies on many shortest paths between other nodes, so it mediates a large fraction of dependency flows and is a potential single point of failure or architectural hotspot. Potentially an unintended dependency concentration: if removed, communication between modules breaks.
180+
181+
<img src="./analysis-results/AxonFramework/AxonFramework-4.12.1/anomaly-detection/Java_Type/GraphVisualizations/TopBottleneck1.svg" width="600" alt="Top 1 Java Type Bottleneck Graph Visualization">
182+
183+
### Java Type Top 1 Bridge
184+
185+
A "Bridge" is a code unit that connects different parts of the codebase. It is detected as an anomaly with a high contribution of node embedding features, which encode the structural position in the graph. It shows code that might integrate various layers or boundaries (e.g., API facades) or violates architecture (tangled dependencies).
186+
187+
<img src="./analysis-results/AxonFramework/AxonFramework-4.12.1/anomaly-detection/Java_Type/GraphVisualizations/TopBridge1.svg" width="600" alt="Top 1 Java Type Bridge Graph Visualization">
188+
189+
### Java Type Top 1 Hub
190+
191+
A "Hub" is a code unit with a high out-degree (many dependencies) but low clustering coefficient (its neighbors are not well connected). Hubs are central dependencies that many other parts rely on, making them potential fragile hotspots in the architecture. The low clustering coefficient indicates that these hubs may not be well integrated into the surrounding code, increasing the risk of failure if the hub encounters issues.
192+
193+
<img src="./analysis-results/AxonFramework/AxonFramework-4.12.1/anomaly-detection/Java_Type/GraphVisualizations/TopHub1.svg" width="600" alt="Top 1 Java Type Hub Graph Visualization">
194+
195+
### Java Type Top 1 Outlier
196+
197+
A "Outlier" is a code unit that significantly deviates from typical patterns in the codebase. It has a low clustering probability and a high distance to the nearest cluster centroid in the node embedding space. This indicates that the outlier has a unique structural position in the dependency graph, potentially representing specialized functionality or an architectural anomaly.
198+
199+
<img src="./analysis-results/AxonFramework/AxonFramework-4.12.1/anomaly-detection/Java_Type/GraphVisualizations/TopOutlier1.svg" width="600" alt="Top 1 Java Type Outlier Graph Visualization">

0 commit comments

Comments
 (0)