Merge pull request #52 from github/examples-docs

Patrick Thomson · web-flow · commit afa4511b8b45 · 2019-06-04T11:28:42.000-04:00
Add documentation detailing example uses for the CLI.
diff --git a/README.md b/README.md
@@ -2,6 +2,8 @@
 
 `semantic` is a Haskell library and command line tool for parsing, analyzing, and comparing source code.
 
+In a hurry? Check out our documentation of [example uses for the `semantic` command line tool](docs/examples.md).
+
 | Table of Contents |
 | :------------- |
 | [Usage](#usage) |
diff --git a/docs/examples.md b/docs/examples.md
@@ -0,0 +1,147 @@
+# Quick usage examples
+
+## Parse trees
+
+Semantic uses [tree-sitter](https://github.com/tree-sitter/tree-sitter) to generate parse trees, but layers in a more generalized notion of syntax terms across all supported programming languages. We'll see why this is important when we get to diffs and program analysis, but for now let's just inspect some output. It helps to have a simple program to parse, so let's create one. Open a file `test.A.py` and paste in the following:
+
+``` python
+def Foo(x):
+    return x
+print Foo("hi")
+```
+
+Now, let's generate an abstract syntax tree (AST).
+
+``` bash
+$ semantic parse test.A.py
+(Statements
+  (Annotation
+    (Function
+      (Identifier)
+      (Identifier)
+      (Return
+        (Identifier)))
+    (Empty))
+  (Call
+    (Identifier)
+    (Call
+      (Identifier)
+      (TextElement)
+      (Empty))
+    (Empty)))
+```
+
+The default s-expression output is a good format for quickly visualizing the structure of code. We can see that there is a function declared and that then there is a call expression, nested in another call expression which matches the function calls to `print` and `Foo`. Feel free to play with some of the other output formats, for example the following will give back the same AST, but in JSON and with much more information about each node including things like the span and range of each syntactic element in the original source file.
+
+``` bash
+$ semantic parse --json test.A.py
+```
+
+## Diffs
+
+Now, let's look at a simple, syntax aware diff. Create a second file `test.B.py` that looks like this (The function `Foo` has been renamed).
+
+``` python
+def Bar(x):
+    return x
+print Bar("hi")
+```
+
+First, let's just see what the diff looks like.
+
+``` bash
+$ git diff --no-index test.A.py test.B.py
+```
+``` diff
+diff --git a/test.A.py b/test.B.py
+index 81f210023..5f37f4260 100644
+--- a/test.A.py
++++ b/test.B.py
+@@ -1,3 +1,3 @@
+-def Foo(x):
++def Bar(x):
+     return x
+-print Foo("hi")
++print Bar("hi")
+```
+
+Now, let's look at a syntax aware diff.
+
+``` bash
+$ semantic diff test.A.py test.B.py
+(Statements
+  (Annotation
+    (Function
+    { (Identifier)
+    ->(Identifier) }
+      (Identifier)
+      (Return
+        (Identifier)))
+    (Empty))
+  (Call
+    (Identifier)
+    (Call
+    { (Identifier)
+    ->(Identifier) }
+      (TextElement)
+      (Empty))
+    (Empty)))
+```
+
+Notice the difference? Instead of showing that entire lines were added and removed, the semantic diff is aware that the identifier of the function declaration and function call changed. Pretty cool.
+
+## Import graphs
+
+OK, now for the fun stuff. Semantic can currently produce a couple of different graph-based views of source code, let's first take a look at import graphs. An import graph shows how files include other files within a software project. For this example, we are going to write a little bit more code in order to see this work. Start by creating a couple of new files:
+
+``` python
+# main.py
+import numpy as np
+from a import foo as foo_
+
+def qux():
+    return foo_()
+
+foo_(1)
+```
+
+``` python
+# a.py
+def foo(x):
+    return x
+```
+
+Now, let's graph.
+
+``` bash
+$ semantic graph main.py
+digraph
+{
+
+  "a.py (Module)" [style="dotted, rounded" shape="box"]
+  "main.py (Module)" [style="dotted, rounded" shape="box"]
+  "numpy (Unknown Module)" [style="dotted, rounded" shape="box" color="red" fontcolor="red"]
+  "main.py (Module)" -> "a.py (Module)" [len="5.0" label="imports"]
+  "main.py (Module)" -> "numpy (Unknown Module)" [len="5.0" label="imports"]
+}
+```
+
+To make this easier to visualize, let's use the `dot` utility from `graphviz` and write this graph to SVG:
+
+```
+$ semantic graph main.py | dot -Tsvg > main.html && open main.html
+```
+
+You'll get something that looks like this:
+
+![an import graph](images/import_graph.svg)
+
+## Call graphs
+
+Call graphs expand on the import graphing capabilities by adding in some additional vertices and edges to the graph to identify named symbols and the connections between them. Taking the same example code, simply add `--call` to the invocation of semantic:
+
+```
+$ semantic graph --calls main.py | dot -Tsvg > main.html && open main.html
+```
+
+![a call graph](images/call_graph.svg)
diff --git a/docs/images/call_graph.svg b/docs/images/call_graph.svg
@@ -0,0 +1,133 @@
+<svg width="333pt" height="410pt"
+ viewBox="0.00 0.00 332.61 410.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 406)">
+<title>%3</title>
+<polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-406 328.6089,-406 328.6089,4 -4,4"/>
+<!-- a.py (Module) -->
+<g id="node1" class="node">
+<title>a.py (Module)</title>
+<path fill="none" stroke="#000000" stroke-dasharray="1,5" d="M84.1348,-262C84.1348,-262 11.955,-262 11.955,-262 5.955,-262 -.045,-256 -.045,-250 -.045,-250 -.045,-238 -.045,-238 -.045,-232 5.955,-226 11.955,-226 11.955,-226 84.1348,-226 84.1348,-226 90.1348,-226 96.1348,-232 96.1348,-238 96.1348,-238 96.1348,-250 96.1348,-250 96.1348,-256 90.1348,-262 84.1348,-262"/>
+<text text-anchor="middle" x="48.0449" y="-239.8" font-family="Times,serif" font-size="14.00" fill="#000000">a.py (Module)</text>
+</g>
+<!-- a.py::foo (Function [1, 1] &#45; [3, 1]) -->
+<g id="node7" class="node">
+<title>a.py::foo (Function [1, 1] &#45; [3, 1])</title>
+<g id="a_node7"><a xlink:title="[1, 1] &#45; [3, 1]">
+<path fill="none" stroke="#000000" d="M184.8109,-122C184.8109,-122 111.2789,-122 111.2789,-122 105.2789,-122 99.2789,-116 99.2789,-110 99.2789,-110 99.2789,-98 99.2789,-98 99.2789,-92 105.2789,-86 111.2789,-86 111.2789,-86 184.8109,-86 184.8109,-86 190.8109,-86 196.8109,-92 196.8109,-98 196.8109,-98 196.8109,-110 196.8109,-110 196.8109,-116 190.8109,-122 184.8109,-122"/>
+<text text-anchor="middle" x="148.0449" y="-99.8" font-family="Times,serif" font-size="14.00" fill="#000000">foo (Function)</text>
+</a>
+</g>
+</g>
+<!-- a.py (Module)&#45;&gt;a.py::foo (Function [1, 1] &#45; [3, 1]) -->
+<g id="edge1" class="edge">
+<title>a.py (Module)&#45;&gt;a.py::foo (Function [1, 1] &#45; [3, 1])</title>
+<path fill="none" stroke="#ff0000" d="M60.9135,-225.9841C78.219,-201.7563 109.12,-158.4949 129.0891,-130.5382"/>
+<polygon fill="#ff0000" stroke="#ff0000" points="131.9439,-132.5631 134.9082,-122.3914 126.2477,-128.4944 131.9439,-132.5631"/>
+<text text-anchor="middle" x="117.876" y="-185.8" font-family="Times,serif" font-size="14.00" fill="#000000">defines</text>
+</g>
+<!-- main.py (Module) -->
+<g id="node2" class="node">
+<title>main.py (Module)</title>
+<path fill="none" stroke="#000000" stroke-dasharray="1,5" d="M200.9141,-402C200.9141,-402 107.1757,-402 107.1757,-402 101.1757,-402 95.1757,-396 95.1757,-390 95.1757,-390 95.1757,-378 95.1757,-378 95.1757,-372 101.1757,-366 107.1757,-366 107.1757,-366 200.9141,-366 200.9141,-366 206.9141,-366 212.9141,-372 212.9141,-378 212.9141,-378 212.9141,-390 212.9141,-390 212.9141,-396 206.9141,-402 200.9141,-402"/>
+<text text-anchor="middle" x="154.0449" y="-379.8" font-family="Times,serif" font-size="14.00" fill="#000000">main.py (Module)</text>
+</g>
+<!-- main.py (Module)&#45;&gt;a.py (Module) -->
+<g id="edge2" class="edge">
+<title>main.py (Module)&#45;&gt;a.py (Module)</title>
+<path fill="none" stroke="#000000" d="M100.9885,-365.9661C79.1556,-355.3239 56.7292,-339.2668 47.0449,-316 41.397,-302.4308 41.1477,-286.139 42.5296,-272.4873"/>
+<polygon fill="#000000" stroke="#000000" points="46.048,-272.5855 43.9155,-262.2075 39.1107,-271.6501 46.048,-272.5855"/>
+<text text-anchor="middle" x="93.4346" y="-336.8" font-family="Times,serif" font-size="14.00" fill="#000000">imports</text>
+</g>
+<!-- numpy (Module) -->
+<g id="node3" class="node">
+<title>numpy (Module)</title>
+<path fill="none" stroke="#000000" stroke-dasharray="1,5" d="M155.8109,-316C155.8109,-316 68.279,-316 68.279,-316 62.279,-316 56.279,-310 56.279,-304 56.279,-304 56.279,-292 56.279,-292 56.279,-286 62.279,-280 68.279,-280 68.279,-280 155.8109,-280 155.8109,-280 161.8109,-280 167.8109,-286 167.8109,-292 167.8109,-292 167.8109,-304 167.8109,-304 167.8109,-310 161.8109,-316 155.8109,-316"/>
+<text text-anchor="middle" x="112.0449" y="-293.8" font-family="Times,serif" font-size="14.00" fill="#000000">numpy (Module)</text>
+</g>
+<!-- main.py (Module)&#45;&gt;numpy (Module) -->
+<g id="edge3" class="edge">
+<title>main.py (Module)&#45;&gt;numpy (Module)</title>
+<path fill="none" stroke="#000000" d="M141.5019,-365.7227C137.9566,-360.1633 134.254,-353.948 131.2656,-348 127.7638,-341.03 124.4933,-333.26 121.6786,-325.9623"/>
+<polygon fill="#000000" stroke="#000000" points="124.9005,-324.5841 118.1438,-316.4233 118.3367,-327.0164 124.9005,-324.5841"/>
+<text text-anchor="middle" x="153.4346" y="-336.8" font-family="Times,serif" font-size="14.00" fill="#000000">imports</text>
+</g>
+<!-- main.py::foo_ (Variable [7, 1] &#45; [7, 5]) -->
+<g id="node5" class="node">
+<title>main.py::foo_ (Variable [7, 1] &#45; [7, 5])</title>
+<g id="a_node5"><a xlink:title="[7, 1] &#45; [7, 5]">
+<path fill="none" stroke="#000000" d="M210.6729,-262C210.6729,-262 133.4169,-262 133.4169,-262 127.4169,-262 121.4169,-256 121.4169,-250 121.4169,-250 121.4169,-238 121.4169,-238 121.4169,-232 127.4169,-226 133.4169,-226 133.4169,-226 210.6729,-226 210.6729,-226 216.6729,-226 222.6729,-232 222.6729,-238 222.6729,-238 222.6729,-250 222.6729,-250 222.6729,-256 216.6729,-262 210.6729,-262"/>
+<text text-anchor="middle" x="172.0449" y="-239.8" font-family="Times,serif" font-size="14.00" fill="#000000">foo_ (Variable)</text>
+</a>
+</g>
+</g>
+<!-- main.py (Module)&#45;&gt;main.py::foo_ (Variable [7, 1] &#45; [7, 5]) -->
+<g id="edge4" class="edge">
+<title>main.py (Module)&#45;&gt;main.py::foo_ (Variable [7, 1] &#45; [7, 5])</title>
+<path fill="none" stroke="#00ff00" d="M165.9663,-365.9735C169.1046,-360.4211 172.1415,-354.1468 174.0449,-348 181.722,-323.2074 180.1489,-293.4742 177.2921,-272.1616"/>
+<polygon fill="#00ff00" stroke="#00ff00" points="180.7329,-271.5068 175.7727,-262.1448 173.8121,-272.5566 180.7329,-271.5068"/>
+<text text-anchor="middle" x="189.8726" y="-336.8" font-family="Times,serif" font-size="14.00" fill="#000000">calls</text>
+</g>
+<!-- main.py::qux (Function [4, 1] &#45; [6, 1]) -->
+<g id="node8" class="node">
+<title>main.py::qux (Function [4, 1] &#45; [6, 1])</title>
+<g id="a_node8"><a xlink:title="[4, 1] &#45; [6, 1]">
+<path fill="none" stroke="#000000" d="M311.6492,-316C311.6492,-316 236.4406,-316 236.4406,-316 230.4406,-316 224.4406,-310 224.4406,-304 224.4406,-304 224.4406,-292 224.4406,-292 224.4406,-286 230.4406,-280 236.4406,-280 236.4406,-280 311.6492,-280 311.6492,-280 317.6492,-280 323.6492,-286 323.6492,-292 323.6492,-292 323.6492,-304 323.6492,-304 323.6492,-310 317.6492,-316 311.6492,-316"/>
+<text text-anchor="middle" x="274.0449" y="-293.8" font-family="Times,serif" font-size="14.00" fill="#000000">qux (Function)</text>
+</a>
+</g>
+</g>
+<!-- main.py (Module)&#45;&gt;main.py::qux (Function [4, 1] &#45; [6, 1]) -->
+<g id="edge5" class="edge">
+<title>main.py (Module)&#45;&gt;main.py::qux (Function [4, 1] &#45; [6, 1])</title>
+<path fill="none" stroke="#ff0000" d="M196.2031,-365.9415C206.3744,-360.7811 216.9296,-354.7123 226.0449,-348 235.5406,-341.0076 244.8004,-332.0415 252.6132,-323.6375"/>
+<polygon fill="#ff0000" stroke="#ff0000" points="255.2622,-325.9261 259.3468,-316.1503 250.0575,-321.2452 255.2622,-325.9261"/>
+<text text-anchor="middle" x="260.876" y="-336.8" font-family="Times,serif" font-size="14.00" fill="#000000">defines</text>
+</g>
+<!-- main.py::foo_ (Variable [5, 12] &#45; [5, 16]) -->
+<g id="node4" class="node">
+<title>main.py::foo_ (Variable [5, 12] &#45; [5, 16])</title>
+<g id="a_node4"><a xlink:title="[5, 12] &#45; [5, 16]">
+<path fill="none" stroke="#000000" d="M312.6729,-208C312.6729,-208 235.4169,-208 235.4169,-208 229.4169,-208 223.4169,-202 223.4169,-196 223.4169,-196 223.4169,-184 223.4169,-184 223.4169,-178 229.4169,-172 235.4169,-172 235.4169,-172 312.6729,-172 312.6729,-172 318.6729,-172 324.6729,-178 324.6729,-184 324.6729,-184 324.6729,-196 324.6729,-196 324.6729,-202 318.6729,-208 312.6729,-208"/>
+<text text-anchor="middle" x="274.0449" y="-185.8" font-family="Times,serif" font-size="14.00" fill="#000000">foo_ (Variable)</text>
+</a>
+</g>
+</g>
+<!-- main.py::foo_ (Variable [5, 12] &#45; [5, 16])&#45;&gt;a.py::foo (Function [1, 1] &#45; [3, 1]) -->
+<g id="edge6" class="edge">
+<title>main.py::foo_ (Variable [5, 12] &#45; [5, 16])&#45;&gt;a.py::foo (Function [1, 1] &#45; [3, 1])</title>
+<path fill="none" stroke="#0000ff" d="M247.6312,-171.9716C228.8148,-159.1286 203.3349,-141.7376 182.8589,-127.7619"/>
+<polygon fill="#0000ff" stroke="#0000ff" points="184.6565,-124.7513 174.4239,-122.0047 180.7103,-130.533 184.6565,-124.7513"/>
+<text text-anchor="middle" x="246.7969" y="-142.8" font-family="Times,serif" font-size="14.00" fill="#000000">references</text>
+</g>
+<!-- main.py::foo_ (Variable [7, 1] &#45; [7, 5])&#45;&gt;a.py::foo (Function [1, 1] &#45; [3, 1]) -->
+<g id="edge7" class="edge">
+<title>main.py::foo_ (Variable [7, 1] &#45; [7, 5])&#45;&gt;a.py::foo (Function [1, 1] &#45; [3, 1])</title>
+<path fill="none" stroke="#0000ff" d="M162.8094,-225.7163C160.3767,-220.1567 158.0211,-213.9429 156.541,-208 150.315,-183.0011 148.3844,-153.6829 147.8982,-132.5587"/>
+<polygon fill="#0000ff" stroke="#0000ff" points="151.3948,-132.2773 147.7605,-122.3253 144.3955,-132.3715 151.3948,-132.2773"/>
+<text text-anchor="middle" x="185.7969" y="-185.8" font-family="Times,serif" font-size="14.00" fill="#000000">references</text>
+</g>
+<!-- a.py::x (Variable [2, 12] &#45; [2, 13]) -->
+<g id="node6" class="node">
+<title>a.py::x (Variable [2, 12] &#45; [2, 13])</title>
+<g id="a_node6"><a xlink:title="[2, 12] &#45; [2, 13]">
+<path fill="none" stroke="#000000" d="M177.5121,-36C177.5121,-36 118.5778,-36 118.5778,-36 112.5778,-36 106.5778,-30 106.5778,-24 106.5778,-24 106.5778,-12 106.5778,-12 106.5778,-6 112.5778,0 118.5778,0 118.5778,0 177.5121,0 177.5121,0 183.5121,0 189.5121,-6 189.5121,-12 189.5121,-12 189.5121,-24 189.5121,-24 189.5121,-30 183.5121,-36 177.5121,-36"/>
+<text text-anchor="middle" x="148.0449" y="-13.8" font-family="Times,serif" font-size="14.00" fill="#000000">x (Variable)</text>
+</a>
+</g>
+</g>
+<!-- a.py::foo (Function [1, 1] &#45; [3, 1])&#45;&gt;a.py::x (Variable [2, 12] &#45; [2, 13]) -->
+<g id="edge8" class="edge">
+<title>a.py::foo (Function [1, 1] &#45; [3, 1])&#45;&gt;a.py::x (Variable [2, 12] &#45; [2, 13])</title>
+<path fill="none" stroke="#00ff00" d="M148.0449,-85.7616C148.0449,-74.3597 148.0449,-59.4342 148.0449,-46.494"/>
+<polygon fill="#00ff00" stroke="#00ff00" points="151.545,-46.2121 148.0449,-36.2121 144.545,-46.2121 151.545,-46.2121"/>
+<text text-anchor="middle" x="160.8726" y="-56.8" font-family="Times,serif" font-size="14.00" fill="#000000">calls</text>
+</g>
+<!-- main.py::qux (Function [4, 1] &#45; [6, 1])&#45;&gt;main.py::foo_ (Variable [5, 12] &#45; [5, 16]) -->
+<g id="edge9" class="edge">
+<title>main.py::qux (Function [4, 1] &#45; [6, 1])&#45;&gt;main.py::foo_ (Variable [5, 12] &#45; [5, 16])</title>
+<path fill="none" stroke="#00ff00" d="M274.0449,-279.6793C274.0449,-262.821 274.0449,-237.5651 274.0449,-218.147"/>
+<polygon fill="#00ff00" stroke="#00ff00" points="277.545,-218.0501 274.0449,-208.0502 270.545,-218.0502 277.545,-218.0501"/>
+<text text-anchor="middle" x="286.8726" y="-239.8" font-family="Times,serif" font-size="14.00" fill="#000000">calls</text>
+</g>
+</g>
+</svg>
diff --git a/docs/images/import_graph.svg b/docs/images/import_graph.svg
@@ -0,0 +1,39 @@
+<svg width="292pt" height="130pt"
+ viewBox="0.00 0.00 292.04 130.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 126)">
+<title>%3</title>
+<polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-126 288.0381,-126 288.0381,4 -4,4"/>
+<!-- a.py (Module) -->
+<g id="node1" class="node">
+<title>a.py (Module)</title>
+<path fill="none" stroke="#000000" stroke-dasharray="1,5" d="M84.1348,-36C84.1348,-36 11.955,-36 11.955,-36 5.955,-36 -.045,-30 -.045,-24 -.045,-24 -.045,-12 -.045,-12 -.045,-6 5.955,0 11.955,0 11.955,0 84.1348,0 84.1348,0 90.1348,0 96.1348,-6 96.1348,-12 96.1348,-12 96.1348,-24 96.1348,-24 96.1348,-30 90.1348,-36 84.1348,-36"/>
+<text text-anchor="middle" x="48.0449" y="-13.8" font-family="Times,serif" font-size="14.00" fill="#000000">a.py (Module)</text>
+</g>
+<!-- main.py (Module) -->
+<g id="node2" class="node">
+<title>main.py (Module)</title>
+<path fill="none" stroke="#000000" stroke-dasharray="1,5" d="M169.9141,-122C169.9141,-122 76.1757,-122 76.1757,-122 70.1757,-122 64.1757,-116 64.1757,-110 64.1757,-110 64.1757,-98 64.1757,-98 64.1757,-92 70.1757,-86 76.1757,-86 76.1757,-86 169.9141,-86 169.9141,-86 175.9141,-86 181.9141,-92 181.9141,-98 181.9141,-98 181.9141,-110 181.9141,-110 181.9141,-116 175.9141,-122 169.9141,-122"/>
+<text text-anchor="middle" x="123.0449" y="-99.8" font-family="Times,serif" font-size="14.00" fill="#000000">main.py (Module)</text>
+</g>
+<!-- main.py (Module)&#45;&gt;a.py (Module) -->
+<g id="edge1" class="edge">
+<title>main.py (Module)&#45;&gt;a.py (Module)</title>
+<path fill="none" stroke="#000000" d="M107.1394,-85.7616C96.4997,-73.5615 82.342,-57.3273 70.5516,-43.8076"/>
+<polygon fill="#000000" stroke="#000000" points="73.1381,-41.4483 63.9276,-36.2121 67.8624,-46.0492 73.1381,-41.4483"/>
+<text text-anchor="middle" x="111.4346" y="-56.8" font-family="Times,serif" font-size="14.00" fill="#000000">imports</text>
+</g>
+<!-- numpy (Unknown Module) -->
+<g id="node3" class="node">
+<title>numpy (Unknown Module)</title>
+<path fill="none" stroke="#ff0000" stroke-dasharray="1,5" d="M272.0313,-36C272.0313,-36 126.0586,-36 126.0586,-36 120.0586,-36 114.0586,-30 114.0586,-24 114.0586,-24 114.0586,-12 114.0586,-12 114.0586,-6 120.0586,0 126.0586,0 126.0586,0 272.0313,0 272.0313,0 278.0313,0 284.0313,-6 284.0313,-12 284.0313,-12 284.0313,-24 284.0313,-24 284.0313,-30 278.0313,-36 272.0313,-36"/>
+<text text-anchor="middle" x="199.0449" y="-13.8" font-family="Times,serif" font-size="14.00" fill="#ff0000">numpy (Unknown Module)</text>
+</g>
+<!-- main.py (Module)&#45;&gt;numpy (Unknown Module) -->
+<g id="edge2" class="edge">
+<title>main.py (Module)&#45;&gt;numpy (Unknown Module)</title>
+<path fill="none" stroke="#000000" d="M139.1626,-85.7616C149.944,-73.5615 164.2905,-57.3273 176.2382,-43.8076"/>
+<polygon fill="#000000" stroke="#000000" points="178.9512,-46.0231 182.9505,-36.2121 173.7058,-41.3877 178.9512,-46.0231"/>
+<text text-anchor="middle" x="186.4346" y="-56.8" font-family="Times,serif" font-size="14.00" fill="#000000">imports</text>
+</g>
+</g>
+</svg>