Skip to content
This repository was archived by the owner on Apr 1, 2025. It is now read-only.

Commit 31d396e

Browse files
committed
Merge branch 'master' into semantic-python
2 parents e9968ca + 23df12a commit 31d396e

File tree

62 files changed

+474
-904
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

62 files changed

+474
-904
lines changed

.dockerignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,8 @@
11
Dockerfile
2+
.licenses
3+
.ghc.environment.x86_64-darwin-8.6.5
4+
5+
/bin
6+
/dist-newstyle
7+
/notices
8+
/docs

.hlint.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@
6767
# Conveniences
6868
- warning: {lhs: maybe a pure, rhs: maybeM a, name: Use maybeM}
6969
- warning: {lhs: either (const a) id, rhs: fromRight a, name: use fromRight}
70-
- warning: {lhs: either id (const a), rhs: fromLeft a, name: use fromRight}
70+
- warning: {lhs: either id (const a), rhs: fromLeft a, name: use fromLeft}
7171

7272
# Applicative style
7373
- warning: {lhs: f <$> pure a <*> b, rhs: f a <$> b, name: Avoid redundant pure}

.travis.yml

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ cache:
44
directories:
55
- $HOME/.cabal/packages
66
- $HOME/.cabal/store
7+
- $TRAVIS_BUILD_DIR/dist-newstyle
78

89
before_cache:
910
- rm -fv $HOME/.cabal/packages/hackage.haskell.org/build-reports.log
@@ -28,15 +29,16 @@ before_install:
2829
- cabal --version
2930

3031
install:
31-
- cabal new-update hackage.haskell.org,HEAD
32-
- cabal new-configure --enable-tests --write-ghc-environment-files=always
33-
- cabal new-build --only-dependencies -j
32+
- cabal new-update -v
33+
- cabal new-configure --enable-tests --disable-optimization --write-ghc-environment-files=always --jobs=2
34+
- cabal new-build --only-dependencies
3435

3536
script:
36-
- cabal new-build -j
37+
- cabal new-build
3738
- cabal new-run semantic:test
3839
- cabal new-run semantic-core:spec
39-
- cabal new-run semantic:parse-examples
40+
# parse-examples is disabled because it slaughters our CI
41+
# - cabal new-run semantic:parse-examples
4042

4143
# Any branch linked with a pull request will be built, as well as the non-PR
4244
# branches listed below:

Dockerfile

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,19 @@
11
FROM haskell:8.6 as build
22
WORKDIR /build
3-
RUN cabal new-update
43

5-
# Build our upstream dependencies after copying in only enough to tell cabal
6-
# what they are. This will make these layers cache better even as we change the
7-
# code of semantic itself.
4+
# Build and cache the dependencies first so we can cache these layers.
85
COPY semantic.cabal .
9-
COPY cabal.project .
10-
COPY semantic-core/semantic-core.cabal ./semantic-core/
11-
COPY vendor ./vendor
6+
COPY semantic-core semantic-core
7+
RUN cabal new-update hackage.haskell.org,HEAD
8+
RUN cabal new-configure semantic semantic-core
129
RUN cabal new-build --only-dependencies
1310

14-
# Once the dependencies are built, copy in the rest of the code and compile
15-
# semantic itself.
16-
COPY . /build
17-
RUN cabal new-build semantic:exe:semantic
11+
# Copy in and build the entire project
12+
COPY . .
13+
RUN cabal new-build --flags="release" semantic:exe:semantic
1814

1915
# A fake `install` target until we can get `cabal new-install` to work
20-
RUN cp $(find dist-newstyle -name semantic -type f -perm -u=x) /usr/local/bin/semantic
16+
RUN cp $(find dist-newstyle/build/x86_64-linux -name semantic -type f -perm -u=x) /usr/local/bin/semantic
2117

2218
# Create a fresh image containing only the compiled CLI program, so that the
2319
# image isn't bulked up by all of the extra build state.

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ cabal new-test
108108
cabal new-run semantic -- --help
109109
```
110110

111-
`semantic` requires at least GHC 8.6.4. We recommend using [`ghcup`][ghcup] to sandbox GHC versions. Our version bounds are based on [Stackage][stackage] LTS versions. The current LTS version is 13.13. `stack` as a build tool is not officially supported; there is an unofficial [`stack.yaml`](https://gist.github.com/jkachmar/f200caee83280f1f25e9cfa2dd2b16bb) available, though we cannot make guarantees as to its stability.
111+
`semantic` requires at least GHC 8.6.4 and Cabal 2.4. We recommend using [`ghcup`][ghcup] to sandbox GHC versions. `stack` as a build tool is not officially supported; there is an unofficial [`stack.yaml`](https://gist.github.com/jkachmar/f200caee83280f1f25e9cfa2dd2b16bb) available, though we cannot make guarantees as to its stability.
112112

113113
[nix]: https://www.haskell.org/cabal/users-guide/nix-local-build-overview.html
114114
[stackage]: https://stackage.org

cabal.project

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,3 @@ source-repository-package
1414
type: git
1515
location: https://github.com/joshvera/proto3-wire.git
1616
tag: 84664e22f01beb67870368f1f88ada5d0ad01f56
17-
18-
source-repository-package
19-
type: git
20-
location: https://github.com/rewinfrey/hspec-expectations-pretty-diff
21-
tag: 94af5871c24ba319f7f72fefa53c1a4d074c9a29

docs/adding-new-languages.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Adding new languages to Semantic
2+
3+
This document exists to outline the process associated with adding a new language to Semantic. Though the Semantic authors have architected the library such that adding new languages and syntax [requires no changes to existing code](https://en.wikipedia.org/wiki/Expression_problem), adding support for a new language is a nontrivial amount of work. Those willing to take the plunge will probably need a degree of Haskell experience.
4+
5+
Please note that this list of steps reflects the state of Semantic as is, not where we authors are taking it: we're working on significant simplifications to this process (see the FAQs below).
6+
7+
## The procedure
8+
9+
1. **Find or write a [tree-sitter](https://tree-sitter.github.io) parser for your language.** The tree-sitter [organization page](https://github.com/tree-sitter) has a number of parsers beyond those we currently support in Semantic; look there first to make sure you're not duplicating work. The tree-sitter [documentation on creating parsers](http://tree-sitter.github.io/tree-sitter/creating-parsers) provides an exhaustive look at the process of developing and debugging tree-sitter parsers. Though we do not support grammars written with other toolkits such as [ANTLR](https://www.antlr.org), translating an ANTLR or other BNF-style grammar into a tree-sitter grammar is usually straightforward.
10+
2. **Create a Haskell library providing an interface to that C source.** The [`haskell-tree-sitter`](https://github.com/tree-sitter/haskell-tree-sitter/tree/master/languages) repository provides a Cabal package for each supported language. You can find an example of a pull request to add such a package here. Each package needs to provide two API surfaces:
11+
* a bridged (via the FFI) reference to the toplevel parser in the generated file ([example](https://github.com/tree-sitter/haskell-tree-sitter/blob/master/languages/json/internal/TreeSitter/JSON/Internal.hs))
12+
* symbol datatypes for each syntax node in the parser, generated with the `mkSymbolDatatype` Template Haskell splice ([example](https://github.com/tree-sitter/haskell-tree-sitter/blob/master/languages/json/TreeSitter/JSON.hs))
13+
3. **Identify the new syntax nodes required to represent your language.** While we provide an extensive library of reusable AST nodes for [literals](https://github.com/github/semantic/blob/master/src/Data/Syntax/Literal.hs), [expressions](https://github.com/github/semantic/blob/master/src/Data/Syntax/Expression.hs), [statements](https://github.com/github/semantic/blob/master/src/Data/Syntax/Statement.hs), and [types](https://github.com/github/semantic/blob/master/src/Data/Syntax/Type.hs), most languages will require some syntax nodes not found in other languages. You'll need to create a new module providing those data types, and those data types must be written as an open union: [here](https://github.com/github/semantic/commits/master/src/Language/Ruby/Syntax.hs?author=charliesome) is an example for Ruby's syntactic details.
14+
4. **Write an assignment step that translates tree-sitter trees into Haskell datatypes.** More information about this can be found in the [assignment documentation](assignment.md). This is currently the most time-consuming and error-prone part of the process (see [https://github.com/github/semantic/issues/77]).
15+
5. **Implement `Evaluatable` instances and add new [`Value` effects](https://github.com/github/semantic/blob/master/src/Control/Abstract/Value.hs) as is needed to describe the control flow of your language.** While several features of Semantic (e.g. `semantic parse --symbols` and `semantic diff`) will become fully available given a working assignment step, further features based on concrete or abstract interpretation (such as `semantic graph`) require implementing the `Evaluatable` typeclass and providing value-style effects for each control flow feature provided by the language. This means that language support is a spectrum: Semantic can provide useful information without any knowledge of a language's semantics, but each successive addition to its interpretive capabilities enables more functionality.
16+
6. **Add tests for diffing, tagging, graphing, and evaluating code written in that language.** Because tree-sitter grammars often change, we require extensive testing so as to avoid the unhappy situation of bitrotted languages that break as soon as a new grammar comes down the line.
17+
18+
To summarize, each interaction made possible by the Semantic CLI corresponds to one (or more) of the above steps:
19+
20+
| Step | Interaction |
21+
|------|-----------------|
22+
| 1, 2 | `ts-parse` |
23+
| 3, 4 | `parse`, `diff` |
24+
| 5, 6 | `graph` |
25+
26+
27+
# FAQs
28+
29+
**This sounds hard.** You're right! It is currently a lot of work: just because the Semantic architecture is extensible in the expression-problem manner does not mean that adding new support is trivial.
30+
31+
**Will this get easier in the future?** Unequivocally, yes. The Semantic authors are currently working on a new architecture for language support and parsing, one that dispenses with the assignment step altogether: in the future, `haskell-tree-sitter` will generate Haskell data types from tree-sitter grammars; instead of assigning these types into an open-union of syntax functors, you'll describe how these types are translated into the [Semantic core language](https://github.com/github/semantic/blob/master/semantic-core/src/Data/Core.hs). This will decouple syntax nodes from the process of interpretation and evaluation; all evaluators will be written in terms of the Core language. We hope that this will make the process of adding new languages significantly easier than it currently is, given that it entirely obviates the third and fourth steps lifted above.

docs/why-tree-sitter.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ To serve these goals, the following options were considered alongside `tree-sitt
2626
5. **Performance is decoupled from specific algorithm.** Similarly, grammar specifications are intimately coupled to performance characteristics using whatever algorithms will support them; a grammar which parses very efficiently with one algorithm may be a worst case for another.
2727
6. **There isn’t a universally accepted format for grammar specification.** BNF, and EBNF, are under-specified, and often unsupported; useful only for informal specification to humans, and not for formal specification to machines.
2828
7. **Language specifications are complex.** Some languages’ grammar specs turn out to be complex, for example [Java's language specification](https://docs.oracle.com/javase/specs/jls/se9/html/index.html). Similarly problematic is [Swift's spec](https://developer.apple.com/library/content/documentation/Swift/Conceptual/Swift_Programming_Language/TheBasics.html), described by @robrix as "a subtle and intricate work of fiction".
29-
8. **Open source.** By using tree-sitterwe can lean on open source contributors to do grammar development work.
29+
8. **Open source.** By using tree-sitter we can lean on open source contributors to do grammar development work.
3030
9. **Low learning curve.** Writing grammars in JavaScript (as opposed to some custom notation/language) is quite powerful.
3131
10. **Multiple algorithms for handling ambiguity.** Precedence annotations at compile time, GLR at runtime.
3232
11. **External scanner support.** In case you need to parse a context free grammar. An example of an external scanner is in [Ruby's language support](https://github.com/tree-sitter/tree-sitter-ruby/blob/master/src/scanner.cc).

script/clone-example-repos

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,10 @@
1313
set -e
1414
cd $(dirname "$0")/..
1515

16-
mkdir vendor || true
17-
git clone --single-branch --recurse-submodules https://github.com/tree-sitter/haskell-tree-sitter.git vendor/haskell-tree-sitter
16+
mkdir -p test/examplerepos || true
17+
git clone --single-branch --recurse-submodules https://github.com/tree-sitter/haskell-tree-sitter.git tmp/haskell-tree-sitter || true
1818

19-
dir="vendor/haskell-tree-sitter/languages"
19+
dir="tmp/haskell-tree-sitter/languages"
2020

2121
# clone_repo LOCAL_PATH URL SHA
2222
function clone_repo {

script/publish

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
#!/bin/bash
2+
#/ Usage: script/publish
3+
#/
4+
#/ Build a docker image of the semantic CLI and publish to the GitHub Package Registry
5+
6+
set -e
7+
cd $(dirname "$0")/..
8+
9+
VERSION="0.6.0"
10+
BUILD_SHA=$(git rev-parse HEAD 2>/dev/null)
11+
DOCKER_IMAGE=docker.pkg.github.com/github/semantic/semantic
12+
13+
# Build
14+
docker build -t $DOCKER_IMAGE .
15+
16+
# Make sure semantic is in the image.
17+
docker run --rm $DOCKER_IMAGE --version
18+
19+
# Requires that you've logged in to the GPR (e.g. `docker login docker.pkg.github.com`)
20+
# https://help.github.com/en/articles/configuring-docker-for-use-with-github-package-registry
21+
docker tag $DOCKER_IMAGE $DOCKER_IMAGE:latest
22+
docker tag $DOCKER_IMAGE $DOCKER_IMAGE:$VERSION
23+
docker tag $DOCKER_IMAGE $DOCKER_IMAGE:sha_$BUILD_SHA
24+
docker push $DOCKER_IMAGE:sha_$BUILD_SHA
25+
docker push $DOCKER_IMAGE:$VERSION
26+
docker push $DOCKER_IMAGE:latest

0 commit comments

Comments
 (0)