1- # Compiler Spec & Design
1+ # _ Compiler _ Spec & Design
22
3- The Compiler component sits between the Frontend component and the Codegen
4- component. The purpose of this component is to typecheck (or, more accurately,
5- kindcheck) the Frontend's output (_ Compiler Input_ ) and perform other additional
6- validation checks necessary to ensure that the Codegen component is capable of
7- processing the _ Compiler Output_ .
3+ The _ Compiler_ component sits between the _ Frontend_ component and the code
4+ generation component named _ Codegen_ . The purpose of the _ Compiler_ is to
5+ typecheck terms generated by the Frontend (terms referred to as _ Compiler Input_ )
6+ and perform other additional validation checks. The end goal of the _ Compiler_
7+ is to ensure that the _ Codegen_ component is capable of processing the _ Compiler
8+ Output_ .
89
9- ## Compiler interface
10+ ## _ Compiler _ interface
1011
11- The Compiler operates on the Compiler Input proto, effectively enabling any
12- number of Frontends to interface with the Compiler in a language agnostic
13- manner.
12+ The _ Compiler_ operates on the _ Compiler Input proto_ - enabling, any _ Frontend_
13+ to interface with the _ Compiler_ in a language agnostic manner.
1414
15- Similarly, the Compiler Output is also a proto that is then consumed by Codegen
16- modules that can be written in any programming environment capable of
17- communicating using Google Protobufs .
15+ Similarly, the _ Compiler Output _ is also a _ proto _ that is then consumed by
16+ _ Codegen _ modules, able to be written in any programming environment capable of
17+ communicating via Google Protocol Buffers .
1818
1919## Checking type definitions
2020
21- One primary purpose of the compiler is to typecheck the schemata that users
22- define via the Frontend. The lambda calculus representation of schemata is "one
23- level up" from the term level: ADT schemata declarations are the "terms" and
24- kinds are the "types".
25-
26- Because the schema language only supports data declarations for algebraic data
27- types, and does not support function definitions, the only two admissible kinds
28- are ` Type ` and ` TyCon ` s of a specific arity, such that a ` TyCon ` can accept an
29- arbitrary number of arguments of kind ` Type ` but can ** only** accept arguments
30- of kind ` Type ` . Put another way: An expression's kind is the same thing as its
31- arity, and only expressions of arity=0 (i.e. expressions of kind ` Type ` ) can be
32- applied.
33-
34- To elaborate on the previous point: A user can define a higher kinded data type,
35- such as ` data Maybe a = Nothing | Just a ` in the schema language. However users
36- cannot define a data type which has a higher kinded type as a _ parameter_ , and
37- users must "fully saturate" higher kinded types with well-scoped type variables
38- or zero arity ` Type ` s to pass them as arguments to a ` TyCon ` . Consequently, a
39- definition such as ` data HKT f a = HKT (f a) ` is not valid because `f :: Type ->
40- Type`. Restricting the schema language in this way does reduce its
41- expressiveness, but allowing higher kinded type parameters would greatly limit
42- (or, at the very least, greatly complicate) our ability to codegen for a variety
43- of target languages with type systems less expressive than Haskell.
44-
45- The schema language supports recursive types, and so the compiler must be
46- capable of checking recursive types.
47-
48- In addition to typechecking, the compiler must perform a special check for
21+ The primary purpose of the compiler is to typecheck the schemata that users
22+ define via the _ Frontend_ . The schemata terms are ADT declarations and their
23+ types are kinds.
24+
25+ Currently the schema language supports:
26+
27+ 1 . types (such as ` Int ` or ` Bool ` ),
28+
29+ 2 . and type functions (such as ` Maybe ` or ` Either ` ).
30+
31+ There are future plans to expand this to Higher Kinded Types (such as ` MaybeT ` ,
32+ ` StateT ` etc.) - subject to research into _ Codegen_ of such types in the target
33+ languages. The schema language and the _ Compiler_ do support recursive
34+ types.
35+
36+ Schemata terms must be monomorphically kinded, with polymorphic kinds defaulting
37+ to monomorphic ones. For example ` Phantom a = Phantom ` would resolve to the
38+ monomorphic kind ` Type → Type ` rather than the polymorphic kind ` ∀a. a → Type ` .
39+
40+ <!-- cstml comments: Is this true? I think it's an Aim not a requirement?
41+ :todo: -->
42+
43+ In addition to typechecking, the compiler must perform a special check for
4944recursive types: It must validate that a recursive type is inhabited (or
5045inhabitable). The purpose of this check is to ensure that any schema which
5146passes validation is (in principle) a schema for which type definitions and
5247typeclass "instances" (which may be simple functions in languages without
5348typeclass support) can be generated. As an example, the compiler should reject
54- types such as ` data F a = F (F a) ` , which is uninhabited (and uninhabitable).
49+ types such as ` data F a = F (F a) ` , which is uninhabited.
50+
51+ <!-- cstml comments: Is this true? Seems as though this should fail
52+ parsing. :todo: -->
5553
5654(Provisional:) Finally, the compiler should _ normalize_ expressions as far as it
5755is able to. For example, it may be possible to define a data type in the schema
@@ -62,12 +60,12 @@ generation.
6260
6361## Checking type class definitions and instance clauses
6462
65- The Compiler should, if possible, ensure that all instance declarations for
63+ The _ Compiler _ should, if possible, ensure that all instance declarations for
6664schemata are derivable using hard-coded derivation axioms. Because the checks
6765relevant to validating type class instances ough to be entirely separate from
6866the checks enumerated above, they can be worked on separately at a later date
6967when the design of the typeclass system has been fleshed out more.
7068
7169## Unsolved Problems
7270
73- - How do we represent recursive types in our lambda calculus AST?
71+ - [ ] How do we represent recursive types in our lambda calculus AST?
0 commit comments