Skip to content

Commit f39b9da

Browse files
Merge branch 'develop-copilot-theorem-counterexamples-take-three'. Close #589.
**Description** Currently, the Copilot.Theorem.What4.prove function returns a list of results, where each result contains a `SatResult` that describes whether a property is `Valid`, `Invalid`, or `Unknown`. The `Invalid` result has the limitation that it does not give any information about a specific counterexample that could drive Copilot into falsifying the property, however. This makes it challenging to interpret what the results of `prove` mean. It would be helpful if `Copilot.Theorem.What4` could offer an API to prove or disprove a property such that disproven properties come with a concrete counterexample. This counterexample information could then be interpreted by users. **Type** * Feature: Add counterexample capabilities to the What4 backend in * `copilot-theorem`. **Additional context** None. **Requester** * Ryan Scott (Galois). **Method to check presence of bug** Not applicable (not a bug). **Expected result** Introduce a new function to `Copilot.Theorem.What4` that mirrors the type signature of `prove`, except that it returns a variant of `SatResult` where the `Invalid` equivalent encodes counterexample information. `copilot-theorem` users can then interpret the results of the counterexample in Copilot specifications. The following Dockerfile runs an example included in the PR that demonstrates the new feature by printing the results of proving some combination of valid and invalid properties, for which counterexamples are produced when applicable with enough information to help diagnose the problem: ```Dockerfile FROM ubuntu:focal ENV DEBIAN_FRONTEND=noninteractive RUN apt-get update RUN apt-get install --yes \ libz-dev \ git \ curl \ gcc \ g++ \ make \ libgmp3-dev \ pkg-config \ z3 RUN mkdir -p $HOME/.ghcup/bin RUN curl https://downloads.haskell.org/~ghcup/0.1.19.2/x86_64-linux-ghcup-0.1.19.2 -o $HOME/.ghcup/bin/ghcup RUN chmod a+x $HOME/.ghcup/bin/ghcup ENV PATH=$PATH:/root/.ghcup/bin/ ENV PATH=$PATH:/root/.cabal/bin/ SHELL ["/bin/bash", "-c"] RUN ghcup install ghc 9.10 RUN ghcup install cabal 3.2 RUN ghcup set ghc 9.10 RUN cabal update SHELL ["/bin/bash", "-c"] CMD git clone $REPO && cd $NAME && git checkout $COMMIT && cd .. \ && cabal v1-sandbox init \ && cabal v1-install alex happy --constraint='happy <= 2' \ && cabal v1-install $NAME/copilot**/ \ && cabal v1-exec -- runhaskell $NAME/copilot/examples/what4/ArithmeticCounterExamples.hs \ && echo "Success" ``` Command (substitute variables based on new path after merge): ``` $ docker run -e "REPO=https://github.com/Copilot-Language/copilot" -e "NAME=copilot" -e "COMMIT=<HASH>" -it copilot-verify-589 ``` **Solution implemented** - Define a `Show` instance for `Type`, to help print error messages. - Add constraints to `XExpr` constructors, to help create `Array`s of `Typed` elements. - Add cases to `Copilot.Theorem.What4.valFromExpr` to handle arrays. - Introduce a new `Copilot.Theorem.What4.proveWithCounterExample` function that produces counter examples for properties found to be invalid. - Add `Show` and `ShowF` instances for counter examples, to help visualize the results of `proveWithCounterExample`. - Add test cases for `proveWithCounterExample`. - Add example demonstrating new features. **Further notes** None.
2 parents e234f1b + 9861071 commit f39b9da

File tree

10 files changed

+584
-62
lines changed

10 files changed

+584
-62
lines changed

copilot-core/CHANGELOG

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
1-
2025-01-28
1+
2025-02-24
22
* Fix typo in documentation. (#587)
3+
* Add a Show instance for Type. (#589)
34

45
2025-01-07
56
* Version bump (4.2). (#577)

copilot-core/src/Copilot/Core/Type.hs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
{-# LANGUAGE GADTs #-}
66
{-# LANGUAGE KindSignatures #-}
77
{-# LANGUAGE ScopedTypeVariables #-}
8+
{-# LANGUAGE StandaloneDeriving #-}
89
{-# LANGUAGE Trustworthy #-}
910
{-# LANGUAGE TypeApplications #-}
1011
{-# LANGUAGE TypeOperators #-}
@@ -154,6 +155,7 @@ data Type :: * -> * where
154155
, Typed t
155156
) => Type t -> Type (Array n t)
156157
Struct :: (Typed a, Struct a) => a -> Type a
158+
deriving instance Show (Type a)
157159

158160
-- | Return the length of an array from its type
159161
typeLength :: forall n t . KnownNat n => Type (Array n t) -> Int

copilot-theorem/CHANGELOG

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1-
2025-01-28
1+
2025-02-24
22
* Fix multiple typos in README. (#560)
33
* Fix typo in documentation. (#587)
4+
* Add function to produce counterexamples for invalid properties. (#589)
45

56
2025-01-07
67
* Version bump (4.2). (#577)

copilot-theorem/copilot-theorem.cabal

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,7 @@ test-suite unit-tests
121121

122122
build-depends:
123123
base
124+
, HUnit
124125
, QuickCheck
125126
, test-framework
126127
, test-framework-quickcheck2

0 commit comments

Comments
 (0)