microsoft
diff --git a/‎README.md‎
Lines changed: 98 additions & 19 deletions b/‎README.md‎
Lines changed: 98 additions & 19 deletions
diff --git a/‎generators/IMO.py‎
Lines changed: 6 additions & 4 deletions b/‎generators/IMO.py‎
Lines changed: 6 additions & 4 deletions
diff --git a/‎generators/__init__.py‎
Lines changed: 0 additions & 1 deletion b/‎generators/__init__.py‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎generators/classic_puzzles.py‎
Lines changed: 20 additions & 33 deletions b/‎generators/classic_puzzles.py‎
Lines changed: 20 additions & 33 deletions
diff --git a/‎generators/codeforces.py‎
Lines changed: 3 additions & 3 deletions b/‎generators/codeforces.py‎
Lines changed: 3 additions & 3 deletions
@@ -1,8 +1,10 @@
 # Python Programming Puzzles (P3)
 
 This repo contains a dataset of Python programming puzzles which can be used to teach and evaluate
-an AI's programming proficiency. We hope this dataset will **grow rapidly**, and it is already diverse in 
-terms of problem difficulty, domain, 
+an AI's programming proficiency. We present code generated by OpenAI's recently released 
+[codex](https://arxiv.org/abs/2107.03374) 12-billion parameter neural network  
+solving many of these puzzles. We hope this dataset will 
+**grow rapidly**, and it is already diverse in terms of problem difficulty, domain, 
 and algorithmic tools needed to solve the problems. Please
 [propose a new puzzle](../../issues/new?assignees=akalai&labels=New-puzzle&template=new-puzzle.md&title=New+puzzle) 
  or [browse newly proposed puzzles](../../issues?q=is%3Aopen+is%3Aissue+label%3ANew-puzzle) 
@@ -33,48 +35,125 @@ your programming compares.
 ## What is a Python programming puzzle?
 
 Each puzzle takes the form of a Python function that takes an answer as an argument. 
-The goal is to find an answer which makes the function return `True`. 
+The answer is an input which makes the function return `True`. 
 This is called *satisfying* the puzzle, and that is why the puzzles are all named `sat`.
 
 ```python
 def sat(s: str):
     return "Hello " + s == "Hello world"
 ```
 
-The answer to the above puzzle is the string `"world"` because `sat("world")` returns `True`. The puzzles range from trivial problems like this, to classic puzzles, 
+The answer to the above puzzle is the string `"world"` because `sat("world")` returns `True`. The puzzles range from
+trivial problems like this, to classic puzzles, 
 to programming competition problems, all the way through open problems in algorithms and mathematics. 
-A slightly harder example is:
+
+The classic [Towers of Hanoi](https://en.wikipedia.org/wiki/Tower_of_Hanoi) puzzle can be written as follows:
 ```python
-def sat(s: str):  
-    """find a string with 1000 o's but no consecutive o's."""
-    return s.count("o") == 1000 and s.count("oo") == 0
+def sat(moves: List[List[int]]):  
+    """
+    Eight disks of sizes 1-8 are stacked on three towers, with each tower having disks in order of largest to
+    smallest. Move [i, j] corresponds to taking the smallest disk off tower i and putting it on tower j, and it
+    is legal as long as the towers remain in sorted order. Find a sequence of moves that moves all the disks
+    from the first to last towers.
+    """
+    rods = ([8, 7, 6, 5, 4, 3, 2, 1], [], [])
+    for [i, j] in moves:
+        rods[j].append(rods[i].pop())
+        assert rods[j][-1] == min(rods[j]), "larger disk on top of smaller disk"
+    return rods[0] == rods[1] == []
 ```
+The shortest answer is a list of 255 moves, so instead we ask for the AI to generate *code* that outputs an answer. In 
+this case, the [codex API](https://beta.openai.com/) generated the following code:
+```python
+def sol():
+    # taken from https://www.geeksforgeeks.org/c-program-for-tower-of-hanoi/
+    moves = []
+    def hanoi(n, source, temp, dest):
+        if n > 0:
+            hanoi(n - 1, source, dest, temp)
+            moves.append([source, dest])
+            hanoi(n - 1, temp, source, dest)
+    hanoi(8, 0, 1, 2)
+    return moves
+```
+This was not on its first try, but that is one of the advantages of puzzles---it is easy for the computer to check 
+its answers so it can generate many answers until it finds one. For this puzzle, about 1 in 1,000 solutions were 
+satisfactory. Clearly, codex has seen this problem before in other input formats---it even generated a url!
+(Upon closer inspection, the website exists and contains Python Tower-of-Hanoi code in a completely different format 
+with different variable names.)
+On a harder, less-standard [Hanoi puzzle variant](puzzles/README.md#towersofhanoiarbitrary) that 
+requires moving from particular start to end positions, codex didn't solve it on 10,000 attempts. 
+
+Next, consider a puzzle inspired by [this easy competitive programming problem](https://codeforces.com/problemset/problem/58/A) 
+from  [codeforces.com](https://codeforces.com) website:
+```python
+def sat(inds: List[int], string="Sssuubbstrissiingg"):
+    """Find increasing indices to make the substring "substring"""
+    return inds == sorted(inds) and "".join(string[i] for i in inds) == "substring"
+```  
+Codex generated the code below, which when run gives the valid answer `[1, 3, 5, 7, 8, 9, 10, 15, 16]`.
+This satisfies this puzzle because it's an increasing list of indices which if you join the
+characters `"Sssuubbstrissiingg"` in these indices you get `"substring"`.    
+```python
+def sol(string="Sssuubbstrissiingg"):
+    x = "substring"
+    pos = string.index(x[0])
+    inds = [pos]
+    while True:
+        x = x[1:]
+        if not x:
+            return inds
+        pos = string.find(x[0], pos+1)
+        if pos == -1:
+            return inds
+        inds.append(pos)
+```  
+Again, there are multiple valid answers, and again this was out of many attempts (only 1 success in 10k). 
+
 
 A more challenging puzzle that requires [dynamic programming](https://en.wikipedia.org/wiki/Dynamic_programming) is the 
 [longest increasing subsequence](https://en.wikipedia.org/wiki/Longest_increasing_subsequence) problem
 which we can also describe with strings:
 ```python
-from typing import List
+def f(x: List[int], length=20, s="Dynamic programming solves this classic job-interview puzzle!!!"):
+    """Find the indices of the longest substring with characters in sorted order"""
+    return all(s[x[i]] <= s[x[i + 1]] and x[i + 1] > x[i] for i in range(length - 1))
 
-def sat(x: List[int], s="Dynamic programming solves this classic job-interview puzzle!!!"): 
-    """Find the indexes (possibly negative!) of the longest monotonic subsequence"""    
-    return all(s[x[i]] <= s[x[i+1]] and x[i+1] > x[i] for i in range(25))
 ```
+Codex didn't solve this one.
 
-The classic [Towers of Hanoi](https://en.wikipedia.org/wiki/Tower_of_Hanoi) puzzle can be written as follows:
+The dataset also has a number of open problems in computer science and mathematics. For example,  
+[Conway's 99-graph problem](https://en.wikipedia.org/w/index.php?title=Conway%27s_99-graph_problem)
+is an unsolved problem in graph theory 
+(see also [Five $1,000 Problems (Update 2017)](https://oeis.org/A248380/a248380.pdf))        
 ```python
-def sat(moves: List[List[int]]):  
-    """moves is list of [from, to] pairs"""
-    t = ([8, 7, 6, 5, 4, 3, 2, 1], [], [])  # towers state
-    return all(t[j].append(t[i].pop()) or t[j][-1] == min(t[j]) for i, j in moves) and t[0] == t[1]
-
+def sat(edges: List[List[int]]):
+    """
+    Find an undirected graph with 99 vertices, in which each two adjacent vertices have exactly one common
+    neighbor, and in which each two non-adjacent vertices have exactly two common neighbors.
+    """
+    # first compute neighbors sets, N:
+    N = {i: {j for j in range(99) if j != i and ([i, j] in edges or [j, i] in edges)} for i in range(99)}
+    return all(len(N[i].intersection(N[j])) == (1 if j in N[i] else 2) for i in range(99) for j in range(i))
 ```
 
+Why puzzles? One reason is that, if we can solve them better than human programmers, 
+then we could make progress on some important algorithms problems.
+But until then, a second reason is that they can be valuable for training and evaluating AI systems. 
+Many programming datasets have been proposed over the years, and several have problems of a similar nature
+(like programming competition problems). In puzzles, the spec is defined by code, while
+other datasets usually use a combination of English and a hidden test set of input-output pairs. English-based
+specs are notoriously ambiguous and test the system's understanding of English. 
+And with input-output test cases, you would have to have solved a puzzle before you pose it,
+so what is the use there? Code-based specs
+have the advantage that they are unambiguous, there is no need to debug the AI-generated code or fears that it 
+doesn't do what you want. If it solved the puzzle, then it succeeded by definition. 
+
 For more information on the motivation and how programming puzzles can help AI learn to program, see 
 the paper:  
 *Programming Puzzles*, by Tal Schuster, Ashwin Kalyan, Alex Polozov, and Adam Tauman Kalai. 2021 (Link to be added shortly)  
 
-# [Click here to browse the puzzles](/puzzles/README.md)
+# [Click here to browse the puzzles and solutions](/puzzles/README.md)
 
 The problems in this repo are based on:
 * Wikipedia articles about [algorithms](https://en.wikipedia.org/wiki/List_of_algorithms), [puzzles](https://en.wikipedia.org/wiki/Category:Logic_puzzles),
 
@@ -108,7 +108,7 @@ class NoRelativePrimes(PuzzleGenerator):
 
 
     @staticmethod
-    def sat(nums: List[int], b=6, m=2):
+    def sat(nums: List[int], b=7, m=6):
         """
         Let P(n) = n^2 + n + 1.
 
@@ -337,7 +337,7 @@ class HalfTag(PuzzleGenerator):
     taint_date = [2020, 9, 19]
 
     @staticmethod
-    def sat(li: List[int], n=3, tags=[0, 1, 2, 0, 0, 1, 1, 1, 2, 2, 0, 2]):
+    def sat(li: List[int], tags=[3, 0, 3, 2, 0, 1, 0, 3, 1, 1, 2, 2, 0, 2, 1, 3]):
         """
         The input tags is a list of 4n integer tags each in range(n) with each tag occurring 4 times.
         The goal is to find a subset (list) li of half the indices such that:
@@ -353,12 +353,14 @@ def sat(li: List[int], n=3, tags=[0, 1, 2, 0, 0, 1, 1, 1, 2, 2, 0, 2]):
 
         Note the sum of the output is 33 = (0+1+2+...+11)/2 and the selected tags are [0, 0, 1, 1, 2, 2]
         """
+        n = max(tags) + 1
         assert sorted(tags) == sorted(list(range(n)) * 4), "hint: each tag occurs exactly four times"
         assert len(li) == len(set(li)) and min(li) >= 0
         return sum(li) * 2 == sum(range(4 * n)) and sorted([tags[i] for i in li]) == [i // 2 for i in range(2 * n)]
 
     @staticmethod
-    def sol(n, tags):
+    def sol(tags):
+        n = max(tags) + 1
         pairs = {(i, 4 * n - i - 1) for i in range(2 * n)}
         by_tag = {tag: [] for tag in range(n)}
         for p in pairs:
@@ -419,7 +421,7 @@ def gen_random(self):
         tags = [i // 4 for i in range(4 * n)]
         self.random.shuffle(tags)
         # print(self.__class__, n, tick())
-        self.add(dict(n=n, tags=tags))
+        self.add(dict(tags=tags))
 
 
 
 
@@ -10,7 +10,6 @@
 from . import compression
 from . import conways_game_of_life
 from . import games
-from . import game_theory
 from . import graphs
 from . import ICPC
 from . import IMO
 
@@ -190,10 +190,6 @@ def sat(quine: str):
     def sol():
         return "(lambda x: f'({x})({chr(34)}{x}{chr(34)})')(\"lambda x: f'({x})({chr(34)}{x}{chr(34)})'\")"
 
-    @staticmethod
-    def sol2():  # thanks for this simple solution, GPT-3!
-        return 'quine'
-
 
 class RevQuine(PuzzleGenerator):
     """Reverse [Quine](https://en.wikipedia.org/wiki/Quine_%28computing%29). The solution we give is from GPT3."""
@@ -252,25 +248,26 @@ class ClockAngle(PuzzleGenerator):
     @staticmethod
     def sat(hands: List[int], target_angle=45):
         """Find clock hands = [hour, min] such that the angle is target_angle degrees."""
-        hour, min = hands
-        return 0 < hour <= 12 and 0 <= min < 60 and ((60 * hour + min) - 12 * min) % 720 == 2 * target_angle
+        h, m = hands
+        assert 0 < h <= 12 and 0 <= m < 60
+        hour_angle = 30 * h + m / 2
+        minute_angle = 6 * m
+        return abs(hour_angle - minute_angle) in [target_angle, 360 - target_angle]
 
     @staticmethod
     def sol(target_angle):
-        for hour in range(1, 13):
-            for min in range(60):
-                if ((60 * hour + min) - 12 * min) % 720 == 2 * target_angle:
-                    return [hour, min]
+        for h in range(1, 13):
+            for m in range(60):
+                hour_angle = 30 * h + m / 2
+                minute_angle = 6 * m
+                if abs(hour_angle - minute_angle) % 360 in [target_angle, 360 - target_angle]:
+                    return [h, m]
+
+    def gen_random(self):
+        target_angle = self.random.randrange(0, 360)
+        if self.sol(target_angle):
+            self.add(dict(target_angle=target_angle))
 
-    def gen(self, target_num_instances):
-        for hour in range(1, 13):
-            for min in range(60):
-                if len(self.instances) == target_num_instances:
-                    return
-                double_angle = ((60 * hour + min) - 12 * min) % 720
-                if double_angle % 2 == 0:
-                    target_angle = double_angle // 2
-                    self.add(dict(target_angle=target_angle))
 
 class Kirkman(PuzzleGenerator):
     """[Kirkman's problem](https://en.wikipedia.org/wiki/Kirkman%27s_schoolgirl_problem)"""
@@ -408,7 +405,6 @@ def mirror(coords):  # rotate to all four corners
             return next(list(mirror(coords)) for coords in combinations(grid, side // 2) if
                         test(coords) and test(mirror(coords)))
 
-
     def gen(self, target_num_instances):
         for easy in range(47):
             for side in range(47):
@@ -449,12 +445,7 @@ def gen_random(self):
 class SquaringTheSquare(PuzzleGenerator):
     """[Squaring the square](https://en.wikipedia.org/wiki/Squaring_the_square)
     Wikipedia gives a minimal [solution with 21 squares](https://en.wikipedia.org/wiki/Squaring_the_square)
-    due to Duijvestijn (1978):
-    ```python
-    [[0, 0, 50], [0, 50, 29], [0, 79, 33], [29, 50, 25], [29, 75, 4], [33, 75, 37], [50, 0, 35],
-     [50, 35, 15], [54, 50, 9], [54, 59, 16], [63, 50, 2], [63, 52, 7], [65, 35, 17], [70, 52, 18],
-     [70, 70, 42], [82, 35, 11], [82, 46, 6], [85, 0, 27], [85, 27, 8], [88, 46, 24], [93, 27, 19]]
-    ```
+    due to Duijvestijn (1978).
     """
 
     @staticmethod
@@ -484,7 +475,7 @@ class NecklaceSplit(PuzzleGenerator):
     """
 
     @staticmethod
-    def sat(n: int, lace="bbbbrrbrbrbbrrrr"):
+    def sat(n: int, lace="bbrbrbbbbbbrrrrrrrbrrrrbbbrbrrbbbrbrrrbrrbrrbrbbrrrrrbrbbbrrrbbbrbbrbbbrbrbb"):
         """
         Find a split dividing the given red/blue necklace in half at n so that each piece has an equal number of
         reds and blues.
@@ -633,15 +624,11 @@ class WaterPouring(PuzzleGenerator):
     """[Water pouring puzzle](https://en.wikipedia.org/w/index.php?title=Water_pouring_puzzle&oldid=985741928)"""
 
     @staticmethod
-    def sat(
-            moves: List[List[int]],
-            capacities=[8, 5, 3],
-            init=[8, 0, 0],
-            goal=[4, 4, 0]
-    ):  # moves is list of [from, to] pairs
+    def sat(moves: List[List[int]], capacities=[8, 5, 3], init=[8, 0, 0], goal=[4, 4, 0]):
         """
         Given an initial state of water quantities in jugs and jug capacities, find a sequence of moves (pouring
         one jug into another until it is full or the first is empty) to reaches the given goal state.
+        moves is list of [from, to] pairs
         """
         state = init.copy()
 
 
@@ -1,4 +1,4 @@
-"""Problems inspired by [codeforces](https://codeforces.com)."""
+"""Problems inspired by the popular programming competition site [codeforces.com](https://codeforces.com)"""
 
 from puzzle_generator import PuzzleGenerator
 from typing import List
@@ -656,7 +656,7 @@ class Sssuubbstriiingg(PuzzleGenerator):
     """Inspired by [Codeforces Problem 58 A](https://codeforces.com/problemset/problem/58/A)"""
 
     @staticmethod
-    def sat(inds: List[int], string="Sssuubbstriiingg"):
+    def sat(inds: List[int], string="Sssuubbstrissiingg"):
         """Find increasing indices to make the substring "substring"""
         return inds == sorted(inds) and "".join(string[i] for i in inds) == "substring"
 
@@ -725,7 +725,7 @@ class Moving0s(PuzzleGenerator):
     @staticmethod
     def sat(seq: List[int], target=[1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0], n_steps=4):
         """
-        Find a sequence of 0's and 1's so that, after n_steps of swapping each adjacent (0, 1), target target sequence
+        Find a sequence of 0's and 1's so that, after n_steps of swapping each adjacent (0, 1), the target sequence
         is achieved.
         """
         s = seq[:]  # copy