Skip to content

Commit 0557455

Browse files
committed
docs, readme
1 parent a8134f1 commit 0557455

File tree

3 files changed

+257
-52
lines changed

3 files changed

+257
-52
lines changed

README.md

Lines changed: 91 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,35 @@
11
# sqlite-regex
22

3-
A fast and performant SQLite extension for regular expressions.
3+
A fast and performant SQLite extension for regular expressions. Based on [`sqlite-loadable-rs`](https://github.com/asg017/sqlite-loadable-rs), and the [regex crate](https://crates.io/crates/regex).
44

5-
See [`sqlite-loadable-rs`](https://github.com/asg017/sqlite-loadable-rs), the framework that makes this extension possible.
5+
See [_Introducing sqlite-regex: The fastest Regular Expression Extension for SQLite_](https://observablehq.com/@asg017/introducing-sqlite-regex) (Jan 2023) for more details!
66

7-
## WORK IN PROGRESS
7+
If your company or organization finds this library useful, consider [supporting my work](#supporting)!
88

9-
This extension isn't 100% complete yet, but hoping to release in the next 1-2 weeks! A sneak peek at what to expect:
9+
![](./benchmarks/dates.png)
1010

11-
### The fastest `REGEXP()` implementation in SQLite
11+
## Usage
1212

13-
I don't have a fancy benchmark screenshot yet, but in my Mac, I get ~50% faster results with the `regexp()` in `sqlite-regex` over the "official" [regexp.c](https://github.com/sqlite/sqlite/blob/master/ext/misc/regexp.c) SQLite extension.
14-
15-
### More regex utilities
13+
```sql
14+
.load ./regex0
15+
select 'foo' matches 'f';
1616

17-
Very rarely does `regexp` cover all your regular expression needs. `sqlite-regex` also includes support for many other regex operations, such as:
17+
```
1818

19-
**Find all occurances of a pattern in a string**
19+
**Find all occurrences of a pattern in a string**
2020

2121
```sql
22-
select regex_find('[0-9]{3}-[0-9]{3}-[0-9]{4}', 'phone: 111-222-3333');
22+
select regex_find(
23+
'[0-9]{3}-[0-9]{3}-[0-9]{4}',
24+
'phone: 111-222-3333'
25+
);
2326
-- '111-222-3333'
2427

2528
select rowid, *
26-
from regex_find_all('\b\w{13}\b', 'Retroactively relinquishing remunerations is reprehensible.');
29+
from regex_find_all(
30+
'\b\w{13}\b',
31+
'Retroactively relinquishing remunerations is reprehensible.'
32+
);
2733
/*
2834
┌───────┬───────┬─────┬───────────────┐
2935
│ rowid │ start │ end │ match │
@@ -36,6 +42,19 @@ from regex_find_all('\b\w{13}\b', 'Retroactively relinquishing remunerations is
3642
*/
3743
```
3844

45+
**Use RegexSets to match a string on multiple patterns in linear time**
46+
47+
```sql
48+
select regexset_is_match(
49+
regexset(
50+
"bar",
51+
"foo",
52+
"barfoo"
53+
),
54+
'foobar'
55+
)
56+
```
57+
3958
**Split the string on the given pattern delimiter**
4059

4160
```sql
@@ -54,7 +73,7 @@ from regex_split('[ \t]+', 'a b c d e');
5473
*/
5574
```
5675

57-
**Replace occurances of a pattern with another string**
76+
**Replace occurrences of a pattern with another string**
5877

5978
```sql
6079
select regex_replace(
@@ -68,4 +87,62 @@ select regex_replace_all('a', 'abc abc', '');
6887
-- 'bc bc'
6988
```
7089

71-
And more!
90+
## Documentation
91+
92+
See [`docs.md`](./docs.md) for a full API reference.
93+
94+
## Installing
95+
96+
The [Releases page](https://github.com/asg017/sqlite-regex/releases) contains pre-built binaries for Linux x86_64, MacOS, and Windows.
97+
98+
### As a loadable extension
99+
100+
If you want to use `sqlite-regex` as a [Runtime-loadable extension](https://www.sqlite.org/loadext.html), Download the `regex0.dylib` (for MacOS), `regex0.so` (Linux), or `regex0.dll` (Windows) file from a release and load it into your SQLite environment.
101+
102+
> **Note:**
103+
> The `0` in the filename (`regex0.dylib`/ `regex0.so`/`regex0.dll`) denotes the major version of `sqlite-regex`. Currently `sqlite-regex` is pre v1, so expect breaking changes in future versions.
104+
105+
For example, if you are using the [SQLite CLI](https://www.sqlite.org/cli.html), you can load the library like so:
106+
107+
```sql
108+
.load ./regex0
109+
select regex_version();
110+
-- v0.1.0
111+
```
112+
113+
Or in Python, using the builtin [sqlite3 module](https://docs.python.org/3/library/sqlite3.html):
114+
115+
```python
116+
import sqlite3
117+
con = sqlite3.connect(":memory:")
118+
con.enable_load_extension(True)
119+
con.load_extension("./regex0")
120+
print(con.execute("select regex_version()").fetchone())
121+
# ('v0.1.0',)
122+
```
123+
124+
Or in Node.js using [better-sqlite3](https://github.com/WiseLibs/better-sqlite3):
125+
126+
```javascript
127+
const Database = require("better-sqlite3");
128+
const db = new Database(":memory:");
129+
db.loadExtension("./regex0");
130+
console.log(db.prepare("select regex_version()").get());
131+
// { 'regex_version()': 'v0.1.0' }
132+
```
133+
134+
Or with [Datasette](https://datasette.io/):
135+
136+
```
137+
datasette data.db --load-extension ./regex0
138+
```
139+
140+
## Supporting
141+
142+
I (Alex 👋🏼) spent a lot of time and energy on this project and [many other open source projects](https://github.com/asg017?tab=repositories&q=&type=&language=&sort=stargazers). If your company or organization uses this library (or you're feeling generous), then please [consider supporting my work](https://alexgarcia.xyz/work.html), or share this project with a friend!
143+
144+
## See also
145+
146+
- [sqlite-xsv](https://github.com/asg017/sqlite-xsv), A SQLite extension for working with CSVs
147+
- [sqlite-loadable](https://github.com/asg017/sqlite-loadable-rs), A framework for writing SQLite extensions in Rust
148+
- [sqlite-http](https://github.com/asg017/sqlite-http), A SQLite extension for making HTTP requests

benchmarks/README.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,17 @@
1+
# `sqlite-regex` Benchmarks
2+
3+
## Caveat: Benchmarks are hard and easy to game
4+
5+
This benchmark isn't exhaustive, and only benchmarks between other widely-used SQLite regex extensions.
6+
7+
## `REGEXP()` across all SQLite regex extensions
8+
9+
![](./dates.png)
10+
11+
Explaination: Essentially running `select count(*) from corpus where line regexp "\d{4}-\d{2}-\d{2}"`, though `regexp` and `sqlean/re` doesn't support `\d` or `{4}` syntax.
12+
13+
```
114
gcc -O3 -shared -fPIC regexp.c -o regexp.dylib
215
316
gcc -O3 -shared -fPIC -I./ re.c sqlite3-re.c -o re.dylib
17+
```

0 commit comments

Comments
 (0)