Skip to content

Commit eb3bd3c

Browse files
authored
New pcre2_next_match() API to simplify pcre2demo, test, and substitute (#733)
* The primary purpose of pcre2_next_match() is to make it much easier for PCRE2 clients to iterate over matches, without needing an advanced knowledge of regular expressions. * Secondly, we can simplify our own code by merging the three duplicate implementations of the /g global match behaviour: pcre2demo, pcre2_substitute, and pcre2test. * Thirdly, as I look closely at the issue, I can improve the documentation. * Fourthly, I would like to actually simplify the logic, removing a complex loop which makes several match attempts, swallows duplicate matches, and more. We can have identical behaviour with a simple retry using PCRE2_NOTEMPTY_ATSTART.
1 parent f63b5d2 commit eb3bd3c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+1425
-1005
lines changed

BUILD.bazel

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ cc_library(
8080
"src/pcre2_maketables.c",
8181
"src/pcre2_match.c",
8282
"src/pcre2_match_data.c",
83+
"src/pcre2_match_next.c",
8384
"src/pcre2_newline.c",
8485
"src/pcre2_ord2utf.c",
8586
"src/pcre2_pattern_info.c",

CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -782,6 +782,7 @@ set(
782782
src/pcre2_maketables.c
783783
src/pcre2_match.c
784784
src/pcre2_match_data.c
785+
src/pcre2_match_next.c
785786
src/pcre2_newline.c
786787
src/pcre2_ord2utf.c
787788
src/pcre2_pattern_info.c

Makefile.am

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ dist_html_DATA = \
7373
doc/html/pcre2_match_data_create.html \
7474
doc/html/pcre2_match_data_create_from_pattern.html \
7575
doc/html/pcre2_match_data_free.html \
76+
doc/html/pcre2_next_match.html \
7677
doc/html/pcre2_pattern_convert.html \
7778
doc/html/pcre2_pattern_info.html \
7879
doc/html/pcre2_serialize_decode.html \
@@ -174,6 +175,7 @@ dist_man_MANS = \
174175
doc/pcre2_match_data_create.3 \
175176
doc/pcre2_match_data_create_from_pattern.3 \
176177
doc/pcre2_match_data_free.3 \
178+
doc/pcre2_next_match.3 \
177179
doc/pcre2_pattern_convert.3 \
178180
doc/pcre2_pattern_info.3 \
179181
doc/pcre2_serialize_decode.3 \
@@ -419,6 +421,7 @@ COMMON_SOURCES = \
419421
src/pcre2_maketables.c \
420422
src/pcre2_match.c \
421423
src/pcre2_match_data.c \
424+
src/pcre2_match_next.c \
422425
src/pcre2_newline.c \
423426
src/pcre2_ord2utf.c \
424427
src/pcre2_pattern_info.c \

NON-AUTOTOOLS-BUILD

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,7 @@ example.
120120
pcre2_maketables.c
121121
pcre2_match.c
122122
pcre2_match_data.c
123+
pcre2_match_next.c
123124
pcre2_newline.c
124125
pcre2_ord2utf.c
125126
pcre2_pattern_info.c

README

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -866,6 +866,7 @@ The distribution should contain the files listed below.
866866
src/pcre2_maketables.c ) sources for the functions in the library,
867867
src/pcre2_match.c ) and some internal functions that they use
868868
src/pcre2_match_data.c )
869+
src/pcre2_match_next.c )
869870
src/pcre2_newline.c )
870871
src/pcre2_ord2utf.c )
871872
src/pcre2_pattern_info.c )

build.zig

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,7 @@ pub fn build(b: *std.Build) !void {
9292
"src/pcre2_maketables.c",
9393
"src/pcre2_match.c",
9494
"src/pcre2_match_data.c",
95+
"src/pcre2_match_next.c",
9596
"src/pcre2_newline.c",
9697
"src/pcre2_ord2utf.c",
9798
"src/pcre2_pattern_info.c",

doc/html/NON-AUTOTOOLS-BUILD.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,7 @@ example.
120120
pcre2_maketables.c
121121
pcre2_match.c
122122
pcre2_match_data.c
123+
pcre2_match_next.c
123124
pcre2_newline.c
124125
pcre2_ord2utf.c
125126
pcre2_pattern_info.c

doc/html/README.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -866,6 +866,7 @@ The distribution should contain the files listed below.
866866
src/pcre2_maketables.c ) sources for the functions in the library,
867867
src/pcre2_match.c ) and some internal functions that they use
868868
src/pcre2_match_data.c )
869+
src/pcre2_match_next.c )
869870
src/pcre2_newline.c )
870871
src/pcre2_ord2utf.c )
871872
src/pcre2_pattern_info.c )

doc/html/index.html

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,9 @@ <h1>Perl-compatible Regular Expressions (revised API: PCRE2)</h1>
220220
<tr><td><a href="pcre2_match_data_free.html">pcre2_match_data_free</a></td>
221221
<td>Free a match data block</td></tr>
222222

223+
<tr><td><a href="pcre2_next_match.html">pcre2_next_match</a></td>
224+
<td>Get the match parameters for the next match</td></tr>
225+
223226
<tr><td><a href="pcre2_pattern_convert.html">pcre2_pattern_convert</a></td>
224227
<td>Experimental foreign pattern converter</td></tr>
225228

doc/html/pcre2_next_match.html

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
<html>
2+
<head>
3+
<title>pcre2_next_match specification</title>
4+
</head>
5+
<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
6+
<h1>pcre2_next_match man page</h1>
7+
<p>
8+
Return to the <a href="index.html">PCRE2 index page</a>.
9+
</p>
10+
<p>
11+
This page is part of the PCRE2 HTML documentation. It was generated
12+
automatically from the original man page. If there is any nonsense in it,
13+
please consult the man page, in case the conversion went wrong.
14+
<br>
15+
<h2>
16+
SYNOPSIS
17+
</h2>
18+
<p>
19+
<b>#include &#60;pcre2.h&#62;</b>
20+
</p>
21+
<p>
22+
<b>int pcre2_next_match(pcre2_match_data *<i>match_data</i>,</b>
23+
<b> PCRE2_SIZE *<i>pstart_offset</i>, uint32_t *<i>poptions</i>);</b>
24+
</p>
25+
<h2>
26+
DESCRIPTION
27+
</h2>
28+
<p>
29+
This function can be called after one of the match functions
30+
(<b>pcre2_match()</b>, <b>pcre2_dfa_match()</b>, or <b>pcre2_jit_match()</b>), and
31+
must be provided with the same <i>match_data</i> parameter. It outputs the
32+
appropriate parameters for searching for the next match in the same subject
33+
string, and is suitable for applications providing "global" matching behaviour
34+
(for example, replacing all matches in the subject, or splitting the subject on
35+
all matches, or simply counting the number of matches).
36+
</p>
37+
<p>
38+
It returns 0 ("false") if there is no need to make any further match attempts,
39+
or 1 ("true") if another match should be attempted.
40+
</p>
41+
<p>
42+
The *<i>pstart_offset</i> and *<i>poptions</i> are set if the function returns 1.
43+
The *<i>pstart_offset</i> should be passed to the next match attempt directly,
44+
and the *<i>poptions</i> should be passed to the next match attempt by combining
45+
with the application's match options using OR.
46+
</p>
47+
<p>
48+
There is a complete description of the PCRE2 native API in the
49+
<a href="pcre2api.html"><b>pcre2api</b></a>
50+
page and a description of the POSIX API in the
51+
<a href="pcre2posix.html"><b>pcre2posix</b></a>
52+
page.
53+
<p>
54+
Return to the <a href="index.html">PCRE2 index page</a>.
55+
</p>

0 commit comments

Comments
 (0)