Commit f9eaba2
committed
Speed up
The changes to provide a public API had some performance related costs
of about 1% runtime. There is no trivial way to offset this any
further without undermining the API we are building. However, we can
pull performance-related shenanigans to compenstate for the cost
introduced.
The codespell codebase unsurprisingly spends a vast majority of its
runtime in various regex related code such as `search` and `finditer`.
The best way to optimize runtime spend in regexes is to not do a regex
in the first place, since the regex engine has a rather steep overhead
over regular string primitives (that is the cost of flexibility). If
the regex rarely matches and there is a very easy static substring
that can be used to rule out the match, then you can speed up the code
by using `substring in string` as a conditional to skip the
regex. This is assuming the regex is used enough for the performance
to matter.
An obvious choice here falls on the `codespell:ignore` regex, because
it has a very distinctive substring in the form of `codespell:ignore`,
which will rule out almost all lines that will not match.
With this little trick, runtime goes from ~5.6s to ~4.9s on the corpus
mentioned in #3419.codespell:ignore check by skipping the regex in most cases1 parent 4ae56b4 commit f9eaba2
1 file changed
+6
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
109 | 109 | | |
110 | 110 | | |
111 | 111 | | |
112 | | - | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
113 | 116 | | |
114 | 117 | | |
115 | 118 | | |
| |||
210 | 213 | | |
211 | 214 | | |
212 | 215 | | |
| 216 | + | |
| 217 | + | |
213 | 218 | | |
214 | 219 | | |
215 | 220 | | |
| |||
0 commit comments