Skip to content

Commit 5154fcc

Browse files
authored
Always parse json in strict mode (#507)
* Make #'handler reloadable during test iteration * Add test-case for lazily parsing large json array that throws IOException * Respect org file fill-column * Add instructions for how to incrementally/lazily parse json * Remove lazy json parsing from coercions * Fix tests -- should always return a vector now
1 parent d68242c commit 5154fcc

File tree

6 files changed

+182
-44
lines changed

6 files changed

+182
-44
lines changed

README.org

Lines changed: 38 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@
6060
- [[#proxies][Proxies]]
6161
- [[#custom-middleware][Custom Middleware]]
6262
- [[#modifying-apache-specific-features-of-the-httpclientbuilder-and-httpasyncclientbuilder][Modifying Apache-specific features of the =HttpClientBuilder= and =HttpAsyncClientBuilder=]]
63+
- [[#incrementally-json-parsing][Incrementally JSON Parsing]]
6364
- [[#development][Development]]
6465
- [[#faking-responses][Faking Responses]]
6566
- [[#optional-dependencies][Optional Dependencies]]
@@ -72,7 +73,6 @@
7273
- [[#other-libraries-providing-middleware][Other Libraries Providing Middleware]]
7374
- [[#license][License]]
7475

75-
7676
* Branches
7777
:PROPERTIES:
7878
:CUSTOM_ID: h-e390585c-cbd8-4e94-b36b-4e9c27c16720
@@ -514,9 +514,7 @@ it's received (output coercion) from the server.
514514

515515
;; Coerce as json
516516
(client/get "http://example.com/foo.json" {:as :json})
517-
(client/get "http://example.com/foo.json" {:as :json-strict})
518517
(client/get "http://example.com/foo.json" {:as :json-string-keys})
519-
(client/get "http://example.com/foo.json" {:as :json-strict-string-keys})
520518

521519
;; Coerce as Transit encoded JSON or MessagePack
522520
(client/get "http://example.com/foo" {:as :transit+json})
@@ -547,7 +545,7 @@ it's received (output coercion) from the server.
547545
;; if any, defaulting to UTF-8.
548546
#+END_SRC
549547

550-
Output coercion with =:as :json=, =:as :json-strict=, =:as :json-strict-string-keys=, =:as :json-string-keys= or =:as :x-www-form-urlencoded= will only work with an optional dependency, see [[#optional-dependencies][Optional Dependencies]].
548+
Output coercion with =:as :json=, =:as :json-string-keys= or =:as :x-www-form-urlencoded=, will only work with an optional dependency, see [[#optional-dependencies][Optional Dependencies]].
551549

552550
By default, JSON coercion is only applied when the response's status
553551
is considered "unexceptional". If the =:unexceptional-status= option
@@ -1401,6 +1399,38 @@ clj-http's =:connection-manager=, =:redirect-strategy=,
14011399
=:retry-handler=, =:cache=, or =:cookie-policy-registry= or
14021400
=:cookie-spec=, respectively.
14031401

1402+
** Incrementally JSON Parsing
1403+
[[https://github.com/dakrone/cheshire][cheshire]] supports incrementally parsing JSON using lazy sequences. This approach can useful for
1404+
processing large top-level JSON arrays because it doesn't require upfront work consuming the entire stream.
1405+
1406+
#+begin_src clojure
1407+
(require '[cheshire.core :as json])
1408+
1409+
(defn print-all-pokemon-names [pokemons]
1410+
(for [pokemon pokemons]
1411+
(println (get-in pokemon [:name :english]))))
1412+
1413+
(let [url "https://raw.githubusercontent.com/fanzeyi/pokemon.json/master/pokedex.json"
1414+
response (get url {:as :reader})]
1415+
(with-open [reader (:body response)] ; closes the underlying connection when we're done
1416+
(let [pokemons (json/parse-stream reader true)]
1417+
; You must perform all reads from the stream inside `with-open`,
1418+
; any , any lazy
1419+
(doall (print-all-pokemon-names pokemons)))))
1420+
#+end_src
1421+
1422+
Keep in mind that the =reader= object wraps a HTTP connection. The user needs to be aware of two
1423+
things:
1424+
1425+
1. The user should close the reader after processing the stream, otherwise the underlying HTTP
1426+
Connection may leak and create subtle bugs. Clojure's [[https://clojuredocs.org/clojure.core/with-open][with-open]] is useful here.
1427+
1428+
2. You should realize any lazy sequences before closing the connection. Use [[https://clojuredocs.org/clojure.core/doall][doall]] or [[https://clojure.org/reference/transducers][transducers]] to
1429+
prevent bugs from lazy IO. See [[https://stuartsierra.com/2015/08/25/clojure-donts-lazy-effects][Clojure Don'ts: Lazy Effects]].
1430+
1431+
In previous versions of =clj-http= (<= 3.10.0), =clj-http= defaulted to lazily parsing JSON, but this
1432+
was slow and also confused users who didn't expect laziness.
1433+
14041434
* Development
14051435
:PROPERTIES:
14061436
:CUSTOM_ID: h-65bbf017-2e8b-4c43-824b-24b89cc27a70
@@ -1543,3 +1573,7 @@ Libraries inspired by clj-http:
15431573

15441574
Released under the MIT License:
15451575
<http://www.opensource.org/licenses/mit-license.php>
1576+
1577+
# Local Variables:
1578+
# fill-column: 100
1579+
# End:

changelog.org

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
List of user-visible changes that have gone into each release
1212

1313
** 3.10.1 (Unreleased)
14-
- TBD
14+
- JSON parsing is always strict. See [[file:README.org::*Incrementally%20JSON%20Parsing][README#Incrementally JSON Parsing]]. Requires cheshire > 5.9.0.
1515
** 3.10.0
1616
- Add trust-manager and key-managers support to the client
1717
https://github.com/dakrone/clj-http/pull/469

project.clj

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
[potemkin "0.4.5"]]
1919
:resource-paths ["resources"]
2020
:profiles {:dev {:dependencies [;; optional deps
21-
[cheshire "5.8.1"]
21+
[cheshire "5.9.0"]
2222
[crouton "0.1.2" :exclusions [[org.jsoup/jsoup]]]
2323
[org.jsoup/jsoup "1.11.3"]
2424
[org.clojure/tools.reader "1.3.2"]

src/clj_http/client.clj

Lines changed: 23 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -128,20 +128,7 @@
128128
"Resolve and apply cheshire's json decoding dynamically."
129129
[& args]
130130
{:pre [json-enabled?]}
131-
(apply (ns-resolve (symbol "cheshire.core") (symbol "decode")) args))
132-
133-
(defn ^:dynamic json-decode-strict
134-
"Resolve and apply cheshire's json decoding dynamically (with lazy parsing
135-
disabled)."
136-
[& args]
137-
{:pre [json-enabled?]}
138-
(apply (ns-resolve (symbol "cheshire.core") (symbol "decode-strict")) args))
139-
140-
(defn ^:dynamic json-decode-stream
141-
"Resolve and apply cheshire's json stream decoding dynamically."
142-
[& args]
143-
{:pre [json-enabled?]}
144-
(apply (ns-resolve (symbol "cheshire.core") (symbol "decode-stream")) args))
131+
(apply (ns-resolve (symbol "cheshire.core") (symbol "parse-stream-strict")) args))
145132

146133
(defn ^:dynamic form-decode
147134
"Resolve and apply ring-codec's form decoding dynamically."
@@ -459,26 +446,23 @@
459446
(and (not (unexceptional-status-for-request? request status))
460447
(= coerce :exceptional))))
461448

462-
(defn- decode-json-body [body keyword? strict? charset]
463-
(if strict?
464-
;; OPTIMIZE: When/if Cheshire gets a parse-stream-strict this won't need to go through String:
465-
(json-decode-strict (util/force-string body charset) keyword?)
466-
(let [^BufferedReader br (io/reader (util/force-stream body))]
467-
(try
468-
(.mark br 1)
469-
(let [first-char (int (try (.read br) (catch EOFException _ -1)))]
470-
(case first-char
471-
-1 nil
472-
(do (.reset br)
473-
(json-decode-stream br keyword?))))
474-
(finally (.close br))))))
449+
(defn- decode-json-body [body keyword? charset]
450+
(let [^BufferedReader br (io/reader (util/force-stream body))]
451+
(try
452+
(.mark br 1)
453+
(let [^int first-char (try (.read br) (catch EOFException _ -1))]
454+
(case first-char
455+
-1 nil
456+
(do (.reset br)
457+
(json-decode br keyword?))))
458+
(finally (.close br)))))
475459

476460
(defn coerce-json-body
477-
[request {:keys [body] :as resp} keyword? strict? & [charset]]
461+
[request {:keys [body] :as resp} keyword? & [charset]]
478462
{:pre [json-enabled?]}
479463
(let [charset (or charset (response-charset resp))
480464
body (if (can-parse-body? request resp)
481-
(decode-json-body body keyword? strict? charset)
465+
(decode-json-body body keyword? charset)
482466
(util/force-string body charset))]
483467
(assoc resp :body body)))
484468

@@ -540,16 +524,20 @@
540524
(coerce-content-type request))))
541525

542526
(defmethod coerce-response-body :json [req resp]
543-
(coerce-json-body req resp true false))
527+
(coerce-json-body req resp true))
528+
529+
(defmethod coerce-response-body :json-string-keys [req resp]
530+
(coerce-json-body req resp false))
544531

532+
;; There is no longer any distinction between strict and non-strict JSON parsing
533+
;; options.
534+
;;
535+
;; `:json-strict` and `:json-strict-string-keys` will be removed in a future version
545536
(defmethod coerce-response-body :json-strict [req resp]
546-
(coerce-json-body req resp true true))
537+
(coerce-json-body req resp true))
547538

548539
(defmethod coerce-response-body :json-strict-string-keys [req resp]
549-
(coerce-json-body req resp false true))
550-
551-
(defmethod coerce-response-body :json-string-keys [req resp]
552-
(coerce-json-body req resp false false))
540+
(coerce-json-body req resp false))
553541

554542
(defmethod coerce-response-body :clojure [req resp]
555543
(coerce-clojure-body req resp))

test-resources/big_array_json.json

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
[
2+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
3+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
4+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
5+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
6+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
7+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
8+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
9+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
10+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
11+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
12+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
13+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
14+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
15+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
16+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
17+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
18+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
19+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
20+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
21+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
22+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
23+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
24+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
25+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
26+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
27+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
28+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
29+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
30+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
31+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
32+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
33+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
34+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
35+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
36+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
37+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
38+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
39+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
40+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
41+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
42+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
43+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
44+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
45+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
46+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
47+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
48+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
49+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
50+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
51+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
52+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
53+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
54+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
55+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
56+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
57+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
58+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
59+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
60+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
61+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
62+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
63+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
64+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
65+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
66+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
67+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
68+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
69+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
70+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
71+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
72+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
73+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
74+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
75+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
76+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
77+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
78+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
79+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
80+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
81+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
82+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
83+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
84+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
85+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
86+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
87+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
88+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
89+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
90+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
91+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
92+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
93+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
94+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
95+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
96+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
97+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
98+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
99+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
100+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]},
101+
{"foo": "bar", "baz": "qux", "values": [1, 2, 3, 4, 5]}
102+
]

test/clj_http/test/core_test.clj

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,9 @@
5858
[:get "/json-array"]
5959
{:status 200 :body "[\"foo\", \"bar\"]"
6060
:headers {"content-type" "application/json"}}
61+
[:get "/json-large-array"]
62+
{:status 200 :body (file "test-resources/big_array_json.json")
63+
:headers {"content-type" "application/json"}}
6164
[:get "/json-bad"]
6265
{:status 400 :body "{\"foo\":\"bar\"}"}
6366
[:get "/redirect"]
@@ -147,7 +150,7 @@
147150
(defn run-server
148151
[]
149152
(defonce server
150-
(ring/run-jetty (add-headers-if-requested handler) {:port 18080 :join? false})))
153+
(ring/run-jetty (add-headers-if-requested #'handler) {:port 18080 :join? false})))
151154

152155
(defn localhost [path]
153156
(str "http://localhost:18080" path))
@@ -427,7 +430,10 @@
427430
(deftest ^:integration t-json-output-coercion
428431
(run-server)
429432
(let [resp (client/get (localhost "/json") {:as :json})
430-
resp-array (client/get (localhost "/json-array") {:as :json-strict})
433+
resp-array (client/get (localhost "/json-array") {:as :json})
434+
resp-array-strict (client/get (localhost "/json-array") {:as :json-strict})
435+
resp-large-array (client/get (localhost "/json-large-array") {:as :json})
436+
resp-large-array-strict (client/get (localhost "/json-large-array") {:as :json-strict})
431437
resp-str (client/get (localhost "/json")
432438
{:as :json :coerce :exceptional})
433439
resp-str-keys (client/get (localhost "/json") {:as :json-string-keys})
@@ -445,6 +451,9 @@
445451
(is (= 200
446452
(:status resp)
447453
(:status resp-array)
454+
(:status resp-array-strict)
455+
(:status resp-large-array)
456+
(:status resp-large-array-strict)
448457
(:status resp-str)
449458
(:status resp-str-keys)
450459
(:status resp-strict-str-keys)
@@ -459,6 +468,7 @@
459468
(:body resp-str-keys)))
460469
;; '("foo" "bar") and ["foo" "bar"] compare as equal with =.
461470
(is (vector? (:body resp-array)))
471+
(is (vector? (:body resp-array-strict)))
462472
(is (= "{\"foo\":\"bar\"}" (:body resp-str)))
463473
(is (= 400
464474
(:status bad-resp)
@@ -467,7 +477,11 @@
467477
(is (= "{\"foo\":\"bar\"}" (:body bad-resp))
468478
"don't coerce on bad response status by default")
469479
(is (= {:foo "bar"} (:body bad-resp-json)))
470-
(is (= "{\"foo\":\"bar\"}" (:body bad-resp-json2)))))
480+
(is (= "{\"foo\":\"bar\"}" (:body bad-resp-json2)))
481+
482+
(testing "lazily parsed stream completes parsing."
483+
(is (= 100 (count (:body resp-large-array)))))
484+
(is (= 100 (count (:body resp-large-array-strict))))))
471485

472486
(deftest ^:integration t-ipv6
473487
(run-server)

0 commit comments

Comments
 (0)