If you need to handle a series of JSON objects (NOT a JSON array) from stdin or some other stream you’ll need to use a
less common method of parsing json than pythons json.loads(str)
Let’s say your input looks like this
{"somekey": 5, "more": [1,2,3]} 5
"a string"
{"somekey2":
123,}
You need to parse a variety of json objects but you don’t know how many there will be. We’re going to use raw_decode from python3’s json implementation.
loads |
raw_decode |
|
---|---|---|
Accepts | str | str |
Returns | parsed json | parsed json index the json object ended at |
Extra chars at the end? | json.decoder.JSONDecodeError |
Leaves them |
from json import JSONDecoder
import sys
# All objects we find
json_found = []
# raw_decode expects byte1 to be part of a JSON, so remove whitespace from left
stdin = sys.stdin.read().lstrip()
decoder = JSONDecoder()
while len(stdin) > 0:
# parsed_json, number of bytes used
parsed_json, consumed = decoder.raw_decode(stdin)
# Remove bytes that were consumed in this object ^
stdin = stdin[consumed:]
# Save this parsed object
json_found.append(parsed_json)
# Remove any whitespace before the next JSON object
stdin = stdin.lstrip()
print(json_found)
So let’s try it out
echo "{\"content\": 5}4" | python3 run
> [{'content': 5}, 4]