Nate Tracy-Amoroso

Home > How to parse a series of json objects from stdin or another string in python3

If you need to handle a series of JSON objects (NOT a JSON array) from stdin or some other stream you’ll need to use a less common method of parsing json than pythons json.loads(str)

Let’s say your input looks like this

 {"somekey": 5,  "more": [1,2,3]} 5
"a string"
{"somekey2":
123,} 

You need to parse a variety of json objects but you don’t know how many there will be. We’re going to use raw_decode from python3’s json implementation.

loads raw_decode
Accepts str str
Returns parsed json parsed json
index the json object ended at
Extra chars at the end? json.decoder.JSONDecodeError Leaves them
from json import JSONDecoder
import sys
# All objects we find
json_found = []  
# raw_decode expects byte1 to be part of a JSON, so remove whitespace from left
stdin = sys.stdin.read().lstrip()
decoder = JSONDecoder()

while len(stdin) > 0:

    # parsed_json, number of bytes used
    parsed_json, consumed = decoder.raw_decode(stdin)
    # Remove bytes that were consumed in this object ^ 
    stdin = stdin[consumed:]
    # Save this parsed object
    json_found.append(parsed_json)
    # Remove any whitespace before the next JSON object
    stdin = stdin.lstrip()

print(json_found)

So let’s try it out

echo "{\"content\": 5}4" | python3 run
> [{'content': 5}, 4]

LGTM