Обработка ошибок python json

The above Idea is good but I had problem with that. My json Sting consisted only one additional double quote in it.
So, I made a fix to the above given code.

The jsonStr was

{
    "api_version": "1.3",
    "response_code": "200",
    "id": "3237490513229753",
    "lon": "38.969916127827",
    "lat": "45.069889625267",
    "page_url": null,
    "name": "ATB",
    "firm_group": {
        "id": "3237499103085728",
        "count": "1"
    },
    "city_name": "Krasnodar",
    "city_id": "3237585002430511",
    "address": "Turgeneva,   172/1",
    "create_time": "2008-07-22 10:02:04 07",
    "modification_time": "2013-08-09 20:04:36 07",
    "see_also": [
        {
            "id": "3237491513434577",
            "lon": 38.973110606808,
            "lat": 45.029031222211,
            "name": "Advance",
            "hash": "5698hn745A8IJ1H86177uvgn94521J3464he26763737242Cf6e654G62J0I7878e",
            "ads": {
                "sponsored_article": {
                    "title": "Center "ADVANCE",
                    "text": "Business.English."
                },
                "warning": null
            }
        }
    ]
}

The fix is as follows:

import json, re
def fixJSON(jsonStr):
    # Substitue all the backslash from JSON string.
    jsonStr = re.sub(r'\\', '', jsonStr)
    try:
        return json.loads(jsonStr)
    except ValueError:
        while True:
            # Search json string specifically for '"'
            b = re.search(r'[\w|"]\s?(")\s?[\w|"]', jsonStr)

            # If we don't find any the we come out of loop
            if not b:
                break

            # Get the location of \"
            s, e = b.span(1)
            c = jsonStr[s:e]

            # Replace \" with \'
            c = c.replace('"',"'")
            jsonStr = jsonStr[:s] + c + jsonStr[e:]
        return json.loads(jsonStr)

This code also works for JSON string mentioned in problem statement


OR you can also do this:

def fixJSON(jsonStr):
    # First remove the " from where it is supposed to be.
    jsonStr = re.sub(r'\\', '', jsonStr)
    jsonStr = re.sub(r'{"', '{`', jsonStr)
    jsonStr = re.sub(r'"}', '`}', jsonStr)
    jsonStr = re.sub(r'":"', '`:`', jsonStr)
    jsonStr = re.sub(r'":', '`:', jsonStr)
    jsonStr = re.sub(r'","', '`,`', jsonStr)
    jsonStr = re.sub(r'",', '`,', jsonStr)
    jsonStr = re.sub(r',"', ',`', jsonStr)
    jsonStr = re.sub(r'\["', '\[`', jsonStr)
    jsonStr = re.sub(r'"\]', '`\]', jsonStr)

    # Remove all the unwanted " and replace with ' '
    jsonStr = re.sub(r'"',' ', jsonStr)

    # Put back all the " where it supposed to be.
    jsonStr = re.sub(r'\`','\"', jsonStr)

    return json.loads(jsonStr)

JSON, which stands for JavaScript Object Notation, is a popular format for representing data in web applications. Python provides an easy way to work with JSON data, but parsing errors can be frustratingly difficult to debug. This guide will provide a comprehensive look at various types of JSON parsing errors in Python and how to effectively debug them.

Understanding JSON Syntax

Before diving into JSON parsing errors, it’s important to understand the syntax of a well-formed JSON object. A JSON object consists of one or more key-value pairs, enclosed in curly braces ({}). Keys are strings enclosed in quotes ("") followed by a colon (:) and a value. Values can be one of several types: a string (also enclosed in quotes), a number, a boolean, an array (enclosed in square brackets []) or another JSON object.

Here’s an example of a simple JSON object:

{
    "name": "John",
    "age": 30,
    "isMarried": false,
    "hobbies": ["reading", "running", "cooking"],
    "address": {
        "street": "123 Main St",
        "city": "New York",
        "state": "NY"
    }
}

Common JSON Parsing Errors

SyntaxError

A SyntaxError is raised when the JSON object is not properly formatted. This can occur for several reasons, such as missing quotes, incorrect commas or incorrect nesting of curly braces. Here’s an example of a JSON object that raises a SyntaxError:

{
    "name": "John",
    "age": 30,
    "isMarried": false,
    "hobbies": ["reading", "running", "cooking"]
    "address": {
        "street": "123 Main St",
        "city": "New York",
        "state": "NY"
    }
}

Note the missing comma after the "hobbies" array, which causes a SyntaxError.

To fix a SyntaxError, carefully check the JSON syntax to identify and correct the issue.

ValueError

A ValueError is raised when the JSON object contains an invalid value. This can happen if a value is not enclosed in quotes (for example, "age": 30 instead of "age": "30"), or if a value is not a valid JSON type. Here’s an example of a JSON object that raises a ValueError:

{
    "name": "John",
    "age": "thirty",
    "isMarried": false,
    "hobbies": ["reading", "running", "cooking"],
    "address": {
        "street": "123 Main St",
        "city": "New York",
        "state": "NY"
    }
}

Note the "age" value is a string ("thirty") instead of a number. This causes a ValueError.

To fix a ValueError, carefully check the JSON values for correct typing and make necessary corrections.

KeyError

A KeyError is raised when attempting to retrieve a non-existent key from a JSON object. This can happen when the key is misspelled or when the JSON object is not constructed as expected. Here’s an example of a JSON object that raises a KeyError:

{
    "name": "John",
    "age": 30,
    "isMarried": false,
    "hobbies": ["reading", "running", "cooking"],
    "address": {
        "street": "123 Main St",
        "city": "New York",
        "state": "NY"
    }
}

person = json.loads(my_json)
print(person["lastname"])

Note that there is no "lastname" key in the JSON object, so attempting to retrieve it will raise a KeyError.

To fix a KeyError, ensure that the correct key is being used to access the JSON object.

Debugging JSON Parsing Errors

Debugging JSON parsing errors can be a time-consuming process. Here are some tips to help make the process easier:

  • Use a JSON validator such as JSONLint to verify that the JSON syntax is well-formed.
  • Use try...except blocks to handle parsing errors gracefully and provide better user feedback.
  • Use print() statements to inspect the JSON object and identify the location of parsing errors.
  • Break down the JSON object into smaller components to pinpoint where the error may be occurring.

With these tips and a solid understanding of JSON syntax and common parsing errors, you’ll be well-equipped to debug JSON parsing errors in Python.

Why do you have a list of both numbers and some other kind of object? It seems like you’re trying to compensate for a design flaw.

As a matter of fact, I want it work this way because I want to keep data that is already encoded in JsonedData(), then I want json module to give me some way to insert a ‘raw’ item data rather than the defaults, so that encoded JsonedData could be reuseable.

here’s the code, thanks

import json
import io
class JsonedData():
    def __init__(self, data):
        self.data = data
def main():
    try:
        for chunk in json.JSONEncoder().iterencode([1,2,3,JsonedData(u'4'),5]):
            print chunk
    except TypeError: pass# except come method to make the print continue
    # so that printed data is something like:
    # [1
    # ,2
    # ,3
    # , 
    # ,5]

asked Oct 18, 2011 at 2:52

tdihp's user avatar

tdihptdihp

2,3292 gold badges23 silver badges40 bronze badges

2

put the try/except inside the loop around the json.JSONEncoder().encode(item):

print "[",
lst = [1, 2, 3, JsonedData(u'4'), 5]
for i, item in enumerate(lst):
    try:
        chunk = json.JSONEncoder().encode(item)
    except TypeError: 
        pass
    else:
        print chunk
    finally:
        # dont print the ',' if this is the last item in the lst
        if i + 1 != len(lst):
            print ","
print "]"

answered Oct 18, 2011 at 3:01

chown's user avatar

chownchown

51.9k16 gold badges134 silver badges170 bronze badges

Use the skipkeys option for JSONEncoder() so that it skips items that it can’t encode. Alternatively, create a default method for your JsonedData object. See the docs.

answered Oct 18, 2011 at 2:59

imm's user avatar

immimm

5,8471 gold badge26 silver badges32 bronze badges

3

When an invalid instance is encountered, a ValidationError will be
raised or returned, depending on which method or function is used.

exception jsonschema.exceptions.ValidationError(message: str, validator=<unset>, path=(), cause=None, context=(), validator_value=<unset>, instance=<unset>, schema=<unset>, schema_path=(), parent=None, type_checker=<unset>)[source]

An instance was invalid under a provided schema.

The information carried by an error roughly breaks down into:

What Happened

Why Did It Happen

What Was Being Validated

message

context

cause

instance

json_path

path

schema

schema_path

validator

validator_value

message#

A human readable message explaining the error.

validator#

The name of the failed keyword.

validator_value#

The associated value for the failed keyword in the schema.

schema#

The full schema that this error came from. This is potentially a
subschema from within the schema that was passed in originally,
or even an entirely different schema if a $ref was
followed.

relative_schema_path#

A collections.deque containing the path to the failed keyword
within the schema.

absolute_schema_path#

A collections.deque containing the path to the failed
keyword within the schema, but always relative to the
original schema as opposed to any subschema (i.e. the one
originally passed into a validator class, not schema).

schema_path#

Same as relative_schema_path.

relative_path#

A collections.deque containing the path to the
offending element within the instance. The deque can be empty if
the error happened at the root of the instance.

absolute_path#

A collections.deque containing the path to the
offending element within the instance. The absolute path
is always relative to the original instance that was
validated (i.e. the one passed into a validation method, not
instance). The deque can be empty if the error happened
at the root of the instance.

json_path#

A JSON path
to the offending element within the instance.

path#

Same as relative_path.

instance#

The instance that was being validated. This will differ from
the instance originally passed into validate if the
validator object was in the process of validating a (possibly
nested) element within the top-level instance. The path within
the top-level instance (i.e. ValidationError.path) could
be used to find this object, but it is provided for convenience.

context#

If the error was caused by errors in subschemas, the list of errors
from the subschemas will be available on this property. The
schema_path and path of these errors will be relative
to the parent error.

cause#

If the error was caused by a non-validation error, the
exception object will be here. Currently this is only used
for the exception raised by a failed format checker in
jsonschema.FormatChecker.check.

parent#

A validation error which this error is the context of.
None if there wasn’t one.

In case an invalid schema itself is encountered, a SchemaError is
raised.

exception jsonschema.exceptions.SchemaError(message: str, validator=<unset>, path=(), cause=None, context=(), validator_value=<unset>, instance=<unset>, schema=<unset>, schema_path=(), parent=None, type_checker=<unset>)[source]

A schema was invalid under its corresponding metaschema.

The same attributes are present as for ValidationErrors.

These attributes can be clarified with a short example:

schema = {
    "items": {
        "anyOf": [
            {"type": "string", "maxLength": 2},
            {"type": "integer", "minimum": 5}
        ]
    }
}
instance = [{}, 3, "foo"]
v = Draft202012Validator(schema)
errors = sorted(v.iter_errors(instance), key=lambda e: e.path)

The error messages in this situation are not very helpful on their own.

for error in errors:
    print(error.message)

outputs:

{} is not valid under any of the given schemas
3 is not valid under any of the given schemas
'foo' is not valid under any of the given schemas

If we look at ValidationError.path on each of the errors, we can find
out which elements in the instance correspond to each of the errors. In
this example, ValidationError.path will have only one element, which
will be the index in our list.

for error in errors:
    print(list(error.path))

Since our schema contained nested subschemas, it can be helpful to look at
the specific part of the instance and subschema that caused each of the errors.
This can be seen with the ValidationError.instance and
ValidationError.schema attributes.

With keywords like anyOf, the ValidationError.context
attribute can be used to see the sub-errors which caused the failure. Since
these errors actually came from two separate subschemas, it can be helpful to
look at the ValidationError.schema_path attribute as well to see where
exactly in the schema each of these errors come from. In the case of sub-errors
from the ValidationError.context attribute, this path will be relative
to the ValidationError.schema_path of the parent error.

for error in errors:
    for suberror in sorted(error.context, key=lambda e: e.schema_path):
        print(list(suberror.schema_path), suberror.message, sep=", ")
[0, 'type'], {} is not of type 'string'
[1, 'type'], {} is not of type 'integer'
[0, 'type'], 3 is not of type 'string'
[1, 'minimum'], 3 is less than the minimum of 5
[0, 'maxLength'], 'foo' is too long
[1, 'type'], 'foo' is not of type 'integer'

The string representation of an error combines some of these attributes for
easier debugging.

3 is not valid under any of the given schemas

Failed validating 'anyOf' in schema['items']:
    {'anyOf': [{'maxLength': 2, 'type': 'string'},
               {'minimum': 5, 'type': 'integer'}]}

On instance[1]:
    3

ErrorTrees#

If you want to programmatically query which validation keywords
failed when validating a given instance, you may want to do so using
jsonschema.exceptions.ErrorTree objects.

class jsonschema.exceptions.ErrorTree(errors=())[source]

ErrorTrees make it easier to check which validations failed.

errors#

The mapping of validation keywords to the error objects (usually jsonschema.exceptions.ValidationErrors) at this level of the tree.

__contains__(index)[source]

Check whether instance[index] has any errors.

__getitem__(index)[source]

Retrieve the child tree one level down at the given index.

If the index is not in the instance that this tree corresponds
to and is not known by this tree, whatever error would be raised
by instance.__getitem__ will be propagated (usually this is
some subclass of LookupError.

__init__(errors=())[source]
__iter__()[source]

Iterate (non-recursively) over the indices in the instance with errors.

__len__()[source]

Return the total_errors.

__repr__()[source]

Return repr(self).

__setitem__(index, value)[source]

Add an error to the tree at the given index.

property total_errors

The total number of errors in the entire tree, including children.

Consider the following example:

schema = {
    "type" : "array",
    "items" : {"type" : "number", "enum" : [1, 2, 3]},
    "minItems" : 3,
}
instance = ["spam", 2]

For clarity’s sake, the given instance has three errors under this schema:

v = Draft202012Validator(schema)
for error in sorted(v.iter_errors(["spam", 2]), key=str):
    print(error.message)
'spam' is not of type 'number'
'spam' is not one of [1, 2, 3]
['spam', 2] is too short

Let’s construct an jsonschema.exceptions.ErrorTree so that we
can query the errors a bit more easily than by just iterating over the
error objects.

from jsonschema.exceptions import ErrorTree
tree = ErrorTree(v.iter_errors(instance))

As you can see, jsonschema.exceptions.ErrorTree takes an iterable of ValidationErrors when constructing a tree so you can directly pass it the return value of a validator’s jsonschema.protocols.Validator.iter_errors method.

ErrorTrees support a number of useful operations. The first one we
might want to perform is to check whether a given element in our instance
failed validation. We do so using the in operator:

>>> 0 in tree
True

>>> 1 in tree
False

The interpretation here is that the 0th index into the instance ("spam")
did have an error (in fact it had 2), while the 1th index (2) did not (i.e.
it was valid).

If we want to see which errors a child had, we index into the tree and look at
the ErrorTree.errors attribute.

>>> sorted(tree[0].errors)
['enum', 'type']

Here we see that the enum and type keywords failed for
index 0. In fact ErrorTree.errors is a dict, whose values are the
ValidationErrors, so we can get at those directly if we want them.

>>> print(tree[0].errors["type"].message)
'spam' is not of type 'number'

Of course this means that if we want to know if a given validation
keyword failed for a given index, we check for its presence in
ErrorTree.errors:

>>> "enum" in tree[0].errors
True

>>> "minimum" in tree[0].errors
False

Finally, if you were paying close enough attention, you’ll notice that
we haven’t seen our minItems error appear anywhere yet. This is
because minItems is an error that applies globally to the instance
itself. So it appears in the root node of the tree.

>>> "minItems" in tree.errors
True

That’s all you need to know to use error trees.

To summarize, each tree contains child trees that can be accessed by
indexing the tree to get the corresponding child tree for a given
index into the instance. Each tree and child has a ErrorTree.errors
attribute, a dict, that maps the failed validation keyword to the
corresponding validation error.

best_match and relevance#

The best_match function is a simple but useful function for attempting
to guess the most relevant error in a given bunch.

>>> from jsonschema import Draft202012Validator
>>> from jsonschema.exceptions import best_match

>>> schema = {
...     "type": "array",
...     "minItems": 3,
... }
>>> print(best_match(Draft202012Validator(schema).iter_errors(11)).message)
11 is not of type 'array'
jsonschema.exceptions.best_match(errors, key=<function by_relevance.<locals>.relevance>)[source]

Try to find an error that appears to be the best match among given errors.

In general, errors that are higher up in the instance (i.e. for which
ValidationError.path is shorter) are considered better matches,
since they indicate “more” is wrong with the instance.

If the resulting match is either oneOf or anyOf, the
opposite assumption is made – i.e. the deepest error is picked,
since these keywords only need to match once, and any other errors
may not be relevant.

Parameters:
  • errors (collections.abc.Iterable) – the errors to select from. Do not provide a mixture of
    errors from different validation attempts (i.e. from
    different instances or schemas), since it won’t produce
    sensical output.

  • key (collections.abc.Callable) – the key to use when sorting errors. See relevance and
    transitively by_relevance for more details (the default is
    to sort with the defaults of that function). Changing the
    default is only useful if you want to change the function
    that rates errors but still want the error context descent
    done by this function.

Returns:

the best matching error, or None if the iterable was empty

Note

This function is a heuristic. Its return value may change for a given
set of inputs from version to version if better heuristics are added.

jsonschema.exceptions.relevance(validation_error)

A key function that sorts errors based on heuristic relevance.

If you want to sort a bunch of errors entirely, you can use
this function to do so. Using this function as a key to e.g.
sorted or max will cause more relevant errors to be
considered greater than less relevant ones.

Within the different validation keywords that can fail, this
function considers anyOf and oneOf to be weak
validation errors, and will sort them lower than other errors at the
same level in the instance.

If you want to change the set of weak [or strong] validation
keywords you can create a custom version of this function with
by_relevance and provide a different set of each.

>>> schema = {
...     "properties": {
...         "name": {"type": "string"},
...         "phones": {
...             "properties": {
...                 "home": {"type": "string"}
...             },
...         },
...     },
... }
>>> instance = {"name": 123, "phones": {"home": [123]}}
>>> errors = Draft202012Validator(schema).iter_errors(instance)
>>> [
...     e.path[-1]
...     for e in sorted(errors, key=exceptions.relevance)
... ]
['home', 'name']
jsonschema.exceptions.by_relevance(weak=frozenset({‘anyOf’, ‘oneOf’}), strong=frozenset({}))[source]

Create a key function that can be used to sort errors by relevance.

Parameters:
  • weak (set) – a collection of validation keywords to consider to be
    “weak”. If there are two errors at the same level of the
    instance and one is in the set of weak validation keywords,
    the other error will take priority. By default, anyOf
    and oneOf are considered weak keywords and will be
    superseded by other same-level validation errors.

  • strong (set) – a collection of validation keywords to consider to be
    “strong”

JSON (JavaScript Object Notation) — это универсальный формат данных, который широко используется для обмена данными между веб-сервером и клиентом. При работе с

JSON (JavaScript Object Notation) — это универсальный формат данных, который широко используется для обмена данными между веб-сервером и клиентом. При работе с JSON в Python часто возникают проблемы с разбором данных.

Рассмотрим типичную проблему. Допустим, есть файл с данными в формате JSON:

{
«maps»: [
{
«id»: «blabla»,
«iscategorical»: «0»
},
{
«id»: «blabla»,
«iscategorical»: «0»
}
],
«masks»: [
«id»: «valore»
],
«om_points»: «value»,
«parameters»: [
«id»: «valore»
]
}

И есть скрипт на Python, который пытается прочитать эти данные:

import json
from pprint import pprint

with open('data.json') as f:
    data = json.load(f)

pprint(data)

В результате выполнения этого скрипта может возникнуть ошибка json.decoder.JSONDecodeError. Это происходит из-за того, что данные в файле JSON не соответствуют правильному формату. В JSON каждый объект должен быть парой ключ-значение. Но в данном примере в массивах «masks» и «parameters» приведены только значения без ключей.

Чтобы исправить ошибку, необходимо убедиться, что все данные в JSON соответствуют правильному формату. В данном случае, исправленный файл может выглядеть так:

{
«maps»: [
{
«id»: «blabla»,
«iscategorical»: «0»
},
{
«id»: «blabla»,
«iscategorical»: «0»
}
],
«masks»: [
{
«id»: «valore»
}
],
«om_points»: «value»,
«parameters»: [
{
«id»: «valore»
}
]
}

Таким образом, при возникновении ошибок при разборе JSON в Python важно внимательно проверять соответствие данных правильному формату JSON.

Понравилась статья? Поделить с друзьями:
  • Обработка ошибок mysqli
  • Обратное распространение ошибки keras
  • Образец трудолюбия лексическая ошибка
  • Обработка ошибок discord py
  • Обозначение ошибок на принтере