Use-case

It’s very common for projects to do JSON logging if you are working with third-party tools or open-source projects like Logstash to process your logs. These tools usually need more complex filtering on the structured data, so using JSON is preferred there.

We also wanted to integrate with a third-party tool at work and we needed to add the JSON formatting logs in our projects.

I will not go into the details of why or why not you should decide JSON logging, as each approach will have it’s pros and cons. Instead I will explain how you can do it in Python.

The final goal would be to go from:

2022-09-14 23:47:11,506248 - myapp - DEBUG - debug message

To:

{
    "threadName": "MainThread",
    "name": "root",
    "thread": 140735202359648,
    "created": 1336281068.506248,
    "process": 41937,
    "processName": "MainProcess",
    "relativeCreated": 9.100914001464844,
    "module": "app",
    "funcName": "do_logging",
    "levelno": 20,
    "pathname": "app.py",
    "lineno": 20,
    "asctime": ["2022-09-14 23:47:11,506248"],
    "message": "debug message",
    "filename": "main.py",
    "levelname": "DEBUG",
}

Existing projects

If you do a quick search, like I did, you will find two (more or less) active projects which do this:

And it’s interesting that if you check the downloads of these project at PePY or some other tool you see many people probably actually use them. As of writing this, I checked that python-json-logger has a daily download rate of ~200K per day!

Why I think you probably don’t need that

The Python logging module provides a Formatter which can be used to do logging in any formatting you want.

A very simple and minimal example of a JSON formatter can be written as:

import json
import logging


class JSONFormatter(logging.Formatter):
    def __init__(self) -> None:
        super().__init__()

        self._ignore_keys = {"msg", "args"}

    def format(self, record: logging.LogRecord) -> str:
        message = record.__dict__.copy()
        message["message"] = record.getMessage()

        for key in self._ignore_keys:
            message.pop(key, None)

        if record.exc_info and record.exc_text is None:
            record.exc_text = self.formatException(record.exc_info)

        if record.exc_text:
            message["exc_info"] = record.exc_text

        if record.stack_info:
            message["stack_info"] = self.formatStack(record.stack_info)

        return json.dumps(message)

The code is really simple, for each record you will get the dict from the record, and turn in to JSON with json.dumps().

There are only some conditions to add stack_info and exc_info if they are available, which should format exception info according to the record. You can easily modify that to fit your needs.

And to use this formatter with the loggers:

import logging

# import JSONFormatter

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

handler = logging.StreamHandler()
handler.setFormatter(JSONFormatter())

logger.addHandler(handler)

logger.debug("debug message")

which will output:

{
    "name": "__main__",
    "levelname": "DEBUG",
    "levelno": 10,
    "pathname": "main.py",
    "filename": "main.py",
    "module": "main",
    "exc_info": null,
    "exc_text": null,
    "stack_info": null,
    "lineno": 38,
    "funcName": "<module>",
    "created": 1663168021.864416,
    "msecs": 864.4158840179443,
    "relativeCreated": 1.2068748474121094,
    "thread": 8673392128,
    "threadName": "MainThread",
    "processName": "MainProcess",
    "process": 14747,
    "message": "debug message",
}

For list of all LogRecord attributes you can check the Python’s documentation.

That’s why I think for a code so simple, you are probably better off with implementing in your own code, rather than relying on a third-party package.