Overview of Paperless-ng

Compared to paperless, paperless-ng works a little different under the hood and has more moving parts that work together. While this increases the complexity of the system, it also brings many benefits.

Paperless consists of the following components:

  • The webserver: This is pretty much the same as in paperless. It serves the administration pages, the API, and the new frontend. This is the main tool you’ll be using to interact with paperless. You may start the webserver with

    $ cd /path/to/paperless/src/
    $ pipenv run gunicorn -c /usr/src/paperless/ -b paperless.wsgi

    or by any other means such as Apache mod_wsgi.

  • The consumer: This is what watches your consumption folder for documents. However, the consumer itself does not consume really consume your documents anymore. It rather notifies a task processor that a new file is ready for consumption. I suppose it should be named differently. This also used to check your emails, but that’s now gone elsewhere as well.

    Start the consumer with the management command document_consumer:

    $ cd /path/to/paperless/src/
    $ pipenv run python3 document_consumer
  • The task processor: Paperless relies on Django Q for doing much of the heavy lifting. This is a task queue that accepts tasks from multiple sources and processes tasks in parallel. It also comes with a scheduler that executes certain commands periodically.

    This task processor is responsible for:

    • Consuming documents. When the consumer finds new documents, it notifies the task processor to start a consumption task.

    • Consuming emails. It periodically checks your configured accounts for new mails and produces consumption tasks for any documents it finds.

    • The task processor also performs the consumption of any documents you upload through the web interface.

    • Maintain the search index and the automatic matching algorithm. These are things that paperless needs to do from time to time in order to operate properly.

    This allows paperless to process multiple documents from your consumption folder in parallel! On a modern multi core system, consumption with full ocr is blazing fast.

    The task processor comes with a built-in admin interface that you can use to see whenever any of the tasks fail and inspect the errors (i.e., wrong email credentials, errors during consuming a specific file, etc).

    You may start the task processor by executing:

    $ cd /path/to/paperless/src/
    $ pipenv run python3 qcluster
  • A redis message broker: This is a really lightweight service that is responsible for getting the tasks from the webserver and consumer to the task scheduler. These run in different processes (maybe even on different machines!), and therefore, this is necessary.

  • Optional: A database server. Paperless supports both PostgreSQL and SQLite for storing its data.


You can go multiple routes with setting up and running Paperless:

The Docker routes are quick & easy. These are the recommended routes. This configures all the stuff from above automatically so that it just works and uses sensible defaults for all configuration options.

The bare metal route is more complicated to setup but makes it easier should you want to contribute some code back. You need to configure and run the above mentioned components yourself.

Install Paperless from Docker Hub

  1. Go to the /docker/compose directory on the project page and download one of the docker-compose.*.yml files, depending on which database backend you want to use. Rename this file to docker-compose.yml. If you want to enable optional support for Office documents, download a file with -tika in its name. Download the docker-compose.env file and the .env file as well and store them in the same directory.


    For new installations, it is recommended to use PostgreSQL as the database backend.

  2. Install Docker and docker-compose.


    If you want to use the included docker-compose.*.yml file, you need to have at least Docker version 17.09.0 and docker-compose version 1.17.0.

    See the Docker installation guide on how to install the current version of Docker for your operating system or Linux distribution of choice. To get an up-to-date version of docker-compose, follow the docker-compose installation guide if your package repository doesn’t include it.

  3. Modify docker-compose.yml to your preferences. You may want to change the path to the consumption directory in this file. Find the line that specifies where to mount the consumption directory:

    - ./consume:/usr/src/paperless/consume

    Replace the part BEFORE the colon with a local directory of your choice:

    - /home/jonaswinkler/paperless-inbox:/usr/src/paperless/consume

    Don’t change the part after the colon or paperless wont find your documents.

  4. Modify docker-compose.env, following the comments in the file. The most important change is to set USERMAP_UID and USERMAP_GID to the uid and gid of your user on the host system. This ensures that both the docker container and you on the host machine have write access to the consumption directory. If your UID and GID on the host system is 1000 (the default for the first normal user on most systems), it will work out of the box without any modifications.


    You can use any settings from the file paperless.conf.example in this file. Have a look at Configuration to see whats available.


    Certain file systems such as NFS network shares don’t support file system notifications with inotify. When storing the consumption directory on such a file system, paperless will be unable to pick up new files with the default configuration. You will need to use PAPERLESS_CONSUMER_POLLING, which will disable inotify. See here.

  5. Run docker-compose up -d. This will create and start the necessary containers.

  6. To be able to login, you will need a super user. To create it, execute the following command:

    $ docker-compose run --rm webserver createsuperuser

    This will prompt you to set a username, an optional e-mail address and finally a password.

  7. The default docker-compose.yml exports the webserver on your local port 8000. If you haven’t adapted this, you should now be able to visit your Paperless instance at You can login with the user and password you just created.

Build the docker image yourself

  1. Clone the entire repository of paperless:

    git clone

    The master branch always reflects the latest stable version.

  2. Copy one of the docker/compose/docker-compose.*.yml to docker-compose.yml in the root folder, depending on which database backend you want to use. Copy docker-compose.env into the project root as well.

  3. In the docker-compose.yml file, find the line that instructs docker-compose to pull the paperless image from Docker Hub:

        image: jonaswinkler/paperless-ng:latest

    and replace it with a line that instructs docker-compose to build the image from the current working directory instead:

        build: .
  4. Run the script. This requires node and npm >= v15.

  5. Follow steps 2 to 7 of Install Paperless from Docker Hub. When asked to run docker-compose up -d to start the containers, do

    $ docker-compose build

    before that to build the image.

Bare Metal Route

Paperless runs on linux only. The following procedure has been tested on a minimal installation of Debian/Buster, which is the current stable release at the time of writing. Windows is not and will never be supported.

  1. Install dependencies. Paperless requires the following packages.

    • python3 3.6, 3.7, 3.8 (3.9 is untested).

    • python3-pip, optionally pipenv for package installation

    • python3-dev

    • fonts-liberation for generating thumbnails for plain text files

    • imagemagick >= 6 for PDF conversion

    • optipng for optimizing thumbnails

    • gnupg for handling encrypted documents

    • libpoppler-cpp-dev for PDF to text conversion

    • libpq-dev for PostgreSQL

    • libmagic-dev for mime type detection

    • mime-support for mime type detection

    These dependencies are required for OCRmyPDF, which is used for text recognition.

    • unpaper

    • ghostscript

    • icc-profiles-free

    • qpdf

    • liblept5

    • libxml2

    • pngquant

    • zlib1g

    • tesseract-ocr >= 4.0.0 for OCR

    • tesseract-ocr language packs (tesseract-ocr-eng, tesseract-ocr-deu, etc)

    On Raspberry Pi, these libraries are required as well:

    • libatlas-base-dev

    • libxslt1-dev

    You will also need build-essential, python3-setuptools and python3-wheel for installing some of the python dependencies.

  2. Install redis >= 5.0 and configure it to start automatically.

  3. Optional. Install postgresql and configure a database, user and password for paperless. If you do not wish to use PostgreSQL, SQLite is avialable as well.

  4. Get the release archive from If you clone the git repo as it is, you also have to compile the front end by yourself. Extract the archive to a place from where you wish to execute it, such as /opt/paperless.

  5. Configure paperless. See Configuration for details. Edit the included paperless.conf and adjust the settings to your needs. Required settings for getting paperless running are:

    • PAPERLESS_REDIS should point to your redis server, such as redis://localhost:6379.

    • PAPERLESS_DBHOST should be the hostname on which your PostgreSQL server is running. Do not configure this to use SQLite instead. Also configure port, database name, user and password as necessary.

    • PAPERLESS_CONSUMPTION_DIR should point to a folder which paperless should watch for documents. You might want to have this somewhere else. Likewise, PAPERLESS_DATA_DIR and PAPERLESS_MEDIA_ROOT define where paperless stores its data. If you like, you can point both to the same directory.

    • PAPERLESS_SECRET_KEY should be a random sequence of characters. It’s used for authentication. Failure to do so allows third parties to forge authentication credentials.

    Many more adjustments can be made to paperless, especially the OCR part. The following options are recommended for everyone:

    • Set PAPERLESS_OCR_LANGUAGE to the language most of your documents are written in.

    • Set PAPERLESS_TIME_ZONE to your local time zone.

  6. Setup permissions. Create a system users under which you wish to run paperless. Ensure that these directories exist and that the user has write permissions to the following directories

    • /opt/paperless/media

    • /opt/paperless/data

    • /opt/paperless/consume

    Adjust as necessary if you configured different folders.

  7. Install python requirements. Paperless comes with both Pipfiles for pipenv as well as with a requirements.txt. Both will install exactly the same requirements. It is up to you if you wish to use a virtual environment or not.

  8. Go to /opt/paperless/src, and execute the following commands:

    # This creates the database schema.
    python3 migrate
    # This creates your first paperless user
    python3 createsuperuser
  9. Optional: Test that paperless is working by executing

    # This collects static files from paperless and django.
    python3 runserver

    and pointing your browser to http://localhost:8000/.


    This is a development server which should not be used in production.


    This will not start the consumer. Paperless does this in a separate process.

  10. Setup systemd services to run paperless automatically. You may use the service definition files included in the scripts folder as a starting point.

    Paperless needs the webserver script to run the webserver, the consumer script to watch the input folder, and the scheduler script to run tasks such as email checking and document consumption.

    These services rely on redis and optionally the database server, but don’t need to be started in any particular order. The example files depend on redis being started. If you use a database server, you should add additinal dependencies.


    You may optionally set up your preferred web server to serve paperless as a wsgi application directly instead of running the webserver service. The module containing the wsgi application is named paperless.wsgi.


    The included scripts run a gunicorn standalone server, which is fine for running paperless. It does support SSL, however, the documentation of GUnicorn states that you should use a proxy server in front of gunicorn instead.

  11. Optional: Install a samba server and make the consumption folder available as a network share.

  12. Configure ImageMagick to allow processing of PDF documents. Most distributions have this disabled by default, since PDF documents can contain malware. If you don’t do this, paperless will fall back to ghostscript for certain steps such as thumbnail generation.

    Edit /etc/ImageMagick-6/policy.xml and adjust

    <policy domain="coder" rights="none" pattern="PDF" />


    <policy domain="coder" rights="read|write" pattern="PDF" />
  13. Optional: Install the jbig2enc encoder. This will reduce the size of generated PDF documents. You’ll most likely need to compile this by yourself, because this software has been patented until around 2017 and binary packages are not available for most distributions.

Migration to paperless-ng

At its core, paperless-ng is still paperless and fully compatible. However, some things have changed under the hood, so you need to adapt your setup depending on how you installed paperless. The important things to keep in mind are as follows.

  • Read the changelog and take note of breaking changes.

  • You should decide if you want to stick with SQLite or want to migrate your database to PostgreSQL. See Moving data from SQLite to PostgreSQL for details on how to move your data from SQLite to PostgreSQL. Both work fine with paperless. However, if you already have a database server running for other services, you might as well use it for paperless as well.

  • The task scheduler of paperless, which is used to execute periodic tasks such as email checking and maintenance, requires a redis message broker instance. The docker-compose route takes care of that.

  • The layout of the folder structure for your documents and data remains the same, so you can just plug your old docker volumes into paperless-ng and expect it to find everything where it should be.

Migration to paperless-ng is then performed in a few simple steps:

  1. Stop paperless.

    $ cd /path/to/current/paperless
    $ docker-compose down
  2. Do a backup for two purposes: If something goes wrong, you still have your data. Second, if you don’t like paperless-ng, you can switch back to paperless.

  3. Download the latest release of paperless-ng. You can either go with the docker-compose files from here or clone the repository to build the image yourself (see above). You can either replace your current paperless folder or put paperless-ng in a different location.


    Paperless includes a .env file. This will set the project name for docker compose to paperless so that paperless-ng will automatically reuse your existing paperless volumes. When you start it, it will migrate your existing data. After that, your old paperless installation will be incompatible with the migrated volumes.

  4. Download the docker-compose.sqlite.yml file to docker-compose.yml. If you want to switch to PostgreSQL, do that after you migrated your existing SQLite database.

  5. Adjust docker-compose.yml and docker-compose.env to your needs. See Install Paperless from Docker Hub for details on which edits are advised.

  6. Update paperless.

  7. In order to find your existing documents with the new search feature, you need to invoke a one-time operation that will create the search index:

    $ docker-compose run --rm webserver document_index reindex

    This will migrate your database and create the search index. After that, paperless will take care of maintaining the index by itself.

  8. Start paperless-ng.

    $ docker-compose up -d

    This will run paperless in the background and automatically start it on system boot.

  9. Paperless installed a permanent redirect to admin/ in your browser. This redirect is still in place and prevents access to the new UI. Clear your browsing cache in order to fix this.

  10. Optionally, follow the instructions below to migrate your existing data to PostgreSQL.

Moving data from SQLite to PostgreSQL

Moving your data from SQLite to PostgreSQL is done via executing a series of django management commands as below.


Make sure that your SQLite database is migrated to the latest version. Starting paperless will make sure that this is the case. If your try to load data from an old database schema in SQLite into a newer database schema in PostgreSQL, you will run into trouble.


On some database fields, PostgreSQL enforces predefined limits on maximum length, whereas SQLite does not. The fields in question are the title of documents (128 characters), names of document types, tags and correspondents (128 characters), and filenames (1024 characters). If you have data in these fields that surpasses these limits, migration to PostgreSQL is not possible and will fail with an error.

  1. Stop paperless, if it is running.

  2. Tell paperless to use PostgreSQL:

    1. With docker, copy the provided docker-compose.postgres.yml file to docker-compose.yml. Remember to adjust the consumption directory, if necessary.

    2. Without docker, configure the database in your paperless.conf file. See Configuration for details.

  3. Open a shell and initialize the database:

    1. With docker, run the following command to open a shell within the paperless container:

      $ cd /path/to/paperless
      $ docker-compose run --rm webserver /bin/bash

      This will launch the container and initialize the PostgreSQL database.

    2. Without docker, open a shell in your virtual environment, switch to the src directory and create the database schema:

      $ cd /path/to/paperless
      $ pipenv shell
      $ cd src
      $ python3 migrate

      This will not copy any data yet.

  4. Dump your data from SQLite:

    $ python3 dumpdata --database=sqlite --exclude=contenttypes --exclude=auth.Permission > data.json
  5. Load your data into PostgreSQL:

    $ python3 loaddata data.json
  6. Exit the shell.

    $ exit
  7. Start paperless.

Moving back to paperless

Lets say you migrated to Paperless-ng and used it for a while, but decided that you don’t like it and want to move back (If you do, send me a mail about what part you didn’t like!), you can totally do that with a few simple steps.

Paperless-ng modified the database schema slightly, however, these changes can be reverted while keeping your current data, so that your current data will be compatible with original Paperless.

Execute this:

$ cd /path/to/paperless
$ docker-compose run --rm webserver migrate documents 0023

Or without docker:

$ cd /path/to/paperless/src
$ python3 migrate documents 0023

After that, you need to clear your cookies (Paperless-ng comes with updated dependencies that do cookie-processing differently) and probably your cache as well.

Considerations for less powerful devices

Paperless runs on Raspberry Pi. However, some things are rather slow on the Pi and configuring some options in paperless can help improve performance immensely:

  • Stick with SQLite to save some resources.

  • Consider setting PAPERLESS_OCR_PAGES to 1, so that paperless will only OCR the first page of your documents.

  • PAPERLESS_TASK_WORKERS and PAPERLESS_THREADS_PER_WORKER are configured to use all cores. The Raspberry Pi models 3 and up have 4 cores, meaning that paperless will use 2 workers and 2 threads per worker. This may result in sluggish response times during consumption, so you might want to lower these settings (example: 2 workers and 1 thread to always have some computing power left for other tasks).

  • Keep PAPERLESS_OCR_MODE at its default value skip and consider OCR’ing your documents before feeding them into paperless. Some scanners are able to do this! You might want to even specify skip_noarchive to skip archive file generation for already ocr’ed documents entirely.

  • Set PAPERLESS_OPTIMIZE_THUMBNAILS to ‘false’ if you want faster consumption times. Thumbnails will be about 20% larger.

For details, refer to Configuration.


Updating the automatic matching algorithm takes quite a bit of time. However, the update mechanism checks if your data has changed before doing the heavy lifting. If you experience the algorithm taking too much cpu time, consider changing the schedule in the admin interface to daily. You can also manually invoke the task by changing the date and time of the next run to today/now.

The actual matching of the algorithm is fast and works on Raspberry Pi as well as on any other device.