Last updated: April 09, 2024
Run DQOps in Docker
This guide shows how to pull DQOps docker image from Docker Hub, and how to pass the right parameters to the container to start it in a production mode.
Overview
DQOps can be run as a Docker container in a server mode or in Shell mode. You can also build a custom DQOps container image.
Note
Running DQOps as docker container is a preferred method for starting in a long-running production mode.
Prerequisites
To run DQOps as a Docker container you need
-
Docker running locally. Follow the instructions to download and install Docker.
-
DQOps Cloud account and a DQOps Cloud API Key. If you want to use all DQOps features, such as storing data quality definitions and results in the cloud or data quality dashboards. Create a new DQOps Cloud account here.
-
A
DQOps User Home
folder is created locally which will be mounted to your container. Volumes are the preferred mechanism for persisting data generated by and used by Docker containers. TheDQOps User Home
folder stores local data such as sensor readouts, data quality check results, and the data source configuration.
Start DQOps in DQOps interactive shell mode
To start DQOps in a Shell mode follow the steps below.
-
Download the DQOps image from Docker Hub by running the following command in a terminal:
-
Create an empty folder where you want to create your
DQOps User Home
.DQOps User Home
is a folder where DQOps will store the metadata of imported data sources, the configuration of activated data quality checks, and the data quality results. -
Run DQOps Docker image
docker run -v [path to local DQOps user home folder]:/dqo/userhome -it -p 8888:8888 dqops/dqo [--dqo.cloud.api-key=here-your-DQOps-Cloud-API-key]
- The
-v
flag mounts your locally createdDQOps User Home
folder into the container. You need to provide the path to your localDQOps User Home
folder - The
-i
flag keeps STDIN open even if not attached. - The
-t
flag allocates a pseudo-TTY. - The
-p
flag creates a mapping between the host’s port 8888 to the container’s port 8888. Without the port mapping, you would not be able to access the application. - The
--dqo.cloud.api-key
argument specifies the API Key of your DQOps Cloud registration. When the DQOps Cloud API Key is not specified and you are starting DQOps using an emptyDQOps User Home
folder, DQOps will not be able to open the browser. Please copy the url to the DQOps Cloud Login that is shown to a browser and create or login to your DQOps Cloud account.
If you want to use the current folder as your
DQOps User Home
, you can bind this folder to the/dqo/userhome
mount point in the DQOps docker image. Please keep in mind that theDQOps User Home
folder should be empty (to initialize it on startup) or it should be already a validDQOps User Home
folder. Read the DQOps user home folder concept to learn more. - The
-
After a few seconds you can use the DQOps terminal or open the user interface by opening http://localhost:8888 in a web browser.
Start DQOps in server mode
To start DQOps in a server mode follow the steps below.
-
Download the
dqops/dqo
image from DockerHub by running the following command in a terminal: -
Run the DQOps Docker image
-
The
-v
flag mounts your locally createdDQOps User Home
folder into the container. You need to provide the path to your localDQOps User Home
folder - The
-p
flag creates a mapping between the host’s port 8888 to the container’s port 8888. Without the port mapping, you would not be able to access the application. - The
-d
flag turns on a daemon mode - The
-m
parameter configures the memory size for the container. We are advising to allocate at least 2 GB of memory for the DQOps container, which is configured by-m=2g
. DQOps container runs one Java JVM process and several small Python processes (two per core) that are running the rules. DQOps runtime allocates 80% of the container memory for the JVM heap. The memory is used for caching YAML and parquet files in memory. The memory size can be changed by passing theDQO_JAVA_OPTS
environment variable to the container using the following docker run parameter:-e DQO_JAVA_OPTS=-XX:MaxRAMPercentage=60.0
- The
--dqo.cloud.api-key
argument specifies the API Key of your DQOps Cloud account. -
The
run
command at the end will run the run CLI command command and activate a server mode without the DQOps Shell. -
After a few seconds open your web browser to http://localhost:8888/. You should see the DQOps user interface.
Build a custom DQOps container image
-
Create an empty folder.
-
Open a terminal, navigate to the created directory and clone the DQOps repository from GitHub.
-
Modify the DQOps Docker file
Dockerfile
located in the main directory. -
Run the following command to build a DQOps container image using a Dockerfile:
The
-t
parameter specifies the name for the container image, in this case "your_dqo_image_name".