Distinct count
In this example - check the data of bigquery-public-data.covid_italy.national_trends
using
distinct_count
check.
The data is updated daily. The goal is to set up a uniqueness check and verify how many distinct values are in data.
Adding connection
GCP
Download and install Google Cloud CLI. After installing Google Cloud CLI, log in to your GCP account (you can start one for free), by running:
After setting up the GCP account, create a GCP project. That will be the GCP billing project used to run SQL sensors on the public datasets provided by Google.
The examples are using a name of the GCP billing project, received as an environment variable GCP_PROJECT
.
Set and export this variable before starting DQO shell.
Navigate to the example directory and run the check
After starting the example, run the following commands in the DQO shell:
This command will let up login or sign up for the cloud.dqo.ai account. The data quality checks will be executed. The result files will be pushed to cloud.dqo.aiNow, you can open the browser and navigate to https://cloud.dqo.ai/ and review the sensor results on the dashboards.