Sensor are a predefined and customizable Jinja templates whose goal is to render SQL query in accordance with provider's SQL dialect.
Dialects (Jinja2 macros) are variables obtained from the table metadata and macros performing various functions, written in Jinja2.
User can configure basic statements as
order by, time series mode,
and different parameters
characteristic for the individual sensors (see the examples).
When changing configuration in YAML file in your editor, you can use code completion and check possible fields. We highly recommend using Visual Studio Code with installed Better Jinja by Samuel Colvin and YAML by Red Hat plugins.
Sensor types and categories
Sensors are divided into types:
Each check adds value to the level indicated by the type. All timeliness checks, for example, are table type since they correspond to the various time aspects of the whole table, whereas column level checks only provide value to the data inside solitary columns.
Each check category relates to a distinct set of business requirements. Validity tests, for example, ensure that the data follows particular guidelines, such as numerical numbers being inside a certain range.
Choosing sensor for a check
Check configuration is very simple, in fact it is almost the same both for column and table type.
To configure quality checks on a table level, you have to add field
checks under table configuration, choose the
category and desired sensor. In case of column level checks, the configuration is the same, only done under chosen
Configuration of a column level validity check
# yaml-language-server: $schema=https://cloud.dqo.ai/dqo-yaml-schema/TableYaml-schema.json apiVersion: dqo/v1 kind: table spec: target: schema_name: dqo_ai_test_data table_name: continuous_days_one_row_per_day_7839946914895804974 time_series: mode: current_time time_gradient: day columns: id: type_snapshot: column_type: INT64 nullable: true date: type_snapshot: column_type: DATE nullable: true checks: validity: value_in_range_date_percent: parameters: min_value: "2022-01-01" max_value: "2022-02-13" include_max_value: false rules: count_equals: low: expected_value: 90.0 error_margin: 5.0 medium: expected_value: 80.0 error_margin: 5.0 high: expected_value: 70.0 error_margin: 5.0 value: type_snapshot: column_type: STRING nullable: true