resolve conflict

This commit is contained in:
Dave Connors
2022-12-06 09:56:39 -06:00
10 changed files with 78 additions and 15 deletions

View File

@@ -486,7 +486,12 @@ or any other nested information.
### Testing
#### Missing Primary Key Tests
`fct_missing_primary_key_tests` ([source](models/marts/tests/fct_missing_primary_key_tests.sql)) lists every model that does not meet the minimum testing requirement of testing primary keys. Any models that does not have both a `not_null` and `unique` test configured will be highlighted in this model.
`fct_missing_primary_key_tests` ([source](models/marts/tests/fct_missing_primary_key_tests.sql)) lists every model that does not meet the minimum testing requirement of testing primary keys. Any model that does not have either
1. a `not_null` test and a `unique` test applied to a single column OR
2. a `dbt_utils.unique_combination_of_columns` test applied to a set of columns
will be flagged by this model.
<details>
<summary><b>Reason to Flag</b></summary>
@@ -496,15 +501,16 @@ Tests are assertions you make about your models and other resources in your dbt
<details>
<summary><b>How to Remediate</b></summary>
Apply a [uniqueness test](https://docs.getdbt.com/reference/resource-properties/tests#unique) and a [not null test](https://docs.getdbt.com/reference/resource-properties/tests#not_null) to the column that represents the grain of your model in its schema entry. For models that are unique across a combination of columns, we recommend adding a surrogate key column to your model, then applying these tests to that new model. See the [`surrogate_key`](https://github.com/dbt-labs/dbt-utils#surrogate_key-source) macro from dbt_utils for more info!
Apply a [uniqueness test](https://docs.getdbt.com/reference/resource-properties/tests#unique) and a [not null test](https://docs.getdbt.com/reference/resource-properties/tests#not_null) to the column that represents the grain of your model in its schema entry. For models that are unique across a combination of columns, we recommend adding a surrogate key column to your model, then applying these tests to that new model. See the [`surrogate_key`](https://github.com/dbt-labs/dbt-utils#surrogate_key-source) macro from dbt_utils for more info! Alternatively, you can use the [`dbt_utils.unique_combination_of_columns`](<https://github.com/dbt-labs/dbt-utils#unique_combination_of_columns-source>) test from `dbt_utils`. Check out the [overriding variables section](#overriding-variables) to read more about configuring other primary key tests for your project!
Additional tests can be configured by applying a [generic test](https://docs.getdbt.com/docs/building-a-dbt-project/tests#generic-tests) in the model's `.yml` entry or by creating a [singular test](https://docs.getdbt.com/docs/building-a-dbt-project/tests#singular-tests)
in the `tests` directory of you project.
in the `tests` directory of you project.
</details>
#### Test Coverage
`fct_test_coverage` ([source](models/marts/tests/fct_test_coverage.sql)) contains metrics pertaining to project-wide test coverage.
Specifically, this models measures:
1. `test_coverage_pct`: the percentage of your models that have minimum 1 test applied.
2. `test_to_model_ratio`: the ratio of the number of tests in your dbt project to the number of models in your dbt project
3. `< model_type >_test_coverage_pct`: the percentage of each of your model types that have minimum 1 test applied.
@@ -892,21 +898,39 @@ models:
Currently, this package uses different variables to adapt the models to your objectives and naming conventions. They can all be updated directly in `dbt_project.yml`
<details>
<summary><b>Coverage Variables</b></summary>
<summary><b>Testing and Documentation Variables</b></summary>
| variable | description | default |
| ----------- | ----------- | ----------- |
| `test_coverage_pct` | the minimum acceptable test coverage percentage | 100% |
| `documentation_coverage_pct` | the minimum acceptable documentation coverage percentage | 100% |
| `primary_key_test_macros` | the set(s) of dbt tests used to check validity of a primary key | [["dbt.test_unique", "dbt.test_not_null"], ["dbt_utils.test_unique_combination_of_columns"]] |
**Usage notes for `primary_key_test_macros:`**
The `primary_key_test_macros` variable determines how the `fct_missing_primary_key_tests` ([source](models/marts/tests/fct_missing_primary_key_tests.sql)) model evaluates whether the models in your project are properly tested for their grain. This variable is a list and each entry **must be a list of test names in `project_name.test_macro_name` format**.
For each entry in the parent list, the logic in `int_model_test_summary` will evaluate whether each model has all of the tests in that entry applied. If a model meets the criteria of any of the entries in the parent list, it will be considered a pass. The default behavior for this package will check for whether each model has either:
1. __Both__ the `not_null` and `unique` tests applied to a single column OR
2. The `dbt_utils.unique_combination_of_columns` applied to the model.
Each set of test(s) that define a primary key requirement must be grouped together in a sub-list to ensure they are evaluated together (e.g. [`dbt.test_unique`, `dbt.test_not_null`] ).
*While it's not explicitly tested in this package, we strongly encourage adding a `not_null` test on each of the columns listed in the `dbt_utils.unique_combination_of_columns` tests.*
```yml
# dbt_project.yml
# set your test and doc coverage to 75% instead
# use the dbt_constraints.test_primary_key test to check for validity of your primary keys
vars:
dbt_project_evaluator:
documentation_coverage_target: 75
test_coverage_target: 75
primary_key_test_macros: [["dbt_constraints.test_primary_key"]]
```
</details>

View File

@@ -57,6 +57,8 @@ vars:
documentation_coverage_target: 100
test_coverage_target: 100
primary_key_test_macros: [["dbt.test_unique", "dbt.test_not_null"], ["dbt_utils.test_unique_combination_of_columns"]]
# -- DAG variables --
models_fanout_threshold: 3

View File

@@ -8,6 +8,12 @@ models:
description: hocus pocus
tests:
- unique
- name: stg_model_3
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- id
- color
- name: stg_model_2
columns:
- name: id

View File

@@ -1,2 +1,4 @@
-- depends on: {{ source('source_2', 'table_3') }}
select 1 as id
select 1 as id, 'blue' as color
union all
select 1 as id, 'red' as color

View File

@@ -8,5 +8,4 @@ report_1,FALSE,0
report_2,FALSE,0
report_3,FALSE,0
stg_model_1,FALSE,1
stg_model_3,FALSE,0
stg_model_5,FALSE,0
1 resource_name is_primary_key_tested number_of_tests_on_model
8 report_2 FALSE 0
9 report_3 FALSE 0
10 stg_model_1 FALSE 1
stg_model_3 FALSE 0
11 stg_model_5 FALSE 0

View File

@@ -1,2 +1,2 @@
total_models,total_tests,tested_models,test_coverage_pct,staging_test_coverage_pct,intermediate_test_coverage_pct,marts_test_coverage_pct,other_test_coverage_pct,test_to_model_ratio
14,9,4,28.57,60.00,50.00,0.00,0.00,0.6429
14,10,5,35.71,80.00,50.00,0.00,0.00,0.7143
1 total_models total_tests tested_models test_coverage_pct staging_test_coverage_pct intermediate_test_coverage_pct marts_test_coverage_pct other_test_coverage_pct test_to_model_ratio
2 14 9 10 4 5 28.57 35.71 60.00 80.00 50.00 0.00 0.00 0.6429 0.7143

View File

@@ -9,7 +9,6 @@
{%- set values = [] -%}
{%- for node in nodes_list -%}
{%- set values_line =
[
wrap_string_with_quotes(node.unique_id),
@@ -25,7 +24,8 @@
wrap_string_with_quotes(node.alias),
"cast(" ~ dbt_project_evaluator.is_not_empty_string(node.description) | trim ~ " as boolean)",
"''" if not node.column_name else wrap_string_with_quotes(dbt.escape_single_quotes(node.column_name)),
wrap_string_with_quotes(node.meta | tojson)
wrap_string_with_quotes(node.meta | tojson),
wrap_string_with_quotes(node.depends_on.macros | tojson)
]
%}
@@ -51,7 +51,8 @@
'alias',
('is_described', 'boolean'),
'column_name',
'meta'
'meta',
'macro_dependencies'
]
)
) }}

View File

@@ -1,4 +1,13 @@
-- one row for each resource in the graph
{# flatten the sets of permissable primary key test sets to one level for later iteration #}
{%- set test_macro_list = [] %}
{%- for test_set in var('primary_key_test_macros') -%}
{%- for test in test_set %}
{%- do test_macro_list.append(test) -%}
{%- endfor %}
{%- endfor -%}
with unioned as (
{{ dbt_utils.union_relations([
@@ -56,8 +65,9 @@ joined as (
end as model_type_folder,
{{ dbt.position('naming_convention_folders.folder_name_value','unioned_with_calc.directory_path') }} as position_folder,
nullif(unioned_with_calc.column_name, '') as column_name,
unioned_with_calc.resource_name like 'unique%' and unioned_with_calc.resource_type = 'test' as is_not_null_test,
unioned_with_calc.resource_name like 'not_null%' and unioned_with_calc.resource_type = 'test' as is_unique_test,
{% for test in test_macro_list %}
unioned_with_calc.macro_dependencies like '%macro.{{ test }}%' and unioned_with_calc.resource_type = 'test' as is_{{ test.split('.')[1] }},
{% endfor %}
unioned_with_calc.is_enabled,
unioned_with_calc.materialized,
unioned_with_calc.on_schema_change,
@@ -72,6 +82,7 @@ joined as (
unioned_with_calc.owner_name,
unioned_with_calc.owner_email,
unioned_with_calc.meta,
unioned_with_calc.macro_dependencies,
unioned_with_calc.metric_type,
unioned_with_calc.model,
unioned_with_calc.label,

View File

@@ -13,7 +13,15 @@ count_column_tests as (
select
relationships.direct_parent_id,
all_graph_resources.column_name,
count(distinct case when all_graph_resources.is_unique_test or all_graph_resources.is_not_null_test then relationships.resource_id else null end) primary_key_tests_count,
{%- for test_set in var('primary_key_test_macros') %}
{%- set outer_loop = loop -%}
count(distinct case when
{%- for test in test_set %}
all_graph_resources.is_{{ test.split('.')[1] }} {%- if not loop.last %} or {% endif %}
{%- endfor %}
then relationships.resource_id else null end
) as primary_key_method_{{ outer_loop.index }}_count,
{%- endfor %}
count(distinct relationships.resource_id) as tests_count
from all_graph_resources
left join relationships
@@ -27,7 +35,17 @@ agg_test_relationships as (
select
direct_parent_id,
sum(case when primary_key_tests_count = 2 then 1 else 0 end) >= 1 as is_primary_key_tested,
sum(case
when (
{%- for test_set in var('primary_key_test_macros') %}
{%- set compare_value = test_set | length %}
primary_key_method_{{ loop.index }}_count = {{ compare_value}}
{%- if not loop.last %} or {% endif %}
{%- endfor %}
) then 1
else 0
end
) >= 1 as is_primary_key_tested,
sum(tests_count) as number_of_tests_on_model
from count_column_tests
group by 1

View File

@@ -1,3 +1,3 @@
packages:
- package: dbt-labs/dbt_utils
version: [">1.0.0", "<2.0.0"]
version: [">1.0.0", "<2.0.0"]