resolve conflict

2025-12-18 02:11:27 +00:00 · 2022-12-06 09:56:39 -06:00
parent 6786792bae 33e0a596d8
commit 518161ae39
10 changed files with 78 additions and 15 deletions
--- a/README.md
+++ b/README.md
@@ -486,7 +486,12 @@ or any other nested information.

 ### Testing
 #### Missing Primary Key Tests
-`fct_missing_primary_key_tests` ([source](models/marts/tests/fct_missing_primary_key_tests.sql)) lists every model that does not meet the minimum testing requirement of testing primary keys. Any models that does not have both a `not_null` and `unique` test configured will be highlighted in this model. 
+`fct_missing_primary_key_tests` ([source](models/marts/tests/fct_missing_primary_key_tests.sql)) lists every model that does not meet the minimum testing requirement of testing primary keys. Any model that does not have either
+
+1. a `not_null` test and a `unique` test applied to a single column OR 
+2. a `dbt_utils.unique_combination_of_columns` test applied to a set of columns 
+
+will be flagged by this model. 

 <details>
 <summary><b>Reason to Flag</b></summary>
@@ -496,15 +501,16 @@ Tests are assertions you make about your models and other resources in your dbt
 <details>
 <summary><b>How to Remediate</b></summary>

-Apply a [uniqueness test](https://docs.getdbt.com/reference/resource-properties/tests#unique) and a [not null test](https://docs.getdbt.com/reference/resource-properties/tests#not_null) to the column that represents the grain of your model in its schema entry. For models that are unique across a combination of columns, we recommend adding a surrogate key column to your model, then applying these tests to that new model. See the [`surrogate_key`](https://github.com/dbt-labs/dbt-utils#surrogate_key-source) macro from dbt_utils for more info!
+Apply a [uniqueness test](https://docs.getdbt.com/reference/resource-properties/tests#unique) and a [not null test](https://docs.getdbt.com/reference/resource-properties/tests#not_null) to the column that represents the grain of your model in its schema entry. For models that are unique across a combination of columns, we recommend adding a surrogate key column to your model, then applying these tests to that new model. See the [`surrogate_key`](https://github.com/dbt-labs/dbt-utils#surrogate_key-source) macro from dbt_utils for more info! Alternatively, you can use the [`dbt_utils.unique_combination_of_columns`](<https://github.com/dbt-labs/dbt-utils#unique_combination_of_columns-source>) test from `dbt_utils`. Check out the [overriding variables section](#overriding-variables) to read more about configuring other primary key tests for your project!

 Additional tests can be configured by applying a [generic test](https://docs.getdbt.com/docs/building-a-dbt-project/tests#generic-tests) in the model's `.yml` entry or by creating a [singular test](https://docs.getdbt.com/docs/building-a-dbt-project/tests#singular-tests) 
-in the `tests` directory of you project. 
+in the `tests` directory of you project.
 </details>

 #### Test Coverage
 `fct_test_coverage` ([source](models/marts/tests/fct_test_coverage.sql)) contains metrics pertaining to project-wide test coverage.
 Specifically, this models measures:
+
 1. `test_coverage_pct`: the percentage of your models that have minimum 1 test applied.
 2. `test_to_model_ratio`: the ratio of the number of tests in your dbt project to the number of models in your dbt project
 3. `< model_type >_test_coverage_pct`: the percentage of each of your model types that have minimum 1 test applied.
@@ -892,21 +898,39 @@ models:
 Currently, this package uses different variables to adapt the models to your objectives and naming conventions. They can all be updated directly in `dbt_project.yml`

 <details>
-<summary><b>Coverage Variables</b></summary>
+<summary><b>Testing and Documentation Variables</b></summary>

 | variable    | description | default     |
 | ----------- | ----------- | ----------- |
 | `test_coverage_pct` | the minimum acceptable test coverage percentage | 100% |
 | `documentation_coverage_pct` | the minimum acceptable documentation coverage percentage | 100% |
+| `primary_key_test_macros` | the set(s) of dbt tests used to check validity of a primary key | [["dbt.test_unique", "dbt.test_not_null"], ["dbt_utils.test_unique_combination_of_columns"]] |
+
+**Usage notes for `primary_key_test_macros:`**
+
+The `primary_key_test_macros` variable determines how the `fct_missing_primary_key_tests` ([source](models/marts/tests/fct_missing_primary_key_tests.sql)) model evaluates whether the models in your project are properly tested for their grain. This variable is a list and each entry **must be a list of test names in `project_name.test_macro_name` format**.
+
+For each entry in the parent list, the logic in `int_model_test_summary` will evaluate whether each model has all of the tests in that entry applied. If a model meets the criteria of any of the entries in the parent list, it will be considered a pass. The default behavior for this package will check for whether each model has either:
+
+1. __Both__ the `not_null` and `unique` tests applied to a single column OR
+2. The `dbt_utils.unique_combination_of_columns` applied to the model.
+
+Each set of test(s) that define a primary key requirement must be grouped together in a sub-list to ensure they are evaluated together (e.g. [`dbt.test_unique`, `dbt.test_not_null`] ).
+
+*While it's not explicitly tested in this package, we strongly encourage adding a `not_null` test on each of the columns listed in the `dbt_utils.unique_combination_of_columns` tests.*
+

 ```yml
 # dbt_project.yml
 # set your test and doc coverage to 75% instead
+# use the dbt_constraints.test_primary_key test to check for validity of your primary keys

 vars:
  dbt_project_evaluator:
    documentation_coverage_target: 75
    test_coverage_target: 75
+    primary_key_test_macros: [["dbt_constraints.test_primary_key"]]
+    
 ```
 </details>

--- a/dbt_project.yml
+++ b/dbt_project.yml
@@ -57,6 +57,8 @@ vars:
  documentation_coverage_target: 100
  test_coverage_target: 100

+  primary_key_test_macros: [["dbt.test_unique", "dbt.test_not_null"], ["dbt_utils.test_unique_combination_of_columns"]]
+
  # -- DAG variables --
  models_fanout_threshold: 3

--- a/integration_tests/models/staging/source_1/schema.yml
+++ b/integration_tests/models/staging/source_1/schema.yml
@@ -8,6 +8,12 @@ models:
        description: hocus pocus
        tests:
          - unique
+  - name: stg_model_3
+    tests:
+      - dbt_utils.unique_combination_of_columns:
+          combination_of_columns:
+            - id
+            - color
  - name: stg_model_2
    columns:
      - name: id 
--- a/integration_tests/models/staging/source_1/stg_model_3.sql
+++ b/integration_tests/models/staging/source_1/stg_model_3.sql
@@ -1,2 +1,4 @@
 -- depends on: {{ source('source_2', 'table_3') }}
-select 1 as id
+select 1 as id, 'blue' as color
+union all 
+select 1 as id, 'red' as color
--- a/integration_tests/seeds/tests/test_fct_missing_primary_key_tests.csv
+++ b/integration_tests/seeds/tests/test_fct_missing_primary_key_tests.csv
@@ -8,5 +8,4 @@ report_1,FALSE,0
 report_2,FALSE,0
 report_3,FALSE,0
 stg_model_1,FALSE,1
-stg_model_3,FALSE,0
 stg_model_5,FALSE,0
--- a/integration_tests/seeds/tests/test_fct_test_coverage.csv
+++ b/integration_tests/seeds/tests/test_fct_test_coverage.csv
@@ -1,2 +1,2 @@
 total_models,total_tests,tested_models,test_coverage_pct,staging_test_coverage_pct,intermediate_test_coverage_pct,marts_test_coverage_pct,other_test_coverage_pct,test_to_model_ratio
-14,9,4,28.57,60.00,50.00,0.00,0.00,0.6429
+14,10,5,35.71,80.00,50.00,0.00,0.00,0.7143
--- a/macros/unpack/get_nodes.sql
+++ b/macros/unpack/get_nodes.sql
@@ -9,7 +9,6 @@
    {%- set values = [] -%}

    {%- for node in nodes_list -%}
-
        {%- set values_line  = 
            [
                wrap_string_with_quotes(node.unique_id),
@@ -25,7 +24,8 @@
                wrap_string_with_quotes(node.alias),
                "cast(" ~ dbt_project_evaluator.is_not_empty_string(node.description) | trim ~ " as boolean)",
                "''" if not node.column_name else wrap_string_with_quotes(dbt.escape_single_quotes(node.column_name)),
-                wrap_string_with_quotes(node.meta | tojson)
+                wrap_string_with_quotes(node.meta | tojson),
+                wrap_string_with_quotes(node.depends_on.macros | tojson)
            ]
        %}

@@ -51,7 +51,8 @@
              'alias',
              ('is_described', 'boolean'),
              'column_name',
-              'meta'
+              'meta',
+              'macro_dependencies'
            ]
         )
    ) }}
--- a/models/marts/core/int_all_graph_resources.sql
+++ b/models/marts/core/int_all_graph_resources.sql
@@ -1,4 +1,13 @@
 -- one row for each resource in the graph
+
+{# flatten the sets of permissable primary key test sets to one level for later iteration #}
+{%- set test_macro_list = [] %}
+{%- for test_set in var('primary_key_test_macros') -%}
+      {%- for test in test_set %}
+        {%- do test_macro_list.append(test) -%}
+      {%- endfor %}
+{%- endfor -%}
+
 with unioned as (

    {{ dbt_utils.union_relations([
@@ -56,8 +65,9 @@ joined as (
        end as model_type_folder,
        {{ dbt.position('naming_convention_folders.folder_name_value','unioned_with_calc.directory_path') }} as position_folder,  
        nullif(unioned_with_calc.column_name, '') as column_name,
-        unioned_with_calc.resource_name like 'unique%' and unioned_with_calc.resource_type = 'test' as is_not_null_test,
-        unioned_with_calc.resource_name like 'not_null%' and unioned_with_calc.resource_type = 'test' as is_unique_test,
+        {% for test in test_macro_list %}
+        unioned_with_calc.macro_dependencies like '%macro.{{ test }}%' and unioned_with_calc.resource_type = 'test' as is_{{ test.split('.')[1] }},  
+        {% endfor %}
        unioned_with_calc.is_enabled, 
        unioned_with_calc.materialized, 
        unioned_with_calc.on_schema_change, 
@@ -72,6 +82,7 @@ joined as (
        unioned_with_calc.owner_name,
        unioned_with_calc.owner_email,
        unioned_with_calc.meta,
+        unioned_with_calc.macro_dependencies,
        unioned_with_calc.metric_type, 
        unioned_with_calc.model, 
        unioned_with_calc.label, 
--- a/models/marts/tests/intermediate/int_model_test_summary.sql
+++ b/models/marts/tests/intermediate/int_model_test_summary.sql
@@ -13,7 +13,15 @@ count_column_tests as (
    select 
        relationships.direct_parent_id, 
        all_graph_resources.column_name,
-        count(distinct case when all_graph_resources.is_unique_test or all_graph_resources.is_not_null_test then relationships.resource_id else null end) primary_key_tests_count,
+        {%- for test_set in var('primary_key_test_macros') %}
+            {%- set outer_loop = loop -%}
+        count(distinct case when 
+                {%- for test in test_set %} 
+                all_graph_resources.is_{{ test.split('.')[1] }} {%- if not loop.last %} or {% endif %} 
+                {%- endfor %}
+            then relationships.resource_id else null end
+        ) as primary_key_method_{{ outer_loop.index }}_count,
+        {%- endfor %}
        count(distinct relationships.resource_id) as tests_count
    from all_graph_resources
    left join relationships
@@ -27,7 +35,17 @@ agg_test_relationships as (

    select 
        direct_parent_id, 
-        sum(case when primary_key_tests_count = 2 then 1 else 0 end) >= 1 as is_primary_key_tested,
+        sum(case 
+                when (
+                    {%- for test_set in var('primary_key_test_macros') %}
+                        {%- set compare_value = test_set | length %}
+                    primary_key_method_{{ loop.index }}_count = {{ compare_value}}
+                        {%- if not loop.last %} or {% endif %}
+                    {%- endfor %} 
+                ) then 1 
+                else 0 
+            end
+        ) >= 1 as is_primary_key_tested,
        sum(tests_count) as number_of_tests_on_model
    from count_column_tests
    group by 1
--- a/packages.yml
+++ b/packages.yml
@@ -1,3 +1,3 @@
 packages:
  - package: dbt-labs/dbt_utils
-    version: [">1.0.0", "<2.0.0"]
+    version: [">1.0.0", "<2.0.0"]