Compare commits

...

201 Commits

Author SHA1 Message Date
Jeremy Cohen
2c8856da3b Add changelog entry 2023-07-19 12:37:50 +02:00
Jeremy Cohen
029045e556 Pin sqlparse<0.5 2023-07-19 12:35:31 +02:00
Jeremy Cohen
433e5c670e Pin click<9 2023-07-19 12:35:17 +02:00
FishtownBuildBot
cbfc6a8baf Add new index.html and changelog yaml files from dbt-docs (#8131) 2023-07-19 11:23:35 +02:00
Michelle Ark
9765596247 wrap deprecation warnings in warn_or_error calls (#8129) 2023-07-18 18:08:52 -04:00
Kshitij Aranke
1b1a291fae Publish coverage results to codecov.io (#8127) 2023-07-18 16:47:44 -05:00
FishtownBuildBot
867534c1f4 Cleanup main after cutting new 1.6.latest branch (#8102) 2023-07-17 19:49:55 -04:00
Michelle Ark
6d8b6459eb bumpversion 1.7.0a1 (#8111) 2023-07-17 18:10:29 -04:00
Mike Alfare
203bd8defd Apply new integration tests to existing framework to identify supported features (#8099)
* applied new integration tests to existing framework

* applied new integration tests to existing framework

* generalized tests for reusability in adapters; fixed drop index issue

* generalized tests for reusability in adapters; fixed drop index issue

* removed unnecessary overrides in tests

* adjusted import to allow for usage in adapters

* adjusted import to allow for usage in adapters

* removed fixture artifact

* generalized the materialized view fixture which will need to be specific to the adapter

* unskipped tests in the test runner package

* corrected test condition

* corrected test condition

* added missing initial build for the relation type swap tests
2023-07-17 12:17:02 -04:00
Emily Rockman
949680a5ce add env vars for datadog ci visibility (#8097)
* add env vars for datadog ci visibility

* modify pytest command for tracing

* fix posargs

* move env vars to job that needs them

* add test repeater to DD

* swap flags
2023-07-17 09:52:21 -05:00
Damian Owsianny
015c490b63 Fix query comment tests (#7928) (#7928) 2023-07-13 13:45:14 -06:00
Quigley Malcolm
95a916936e [CT-2821] Support dbt-semantic-interfaces~=0.1.0rc1 (#8085)
* Bump version support for `dbt-semantic-interfaces` to `~=0.1.0rc1`

* Add tests for asserting WhereFilter satisfies protocol

* Add `call_parameter_sets` to `WhereFilter` class to satisfy protocol

* Changie doc for moving to DSI 0.1.0rc1

* [CT-2822]  Fix `NonAdditiveDimension` Implementation (#8089)

* Add test to ensure `NonAdditiveDimension` implementation satisfies protocol

* Fix typo in `NonAdditiveDimension`: `window_grouples` -> `window_groupings`

* Add changie doc for typo fix in NonAdditiveDimension
2023-07-13 12:51:23 +02:00
Michelle Ark
961d69d8c2 gitignore user.yml and profiles.yml (#8087) 2023-07-12 17:46:49 -07:00
Michelle Ark
be4d0a5b88 add __test__ = False to non-test classes that start with Test (#8086) 2023-07-12 17:35:38 -07:00
Quigley Malcolm
5310d3715c CT-2691 Fix the populating of a Metric's depends_on property (#8015)
* Add metrics from metric type params to a metric's depends_on

* Add Lookup utility for finding `SemanticModel`s by measure names

* Add the `SemanticModel` of a `Metric`'s measure property to the `Metric`'s `depends_on`

* Add `SemanticModelConfig` to `SemanticModel`

Some tests were failing due to `Metric`'s referencing `SemanticModels`.
Specifically there was a check to see if a referenced node was disabled,
and because `SemanticModel`'s didn't have a `config` holding the `enabled`
boolean attr, core would blow up.

* Checkpoint on test fixing

* Correct metricflow_time_spine_sql in test fixtures

* Add check for `SemanticModel` nodes in `Linker.link_node`

Now that `Metrics` depend on `SemanticModels` and `SemanticModels`
have their own dependencies on `Models` they need to be checked for
in the `Linker.link_node`. I forget the details but things blow up
without it. Basically it adds the SemanticModels to the dependency
graph.

* Fix artifacts/test_previous_version_state.py tests

* fix access/test_access.py tests

* Fix function metric tests

* Fix functional partial_parsing tests

* Add time dimension to semantic model in exposures fixture

* Bump DSI version to a minimum of 0.1.0dev10

DSI 0.1.0dev10 fixes an incoherence issue in DSI around `agg_time_dimension`
setting. This incoherence was that `measure.agg_time_dimension` was being
required, even though it was no longer supposed to be a required attribute
(it's specificially typed as optional in the protocol). This was causing
a handful of tests to fail because the `semantic_model.defaults.agg_time_dimension`
value wasn't being respected. Pulling in the fix from DSI 0.1.0dev10 fixes
the issue.

Interestingly after bumping the DSI version, the integration tests were
still failing. If I ran the tests individually they passed though. To get
`make integration` to run properly I ended up having to clear my `.tox`
cache, as it seems some outdated state was being persisted.

* Add test specifically for checking the `depends_on` of `Metric` nodes

* Re-enable test asserting calling metric nodes in models

* Migrate `checked_agg_time_dimension` to `checked_agg_time_dimension_for_measure`

DSI 0.1.0dev10 moved `checked_agg_time_dimension` from the `Measure`
protocol to the `SemanticModel` protocol as `checked_agg_time_dimension_for_measure`.
This finishes a change where for a given measure either the `Measure.agg_time_dimension`
or the measure's parent `SemanticModel.defaults.agg_time_dimension` needs to be
set, instead of always require the measure's `Measure.agg_time_dimension`.

* Add changie doc for populating metric

---------

Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
2023-07-12 13:42:44 -07:00
Jeremy Cohen
6bdf983e0b Add semantic_models to tracked resource counts (#8078)
* Add semantic_models to tracked resource counts

* Add changelog entry

* Simplify node statistic tabulation.

* Remove review comment. Replace with explanation.

---------

Co-authored-by: Peter Allen Webb <peter.webb@dbtlabs.com>
2023-07-12 14:19:19 -04:00
Michelle Ark
6604b9ca31 8030/fix contract checksum (#8072) 2023-07-12 09:36:15 -07:00
Emily Rockman
305241fe86 Er/ct 2675 test custom target (#8079)
* remove skip

* fix retry test
2023-07-12 11:03:19 -05:00
Michelle Ark
2d686b73fd update contributing.md reference to test/integration (#8073) 2023-07-12 09:03:02 -07:00
Alex Rosenfeld
30def98ed9 Remove volume declaration (#8069)
* Remove volume declaration

* Changelog entry

---------

Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
2023-07-12 10:39:24 -04:00
Thomas Lento
b78d23f68d Update validate sql test classes to new nomenclature (#8076)
The original implementation of validate_sql was called dry_run,
but in the rename the test classes and much of their associated
documentation still retained the old naming.

This is mainly cosmetic, but since these test classes will be
imported into adapter repositories we should fix this now before
the wrong name proliferates.
2023-07-12 10:20:25 +02:00
Thomas Lento
4ffd633e40 Add validate_sql method to base adapter with implementation for SQLAdapters (#8001)
* Add dry_run method to base adapter with implementation for SQLAdapters

resolves #7839

In the CLI integration, MetricFlow will issue dry run queries as
part of its warehouse-level validation of the semantic manifest,
including all semantic model and metric definitions.

In most cases, issuing an `explain` query is adequate, however,
BigQuery does not support the `explain` keyword and so we cannot
simply pre-pend `explain` to our input queries and expect the
correct behavior across all contexts.

This commit adds a dry_run() method to the BaseAdapter which mirrors
the execute() method in that it simply delegates to the ConnectionManager.
It also adds a working implementation to the SQLConnectionManager and
includes a few test cases for adapter maintainers to try out on their own.

The current implementation should work out of the box with most
of our adapters. BigQuery will require us to implement the dry_run
method on the BigQueryConnectionManager, and community-maintained
adapters can opt in by enabling the test and ensuring their own
implementations work as expected.

Note - we decided to make these concrete methods that throw runtime
exceptions for direct descendants of BaseAdapter in order to avoid
forcing community adapter maintainers to implement a method that does
not currently have any use cases in dbt proper.

* Switch dry_run implementation to be macro-based

The common pattern for engine-specific SQL statement construction
in dbt is to provide a default macro which can then be overridden
on a per-adapter basis by either adapter maintainers or end users.
The advantage of this is users can take advantage of alternative
SQL syntax for performance or other reasons, or even to enable
local usage if an engine relies on a non-standard expression and
the adapter maintainer has not updated the package.

Although there are some risks here they are minimal, and the benefit
of added expressiveness and consistency with other similar constructs
is clear, so we adopt this approach here.

* Improve error message for InvalidConnectionError in test_invalid_dry_run.

* Rename dry_run to validate_sql

The validate_sql name has less chance of colliding with dbt's
command nomenclature, both now and in some future where we have
dry-run operations.

* Rename macro and test files to validate_sql

* Fix changelog entry
2023-07-11 18:24:18 -04:00
Kshitij Aranke
07c3dcd21c Fixes #7785: fail-fast behavior (#8066) 2023-07-11 17:05:35 -05:00
Doug Beatty
fd233eac62 Use Ubuntu 22.04.2 LTS (Jammy Jellyfish) since it is a long-term supported release (#8071)
* Use Ubuntu 22.04.2 LTS (Jammy Jellyfish) since it is a long-term supported release

* Changelog entry
2023-07-11 15:56:31 -06:00
Emily Rockman
d8f38ca48b Flaky Test Workflow (#8055)
* add permissions

* replace db setup

* try with bash instead of just pytest flags

* fix test command

* remove spaces

* remove force-flaky flag

* add starting vlaues

* add mac and windows postgres isntall

* define use bash

* fix typo

* update output report

* tweak last if condition

* clarify failures/successful runs

* print running success and failure tally

* just output pytest instead of capturing it

* set shell to not exit immediately on exit code

* add formatting around results for easier scanning

* more output formatting

* add matrix to unlock parallel runners

* increase to ten batches

* update debug

* add comment

* clean up comments
2023-07-11 12:58:46 -05:00
Quigley Malcolm
7740bd6b45 Remove create_metric as a public facing SemanticModel.Measure property (#8068)
* Remove `create_metric` as a public facing `SemanticModel.Measure` property

We want to add `create_metric`. The `create_metric` property will be
incredibly useful. However, at this time it is not hooked up, and we don't
have time to hook it up before the code freeze for 1.6.0rc of core. As
it doesn't do anything, we shouldn't allow people to specify it, because
it won't do what one would expect. We plan on making the implementation
of `create_metric` a priority for 1.7 of core

* Changie doc for the removal of create_metric property
2023-07-11 09:36:10 -07:00
dave-connors-3
a57fdf008e add negative part number test case for split part cross db util (#7200)
* add negative test case

* changie

* missed a comma

* Update changelog entry

* Add a negative number (rather than subtract a positive number)

---------

Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
2023-07-11 08:56:52 -06:00
Peter Webb
a8e3afe8af Fix accidental propagation of log messages to root logger (#7882)
* Fix accidental propagation of log messages to root logger.

* Add changelog entry

* Fixed an issue which blocked debug logging to stdout with --log-level debug, unless --debug was also used.
2023-07-11 10:40:33 -04:00
Peter Webb
44572e72f0 Semantic Model Validation (#8049)
* Use dbt-semantic-interface validations on semantic models and metrics defined in Core.

* Remove empty test, since semantic models don't generate any validation warnings.

* Add changelog entry.

* Temporarily remove requirement that there must be semantic models definied in order to define metrics
2023-07-10 14:48:20 -04:00
Nathaniel May
54b1e5699c Update the PR template (#7892)
* add interface changes section to the PR template

* update entire template

* split up choices for tests and interfaces

* minor formatting change

* add line breaks

* actually put in line breaks

* revert split choices in checklist

* add line breaks to top

* move docs link

* typo
2023-07-10 13:13:02 -04:00
Chenyu Li
ee7bc24903 partial parse file path (#8032) 2023-07-10 08:52:52 -07:00
Emily Rockman
15ef88d2ed add workflow for flaky test testing (#8044)
* add workflow for flaky test testing

* improve docs

* rename workflow

* update default input

* add min passing tests
2023-07-07 12:48:40 -05:00
Emily Rockman
7c56d72b46 pin click (#8050)
* pin click

* changelog
2023-07-07 11:20:27 -05:00
Michelle Ark
5d28e4744e ModelNodeArgs.unique_id - include v (#8038) 2023-07-06 11:54:12 -04:00
Jeremy Cohen
746ca7d149 Nicer error message for contracted model missing 'columns' (#8024) 2023-07-06 11:06:56 +02:00
Grant Murray
a58b5ee8fb CT-2780 [Docs] Fix-toc-links-in-contributing-md (#8017)
* docs(contributing): fix-toc-link-in-contributing-md

* docs(contributing-md): fix-link2

* Fix-typo

* Remove backtick from href

* changie new

* Cough commit / trigger CI
2023-07-05 11:47:43 +02:00
Peter Webb
7fbfd53c3e CT-2752: Extract methods to new SemanticManifest class for better encapsulation of details. (#8012) 2023-07-01 16:59:44 -04:00
Chenyu Li
4c44c29ee4 fire proper event for inline query error (#7960) 2023-06-30 14:50:52 -07:00
FishtownBuildBot
8ee0fe0a64 [Automated] Merged prep-release/1.6.0b8_5425945126 into target main during release process 2023-06-30 14:18:53 -05:00
Github Build Bot
307a618ea8 Bumping version to 1.6.0b8 and generate changelog 2023-06-30 18:36:40 +00:00
Michelle Ark
ce07ce58e1 versioned node selection with underscore delimiting (#7995) 2023-06-30 14:28:19 -04:00
Michelle Ark
7ea51df6ae allow on_schema_change: fail for incremental models with contracts (#8006) 2023-06-30 13:27:48 -04:00
Gerda Shank
fe463c79fe Add time spine table configuration to semantic manifest (#7996)
* Bump dbt-semantic-interface to dev8

* Create time_spline_table_configuration in semantic manifest

* Add metricflow_time_spin to semantic_models tests

* Remove skip from test

* Changie

* Update exception message

Co-authored-by: Quigley Malcolm <QMalcolm@users.noreply.github.com>

---------

Co-authored-by: Quigley Malcolm <QMalcolm@users.noreply.github.com>
2023-06-30 13:22:24 -04:00
d-kaneshiro
d7d6843c5f Added note before running integration tests (#7657)
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
2023-06-30 09:32:08 -05:00
Tyler Rouze
adcf8bcbb3 [CT-2551] Make state selection MECE (#7773)
* ct-2551: adds old and unmodified state selection methods

* ct-2551: update check_unmodified_content to simplify

* add unit and integration tests for unmodified and old

* add changelog entry

* ct-2551: reformatting of contingent adapter assignment list
2023-06-30 09:30:10 -04:00
Gerda Shank
5d937802f1 Remove pin of sqlparse, minor refactoring, add tests (#7993) 2023-06-29 16:24:17 -04:00
Michelle Ark
8c201e88a7 type + fix typo in ModelNodeArgs.unique_id (#7992) 2023-06-29 13:10:26 -04:00
d-kaneshiro
b8bc264731 Unified to UTC (#7665)
* UnifiedToUTC

* Check proximity of dbt_valid_to and deleted time

* update the message to print if the assertion fails

* add CHANGELOG entries

* test only if naive

* Added comments about naive and aware

* Generalize comparison of datetimes that are "close enough"

---------

Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
2023-06-29 11:52:25 -04:00
Quigley Malcolm
9c6fbff0c3 CT-2707: Populate metric input measures (#7984)
* Fix tests fixtures which were using measures for metric numerator/denominators

In our previous upgrade to DSI dev7, numerators and denominators for
metrics switched from being `MetricInputMeasure`s to `MetricInput`s.
I.e. metric numerators and denominators should references other metrics,
not semantic model measures. However, at that time, we weren't actually
doing anything with numerators and denominators in core, so no issue
got raised. The changes we are about to make though are going to surface
these issues..

* Add tests for ensuring a metric's `input_measures` gets properly populated

* Begin populating `metric.type_params.input_measures`

This isn't my favorite bit of code. Mostly because there are checks for
existence which really should be handled before this point, however a
good point for that to happen doesn't exist currently. For instance,
in an ideal world by the time we get to `_process_metric_node`, if a
metric is of type `RATIO` and the nominator and denominator should be
guaranteed.

* Update test checking that disabled metrics aren't added to the manifest metrics

We updated from the metric `number_of_people` to `average_tenure_minus_people` for
this test because disabling `number_of_people` raised other exceptions at parse
time due to a metric referencing a disabled metric. The metric `average_tenure_minus_people`
is a leaf metric, and so for this test, it is a better candidate.

* Update `test_disabled_metric_ref_model` to have more disabled metrics

There are metrics which depend on the metric `number_of_people`. If
`number_of_people` is disabled without the metrics that depend on it
being disabled, then a different (expected) exception would be raised
than the one this test is testing for. Thus we've disabled those
downstream metrics.

* Add test which checks that metrics depending on disabled metrics raise an exception

* Add changie doc for populating metric input measures
2023-06-29 08:30:58 -07:00
Kshitij Aranke
5c7aa7f9ce dbt clone (#7881)
Co-authored-by: Matthew McKnight <matthew.mcknight@dbtlabs.com>
2023-06-28 19:22:07 -05:00
Peter Webb
1af94dedad CT-2757: Fix unit test which broke due to merge issues (#7978) 2023-06-28 17:34:45 -04:00
Gerda Shank
2e7c968419 Use events.contextvar because of multiprocessing unable to pickle ContextVar (#7949)
* Add task contextvars to events/contextvars.py

* Use events.contextvars instead of task.contextvars

* Changie
2023-06-28 16:55:50 -04:00
Jeremy Cohen
05b0820a9e Replace space with underscore in NodeType strings (#7947) 2023-06-28 20:04:32 +02:00
FishtownBuildBot
d4e620eb50 [Automated] Merged prep-release/1.6.0b7_5402737814 into target main during release process 2023-06-28 11:03:18 -05:00
Will Bryant
0f52505dbe Fix CTE insertion position when the model uses WITH RECURSIVE (#7350) (#7414) 2023-06-28 11:41:31 -04:00
Github Build Bot
cb754fd97b Bumping version to 1.6.0b7 and generate changelog 2023-06-28 15:11:49 +00:00
Michelle Ark
e01d4c0a6e Add restrict-access to dbt_project.yml (#7962) 2023-06-28 10:55:11 -04:00
Michelle Ark
7a6bedaae3 consolidate cross-project ref entrypoint + plugin framework (#7955) 2023-06-28 10:54:55 -04:00
Niall Woodward
22145e7e5f Add invocation command flag (#7939)
* Add invocation command flag

* Add changie entry

* Update .changes/unreleased/Features-20230623-111254.yaml
2023-06-28 10:47:07 -04:00
Niall Woodward
b3ac41ff9a Add thread_id context var (#7942)
* Add thread_id context var

* Changie

* Fix context test

* Update .changes/unreleased/Features-20230623-173357.yaml

Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>

* Fix tests

---------

Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
2023-06-28 10:44:51 -04:00
Michelle Ark
036b95e5b2 Handle state:modified for external nodes (#7925) 2023-06-28 10:34:17 -04:00
Rainer Mensing
2ce0c5ccf5 Add merge incremental strategy for postgres (#6951)
* Add merge incremental strategy

* Expect merge to be a valid strategy for Postgres

---------

Co-authored-by: Anders Swanson <anders.swanson@dbtlabs.com>
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
2023-06-28 10:28:40 -04:00
Peter Webb
7156cc5c1d Add Partial Parsing Support for Semantic Models (#7964)
* CT-2711: Add partial parsing support for semantic models

* CT-2711: Fix typo identified in code review
2023-06-27 17:40:27 -04:00
Michelle Ark
fcd30b1de2 Set access model node args (#7966) 2023-06-27 16:10:13 -04:00
Michelle Ark
a84fa50166 allow setting enabled and depends_on_nodes from ModelNodeArgs (#7930) 2023-06-27 16:09:35 -04:00
Doug Beatty
6a1e3a6db8 Fix macro namespace search packages (#5804) 2023-06-27 14:55:38 -04:00
Peter Webb
b37e5b5198 Factor Out Repeated Logic in the PartialParsing Class (#7952)
* CT-2711: Add remove_tests() call to delete_schema_source() so that call sites are more uniform with other node deletion call sites. This will enable further code factorization.

* CT-2711: Factor repeated code section (mostly) out of PartialParsing.handle_schema_file_changes()

* CT-2711: Factor a repeated code section out of schedule_nodes_for_parsing()
2023-06-26 15:20:50 -04:00
Doug Beatty
f9d4e9e03d Fix comment for dbt retry (#7932) 2023-06-26 12:10:44 -06:00
Gerda Shank
9c97d30702 update mashumaro to 3.8.1 (#7951)
* Update mashumaro to 3.8

* Change to 3.8.1

* Changie
2023-06-26 14:01:22 -04:00
FishtownBuildBot
9836f7bdef [Automated] Merged prep-release/1.6.0b6_5360267609 into target main during release process 2023-06-23 16:11:51 -05:00
Github Build Bot
b07ff7aebd Bumping version to 1.6.0b6 and generate changelog 2023-06-23 20:31:02 +00:00
Peter Webb
aecbb4564c CT-2732: Fix selector methods to include semantic models (#7936) 2023-06-23 15:54:33 -04:00
Quigley Malcolm
779663b39c Improved Semantic Model Measure Percentile defaults (#7877)
* Update semantic model parsing test to check measure agg params

* Make `use_discrete_percentile` and `use_approximate_percentile` non optional and default false

This was a mistake in our implementation of the MeasureAggregationParams.
We had defined them as optional and defaulting to `None`. However, as the
protocol states, they cannot be `None`, they must be a boolean value.
Thus now we now ensure them.

* Add changie doc for measure percentile fixes
2023-06-23 11:20:59 -07:00
Quigley Malcolm
7934af2974 Improved Semantic expr specification handling (#7876)
* Update semantic model parsing test to check different measure expr types

* Allow semantic model measure exprs to be defined with ints and bools in yaml

Sometimes the expr for a measure can defined in yaml with a bool or an int.
However, we were only allowing for strings. There was a work around for this,
which was wrapping your bool or int in double quotes in the yaml, but
this can be fairly annoying for the end user.

* Changie doc for fixing measure expr yaml specification
2023-06-23 10:34:11 -07:00
FishtownBuildBot
533988233e [Automated] Merged prep-release/1.6.0b5_5349512955 into target main during release process 2023-06-22 15:07:48 -05:00
Peter Webb
8bc0e77a1d CT-2719: Rename the semantic_nodes collection on the manifest to semantic_models (#7927) 2023-06-22 15:59:54 -04:00
Github Build Bot
1c93c9bb58 Bumping version to 1.6.0b5 and generate changelog 2023-06-22 19:27:29 +00:00
Michelle Ark
6d7b32977c Fix: safe remove of external nodes from nodes.depends_on (#7923) 2023-06-22 13:42:06 -04:00
Michelle Ark
bf15466bec UninstalledPackagesFoundError references correct packages specified path (#7886) 2023-06-22 11:40:59 -04:00
Gerda Shank
fb1ebe48f0 Resolve SemanticModel refs in the same way as other refs (#7895) 2023-06-22 11:26:08 -04:00
Peter Webb
de65697ff9 Further Integrate Semantic Models (#7917)
* CT-2651: Add Semantic Models to the manifest and various pieces of graph linking code

* CT-2651: Finish integrating semantic models into the partial parsing system

* CT-2651: More semantic model details for partial parsing

* CT-2651: Remove merged references to project_dependencies

* CT-2651: Revise changelog entry

* CT-2651: Disable unit test until partial parsing of semantic models is complete.

* CT-2651: Temporarily disable an apparently-flaky test.
2023-06-21 18:31:54 -04:00
Michelle Ark
ecf90d689e Refactor/unify public and model nodes (#7891) 2023-06-21 17:12:15 -04:00
Mila Page
4cdeff11cd Remove --config-dir instead of fixing #7774 (#7793)
* Stringfy the dir always to solve the bug.

* Add changelog

---------

Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
2023-06-21 11:36:47 -07:00
Gerda Shank
9ff2f6e430 Do not jinja render packages from dependencies yml (#7910)
* Add some comments to methods constructing Project/RuntimeConfig

* Save flag that packages dict came from dependencies.yml

* Test for not rendering packages_dict

* Changie

* Ensure packages_yml_dict and dependencies_yml_dict are dictionaries

* Ensure "packages" passed to render_packages is a dict
2023-06-21 14:33:23 -04:00
Jeremy Cohen
73a0dc6d14 Reorganize, annotate, and revise dependency pins (#7368)
* Reorganize + annotate dependencies

* Loosen pins on other dbt-labs packages

* Minor pins, no patch pins

* Rm support for py37

* Fix protobuf pin

* Bump networkx upper bound #6551
2023-06-21 07:08:43 +02:00
Kshitij Aranke
0a1c73e0fd Fixes #7753: Fix regression in run-operation to not require the name of the package to run (#7811) 2023-06-20 11:19:16 -05:00
Quigley Malcolm
8653ffc5a4 Upgrade core to support dbt-semantic-interfaces 0.1.0dev7 (#7903)
* Bump DSI dependency version to 0.1.0dev7

* Cleaner DSI type enum importing

Previoulsy we had to use individual import paths for each type enum
that dbt-semantic-interfaces provided. However, dbt-semantic-interfaces
has been updated to allow for importing all the type enums from a
singular path.

* Cleaner DSI protocol importing

Previoulsy we had to use individual import paths for each protocol
that dbt-semantic-interfaces provided. However, dbt-semantic-interfaces
has been updated to allow for importing all the protocols from a
singular path.

* Add semantic protocol satisifcation test for metric type params

* Replace `metric.type_params.measures` with `metric.type_params.input_measures`

In DSI 0.1.0dev7 `measures` on metric type params became `input_measures`.
Additionally `input_measures` should not be user specified but something
we compile at parse time, thus we've removed it from `UnparsedMetricTypeParams`.
Finally, actually populating `input_measures` is somewhat complicated due
to the existance of derived metrics, thus that work is being pushed
off to CT-2707.

* Update metric numerator/denominator to be `MetricInput`s

In DSI 0.1.0dev7 `metric.type_params.numerator` and `metric.type_params.denominator`
switched from being `MetricInputMeasure`s to `MetricInput`s. This
commit reflects that change. Additionally, some helper functions on
metric type params were removed related to the numerator and denominator.
Thus we've removed them respectively in this commit.

* Add protocol satisfaction tests for `MetricInput` and `MetricInputMeasure`

* Add `post_aggregation_reference` to `MetricInput` and fix typo in `MetricInputMeasure`

DSI 0.1.0dev7 added `post_aggregation_reference` to the `MetricInput` protocol,
thus we've added it to our implementation in core. Additionally, we had a typo
in a method name in our implementation of `MetricInputMeasure`, ironically
a similar function to the one we've added for `MetricInput`

* Changie doc for upgraded to DSI 0.1.0dev7

* Fix parsing of metric numerator and denominator in schema_yaml_readers

Previously numerator and denominator of a metric were `MetricInputMeasure`s,
now they're `MetricInput`s. Changing the typing isn't enough though.
We have parsing functions in `schema_yaml_readers` which were specifically
parsing the numerator and denominator as if they were `MetricInputMeasure`s.
Thus we had to updating the schema_yaml_readers to parse them as `MetricInput`s.
During this we had some logic in a parsing function `_get_metric_inputs` which
could be abstracted to newly added functions.
2023-06-20 08:08:57 -07:00
Quigley Malcolm
86583a350f Ct-2690 Support dbt-semantic-interfaces 0.1.0dev5 (#7888)
* Upgrade to dbt-semantic-interfaces v0.1.0dev5

This is a fairly simple upgrade. Literally it's just pointing at the
the new versions. The v3 schemas are directly compatible with v5 because
there were no protocol level changes from v3 to v5. All the changers were
updates to tools MetricFlow uses from DSI, not tools that we ourselves
are using in core (yet).

* Add changie doc for DSI version bump
2023-06-16 12:47:55 -07:00
Quigley Malcolm
fafab5d557 [CT-2696] Skip jinia parsing of metric filters (#7885)
* Update metric filters in testing fixtures

I incorrectly wrote the tests such that they didn't include curly
braces, `{{..}}`, around things like `dimension(..)` for filters.
This updates the tests fixtures to have proper filter specifications

* Skip jinja rendering of `filter` key of metrics

Note that `filter` can show up in multiple places: as a root key
on a metric (`metric.filter`), on a metric input (`metric.type_params.metrics[x].filter`),
denominator (`metric.type_params.denominator.filter`), numerator
(`metric.type_params.numerator.filter`), and a metric input measure
(`metric.type_params.measure.filter` and `metric.type_params.measures[x].filter`).
In this commit we skip all of them :)

* Add changie doc for skipping jinja parsing for metric filters

* Update yaml renderer test for metrics
2023-06-15 15:41:31 -07:00
Gerda Shank
39e0c22353 Allow setting packages in dependencies.yml and move dependencies to runtime config (#7857) 2023-06-15 14:41:57 -04:00
colin-rogers-dbt
f767943fb2 Add AdapterRegistered event log message (#7862)
* Add AdapterRegistered event log message

* Add AdapterRegistered to unit test

* make versioning and logging consistent

* make versioning and logging consistent

* add to_version_string

* remove extra equals

* format fire_event
2023-06-14 14:56:47 -07:00
dependabot[bot]
ae97831ebf Bump mypy from 0.981 to 1.0.1 (#7027)
* Bump mypy from 0.981 to 1.0.1

Bumps [mypy](https://github.com/python/mypy) from 0.981 to 1.0.1.
- [Release notes](https://github.com/python/mypy/releases)
- [Commits](https://github.com/python/mypy/compare/v0.981...v1.0.1)

---
updated-dependencies:
- dependency-name: mypy
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* Add automated changelog yaml from template for bot PR

* upgrade mypy and fix all errors

* fixing some duplicate imports from conflict resolution

* fix mypy errors from merging in main

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
2023-06-14 14:31:57 -05:00
Jeremy Cohen
f16bae0ab9 Fix: dbt show --inline with private models (#7838)
* Add functional test

* Check resource_type before DbtReferenceError

* Changelog entry
2023-06-14 12:23:20 -04:00
FishtownBuildBot
b947b2bc7e [Automated] Merged prep-release/1.6.0b4_5260249343 into target main during release process 2023-06-13 16:40:37 -05:00
Github Build Bot
7068688181 Bumping version to 1.6.0b4 and generate changelog 2023-06-13 20:54:08 +00:00
Quigley Malcolm
38c0600982 Update SemanticModel node to match DSI 0.1.0dev3 protocols (#7848)
* Add tests to ensure our semantic layer nodes satisfy the DSI protocols

These tests create runtime checkable versions of the protocols defined in
DSI. Thus we can instantiate instances of our semantic layer nodes and
use `isinstance` to check that they satisfy the protocol. These `runtime_checkable`
versions of the protocols should only exist in testing and should never
be used in the actual package code.

* Update the `Dimension` object of `SemanticModel` node to match DSI protocol

* Make `UnparsedDimension` more strict and update schema readers accordingly

* Update the `Entity` object of `SemanticModel` node to match DSI protocol

* Make `UnparsedEntity` more strict and update schema readers accordingly

* Update the `Measure` object of `SemanticModel` node to match DSI protocol

* Make `UnparsedMeasure` more strict and update schema readers accordingly

* Update the `SemanticModel` node to match DSI protocol

A lot of the additions are helper functions which we don't actually
use in core. This is a known issue. We're in the process of removing
a fair number of them from the DSI protocol spec. However, in the meantime
we need to implement them to satisfy the protocol unfortunately.

* Make `UnparsedSemanticModel` more strict and update schema readers accordingly

* Changie entry for updating SemanticModel node
2023-06-13 13:25:35 -07:00
Jeremy Cohen
83d163add5 Respect column quote config in model contracts (#7537) 2023-06-13 15:32:56 -04:00
Emily Rockman
d46e8855ef Allow ProjectDependency to have extra fields (#7834)
* Allow ProjectDependency to have extra fields

* changelog
2023-06-13 09:47:50 -05:00
mirnawong1
60524c0f8e update adapters url (#7779)
* update adapters url

in response to [docs.getedbt.com pr 3465](https://github.com/dbt-labs/docs.getdbt.com/issues/3465), updating this error message to point to the correct URL, which was recently changed. 

old URL: https://docs.getdbt.com/docs/supported-data-platforms#adapter-installation
new URL: https://docs.getdbt.com/docs/connect-adapters#install-using-the-cli

thank you @dbeatty10 for your 🦅 👀 !

* adding changie entry

* Update .changes/unreleased/Breaking Changes-20230612-161159.yaml

---------

Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
2023-06-13 08:51:21 -05:00
Gerda Shank
ca73a2aa15 Use project directory in path selector instead of cwd (#7829)
* Use contextvar to store and get project_root for path selector method

* Changie

* Modify test to check Path selector with project-dir

* Don't set cv_project_root in base task if no config
2023-06-13 09:08:09 -04:00
Emily Rockman
4a833a4272 Move remaining unit tests to /tests directory (#7843)
* moved remaining unit tests to  directory

* fix path

* Delete profiles.yml

* rename remaining test refs
2023-06-13 07:08:28 -05:00
FishtownBuildBot
f9abeca231 Add new index.html and changelog yaml files from dbt-docs (#7836) 2023-06-12 11:55:42 -04:00
Jeremy Cohen
5f9e527768 Rm spaces from NodeType strings (#7842)
* Rm space from NodeType strings

* Add changelog entry
2023-06-12 11:55:07 -04:00
FishtownBuildBot
6f51de4cb5 [Automated] Merged prep-release/1.6.0b3_5215324721 into target main during release process 2023-06-08 15:52:09 -05:00
Github Build Bot
cb64682d33 Bumping version to 1.6.0b3 and generate changelog 2023-06-08 20:15:20 +00:00
Quigley Malcolm
98d1a94b60 Update spec MetricNode for dbt x MetricFlow integration and begin outputting semantic manifest artifact (#7812)
* Refactor MetricNode definition to satisfy DSI Metric protocol

* Fix tests involving metrics to have updated properties

* Update UnparsedMetricNode to match new metric yaml spec

* Update MetricParser for new unparsed and parsed MetricNodes

* Remove `rename_metric_attr`

We're intentionally breaking the spec. There will be a separate tool provided
for migrating from dbt-metrics to dbt x metricflow. This bit of code was renaming
things like `type` to `calculation_method`. This is problematic because `type` is
on the new spec, while `calculation_method` is not. Additionally, since we're
intentionally breaking the spec, this function, `rename_metric_attr`, shouldn't be
used for any property renaming.

* Fix tests for Metrics (1.6) changes

* Regenerated v10 manifest schema and associated functional test artifact state

* Remove no longer needed tests

* Skip / comment out tests for metrics functionality that we'll be implementing later

* Begin outputting semantic manifest artifact on every run

* Drop metrics during upgrade_manifest_json if manifest is v9 or before

* Update properties of `minimal_parsed_metric_dict` to match new metric spec

* Add changie entry for metric node breaking changes

* Add semantic model nodes to semantic manifest
2023-06-08 10:23:36 -07:00
Peter Webb
a89da7ca88 Add SemanticModel Node Type (#7769)
* Add dbt-semantic-interfaces as a dependency

With the integration with MetricFlow we're taking a dependency on
`dbt-semantic-interfaces` which acts as the source of truth for
protocols which MetricFlow and dbt-core need to agree on. Additionally
we're hard pinning to 0.1.0.dev3 for now. We plan on having a less
restrictive specification when dbt-core 1.6 hits GA.

* Add implementations of DSI Metadata protocol to nodes.py

* CT-2521: Initial work on adding new SemanticModel node

* CT-2521: Second rough draft of SemanticModels

* CT-2521: Update schema v10

* CT-2521: Update unit tests for new SemanticModel collection in manifest

* CT-2521: Add changelog entry

* CT-2521: Final touches on initial implementation of SemanticModel parsing

* Change name of Metadata class to reduce potential for confusion

* Remove "Replaceable" inheritance, per review

* CT-2521: Rename internal variables from semantic_models to semantic_nodes

* CT-2521: Update manifest schema to reflect change

---------

Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
2023-06-08 09:39:04 -04:00
Mike Alfare
2d237828ae ADAP-2: Materialized Views (#7239)
* changie

* ADAP-387: Stub materialized view as a materialization (#7211)

* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources

* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources

* remove unneeded return statement, rename directory

* remove unneeded ()

* responding to some pr feedback

* adjusting order of events for mv base work

* move up prexisting drop of backup

* change relatiion type to view to be consistent

* add base test case

* fix jinja exeception message expression, basic test passing

* response to feedback, removeal of refresh infavor of combined create_as, etc.

* swapping to api layer and stratgeies for default implementation (basing off postgres, redshift)

* remove stratgey to limit need for now

* remove unneeded story level changelog entry

* add strategies to condtional in place of old macros

* macro name fix

* rename refresh macro in api level

* align names between postgres and default to same convention

* align names between postgres and default to same convention

* change a create call to full refresh

* pull adapter rename into strategy, add backup_relation as optional arg

* minor typo fix, add intermediate relation to refresh strategy and initial attempt at further conditional logic

* updating to feature main

---------

Co-authored-by: Matthew McKnight <matthew.mcknight@dbtlabs.com>

* ADAP-387: reverting db_api implementation (#7322)

* changie

* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources

* remove unneeded return statement, rename directory

* remove unneeded ()

* responding to some pr feedback

* adjusting order of events for mv base work

* move up prexisting drop of backup

* change relatiion type to view to be consistent

* add base test case

* fix jinja exeception message expression, basic test passing

* response to feedback, removeal of refresh infavor of combined create_as, etc.

* swapping to api layer and stratgeies for default implementation (basing off postgres, redshift)

* remove stratgey to limit need for now

* remove unneeded story level changelog entry

* add strategies to condtional in place of old macros

* macro name fix

* rename refresh macro in api level

* align names between postgres and default to same convention

* change a create call to full refresh

* pull adapter rename into strategy, add backup_relation as optional arg

* minor typo fix, add intermediate relation to refresh strategy and initial attempt at further conditional logic

* updating to feature main

* removing db_api and strategies directories in favor of matching current materialization setups

* macro name change

* revert to current approach for materializations

* added tests

* added `is_materialized_view` to `BaseRelation`

* updated materialized view stored value to snake case

* typo

* moved materialized view tests into adapter test framework

* add enum to relation for comparison in jinja

---------

Co-authored-by: Mike Alfare <mike.alfare@dbtlabs.com>

* ADAP-391: Add configuration change option (#7272)

* changie

* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources

* move up pre-existing drop of backup

* change relation type to view to be consistent

* add base test case

* fix jinja exception message expression, basic test passing

* align names between postgres and default to same convention

* init set of Enum for config

* work on initial Enum class for on_configuration_change base it off ConstraintTypes which is also a str based Enum in core

* add on_configuration_change to unit test expected values

* make suggested name change to Enum class

* add on_configuration_change to some integration tests

* add on_configuration_change to expected_manifest to pass functional tests

* added `is_materialized_view` to `BaseRelation`

* updated materialized view stored value to snake case

* moved materialized view tests into adapter test framework

* add alter materialized view macro

* change class name, and config setup

* play with field setup for on_configuration_change

* add method for default selection in enum class

* renamed get_refresh_data_in_materialized_view_sql to align with experimental package

* changed expected values to default string

* added in `on_configuration_change` setting

* change ignore to skip

* updated default option for on_configuration_change on NodeConfig

* removed explicit calls to enum values

* add test setup for testing fail config option

* updated `config_updates` to `configuration_changes` to align with `on_configuration_change` name

* setup configuration change framework

* skipped tests that are expected to fail without adapter implementation

* cleaned up log checks

---------

Co-authored-by: Mike Alfare <mike.alfare@dbtlabs.com>

* ADAP-388: Stub materialized view as a materialization - postgres (#7244)

* move the body of the default macros into the postgres implementation, throw errors if the default is used, indicating that materialized views have not been implemented for that adapter

---------

Co-authored-by: Matthew McKnight <matthew.mcknight@dbtlabs.com>

* ADAP-402: Add configuration change option - postgres (#7334)

* changie

* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources

* remove unneeded return statement, rename directory

* remove unneeded ()

* responding to some pr feedback

* adjusting order of events for mv base work

* move up prexisting drop of backup

* change relatiion type to view to be consistent

* add base test case

* fix jinja exeception message expression, basic test passing

* added materialized view stubs and test

* response to feedback, removeal of refresh infavor of combined create_as, etc.

* updated postgres to use the new macros structure

* swapping to api layer and stratgeies for default implementation (basing off postgres, redshift)

* remove stratgey to limit need for now

* remove unneeded story level changelog entry

* add strategies to condtional in place of old macros

* macro name fix

* rename refresh macro in api level

* align names between postgres and default to same convention

* change a create call to full refresh

* pull adapter rename into strategy, add backup_relation as optional arg

* minor typo fix, add intermediate relation to refresh strategy and initial attempt at further conditional logic

* init copy of pr 387 to begin 391 implementation

* init set of Enum for config

* work on initial Enum class for on_configuration_change base it off ConstraintTypes which is also a str based Enum in core

* remove postgres-specific materialization in favor of core default materialization

* update db_api to use native types (e.g. str) and avoid direct calls to relation or config, which would alter the run order for all db_api dependencies

* add clarifying comment as to why we have a single test that's expected to fail at the dbt-core layer

* add on_configuration_change to unit test expected values

* make suggested name change to Enum class

* add on_configuration_change to some integretion tests

* add on_configuration_change to expected_manifest to pass functuional tests

* removing db_api and strategies directories in favor of matching current materialization setups

* macro name change

* revert to current approach for materializations

* revert to current approach for materializations

* added tests

* move materialized view logic into the `/materializations` directory in line with `dbt-core`

* moved default macros in `dbt-core` into `dbt-postgres`

* added `is_materialized_view` to `BaseRelation`

* updated materialized view stored value to snake case

* moved materialized view tests into adapter test framework

* updated materialized view tests to use adapter test framework

* add alter materialized view macro

* add alter materialized view macro

* change class name, and config setup

* change class name, and config setup

* play with field setup for on_configuration_change

* add method for default selection in enum class

* renamed get_refresh_data_in_materialized_view_sql to align with experimental package

* changed expected values to default string

* added in `on_configuration_change` setting

* change ignore to skip

* added in `on_configuration_change` setting

* updated default option for on_configuration_change on NodeConfig

* updated default option for on_configuration_change on NodeConfig

* fixed list being passed as string bug

* removed explicit calls to enum values

* removed unneeded test class

* fixed on_configuration_change to be picked up appropriately

* add test setup for testing fail config option

* remove breakpoint, uncomment tests

* update skip scenario to use empty strings

* update skip scenario to avoid using sql at all, remove extra whitespace in some templates

* push up initial addition of indexes for mv macro

* push slight change up

* reverting alt macro and moving the do create_index call to be more in line with other materializations

* Merge branch 'feature/materialized-views/ADAP-2' into feature/materialized-views/ADAP-402

# Conflicts:
#	core/dbt/contracts/graph/model_config.py
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/alter_materialized_view.sql
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/create_materialized_view_as.sql
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/get_materialized_view_configuration_changes.sql
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/materialized_view.sql
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/refresh_materialized_view.sql
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/replace_materialized_view.sql
#	plugins/postgres/dbt/include/postgres/macros/materializations/materialized_view.sql
#	tests/adapter/dbt/tests/adapter/materialized_views/base.py
#	tests/functional/materializations/test_materialized_view.py

* merge feature branch into story branch

* merge feature branch into story branch

* added indexes into the workflow

* fix error in jinja that caused print error

* working on test messaging and skipping tests that might not fit quite into current system

* add drop and show macros for indexes

* add drop and show macros for indexes

* add logic to determine the indexes to create or drop

* pulled index updates through the workflow properly

* convert configuration changes to fixtures, implement index changes into tests

* created Model dataclass for readability, added column to swap index columns for testing

* fixed typo

---------

Co-authored-by: Matthew McKnight <matthew.mcknight@dbtlabs.com>

* ADAP-395: Implement native materialized view DDL (#7336)

* changie

* changie

* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources

* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources

* remove unneeded return statement, rename directory

* remove unneeded ()

* responding to some pr feedback

* adjusting order of events for mv base work

* move up prexisting drop of backup

* change relatiion type to view to be consistent

* add base test case

* fix jinja exeception message expression, basic test passing

* added materialized view stubs and test

* response to feedback, removeal of refresh infavor of combined create_as, etc.

* updated postgres to use the new macros structure

* swapping to api layer and stratgeies for default implementation (basing off postgres, redshift)

* remove stratgey to limit need for now

* remove unneeded story level changelog entry

* add strategies to condtional in place of old macros

* macro name fix

* rename refresh macro in api level

* align names between postgres and default to same convention

* align names between postgres and default to same convention

* change a create call to full refresh

* pull adapter rename into strategy, add backup_relation as optional arg

* minor typo fix, add intermediate relation to refresh strategy and initial attempt at further conditional logic

* init copy of pr 387 to begin 391 implementation

* updating to feature main

* updating to feature main

* init set of Enum for config

* work on initial Enum class for on_configuration_change base it off ConstraintTypes which is also a str based Enum in core

* remove postgres-specific materialization in favor of core default materialization

* update db_api to use native types (e.g. str) and avoid direct calls to relation or config, which would alter the run order for all db_api dependencies

* add clarifying comment as to why we have a single test that's expected to fail at the dbt-core layer

* add on_configuration_change to unit test expected values

* make suggested name change to Enum class

* add on_configuration_change to some integretion tests

* add on_configuration_change to expected_manifest to pass functuional tests

* removing db_api and strategies directories in favor of matching current materialization setups

* macro name change

* revert to current approach for materializations

* revert to current approach for materializations

* added tests

* move materialized view logic into the `/materializations` directory in line with `dbt-core`

* moved default macros in `dbt-core` into `dbt-postgres`

* added `is_materialized_view` to `BaseRelation`

* updated materialized view stored value to snake case

* typo

* moved materialized view tests into adapter test framework

* updated materialized view tests to use adapter test framework

* add alter materialized view macro

* add alter materialized view macro

* added basic sql to default macros, added postgres-specific sql for alter scenario, stubbed a test case for index update

* change class name, and config setup

* change class name, and config setup

* play with field setup for on_configuration_change

* add method for default selection in enum class

* renamed get_refresh_data_in_materialized_view_sql to align with experimental package

* changed expected values to default string

* added in `on_configuration_change` setting

* change ignore to skip

* added in `on_configuration_change` setting

* updated default option for on_configuration_change on NodeConfig

* updated default option for on_configuration_change on NodeConfig

* fixed list being passed as string bug

* fixed list being passed as string bug

* removed explicit calls to enum values

* removed explicit calls to enum values

* removed unneeded test class

* fixed on_configuration_change to be picked up appropriately

* add test setup for testing fail config option

* remove breakpoint, uncomment tests

* update skip scenario to use empty strings

* update skip scenario to avoid using sql at all, remove extra whitespace in some templates

* push up initial addition of indexes for mv macro

* push slight change up

* reverting alt macro and moving the do create_index call to be more in line with other materializations

* Merge branch 'feature/materialized-views/ADAP-2' into feature/materialized-views/ADAP-402

# Conflicts:
#	core/dbt/contracts/graph/model_config.py
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/alter_materialized_view.sql
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/create_materialized_view_as.sql
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/get_materialized_view_configuration_changes.sql
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/materialized_view.sql
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/refresh_materialized_view.sql
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/replace_materialized_view.sql
#	plugins/postgres/dbt/include/postgres/macros/materializations/materialized_view.sql
#	tests/adapter/dbt/tests/adapter/materialized_views/base.py
#	tests/functional/materializations/test_materialized_view.py

* merge feature branch into story branch

* merge feature branch into story branch

* added indexes into the workflow

* fix error in jinja that caused print error

* working on test messaging and skipping tests that might not fit quite into current system

* Merge branch 'feature/materialized-views/ADAP-2' into feature/materialized-views/ADAP-395

# Conflicts:
#	core/dbt/include/global_project/macros/materializations/models/materialized_view/get_materialized_view_configuration_changes.sql
#	plugins/postgres/dbt/include/postgres/macros/adapters.sql
#	plugins/postgres/dbt/include/postgres/macros/materializations/materialized_view.sql
#	tests/adapter/dbt/tests/adapter/materialized_views/test_on_configuration_change.py
#	tests/functional/materializations/test_materialized_view.py

* moved postgres implemention into plugin directory

* update index methods to align with the configuration update macro

* added native ddl to postgres macros

* removed extra docstring

* updated references to View, now references MaterializedView

* decomposed materialization into macros

* refactor index create statement parser, add exceptions for unexpected formats

* swapped conditional to check for positive state

* removed skipped test now that materialized view is being used

* return the results and logs of the run so that additional checks can be applied at the adapter level, add check for refresh to a test

* add check for indexes in particular for apply on configuration scenario

* removed extra argument

* add materialized views to get_relations / list_relations

* typos in index change logic

* moved full refresh check inside the build sql step

---------

Co-authored-by: Matthew McKnight <matthew.mcknight@dbtlabs.com>

* removing returns from tests to stop logs from printing

* moved test cases into postgres tests, left non-test functionality in base as new methods or fixtures

* fixed overwrite issue, simplified assertion method

* updated import order to standard

* fixed test import paths

* updated naming convention for proper test collection with the test runner

* still trying to make the test runner happy

* rewrite index updates to use a better source in Postgres

* break out a large test suite as a separate run

* update `skip` and `fail` scenarios with more descriptive results

* typo

* removed call to skip status

* reverting `exceptions_jinja.py`

* added FailFastError back, the right way

* removed PostgresIndex in favor of the already existing PostgresIndexConfig, pulled it into its own file to avoid circular imports

* removed assumed models in method calls, removed odd insert records and replaced with get row count

* fixed index issue, removed some indirection in testing

* made test more readable

* remove the "apply" from the tests and put it on the base as the default

* generalized assertion for reuse with dbt-snowflake, fixed bug in record count utility

* fixed type to be more generic to accommodate adapters with their own relation types

* fixed all the broken index stuff

* updated on_configuration_change to use existing patterns

* updated on_configuration_change to use existing patterns

* reflected update in tests and materialization logic

* reflected update in tests and materialization logic

* reverted the change to create a config object from the option object, using just the option object now

* reverted the change to create a config object from the option object, using just the option object now

* modelled database objects to support monitoring all configuration changes

* updated "skip" to "continue", throw an error on non-implemented macro defaults

* updated "skip" to "continue", throw an error on non-implemented macro defaults

* updated "skip" to "continue", throw an error on non-implemented macro defaults

* updated "skip" to "continue", throw an error on non-implemented macro defaults

* reverted centralized framework, retained a few reusable base classes

* updated names to be more consistent

* readability updates

* added readme specifying that `relation_configs` only supports materialized views for now

---------

Co-authored-by: Matthew McKnight <matthew.mcknight@dbtlabs.com>
Co-authored-by: Matthew McKnight <91097623+McKnight-42@users.noreply.github.com>
2023-06-07 19:19:09 -04:00
Michelle Ark
f4253da72a fix: removing dependency from dependencies.yml (#7743) 2023-06-07 13:48:39 -04:00
Mila Page
919822e583 Adap 496/add test connection mode to debug (#7741)
* --connection-flag

* Standardize the plugin functions used by DebugTask

* Cleanup redundant code and help logic along.

* Add more output tests to add logic coverage and formatting.

* Code review

---------

Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
2023-06-07 09:56:35 -07:00
Michelle Ark
444c787729 fix error message for empty/None: --warn-error-options handling (#7735) 2023-06-07 12:23:40 -04:00
Michelle Ark
3b63dd9f11 Validate public models are not materialized as ephemeral (#7794) 2023-06-07 12:22:57 -04:00
Kshitij Aranke
84166bf457 Fixes #7551: Create add_from_artifact to populate state_relation field of nodes (#7796) 2023-06-06 15:25:12 -07:00
Michelle Ark
dd445e1fde generalize BaseModelConstraintsRuntimeEnforcement (#7805) 2023-06-06 16:30:50 -04:00
Michelle Ark
6a22ec1b2e Package-namespaced generate_x_name macro resolution (#7509) 2023-06-06 15:05:44 -04:00
Emily Rockman
587bbcbf0d Improve warnings for constraints and materialization types (#7696)
* first pass

* debugging

* regen proto types

* refactor to use warn_supported flag

* PR feedback
2023-06-06 12:50:58 -05:00
Doug Beatty
8e1c4ec116 Fix not equals comparison to be null-safe for adapters/utils tests (#7776)
* Fix names within functional test

* Changelog entry

* Test for implementation of null-safe equals comparison

* Remove duplicated where filter

* Fix null-safe equals comparison

* Fix tests for `concat` and `hash` by using empty strings () instead of `null`

* Remove macro namespace interpolation
2023-06-06 06:11:48 -06:00
Kshitij Aranke
dc35f56baa Fixes #7299: dbt retry (#7763) 2023-06-05 15:51:00 -07:00
Michelle Ark
60d116b5b5 log PublicationArtifactAvailable even when partially parsing & public models unchanged (#7783) 2023-06-05 14:59:38 -04:00
Emily Rockman
4dbc4a41c4 remove entire changes folder (#7766)
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
2023-06-05 09:51:30 -05:00
Michelle Ark
89541faec9 force dependency between test models (#7767) 2023-06-05 10:45:59 -04:00
Gerda Shank
79bd98560b Version 0 for model works for latest_version (#7712) 2023-06-05 10:21:39 -04:00
Michelle Ark
7917bd5033 add project_name to manifest metadata (#7754) 2023-06-02 16:16:14 -04:00
Michelle Ark
05b0ebb184 Fix constraint rendering for expressions and foreign key constraint types (#7512) 2023-06-02 15:05:09 -04:00
Quazi Irfan
e1d7a53325 Fix doc link in selector.py (#7755)
* Fix doc link in selector.py

* Ran changie to modify changelog entry
2023-06-02 09:00:58 -05:00
Michelle Ark
7a06d354aa pass optional sql_header to empty subquery sql rendering (#7734) 2023-06-01 14:08:02 -04:00
dave-connors-3
9dd5ab90bf add ability to select models by access (#7739)
* add ability to select models by access

* changie

* Update core/dbt/graph/selector_methods.py
2023-06-01 09:25:59 -04:00
Michelle Ark
45d614533f fix StopIteration error when publication not found (#7710) 2023-05-30 16:50:36 -04:00
Peter Webb
00a531d9d6 Template rendering optimization (#7451)
* CT-2478: Template rendering optimization

* CT-2478: Fix type annotation, and accomodate non-string unit test cases.
2023-05-30 12:48:47 -04:00
Sam Debruyn
fd301a38db Dropped support for Python 3.7 (#7623) 2023-05-30 12:12:57 -04:00
Jeremy Cohen
9c7e01dbca Readd exp_path for config deprecation warnings (#7536) 2023-05-30 12:04:49 -04:00
github-actions[bot]
1ac6df0996 Adding performance modeling for 1.2.0 to refs/heads/main (#7560)
* adding performance baseline for 1.2.0

* Adding newline

---------

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2023-05-27 22:10:38 -04:00
Gerda Shank
38ca4fce25 Target path should be relative to project dir, rather than current working directory (#7706) 2023-05-26 18:50:38 -04:00
Kshitij Aranke
7e3a6eec96 fix #7300: Enable state for deferral to be separate from state for selectors (#7690) 2023-05-26 13:00:16 -07:00
Emily Rockman
ac16a55c64 Update to reusable workflow for branch testing (#7676)
* fix overlooked node12 case with abandonded marketplace action

* update slack notification

* remove spaces per formatting

* replace with cli dispatch

* move conditional

* add explicit token, temp comment out slack

* add checkout

* checkout teh right branch

* switch to PAT

* add back repo checkout

* manually check workflow status

* fix notification

* swap to reusable workflow

* fix path

* swap permissions

* remove trigger

* fix secrets

* point to main
2023-05-26 14:51:00 -05:00
Doug Beatty
620ca40b85 Add % to adapter suite test cases for persist_docs (#7699)
* Test table/view/column-level comments with `%` symbol

* Test docs block with `%` symbol

* Changelog entry
2023-05-26 12:48:08 -04:00
leahwicz
aa11cf2956 Adding link to 1.5 release notes (#7707) 2023-05-26 08:38:13 -04:00
FishtownBuildBot
feb06e2107 [Automated] Merged prep-release/1.6.0b2_5081502847 into target main during release process 2023-05-25 10:41:03 -05:00
Github Build Bot
a3d40e0abf Bumping version to 1.6.0b2 and generate changelog 2023-05-25 15:06:24 +00:00
Gerda Shank
7c1bd91d0a CT 2590 write pub artifact to log (#7686) 2023-05-24 13:54:58 -04:00
leahwicz
70a132d059 Updating CODEOWNERS to consolidate (#7693)
Consolidating to remove Language and Execution and instead default to the `core-team`
2023-05-24 09:12:25 -04:00
Anis Nasir
1fdebc660b Relaxed the pyyaml dependency to >=5.3. (#7681) 2023-05-24 08:41:38 -04:00
Gerda Shank
0516192d69 CT 2516 ensure that paths in Jinja context flags object are strings (#7678) 2023-05-24 08:24:36 -04:00
github-actions[bot]
f99be58217 Adding performance modeling for 1.4.6 to refs/heads/main (#7532)
* adding performance baseline for 1.4.6

* Add newline

---------

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2023-05-24 08:17:19 -04:00
github-actions[bot]
3b6222e516 Adding performance modeling for 1.3.0 to refs/heads/main (#7530)
* adding performance baseline for 1.3.0

* Add newline

---------

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2023-05-24 08:16:30 -04:00
github-actions[bot]
b88e60f8dd Adding performance modeling for 1.4.1 to refs/heads/main (#7527)
* adding performance baseline for 1.4.1

* Add newline

---------

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2023-05-24 08:16:02 -04:00
github-actions[bot]
9373c4d1e4 Adding performance modeling for 1.3.4 to refs/heads/main (#7525)
* adding performance baseline for 1.3.4

* Add newline

---------

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2023-05-24 08:15:26 -04:00
dependabot[bot]
0fe3ee8eca Bump ubuntu from 23.04 to 23.10 (#7675)
* Bump ubuntu from 23.04 to 23.10

Bumps ubuntu from 23.04 to 23.10.

---
updated-dependencies:
- dependency-name: ubuntu
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Add automated changelog yaml from template for bot PR

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
2023-05-24 08:01:10 -04:00
Sam Debruyn
0d71a32aa2 Include null checks in utils test base (#7672)
* Include null checks in utils test base

* Add tests for the schema test

* Add tests for this macro

---------

Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
2023-05-23 19:49:48 -07:00
Doug Beatty
0f223663bb Honor --skip-profile-setup parameter when inside an existing project (#7609)
* Honor `--skip-profile-setup` parameter when inside an existing project

* Use project name as the profile name

* Use separate file connections for reading and writing

* Raise a custom exception when no adapters are installed

* Test skipping interactive profile setup when inside a dbt project

* Replace `assert_not_called()` since it does not work

* Verbose CLI argument for skipping profile setup

* Use separate file connections for reading and writing

* Check empty list in a Pythonic manner
2023-05-23 17:56:54 -06:00
Kshitij Aranke
c25d0c9f9c fix #7502: write run_results.json for run operation (#7655) 2023-05-23 14:56:23 -07:00
Peter Webb
4a4b7beeb9 Model Deprecation (#7562)
* CT-2461: Work toward model deprecation

* CT-2461: Remove unneeded conversions

* CT-2461: Fix up unit tests for new fields, correct a couple oversights

* CT-2461: Remaining implementation and tests for model/ref deprecation warnings

* CT-2461: Changelog entry for deprecation warnings

* CT-2461: Refine datetime handling and tests

* CT-2461: Fix up unit test data

* CT-2461: Fix some more unit test data.

* CT-2461: Fix merge issues

* CT-2461: Code review items.

* CT-2461: Improve version -> str conversion
2023-05-23 09:30:32 -04:00
Ian Knox
265e09dc93 Remove DelayedFileHandler (#7661)
* remove DelayedFileHandler

* Changelog

* set_path to no-op

* more no-ops for rpc

* Clearer comments
2023-05-22 16:42:48 -04:00
Mike Alfare
87ea28fe84 break out a large test suite as a separate execution to avoid memory issues on windows CI runs (#7669) 2023-05-19 17:04:35 -04:00
Michelle Ark
af0f786f2e Accept PublicationArtifacts in dbtRunner.invoke (#7656) 2023-05-18 16:42:50 -04:00
David Bloss
50528a009d update used gh actions ahead of node12 deprecation (#7651)
* update used gh actions ahead of node12 deprecation

* replace with valid tag

---------

Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
2023-05-17 16:20:26 -05:00
Stu Kilgore
f6e5582370 Add "other" relation to reffable node classes (#7645) 2023-05-17 12:14:16 -05:00
Peter Webb
dea3181d96 Exclude some profile fields from Jinja rendering when they are not valid Jinja. (#7630)
* CT-2583: Exclude some profile fields from Jinja rendering.

* CT-2583: Add functional test.

* CT-2583: Change approach to password jinja detection

* CT-2583: Extract string constant and add additional checks

* CT-2583: Improve unit test coverage
2023-05-17 11:38:48 -04:00
Gerda Shank
5f7ae2fd4c Move node patch method to schema parser patch_node_properties and refactor schema parsing (#7640) 2023-05-16 21:08:04 -04:00
Daniel Reeves
4f249b8652 Add target_path to more cli commands that use it (#7647) 2023-05-16 16:22:59 -05:00
Ian Knox
df23f68dd4 Missed PR fedback (#7642) 2023-05-16 13:50:28 -05:00
Stu Kilgore
4b091cee9e Instantiate Flags class from dict and command name (#7624) 2023-05-16 13:31:23 -05:00
Ian Knox
dcb5acdf29 bugfix: Deps hangs when using relative paths via --project-dir (#7628) 2023-05-16 10:00:23 -05:00
Emily Rockman
7fbeced315 updates for github deprecations (#7614)
* updates for github deprecations

* fix jira file name

* swap out abandonded action

* add quotes to env var

* revert main.yml
2023-05-15 15:28:52 -05:00
Mike Alfare
47e7b1cc80 Feature/drop relation/ct 2581 (#7626)
* changie
* move drop_relation macros into their own file, add scenarios for table, view, and materialized view
2023-05-15 15:51:19 -04:00
FishtownBuildBot
8f998c218e [Automated] Merged prep-release/1.6.0b1_4961250999 into target main during release process 2023-05-12 12:37:41 -05:00
Github Build Bot
41c0797d7a Bumping version to 1.6.0b1 and generate changelog 2023-05-12 17:04:26 +00:00
Michelle Ark
3f2cba0dec add --artifact flag to scripts/collect-artifact-schema (#7599) 2023-05-11 14:31:00 -04:00
Michelle Ark
b60c67d107 add publication artifact to schemas (#7590) 2023-05-11 13:18:35 -04:00
Doug Beatty
630cd3aba0 Allow missing profiles.yml for dbt deps and dbt init (#7546)
* Allow missing `profiles.yml` for `dbt deps` and `dbt init`

* Some commands allow the `--profiles-dir` to not exist

* Remove fix to verify that CI tests work

* Allow missing `profiles.yml` for `dbt deps` and `dbt init`

* CI is not finding any installed adapters

* Remove functional test for `dbt init`
2023-05-10 19:24:04 -06:00
Emily Rockman
05595f5920 Detect breaking changes to constraints in state:modifed (#7476)
* added test that fails

* added new exception

* add partial error checking - needs more specifics

* move contract check under modelnode

* try adding only enforced constraints

* add checks for enforced constraints

* changelog

* add materialization logic

* clean up tests, tweak materializations

* PR feedback

* more PR feedback

* change to tuple
2023-05-10 15:39:26 -05:00
Gerda Shank
29f2cfc48d CT 2510 Throw error for duplicate versioned and non versioned model names (#7577)
* Check for versioned/unversioned duplicates

* Add new exception DuplicateVersionedUnversionedError

* Changie

* Handle packages when finding versioned and unversioned duplicates
2023-05-10 16:16:38 -04:00
Gerda Shank
43d949c5cc CT 2494 check for project level dependency cycles (#7558)
* Raise error if dependent project depends on current project

* Test for project dependency cycle

* Changie
2023-05-10 10:20:03 -04:00
Mike Alfare
58312f1816 CT-2556: pin urllib3 to 1.x (#7574)
* pin urllib3 to 1.x

* changie
2023-05-09 18:36:20 -04:00
Ian Knox
dffbb6a659 Always write run_results.json (#7539) 2023-05-09 13:46:21 -05:00
Kshitij Aranke
272beb21a9 fix #7413: inject sql header in query for show (#7568) 2023-05-09 11:29:12 -07:00
Gerda Shank
d34c511fa5 CT 2552 pin protobuf to >=4.0.0 (#7566)
* Pin protobuf to >=4.0.0

* Changie
2023-05-09 13:10:09 -04:00
github-actions[bot]
2945619eb8 Adding performance modeling for 1.4.0 to refs/heads/main (#7523)
* adding performance baseline for 1.4.0

* Fix formatting

---------

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2023-05-09 09:14:53 -04:00
Kshitij Aranke
078a83679a fix #7390: push down limit filtering to adapter (#7545) 2023-05-08 22:22:58 -07:00
Kshitij Aranke
881437e890 fix #7273: enable dbt show for seeds (#7544) 2023-05-08 15:08:24 -07:00
Kshitij Aranke
40aca4bc17 fix #7407: print model version in dbt show if specified (#7543) 2023-05-08 14:30:07 -07:00
Michelle Ark
0de046dfbe Allow duplicate refable node names across packages (#7374) 2023-05-08 16:41:37 -04:00
Jeremy Cohen
5a7b73be26 Do not rewrite manifest.json during 'docs serve' command (#7554) 2023-05-08 15:36:07 -04:00
Jeremy Cohen
35f8ceb7f1 Back compat for previous return type of collect_freshness (#7535)
* Back compat for previous retrurn type of 'collect_freshness'

* Test fixups

* PR feedback
2023-05-08 10:11:42 -04:00
Doug Beatty
19d6dab973 Fix inverted --print/--no-print flag (#7524) 2023-05-08 07:20:29 -06:00
leahwicz
810ef7f556 Adding perf testing GHA (#5851)
* Adding perf testing GHA

* Fixing tigger syntax

* Fixing PR creation issue

* Updating testing var

* Remove unneeded branch names

* Fixing branch naming convention

* Standardizing branch name to var

* Consolidating PR jobs

* Updating inputs and making more readable

* Splitting steps up

* Making some updates here to simplify and update

* Remove tab

* Cleaned up testing TODOs before committing

* Fixing spacing

* Fixing spacing issue
2023-05-05 09:41:41 -04:00
Gerda Shank
fd7306643f Initial implementation of cross-project ref (#7276)
* Create publication.py, various Publication classes, Dependency class

* Load dependencies.yml and the corresponding publication file

* Add "public_nodes" and populate ref_lookup

* resolve_ref working

* Add public nodes to parent and child maps

* Bump manifest version and fix tests, use ModelDependsOn

* Split out PublicationArtifact and PublicationConfig, store public_models
separately

* Store dependencies in publication artifact

* change detection of PublicModel for >= python3.10

* Handle removing references for re-processing if publication has changed

* Handle only changed publication artifacts

* Add some logging events

* Remove duplicate nodes from manifest

* refactor relation_from_relation_name

* Remove duplicate writing of manifest.json

* Add public_nodes to flat_graph

* Move some file name constants to core/dbt/constants.py

* Remove "environment" from ProjectDependency. Add
database/schema/identifier to PublicModel. Update TargetNotFound
exception.

* Include external publication dependencies in publication artifact dependencies

* Remove create_from_relation_name, call create_from_node instead

* Change PublicationArtifactChanged message to debug level

* Make write_publication_artifact a function in parser/manifest.py

* Create fixture to create minimal alternate project (just models)

* develop multi project test case
2023-05-03 10:56:40 -04:00
Brice Luu
f1dddaa6e9 Ignore parent tests added edges for build selection (#7431) 2023-04-28 17:01:06 -05:00
Michelle Ark
a7eb89d645 active project > local project in ConfiguredVar (#7441) 2023-04-28 16:17:28 -04:00
Peter Webb
c56a9b2b7f CT-2414: Add graph summaries to target directory output (#7358)
* CT-2414: Add graph summaries to target directory output

* CT-2414: Make graph representation more compact

* CT-2414: Add changelog entry

* CT-2414: Remove temporary diagnostic code.

* CT-2414: Combine graphs into a single file

* CT-2414: Simplify graph summary format.

* CT-2414: Add invocation id to summary, add unit test
2023-04-28 16:00:31 -04:00
Emily Rockman
17a8f462dd Update CODEOWNERS to include the OSS Tooling Guild (#7472)
* Update CODEOWNERS to include the OSS Tooling Guild

* add a few more files
2023-04-28 08:17:53 -05:00
Jeremy Cohen
e3498bdaa5 Remove noisy parse events (#7388)
* Rm noisy parse events

* PR feedback
2023-04-27 18:17:53 +02:00
Gerda Shank
d2f963e20e CT 2483 duplicate depends on nodes (#7455)
* Remove unnecessary "_update_into" methods
2023-04-25 14:10:59 -04:00
Jeremy Cohen
d53bb37186 UX improvements to model versions (#7435)
* Latest version should use un-suffixed alias

* Latest version can be in un-suffixed file

* FYI when unpinned ref to model with prerelease version

* [WIP] Nicer error if versioned ref to unversioned model

* Revert "Latest version should use un-suffixed alias"

This reverts commit 3616c52c1eed7588b9e210e1c957dfda598be550.

* Revert "[WIP] Nicer error if versioned ref to unversioned model"

This reverts commit c9ae4af1cfbd6b7bfc5dcbb445556233eb4bd2c0.

* Define real event for UnpinnedRefNewVersionAvailable

* Update pp test for implicit unsuffixed defined_in

* Add changelog entry

* Fix unit test

* marky feedback

* Add test case for UnpinnedRefNewVersionAvailable event
2023-04-25 19:55:58 +02:00
Michelle Ark
9874f9e004 Fix groupable node partial parsing, raise DbtReferenceError in RuntimeRefResolver (#7438) 2023-04-25 11:09:35 -04:00
Michelle Ark
2739d5f4c4 fix partial parsing of versioned models - schedule child nodes if latest version has been modified (#7439) 2023-04-25 10:30:03 -04:00
Ian Knox
d07603b288 Clear cached statement results when retrieved from ProviderContext and subclasses (#7371) 2023-04-24 12:36:50 -05:00
Jeremy Cohen
723ac9493d Fix .gitignore to take heed of tests/functional/build (#7436) 2023-04-24 19:23:09 +02:00
Stu Kilgore
de75777ede Persist timing info for failed nodes (#7353) 2023-04-24 11:13:41 -05:00
Daniel Reeves
75703c10ee add --target-path to snapshot command (#7419) 2023-04-21 08:29:15 -07:00
Michelle Ark
1722079a43 fix v0 ref resolution and latest_version configuration(#7415) 2023-04-20 12:06:14 -04:00
337 changed files with 20571 additions and 4820 deletions

View File

@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.6.0a1
current_version = 1.7.0a1
parse = (?P<major>[\d]+) # major version number
\.(?P<minor>[\d]+) # minor version number
\.(?P<patch>[\d]+) # patch version number

View File

@@ -3,6 +3,8 @@
For information on prior major and minor releases, see their changelogs:
* [1.6](https://github.com/dbt-labs/dbt-core/blob/1.6.latest/CHANGELOG.md)
* [1.5](https://github.com/dbt-labs/dbt-core/blob/1.5.latest/CHANGELOG.md)
* [1.4](https://github.com/dbt-labs/dbt-core/blob/1.4.latest/CHANGELOG.md)
* [1.3](https://github.com/dbt-labs/dbt-core/blob/1.3.latest/CHANGELOG.md)
* [1.2](https://github.com/dbt-labs/dbt-core/blob/1.2.latest/CHANGELOG.md)

View File

@@ -1 +0,0 @@
## dbt-core 1.6.0-a1 - April 17, 2023

View File

@@ -0,0 +1,6 @@
kind: Dependencies
body: Pin click<9 + sqlparse<0.5
time: 2023-07-19T12:37:43.716495+02:00
custom:
Author: jtcohen6
PR: "8146"

View File

@@ -0,0 +1,6 @@
kind: Docs
body: Fix for column tests not rendering on quoted columns
time: 2023-05-31T11:54:19.687363-04:00
custom:
Author: drewbanin
Issue: "201"

View File

@@ -0,0 +1,6 @@
kind: Docs
body: Remove static SQL codeblock for metrics
time: 2023-07-18T19:24:22.155323+02:00
custom:
Author: marcodamore
Issue: "436"

View File

@@ -1,6 +0,0 @@
kind: Features
body: Skip catalog generation
time: 2023-03-21T21:33:38.513443Z
custom:
Author: AndyBys
Issue: "6980"

View File

@@ -1,6 +0,0 @@
kind: Fixes
body: fix typo in unpacking statically parsed ref
time: 2023-04-14T16:36:42.279838-04:00
custom:
Author: MichelleArk
Issue: "7364"

View File

@@ -1,6 +0,0 @@
kind: Fixes
body: safe version attribute access in _check_resource_uniqueness
time: 2023-04-18T13:52:57.367108-04:00
custom:
Author: MichelleArk
Issue: "7375"

View File

@@ -1,6 +0,0 @@
kind: Fixes
body: Fix dbt command missing target-path param
time: 2023-04-19T14:21:50.959786-07:00
custom:
Author: ChenyuLInx
Issue: "\t7411"

View File

@@ -0,0 +1,6 @@
kind: Fixes
body: Enable converting deprecation warnings to errors
time: 2023-07-18T12:55:18.03914-04:00
custom:
Author: michelleark
Issue: "8130"

View File

@@ -1,6 +0,0 @@
kind: Under the Hood
body: Update docs link in ContractBreakingChangeError message
time: 2023-04-17T11:45:01.005104+02:00
custom:
Author: jtcohen6
Issue: "7366"

View File

@@ -1,6 +0,0 @@
kind: Under the Hood
body: Update --help text for cache-related parameters
time: 2023-04-18T12:23:23.276693+02:00
custom:
Author: jtcohen6
Issue: "7381"

49
.github/CODEOWNERS vendored
View File

@@ -11,44 +11,24 @@
# As a default for areas with no assignment,
# the core team as a whole will be assigned
* @dbt-labs/core
* @dbt-labs/core-team
# Changes to GitHub configurations including Actions
/.github/ @leahwicz
### OSS Tooling Guild
### LANGUAGE
/.github/ @dbt-labs/guild-oss-tooling
.bumpversion.cfg @dbt-labs/guild-oss-tooling
# Language core modules
/core/dbt/config/ @dbt-labs/core-language
/core/dbt/context/ @dbt-labs/core-language
/core/dbt/contracts/ @dbt-labs/core-language
/core/dbt/deps/ @dbt-labs/core-language
/core/dbt/events/ @dbt-labs/core-language # structured logging
/core/dbt/parser/ @dbt-labs/core-language
.changie.yaml @dbt-labs/guild-oss-tooling
# Language misc files
/core/dbt/dataclass_schema.py @dbt-labs/core-language
/core/dbt/hooks.py @dbt-labs/core-language
/core/dbt/node_types.py @dbt-labs/core-language
/core/dbt/semver.py @dbt-labs/core-language
### EXECUTION
# Execution core modules
/core/dbt/graph/ @dbt-labs/core-execution
/core/dbt/task/ @dbt-labs/core-execution
# Execution misc files
/core/dbt/compilation.py @dbt-labs/core-execution
/core/dbt/flags.py @dbt-labs/core-execution
/core/dbt/lib.py @dbt-labs/core-execution
/core/dbt/main.py @dbt-labs/core-execution
/core/dbt/profiler.py @dbt-labs/core-execution
/core/dbt/selected_resources.py @dbt-labs/core-execution
/core/dbt/tracking.py @dbt-labs/core-execution
/core/dbt/version.py @dbt-labs/core-execution
pre-commit-config.yaml @dbt-labs/guild-oss-tooling
pytest.ini @dbt-labs/guild-oss-tooling
tox.ini @dbt-labs/guild-oss-tooling
pyproject.toml @dbt-labs/guild-oss-tooling
requirements.txt @dbt-labs/guild-oss-tooling
dev_requirements.txt @dbt-labs/guild-oss-tooling
/core/setup.py @dbt-labs/guild-oss-tooling
/core/MANIFEST.in @dbt-labs/guild-oss-tooling
### ADAPTERS
@@ -60,6 +40,7 @@
# Postgres plugin
/plugins/ @dbt-labs/core-adapters
/plugins/postgres/setup.py @dbt-labs/core-adapters @dbt-labs/guild-oss-tooling
# Functional tests for adapter plugins
/tests/adapter @dbt-labs/core-adapters
@@ -71,7 +52,7 @@
# Perf regression testing framework
# This excludes the test project files itself since those aren't specific
# framework changes (excluded by not setting an owner next to it- no owner)
/performance @nathaniel-may
/performance @nathaniel-may
/performance/projects
### ARTIFACTS

2
.github/_README.md vendored
View File

@@ -197,7 +197,7 @@ ___
```yaml
- name: Configure AWS credentials from Test account
uses: aws-actions/configure-aws-credentials@v1
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

View File

@@ -35,7 +35,7 @@ jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1
- uses: actions/checkout@v3
- name: Wrangle latest tag
id: is_latest
uses: ./.github/actions/latest-wrangler

View File

@@ -13,7 +13,7 @@ jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1
- uses: actions/checkout@v3
- name: Wrangle latest tag
id: is_latest
uses: ./.github/actions/latest-wrangler

View File

@@ -1,23 +1,35 @@
resolves #
resolves #
[docs](https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose) dbt-labs/docs.getdbt.com/#
<!---
Include the number of the issue addressed by this PR above if applicable.
PRs for code changes without an associated issue *will not be merged*.
See CONTRIBUTING.md for more information.
Include the number of the docs issue that was opened for this PR. If
this change has no user-facing implications, "N/A" suffices instead. New
docs tickets can be created by clicking the link above or by going to
https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose.
-->
### Description
### Problem
<!---
Describe the Pull Request here. Add any references and info to help reviewers
understand your changes. Include any tradeoffs you considered.
Describe the problem this PR is solving. What is the application state
before this PR is merged?
-->
### Solution
<!---
Describe the way this PR solves the above problem. Add as much detail as you
can to help reviewers understand your changes. Include any alternatives and
tradeoffs you considered.
-->
### Checklist
- [ ] I have read [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md) and understand what's expected of me
- [ ] I have signed the [CLA](https://docs.getdbt.com/docs/contributor-license-agreements)
- [ ] I have run this code in development and it appears to resolve the stated issue
- [ ] I have read [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md) and understand what's expected of me
- [ ] I have run this code in development and it appears to resolve the stated issue
- [ ] This PR includes tests, or tests are not required/relevant for this PR
- [ ] I have [opened an issue to add/update docs](https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose), or docs changes are not required/relevant for this PR
- [ ] I have run `changie new` to [create a changelog entry](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-a-changelog-entry)
- [ ] This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

View File

@@ -35,6 +35,6 @@ jobs:
github.event.pull_request.merged
&& contains(github.event.label.name, 'backport')
steps:
- uses: tibdex/backport@v2.0.2
- uses: tibdex/backport@v2.0.3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -50,7 +50,7 @@ jobs:
- name: Create and commit changelog on bot PR
if: ${{ contains(github.event.pull_request.labels.*.name, matrix.label) }}
id: bot_changelog
uses: emmyoop/changie_bot@v1.0.1
uses: emmyoop/changie_bot@v1.1.0
with:
GITHUB_TOKEN: ${{ secrets.FISHTOWN_BOT_PAT }}
commit_author_name: "Github Build Bot"

View File

@@ -18,8 +18,8 @@ permissions:
issues: write
jobs:
call-label-action:
uses: dbt-labs/jira-actions/.github/workflows/jira-creation.yml@main
call-creation-action:
uses: dbt-labs/actions/.github/workflows/jira-creation-actions.yml@main
secrets:
JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}

View File

@@ -19,7 +19,7 @@ permissions:
jobs:
call-label-action:
uses: dbt-labs/jira-actions/.github/workflows/jira-label.yml@main
uses: dbt-labs/actions/.github/workflows/jira-label-actions.yml@main
secrets:
JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}

View File

@@ -19,8 +19,8 @@ on:
permissions: read-all
jobs:
call-label-action:
uses: dbt-labs/jira-actions/.github/workflows/jira-transition.yml@main
call-transition-action:
uses: dbt-labs/actions/.github/workflows/jira-transition-actions.yml@main
secrets:
JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}

View File

@@ -45,7 +45,7 @@ jobs:
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4.3.0
uses: actions/setup-python@v4
with:
python-version: '3.8'
@@ -69,18 +69,17 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
python-version: ["3.8", "3.9", "3.10", "3.11"]
env:
TOXENV: "unit"
PYTEST_ADDOPTS: "-v --color=yes --csv unit_results.csv"
steps:
- name: Check out the repository
uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4.3.0
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
@@ -101,22 +100,22 @@ jobs:
CURRENT_DATE=$(date +'%Y-%m-%dT%H_%M_%S') # no colons allowed for artifacts
echo "date=$CURRENT_DATE" >> $GITHUB_OUTPUT
- uses: actions/upload-artifact@v3
if: always()
with:
name: unit_results_${{ matrix.python-version }}-${{ steps.date.outputs.date }}.csv
path: unit_results.csv
- name: Upload Unit Test Coverage to Codecov
if: ${{ matrix.python-version == '3.11' }}
uses: codecov/codecov-action@v3
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
integration:
name: integration test / python ${{ matrix.python-version }} / ${{ matrix.os }}
runs-on: ${{ matrix.os }}
timeout-minutes: 45
timeout-minutes: 60
strategy:
fail-fast: false
matrix:
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
python-version: ["3.8", "3.9", "3.10", "3.11"]
os: [ubuntu-20.04]
include:
- python-version: 3.8
@@ -126,18 +125,22 @@ jobs:
env:
TOXENV: integration
PYTEST_ADDOPTS: "-v --color=yes -n4 --csv integration_results.csv"
DBT_INVOCATION_ENV: github-actions
DBT_TEST_USER_1: dbt_test_user_1
DBT_TEST_USER_2: dbt_test_user_2
DBT_TEST_USER_3: dbt_test_user_3
DD_CIVISIBILITY_AGENTLESS_ENABLED: true
DD_API_KEY: ${{ secrets.DATADOG_API_KEY }}
DD_SITE: datadoghq.com
DD_ENV: ci
DD_SERVICE: ${{ github.event.repository.name }}
steps:
- name: Check out the repository
uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4.3.0
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
@@ -161,7 +164,7 @@ jobs:
tox --version
- name: Run tests
run: tox
run: tox -- --ddtrace
- name: Get current date
if: always()
@@ -176,11 +179,11 @@ jobs:
name: logs_${{ matrix.python-version }}_${{ matrix.os }}_${{ steps.date.outputs.date }}
path: ./logs
- uses: actions/upload-artifact@v3
if: always()
with:
name: integration_results_${{ matrix.python-version }}_${{ matrix.os }}_${{ steps.date.outputs.date }}.csv
path: integration_results.csv
- name: Upload Integration Test Coverage to Codecov
if: ${{ matrix.python-version == '3.11' }}
uses: codecov/codecov-action@v3
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
build:
name: build packages
@@ -192,7 +195,7 @@ jobs:
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4.3.0
uses: actions/setup-python@v4
with:
python-version: '3.8'

265
.github/workflows/model_performance.yml vendored Normal file
View File

@@ -0,0 +1,265 @@
# **what?**
# This workflow models the performance characteristics of a point in time in dbt.
# It runs specific dbt commands on committed projects multiple times to create and
# commit information about the distribution to the current branch. For more information
# see the readme in the performance module at /performance/README.md.
#
# **why?**
# When developing new features, we can take quick performance samples and compare
# them against the commited baseline measurements produced by this workflow to detect
# some performance regressions at development time before they reach users.
#
# **when?**
# This is only run once directly after each release (for non-prereleases). If for some
# reason the results of a run are not satisfactory, it can also be triggered manually.
name: Model Performance Characteristics
on:
# runs after non-prereleases are published.
release:
types: [released]
# run manually from the actions tab
workflow_dispatch:
inputs:
release_id:
description: 'dbt version to model (must be non-prerelease in Pypi)'
type: string
required: true
env:
RUNNER_CACHE_PATH: performance/runner/target/release/runner
# both jobs need to write
permissions:
contents: write
pull-requests: write
jobs:
set-variables:
name: Setting Variables
runs-on: ubuntu-latest
outputs:
cache_key: ${{ steps.variables.outputs.cache_key }}
release_id: ${{ steps.semver.outputs.base-version }}
release_branch: ${{ steps.variables.outputs.release_branch }}
steps:
# explicitly checkout the performance runner from main regardless of which
# version we are modeling.
- name: Checkout
uses: actions/checkout@v3
with:
ref: main
- name: Parse version into parts
id: semver
uses: dbt-labs/actions/parse-semver@v1
with:
version: ${{ github.event.inputs.release_id || github.event.release.tag_name }}
# collect all the variables that need to be used in subsequent jobs
- name: Set variables
id: variables
run: |
# create a cache key that will be used in the next job. without this the
# next job would have to checkout from main and hash the files itself.
echo "cache_key=${{ runner.os }}-${{ hashFiles('performance/runner/Cargo.toml')}}-${{ hashFiles('performance/runner/src/*') }}" >> $GITHUB_OUTPUT
branch_name="${{steps.semver.outputs.major}}.${{steps.semver.outputs.minor}}.latest"
echo "release_branch=$branch_name" >> $GITHUB_OUTPUT
echo "release branch is inferred to be ${branch_name}"
latest-runner:
name: Build or Fetch Runner
runs-on: ubuntu-latest
needs: [set-variables]
env:
RUSTFLAGS: "-D warnings"
steps:
- name: '[DEBUG] print variables'
run: |
echo "all variables defined in set-variables"
echo "cache_key: ${{ needs.set-variables.outputs.cache_key }}"
echo "release_id: ${{ needs.set-variables.outputs.release_id }}"
echo "release_branch: ${{ needs.set-variables.outputs.release_branch }}"
# explicitly checkout the performance runner from main regardless of which
# version we are modeling.
- name: Checkout
uses: actions/checkout@v3
with:
ref: main
# attempts to access a previously cached runner
- uses: actions/cache@v3
id: cache
with:
path: ${{ env.RUNNER_CACHE_PATH }}
key: ${{ needs.set-variables.outputs.cache_key }}
- name: Fetch Rust Toolchain
if: steps.cache.outputs.cache-hit != 'true'
uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
- name: Add fmt
if: steps.cache.outputs.cache-hit != 'true'
run: rustup component add rustfmt
- name: Cargo fmt
if: steps.cache.outputs.cache-hit != 'true'
uses: actions-rs/cargo@v1
with:
command: fmt
args: --manifest-path performance/runner/Cargo.toml --all -- --check
- name: Test
if: steps.cache.outputs.cache-hit != 'true'
uses: actions-rs/cargo@v1
with:
command: test
args: --manifest-path performance/runner/Cargo.toml
- name: Build (optimized)
if: steps.cache.outputs.cache-hit != 'true'
uses: actions-rs/cargo@v1
with:
command: build
args: --release --manifest-path performance/runner/Cargo.toml
# the cache action automatically caches this binary at the end of the job
model:
# depends on `latest-runner` as a separate job so that failures in this job do not prevent
# a successfully tested and built binary from being cached.
needs: [set-variables, latest-runner]
name: Model a release
runs-on: ubuntu-latest
steps:
- name: '[DEBUG] print variables'
run: |
echo "all variables defined in set-variables"
echo "cache_key: ${{ needs.set-variables.outputs.cache_key }}"
echo "release_id: ${{ needs.set-variables.outputs.release_id }}"
echo "release_branch: ${{ needs.set-variables.outputs.release_branch }}"
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dbt
run: pip install dbt-postgres==${{ needs.set-variables.outputs.release_id }}
- name: Install Hyperfine
run: wget https://github.com/sharkdp/hyperfine/releases/download/v1.11.0/hyperfine_1.11.0_amd64.deb && sudo dpkg -i hyperfine_1.11.0_amd64.deb
# explicitly checkout main to get the latest project definitions
- name: Checkout
uses: actions/checkout@v3
with:
ref: main
# this was built in the previous job so it will be there.
- name: Fetch Runner
uses: actions/cache@v3
id: cache
with:
path: ${{ env.RUNNER_CACHE_PATH }}
key: ${{ needs.set-variables.outputs.cache_key }}
- name: Move Runner
run: mv performance/runner/target/release/runner performance/app
- name: Change Runner Permissions
run: chmod +x ./performance/app
- name: '[DEBUG] ls baseline directory before run'
run: ls -R performance/baselines/
# `${{ github.workspace }}` is used to pass the absolute path
- name: Create directories
run: |
mkdir ${{ github.workspace }}/performance/tmp/
mkdir -p performance/baselines/${{ needs.set-variables.outputs.release_id }}/
# Run modeling with taking 20 samples
- name: Run Measurement
run: |
performance/app model -v ${{ needs.set-variables.outputs.release_id }} -b ${{ github.workspace }}/performance/baselines/ -p ${{ github.workspace }}/performance/projects/ -t ${{ github.workspace }}/performance/tmp/ -n 20
- name: '[DEBUG] ls baseline directory after run'
run: ls -R performance/baselines/
- uses: actions/upload-artifact@v3
with:
name: baseline
path: performance/baselines/${{ needs.set-variables.outputs.release_id }}/
create-pr:
name: Open PR for ${{ matrix.base-branch }}
# depends on `model` as a separate job so that the baseline can be committed to more than one branch
# i.e. release branch and main
needs: [set-variables, latest-runner, model]
runs-on: ubuntu-latest
strategy:
matrix:
include:
- base-branch: refs/heads/main
target-branch: performance-bot/main_${{ needs.set-variables.outputs.release_id }}_${{GITHUB.RUN_ID}}
- base-branch: refs/heads/${{ needs.set-variables.outputs.release_branch }}
target-branch: performance-bot/release_${{ needs.set-variables.outputs.release_id }}_${{GITHUB.RUN_ID}}
steps:
- name: '[DEBUG] print variables'
run: |
echo "all variables defined in set-variables"
echo "cache_key: ${{ needs.set-variables.outputs.cache_key }}"
echo "release_id: ${{ needs.set-variables.outputs.release_id }}"
echo "release_branch: ${{ needs.set-variables.outputs.release_branch }}"
- name: Checkout
uses: actions/checkout@v3
with:
ref: ${{ matrix.base-branch }}
- name: Create PR branch
run: |
git checkout -b ${{ matrix.target-branch }}
git push origin ${{ matrix.target-branch }}
git branch --set-upstream-to=origin/${{ matrix.target-branch }} ${{ matrix.target-branch }}
- uses: actions/download-artifact@v3
with:
name: baseline
path: performance/baselines/${{ needs.set-variables.outputs.release_id }}
- name: '[DEBUG] ls baselines after artifact download'
run: ls -R performance/baselines/
- name: Commit baseline
uses: EndBug/add-and-commit@v9
with:
add: 'performance/baselines/*'
author_name: 'Github Build Bot'
author_email: 'buildbot@fishtownanalytics.com'
message: 'adding performance baseline for ${{ needs.set-variables.outputs.release_id }}'
push: 'origin origin/${{ matrix.target-branch }}'
- name: Create Pull Request
uses: peter-evans/create-pull-request@v5
with:
author: 'Github Build Bot <buildbot@fishtownanalytics.com>'
base: ${{ matrix.base-branch }}
branch: '${{ matrix.target-branch }}'
title: 'Adding performance modeling for ${{needs.set-variables.outputs.release_id}} to ${{ matrix.base-branch }}'
body: 'Committing perf results for tracking for the ${{needs.set-variables.outputs.release_id}}'
labels: |
Skip Changelog
Performance

View File

@@ -1,11 +1,7 @@
# **what?**
# The purpose of this workflow is to trigger CI to run for each
# release branch and main branch on a regular cadence. If the CI workflow
# fails for a branch, it will post to dev-core-alerts to raise awareness.
# The 'aurelien-baudet/workflow-dispatch' Action triggers the existing
# CI worklow file on the given branch to run so that even if we change the
# CI workflow file in the future, the one that is tailored for the given
# release branch will be used.
# fails for a branch, it will post to #dev-core-alerts to raise awareness.
# **why?**
# Ensures release branches and main are always shippable and not broken.
@@ -28,63 +24,8 @@ on:
permissions: read-all
jobs:
fetch-latest-branches:
runs-on: ubuntu-latest
outputs:
latest-branches: ${{ steps.get-latest-branches.outputs.repo-branches }}
steps:
- name: "Fetch dbt-core Latest Branches"
uses: dbt-labs/actions/fetch-repo-branches@v1.1.1
id: get-latest-branches
with:
repo_name: ${{ github.event.repository.name }}
organization: "dbt-labs"
pat: ${{ secrets.GITHUB_TOKEN }}
fetch_protected_branches_only: true
regex: "^1.[0-9]+.latest$"
perform_match_method: "match"
retries: 3
- name: "[ANNOTATION] ${{ github.event.repository.name }} - branches to test"
run: |
title="${{ github.event.repository.name }} - branches to test"
message="The workflow will run tests for the following branches of the ${{ github.event.repository.name }} repo: ${{ steps.get-latest-branches.outputs.repo-branches }}"
echo "::notice $title::$message"
kick-off-ci:
needs: [fetch-latest-branches]
name: Kick-off CI
runs-on: ubuntu-latest
strategy:
# must run CI 1 branch at a time b/c the workflow-dispatch Action polls for
# latest run for results and it gets confused when we kick off multiple runs
# at once. There is a race condition so we will just run in sequential order.
max-parallel: 1
fail-fast: false
matrix:
branch: ${{ fromJSON(needs.fetch-latest-branches.outputs.latest-branches) }}
include:
- branch: 'main'
steps:
- name: Call CI workflow for ${{ matrix.branch }} branch
id: trigger-step
uses: aurelien-baudet/workflow-dispatch@v2.1.1
with:
workflow: main.yml
ref: ${{ matrix.branch }}
token: ${{ secrets.FISHTOWN_BOT_PAT }}
- name: Post failure to Slack
uses: ravsamhq/notify-slack-action@v1
if: ${{ always() && !contains(steps.trigger-step.outputs.workflow-conclusion,'success') }}
with:
status: ${{ job.status }}
notification_title: 'dbt-core scheduled run of "${{ matrix.branch }}" branch not successful'
message_format: ':x: CI on branch "${{ matrix.branch }}" ${{ steps.trigger-step.outputs.workflow-conclusion }}'
footer: 'Linked failed CI run ${{ steps.trigger-step.outputs.workflow-url }}'
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_DEV_CORE_ALERTS }}
run_tests:
uses: dbt-labs/actions/.github/workflows/release-branch-tests.yml@main
with:
workflows_to_run: '["main.yml"]'
secrets: inherit

View File

@@ -36,7 +36,7 @@ jobs:
latest: ${{ steps.latest.outputs.latest }}
minor_latest: ${{ steps.latest.outputs.minor_latest }}
steps:
- uses: actions/checkout@v1
- uses: actions/checkout@v3
- name: Split version
id: version
run: |
@@ -60,7 +60,7 @@ jobs:
needs: [get_version_meta]
steps:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
uses: docker/setup-buildx-action@v2
build_and_push:
name: Build images and push to GHCR
@@ -76,14 +76,14 @@ jobs:
echo "build_arg_value=$BUILD_ARG_VALUE" >> $GITHUB_OUTPUT
- name: Log in to the GHCR
uses: docker/login-action@v1
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push MAJOR.MINOR.PATCH tag
uses: docker/build-push-action@v2
uses: docker/build-push-action@v4
with:
file: docker/Dockerfile
push: True
@@ -94,7 +94,7 @@ jobs:
ghcr.io/dbt-labs/${{ github.event.inputs.package }}:${{ github.event.inputs.version_number }}
- name: Build and push MINOR.latest tag
uses: docker/build-push-action@v2
uses: docker/build-push-action@v4
if: ${{ needs.get_version_meta.outputs.minor_latest == 'True' }}
with:
file: docker/Dockerfile
@@ -106,7 +106,7 @@ jobs:
ghcr.io/dbt-labs/${{ github.event.inputs.package }}:${{ needs.get_version_meta.outputs.major }}.${{ needs.get_version_meta.outputs.minor }}.latest
- name: Build and push latest tag
uses: docker/build-push-action@v2
uses: docker/build-push-action@v4
if: ${{ needs.get_version_meta.outputs.latest == 'True' }}
with:
file: docker/Dockerfile

View File

@@ -37,17 +37,17 @@ jobs:
steps:
- name: Set up Python
uses: actions/setup-python@v2
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Checkout dbt repo
uses: actions/checkout@v2.3.4
uses: actions/checkout@v3
with:
path: ${{ env.DBT_REPO_DIRECTORY }}
- name: Checkout schemas.getdbt.com repo
uses: actions/checkout@v2.3.4
uses: actions/checkout@v3
with:
repository: dbt-labs/schemas.getdbt.com
ref: 'main'
@@ -83,7 +83,7 @@ jobs:
fi
- name: Upload schema diff
uses: actions/upload-artifact@v2.2.4
uses: actions/upload-artifact@v3
if: ${{ failure() }}
with:
name: 'schema_schanges.txt'

View File

@@ -39,12 +39,12 @@ jobs:
steps:
- name: checkout dev
uses: actions/checkout@v2
uses: actions/checkout@v3
with:
persist-credentials: false
- name: Setup Python
uses: actions/setup-python@v2.2.2
uses: actions/setup-python@v4
with:
python-version: "3.8"

155
.github/workflows/test-repeater.yml vendored Normal file
View File

@@ -0,0 +1,155 @@
# **what?**
# This workflow will test all test(s) at the input path given number of times to determine if it's flaky or not. You can test with any supported OS/Python combination.
# This is batched in 10 to allow more test iterations faster.
# **why?**
# Testing if a test is flaky and if a previously flaky test has been fixed. This allows easy testing on supported python versions and OS combinations.
# **when?**
# This is triggered manually from dbt-core.
name: Flaky Tester
on:
workflow_dispatch:
inputs:
branch:
description: 'Branch to check out'
type: string
required: true
default: 'main'
test_path:
description: 'Path to single test to run (ex: tests/functional/retry/test_retry.py::TestRetry::test_fail_fast)'
type: string
required: true
default: 'tests/functional/...'
python_version:
description: 'Version of Python to Test Against'
type: choice
options:
- '3.8'
- '3.9'
- '3.10'
- '3.11'
os:
description: 'OS to run test in'
type: choice
options:
- 'ubuntu-latest'
- 'macos-latest'
- 'windows-latest'
num_runs_per_batch:
description: 'Max number of times to run the test per batch. We always run 10 batches.'
type: number
required: true
default: '50'
permissions: read-all
defaults:
run:
shell: bash
jobs:
debug:
runs-on: ubuntu-latest
steps:
- name: "[DEBUG] Output Inputs"
run: |
echo "Branch: ${{ inputs.branch }}"
echo "test_path: ${{ inputs.test_path }}"
echo "python_version: ${{ inputs.python_version }}"
echo "os: ${{ inputs.os }}"
echo "num_runs_per_batch: ${{ inputs.num_runs_per_batch }}"
pytest:
runs-on: ${{ inputs.os }}
strategy:
# run all batches, even if one fails. This informs how flaky the test may be.
fail-fast: false
# using a matrix to speed up the jobs since the matrix will run in parallel when runners are available
matrix:
batch: ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"]
env:
PYTEST_ADDOPTS: "-v --color=yes -n4 --csv integration_results.csv"
DBT_TEST_USER_1: dbt_test_user_1
DBT_TEST_USER_2: dbt_test_user_2
DBT_TEST_USER_3: dbt_test_user_3
DD_CIVISIBILITY_AGENTLESS_ENABLED: true
DD_API_KEY: ${{ secrets.DATADOG_API_KEY }}
DD_SITE: datadoghq.com
DD_ENV: ci
DD_SERVICE: ${{ github.event.repository.name }}
steps:
- name: "Checkout code"
uses: actions/checkout@v3
with:
ref: ${{ inputs.branch }}
- name: "Setup Python"
uses: actions/setup-python@v4
with:
python-version: "${{ inputs.python_version }}"
- name: "Setup Dev Environment"
run: make dev
- name: "Set up postgres (linux)"
if: inputs.os == 'ubuntu-latest'
run: make setup-db
# mac and windows don't use make due to limitations with docker with those runners in GitHub
- name: "Set up postgres (macos)"
if: inputs.os == 'macos-latest'
uses: ./.github/actions/setup-postgres-macos
- name: "Set up postgres (windows)"
if: inputs.os == 'windows-latest'
uses: ./.github/actions/setup-postgres-windows
- name: "Test Command"
id: command
run: |
test_command="python -m pytest ${{ inputs.test_path }}"
echo "test_command=$test_command" >> $GITHUB_OUTPUT
- name: "Run test ${{ inputs.num_runs_per_batch }} times"
id: pytest
run: |
set +e
for ((i=1; i<=${{ inputs.num_runs_per_batch }}; i++))
do
echo "Running pytest iteration $i..."
python -m pytest --ddtrace ${{ inputs.test_path }}
exit_code=$?
if [[ $exit_code -eq 0 ]]; then
success=$((success + 1))
echo "Iteration $i: Success"
else
failure=$((failure + 1))
echo "Iteration $i: Failure"
fi
echo
echo "==========================="
echo "Successful runs: $success"
echo "Failed runs: $failure"
echo "==========================="
echo
done
echo "failure=$failure" >> $GITHUB_OUTPUT
- name: "Success and Failure Summary: ${{ inputs.os }}/Python ${{ inputs.python_version }}"
run: |
echo "Batch: ${{ matrix.batch }}"
echo "Successful runs: ${{ steps.pytest.outputs.success }}"
echo "Failed runs: ${{ steps.pytest.outputs.failure }}"
- name: "Error for Failures"
if: ${{ steps.pytest.outputs.failure }}
run: |
echo "Batch ${{ matrix.batch }} failed ${{ steps.pytest.outputs.failure }} of ${{ inputs.num_runs_per_batch }} tests"
exit 1

View File

@@ -24,10 +24,8 @@ permissions:
jobs:
triage_label:
if: contains(github.event.issue.labels.*.name, 'awaiting_response')
runs-on: ubuntu-latest
steps:
- name: initial labeling
uses: andymckay/labeler@master
with:
add-labels: "triage"
remove-labels: "awaiting_response"
uses: dbt-labs/actions/.github/workflows/swap-labels.yml@main
with:
add_label: "triage"
remove_label: "awaiting_response"
secrets: inherit

3
.gitignore vendored
View File

@@ -11,6 +11,7 @@ __pycache__/
env*/
dbt_env/
build/
!tests/functional/build
!core/dbt/docs/build
develop-eggs/
dist/
@@ -28,6 +29,8 @@ var/
.mypy_cache/
.dmypy.json
logs/
.user.yml
profiles.yml
# PyInstaller
# Usually these files are written by a python script from a template

View File

@@ -37,7 +37,7 @@ repos:
alias: flake8-check
stages: [manual]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v0.981
rev: v1.3.0
hooks:
- id: mypy
# N.B.: Mypy is... a bit fragile.

View File

@@ -5,14 +5,12 @@
- "Breaking changes" listed under a version may require action from end users or external maintainers when upgrading to that version.
- Do not edit this file directly. This file is auto-generated using [changie](https://github.com/miniscruff/changie). For details on how to document a change, see [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-changelog-entry)
## dbt-core 1.6.0-a1 - April 17, 2023
## Previous Releases
For information on prior major and minor releases, see their changelogs:
* [1.6](https://github.com/dbt-labs/dbt-core/blob/1.6.latest/CHANGELOG.md)
* [1.5](https://github.com/dbt-labs/dbt-core/blob/1.5.latest/CHANGELOG.md)
* [1.4](https://github.com/dbt-labs/dbt-core/blob/1.4.latest/CHANGELOG.md)
* [1.3](https://github.com/dbt-labs/dbt-core/blob/1.3.latest/CHANGELOG.md)

View File

@@ -5,10 +5,10 @@
1. [About this document](#about-this-document)
2. [Getting the code](#getting-the-code)
3. [Setting up an environment](#setting-up-an-environment)
4. [Running `dbt` in development](#running-dbt-core-in-development)
4. [Running dbt-core in development](#running-dbt-core-in-development)
5. [Testing dbt-core](#testing)
6. [Debugging](#debugging)
7. [Adding a changelog entry](#adding-a-changelog-entry)
7. [Adding or modifying a changelog entry](#adding-or-modifying-a-changelog-entry)
8. [Submitting a Pull Request](#submitting-a-pull-request)
## About this document
@@ -56,7 +56,7 @@ There are some tools that will be helpful to you in developing locally. While th
These are the tools used in `dbt-core` development and testing:
- [`tox`](https://tox.readthedocs.io/en/latest/) to manage virtualenvs across python versions. We currently target the latest patch releases for Python 3.7, 3.8, 3.9, 3.10 and 3.11
- [`tox`](https://tox.readthedocs.io/en/latest/) to manage virtualenvs across python versions. We currently target the latest patch releases for Python 3.8, 3.9, 3.10 and 3.11
- [`pytest`](https://docs.pytest.org/en/latest/) to define, discover, and run tests
- [`flake8`](https://flake8.pycqa.org/en/latest/) for code linting
- [`black`](https://github.com/psf/black) for code formatting
@@ -113,7 +113,7 @@ When installed in this way, any changes you make to your local copy of the sourc
With your virtualenv activated, the `dbt` script should point back to the source code you've cloned on your machine. You can verify this by running `which dbt`. This command should show you a path to an executable in your virtualenv.
Configure your [profile](https://docs.getdbt.com/docs/configure-your-profile) as necessary to connect to your target databases. It may be a good idea to add a new profile pointing to a local Postgres instance, or a specific test sandbox within your data warehouse if appropriate.
Configure your [profile](https://docs.getdbt.com/docs/configure-your-profile) as necessary to connect to your target databases. It may be a good idea to add a new profile pointing to a local Postgres instance, or a specific test sandbox within your data warehouse if appropriate. Make sure to create a profile before running integration tests.
## Testing
@@ -163,7 +163,7 @@ suites.
#### `tox`
[`tox`](https://tox.readthedocs.io/en/latest/) takes care of managing virtualenvs and install dependencies in order to run tests. You can also run tests in parallel, for example, you can run unit tests for Python 3.7, Python 3.8, Python 3.9, Python 3.10 and Python 3.11 checks in parallel with `tox -p`. Also, you can run unit tests for specific python versions with `tox -e py37`. The configuration for these tests in located in `tox.ini`.
[`tox`](https://tox.readthedocs.io/en/latest/) takes care of managing virtualenvs and install dependencies in order to run tests. You can also run tests in parallel, for example, you can run unit tests for Python 3.8, Python 3.9, Python 3.10 and Python 3.11 checks in parallel with `tox -p`. Also, you can run unit tests for specific python versions with `tox -e py38`. The configuration for these tests in located in `tox.ini`.
#### `pytest`
@@ -171,12 +171,10 @@ Finally, you can also run a specific test or group of tests using [`pytest`](htt
```sh
# run all unit tests in a file
python3 -m pytest test/unit/test_graph.py
python3 -m pytest tests/unit/test_graph.py
# run a specific unit test
python3 -m pytest test/unit/test_graph.py::GraphTest::test__dependency_list
# run specific Postgres integration tests (old way)
python3 -m pytest -m profile_postgres test/integration/074_postgres_unlogged_table_tests
# run specific Postgres integration tests (new way)
python3 -m pytest tests/unit/test_graph.py::GraphTest::test__dependency_list
# run specific Postgres functional tests
python3 -m pytest tests/functional/sources
```
@@ -185,9 +183,8 @@ python3 -m pytest tests/functional/sources
### Unit, Integration, Functional?
Here are some general rules for adding tests:
* unit tests (`test/unit` & `tests/unit`) dont need to access a database; "pure Python" tests should be written as unit tests
* functional tests (`test/integration` & `tests/functional`) cover anything that interacts with a database, namely adapter
* *everything in* `test/*` *is being steadily migrated to* `tests/*`
* unit tests (`tests/unit`) dont need to access a database; "pure Python" tests should be written as unit tests
* functional tests (`tests/functional`) cover anything that interacts with a database, namely adapter
## Debugging

View File

@@ -3,13 +3,13 @@
# See `/docker` for a generic and production-ready docker file
##
FROM ubuntu:23.04
FROM ubuntu:22.04
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
software-properties-common \
software-properties-common gpg-agent \
&& add-apt-repository ppa:git-core/ppa -y \
&& apt-get dist-upgrade -y \
&& apt-get install -y --no-install-recommends \
@@ -30,16 +30,9 @@ RUN apt-get update \
unixodbc-dev \
&& add-apt-repository ppa:deadsnakes/ppa \
&& apt-get install -y \
python \
python-dev \
python-is-python3 \
python-dev-is-python3 \
python3-pip \
python3.6 \
python3.6-dev \
python3-pip \
python3.6-venv \
python3.7 \
python3.7-dev \
python3.7-venv \
python3.8 \
python3.8-dev \
python3.8-venv \

View File

@@ -71,7 +71,7 @@ from dbt.adapters.base.relation import (
from dbt.adapters.base import Column as BaseColumn
from dbt.adapters.base import Credentials
from dbt.adapters.cache import RelationsCache, _make_ref_key_dict
from dbt import deprecations
GET_CATALOG_MACRO_NAME = "get_catalog"
FRESHNESS_MACRO_NAME = "collect_freshness"
@@ -274,7 +274,7 @@ class BaseAdapter(metaclass=AdapterMeta):
@available.parse(lambda *a, **k: ("", empty_table()))
def execute(
self, sql: str, auto_begin: bool = False, fetch: bool = False
self, sql: str, auto_begin: bool = False, fetch: bool = False, limit: Optional[int] = None
) -> Tuple[AdapterResponse, agate.Table]:
"""Execute the given SQL. This is a thin wrapper around
ConnectionManager.execute.
@@ -283,10 +283,22 @@ class BaseAdapter(metaclass=AdapterMeta):
:param bool auto_begin: If set, and dbt is not currently inside a
transaction, automatically begin one.
:param bool fetch: If set, fetch results.
:param Optional[int] limit: If set, only fetch n number of rows
:return: A tuple of the query status and results (empty if fetch=False).
:rtype: Tuple[AdapterResponse, agate.Table]
"""
return self.connections.execute(sql=sql, auto_begin=auto_begin, fetch=fetch)
return self.connections.execute(sql=sql, auto_begin=auto_begin, fetch=fetch, limit=limit)
def validate_sql(self, sql: str) -> AdapterResponse:
"""Submit the given SQL to the engine for validation, but not execution.
This should throw an appropriate exception if the input SQL is invalid, although
in practice that will generally be handled by delegating to an existing method
for execution and allowing the error handler to take care of the rest.
:param str sql: The sql to validate
"""
raise NotImplementedError("`validate_sql` is not implemented for this adapter!")
@available.parse(lambda *a, **k: [])
def get_column_schema_from_query(self, sql: str) -> List[BaseColumn]:
@@ -383,7 +395,7 @@ class BaseAdapter(metaclass=AdapterMeta):
return {
self.Relation.create_from(self.config, node).without_identifier()
for node in manifest.nodes.values()
if (node.is_relational and not node.is_ephemeral_model)
if (node.is_relational and not node.is_ephemeral_model and not node.is_external_node)
}
def _get_catalog_schemas(self, manifest: Manifest) -> SchemaSearchMap:
@@ -414,7 +426,7 @@ class BaseAdapter(metaclass=AdapterMeta):
return info_schema_name_map
def _relations_cache_for_schemas(
self, manifest: Manifest, cache_schemas: Set[BaseRelation] = None
self, manifest: Manifest, cache_schemas: Optional[Set[BaseRelation]] = None
) -> None:
"""Populate the relations cache for the given schemas. Returns an
iterable of the schemas populated, as strings.
@@ -450,7 +462,7 @@ class BaseAdapter(metaclass=AdapterMeta):
self,
manifest: Manifest,
clear: bool = False,
required_schemas: Set[BaseRelation] = None,
required_schemas: Optional[Set[BaseRelation]] = None,
) -> None:
"""Run a query that gets a populated cache of the relations in the
database and set the cache on this adapter.
@@ -784,7 +796,6 @@ class BaseAdapter(metaclass=AdapterMeta):
schema: str,
identifier: str,
) -> List[BaseRelation]:
matches = []
search = self._make_match_kwargs(database, schema, identifier)
@@ -985,7 +996,7 @@ class BaseAdapter(metaclass=AdapterMeta):
manifest: Optional[Manifest] = None,
project: Optional[str] = None,
context_override: Optional[Dict[str, Any]] = None,
kwargs: Dict[str, Any] = None,
kwargs: Optional[Dict[str, Any]] = None,
text_only_columns: Optional[Iterable[str]] = None,
) -> AttrDict:
"""Look macro_name up in the manifest and execute its results.
@@ -1062,7 +1073,6 @@ class BaseAdapter(metaclass=AdapterMeta):
schemas: Set[str],
manifest: Manifest,
) -> agate.Table:
kwargs = {"information_schema": information_schema, "schemas": schemas}
table = self.execute_macro(
GET_CATALOG_MACRO_NAME,
@@ -1104,7 +1114,7 @@ class BaseAdapter(metaclass=AdapterMeta):
loaded_at_field: str,
filter: Optional[str],
manifest: Optional[Manifest] = None,
) -> Tuple[AdapterResponse, Dict[str, Any]]:
) -> Tuple[Optional[AdapterResponse], Dict[str, Any]]:
"""Calculate the freshness of sources in dbt, and return it"""
kwargs: Dict[str, Any] = {
"source": source,
@@ -1113,8 +1123,19 @@ class BaseAdapter(metaclass=AdapterMeta):
}
# run the macro
# in older versions of dbt-core, the 'collect_freshness' macro returned the table of results directly
# starting in v1.5, by default, we return both the table and the adapter response (metadata about the query)
result: Union[
AttrDict, # current: contains AdapterResponse + agate.Table
agate.Table, # previous: just table
]
result = self.execute_macro(FRESHNESS_MACRO_NAME, kwargs=kwargs, manifest=manifest)
adapter_response, table = result.response, result.table # type: ignore[attr-defined]
if isinstance(result, agate.Table):
deprecations.warn("collect-freshness-return-signature")
adapter_response = None
table = result
else:
adapter_response, table = result.response, result.table # type: ignore[attr-defined]
# now we have a 1-row table of the maximum `loaded_at_field` value and
# the current time according to the db.
if len(table) != 1 or len(table[0]) != 2:
@@ -1307,20 +1328,26 @@ class BaseAdapter(metaclass=AdapterMeta):
def render_column_constraint(cls, constraint: ColumnLevelConstraint) -> Optional[str]:
"""Render the given constraint as DDL text. Should be overriden by adapters which need custom constraint
rendering."""
if constraint.type == ConstraintType.check and constraint.expression:
return f"check {constraint.expression}"
constraint_expression = constraint.expression or ""
rendered_column_constraint = None
if constraint.type == ConstraintType.check and constraint_expression:
rendered_column_constraint = f"check ({constraint_expression})"
elif constraint.type == ConstraintType.not_null:
return "not null"
rendered_column_constraint = f"not null {constraint_expression}"
elif constraint.type == ConstraintType.unique:
return "unique"
rendered_column_constraint = f"unique {constraint_expression}"
elif constraint.type == ConstraintType.primary_key:
return "primary key"
elif constraint.type == ConstraintType.foreign_key:
return "foreign key"
elif constraint.type == ConstraintType.custom and constraint.expression:
return constraint.expression
else:
return None
rendered_column_constraint = f"primary key {constraint_expression}"
elif constraint.type == ConstraintType.foreign_key and constraint_expression:
rendered_column_constraint = f"references {constraint_expression}"
elif constraint.type == ConstraintType.custom and constraint_expression:
rendered_column_constraint = constraint_expression
if rendered_column_constraint:
rendered_column_constraint = rendered_column_constraint.strip()
return rendered_column_constraint
@available
@classmethod
@@ -1328,7 +1355,8 @@ class BaseAdapter(metaclass=AdapterMeta):
rendered_column_constraints = []
for v in raw_columns.values():
rendered_column_constraint = [f"{v['name']} {v['data_type']}"]
col_name = cls.quote(v["name"]) if v.get("quote") else v["name"]
rendered_column_constraint = [f"{col_name} {v['data_type']}"]
for con in v.get("constraints", None):
constraint = cls._parse_column_constraint(con)
c = cls.process_parsed_constraint(constraint, cls.render_column_constraint)
@@ -1387,13 +1415,15 @@ class BaseAdapter(metaclass=AdapterMeta):
constraint_prefix = f"constraint {constraint.name} " if constraint.name else ""
column_list = ", ".join(constraint.columns)
if constraint.type == ConstraintType.check and constraint.expression:
return f"{constraint_prefix}check {constraint.expression}"
return f"{constraint_prefix}check ({constraint.expression})"
elif constraint.type == ConstraintType.unique:
return f"{constraint_prefix}unique ({column_list})"
constraint_expression = f" {constraint.expression}" if constraint.expression else ""
return f"{constraint_prefix}unique{constraint_expression} ({column_list})"
elif constraint.type == ConstraintType.primary_key:
return f"{constraint_prefix}primary key ({column_list})"
elif constraint.type == ConstraintType.foreign_key:
return f"{constraint_prefix}foreign key ({column_list})"
constraint_expression = f" {constraint.expression}" if constraint.expression else ""
return f"{constraint_prefix}primary key{constraint_expression} ({column_list})"
elif constraint.type == ConstraintType.foreign_key and constraint.expression:
return f"{constraint_prefix}foreign key ({column_list}) references {constraint.expression}"
elif constraint.type == ConstraintType.custom and constraint.expression:
return f"{constraint_prefix}{constraint.expression}"
else:
@@ -1432,7 +1462,6 @@ join diff_count using (id)
def catch_as_completed(
futures, # typing: List[Future[agate.Table]]
) -> Tuple[agate.Table, List[Exception]]:
# catalogs: agate.Table = agate.Table(rows=[])
tables: List[agate.Table] = []
exceptions: List[Exception] = []

View File

@@ -227,7 +227,7 @@ class BaseRelation(FakeAPIObject, Hashable):
def create_from_node(
cls: Type[Self],
config: HasQuoting,
node: ManifestNode,
node,
quote_policy: Optional[Dict[str, bool]] = None,
**kwargs: Any,
) -> Self:
@@ -328,6 +328,10 @@ class BaseRelation(FakeAPIObject, Hashable):
def is_view(self) -> bool:
return self.type == RelationType.View
@property
def is_materialized_view(self) -> bool:
return self.type == RelationType.MaterializedView
@classproperty
def Table(cls) -> str:
return str(RelationType.Table)
@@ -344,6 +348,10 @@ class BaseRelation(FakeAPIObject, Hashable):
def External(cls) -> str:
return str(RelationType.External)
@classproperty
def MaterializedView(cls) -> str:
return str(RelationType.MaterializedView)
@classproperty
def get_relation_type(cls) -> Type[RelationType]:
return RelationType

View File

@@ -9,10 +9,11 @@ from dbt.adapters.base.plugin import AdapterPlugin
from dbt.adapters.protocol import AdapterConfig, AdapterProtocol, RelationProtocol
from dbt.contracts.connection import AdapterRequiredConfig, Credentials
from dbt.events.functions import fire_event
from dbt.events.types import AdapterImportError, PluginLoadError
from dbt.events.types import AdapterImportError, PluginLoadError, AdapterRegistered
from dbt.exceptions import DbtInternalError, DbtRuntimeError
from dbt.include.global_project import PACKAGE_PATH as GLOBAL_PROJECT_PATH
from dbt.include.global_project import PROJECT_NAME as GLOBAL_PROJECT_NAME
from dbt.semver import VersionSpecifier
Adapter = AdapterProtocol
@@ -89,7 +90,13 @@ class AdapterContainer:
def register_adapter(self, config: AdapterRequiredConfig) -> None:
adapter_name = config.credentials.type
adapter_type = self.get_adapter_class_by_name(adapter_name)
adapter_version = import_module(f".{adapter_name}.__version__", "dbt.adapters").version
adapter_version_specifier = VersionSpecifier.from_version_string(
adapter_version
).to_version_string()
fire_event(
AdapterRegistered(adapter_name=adapter_name, adapter_version=adapter_version_specifier)
)
with self.lock:
if adapter_name in self.adapters:
# this shouldn't really happen...
@@ -158,6 +165,9 @@ class AdapterContainer:
def get_adapter_type_names(self, name: Optional[str]) -> List[str]:
return [p.adapter.type() for p in self.get_adapter_plugins(name)]
def get_adapter_constraint_support(self, name: Optional[str]) -> List[str]:
return self.lookup_adapter(name).CONSTRAINT_SUPPORT # type: ignore
FACTORY: AdapterContainer = AdapterContainer()
@@ -214,6 +224,10 @@ def get_adapter_type_names(name: Optional[str]) -> List[str]:
return FACTORY.get_adapter_type_names(name)
def get_adapter_constraint_support(name: Optional[str]) -> List[str]:
return FACTORY.get_adapter_constraint_support(name)
@contextmanager
def adapter_management():
reset_adapters()

View File

@@ -0,0 +1,25 @@
# RelationConfig
This package serves as an initial abstraction for managing the inspection of existing relations and determining
changes on those relations. It arose from the materialized view work and is currently only supporting
materialized views for Postgres and Redshift as well as dynamic tables for Snowflake. There are three main
classes in this package.
## RelationConfigBase
This is a very small class that only has a `from_dict()` method and a default `NotImplementedError()`. At some
point this could be replaced by a more robust framework, like `mashumaro` or `pydantic`.
## RelationConfigChange
This class inherits from `RelationConfigBase` ; however, this can be thought of as a separate class. The subclassing
merely points to the idea that both classes would likely inherit from the same class in a `mashumaro` or
`pydantic` implementation. This class is much more restricted in attribution. It should really only
ever need an `action` and a `context`. This can be though of as being analogous to a web request. You need to
know what you're doing (`action`: 'create' = GET, 'drop' = DELETE, etc.) and the information (`context`) needed
to make the change. In our scenarios, the context tends to be an instance of `RelationConfigBase` corresponding
to the new state.
## RelationConfigValidationMixin
This mixin provides optional validation mechanics that can be applied to either `RelationConfigBase` or
`RelationConfigChange` subclasses. A validation rule is a combination of a `validation_check`, something
that should evaluate to `True`, and an optional `validation_error`, an instance of `DbtRuntimeError`
that should be raised in the event the `validation_check` fails. While optional, it's recommended that
the `validation_error` be provided for clearer transparency to the end user.

View File

@@ -0,0 +1,12 @@
from dbt.adapters.relation_configs.config_base import ( # noqa: F401
RelationConfigBase,
RelationResults,
)
from dbt.adapters.relation_configs.config_change import ( # noqa: F401
RelationConfigChangeAction,
RelationConfigChange,
)
from dbt.adapters.relation_configs.config_validation import ( # noqa: F401
RelationConfigValidationMixin,
RelationConfigValidationRule,
)

View File

@@ -0,0 +1,44 @@
from dataclasses import dataclass
from typing import Union, Dict
import agate
from dbt.utils import filter_null_values
"""
This is what relation metadata from the database looks like. It's a dictionary because there will be
multiple grains of data for a single object. For example, a materialized view in Postgres has base level information,
like name. But it also can have multiple indexes, which needs to be a separate query. It might look like this:
{
"base": agate.Row({"table_name": "table_abc", "query": "select * from table_def"})
"indexes": agate.Table("rows": [
agate.Row({"name": "index_a", "columns": ["column_a"], "type": "hash", "unique": False}),
agate.Row({"name": "index_b", "columns": ["time_dim_a"], "type": "btree", "unique": False}),
])
}
"""
RelationResults = Dict[str, Union[agate.Row, agate.Table]]
@dataclass(frozen=True)
class RelationConfigBase:
@classmethod
def from_dict(cls, kwargs_dict) -> "RelationConfigBase":
"""
This assumes the subclass of `RelationConfigBase` is flat, in the sense that no attribute is
itself another subclass of `RelationConfigBase`. If that's not the case, this should be overriden
to manually manage that complexity.
Args:
kwargs_dict: the dict representation of this instance
Returns: the `RelationConfigBase` representation associated with the provided dict
"""
return cls(**filter_null_values(kwargs_dict)) # type: ignore
@classmethod
def _not_implemented_error(cls) -> NotImplementedError:
return NotImplementedError(
"This relation type has not been fully configured for this adapter."
)

View File

@@ -0,0 +1,23 @@
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Hashable
from dbt.adapters.relation_configs.config_base import RelationConfigBase
from dbt.dataclass_schema import StrEnum
class RelationConfigChangeAction(StrEnum):
alter = "alter"
create = "create"
drop = "drop"
@dataclass(frozen=True, eq=True, unsafe_hash=True)
class RelationConfigChange(RelationConfigBase, ABC):
action: RelationConfigChangeAction
context: Hashable # this is usually a RelationConfig, e.g. IndexConfig, but shouldn't be limited
@property
@abstractmethod
def requires_full_refresh(self) -> bool:
raise self._not_implemented_error()

View File

@@ -0,0 +1,57 @@
from dataclasses import dataclass
from typing import Set, Optional
from dbt.exceptions import DbtRuntimeError
@dataclass(frozen=True, eq=True, unsafe_hash=True)
class RelationConfigValidationRule:
validation_check: bool
validation_error: Optional[DbtRuntimeError]
@property
def default_error(self):
return DbtRuntimeError(
"There was a validation error in preparing this relation config."
"No additional context was provided by this adapter."
)
@dataclass(frozen=True)
class RelationConfigValidationMixin:
def __post_init__(self):
self.run_validation_rules()
@property
def validation_rules(self) -> Set[RelationConfigValidationRule]:
"""
A set of validation rules to run against the object upon creation.
A validation rule is a combination of a validation check (bool) and an optional error message.
This defaults to no validation rules if not implemented. It's recommended to override this with values,
but that may not always be necessary.
Returns: a set of validation rules
"""
return set()
def run_validation_rules(self):
for validation_rule in self.validation_rules:
try:
assert validation_rule.validation_check
except AssertionError:
if validation_rule.validation_error:
raise validation_rule.validation_error
else:
raise validation_rule.default_error
self.run_child_validation_rules()
def run_child_validation_rules(self):
for attr_value in vars(self).values():
if hasattr(attr_value, "validation_rules"):
attr_value.run_validation_rules()
if isinstance(attr_value, set):
for member in attr_value:
if hasattr(member, "validation_rules"):
member.run_validation_rules()

View File

@@ -52,7 +52,6 @@ class SQLConnectionManager(BaseConnectionManager):
bindings: Optional[Any] = None,
abridge_sql_log: bool = False,
) -> Tuple[Connection, Any]:
connection = self.get_thread_connection()
if auto_begin and connection.transaction_open is False:
self.begin()
@@ -118,13 +117,16 @@ class SQLConnectionManager(BaseConnectionManager):
return [dict(zip(column_names, row)) for row in rows]
@classmethod
def get_result_from_cursor(cls, cursor: Any) -> agate.Table:
def get_result_from_cursor(cls, cursor: Any, limit: Optional[int]) -> agate.Table:
data: List[Any] = []
column_names: List[str] = []
if cursor.description is not None:
column_names = [col[0] for col in cursor.description]
rows = cursor.fetchall()
if limit:
rows = cursor.fetchmany(limit)
else:
rows = cursor.fetchall()
data = cls.process_results(column_names, rows)
return dbt.clients.agate_helper.table_from_data_flat(data, column_names)
@@ -138,13 +140,13 @@ class SQLConnectionManager(BaseConnectionManager):
)
def execute(
self, sql: str, auto_begin: bool = False, fetch: bool = False
self, sql: str, auto_begin: bool = False, fetch: bool = False, limit: Optional[int] = None
) -> Tuple[AdapterResponse, agate.Table]:
sql = self._add_query_comment(sql)
_, cursor = self.add_query(sql, auto_begin)
response = self.get_response(cursor)
if fetch:
table = self.get_result_from_cursor(cursor)
table = self.get_result_from_cursor(cursor, limit)
else:
table = dbt.clients.agate_helper.empty_table()
return response, table

View File

@@ -1,7 +1,7 @@
import agate
from typing import Any, Optional, Tuple, Type, List
from dbt.contracts.connection import Connection
from dbt.contracts.connection import Connection, AdapterResponse
from dbt.exceptions import RelationTypeNullError
from dbt.adapters.base import BaseAdapter, available
from dbt.adapters.cache import _make_ref_key_dict
@@ -22,6 +22,7 @@ RENAME_RELATION_MACRO_NAME = "rename_relation"
TRUNCATE_RELATION_MACRO_NAME = "truncate_relation"
DROP_RELATION_MACRO_NAME = "drop_relation"
ALTER_COLUMN_TYPE_MACRO_NAME = "alter_column_type"
VALIDATE_SQL_MACRO_NAME = "validate_sql"
class SQLAdapter(BaseAdapter):
@@ -197,6 +198,7 @@ class SQLAdapter(BaseAdapter):
)
return relations
@classmethod
def quote(self, identifier):
return '"{}"'.format(identifier)
@@ -217,6 +219,34 @@ class SQLAdapter(BaseAdapter):
results = self.execute_macro(CHECK_SCHEMA_EXISTS_MACRO_NAME, kwargs=kwargs)
return results[0][0] > 0
def validate_sql(self, sql: str) -> AdapterResponse:
"""Submit the given SQL to the engine for validation, but not execution.
By default we simply prefix the query with the explain keyword and allow the
exceptions thrown by the underlying engine on invalid SQL inputs to bubble up
to the exception handler. For adjustments to the explain statement - such as
for adapters that have different mechanisms for hinting at query validation
or dry-run - callers may be able to override the validate_sql_query macro with
the addition of an <adapter>__validate_sql implementation.
:param sql str: The sql to validate
"""
kwargs = {
"sql": sql,
}
result = self.execute_macro(VALIDATE_SQL_MACRO_NAME, kwargs=kwargs)
# The statement macro always returns an AdapterResponse in the output AttrDict's
# `response` property, and we preserve the full payload in case we want to
# return fetched output for engines where explain plans are emitted as columnar
# results. Any macro override that deviates from this behavior may encounter an
# assertion error in the runtime.
adapter_response = result.response # type: ignore[attr-defined]
assert isinstance(adapter_response, AdapterResponse), (
f"Expected AdapterResponse from validate_sql macro execution, "
f"got {type(adapter_response)}."
)
return adapter_response
# This is for use in the test suite
def run_sql_for_tests(self, sql, fetch, conn):
cursor = conn.handle.cursor()

View File

@@ -1,3 +1,25 @@
# Adding a new command
## `main.py`
Add the new command with all necessary decorators. Every command will need at minimum:
- a decorator for the click group it belongs to which also names the command
- the postflight decorator (must come before other decorators from the `requires` module for error handling)
- the preflight decorator
```py
@cli.command("my-new-command")
@requires.postflight
@requires.preflight
def my_new_command(ctx, **kwargs):
...
```
## `types.py`
Add an entry to the `Command` enum with your new command. Commands that are sub-commands should have entries
that represent their full command path (e.g. `source freshness -> SOURCE_FRESHNESS`, `docs serve -> DOCS_SERVE`).
## `flags.py`
Add the new command to the dictionary within the `command_args` function.
# Exception Handling
## `requires.py`

View File

@@ -4,14 +4,16 @@ from dataclasses import dataclass
from importlib import import_module
from multiprocessing import get_context
from pprint import pformat as pf
from typing import Callable, Dict, List, Set, Union
from typing import Any, Callable, Dict, List, Optional, Set, Union
from click import Context, get_current_context
from click.core import Command, Group, ParameterSource
from click import Context, get_current_context, Parameter
from click.core import Command as ClickCommand, Group, ParameterSource
from dbt.cli.exceptions import DbtUsageException
from dbt.cli.resolvers import default_log_path, default_project_dir
from dbt.cli.types import Command as CliCommand
from dbt.config.profile import read_user_config
from dbt.contracts.project import UserConfig
from dbt.exceptions import DbtInternalError
from dbt.deprecations import renamed_env_var
from dbt.helper_types import WarnErrorOptions
@@ -37,6 +39,9 @@ DEPRECATED_PARAMS = {
}
WHICH_KEY = "which"
def convert_config(config_name, config_value):
"""Convert the values from config and original set_from_args to the correct type."""
ret = config_value
@@ -58,10 +63,10 @@ def args_to_context(args: List[str]) -> Context:
sub_command_name, sub_command, args = cli.resolve_command(cli_ctx, args)
# Handle source and docs group.
if type(sub_command) == Group:
if isinstance(sub_command, Group):
sub_command_name, sub_command, args = sub_command.resolve_command(cli_ctx, args)
assert type(sub_command) == Command
assert isinstance(sub_command, ClickCommand)
sub_command_ctx = sub_command.make_context(sub_command_name, args)
sub_command_ctx.parent = cli_ctx
return sub_command_ctx
@@ -71,7 +76,9 @@ def args_to_context(args: List[str]) -> Context:
class Flags:
"""Primary configuration artifact for running dbt"""
def __init__(self, ctx: Context = None, user_config: UserConfig = None) -> None:
def __init__(
self, ctx: Optional[Context] = None, user_config: Optional[UserConfig] = None
) -> None:
# Set the default flags.
for key, value in FLAGS_DEFAULTS.items():
@@ -199,6 +206,9 @@ class Flags:
profiles_dir = getattr(self, "PROFILES_DIR", None)
user_config = read_user_config(profiles_dir) if profiles_dir else None
# Add entire invocation command to flags
object.__setattr__(self, "INVOCATION_COMMAND", "dbt " + " ".join(sys.argv[1:]))
# Overwrite default assignments with user config if available.
if user_config:
param_assigned_from_default_copy = params_assigned_from_default.copy()
@@ -277,3 +287,118 @@ class Flags:
# It is necessary to remove this attr from the class so it does
# not get pickled when written to disk as json.
object.__delattr__(self, "deprecated_env_var_warnings")
@classmethod
def from_dict(cls, command: CliCommand, args_dict: Dict[str, Any]) -> "Flags":
command_arg_list = command_params(command, args_dict)
ctx = args_to_context(command_arg_list)
flags = cls(ctx=ctx)
flags.fire_deprecations()
return flags
CommandParams = List[str]
def command_params(command: CliCommand, args_dict: Dict[str, Any]) -> CommandParams:
"""Given a command and a dict, returns a list of strings representing
the CLI params for that command. The order of this list is consistent with
which flags are expected at the parent level vs the command level.
e.g. fn("run", {"defer": True, "print": False}) -> ["--no-print", "run", "--defer"]
The result of this function can be passed in to the args_to_context function
to produce a click context to instantiate Flags with.
"""
cmd_args = set(command_args(command))
prnt_args = set(parent_args())
default_args = set([x.lower() for x in FLAGS_DEFAULTS.keys()])
res = command.to_list()
for k, v in args_dict.items():
k = k.lower()
# if a "which" value exists in the args dict, it should match the command provided
if k == WHICH_KEY:
if v != command.value:
raise DbtInternalError(
f"Command '{command.value}' does not match value of which: '{v}'"
)
continue
# param was assigned from defaults and should not be included
if k not in (cmd_args | prnt_args) - default_args:
continue
# if the param is in parent args, it should come before the arg name
# e.g. ["--print", "run"] vs ["run", "--print"]
add_fn = res.append
if k in prnt_args:
def add_fn(x):
res.insert(0, x)
spinal_cased = k.replace("_", "-")
if k == "macro" and command == CliCommand.RUN_OPERATION:
add_fn(v)
elif v in (None, False):
add_fn(f"--no-{spinal_cased}")
elif v is True:
add_fn(f"--{spinal_cased}")
else:
add_fn(f"--{spinal_cased}={v}")
return res
ArgsList = List[str]
def parent_args() -> ArgsList:
"""Return a list representing the params the base click command takes."""
from dbt.cli.main import cli
return format_params(cli.params)
def command_args(command: CliCommand) -> ArgsList:
"""Given a command, return a list of strings representing the params
that command takes. This function only returns params assigned to a
specific command, not those of its parent command.
e.g. fn("run") -> ["defer", "favor_state", "exclude", ...]
"""
import dbt.cli.main as cli
CMD_DICT: Dict[CliCommand, ClickCommand] = {
CliCommand.BUILD: cli.build,
CliCommand.CLEAN: cli.clean,
CliCommand.CLONE: cli.clone,
CliCommand.COMPILE: cli.compile,
CliCommand.DOCS_GENERATE: cli.docs_generate,
CliCommand.DOCS_SERVE: cli.docs_serve,
CliCommand.DEBUG: cli.debug,
CliCommand.DEPS: cli.deps,
CliCommand.INIT: cli.init,
CliCommand.LIST: cli.list,
CliCommand.PARSE: cli.parse,
CliCommand.RUN: cli.run,
CliCommand.RUN_OPERATION: cli.run_operation,
CliCommand.SEED: cli.seed,
CliCommand.SHOW: cli.show,
CliCommand.SNAPSHOT: cli.snapshot,
CliCommand.SOURCE_FRESHNESS: cli.freshness,
CliCommand.TEST: cli.test,
CliCommand.RETRY: cli.retry,
}
click_cmd: Optional[ClickCommand] = CMD_DICT.get(command, None)
if click_cmd is None:
raise DbtInternalError(f"No command found for name '{command.name}'")
return format_params(click_cmd.params)
def format_params(params: List[Parameter]) -> ArgsList:
return [str(x.name) for x in params if not str(x.name).lower().startswith("deprecated_")]

View File

@@ -19,11 +19,11 @@ from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.results import (
CatalogArtifact,
RunExecutionResult,
RunOperationResultsArtifact,
)
from dbt.events.base_types import EventMsg
from dbt.task.build import BuildTask
from dbt.task.clean import CleanTask
from dbt.task.clone import CloneTask
from dbt.task.compile import CompileTask
from dbt.task.debug import DebugTask
from dbt.task.deps import DepsTask
@@ -31,6 +31,7 @@ from dbt.task.freshness import FreshnessTask
from dbt.task.generate import GenerateTask
from dbt.task.init import InitTask
from dbt.task.list import ListTask
from dbt.task.retry import RetryTask
from dbt.task.run import RunTask
from dbt.task.run_operation import RunOperationTask
from dbt.task.seed import SeedTask
@@ -53,8 +54,7 @@ class dbtRunnerResult:
List[str], # list/ls
Manifest, # parse
None, # clean, deps, init, source
RunExecutionResult, # build, compile, run, seed, snapshot, test
RunOperationResultsArtifact, # run-operation
RunExecutionResult, # build, compile, run, seed, snapshot, test, run-operation
] = None
@@ -62,8 +62,8 @@ class dbtRunnerResult:
class dbtRunner:
def __init__(
self,
manifest: Manifest = None,
callbacks: List[Callable[[EventMsg], None]] = None,
manifest: Optional[Manifest] = None,
callbacks: Optional[List[Callable[[EventMsg], None]]] = None,
):
self.manifest = manifest
@@ -139,6 +139,7 @@ class dbtRunner:
@p.log_path
@p.macro_debugging
@p.partial_parse
@p.partial_parse_file_path
@p.populate_cache
@p.print
@p.printer_width
@@ -180,6 +181,7 @@ def cli(ctx, **kwargs):
@p.selector
@p.show
@p.state
@p.defer_state
@p.deprecated_state
@p.store_failures
@p.target
@@ -213,6 +215,7 @@ def build(ctx, **kwargs):
@p.profiles_dir
@p.project_dir
@p.target
@p.target_path
@p.vars
@requires.postflight
@requires.preflight
@@ -250,6 +253,7 @@ def docs(ctx, **kwargs):
@p.selector
@p.empty_catalog
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@@ -284,19 +288,18 @@ def docs_generate(ctx, **kwargs):
@p.profiles_dir
@p.project_dir
@p.target
@p.target_path
@p.vars
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def docs_serve(ctx, **kwargs):
"""Serve the documentation website for your project"""
task = ServeTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
@@ -323,6 +326,7 @@ def docs_serve(ctx, **kwargs):
@p.selector
@p.inline
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@@ -369,6 +373,7 @@ def compile(ctx, **kwargs):
@p.selector
@p.inline
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@@ -398,6 +403,7 @@ def show(ctx, **kwargs):
# dbt debug
@cli.command("debug")
@click.pass_context
@p.debug_connection
@p.config_dir
@p.profile
@p.profiles_dir_exists_false
@@ -408,7 +414,7 @@ def show(ctx, **kwargs):
@requires.postflight
@requires.preflight
def debug(ctx, **kwargs):
"""Test the database connection and show information for debugging purposes. Not to be confused with the --debug option which increases verbosity."""
"""Show information on the current dbt environment and check dependencies, then test the database connection. Not to be confused with the --debug option which increases verbosity."""
task = DebugTask(
ctx.obj["flags"],
@@ -424,7 +430,7 @@ def debug(ctx, **kwargs):
@cli.command("deps")
@click.pass_context
@p.profile
@p.profiles_dir
@p.profiles_dir_exists_false
@p.project_dir
@p.target
@p.vars
@@ -446,7 +452,7 @@ def deps(ctx, **kwargs):
# for backwards compatibility, accept 'project_name' as an optional positional argument
@click.argument("project_name", required=False)
@p.profile
@p.profiles_dir
@p.profiles_dir_exists_false
@p.project_dir
@p.skip_profile_setup
@p.target
@@ -477,8 +483,10 @@ def init(ctx, **kwargs):
@p.raw_select
@p.selector
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@p.vars
@requires.postflight
@requires.preflight
@@ -545,6 +553,7 @@ def parse(ctx, **kwargs):
@p.select
@p.selector
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@@ -570,6 +579,73 @@ def run(ctx, **kwargs):
return results, success
# dbt retry
@cli.command("retry")
@click.pass_context
@p.project_dir
@p.profiles_dir
@p.vars
@p.profile
@p.target
@p.state
@p.threads
@p.fail_fast
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def retry(ctx, **kwargs):
"""Retry the nodes that failed in the previous run."""
task = RetryTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
# dbt clone
@cli.command("clone")
@click.pass_context
@p.defer_state
@p.exclude
@p.full_refresh
@p.profile
@p.profiles_dir
@p.project_dir
@p.resource_type
@p.select
@p.selector
@p.state # required
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
@requires.postflight
def clone(ctx, **kwargs):
"""Create clones of selected nodes based on their location in the manifest provided to --state."""
task = CloneTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
# dbt run operation
@cli.command("run-operation")
@click.pass_context
@@ -579,6 +655,8 @@ def run(ctx, **kwargs):
@p.profiles_dir
@p.project_dir
@p.target
@p.target_path
@p.threads
@p.vars
@requires.postflight
@requires.preflight
@@ -611,6 +689,7 @@ def run_operation(ctx, **kwargs):
@p.selector
@p.show
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@@ -649,8 +728,10 @@ def seed(ctx, **kwargs):
@p.select
@p.selector
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@p.threads
@p.vars
@requires.postflight
@@ -690,8 +771,10 @@ def source(ctx, **kwargs):
@p.select
@p.selector
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@p.threads
@p.vars
@requires.postflight
@@ -735,6 +818,7 @@ cli.commands["source"].add_command(snapshot_freshness, "snapshot-freshness") #
@p.select
@p.selector
@p.state
@p.defer_state
@p.deprecated_state
@p.store_failures
@p.target

View File

@@ -1,7 +1,7 @@
from click import ParamType, Choice
from dbt.config.utils import parse_cli_vars
from dbt.exceptions import ValidationError
from dbt.config.utils import parse_cli_yaml_string
from dbt.exceptions import ValidationError, DbtValidationError, OptionNotYamlDictError
from dbt.helper_types import WarnErrorOptions
@@ -16,8 +16,9 @@ class YAML(ParamType):
if not isinstance(value, str):
self.fail(f"Cannot load YAML from type {type(value)}", param, ctx)
try:
return parse_cli_vars(value)
except ValidationError:
param_option_name = param.opts[0] if param.opts else param.name
return parse_cli_yaml_string(value, param_option_name.strip("-"))
except (ValidationError, DbtValidationError, OptionNotYamlDictError):
self.fail(f"String '{value}' is not valid YAML", param, ctx)

View File

@@ -43,7 +43,7 @@ compile_docs = click.option(
config_dir = click.option(
"--config-dir",
envvar=None,
help="Show the configured location for the profiles.yml file and exit",
help="Print a system-specific command to access the directory that the current dbt project is searching for a profiles.yml. Then, exit. This flag renders other debug step flags no-ops.",
is_flag=True,
)
@@ -239,6 +239,15 @@ partial_parse = click.option(
default=True,
)
partial_parse_file_path = click.option(
"--partial-parse-file-path",
envvar="DBT_PARTIAL_PARSE_FILE_PATH",
help="Internal flag for path to partial_parse.manifest file.",
default=None,
hidden=True,
type=click.Path(exists=True, dir_okay=False, resolve_path=True),
)
populate_cache = click.option(
"--populate-cache/--no-populate-cache",
envvar="DBT_POPULATE_CACHE",
@@ -293,6 +302,8 @@ profiles_dir = click.option(
)
# `dbt debug` uses this because it implements custom behaviour for non-existent profiles.yml directories
# `dbt deps` does not load a profile at all
# `dbt init` will write profiles.yml if it doesn't yet exist
profiles_dir_exists_false = click.option(
"--profiles-dir",
envvar="DBT_PROFILES_DIR",
@@ -424,12 +435,25 @@ empty_catalog = click.option(
state = click.option(
"--state",
envvar="DBT_STATE",
help="If set, use the given directory as the source for JSON files to compare with this project.",
help="Unless overridden, use this state directory for both state comparison and deferral.",
type=click.Path(
dir_okay=True,
file_okay=False,
readable=True,
resolve_path=True,
resolve_path=False,
path_type=Path,
),
)
defer_state = click.option(
"--defer-state",
envvar="DBT_DEFER_STATE",
help="Override the state directory for deferral only.",
type=click.Path(
dir_okay=True,
file_okay=False,
readable=True,
resolve_path=False,
path_type=Path,
),
)
@@ -476,6 +500,13 @@ target_path = click.option(
type=click.Path(),
)
debug_connection = click.option(
"--connection",
envvar=None,
help="Test the connection to the target database independent of dependency checks.",
is_flag=True,
)
threads = click.option(
"--threads",
envvar=None,

View File

@@ -23,6 +23,7 @@ from dbt.parser.manifest import ManifestLoader, write_manifest
from dbt.profiler import profiler
from dbt.tracking import active_user, initialize_from_flags, track_run
from dbt.utils import cast_dict_to_dict_of_strings
from dbt.plugins import set_up_plugin_manager, get_plugin_manager
from click import Context
from functools import update_wrapper
@@ -160,6 +161,9 @@ def project(func):
)
ctx.obj["project"] = project
# Plugins
set_up_plugin_manager(project_name=project.project_name)
if dbt.tracking.active_user is not None:
project_id = None if project is None else project.hashed_name()
@@ -240,12 +244,17 @@ def manifest(*args0, write=True, write_perf_info=False):
# a manifest has already been set on the context, so don't overwrite it
if ctx.obj.get("manifest") is None:
manifest = ManifestLoader.get_full_manifest(
runtime_config, write_perf_info=write_perf_info
runtime_config,
write_perf_info=write_perf_info,
)
ctx.obj["manifest"] = manifest
if write and ctx.obj["flags"].write_json:
write_manifest(manifest, ctx.obj["runtime_config"].target_path)
write_manifest(manifest, runtime_config.project_target_path)
pm = get_plugin_manager(runtime_config.project_name)
plugin_artifacts = pm.get_manifest_artifacts(manifest)
for path, plugin_artifact in plugin_artifacts.items():
plugin_artifact.write(path)
return func(*args, **kwargs)

40
core/dbt/cli/types.py Normal file
View File

@@ -0,0 +1,40 @@
from enum import Enum
from typing import List
from dbt.exceptions import DbtInternalError
class Command(Enum):
BUILD = "build"
CLEAN = "clean"
COMPILE = "compile"
CLONE = "clone"
DOCS_GENERATE = "generate"
DOCS_SERVE = "serve"
DEBUG = "debug"
DEPS = "deps"
INIT = "init"
LIST = "list"
PARSE = "parse"
RUN = "run"
RUN_OPERATION = "run-operation"
SEED = "seed"
SHOW = "show"
SNAPSHOT = "snapshot"
SOURCE_FRESHNESS = "freshness"
TEST = "test"
RETRY = "retry"
@classmethod
def from_str(cls, s: str) -> "Command":
try:
return cls(s)
except ValueError:
raise DbtInternalError(f"No value '{s}' exists in Command enum")
def to_list(self) -> List[str]:
return {
Command.DOCS_GENERATE: ["docs", "generate"],
Command.DOCS_SERVE: ["docs", "serve"],
Command.SOURCE_FRESHNESS: ["source", "freshness"],
}.get(self, [self.value])

View File

@@ -565,6 +565,8 @@ def _requote_result(raw_value: str, rendered: str) -> str:
# is small enough that I've just chosen the more readable option.
_HAS_RENDER_CHARS_PAT = re.compile(r"({[{%#]|[#}%]})")
_render_cache: Dict[str, Any] = dict()
def get_rendered(
string: str,
@@ -572,15 +574,21 @@ def get_rendered(
node=None,
capture_macros: bool = False,
native: bool = False,
) -> str:
) -> Any:
# performance optimization: if there are no jinja control characters in the
# string, we can just return the input. Fall back to jinja if the type is
# not a string or if native rendering is enabled (so '1' -> 1, etc...)
# If this is desirable in the native env as well, we could handle the
# native=True case by passing the input string to ast.literal_eval, like
# the native renderer does.
if not native and isinstance(string, str) and _HAS_RENDER_CHARS_PAT.search(string) is None:
return string
has_render_chars = not isinstance(string, str) or _HAS_RENDER_CHARS_PAT.search(string)
if not has_render_chars:
if not native:
return string
elif string in _render_cache:
return _render_cache[string]
template = get_template(
string,
ctx,
@@ -588,7 +596,13 @@ def get_rendered(
capture_macros=capture_macros,
native=native,
)
return render_template(template, ctx, node)
rendered = render_template(template, ctx, node)
if not has_render_chars and native:
_render_cache[string] = rendered
return rendered
def undefined_error(msg) -> NoReturn:

View File

@@ -141,7 +141,7 @@ def statically_parse_adapter_dispatch(func_call, ctx, db_wrapper):
macro = db_wrapper.dispatch(func_name, macro_namespace=macro_namespace).macro
func_name = f"{macro.package_name}.{macro.name}"
possible_macro_calls.append(func_name)
else: # this is only for test/unit/test_macro_calls.py
else: # this is only for tests/unit/test_macro_calls.py
if macro_namespace:
packages = [macro_namespace]
else:

View File

@@ -211,7 +211,7 @@ def _windows_rmdir_readonly(func: Callable[[str], Any], path: str, exc: Tuple[An
def resolve_path_from_base(path_to_resolve: str, base_path: str) -> str:
"""
If path-to_resolve is a relative path, create an absolute path
If path_to_resolve is a relative path, create an absolute path
with base_path as the base.
If path_to_resolve is an absolute path or a user path (~), just

View File

@@ -1,4 +1,6 @@
import argparse
import json
import networkx as nx # type: ignore
import os
import pickle
@@ -27,8 +29,8 @@ from dbt.exceptions import (
DbtRuntimeError,
)
from dbt.graph import Graph
from dbt.events.functions import fire_event
from dbt.events.types import FoundStats, WritingInjectedSQLForNode
from dbt.events.functions import fire_event, get_invocation_id
from dbt.events.types import FoundStats, Note, WritingInjectedSQLForNode
from dbt.events.contextvars import get_node_info
from dbt.node_types import NodeType, ModelLanguage
from dbt.events.format import pluralize
@@ -46,9 +48,10 @@ def print_compile_stats(stats):
NodeType.Analysis: "analysis",
NodeType.Macro: "macro",
NodeType.Operation: "operation",
NodeType.Seed: "seed file",
NodeType.Seed: "seed",
NodeType.Source: "source",
NodeType.Exposure: "exposure",
NodeType.SemanticModel: "semantic model",
NodeType.Metric: "metric",
NodeType.Group: "group",
}
@@ -61,7 +64,8 @@ def print_compile_stats(stats):
resource_counts = {k.pluralize(): v for k, v in results.items()}
dbt.tracking.track_resource_counts(resource_counts)
stat_line = ", ".join([pluralize(ct, names.get(t)) for t, ct in results.items() if t in names])
# do not include resource types that are not actually defined in the project
stat_line = ", ".join([pluralize(ct, names.get(t)) for t, ct in stats.items() if t in names])
fire_event(FoundStats(stat_line=stat_line))
@@ -80,16 +84,16 @@ def _generate_stats(manifest: Manifest):
if _node_enabled(node):
stats[node.resource_type] += 1
for source in manifest.sources.values():
stats[source.resource_type] += 1
for exposure in manifest.exposures.values():
stats[exposure.resource_type] += 1
for metric in manifest.metrics.values():
stats[metric.resource_type] += 1
for macro in manifest.macros.values():
stats[macro.resource_type] += 1
for group in manifest.groups.values():
stats[group.resource_type] += 1
# Disabled nodes don't appear in the following collections, so we don't check.
stats[NodeType.Source] += len(manifest.sources)
stats[NodeType.Exposure] += len(manifest.exposures)
stats[NodeType.Metric] += len(manifest.metrics)
stats[NodeType.Macro] += len(manifest.macros)
stats[NodeType.Group] += len(manifest.groups)
stats[NodeType.SemanticModel] += len(manifest.semantic_models)
# TODO: should we be counting dimensions + entities?
return stats
@@ -161,13 +165,120 @@ class Linker:
with open(outfile, "wb") as outfh:
pickle.dump(out_graph, outfh, protocol=pickle.HIGHEST_PROTOCOL)
def link_node(self, node: GraphMemberNode, manifest: Manifest):
self.add_node(node.unique_id)
for dependency in node.depends_on_nodes:
if dependency in manifest.nodes:
self.dependency(node.unique_id, (manifest.nodes[dependency].unique_id))
elif dependency in manifest.sources:
self.dependency(node.unique_id, (manifest.sources[dependency].unique_id))
elif dependency in manifest.metrics:
self.dependency(node.unique_id, (manifest.metrics[dependency].unique_id))
elif dependency in manifest.semantic_models:
self.dependency(node.unique_id, (manifest.semantic_models[dependency].unique_id))
else:
raise GraphDependencyNotFoundError(node, dependency)
def link_graph(self, manifest: Manifest):
for source in manifest.sources.values():
self.add_node(source.unique_id)
for semantic_model in manifest.semantic_models.values():
self.add_node(semantic_model.unique_id)
for node in manifest.nodes.values():
self.link_node(node, manifest)
for exposure in manifest.exposures.values():
self.link_node(exposure, manifest)
for metric in manifest.metrics.values():
self.link_node(metric, manifest)
cycle = self.find_cycles()
if cycle:
raise RuntimeError("Found a cycle: {}".format(cycle))
def add_test_edges(self, manifest: Manifest) -> None:
"""This method adds additional edges to the DAG. For a given non-test
executable node, add an edge from an upstream test to the given node if
the set of nodes the test depends on is a subset of the upstream nodes
for the given node."""
# Given a graph:
# model1 --> model2 --> model3
# | |
# | \/
# \/ test 2
# test1
#
# Produce the following graph:
# model1 --> model2 --> model3
# | /\ | /\ /\
# | | \/ | |
# \/ | test2 ----| |
# test1 ----|---------------|
for node_id in self.graph:
# If node is executable (in manifest.nodes) and does _not_
# represent a test, continue.
if (
node_id in manifest.nodes
and manifest.nodes[node_id].resource_type != NodeType.Test
):
# Get *everything* upstream of the node
all_upstream_nodes = nx.traversal.bfs_tree(self.graph, node_id, reverse=True)
# Get the set of upstream nodes not including the current node.
upstream_nodes = set([n for n in all_upstream_nodes if n != node_id])
# Get all tests that depend on any upstream nodes.
upstream_tests = []
for upstream_node in upstream_nodes:
upstream_tests += _get_tests_for_node(manifest, upstream_node)
for upstream_test in upstream_tests:
# Get the set of all nodes that the test depends on
# including the upstream_node itself. This is necessary
# because tests can depend on multiple nodes (ex:
# relationship tests). Test nodes do not distinguish
# between what node the test is "testing" and what
# node(s) it depends on.
test_depends_on = set(manifest.nodes[upstream_test].depends_on_nodes)
# If the set of nodes that an upstream test depends on
# is a subset of all upstream nodes of the current node,
# add an edge from the upstream test to the current node.
if test_depends_on.issubset(upstream_nodes):
self.graph.add_edge(upstream_test, node_id, edge_type="parent_test")
def get_graph(self, manifest: Manifest) -> Graph:
self.link_graph(manifest)
return Graph(self.graph)
def get_graph_summary(self, manifest: Manifest) -> Dict[int, Dict[str, Any]]:
"""Create a smaller summary of the graph, suitable for basic diagnostics
and performance tuning. The summary includes only the edge structure,
node types, and node names. Each of the n nodes is assigned an integer
index 0, 1, 2,..., n-1 for compactness"""
graph_nodes = dict()
index_dict = dict()
for node_index, node_name in enumerate(self.graph):
index_dict[node_name] = node_index
data = manifest.expect(node_name).to_dict(omit_none=True)
graph_nodes[node_index] = {"name": node_name, "type": data["resource_type"]}
for node_index, node in graph_nodes.items():
successors = [index_dict[n] for n in self.graph.successors(node["name"])]
if successors:
node["succ"] = [index_dict[n] for n in self.graph.successors(node["name"])]
return graph_nodes
class Compiler:
def __init__(self, config):
self.config = config
def initialize(self):
make_directory(self.config.target_path)
make_directory(self.config.project_target_path)
make_directory(self.config.packages_install_path)
# creates a ModelContext which is converted to
@@ -193,62 +304,6 @@ class Compiler:
relation_cls = adapter.Relation
return relation_cls.add_ephemeral_prefix(name)
def _inject_ctes_into_sql(self, sql: str, ctes: List[InjectedCTE]) -> str:
"""
`ctes` is a list of InjectedCTEs like:
[
InjectedCTE(
id="cte_id_1",
sql="__dbt__cte__ephemeral as (select * from table)",
),
InjectedCTE(
id="cte_id_2",
sql="__dbt__cte__events as (select id, type from events)",
),
]
Given `sql` like:
"with internal_cte as (select * from sessions)
select * from internal_cte"
This will spit out:
"with __dbt__cte__ephemeral as (select * from table),
__dbt__cte__events as (select id, type from events),
with internal_cte as (select * from sessions)
select * from internal_cte"
(Whitespace enhanced for readability.)
"""
if len(ctes) == 0:
return sql
parsed_stmts = sqlparse.parse(sql)
parsed = parsed_stmts[0]
with_stmt = None
for token in parsed.tokens:
if token.is_keyword and token.normalized == "WITH":
with_stmt = token
break
if with_stmt is None:
# no with stmt, add one, and inject CTEs right at the beginning
first_token = parsed.token_first()
with_stmt = sqlparse.sql.Token(sqlparse.tokens.Keyword, "with")
parsed.insert_before(first_token, with_stmt)
else:
# stmt exists, add a comma (which will come after injected CTEs)
trailing_comma = sqlparse.sql.Token(sqlparse.tokens.Punctuation, ",")
parsed.insert_after(with_stmt, trailing_comma)
token = sqlparse.sql.Token(sqlparse.tokens.Keyword, ", ".join(c.sql for c in ctes))
parsed.insert_after(with_stmt, token)
return str(parsed)
def _recursively_prepend_ctes(
self,
model: ManifestSQLNode,
@@ -323,7 +378,7 @@ class Compiler:
_add_prepended_cte(prepended_ctes, InjectedCTE(id=cte.id, sql=sql))
injected_sql = self._inject_ctes_into_sql(
injected_sql = inject_ctes_into_sql(
model.compiled_code,
prepended_ctes,
)
@@ -385,102 +440,39 @@ class Compiler:
return node
def write_graph_file(self, linker: Linker, manifest: Manifest):
filename = graph_file_name
graph_path = os.path.join(self.config.target_path, filename)
flags = get_flags()
if flags.WRITE_JSON:
linker.write_graph(graph_path, manifest)
def link_node(self, linker: Linker, node: GraphMemberNode, manifest: Manifest):
linker.add_node(node.unique_id)
for dependency in node.depends_on_nodes:
if dependency in manifest.nodes:
linker.dependency(node.unique_id, (manifest.nodes[dependency].unique_id))
elif dependency in manifest.sources:
linker.dependency(node.unique_id, (manifest.sources[dependency].unique_id))
elif dependency in manifest.metrics:
linker.dependency(node.unique_id, (manifest.metrics[dependency].unique_id))
else:
raise GraphDependencyNotFoundError(node, dependency)
def link_graph(self, linker: Linker, manifest: Manifest, add_test_edges: bool = False):
for source in manifest.sources.values():
linker.add_node(source.unique_id)
for node in manifest.nodes.values():
self.link_node(linker, node, manifest)
for exposure in manifest.exposures.values():
self.link_node(linker, exposure, manifest)
for metric in manifest.metrics.values():
self.link_node(linker, metric, manifest)
cycle = linker.find_cycles()
if cycle:
raise RuntimeError("Found a cycle: {}".format(cycle))
if add_test_edges:
manifest.build_parent_and_child_maps()
self.add_test_edges(linker, manifest)
def add_test_edges(self, linker: Linker, manifest: Manifest) -> None:
"""This method adds additional edges to the DAG. For a given non-test
executable node, add an edge from an upstream test to the given node if
the set of nodes the test depends on is a subset of the upstream nodes
for the given node."""
# Given a graph:
# model1 --> model2 --> model3
# | |
# | \/
# \/ test 2
# test1
#
# Produce the following graph:
# model1 --> model2 --> model3
# | /\ | /\ /\
# | | \/ | |
# \/ | test2 ----| |
# test1 ----|---------------|
for node_id in linker.graph:
# If node is executable (in manifest.nodes) and does _not_
# represent a test, continue.
if (
node_id in manifest.nodes
and manifest.nodes[node_id].resource_type != NodeType.Test
):
# Get *everything* upstream of the node
all_upstream_nodes = nx.traversal.bfs_tree(linker.graph, node_id, reverse=True)
# Get the set of upstream nodes not including the current node.
upstream_nodes = set([n for n in all_upstream_nodes if n != node_id])
# Get all tests that depend on any upstream nodes.
upstream_tests = []
for upstream_node in upstream_nodes:
upstream_tests += _get_tests_for_node(manifest, upstream_node)
for upstream_test in upstream_tests:
# Get the set of all nodes that the test depends on
# including the upstream_node itself. This is necessary
# because tests can depend on multiple nodes (ex:
# relationship tests). Test nodes do not distinguish
# between what node the test is "testing" and what
# node(s) it depends on.
test_depends_on = set(manifest.nodes[upstream_test].depends_on_nodes)
# If the set of nodes that an upstream test depends on
# is a subset of all upstream nodes of the current node,
# add an edge from the upstream test to the current node.
if test_depends_on.issubset(upstream_nodes):
linker.graph.add_edge(upstream_test, node_id)
# This method doesn't actually "compile" any of the nodes. That is done by the
# "compile_node" method. This creates a Linker and builds the networkx graph,
# writes out the graph.gpickle file, and prints the stats, returning a Graph object.
def compile(self, manifest: Manifest, write=True, add_test_edges=False) -> Graph:
self.initialize()
linker = Linker()
linker.link_graph(manifest)
self.link_graph(linker, manifest, add_test_edges)
# Create a file containing basic information about graph structure,
# supporting diagnostics and performance analysis.
summaries: Dict = dict()
summaries["_invocation_id"] = get_invocation_id()
summaries["linked"] = linker.get_graph_summary(manifest)
if add_test_edges:
manifest.build_parent_and_child_maps()
linker.add_test_edges(manifest)
# Create another diagnostic summary, just as above, but this time
# including the test edges.
summaries["with_test_edges"] = linker.get_graph_summary(manifest)
with open(
os.path.join(self.config.project_target_path, "graph_summary.json"), "w"
) as out_stream:
try:
out_stream.write(json.dumps(summaries))
except Exception as e: # This is non-essential information, so merely note failures.
fire_event(
Note(
msg=f"An error was encountered writing the graph summary information: {e}"
)
)
stats = _generate_stats(manifest)
@@ -492,10 +484,18 @@ class Compiler:
self.config.args.__class__ == argparse.Namespace
and self.config.args.cls == list_task.ListTask
):
stats = _generate_stats(manifest)
print_compile_stats(stats)
return Graph(linker.graph)
def write_graph_file(self, linker: Linker, manifest: Manifest):
filename = graph_file_name
graph_path = os.path.join(self.config.project_target_path, filename)
flags = get_flags()
if flags.WRITE_JSON:
linker.write_graph(graph_path, manifest)
# writes the "compiled_code" into the target/compiled directory
def _write_node(self, node: ManifestSQLNode) -> ManifestSQLNode:
if not node.extra_ctes_injected or node.resource_type in (
@@ -506,9 +506,8 @@ class Compiler:
fire_event(WritingInjectedSQLForNode(node_info=get_node_info()))
if node.compiled_code:
node.compiled_path = node.write_node(
self.config.target_path, "compiled", node.compiled_code
)
node.compiled_path = node.get_target_write_path(self.config.target_path, "compiled")
node.write_node(self.config.project_root, node.compiled_path, node.compiled_code)
return node
def compile_node(
@@ -530,3 +529,74 @@ class Compiler:
if write:
self._write_node(node)
return node
def inject_ctes_into_sql(sql: str, ctes: List[InjectedCTE]) -> str:
"""
`ctes` is a list of InjectedCTEs like:
[
InjectedCTE(
id="cte_id_1",
sql="__dbt__cte__ephemeral as (select * from table)",
),
InjectedCTE(
id="cte_id_2",
sql="__dbt__cte__events as (select id, type from events)",
),
]
Given `sql` like:
"with internal_cte as (select * from sessions)
select * from internal_cte"
This will spit out:
"with __dbt__cte__ephemeral as (select * from table),
__dbt__cte__events as (select id, type from events),
internal_cte as (select * from sessions)
select * from internal_cte"
(Whitespace enhanced for readability.)
"""
if len(ctes) == 0:
return sql
parsed_stmts = sqlparse.parse(sql)
parsed = parsed_stmts[0]
with_stmt = None
for token in parsed.tokens:
if token.is_keyword and token.normalized == "WITH":
with_stmt = token
elif token.is_keyword and token.normalized == "RECURSIVE" and with_stmt is not None:
with_stmt = token
break
elif not token.is_whitespace and with_stmt is not None:
break
if with_stmt is None:
# no with stmt, add one, and inject CTEs right at the beginning
# [original_sql]
first_token = parsed.token_first()
with_token = sqlparse.sql.Token(sqlparse.tokens.Keyword, "with")
parsed.insert_before(first_token, with_token)
# [with][original_sql]
injected_ctes = ", ".join(c.sql for c in ctes) + " "
injected_ctes_token = sqlparse.sql.Token(sqlparse.tokens.Keyword, injected_ctes)
parsed.insert_after(with_token, injected_ctes_token)
# [with][joined_ctes][original_sql]
else:
# with stmt exists so we don't need to add one, but we do need to add a comma
# between the injected ctes and the original sql
# [with][original_sql]
injected_ctes = ", ".join(c.sql for c in ctes)
injected_ctes_token = sqlparse.sql.Token(sqlparse.tokens.Keyword, injected_ctes)
parsed.insert_after(with_stmt, injected_ctes_token)
# [with][joined_ctes][original_sql]
comma_token = sqlparse.sql.Token(sqlparse.tokens.Punctuation, ", ")
parsed.insert_after(injected_ctes_token, comma_token)
# [with][joined_ctes][, ][original_sql]
return str(parsed)

View File

@@ -16,6 +16,7 @@ import os
from dbt.flags import get_flags
from dbt import deprecations
from dbt.constants import DEPENDENCIES_FILE_NAME, PACKAGES_FILE_NAME
from dbt.clients.system import path_exists, resolve_path_from_base, load_file_contents
from dbt.clients.yaml_helper import load_yaml_text
from dbt.contracts.connection import QueryComment
@@ -93,17 +94,36 @@ def _load_yaml(path):
return load_yaml_text(contents)
def package_data_from_root(project_root):
package_filepath = resolve_path_from_base("packages.yml", project_root)
def package_and_project_data_from_root(project_root):
package_filepath = resolve_path_from_base(PACKAGES_FILE_NAME, project_root)
dependencies_filepath = resolve_path_from_base(DEPENDENCIES_FILE_NAME, project_root)
packages_yml_dict = {}
dependencies_yml_dict = {}
if path_exists(package_filepath):
packages_dict = _load_yaml(package_filepath)
else:
packages_dict = None
return packages_dict
packages_yml_dict = _load_yaml(package_filepath) or {}
if path_exists(dependencies_filepath):
dependencies_yml_dict = _load_yaml(dependencies_filepath) or {}
if "packages" in packages_yml_dict and "packages" in dependencies_yml_dict:
msg = "The 'packages' key cannot be specified in both packages.yml and dependencies.yml"
raise DbtProjectError(msg)
if "projects" in packages_yml_dict:
msg = "The 'projects' key cannot be specified in packages.yml"
raise DbtProjectError(msg)
packages_specified_path = PACKAGES_FILE_NAME
packages_dict = {}
if "packages" in dependencies_yml_dict:
packages_dict["packages"] = dependencies_yml_dict["packages"]
packages_specified_path = DEPENDENCIES_FILE_NAME
else: # don't check for "packages" here so we capture invalid keys in packages.yml
packages_dict = packages_yml_dict
return packages_dict, packages_specified_path
def package_config_from_data(packages_data: Dict[str, Any]):
def package_config_from_data(packages_data: Dict[str, Any]) -> PackageConfig:
if not packages_data:
packages_data = {"packages": []}
@@ -244,6 +264,7 @@ class RenderComponents:
@dataclass
class PartialProject(RenderComponents):
# This class includes the project_dict, packages_dict, selectors_dict, etc from RenderComponents
profile_name: Optional[str] = field(
metadata=dict(description="The unrendered profile name in the project, if set")
)
@@ -260,6 +281,9 @@ class PartialProject(RenderComponents):
verify_version: bool = field(
metadata=dict(description=("If True, verify the dbt version matches the required version"))
)
packages_specified_path: str = field(
metadata=dict(description="The filename where packages were specified")
)
def render_profile_name(self, renderer) -> Optional[str]:
if self.profile_name is None:
@@ -272,7 +296,9 @@ class PartialProject(RenderComponents):
) -> RenderComponents:
rendered_project = renderer.render_project(self.project_dict, self.project_root)
rendered_packages = renderer.render_packages(self.packages_dict)
rendered_packages = renderer.render_packages(
self.packages_dict, self.packages_specified_path
)
rendered_selectors = renderer.render_selectors(self.selectors_dict)
return RenderComponents(
@@ -281,7 +307,7 @@ class PartialProject(RenderComponents):
selectors_dict=rendered_selectors,
)
# Called by 'collect_parts' in RuntimeConfig
# Called by Project.from_project_root (not PartialProject.from_project_root!)
def render(self, renderer: DbtProjectYamlRenderer) -> "Project":
try:
rendered = self.get_rendered(renderer)
@@ -315,10 +341,10 @@ class PartialProject(RenderComponents):
# this field is no longer supported, but many projects may specify it with the default value
# if so, let's only raise this deprecation warning if they set a custom value
if not default_value or project_dict[deprecated_path] != default_value:
deprecations.warn(
f"project-config-{deprecated_path}",
deprecated_path=deprecated_path,
)
kwargs = {"deprecated_path": deprecated_path}
if expected_path:
kwargs.update({"exp_path": expected_path})
deprecations.warn(f"project-config-{deprecated_path}", **kwargs)
def create_project(self, rendered: RenderComponents) -> "Project":
unrendered = RenderComponents(
@@ -424,7 +450,7 @@ class PartialProject(RenderComponents):
query_comment = _query_comment_from_cfg(cfg.query_comment)
packages = package_config_from_data(rendered.packages_dict)
packages: PackageConfig = package_config_from_data(rendered.packages_dict)
selectors = selector_config_from_data(rendered.selectors_dict)
manifest_selectors: Dict[str, Any] = {}
if rendered.selectors_dict and rendered.selectors_dict["selectors"]:
@@ -450,6 +476,7 @@ class PartialProject(RenderComponents):
clean_targets=clean_targets,
log_path=log_path,
packages_install_path=packages_install_path,
packages_specified_path=self.packages_specified_path,
quoting=quoting,
models=models,
on_run_start=on_run_start,
@@ -470,6 +497,7 @@ class PartialProject(RenderComponents):
config_version=cfg.config_version,
unrendered=unrendered,
project_env_vars=project_env_vars,
restrict_access=cfg.restrict_access,
)
# sanity check - this means an internal issue
project.validate()
@@ -484,11 +512,13 @@ class PartialProject(RenderComponents):
selectors_dict: Dict[str, Any],
*,
verify_version: bool = False,
packages_specified_path: str = PACKAGES_FILE_NAME,
):
"""Construct a partial project from its constituent dicts."""
project_name = project_dict.get("name")
profile_name = project_dict.get("profile")
# Create a PartialProject
return cls(
profile_name=profile_name,
project_name=project_name,
@@ -497,6 +527,7 @@ class PartialProject(RenderComponents):
packages_dict=packages_dict,
selectors_dict=selectors_dict,
verify_version=verify_version,
packages_specified_path=packages_specified_path,
)
@classmethod
@@ -505,7 +536,10 @@ class PartialProject(RenderComponents):
) -> "PartialProject":
project_root = os.path.normpath(project_root)
project_dict = load_raw_project(project_root)
packages_dict = package_data_from_root(project_root)
(
packages_dict,
packages_specified_path,
) = package_and_project_data_from_root(project_root)
selectors_dict = selector_data_from_root(project_root)
return cls.from_dicts(
project_root=project_root,
@@ -513,6 +547,7 @@ class PartialProject(RenderComponents):
selectors_dict=selectors_dict,
packages_dict=packages_dict,
verify_version=verify_version,
packages_specified_path=packages_specified_path,
)
@@ -552,6 +587,7 @@ class Project:
clean_targets: List[str]
log_path: str
packages_install_path: str
packages_specified_path: str
quoting: Dict[str, Any]
models: Dict[str, Any]
on_run_start: List[str]
@@ -565,13 +601,14 @@ class Project:
exposures: Dict[str, Any]
vars: VarProvider
dbt_version: List[VersionSpecifier]
packages: Dict[str, Any]
packages: PackageConfig
manifest_selectors: Dict[str, Any]
selectors: SelectorConfig
query_comment: QueryComment
config_version: int
unrendered: RenderComponents
project_env_vars: Dict[str, Any]
restrict_access: bool
@property
def all_source_paths(self) -> List[str]:
@@ -640,6 +677,7 @@ class Project:
"vars": self.vars.to_dict(),
"require-dbt-version": [v.to_version_string() for v in self.dbt_version],
"config-version": self.config_version,
"restrict-access": self.restrict_access,
}
)
if self.query_comment:
@@ -656,13 +694,9 @@ class Project:
except ValidationError as e:
raise ProjectContractBrokenError(e) from e
@classmethod
def partial_load(cls, project_root: str, *, verify_version: bool = False) -> PartialProject:
return PartialProject.from_project_root(
project_root,
verify_version=verify_version,
)
# Called by:
# RtConfig.load_dependencies => RtConfig.load_projects => RtConfig.new_project => Project.from_project_root
# RtConfig.from_args => RtConfig.collect_parts => load_project => Project.from_project_root
@classmethod
def from_project_root(
cls,
@@ -700,3 +734,8 @@ class Project:
if dispatch_entry["macro_namespace"] == macro_namespace:
return dispatch_entry["search_order"]
return None
@property
def project_target_path(self):
# If target_path is absolute, project_root will not be included
return os.path.join(self.project_root, self.target_path)

View File

@@ -1,9 +1,10 @@
from typing import Dict, Any, Tuple, Optional, Union, Callable
import re
import os
from datetime import date
from dbt.clients.jinja import get_rendered, catch_jinja
from dbt.constants import SECRET_ENV_PREFIX
from dbt.constants import SECRET_ENV_PREFIX, DEPENDENCIES_FILE_NAME
from dbt.context.target import TargetContext
from dbt.context.secret import SecretContext, SECRET_PLACEHOLDER
from dbt.context.base import BaseContext
@@ -33,10 +34,10 @@ class BaseRenderer:
return self.render_value(value, keypath)
def render_value(self, value: Any, keypath: Optional[Keypath] = None) -> Any:
# keypath is ignored.
# if it wasn't read as a string, ignore it
# keypath is ignored (and someone who knows should explain why here)
if not isinstance(value, str):
return value
return value if not isinstance(value, date) else value.isoformat()
try:
with catch_jinja():
return get_rendered(value, self.context, native=True)
@@ -131,10 +132,15 @@ class DbtProjectYamlRenderer(BaseRenderer):
rendered_project["project-root"] = project_root
return rendered_project
def render_packages(self, packages: Dict[str, Any]):
def render_packages(self, packages: Dict[str, Any], packages_specified_path: str):
"""Render the given packages dict"""
packages = packages or {} # Sometimes this is none in tests
package_renderer = self.get_package_renderer()
return package_renderer.render_data(packages)
if packages_specified_path == DEPENDENCIES_FILE_NAME:
# We don't want to render the "packages" dictionary that came from dependencies.yml
return packages
else:
return package_renderer.render_data(packages)
def render_selectors(self, selectors: Dict[str, Any]):
return self.render_data(selectors)
@@ -182,7 +188,17 @@ class SecretRenderer(BaseRenderer):
# First, standard Jinja rendering, with special handling for 'secret' environment variables
# "{{ env_var('DBT_SECRET_ENV_VAR') }}" -> "$$$DBT_SECRET_START$$$DBT_SECRET_ENV_{VARIABLE_NAME}$$$DBT_SECRET_END$$$"
# This prevents Jinja manipulation of secrets via macros/filters that might leak partial/modified values in logs
rendered = super().render_value(value, keypath)
try:
rendered = super().render_value(value, keypath)
except Exception as ex:
if keypath and "password" in keypath:
# Passwords sometimes contain jinja-esque characters, but we
# don't want to render them if they aren't valid jinja.
rendered = value
else:
raise ex
# Now, detect instances of the placeholder value ($$$DBT_SECRET_START...DBT_SECRET_END$$$)
# and replace them with the actual secret value
if SECRET_ENV_PREFIX in str(rendered):

View File

@@ -38,6 +38,7 @@ from .project import Project
from .renderer import DbtProjectYamlRenderer, ProfileRenderer
# Called by RuntimeConfig.collect_parts class method
def load_project(
project_root: str,
version_check: bool,
@@ -150,6 +151,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
clean_targets=project.clean_targets,
log_path=project.log_path,
packages_install_path=project.packages_install_path,
packages_specified_path=project.packages_specified_path,
quoting=quoting,
models=project.models,
on_run_start=project.on_run_start,
@@ -170,6 +172,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
config_version=project.config_version,
unrendered=project.unrendered,
project_env_vars=project.project_env_vars,
restrict_access=project.restrict_access,
profile_env_vars=profile.profile_env_vars,
profile_name=profile.profile_name,
target_name=profile.target_name,
@@ -236,6 +239,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
except ValidationError as e:
raise ConfigContractBrokenError(e) from e
# Called by RuntimeConfig.from_args
@classmethod
def collect_parts(cls: Type["RuntimeConfig"], args: Any) -> Tuple[Project, Profile]:
# profile_name from the project
@@ -250,7 +254,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
project = load_project(project_root, bool(flags.VERSION_CHECK), profile, cli_vars)
return project, profile
# Called in main.py, lib.py, task/base.py
# Called in task/base.py, in BaseTask.from_args
@classmethod
def from_args(cls, args: Any) -> "RuntimeConfig":
"""Given arguments, read in dbt_project.yml from the current directory,
@@ -271,7 +275,11 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
)
def get_metadata(self) -> ManifestMetadata:
return ManifestMetadata(project_id=self.hashed_name(), adapter_type=self.credentials.type)
return ManifestMetadata(
project_name=self.project_name,
project_id=self.hashed_name(),
adapter_type=self.credentials.type,
)
def _get_v2_config_paths(
self,
@@ -358,6 +366,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
raise UninstalledPackagesFoundError(
count_packages_specified,
count_packages_installed,
self.packages_specified_path,
self.packages_install_path,
)
project_paths = itertools.chain(internal_packages, self._get_project_directories())

View File

@@ -21,7 +21,7 @@ The selectors.yml file in this project is malformed. Please double check
the contents of this file and fix any errors before retrying.
You can find more information on the syntax for this file here:
https://docs.getdbt.com/docs/package-management
https://docs.getdbt.com/reference/node-selection/yaml-selectors
Validator Error:
{error}

View File

@@ -19,6 +19,6 @@ def parse_cli_yaml_string(var_string: str, cli_option_name: str) -> Dict[str, An
return cli_vars
else:
raise OptionNotYamlDictError(var_type, cli_option_name)
except DbtValidationError:
except (DbtValidationError, OptionNotYamlDictError):
fire_event(InvalidOptionYAML(option_name=cli_option_name))
raise

View File

@@ -8,3 +8,9 @@ MAXIMUM_SEED_SIZE_NAME = "1MB"
PIN_PACKAGE_URL = (
"https://docs.getdbt.com/docs/package-management#section-specifying-package-versions"
)
PACKAGES_FILE_NAME = "packages.yml"
DEPENDENCIES_FILE_NAME = "dependencies.yml"
MANIFEST_FILE_NAME = "manifest.json"
SEMANTIC_MANIFEST_FILE_NAME = "semantic_manifest.json"
PARTIAL_PARSE_FILE_NAME = "partial_parse.msgpack"

View File

@@ -1,6 +1,7 @@
import json
import os
from typing import Any, Dict, NoReturn, Optional, Mapping, Iterable, Set, List
import threading
from dbt.flags import get_flags
import dbt.flags as flags_module
@@ -596,6 +597,11 @@ class BaseContext(metaclass=ContextMeta):
"""
return get_invocation_id()
@contextproperty
def thread_id(self) -> str:
"""thread_id outputs an ID for the current thread (useful for auditing)"""
return threading.current_thread().name
@contextproperty
def modules(self) -> Dict[str, Any]:
"""The `modules` variable in the Jinja context contains useful Python
@@ -652,7 +658,7 @@ class BaseContext(metaclass=ContextMeta):
{% endmacro %}"
"""
if not get_flags().PRINT:
if get_flags().PRINT:
print(msg)
return ""

View File

@@ -52,10 +52,11 @@ class ConfiguredVar(Var):
adapter_type = self._config.credentials.type
lookup = FQNLookup(self._project_name)
active_vars = self._config.vars.vars_for(lookup, adapter_type)
all_vars = MultiDict([active_vars])
all_vars = MultiDict()
if self._config.project_name != my_config.project_name:
all_vars.add(my_config.vars.vars_for(lookup, adapter_type))
all_vars.add(active_vars)
if var_name in all_vars:
return all_vars[var_name]
@@ -118,7 +119,9 @@ class MacroResolvingContext(ConfiguredContext):
def generate_schema_yml_context(
config: AdapterRequiredConfig, project_name: str, schema_yaml_vars: SchemaYamlVars = None
config: AdapterRequiredConfig,
project_name: str,
schema_yaml_vars: Optional[SchemaYamlVars] = None,
) -> Dict[str, Any]:
ctx = SchemaYamlContext(config, project_name, schema_yaml_vars)
return ctx.to_dict()

View File

@@ -1,7 +1,7 @@
from abc import abstractmethod
from copy import deepcopy
from dataclasses import dataclass
from typing import List, Iterator, Dict, Any, TypeVar, Generic
from typing import List, Iterator, Dict, Any, TypeVar, Generic, Optional
from dbt.config import RuntimeConfig, Project, IsFQNResource
from dbt.contracts.graph.model_config import BaseConfig, get_config_for, _listify
@@ -130,7 +130,7 @@ class BaseContextConfigGenerator(Generic[T]):
resource_type: NodeType,
project_name: str,
base: bool,
patch_config_dict: Dict[str, Any] = None,
patch_config_dict: Optional[Dict[str, Any]] = None,
) -> BaseConfig:
own_config = self.get_node_project(project_name)
@@ -166,7 +166,7 @@ class BaseContextConfigGenerator(Generic[T]):
resource_type: NodeType,
project_name: str,
base: bool,
patch_config_dict: Dict[str, Any],
patch_config_dict: Optional[Dict[str, Any]] = None,
) -> Dict[str, Any]:
...
@@ -200,7 +200,7 @@ class ContextConfigGenerator(BaseContextConfigGenerator[C]):
resource_type: NodeType,
project_name: str,
base: bool,
patch_config_dict: dict = None,
patch_config_dict: Optional[dict] = None,
) -> Dict[str, Any]:
config = self.calculate_node_config(
config_call_dict=config_call_dict,
@@ -225,7 +225,7 @@ class UnrenderedConfigGenerator(BaseContextConfigGenerator[Dict[str, Any]]):
resource_type: NodeType,
project_name: str,
base: bool,
patch_config_dict: dict = None,
patch_config_dict: Optional[dict] = None,
) -> Dict[str, Any]:
# TODO CT-211
return self.calculate_node_config(
@@ -318,7 +318,11 @@ class ContextConfig:
config_call_dict[k] = v
def build_config_dict(
self, base: bool = False, *, rendered: bool = True, patch_config_dict: dict = None
self,
base: bool = False,
*,
rendered: bool = True,
patch_config_dict: Optional[dict] = None,
) -> Dict[str, Any]:
if rendered:
# TODO CT-211

View File

@@ -25,6 +25,7 @@ from dbt.exceptions import (
RelationWrongTypeError,
ContractError,
ColumnTypeMissingError,
FailFastError,
)
@@ -107,6 +108,10 @@ def column_type_missing(column_names) -> NoReturn:
raise ColumnTypeMissingError(column_names)
def raise_fail_fast_error(msg, node=None) -> NoReturn:
raise FailFastError(msg, node=node)
# Update this when a new function should be added to the
# dbt context's `exceptions` key!
CONTEXT_EXPORTS = {
@@ -131,6 +136,7 @@ CONTEXT_EXPORTS = {
relation_wrong_type,
raise_contract_error,
column_type_missing,
raise_fail_fast_error,
]
}

View File

@@ -32,12 +32,13 @@ from dbt.contracts.graph.manifest import Manifest, Disabled
from dbt.contracts.graph.nodes import (
Macro,
Exposure,
Metric,
SeedNode,
SourceDefinition,
Resource,
ManifestNode,
RefArgs,
AccessType,
SemanticModel,
)
from dbt.contracts.graph.metrics import MetricReference, ResolvedMetricReference
from dbt.contracts.graph.unparsed import NodeVersion
@@ -54,6 +55,7 @@ from dbt.exceptions import (
LoadAgateTableNotSeedError,
LoadAgateTableValueError,
MacroDispatchArgError,
MacroResultAlreadyLoadedError,
MacrosSourcesUnWriteableError,
MetricArgsError,
MissingConfigError,
@@ -65,11 +67,12 @@ from dbt.exceptions import (
DbtRuntimeError,
TargetNotFoundError,
DbtValidationError,
DbtReferenceError,
)
from dbt.config import IsFQNResource
from dbt.node_types import NodeType, ModelLanguage
from dbt.utils import merge, AttrDict, MultiDict, args_to_dict
from dbt.utils import merge, AttrDict, MultiDict, args_to_dict, cast_to_str
from dbt import selected_resources
@@ -130,6 +133,25 @@ class BaseDatabaseWrapper:
search_prefixes = get_adapter_type_names(self._adapter.type()) + ["default"]
return search_prefixes
def _get_search_packages(self, namespace: Optional[str] = None) -> List[Optional[str]]:
search_packages: List[Optional[str]] = [None]
if namespace is None:
search_packages = [None]
elif isinstance(namespace, str):
macro_search_order = self._adapter.config.get_macro_search_order(namespace)
if macro_search_order:
search_packages = macro_search_order
elif not macro_search_order and namespace in self._adapter.config.dependencies:
search_packages = [self.config.project_name, namespace]
else:
raise CompilationError(
f"In adapter.dispatch, got a {type(namespace)} macro_namespace argument "
f'("{namespace}"), but macro_namespace should be None or a string.'
)
return search_packages
def dispatch(
self,
macro_name: str,
@@ -151,20 +173,7 @@ class BaseDatabaseWrapper:
if packages is not None:
raise MacroDispatchArgError(macro_name)
namespace = macro_namespace
if namespace is None:
search_packages = [None]
elif isinstance(namespace, str):
search_packages = self._adapter.config.get_macro_search_order(namespace)
if not search_packages and namespace in self._adapter.config.dependencies:
search_packages = [self.config.project_name, namespace]
else:
# Not a string and not None so must be a list
raise CompilationError(
f"In adapter.dispatch, got a list macro_namespace argument "
f'("{macro_namespace}"), but macro_namespace should be None or a string.'
)
search_packages = self._get_search_packages(macro_namespace)
attempts = []
@@ -188,7 +197,7 @@ class BaseDatabaseWrapper:
return macro
searched = ", ".join(repr(a) for a in attempts)
msg = f"In dispatch: No macro named '{macro_name}' found\n Searched for: {searched}"
msg = f"In dispatch: No macro named '{macro_name}' found within namespace: '{macro_namespace}'\n Searched for: {searched}"
raise CompilationError(msg)
@@ -282,6 +291,7 @@ class BaseSourceResolver(BaseResolver):
class BaseMetricResolver(BaseResolver):
@abc.abstractmethod
def resolve(self, name: str, package: Optional[str] = None) -> MetricReference:
...
@@ -464,6 +474,7 @@ class ParseRefResolver(BaseRefResolver):
) -> RelationProxy:
self.model.refs.append(self._repack_args(name, package, version))
# This is not the ref for the "name" passed in, but for the current model.
return self.Relation.create_from(self.config, self.model)
@@ -478,6 +489,7 @@ class RuntimeRefResolver(BaseRefResolver):
target_version: Optional[NodeVersion] = None,
) -> RelationProxy:
target_model = self.manifest.resolve_ref(
self.model,
target_name,
target_package,
target_version,
@@ -494,6 +506,25 @@ class RuntimeRefResolver(BaseRefResolver):
target_version=target_version,
disabled=isinstance(target_model, Disabled),
)
elif self.manifest.is_invalid_private_ref(
self.model, target_model, self.config.dependencies
):
raise DbtReferenceError(
unique_id=self.model.unique_id,
ref_unique_id=target_model.unique_id,
access=AccessType.Private,
scope=cast_to_str(target_model.group),
)
elif self.manifest.is_invalid_protected_ref(
self.model, target_model, self.config.dependencies
):
raise DbtReferenceError(
unique_id=self.model.unique_id,
ref_unique_id=target_model.unique_id,
access=AccessType.Protected,
scope=target_model.package_name,
)
self.validate(target_model, target_name, target_package, target_version)
return self.create_relation(target_model)
@@ -704,7 +735,7 @@ class ProviderContext(ManifestContext):
self.config: RuntimeConfig
self.model: Union[Macro, ManifestNode] = model
super().__init__(config, manifest, model.package_name)
self.sql_results: Dict[str, AttrDict] = {}
self.sql_results: Dict[str, Optional[AttrDict]] = {}
self.context_config: Optional[ContextConfig] = context_config
self.provider: Provider = provider
self.adapter = get_adapter(self.config)
@@ -732,12 +763,29 @@ class ProviderContext(ManifestContext):
return args_to_dict(self.config.args)
@contextproperty
def _sql_results(self) -> Dict[str, AttrDict]:
def _sql_results(self) -> Dict[str, Optional[AttrDict]]:
return self.sql_results
@contextmember
def load_result(self, name: str) -> Optional[AttrDict]:
return self.sql_results.get(name)
if name in self.sql_results:
# handle the special case of "main" macro
# See: https://github.com/dbt-labs/dbt-core/blob/ada8860e48b32ac712d92e8b0977b2c3c9749981/core/dbt/task/run.py#L228
if name == "main":
return self.sql_results["main"]
# handle a None, which indicates this name was populated but has since been loaded
elif self.sql_results[name] is None:
raise MacroResultAlreadyLoadedError(name)
# Handle the regular use case
else:
ret_val = self.sql_results[name]
self.sql_results[name] = None
return ret_val
else:
# Handle trying to load a result that was never stored
return None
@contextmember
def store_result(
@@ -793,7 +841,8 @@ class ProviderContext(ManifestContext):
# macros/source defs aren't 'writeable'.
if isinstance(self.model, (Macro, SourceDefinition)):
raise MacrosSourcesUnWriteableError(node=self.model)
self.model.build_path = self.model.write_node(self.config.target_path, "run", payload)
self.model.build_path = self.model.get_target_write_path(self.config.target_path, "run")
self.model.write_node(self.config.project_root, self.model.build_path, payload)
return ""
@contextmember
@@ -1329,20 +1378,30 @@ class ModelContext(ProviderContext):
@contextproperty
def sql(self) -> Optional[str]:
# only doing this in sql model for backward compatible
if (
getattr(self.model, "extra_ctes_injected", None)
and self.model.language == ModelLanguage.sql # type: ignore[union-attr]
):
# TODO CT-211
return self.model.compiled_code # type: ignore[union-attr]
return None
if self.model.language == ModelLanguage.sql: # type: ignore[union-attr]
# If the model is deferred and the adapter doesn't support zero-copy cloning, then select * from the prod
# relation
if getattr(self.model, "defer_relation", None):
# TODO https://github.com/dbt-labs/dbt-core/issues/7976
return f"select * from {self.model.defer_relation.relation_name or str(self.defer_relation)}" # type: ignore[union-attr]
elif getattr(self.model, "extra_ctes_injected", None):
# TODO CT-211
return self.model.compiled_code # type: ignore[union-attr]
else:
return None
else:
return None
@contextproperty
def compiled_code(self) -> Optional[str]:
if getattr(self.model, "extra_ctes_injected", None):
if getattr(self.model, "defer_relation", None):
# TODO https://github.com/dbt-labs/dbt-core/issues/7976
return f"select * from {self.model.defer_relation.relation_name or str(self.defer_relation)}" # type: ignore[union-attr]
elif getattr(self.model, "extra_ctes_injected", None):
# TODO CT-211
return self.model.compiled_code # type: ignore[union-attr]
return None
else:
return None
@contextproperty
def database(self) -> str:
@@ -1387,6 +1446,20 @@ class ModelContext(ProviderContext):
return None
return self.db_wrapper.Relation.create_from(self.config, self.model)
@contextproperty
def defer_relation(self) -> Optional[RelationProxy]:
"""
For commands which add information about this node's corresponding
production version (via a --state artifact), access the Relation
object for that stateful other
"""
if getattr(self.model, "defer_relation", None):
return self.db_wrapper.Relation.create_from_node(
self.config, self.model.defer_relation # type: ignore
)
else:
return None
# This is called by '_context_for', used in 'render_with_context'
def generate_parser_model_context(
@@ -1493,7 +1566,8 @@ def generate_parse_exposure(
}
class MetricRefResolver(BaseResolver):
# applies to SemanticModels
class SemanticModelRefResolver(BaseResolver):
def __call__(self, *args, **kwargs) -> str:
package = None
if len(args) == 1:
@@ -1506,34 +1580,30 @@ class MetricRefResolver(BaseResolver):
version = kwargs.get("version") or kwargs.get("v")
self.validate_args(name, package, version)
# "model" here is any node
self.model.refs.append(RefArgs(package=package, name=name, version=version))
return ""
def validate_args(self, name, package, version):
if not isinstance(name, str):
raise ParsingError(
f"In a metrics section in {self.model.original_file_path} "
f"In a semantic model or metrics section in {self.model.original_file_path} "
"the name argument to ref() must be a string"
)
def generate_parse_metrics(
metric: Metric,
# used for semantic models
def generate_parse_semantic_models(
semantic_model: SemanticModel,
config: RuntimeConfig,
manifest: Manifest,
package_name: str,
) -> Dict[str, Any]:
project = config.load_dependencies()[package_name]
return {
"ref": MetricRefResolver(
"ref": SemanticModelRefResolver(
None,
metric,
project,
manifest,
),
"metric": ParseMetricResolver(
None,
metric,
semantic_model,
project,
manifest,
),

View File

@@ -228,6 +228,7 @@ class SchemaSourceFile(BaseSourceFile):
groups: List[str] = field(default_factory=list)
# node patches contain models, seeds, snapshots, analyses
ndp: List[str] = field(default_factory=list)
semantic_models: List[str] = field(default_factory=list)
# any macro patches in this file by macro unique_id.
mcp: Dict[str, str] = field(default_factory=dict)
# any source patches in this file. The entries are package, name pairs

View File

@@ -1,9 +1,11 @@
import enum
from collections import defaultdict
from dataclasses import dataclass, field
from itertools import chain, islice
from mashumaro.mixins.msgpack import DataClassMessagePackMixin
from multiprocessing.synchronize import Lock
from typing import (
DefaultDict,
Dict,
List,
Optional,
@@ -23,20 +25,23 @@ from typing_extensions import Protocol
from uuid import UUID
from dbt.contracts.graph.nodes import (
Macro,
Documentation,
SourceDefinition,
GenericTestNode,
Exposure,
Metric,
Group,
UnpatchedSourceDefinition,
ManifestNode,
GraphMemberNode,
ResultNode,
BaseNode,
Documentation,
Exposure,
GenericTestNode,
GraphMemberNode,
Group,
Macro,
ManifestNode,
Metric,
ModelNode,
DeferRelation,
ResultNode,
SemanticModel,
SourceDefinition,
UnpatchedSourceDefinition,
)
from dbt.contracts.graph.unparsed import SourcePatch, NodeVersion
from dbt.contracts.graph.unparsed import SourcePatch, NodeVersion, UnparsedVersion
from dbt.contracts.graph.manifest_upgrade import upgrade_manifest_json
from dbt.contracts.files import SourceFile, SchemaSourceFile, FileHash, AnySourceFile
from dbt.contracts.util import BaseArtifactMetadata, SourceKey, ArtifactMixin, schema_version
@@ -46,15 +51,18 @@ from dbt.exceptions import (
DuplicateResourceNameError,
DuplicateMacroInPackageError,
DuplicateMaterializationNameError,
AmbiguousResourceNameRefError,
)
from dbt.helper_types import PathSet
from dbt.events.functions import fire_event
from dbt.events.types import MergedFromState
from dbt.node_types import NodeType
from dbt.events.types import MergedFromState, UnpinnedRefNewVersionAvailable
from dbt.events.contextvars import get_node_info
from dbt.node_types import NodeType, AccessType
from dbt.flags import get_flags, MP_CONTEXT
from dbt import tracking
import dbt.utils
NodeEdgeMap = Dict[str, List[str]]
PackageName = str
DocName = str
@@ -148,28 +156,70 @@ class RefableLookup(dbtClassMixin):
_lookup_types: ClassVar[set] = set(NodeType.refable())
_versioned_types: ClassVar[set] = set(NodeType.versioned())
# refables are actually unique, so the Dict[PackageName, UniqueID] will
# only ever have exactly one value, but doing 3 dict lookups instead of 1
# is not a big deal at all and retains consistency
def __init__(self, manifest: "Manifest"):
self.storage: Dict[str, Dict[PackageName, UniqueID]] = {}
self.populate(manifest)
def get_unique_id(self, key, package: Optional[PackageName], version: Optional[NodeVersion]):
def get_unique_id(
self,
key: str,
package: Optional[PackageName],
version: Optional[NodeVersion],
node: Optional[GraphMemberNode] = None,
):
if version:
key = f"{key}.v{version}"
return find_unique_id_for_package(self.storage, key, package)
unique_ids = self._find_unique_ids_for_package(key, package)
if len(unique_ids) > 1:
raise AmbiguousResourceNameRefError(key, unique_ids, node)
else:
return unique_ids[0] if unique_ids else None
def find(
self,
key,
key: str,
package: Optional[PackageName],
version: Optional[NodeVersion],
manifest: "Manifest",
source_node: Optional[GraphMemberNode] = None,
):
unique_id = self.get_unique_id(key, package, version)
unique_id = self.get_unique_id(key, package, version, source_node)
if unique_id is not None:
return self.perform_lookup(unique_id, manifest)
node = self.perform_lookup(unique_id, manifest)
# If this is an unpinned ref (no 'version' arg was passed),
# AND this is a versioned node,
# AND this ref is being resolved at runtime -- get_node_info != {}
# Only ModelNodes can be versioned.
if (
isinstance(node, ModelNode)
and version is None
and node.is_versioned
and get_node_info()
):
# Check to see if newer versions are available, and log an "FYI" if so
max_version: UnparsedVersion = max(
[
UnparsedVersion(v.version)
for v in manifest.nodes.values()
if isinstance(v, ModelNode)
and v.name == node.name
and v.version is not None
]
)
assert node.latest_version is not None # for mypy, whenever i may find it
if max_version > UnparsedVersion(node.latest_version):
fire_event(
UnpinnedRefNewVersionAvailable(
node_info=get_node_info(),
ref_node_name=node.name,
ref_node_package=node.package_name,
ref_node_version=str(node.version),
ref_max_version=str(max_version.v),
)
)
return node
return None
def add_node(self, node: ManifestNode):
@@ -177,7 +227,7 @@ class RefableLookup(dbtClassMixin):
if node.name not in self.storage:
self.storage[node.name] = {}
if node.resource_type in self._versioned_types and node.version:
if node.is_versioned:
if node.search_name not in self.storage:
self.storage[node.search_name] = {}
self.storage[node.search_name][node.package_name] = node.unique_id
@@ -191,11 +241,29 @@ class RefableLookup(dbtClassMixin):
self.add_node(node)
def perform_lookup(self, unique_id: UniqueID, manifest) -> ManifestNode:
if unique_id not in manifest.nodes:
if unique_id in manifest.nodes:
node = manifest.nodes[unique_id]
else:
raise dbt.exceptions.DbtInternalError(
f"Node {unique_id} found in cache but not found in manifest"
)
return manifest.nodes[unique_id]
return node
def _find_unique_ids_for_package(self, key, package: Optional[PackageName]) -> List[str]:
if key not in self.storage:
return []
pkg_dct: Mapping[PackageName, UniqueID] = self.storage[key]
if package is None:
if not pkg_dct:
return []
else:
return list(pkg_dct.values())
elif package in pkg_dct:
return [pkg_dct[package]]
else:
return []
class MetricLookup(dbtClassMixin):
@@ -231,6 +299,49 @@ class MetricLookup(dbtClassMixin):
return manifest.metrics[unique_id]
class SemanticModelByMeasureLookup(dbtClassMixin):
"""Lookup utility for finding SemanticModel by measure
This is possible because measure names are supposed to be unique across
the semantic models in a manifest.
"""
def __init__(self, manifest: "Manifest"):
self.storage: DefaultDict[str, Dict[PackageName, UniqueID]] = defaultdict(dict)
self.populate(manifest)
def get_unique_id(self, search_name: str, package: Optional[PackageName]):
return find_unique_id_for_package(self.storage, search_name, package)
def find(
self, search_name: str, package: Optional[PackageName], manifest: "Manifest"
) -> Optional[SemanticModel]:
"""Tries to find a SemanticModel based on a measure name"""
unique_id = self.get_unique_id(search_name, package)
if unique_id is not None:
return self.perform_lookup(unique_id, manifest)
return None
def add(self, semantic_model: SemanticModel):
"""Sets all measures for a SemanticModel as paths to the SemanticModel's `unique_id`"""
for measure in semantic_model.measures:
self.storage[measure.name][semantic_model.package_name] = semantic_model.unique_id
def populate(self, manifest: "Manifest"):
"""Populate storage with all the measure + package paths to the Manifest's SemanticModels"""
for semantic_model in manifest.semantic_models.values():
self.add(semantic_model=semantic_model)
def perform_lookup(self, unique_id: UniqueID, manifest: "Manifest") -> SemanticModel:
"""Tries to get a SemanticModel from the Manifest"""
semantic_model = manifest.semantic_models.get(unique_id)
if semantic_model is None:
raise dbt.exceptions.DbtInternalError(
f"Semantic model `{unique_id}` found in cache but not found in manifest"
)
return semantic_model
# This handles both models/seeds/snapshots and sources/metrics/exposures
class DisabledLookup(dbtClassMixin):
def __init__(self, manifest: "Manifest"):
@@ -277,7 +388,7 @@ class AnalysisLookup(RefableLookup):
_versioned_types: ClassVar[set] = set()
def _search_packages(
def _packages_to_search(
current_project: str,
node_package: str,
target_package: Optional[str] = None,
@@ -297,10 +408,16 @@ class ManifestMetadata(BaseArtifactMetadata):
dbt_schema_version: str = field(
default_factory=lambda: str(WritableManifest.dbt_schema_version)
)
project_name: Optional[str] = field(
default=None,
metadata={
"description": "Name of the root project",
},
)
project_id: Optional[str] = field(
default=None,
metadata={
"description": "A unique identifier for the project",
"description": "A unique identifier for the project, hashed from the project name",
},
)
user_id: Optional[UUID] = field(
@@ -354,7 +471,7 @@ def build_node_edges(nodes: List[ManifestNode]):
forward_edges: Dict[str, List[str]] = {n.unique_id: [] for n in nodes}
for node in nodes:
backward_edges[node.unique_id] = node.depends_on_nodes[:]
for unique_id in node.depends_on_nodes:
for unique_id in backward_edges[node.unique_id]:
if unique_id in forward_edges.keys():
forward_edges[unique_id].append(node.unique_id)
return _sort_values(forward_edges), _sort_values(backward_edges)
@@ -498,25 +615,6 @@ MaybeNonSource = Optional[Union[ManifestNode, Disabled[ManifestNode]]]
T = TypeVar("T", bound=GraphMemberNode)
def _update_into(dest: MutableMapping[str, T], new_item: T):
"""Update dest to overwrite whatever is at dest[new_item.unique_id] with
new_itme. There must be an existing value to overwrite, and the two nodes
must have the same original file path.
"""
unique_id = new_item.unique_id
if unique_id not in dest:
raise dbt.exceptions.DbtRuntimeError(
f"got an update_{new_item.resource_type} call with an "
f"unrecognized {new_item.resource_type}: {new_item.unique_id}"
)
existing = dest[unique_id]
if new_item.original_file_path != existing.original_file_path:
raise dbt.exceptions.DbtRuntimeError(
f"cannot update a {new_item.resource_type} to have a new file path!"
)
dest[unique_id] = new_item
# This contains macro methods that are in both the Manifest
# and the MacroManifest
class MacroMethods:
@@ -549,26 +647,36 @@ class MacroMethods:
return candidates.last()
def find_generate_macro_by_name(
self, component: str, root_project_name: str
self, component: str, root_project_name: str, imported_package: Optional[str] = None
) -> Optional[Macro]:
"""
The `generate_X_name` macros are similar to regular ones, but ignore
imported packages.
The default `generate_X_name` macros are similar to regular ones, but only
includes imported packages when searching for a package.
- if package is not provided:
- if there is a `generate_{component}_name` macro in the root
project, return it
- return the `generate_{component}_name` macro from the 'dbt'
internal project
- if package is provided
- return the `generate_{component}_name` macro from the imported
package, if one exists
"""
def filter(candidate: MacroCandidate) -> bool:
return candidate.locality != Locality.Imported
if imported_package:
return (
candidate.locality == Locality.Imported
and imported_package == candidate.macro.package_name
)
else:
return candidate.locality != Locality.Imported
candidates: CandidateList = self._find_macros_by_name(
name=f"generate_{component}_name",
root_project_name=root_project_name,
# filter out imported packages
filter=filter,
)
return candidates.last()
def _find_macros_by_name(
@@ -633,6 +741,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
source_patches: MutableMapping[SourceKey, SourcePatch] = field(default_factory=dict)
disabled: MutableMapping[str, List[GraphMemberNode]] = field(default_factory=dict)
env_vars: MutableMapping[str, str] = field(default_factory=dict)
semantic_models: MutableMapping[str, SemanticModel] = field(default_factory=dict)
_doc_lookup: Optional[DocLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
@@ -646,6 +755,9 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
_metric_lookup: Optional[MetricLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
)
_semantic_model_by_measure_lookup: Optional[SemanticModelByMeasureLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
)
_disabled_lookup: Optional[DisabledLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
)
@@ -672,18 +784,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
obj._lock = MP_CONTEXT.Lock()
return obj
def update_exposure(self, new_exposure: Exposure):
_update_into(self.exposures, new_exposure)
def update_metric(self, new_metric: Metric):
_update_into(self.metrics, new_metric)
def update_node(self, new_node: ManifestNode):
_update_into(self.nodes, new_node)
def update_source(self, new_source: SourceDefinition):
_update_into(self.sources, new_source)
def build_flat_graph(self):
"""This attribute is used in context.common by each node, so we want to
only build it once and avoid any concurrency issues around it.
@@ -696,6 +796,9 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
"metrics": {k: v.to_dict(omit_none=False) for k, v in self.metrics.items()},
"nodes": {k: v.to_dict(omit_none=False) for k, v in self.nodes.items()},
"sources": {k: v.to_dict(omit_none=False) for k, v in self.sources.items()},
"semantic_models": {
k: v.to_dict(omit_none=False) for k, v in self.semantic_models.items()
},
}
def build_disabled_by_file_id(self):
@@ -756,6 +859,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self.nodes.values(),
self.sources.values(),
self.metrics.values(),
self.semantic_models.values(),
)
for resource in all_resources:
resource_type_plural = resource.resource_type.pluralize()
@@ -790,6 +894,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
disabled={k: _deepcopy(v) for k, v in self.disabled.items()},
files={k: _deepcopy(v) for k, v in self.files.items()},
state_check=_deepcopy(self.state_check),
semantic_models={k: _deepcopy(v) for k, v in self.semantic_models.items()},
)
copy.build_flat_graph()
return copy
@@ -801,6 +906,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self.sources.values(),
self.exposures.values(),
self.metrics.values(),
self.semantic_models.values(),
)
)
forward_edges, backward_edges = build_node_edges(edge_members)
@@ -830,7 +936,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
group_map[node.group].append(node.unique_id)
self.group_map = group_map
def writable_manifest(self):
def writable_manifest(self) -> "WritableManifest":
self.build_parent_and_child_maps()
self.build_group_map()
return WritableManifest(
@@ -847,6 +953,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
child_map=self.child_map,
parent_map=self.parent_map,
group_map=self.group_map,
semantic_models=self.semantic_models,
)
def write(self, path):
@@ -863,6 +970,8 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
return self.exposures[unique_id]
elif unique_id in self.metrics:
return self.metrics[unique_id]
elif unique_id in self.semantic_models:
return self.semantic_models[unique_id]
else:
# something terrible has happened
raise dbt.exceptions.DbtInternalError(
@@ -899,6 +1008,13 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self._metric_lookup = MetricLookup(self)
return self._metric_lookup
@property
def semantic_model_by_measure_lookup(self) -> SemanticModelByMeasureLookup:
"""Gets (and creates if necessary) the lookup utility for getting SemanticModels by measures"""
if self._semantic_model_by_measure_lookup is None:
self._semantic_model_by_measure_lookup = SemanticModelByMeasureLookup(self)
return self._semantic_model_by_measure_lookup
def rebuild_ref_lookup(self):
self._ref_lookup = RefableLookup(self)
@@ -917,10 +1033,34 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self._analysis_lookup = AnalysisLookup(self)
return self._analysis_lookup
# Called by dbt.parser.manifest._resolve_refs_for_exposure
@property
def external_node_unique_ids(self):
return [node.unique_id for node in self.nodes.values() if node.is_external_node]
def resolve_refs(
self,
source_node: ModelNode,
current_project: str, # TODO: ModelNode is overly restrictive typing
) -> List[MaybeNonSource]:
resolved_refs: List[MaybeNonSource] = []
for ref in source_node.refs:
resolved = self.resolve_ref(
source_node,
ref.name,
ref.package,
ref.version,
current_project,
source_node.package_name,
)
resolved_refs.append(resolved)
return resolved_refs
# Called by dbt.parser.manifest._process_refs_for_exposure, _process_refs_for_metric,
# and dbt.parser.manifest._process_refs_for_node
def resolve_ref(
self,
source_node: GraphMemberNode,
target_model_name: str,
target_model_package: Optional[str],
target_model_version: Optional[NodeVersion],
@@ -931,11 +1071,13 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
node: Optional[ManifestNode] = None
disabled: Optional[List[ManifestNode]] = None
candidates = _search_packages(current_project, node_package, target_model_package)
candidates = _packages_to_search(current_project, node_package, target_model_package)
for pkg in candidates:
node = self.ref_lookup.find(target_model_name, pkg, target_model_version, self)
node = self.ref_lookup.find(
target_model_name, pkg, target_model_version, self, source_node
)
if node is not None and node.config.enabled:
if node is not None and hasattr(node, "config") and node.config.enabled:
return node
# it's possible that the node is disabled
@@ -956,7 +1098,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
node_package: str,
) -> MaybeParsedSource:
search_name = f"{target_source_name}.{target_table_name}"
candidates = _search_packages(current_project, node_package)
candidates = _packages_to_search(current_project, node_package)
source: Optional[SourceDefinition] = None
disabled: Optional[List[SourceDefinition]] = None
@@ -986,7 +1128,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
metric: Optional[Metric] = None
disabled: Optional[List[Metric]] = None
candidates = _search_packages(current_project, node_package, target_metric_package)
candidates = _packages_to_search(current_project, node_package, target_metric_package)
for pkg in candidates:
metric = self.metric_lookup.find(target_metric_name, pkg, self)
@@ -1000,6 +1142,25 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
return Disabled(disabled[0])
return None
def resolve_semantic_model_for_measure(
self,
target_measure_name: str,
current_project: str,
node_package: str,
target_package: Optional[str] = None,
) -> Optional[SemanticModel]:
"""Tries to find the SemanticModel that a measure belongs to"""
candidates = _packages_to_search(current_project, node_package, target_package)
for pkg in candidates:
semantic_model = self.semantic_model_by_measure_lookup.find(
target_measure_name, pkg, self
)
if semantic_model is not None:
return semantic_model
return None
# Called by DocsRuntimeContext.doc
def resolve_doc(
self,
@@ -1012,7 +1173,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
resolve_ref except the is_enabled checks are unnecessary as docs are
always enabled.
"""
candidates = _search_packages(current_project, node_package, package)
candidates = _packages_to_search(current_project, node_package, package)
for pkg in candidates:
result = self.doc_lookup.find(name, pkg, self)
@@ -1020,6 +1181,50 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
return result
return None
def is_invalid_private_ref(
self, node: GraphMemberNode, target_model: MaybeNonSource, dependencies: Optional[Mapping]
) -> bool:
dependencies = dependencies or {}
if not isinstance(target_model, ModelNode):
return False
is_private_ref = (
target_model.access == AccessType.Private
# don't raise this reference error for ad hoc 'preview' queries
and node.resource_type != NodeType.SqlOperation
and node.resource_type != NodeType.RPCCall # TODO: rm
)
target_dependency = dependencies.get(target_model.package_name)
restrict_package_access = target_dependency.restrict_access if target_dependency else False
# TODO: SemanticModel and SourceDefinition do not have group, and so should not be able to make _any_ private ref.
return is_private_ref and (
not hasattr(node, "group")
or not node.group
or node.group != target_model.group
or restrict_package_access
)
def is_invalid_protected_ref(
self, node: GraphMemberNode, target_model: MaybeNonSource, dependencies: Optional[Mapping]
) -> bool:
dependencies = dependencies or {}
if not isinstance(target_model, ModelNode):
return False
is_protected_ref = (
target_model.access == AccessType.Protected
# don't raise this reference error for ad hoc 'preview' queries
and node.resource_type != NodeType.SqlOperation
and node.resource_type != NodeType.RPCCall # TODO: rm
)
target_dependency = dependencies.get(target_model.package_name)
restrict_package_access = target_dependency.restrict_access if target_dependency else False
return is_protected_ref and (
node.package_name != target_model.package_name and restrict_package_access
)
# Called by RunTask.defer_to_manifest
def merge_from_artifact(
self,
@@ -1057,6 +1262,25 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
sample = list(islice(merged, 5))
fire_event(MergedFromState(num_merged=len(merged), sample=sample))
# Called by CloneTask.defer_to_manifest
def add_from_artifact(
self,
other: "WritableManifest",
) -> None:
"""Update this manifest by *adding* information about each node's location
in the other manifest.
Only non-ephemeral refable nodes are examined.
"""
refables = set(NodeType.refable())
for unique_id, node in other.nodes.items():
current = self.nodes.get(unique_id)
if current and (node.resource_type in refables and not node.is_ephemeral):
defer_relation = DeferRelation(
node.database, node.schema, node.alias, node.relation_name
)
self.nodes[unique_id] = current.replace(defer_relation=defer_relation)
# Methods that were formerly in ParseResult
def add_macro(self, source_file: SourceFile, macro: Macro):
@@ -1142,6 +1366,11 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self.docs[doc.unique_id] = doc
source_file.docs.append(doc.unique_id)
def add_semantic_model(self, source_file: SchemaSourceFile, semantic_model: SemanticModel):
_check_duplicates(semantic_model, self.semantic_models)
self.semantic_models[semantic_model.unique_id] = semantic_model
source_file.semantic_models.append(semantic_model.unique_id)
# end of methods formerly in ParseResult
# Provide support for copy.deepcopy() - we just need to avoid the lock!
@@ -1168,10 +1397,12 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self.source_patches,
self.disabled,
self.env_vars,
self.semantic_models,
self._doc_lookup,
self._source_lookup,
self._ref_lookup,
self._metric_lookup,
self._semantic_model_by_measure_lookup,
self._disabled_lookup,
self._analysis_lookup,
)
@@ -1191,7 +1422,7 @@ AnyManifest = Union[Manifest, MacroManifest]
@dataclass
@schema_version("manifest", 9)
@schema_version("manifest", 10)
class WritableManifest(ArtifactMixin):
nodes: Mapping[UniqueID, ManifestNode] = field(
metadata=dict(description=("The nodes defined in the dbt project and its dependencies"))
@@ -1237,6 +1468,9 @@ class WritableManifest(ArtifactMixin):
description="A mapping from group names to their nodes",
)
)
semantic_models: Mapping[UniqueID, SemanticModel] = field(
metadata=dict(description=("The semantic models defined in the dbt project"))
)
metadata: ManifestMetadata = field(
metadata=dict(
description="Metadata about the manifest",
@@ -1251,20 +1485,24 @@ class WritableManifest(ArtifactMixin):
("manifest", 6),
("manifest", 7),
("manifest", 8),
("manifest", 9),
]
@classmethod
def upgrade_schema_version(cls, data):
"""This overrides the "upgrade_schema_version" call in VersionedSchema (via
ArtifactMixin) to modify the dictionary passed in from earlier versions of the manifest."""
if get_manifest_schema_version(data) <= 8:
data = upgrade_manifest_json(data)
manifest_schema_version = get_manifest_schema_version(data)
if manifest_schema_version <= 9:
data = upgrade_manifest_json(data, manifest_schema_version)
return cls.from_dict(data)
def __post_serialize__(self, dct):
for unique_id, node in dct["nodes"].items():
if "config_call_dict" in node:
del node["config_call_dict"]
if "defer_relation" in node:
del node["defer_relation"]
return dct

View File

@@ -1,42 +1,3 @@
from dbt import deprecations
from dbt.dataclass_schema import ValidationError
# we renamed these properties in v1.3
# this method allows us to be nice to the early adopters
def rename_metric_attr(data: dict, raise_deprecation_warning: bool = False) -> dict:
metric_name = data["name"]
if raise_deprecation_warning and (
"sql" in data.keys()
or "type" in data.keys()
or data.get("calculation_method") == "expression"
):
deprecations.warn("metric-attr-renamed", metric_name=metric_name)
duplicated_attribute_msg = """\n
The metric '{}' contains both the deprecated metric property '{}'
and the up-to-date metric property '{}'. Please remove the deprecated property.
"""
if "sql" in data.keys():
if "expression" in data.keys():
raise ValidationError(
duplicated_attribute_msg.format(metric_name, "sql", "expression")
)
else:
data["expression"] = data.pop("sql")
if "type" in data.keys():
if "calculation_method" in data.keys():
raise ValidationError(
duplicated_attribute_msg.format(metric_name, "type", "calculation_method")
)
else:
calculation_method = data.pop("type")
data["calculation_method"] = calculation_method
# we also changed "type: expression" -> "calculation_method: derived"
if data.get("calculation_method") == "expression":
data["calculation_method"] = "derived"
return data
def rename_sql_attr(node_content: dict) -> dict:
if "raw_sql" in node_content:
node_content["raw_code"] = node_content.pop("raw_sql")
@@ -88,7 +49,24 @@ def upgrade_seed_content(node_content):
node_content.get("depends_on", {}).pop("nodes", None)
def upgrade_manifest_json(manifest: dict) -> dict:
def drop_v9_and_prior_metrics(manifest: dict) -> None:
manifest["metrics"] = {}
filtered_disabled_entries = {}
for entry_name, resource_list in manifest.get("disabled", {}).items():
filtered_resource_list = []
for resource in resource_list:
if resource.get("resource_type") != "metric":
filtered_resource_list.append(resource)
filtered_disabled_entries[entry_name] = filtered_resource_list
manifest["disabled"] = filtered_disabled_entries
def upgrade_manifest_json(manifest: dict, manifest_schema_version: int) -> dict:
# this should remain 9 while the check in `upgrade_schema_version` may change
if manifest_schema_version <= 9:
drop_v9_and_prior_metrics(manifest=manifest)
for node_content in manifest.get("nodes", {}).values():
upgrade_node_content(node_content)
if node_content["resource_type"] == "seed":
@@ -107,7 +85,6 @@ def upgrade_manifest_json(manifest: dict) -> dict:
manifest["group_map"] = {}
for metric_content in manifest.get("metrics", {}).values():
# handle attr renames + value translation ("expression" -> "derived")
metric_content = rename_metric_attr(metric_content)
metric_content = upgrade_ref_content(metric_content)
if "root_path" in metric_content:
del metric_content["root_path"]
@@ -125,4 +102,6 @@ def upgrade_manifest_json(manifest: dict) -> dict:
if "root_path" in doc_content:
del doc_content["root_path"]
doc_content["resource_type"] = "doc"
if "semantic_models" not in manifest:
manifest["semantic_models"] = {}
return manifest

View File

@@ -2,15 +2,17 @@ from dataclasses import field, Field, dataclass
from enum import Enum
from itertools import chain
from typing import Any, List, Optional, Dict, Union, Type, TypeVar, Callable
from dbt.dataclass_schema import (
dbtClassMixin,
ValidationError,
register_pattern,
StrEnum,
)
from dbt.contracts.graph.unparsed import AdditionalPropertiesAllowed, Docs
from dbt.contracts.graph.utils import validate_color
from dbt.exceptions import DbtInternalError, CompilationError
from dbt.contracts.util import Replaceable, list_str
from dbt.exceptions import DbtInternalError, CompilationError
from dbt import hooks
from dbt.node_types import NodeType
@@ -189,6 +191,16 @@ class Severity(str):
register_pattern(Severity, insensitive_patterns("warn", "error"))
class OnConfigurationChangeOption(StrEnum):
Apply = "apply"
Continue = "continue"
Fail = "fail"
@classmethod
def default(cls) -> "OnConfigurationChangeOption":
return cls.Apply
@dataclass
class ContractConfig(dbtClassMixin, Replaceable):
enforced: bool = False
@@ -287,11 +299,17 @@ class BaseConfig(AdditionalPropertiesAllowed, Replaceable):
return False
return True
# This is used in 'add_config_call' to created the combined config_call_dict.
# This is used in 'add_config_call' to create the combined config_call_dict.
# 'meta' moved here from node
mergebehavior = {
"append": ["pre-hook", "pre_hook", "post-hook", "post_hook", "tags"],
"update": ["quoting", "column_types", "meta", "docs", "contract"],
"update": [
"quoting",
"column_types",
"meta",
"docs",
"contract",
],
"dict_key_append": ["grants"],
}
@@ -368,6 +386,11 @@ class BaseConfig(AdditionalPropertiesAllowed, Replaceable):
return self.from_dict(dct)
@dataclass
class SemanticModelConfig(BaseConfig):
enabled: bool = True
@dataclass
class MetricConfig(BaseConfig):
enabled: bool = True
@@ -445,6 +468,9 @@ class NodeConfig(NodeAndTestConfig):
# sometimes getting the Union order wrong, causing serialization failures.
unique_key: Union[str, List[str], None] = None
on_schema_change: Optional[str] = "ignore"
on_configuration_change: OnConfigurationChangeOption = field(
default_factory=OnConfigurationChangeOption.default
)
grants: Dict[str, Any] = field(
default_factory=dict, metadata=MergeBehavior.DictKeyAppend.meta()
)
@@ -474,12 +500,12 @@ class NodeConfig(NodeAndTestConfig):
if (
self.contract.enforced
and self.materialized == "incremental"
and self.on_schema_change != "append_new_columns"
and self.on_schema_change not in ("append_new_columns", "fail")
):
raise ValidationError(
f"Invalid value for on_schema_change: {self.on_schema_change}. Models "
"materialized as incremental with contracts enabled must set "
"on_schema_change to 'append_new_columns'"
"on_schema_change to 'append_new_columns' or 'fail'"
)
@classmethod
@@ -529,6 +555,8 @@ class SeedConfig(NodeConfig):
@dataclass
class TestConfig(NodeAndTestConfig):
__test__ = False
# this is repeated because of a different default
schema: Optional[str] = field(
default="dbt_test__audit",

View File

@@ -0,0 +1,31 @@
from dataclasses import dataclass, field
from datetime import datetime
from typing import Optional, List
from dbt.contracts.graph.unparsed import NodeVersion
from dbt.node_types import NodeType, AccessType
@dataclass
class ModelNodeArgs:
name: str
package_name: str
identifier: str
schema: str
database: Optional[str] = None
relation_name: Optional[str] = None
version: Optional[NodeVersion] = None
latest_version: Optional[NodeVersion] = None
deprecation_date: Optional[datetime] = None
access: Optional[str] = AccessType.Protected.value
generated_at: datetime = field(default_factory=datetime.utcnow)
depends_on_nodes: List[str] = field(default_factory=list)
enabled: bool = True
@property
def unique_id(self) -> str:
unique_id = f"{NodeType.Model}.{self.package_name}.{self.name}"
if self.version:
unique_id = f"{unique_id}.v{self.version}"
return unique_id

View File

@@ -1,25 +1,24 @@
import os
from datetime import datetime
import time
from dataclasses import dataclass, field
from enum import Enum
import hashlib
from mashumaro.types import SerializableType
from typing import (
Optional,
Union,
List,
Dict,
Any,
Sequence,
Tuple,
Iterator,
)
from typing import Optional, Union, List, Dict, Any, Sequence, Tuple, Iterator
from dbt.dataclass_schema import dbtClassMixin, ExtensibleDbtClassMixin
from dbt.clients.system import write_file
from dbt.contracts.files import FileHash
from dbt.contracts.graph.semantic_models import (
Defaults,
Dimension,
Entity,
Measure,
SourceFileMetadata,
)
from dbt.contracts.graph.unparsed import (
Docs,
ExposureType,
@@ -28,8 +27,6 @@ from dbt.contracts.graph.unparsed import (
HasYamlMetadata,
MacroArgument,
MaturityType,
MetricFilter,
MetricTime,
Owner,
Quoting,
TestDef,
@@ -38,19 +35,29 @@ from dbt.contracts.graph.unparsed import (
UnparsedSourceTableDefinition,
UnparsedColumn,
)
from dbt.contracts.graph.node_args import ModelNodeArgs
from dbt.contracts.util import Replaceable, AdditionalPropertiesMixin
from dbt.events.functions import warn_or_error
from dbt.exceptions import ParsingError, InvalidAccessTypeError, ContractBreakingChangeError
from dbt.exceptions import ParsingError, ContractBreakingChangeError
from dbt.events.types import (
SeedIncreased,
SeedExceedsLimitSamePath,
SeedExceedsLimitAndPathChanged,
SeedExceedsLimitChecksumChanged,
ValidationWarning,
)
from dbt.events.contextvars import set_contextvars
from dbt.events.contextvars import set_log_contextvars
from dbt.flags import get_flags
from dbt.node_types import ModelLanguage, NodeType, AccessType
from dbt_semantic_interfaces.call_parameter_sets import FilterCallParameterSets
from dbt_semantic_interfaces.references import (
MeasureReference,
LinkableElementReference,
SemanticModelReference,
TimeDimensionReference,
)
from dbt_semantic_interfaces.references import MetricReference as DSIMetricReference
from dbt_semantic_interfaces.type_enums import MetricType, TimeGranularity
from dbt_semantic_interfaces.parsing.where_filter_parser import WhereFilterParser
from .model_config import (
NodeConfig,
@@ -61,6 +68,7 @@ from .model_config import (
ExposureConfig,
EmptySnapshotConfig,
SnapshotConfig,
SemanticModelConfig,
)
@@ -187,6 +195,9 @@ class ConstraintType(str, Enum):
class ColumnLevelConstraint(dbtClassMixin):
type: ConstraintType
name: Optional[str] = None
# expression is a user-provided field that will depend on the constraint type.
# It could be a predicate (check type), or a sequence sql keywords (e.g. unique type),
# so the vague naming of 'expression' is intended to capture this range.
expression: Optional[str] = None
warn_unenforced: bool = (
True # Warn if constraint cannot be enforced by platform but will be in DDL
@@ -250,6 +261,16 @@ class MacroDependsOn(dbtClassMixin, Replaceable):
self.macros.append(value)
@dataclass
class DeferRelation(HasRelationMetadata):
alias: str
relation_name: Optional[str]
@property
def identifier(self):
return self.alias
@dataclass
class DependsOn(MacroDependsOn):
nodes: List[str] = field(default_factory=list)
@@ -300,7 +321,7 @@ class NodeInfoMixin:
def update_event_status(self, **kwargs):
for k, v in kwargs.items():
self._event_status[k] = v
set_contextvars(node_info=self.node_info)
set_log_contextvars(node_info=self.node_info)
def clear_event_status(self):
self._event_status = dict()
@@ -323,17 +344,23 @@ class ParsedNode(NodeInfoMixin, ParsedNodeMandatory, SerializableType):
relation_name: Optional[str] = None
raw_code: str = ""
def write_node(self, target_path: str, subdirectory: str, payload: str):
def get_target_write_path(self, target_path: str, subdirectory: str):
# This is called for both the "compiled" subdirectory of "target" and the "run" subdirectory
if os.path.basename(self.path) == os.path.basename(self.original_file_path):
# One-to-one relationship of nodes to files.
path = self.original_file_path
else:
# Many-to-one relationship of nodes to files.
path = os.path.join(self.original_file_path, self.path)
full_path = os.path.join(target_path, subdirectory, self.package_name, path)
target_write_path = os.path.join(target_path, subdirectory, self.package_name, path)
return target_write_path
write_file(full_path, payload)
return full_path
def write_node(self, project_root: str, compiled_path, compiled_code: str):
if os.path.isabs(compiled_path):
full_path = compiled_path
else:
full_path = os.path.join(project_root, compiled_path)
write_file(full_path, compiled_code)
def _serialize(self):
return self.to_dict()
@@ -428,74 +455,17 @@ class ParsedNode(NodeInfoMixin, ParsedNodeMandatory, SerializableType):
def build_contract_checksum(self):
pass
def same_contract(self, old) -> bool:
def same_contract(self, old, adapter_type=None) -> bool:
# This would only apply to seeds
return True
def patch(self, patch: "ParsedNodePatch"):
"""Given a ParsedNodePatch, add the new information to the node."""
# NOTE: Constraint patching is awkwardly done in the parse_patch function
# which calls this one. We need to combine the logic.
# explicitly pick out the parts to update so we don't inadvertently
# step on the model name or anything
# Note: config should already be updated
self.patch_path: Optional[str] = patch.file_id
# update created_at so process_docs will run in partial parsing
self.created_at = time.time()
self.description = patch.description
self.columns = patch.columns
self.name = patch.name
# TODO: version, latest_version, and access are specific to ModelNodes, consider splitting out to ModelNode
if self.resource_type != NodeType.Model:
if patch.version:
warn_or_error(
ValidationWarning(
field_name="version",
resource_type=self.resource_type.value,
node_name=patch.name,
)
)
if patch.latest_version:
warn_or_error(
ValidationWarning(
field_name="latest_version",
resource_type=self.resource_type.value,
node_name=patch.name,
)
)
self.version = patch.version
self.latest_version = patch.latest_version
# This might not be the ideal place to validate the "access" field,
# but at this point we have the information we need to properly
# validate and we don't before this.
if patch.access:
if self.resource_type == NodeType.Model:
if AccessType.is_valid(patch.access):
self.access = AccessType(patch.access)
else:
raise InvalidAccessTypeError(
unique_id=self.unique_id,
field_value=patch.access,
)
else:
warn_or_error(
ValidationWarning(
field_name="access",
resource_type=self.resource_type.value,
node_name=patch.name,
)
)
def same_contents(self, old) -> bool:
def same_contents(self, old, adapter_type) -> bool:
if old is None:
return False
# Need to ensure that same_contract is called because it
# could throw an error
same_contract = self.same_contract(old)
same_contract = self.same_contract(old, adapter_type)
return (
self.same_body(old)
and self.same_config(old)
@@ -506,6 +476,10 @@ class ParsedNode(NodeInfoMixin, ParsedNodeMandatory, SerializableType):
and True
)
@property
def is_external_node(self):
return False
@dataclass
class InjectedCTE(dbtClassMixin, Replaceable):
@@ -572,76 +546,6 @@ class CompiledNode(ParsedNode):
def depends_on_macros(self):
return self.depends_on.macros
def build_contract_checksum(self):
# We don't need to construct the checksum if the model does not
# have contract enforced, because it won't be used.
# This needs to be executed after contract config is set
if self.contract.enforced is True:
contract_state = ""
# We need to sort the columns so that order doesn't matter
# columns is a str: ColumnInfo dictionary
sorted_columns = sorted(self.columns.values(), key=lambda col: col.name)
for column in sorted_columns:
contract_state += f"|{column.name}"
contract_state += str(column.data_type)
data = contract_state.encode("utf-8")
self.contract.checksum = hashlib.new("sha256", data).hexdigest()
def same_contract(self, old) -> bool:
# If the contract wasn't previously enforced:
if old.contract.enforced is False and self.contract.enforced is False:
# No change -- same_contract: True
return True
if old.contract.enforced is False and self.contract.enforced is True:
# Now it's enforced. This is a change, but not a breaking change -- same_contract: False
return False
# Otherwise: The contract was previously enforced, and we need to check for changes.
# Happy path: The contract is still being enforced, and the checksums are identical.
if self.contract.enforced is True and self.contract.checksum == old.contract.checksum:
# No change -- same_contract: True
return True
# Otherwise: There has been a change.
# We need to determine if it is a **breaking** change.
# These are the categories of breaking changes:
contract_enforced_disabled: bool = False
columns_removed: List[str] = []
column_type_changes: List[Tuple[str, str, str]] = []
if old.contract.enforced is True and self.contract.enforced is False:
# Breaking change: the contract was previously enforced, and it no longer is
contract_enforced_disabled = True
# Next, compare each column from the previous contract (old.columns)
for key, value in sorted(old.columns.items()):
# Has this column been removed?
if key not in self.columns.keys():
columns_removed.append(value.name)
# Has this column's data type changed?
elif value.data_type != self.columns[key].data_type:
column_type_changes.append(
(str(value.name), str(value.data_type), str(self.columns[key].data_type))
)
# If a column has been added, it will be missing in the old.columns, and present in self.columns
# That's a change (caught by the different checksums), but not a breaking change
# Did we find any changes that we consider breaking? If so, that's an error
if contract_enforced_disabled or columns_removed or column_type_changes:
raise (
ContractBreakingChangeError(
contract_enforced_disabled=contract_enforced_disabled,
columns_removed=columns_removed,
column_type_changes=column_type_changes,
node=self,
)
)
# Otherwise, though we didn't find any *breaking* changes, the contract has still changed -- same_contract: False
else:
return False
# ====================================
# CompiledNode subclasses
@@ -666,10 +570,49 @@ class ModelNode(CompiledNode):
constraints: List[ModelLevelConstraint] = field(default_factory=list)
version: Optional[NodeVersion] = None
latest_version: Optional[NodeVersion] = None
deprecation_date: Optional[datetime] = None
defer_relation: Optional[DeferRelation] = None
@classmethod
def from_args(cls, args: ModelNodeArgs) -> "ModelNode":
unique_id = args.unique_id
# build unrendered config -- for usage in ParsedNode.same_contents
unrendered_config = {}
unrendered_config["alias"] = args.identifier
unrendered_config["schema"] = args.schema
if args.database:
unrendered_config["database"] = args.database
return cls(
resource_type=NodeType.Model,
name=args.name,
package_name=args.package_name,
unique_id=unique_id,
fqn=[args.package_name, args.name],
version=args.version,
latest_version=args.latest_version,
relation_name=args.relation_name,
database=args.database,
schema=args.schema,
alias=args.identifier,
deprecation_date=args.deprecation_date,
checksum=FileHash.from_contents(f"{unique_id},{args.generated_at}"),
access=AccessType(args.access),
original_file_path="",
path="",
unrendered_config=unrendered_config,
depends_on=DependsOn(nodes=args.depends_on_nodes),
config=NodeConfig(enabled=args.enabled),
)
@property
def is_latest_version(self):
return self.version and self.version == self.latest_version
def is_external_node(self) -> bool:
return not self.original_file_path and not self.path
@property
def is_latest_version(self) -> bool:
return self.version is not None and self.version == self.latest_version
@property
def search_name(self):
@@ -678,6 +621,155 @@ class ModelNode(CompiledNode):
else:
return f"{self.name}.v{self.version}"
@property
def materialization_enforces_constraints(self) -> bool:
return self.config.materialized in ["table", "incremental"]
def build_contract_checksum(self):
# We don't need to construct the checksum if the model does not
# have contract enforced, because it won't be used.
# This needs to be executed after contract config is set
# Avoid rebuilding the checksum if it has already been set.
if self.contract.checksum is not None:
return
if self.contract.enforced is True:
contract_state = ""
# We need to sort the columns so that order doesn't matter
# columns is a str: ColumnInfo dictionary
sorted_columns = sorted(self.columns.values(), key=lambda col: col.name)
for column in sorted_columns:
contract_state += f"|{column.name}"
contract_state += str(column.data_type)
contract_state += str(column.constraints)
if self.materialization_enforces_constraints:
contract_state += self.config.materialized
contract_state += str(self.constraints)
data = contract_state.encode("utf-8")
self.contract.checksum = hashlib.new("sha256", data).hexdigest()
def same_contract(self, old, adapter_type=None) -> bool:
# If the contract wasn't previously enforced:
if old.contract.enforced is False and self.contract.enforced is False:
# No change -- same_contract: True
return True
if old.contract.enforced is False and self.contract.enforced is True:
# Now it's enforced. This is a change, but not a breaking change -- same_contract: False
return False
# Otherwise: The contract was previously enforced, and we need to check for changes.
# Happy path: The contract is still being enforced, and the checksums are identical.
if self.contract.enforced is True and self.contract.checksum == old.contract.checksum:
# No change -- same_contract: True
return True
# Otherwise: There has been a change.
# We need to determine if it is a **breaking** change.
# These are the categories of breaking changes:
contract_enforced_disabled: bool = False
columns_removed: List[str] = []
column_type_changes: List[Tuple[str, str, str]] = []
enforced_column_constraint_removed: List[Tuple[str, str]] = [] # column, constraint_type
enforced_model_constraint_removed: List[
Tuple[str, List[str]]
] = [] # constraint_type, columns
materialization_changed: List[str] = []
if old.contract.enforced is True and self.contract.enforced is False:
# Breaking change: the contract was previously enforced, and it no longer is
contract_enforced_disabled = True
# TODO: this avoid the circular imports but isn't ideal
from dbt.adapters.factory import get_adapter_constraint_support
from dbt.adapters.base import ConstraintSupport
constraint_support = get_adapter_constraint_support(adapter_type)
column_constraints_exist = False
# Next, compare each column from the previous contract (old.columns)
for old_key, old_value in sorted(old.columns.items()):
# Has this column been removed?
if old_key not in self.columns.keys():
columns_removed.append(old_value.name)
# Has this column's data type changed?
elif old_value.data_type != self.columns[old_key].data_type:
column_type_changes.append(
(
str(old_value.name),
str(old_value.data_type),
str(self.columns[old_key].data_type),
)
)
# track if there are any column level constraints for the materialization check late
if old_value.constraints:
column_constraints_exist = True
# Have enforced columns level constraints changed?
# Constraints are only enforced for table and incremental materializations.
# We only really care if the old node was one of those materializations for breaking changes
if (
old_key in self.columns.keys()
and old_value.constraints != self.columns[old_key].constraints
and old.materialization_enforces_constraints
):
for old_constraint in old_value.constraints:
if (
old_constraint not in self.columns[old_key].constraints
and constraint_support[old_constraint.type] == ConstraintSupport.ENFORCED
):
enforced_column_constraint_removed.append(
(old_key, str(old_constraint.type))
)
# Now compare the model level constraints
if old.constraints != self.constraints and old.materialization_enforces_constraints:
for old_constraint in old.constraints:
if (
old_constraint not in self.constraints
and constraint_support[old_constraint.type] == ConstraintSupport.ENFORCED
):
enforced_model_constraint_removed.append(
(str(old_constraint.type), old_constraint.columns)
)
# Check for relevant materialization changes.
if (
old.materialization_enforces_constraints
and not self.materialization_enforces_constraints
and (old.constraints or column_constraints_exist)
):
materialization_changed = [old.config.materialized, self.config.materialized]
# If a column has been added, it will be missing in the old.columns, and present in self.columns
# That's a change (caught by the different checksums), but not a breaking change
# Did we find any changes that we consider breaking? If so, that's an error
if (
contract_enforced_disabled
or columns_removed
or column_type_changes
or enforced_model_constraint_removed
or enforced_column_constraint_removed
or materialization_changed
):
raise (
ContractBreakingChangeError(
contract_enforced_disabled=contract_enforced_disabled,
columns_removed=columns_removed,
column_type_changes=column_type_changes,
enforced_column_constraint_removed=enforced_column_constraint_removed,
enforced_model_constraint_removed=enforced_model_constraint_removed,
materialization_changed=materialization_changed,
node=self,
)
)
# Otherwise, though we didn't find any *breaking* changes, the contract has still changed -- same_contract: False
else:
return False
# TODO: rm?
@dataclass
@@ -703,6 +795,7 @@ class SeedNode(ParsedNode): # No SQLDefaults!
# and we need the root_path to load the seed later
root_path: Optional[str] = None
depends_on: MacroDependsOn = field(default_factory=MacroDependsOn)
defer_relation: Optional[DeferRelation] = None
def same_seeds(self, other: "SeedNode") -> bool:
# for seeds, we check the hashes. If the hashes are different types,
@@ -839,6 +932,8 @@ class SingularTestNode(TestShouldStoreFailures, CompiledNode):
@dataclass
class TestMetadata(dbtClassMixin, Replaceable):
__test__ = False
name: str
# kwargs are the args that are left in the test builder after
# removing configs. They are set from the test builder when
@@ -864,7 +959,7 @@ class GenericTestNode(TestShouldStoreFailures, CompiledNode, HasTestMetadata):
config: TestConfig = field(default_factory=TestConfig) # type: ignore
attached_node: Optional[str] = None
def same_contents(self, other) -> bool:
def same_contents(self, other, adapter_type: Optional[str]) -> bool:
if other is None:
return False
@@ -897,6 +992,7 @@ class IntermediateSnapshotNode(CompiledNode):
class SnapshotNode(CompiledNode):
resource_type: NodeType = field(metadata={"restrict": [NodeType.Snapshot]})
config: SnapshotConfig
defer_relation: Optional[DeferRelation] = None
# ====================================
@@ -917,14 +1013,6 @@ class Macro(BaseNode):
created_at: float = field(default_factory=lambda: time.time())
supported_languages: Optional[List[ModelLanguage]] = None
def patch(self, patch: "ParsedMacroPatch"):
self.patch_path: Optional[str] = patch.file_id
self.description = patch.description
self.created_at = time.time()
self.meta = patch.meta
self.docs = patch.docs
self.arguments = patch.arguments
def same_contents(self, other: Optional["Macro"]) -> bool:
if other is None:
return False
@@ -1217,16 +1305,75 @@ class Exposure(GraphNode):
and True
)
@property
def group(self):
return None
# ====================================
# Metric node
# ====================================
@dataclass
class WhereFilter(dbtClassMixin):
where_sql_template: str
@property
def call_parameter_sets(self) -> FilterCallParameterSets:
return WhereFilterParser.parse_call_parameter_sets(self.where_sql_template)
@dataclass
class MetricInputMeasure(dbtClassMixin):
name: str
filter: Optional[WhereFilter] = None
alias: Optional[str] = None
def measure_reference(self) -> MeasureReference:
return MeasureReference(element_name=self.name)
def post_aggregation_measure_reference(self) -> MeasureReference:
return MeasureReference(element_name=self.alias or self.name)
@dataclass
class MetricTimeWindow(dbtClassMixin):
count: int
granularity: TimeGranularity
@dataclass
class MetricInput(dbtClassMixin):
name: str
filter: Optional[WhereFilter] = None
alias: Optional[str] = None
offset_window: Optional[MetricTimeWindow] = None
offset_to_grain: Optional[TimeGranularity] = None
def as_reference(self) -> DSIMetricReference:
return DSIMetricReference(element_name=self.name)
def post_aggregation_reference(self) -> DSIMetricReference:
return DSIMetricReference(element_name=self.alias or self.name)
@dataclass
class MetricTypeParams(dbtClassMixin):
measure: Optional[MetricInputMeasure] = None
input_measures: List[MetricInputMeasure] = field(default_factory=list)
numerator: Optional[MetricInput] = None
denominator: Optional[MetricInput] = None
expr: Optional[str] = None
window: Optional[MetricTimeWindow] = None
grain_to_date: Optional[TimeGranularity] = None
metrics: Optional[List[MetricInput]] = None
@dataclass
class MetricReference(dbtClassMixin, Replaceable):
sql: Optional[Union[str, int]]
unique_id: Optional[str]
sql: Optional[Union[str, int]] = None
unique_id: Optional[str] = None
@dataclass
@@ -1234,16 +1381,11 @@ class Metric(GraphNode):
name: str
description: str
label: str
calculation_method: str
expression: str
filters: List[MetricFilter]
time_grains: List[str]
dimensions: List[str]
type: MetricType
type_params: MetricTypeParams
filter: Optional[WhereFilter] = None
metadata: Optional[SourceFileMetadata] = None
resource_type: NodeType = field(metadata={"restrict": [NodeType.Metric]})
timestamp: Optional[str] = None
window: Optional[MetricTime] = None
model: Optional[str] = None
model_unique_id: Optional[str] = None
meta: Dict[str, Any] = field(default_factory=dict)
tags: List[str] = field(default_factory=list)
config: MetricConfig = field(default_factory=MetricConfig)
@@ -1263,17 +1405,17 @@ class Metric(GraphNode):
def search_name(self):
return self.name
def same_model(self, old: "Metric") -> bool:
return self.model == old.model
@property
def input_measures(self) -> List[MetricInputMeasure]:
return self.type_params.input_measures
def same_window(self, old: "Metric") -> bool:
return self.window == old.window
@property
def measure_references(self) -> List[MeasureReference]:
return [x.measure_reference() for x in self.input_measures]
def same_dimensions(self, old: "Metric") -> bool:
return self.dimensions == old.dimensions
def same_filters(self, old: "Metric") -> bool:
return self.filters == old.filters
@property
def input_metrics(self) -> List[MetricInput]:
return self.type_params.metrics or []
def same_description(self, old: "Metric") -> bool:
return self.description == old.description
@@ -1281,24 +1423,24 @@ class Metric(GraphNode):
def same_label(self, old: "Metric") -> bool:
return self.label == old.label
def same_calculation_method(self, old: "Metric") -> bool:
return self.calculation_method == old.calculation_method
def same_expression(self, old: "Metric") -> bool:
return self.expression == old.expression
def same_timestamp(self, old: "Metric") -> bool:
return self.timestamp == old.timestamp
def same_time_grains(self, old: "Metric") -> bool:
return self.time_grains == old.time_grains
def same_config(self, old: "Metric") -> bool:
return self.config.same_contents(
self.unrendered_config,
old.unrendered_config,
)
def same_filter(self, old: "Metric") -> bool:
return True # TODO
def same_metadata(self, old: "Metric") -> bool:
return True # TODO
def same_type(self, old: "Metric") -> bool:
return self.type == old.type
def same_type_params(self, old: "Metric") -> bool:
return True # TODO
def same_contents(self, old: Optional["Metric"]) -> bool:
# existing when it didn't before is a change!
# metadata/tags changes are not "changes"
@@ -1306,16 +1448,12 @@ class Metric(GraphNode):
return True
return (
self.same_model(old)
and self.same_window(old)
and self.same_dimensions(old)
and self.same_filters(old)
self.same_filter(old)
and self.same_metadata(old)
and self.same_type(old)
and self.same_type_params(old)
and self.same_description(old)
and self.same_label(old)
and self.same_calculation_method(old)
and self.same_expression(old)
and self.same_timestamp(old)
and self.same_time_grains(old)
and self.same_config(old)
and True
)
@@ -1333,6 +1471,115 @@ class Group(BaseNode):
resource_type: NodeType = field(metadata={"restrict": [NodeType.Group]})
# ====================================
# SemanticModel and related classes
# ====================================
@dataclass
class NodeRelation(dbtClassMixin):
alias: str
schema_name: str # TODO: Could this be called simply "schema" so we could reuse StateRelation?
database: Optional[str] = None
relation_name: Optional[str] = None
@dataclass
class SemanticModel(GraphNode):
model: str
node_relation: Optional[NodeRelation]
description: Optional[str] = None
defaults: Optional[Defaults] = None
entities: Sequence[Entity] = field(default_factory=list)
measures: Sequence[Measure] = field(default_factory=list)
dimensions: Sequence[Dimension] = field(default_factory=list)
metadata: Optional[SourceFileMetadata] = None
depends_on: DependsOn = field(default_factory=DependsOn)
refs: List[RefArgs] = field(default_factory=list)
created_at: float = field(default_factory=lambda: time.time())
config: SemanticModelConfig = field(default_factory=SemanticModelConfig)
@property
def entity_references(self) -> List[LinkableElementReference]:
return [entity.reference for entity in self.entities]
@property
def dimension_references(self) -> List[LinkableElementReference]:
return [dimension.reference for dimension in self.dimensions]
@property
def measure_references(self) -> List[MeasureReference]:
return [measure.reference for measure in self.measures]
@property
def has_validity_dimensions(self) -> bool:
return any([dim.validity_params is not None for dim in self.dimensions])
@property
def validity_start_dimension(self) -> Optional[Dimension]:
validity_start_dims = [
dim for dim in self.dimensions if dim.validity_params and dim.validity_params.is_start
]
if not validity_start_dims:
return None
return validity_start_dims[0]
@property
def validity_end_dimension(self) -> Optional[Dimension]:
validity_end_dims = [
dim for dim in self.dimensions if dim.validity_params and dim.validity_params.is_end
]
if not validity_end_dims:
return None
return validity_end_dims[0]
@property
def partitions(self) -> List[Dimension]: # noqa: D
return [dim for dim in self.dimensions or [] if dim.is_partition]
@property
def partition(self) -> Optional[Dimension]:
partitions = self.partitions
if not partitions:
return None
return partitions[0]
@property
def reference(self) -> SemanticModelReference:
return SemanticModelReference(semantic_model_name=self.name)
@property
def depends_on_nodes(self):
return self.depends_on.nodes
@property
def depends_on_macros(self):
return self.depends_on.macros
def checked_agg_time_dimension_for_measure(
self, measure_reference: MeasureReference
) -> TimeDimensionReference:
measure: Optional[Measure] = None
for measure in self.measures:
if measure.reference == measure_reference:
measure = measure
assert (
measure is not None
), f"No measure with name ({measure_reference.element_name}) in semantic_model with name ({self.name})"
if self.defaults is not None:
default_agg_time_dimesion = self.defaults.agg_time_dimension
agg_time_dimension_name = measure.agg_time_dimension or default_agg_time_dimesion
assert agg_time_dimension_name is not None, (
f"Aggregation time dimension for measure {measure.name} is not set! This should either be set directly on "
f"the measure specification in the model, or else defaulted to the primary time dimension in the data "
f"source containing the measure."
)
return TimeDimensionReference(element_name=agg_time_dimension_name)
# ====================================
# Patches
# ====================================
@@ -1356,6 +1603,8 @@ class ParsedNodePatch(ParsedPatch):
access: Optional[str]
version: Optional[NodeVersion]
latest_version: Optional[NodeVersion]
constraints: List[Dict[str, Any]]
deprecation_date: Optional[datetime]
@dataclass
@@ -1397,6 +1646,7 @@ GraphMemberNode = Union[
ResultNode,
Exposure,
Metric,
SemanticModel,
]
# All "nodes" (or node-like objects) in this file

View File

@@ -0,0 +1,95 @@
from dbt_semantic_interfaces.implementations.metric import PydanticMetric
from dbt_semantic_interfaces.implementations.project_configuration import (
PydanticProjectConfiguration,
)
from dbt_semantic_interfaces.implementations.semantic_manifest import PydanticSemanticManifest
from dbt_semantic_interfaces.implementations.semantic_model import PydanticSemanticModel
from dbt_semantic_interfaces.implementations.time_spine_table_configuration import (
PydanticTimeSpineTableConfiguration,
)
from dbt_semantic_interfaces.type_enums import TimeGranularity
from dbt_semantic_interfaces.validations.semantic_manifest_validator import (
SemanticManifestValidator,
)
from dbt.clients.system import write_file
from dbt.events.base_types import EventLevel
from dbt.events.functions import fire_event
from dbt.events.types import SemanticValidationFailure
from dbt.exceptions import ParsingError
class SemanticManifest:
def __init__(self, manifest):
self.manifest = manifest
def validate(self) -> bool:
# TODO: Enforce this check.
# if self.manifest.metrics and not self.manifest.semantic_models:
# fire_event(
# SemanticValidationFailure(
# msg="Metrics require semantic models, but none were found."
# ),
# EventLevel.ERROR,
# )
# return False
if not self.manifest.metrics or not self.manifest.semantic_models:
return True
semantic_manifest = self._get_pydantic_semantic_manifest()
validator = SemanticManifestValidator[PydanticSemanticManifest]()
validation_results = validator.validate_semantic_manifest(semantic_manifest)
for warning in validation_results.warnings:
fire_event(SemanticValidationFailure(msg=warning.message))
for error in validation_results.errors:
fire_event(SemanticValidationFailure(msg=error.message), EventLevel.ERROR)
return not validation_results.errors
def write_json_to_file(self, file_path: str):
semantic_manifest = self._get_pydantic_semantic_manifest()
json = semantic_manifest.json()
write_file(file_path, json)
def _get_pydantic_semantic_manifest(self) -> PydanticSemanticManifest:
project_config = PydanticProjectConfiguration(
time_spine_table_configurations=[],
)
pydantic_semantic_manifest = PydanticSemanticManifest(
metrics=[], semantic_models=[], project_configuration=project_config
)
for semantic_model in self.manifest.semantic_models.values():
pydantic_semantic_manifest.semantic_models.append(
PydanticSemanticModel.parse_obj(semantic_model.to_dict())
)
for metric in self.manifest.metrics.values():
pydantic_semantic_manifest.metrics.append(PydanticMetric.parse_obj(metric.to_dict()))
# Look for time-spine table model and create time spine table configuration
if self.manifest.semantic_models:
# Get model for time_spine_table
time_spine_model_name = "metricflow_time_spine"
model = self.manifest.ref_lookup.find(time_spine_model_name, None, None, self.manifest)
if not model:
raise ParsingError(
"The semantic layer requires a 'metricflow_time_spine' model in the project, but none was found. "
"Guidance on creating this model can be found on our docs site ("
"https://docs.getdbt.com/docs/build/metricflow-time-spine) "
)
# Create time_spine_table_config, set it in project_config, and add to semantic manifest
time_spine_table_config = PydanticTimeSpineTableConfiguration(
location=model.relation_name,
column_name="date_day",
grain=TimeGranularity.DAY,
)
pydantic_semantic_manifest.project_configuration.time_spine_table_configurations = [
time_spine_table_config
]
return pydantic_semantic_manifest

View File

@@ -0,0 +1,147 @@
from dataclasses import dataclass
from dbt.dataclass_schema import dbtClassMixin
from dbt_semantic_interfaces.references import (
DimensionReference,
EntityReference,
MeasureReference,
TimeDimensionReference,
)
from dbt_semantic_interfaces.type_enums import (
AggregationType,
DimensionType,
EntityType,
TimeGranularity,
)
from typing import List, Optional
@dataclass
class FileSlice(dbtClassMixin):
"""Provides file slice level context about what something was created from.
Implementation of the dbt-semantic-interfaces `FileSlice` protocol
"""
filename: str
content: str
start_line_number: int
end_line_number: int
@dataclass
class SourceFileMetadata(dbtClassMixin):
"""Provides file context about what something was created from.
Implementation of the dbt-semantic-interfaces `Metadata` protocol
"""
repo_file_path: str
file_slice: FileSlice
@dataclass
class Defaults(dbtClassMixin):
agg_time_dimension: Optional[str] = None
# ====================================
# Dimension objects
# ====================================
@dataclass
class DimensionValidityParams(dbtClassMixin):
is_start: bool = False
is_end: bool = False
@dataclass
class DimensionTypeParams(dbtClassMixin):
time_granularity: TimeGranularity
validity_params: Optional[DimensionValidityParams] = None
@dataclass
class Dimension(dbtClassMixin):
name: str
type: DimensionType
description: Optional[str] = None
is_partition: bool = False
type_params: Optional[DimensionTypeParams] = None
expr: Optional[str] = None
metadata: Optional[SourceFileMetadata] = None
@property
def reference(self) -> DimensionReference:
return DimensionReference(element_name=self.name)
@property
def time_dimension_reference(self) -> Optional[TimeDimensionReference]:
if self.type == DimensionType.TIME:
return TimeDimensionReference(element_name=self.name)
else:
return None
@property
def validity_params(self) -> Optional[DimensionValidityParams]:
if self.type_params:
return self.type_params.validity_params
else:
return None
# ====================================
# Entity objects
# ====================================
@dataclass
class Entity(dbtClassMixin):
name: str
type: EntityType
description: Optional[str] = None
role: Optional[str] = None
expr: Optional[str] = None
@property
def reference(self) -> EntityReference:
return EntityReference(element_name=self.name)
@property
def is_linkable_entity_type(self) -> bool:
return self.type in (EntityType.PRIMARY, EntityType.UNIQUE, EntityType.NATURAL)
# ====================================
# Measure objects
# ====================================
@dataclass
class MeasureAggregationParameters(dbtClassMixin):
percentile: Optional[float] = None
use_discrete_percentile: bool = False
use_approximate_percentile: bool = False
@dataclass
class NonAdditiveDimension(dbtClassMixin):
name: str
window_choice: AggregationType
window_groupings: List[str]
@dataclass
class Measure(dbtClassMixin):
name: str
agg: AggregationType
description: Optional[str] = None
create_metric: bool = False
expr: Optional[str] = None
agg_params: Optional[MeasureAggregationParameters] = None
non_additive_dimension: Optional[NonAdditiveDimension] = None
agg_time_dimension: Optional[str] = None
@property
def reference(self) -> MeasureReference:
return MeasureReference(element_name=self.name)

View File

@@ -1,13 +1,18 @@
import datetime
import re
from dbt import deprecations
from dbt.node_types import NodeType
from dbt.contracts.graph.semantic_models import (
Defaults,
DimensionValidityParams,
MeasureAggregationParameters,
)
from dbt.contracts.util import (
AdditionalPropertiesMixin,
Mergeable,
Replaceable,
)
from dbt.contracts.graph.manifest_upgrade import rename_metric_attr
# trigger the PathEncoder
import dbt.helper_types # noqa:F401
@@ -154,6 +159,7 @@ class UnparsedVersion(dbtClassMixin):
columns: Sequence[Union[dbt.helper_types.IncludeExclude, UnparsedColumn]] = field(
default_factory=list
)
deprecation_date: Optional[datetime.datetime] = None
def __lt__(self, other):
try:
@@ -192,6 +198,8 @@ class UnparsedVersion(dbtClassMixin):
else:
self._unparsed_columns.append(column)
self.deprecation_date = normalize_date(self.deprecation_date)
@dataclass
class UnparsedAnalysisUpdate(HasConfig, HasColumnDocs, HasColumnProps, HasYamlMetadata):
@@ -210,6 +218,7 @@ class UnparsedModelUpdate(UnparsedNodeUpdate):
access: Optional[str] = None
latest_version: Optional[NodeVersion] = None
versions: Sequence[UnparsedVersion] = field(default_factory=list)
deprecation_date: Optional[datetime.datetime] = None
def __post_init__(self):
if self.latest_version:
@@ -229,6 +238,8 @@ class UnparsedModelUpdate(UnparsedNodeUpdate):
self._version_map = {version.v: version for version in self.versions}
self.deprecation_date = normalize_date(self.deprecation_date)
def get_columns_for_version(self, version: NodeVersion) -> List[UnparsedColumn]:
if version not in self._version_map:
raise DbtInternalError(
@@ -587,25 +598,47 @@ class MetricTime(dbtClassMixin, Mergeable):
@dataclass
class UnparsedMetric(dbtClassMixin, Replaceable):
class UnparsedMetricInputMeasure(dbtClassMixin):
name: str
filter: Optional[str] = None
alias: Optional[str] = None
@dataclass
class UnparsedMetricInput(dbtClassMixin):
name: str
filter: Optional[str] = None
alias: Optional[str] = None
offset_window: Optional[str] = None
offset_to_grain: Optional[str] = None # str is really a TimeGranularity Enum
@dataclass
class UnparsedMetricTypeParams(dbtClassMixin):
measure: Optional[Union[UnparsedMetricInputMeasure, str]] = None
numerator: Optional[Union[UnparsedMetricInput, str]] = None
denominator: Optional[Union[UnparsedMetricInput, str]] = None
expr: Optional[Union[str, bool]] = None
window: Optional[str] = None
grain_to_date: Optional[str] = None # str is really a TimeGranularity Enum
metrics: Optional[List[Union[UnparsedMetricInput, str]]] = None
@dataclass
class UnparsedMetric(dbtClassMixin):
name: str
label: str
calculation_method: str
expression: str
type: str
type_params: UnparsedMetricTypeParams
description: str = ""
timestamp: Optional[str] = None
time_grains: List[str] = field(default_factory=list)
dimensions: List[str] = field(default_factory=list)
window: Optional[MetricTime] = None
model: Optional[str] = None
filters: List[MetricFilter] = field(default_factory=list)
filter: Optional[str] = None
# metadata: Optional[Unparsedetadata] = None # TODO
meta: Dict[str, Any] = field(default_factory=dict)
tags: List[str] = field(default_factory=list)
config: Dict[str, Any] = field(default_factory=dict)
@classmethod
def validate(cls, data):
data = rename_metric_attr(data, raise_deprecation_warning=True)
super(UnparsedMetric, cls).validate(data)
if "name" in data:
errors = []
@@ -625,22 +658,6 @@ class UnparsedMetric(dbtClassMixin, Replaceable):
f"The metric name '{data['name']}' is invalid. It {', '.join(e for e in errors)}"
)
if data.get("timestamp") is None and data.get("time_grains") is not None:
raise ValidationError(
f"The metric '{data['name']} has time_grains defined but is missing a timestamp dimension."
)
if data.get("timestamp") is None and data.get("window") is not None:
raise ValidationError(
f"The metric '{data['name']} has a window defined but is missing a timestamp dimension."
)
if data.get("model") is None and data.get("calculation_method") != "derived":
raise ValidationError("Non-derived metrics require a 'model' property")
if data.get("model") is not None and data.get("calculation_method") == "derived":
raise ValidationError("Derived metrics cannot have a 'model' property")
@dataclass
class UnparsedGroup(dbtClassMixin, Replaceable):
@@ -652,3 +669,77 @@ class UnparsedGroup(dbtClassMixin, Replaceable):
super(UnparsedGroup, cls).validate(data)
if data["owner"].get("name") is None and data["owner"].get("email") is None:
raise ValidationError("Group owner must have at least one of 'name' or 'email'.")
#
# semantic interfaces unparsed objects
#
@dataclass
class UnparsedEntity(dbtClassMixin):
name: str
type: str # EntityType enum
description: Optional[str] = None
role: Optional[str] = None
expr: Optional[str] = None
@dataclass
class UnparsedNonAdditiveDimension(dbtClassMixin):
name: str
window_choice: str # AggregationType enum
window_groupings: List[str]
@dataclass
class UnparsedMeasure(dbtClassMixin):
name: str
agg: str # actually an enum
description: Optional[str] = None
expr: Optional[Union[str, bool, int]] = None
agg_params: Optional[MeasureAggregationParameters] = None
non_additive_dimension: Optional[UnparsedNonAdditiveDimension] = None
agg_time_dimension: Optional[str] = None
@dataclass
class UnparsedDimensionTypeParams(dbtClassMixin):
time_granularity: str # TimeGranularity enum
validity_params: Optional[DimensionValidityParams] = None
@dataclass
class UnparsedDimension(dbtClassMixin):
name: str
type: str # actually an enum
description: Optional[str] = None
is_partition: bool = False
type_params: Optional[UnparsedDimensionTypeParams] = None
expr: Optional[str] = None
@dataclass
class UnparsedSemanticModel(dbtClassMixin):
name: str
model: str # looks like "ref(...)"
description: Optional[str] = None
defaults: Optional[Defaults] = None
entities: List[UnparsedEntity] = field(default_factory=list)
measures: List[UnparsedMeasure] = field(default_factory=list)
dimensions: List[UnparsedDimension] = field(default_factory=list)
def normalize_date(d: Optional[datetime.date]) -> Optional[datetime.datetime]:
"""Convert date to datetime (at midnight), and add local time zone if naive"""
if d is None:
return None
# convert date to datetime
dt = d if type(d) == datetime.datetime else datetime.datetime(d.year, d.month, d.day)
if not dt.tzinfo:
# date is naive, re-interpret as system time zone
dt = dt.astimezone()
return dt

View File

@@ -223,6 +223,7 @@ class Project(HyphenatedDbtClassMixin, Replaceable):
)
packages: List[PackageSpec] = field(default_factory=list)
query_comment: Optional[Union[QueryComment, NoValue, str]] = field(default_factory=NoValue)
restrict_access: bool = False
@classmethod
def validate(cls, data):

View File

@@ -17,7 +17,7 @@ class RelationType(StrEnum):
Table = "table"
View = "view"
CTE = "cte"
MaterializedView = "materializedview"
MaterializedView = "materialized_view"
External = "external"

View File

@@ -1,3 +1,5 @@
import threading
from dbt.contracts.graph.unparsed import FreshnessThreshold
from dbt.contracts.graph.nodes import SourceDefinition, ResultNode
from dbt.contracts.util import (
@@ -21,13 +23,14 @@ import agate
from dataclasses import dataclass, field
from datetime import datetime
from typing import (
Union,
Any,
Callable,
Dict,
List,
Optional,
Any,
NamedTuple,
Optional,
Sequence,
Union,
)
from dbt.clients.system import write_json
@@ -56,15 +59,16 @@ class TimingInfo(dbtClassMixin):
# This is a context manager
class collect_timing_info:
def __init__(self, name: str):
def __init__(self, name: str, callback: Callable[[TimingInfo], None]):
self.timing_info = TimingInfo(name=name)
self.callback = callback
def __enter__(self):
self.timing_info.begin()
return self.timing_info
def __exit__(self, exc_type, exc_value, traceback):
self.timing_info.end()
self.callback(self.timing_info)
# Note: when legacy logger is removed, we can remove the following line
with TimingProcessor(self.timing_info):
fire_event(
@@ -159,6 +163,20 @@ class RunResult(NodeResult):
def skipped(self):
return self.status == RunStatus.Skipped
@classmethod
def from_node(cls, node: ResultNode, status: RunStatus, message: Optional[str]):
thread_id = threading.current_thread().name
return RunResult(
status=status,
thread_id=thread_id,
execution_time=0,
timing=[],
message=message,
node=node,
adapter_response={},
failures=None,
)
@dataclass
class ExecutionResult(dbtClassMixin):
@@ -245,40 +263,6 @@ class RunResultsArtifact(ExecutionResult, ArtifactMixin):
write_json(path, self.to_dict(omit_none=False))
@dataclass
class RunOperationResult(ExecutionResult):
success: bool
@dataclass
class RunOperationResultMetadata(BaseArtifactMetadata):
dbt_schema_version: str = field(
default_factory=lambda: str(RunOperationResultsArtifact.dbt_schema_version)
)
@dataclass
@schema_version("run-operation-result", 1)
class RunOperationResultsArtifact(RunOperationResult, ArtifactMixin):
@classmethod
def from_success(
cls,
success: bool,
elapsed_time: float,
generated_at: datetime,
):
meta = RunOperationResultMetadata(
dbt_schema_version=str(cls.dbt_schema_version),
generated_at=generated_at,
)
return cls(
metadata=meta,
results=[],
elapsed_time=elapsed_time,
success=success,
)
# due to issues with typing.Union collapsing subclasses, this can't subclass
# PartialResult
@@ -391,6 +375,9 @@ class FreshnessResult(ExecutionResult):
meta = FreshnessMetadata(generated_at=generated_at)
return cls(metadata=meta, results=results, elapsed_time=elapsed_time)
def write(self, path):
FreshnessExecutionResultArtifact.from_result(self).write(path)
@dataclass
@schema_version("sources", 3)

View File

@@ -7,15 +7,17 @@ from dbt.exceptions import IncompatibleSchemaError
class PreviousState:
def __init__(self, path: Path, current_path: Path):
self.path: Path = path
self.current_path: Path = current_path
def __init__(self, state_path: Path, target_path: Path, project_root: Path):
self.state_path: Path = state_path
self.target_path: Path = target_path
self.project_root: Path = project_root
self.manifest: Optional[WritableManifest] = None
self.results: Optional[RunResultsArtifact] = None
self.sources: Optional[FreshnessExecutionResultArtifact] = None
self.sources_current: Optional[FreshnessExecutionResultArtifact] = None
manifest_path = self.path / "manifest.json"
# Note: if state_path is absolute, project_root will be ignored.
manifest_path = self.project_root / self.state_path / "manifest.json"
if manifest_path.exists() and manifest_path.is_file():
try:
self.manifest = WritableManifest.read_and_check_versions(str(manifest_path))
@@ -23,7 +25,7 @@ class PreviousState:
exc.add_filename(str(manifest_path))
raise
results_path = self.path / "run_results.json"
results_path = self.project_root / self.state_path / "run_results.json"
if results_path.exists() and results_path.is_file():
try:
self.results = RunResultsArtifact.read_and_check_versions(str(results_path))
@@ -31,7 +33,7 @@ class PreviousState:
exc.add_filename(str(results_path))
raise
sources_path = self.path / "sources.json"
sources_path = self.project_root / self.state_path / "sources.json"
if sources_path.exists() and sources_path.is_file():
try:
self.sources = FreshnessExecutionResultArtifact.read_and_check_versions(
@@ -41,7 +43,7 @@ class PreviousState:
exc.add_filename(str(sources_path))
raise
sources_current_path = self.current_path / "sources.json"
sources_current_path = self.project_root / self.target_path / "sources.json"
if sources_current_path.exists() and sources_current_path.is_file():
try:
self.sources_current = FreshnessExecutionResultArtifact.read_and_check_versions(

View File

@@ -91,6 +91,11 @@ class ConfigTargetPathDeprecation(DBTDeprecation):
_event = "ConfigTargetPathDeprecation"
class CollectFreshnessReturnSignature(DBTDeprecation):
_name = "collect-freshness-return-signature"
_event = "CollectFreshnessReturnSignature"
def renamed_env_var(old_name: str, new_name: str):
class EnvironmentVariableRenamed(DBTDeprecation):
_name = f"environment-variable-renamed:{old_name}"
@@ -128,6 +133,7 @@ deprecations_list: List[DBTDeprecation] = [
ExposureNameDeprecation(),
ConfigLogPathDeprecation(),
ConfigTargetPathDeprecation(),
CollectFreshnessReturnSignature(),
]
deprecations: Dict[str, DBTDeprecation] = {d.name: d for d in deprecations_list}

View File

@@ -6,6 +6,7 @@ import sys
from google.protobuf.json_format import ParseDict, MessageToDict, MessageToJson
from google.protobuf.message import Message
from dbt.events.helpers import get_json_string_utcnow
from typing import Optional
if sys.version_info >= (3, 8):
from typing import Protocol
@@ -126,7 +127,7 @@ class EventMsg(Protocol):
data: Message
def msg_from_base_event(event: BaseEvent, level: EventLevel = None):
def msg_from_base_event(event: BaseEvent, level: Optional[EventLevel] = None):
msg_class_name = f"{type(event).__name__}Msg"
msg_cls = getattr(types_pb2, msg_class_name)

View File

@@ -5,48 +5,65 @@ from typing import Any, Generator, Mapping, Dict
LOG_PREFIX = "log_"
LOG_PREFIX_LEN = len(LOG_PREFIX)
TASK_PREFIX = "task_"
_log_context_vars: Dict[str, contextvars.ContextVar] = {}
_context_vars: Dict[str, contextvars.ContextVar] = {}
def get_contextvars() -> Dict[str, Any]:
def get_contextvars(prefix: str) -> Dict[str, Any]:
rv = {}
ctx = contextvars.copy_context()
prefix_len = len(prefix)
for k in ctx:
if k.name.startswith(LOG_PREFIX) and ctx[k] is not Ellipsis:
rv[k.name[LOG_PREFIX_LEN:]] = ctx[k]
if k.name.startswith(prefix) and ctx[k] is not Ellipsis:
rv[k.name[prefix_len:]] = ctx[k]
return rv
def get_node_info():
cvars = get_contextvars()
cvars = get_contextvars(LOG_PREFIX)
if "node_info" in cvars:
return cvars["node_info"]
else:
return {}
def clear_contextvars() -> None:
def get_project_root():
cvars = get_contextvars(TASK_PREFIX)
if "project_root" in cvars:
return cvars["project_root"]
else:
return None
def clear_contextvars(prefix: str) -> None:
ctx = contextvars.copy_context()
for k in ctx:
if k.name.startswith(LOG_PREFIX):
if k.name.startswith(prefix):
k.set(Ellipsis)
def set_log_contextvars(**kwargs: Any) -> Mapping[str, contextvars.Token]:
return set_contextvars(LOG_PREFIX, **kwargs)
def set_task_contextvars(**kwargs: Any) -> Mapping[str, contextvars.Token]:
return set_contextvars(TASK_PREFIX, **kwargs)
# put keys and values into context. Returns the contextvar.Token mapping
# Save and pass to reset_contextvars
def set_contextvars(**kwargs: Any) -> Mapping[str, contextvars.Token]:
def set_contextvars(prefix: str, **kwargs: Any) -> Mapping[str, contextvars.Token]:
cvar_tokens = {}
for k, v in kwargs.items():
log_key = f"{LOG_PREFIX}{k}"
log_key = f"{prefix}{k}"
try:
var = _log_context_vars[log_key]
var = _context_vars[log_key]
except KeyError:
var = contextvars.ContextVar(log_key, default=Ellipsis)
_log_context_vars[log_key] = var
_context_vars[log_key] = var
cvar_tokens[k] = var.set(v)
@@ -54,30 +71,44 @@ def set_contextvars(**kwargs: Any) -> Mapping[str, contextvars.Token]:
# reset by Tokens
def reset_contextvars(**kwargs: contextvars.Token) -> None:
def reset_contextvars(prefix: str, **kwargs: contextvars.Token) -> None:
for k, v in kwargs.items():
log_key = f"{LOG_PREFIX}{k}"
var = _log_context_vars[log_key]
log_key = f"{prefix}{k}"
var = _context_vars[log_key]
var.reset(v)
# remove from contextvars
def unset_contextvars(*keys: str) -> None:
def unset_contextvars(prefix: str, *keys: str) -> None:
for k in keys:
if k in _log_context_vars:
log_key = f"{LOG_PREFIX}{k}"
_log_context_vars[log_key].set(Ellipsis)
if k in _context_vars:
log_key = f"{prefix}{k}"
_context_vars[log_key].set(Ellipsis)
# Context manager or decorator to set and unset the context vars
@contextlib.contextmanager
def log_contextvars(**kwargs: Any) -> Generator[None, None, None]:
context = get_contextvars()
context = get_contextvars(LOG_PREFIX)
saved = {k: context[k] for k in context.keys() & kwargs.keys()}
set_contextvars(**kwargs)
set_contextvars(LOG_PREFIX, **kwargs)
try:
yield
finally:
unset_contextvars(*kwargs.keys())
set_contextvars(**saved)
unset_contextvars(LOG_PREFIX, *kwargs.keys())
set_contextvars(LOG_PREFIX, **saved)
# Context manager for earlier in task.run
@contextlib.contextmanager
def task_contextvars(**kwargs: Any) -> Generator[None, None, None]:
context = get_contextvars(TASK_PREFIX)
saved = {k: context[k] for k in context.keys() & kwargs.keys()}
set_contextvars(TASK_PREFIX, **kwargs)
try:
yield
finally:
unset_contextvars(TASK_PREFIX, *kwargs.keys())
set_contextvars(TASK_PREFIX, **saved)

View File

@@ -110,6 +110,7 @@ class _Logger:
log.setLevel(_log_level_map[self.level])
handler.setFormatter(logging.Formatter(fmt="%(message)s"))
log.handlers.clear()
log.propagate = False
log.addHandler(handler)
return log
@@ -185,7 +186,7 @@ class EventManager:
self.callbacks: List[Callable[[EventMsg], None]] = []
self.invocation_id: str = str(uuid4())
def fire_event(self, e: BaseEvent, level: EventLevel = None) -> None:
def fire_event(self, e: BaseEvent, level: Optional[EventLevel] = None) -> None:
msg = msg_from_base_event(e, level=level)
if os.environ.get("DBT_TEST_BINARY_SERIALIZATION"):

View File

@@ -39,14 +39,18 @@ def setup_event_logger(flags, callbacks: List[Callable[[EventMsg], None]] = [])
else:
if flags.LOG_LEVEL != "none":
line_format = _line_format_from_str(flags.LOG_FORMAT, LineFormat.PlainText)
log_level = EventLevel.DEBUG if flags.DEBUG else EventLevel(flags.LOG_LEVEL)
log_level = (
EventLevel.ERROR
if flags.QUIET
else EventLevel.DEBUG
if flags.DEBUG
else EventLevel(flags.LOG_LEVEL)
)
console_config = _get_stdout_config(
line_format,
flags.DEBUG,
flags.USE_COLORS,
log_level,
flags.LOG_CACHE_EVENTS,
flags.QUIET,
)
EVENT_MANAGER.add_logger(console_config)
@@ -81,11 +85,9 @@ def _line_format_from_str(format_str: str, default: LineFormat) -> LineFormat:
def _get_stdout_config(
line_format: LineFormat,
debug: bool,
use_colors: bool,
level: EventLevel,
log_cache_events: bool,
quiet: bool,
) -> LoggerConfig:
return LoggerConfig(
@@ -97,8 +99,6 @@ def _get_stdout_config(
filter=partial(
_stdout_filter,
log_cache_events,
debug,
quiet,
line_format,
),
output_stream=sys.stdout,
@@ -107,16 +107,11 @@ def _get_stdout_config(
def _stdout_filter(
log_cache_events: bool,
debug_mode: bool,
quiet_mode: bool,
line_format: LineFormat,
msg: EventMsg,
) -> bool:
return (
(msg.info.name not in ["CacheAction", "CacheDumpGraph"] or log_cache_events)
and (EventLevel(msg.info.level) != EventLevel.DEBUG or debug_mode)
and (EventLevel(msg.info.level) == EventLevel.ERROR or not quiet_mode)
and not (line_format == LineFormat.Json and type(msg.data) == Formatting)
return (msg.info.name not in ["CacheAction", "CacheDumpGraph"] or log_cache_events) and not (
line_format == LineFormat.Json and type(msg.data) == Formatting
)
@@ -147,11 +142,9 @@ def _get_logbook_log_config(
) -> LoggerConfig:
config = _get_stdout_config(
LineFormat.PlainText,
debug,
use_colors,
EventLevel.DEBUG if debug else EventLevel.INFO,
EventLevel.ERROR if quiet else EventLevel.DEBUG if debug else EventLevel.INFO,
log_cache_events,
quiet,
)
config.name = "logbook_log"
config.filter = (
@@ -183,7 +176,7 @@ EVENT_MANAGER: EventManager = EventManager()
EVENT_MANAGER.add_logger(
_get_logbook_log_config(False, True, False, False) # type: ignore
if ENABLE_LEGACY_LOGGER
else _get_stdout_config(LineFormat.PlainText, False, True, EventLevel.INFO, False, False)
else _get_stdout_config(LineFormat.PlainText, True, EventLevel.INFO, False)
)
# This global, and the following two functions for capturing stdout logs are
@@ -247,17 +240,24 @@ def warn_or_error(event, node=None):
# an alternative to fire_event which only creates and logs the event value
# if the condition is met. Does nothing otherwise.
def fire_event_if(
conditional: bool, lazy_e: Callable[[], BaseEvent], level: EventLevel = None
conditional: bool, lazy_e: Callable[[], BaseEvent], level: Optional[EventLevel] = None
) -> None:
if conditional:
fire_event(lazy_e(), level=level)
# a special case of fire_event_if, to only fire events in our unit/functional tests
def fire_event_if_test(
lazy_e: Callable[[], BaseEvent], level: Optional[EventLevel] = None
) -> None:
fire_event_if(conditional=("pytest" in sys.modules), lazy_e=lazy_e, level=level)
# top-level method for accessing the new eventing system
# this is where all the side effects happen branched by event type
# (i.e. - mutating the event history, printing to stdout, logging
# to files, etc.)
def fire_event(e: BaseEvent, level: EventLevel = None) -> None:
def fire_event(e: BaseEvent, level: Optional[EventLevel] = None) -> None:
EVENT_MANAGER.fire_event(e, level=level)

View File

@@ -385,6 +385,15 @@ message ConfigTargetPathDeprecationMsg {
ConfigTargetPathDeprecation data = 2;
}
// D012
message CollectFreshnessReturnSignature {
}
message CollectFreshnessReturnSignatureMsg {
EventInfo info = 1;
CollectFreshnessReturnSignature data = 2;
}
// E - DB Adapter
// E001
@@ -662,6 +671,19 @@ message CacheDumpGraphMsg {
// Skipping E032, E033, E034
// E034
message AdapterRegistered {
string adapter_name = 1;
string adapter_version = 2;
}
message AdapterRegisteredMsg {
EventInfo info = 1;
AdapterRegistered data = 2;
}
// E035
message AdapterImportError {
string exc = 1;
@@ -866,26 +888,6 @@ message ParsePerfInfoPathMsg {
ParsePerfInfoPath data = 2;
}
// I011
message GenericTestFileParse {
string path = 1;
}
message GenericTestFileParseMsg {
EventInfo info = 1;
GenericTestFileParse data = 2;
}
// I012
message MacroFileParse {
string path = 1;
}
message MacroFileParseMsg {
EventInfo info = 1;
MacroFileParse data = 2;
}
// Skipping I013
// I014
@@ -1145,7 +1147,7 @@ message JinjaLogInfo {
message JinjaLogInfoMsg {
EventInfo info = 1;
JinjaLogInfo data = 2;
JinjaLogInfo data = 2;
}
// I063
@@ -1159,6 +1161,94 @@ message JinjaLogDebugMsg {
JinjaLogDebug data = 2;
}
// I064
message UnpinnedRefNewVersionAvailable {
NodeInfo node_info = 1;
string ref_node_name = 2;
string ref_node_package = 3;
string ref_node_version = 4;
string ref_max_version = 5;
}
message UnpinnedRefNewVersionAvailableMsg {
EventInfo info = 1;
UnpinnedRefNewVersionAvailable data = 2;
}
// I065
message DeprecatedModel {
string model_name = 1;
string model_version = 2;
string deprecation_date = 3;
}
message DeprecatedModelMsg {
EventInfo info = 1;
DeprecatedModel data = 2;
}
// I066
message UpcomingReferenceDeprecation {
string model_name = 1;
string ref_model_package = 2;
string ref_model_name = 3;
string ref_model_version = 4;
string ref_model_latest_version = 5;
string ref_model_deprecation_date = 6;
}
message UpcomingReferenceDeprecationMsg {
EventInfo info = 1;
UpcomingReferenceDeprecation data = 2;
}
// I067
message DeprecatedReference {
string model_name = 1;
string ref_model_package = 2;
string ref_model_name = 3;
string ref_model_version = 4;
string ref_model_latest_version = 5;
string ref_model_deprecation_date = 6;
}
message DeprecatedReferenceMsg {
EventInfo info = 1;
DeprecatedReference data = 2;
}
// I068
message UnsupportedConstraintMaterialization {
string materialized = 1;
}
message UnsupportedConstraintMaterializationMsg {
EventInfo info = 1;
UnsupportedConstraintMaterialization data = 2;
}
// I069
message ParseInlineNodeError{
NodeInfo node_info = 1;
string exc = 2;
}
message ParseInlineNodeErrorMsg {
EventInfo info = 1;
ParseInlineNodeError data = 2;
}
// I070
message SemanticValidationFailure {
string msg = 2;
}
message SemanticValidationFailureMsg {
EventInfo info = 1;
SemanticValidationFailure data = 2;
}
// M - Deps generation
// M001

View File

@@ -31,6 +31,7 @@ from dbt.node_types import NodeType
# | E | DB adapter |
# | I | Project parsing |
# | M | Deps generation |
# | P | Artifacts |
# | Q | Node execution |
# | W | Node testing |
# | Z | Misc |
@@ -280,7 +281,7 @@ class ConfigSourcePathDeprecation(WarnLevel):
def message(self):
description = (
f"The `{self.deprecated_path}` config has been renamed to `{self.exp_path}`."
f"The `{self.deprecated_path}` config has been renamed to `{self.exp_path}`. "
"Please update your `dbt_project.yml` configuration to reflect this change."
)
return line_wrap_message(warning_tag(f"Deprecated functionality\n\n{description}"))
@@ -292,7 +293,7 @@ class ConfigDataPathDeprecation(WarnLevel):
def message(self):
description = (
f"The `{self.deprecated_path}` config has been renamed to `{self.exp_path}`."
f"The `{self.deprecated_path}` config has been renamed to `{self.exp_path}`. "
"Please update your `dbt_project.yml` configuration to reflect this change."
)
return line_wrap_message(warning_tag(f"Deprecated functionality\n\n{description}"))
@@ -407,6 +408,19 @@ class ConfigTargetPathDeprecation(WarnLevel):
return line_wrap_message(warning_tag(f"Deprecated functionality\n\n{description}"))
class CollectFreshnessReturnSignature(WarnLevel):
def code(self):
return "D012"
def message(self):
description = (
"The 'collect_freshness' macro signature has changed to return the full "
"query result, rather than just a table of values. See the v1.5 migration guide "
"for details on how to update your custom macro: https://docs.getdbt.com/guides/migration/versions/upgrading-to-v1.5"
)
return line_wrap_message(warning_tag(f"Deprecated functionality\n\n{description}"))
# =======================================================
# E - DB Adapter
# =======================================================
@@ -641,6 +655,14 @@ class CacheDumpGraph(DebugLevel):
# Skipping E032, E033, E034
class AdapterRegistered(InfoLevel):
def code(self):
return "E034"
def message(self) -> str:
return f"Registered adapter: {self.adapter_name}{self.adapter_version}"
class AdapterImportError(InfoLevel):
def code(self):
return "E035"
@@ -788,7 +810,7 @@ class InputFileDiffError(DebugLevel):
return f"Error processing file diff: {self.category}, {self.file_id}"
# Skipping I002, I003, I004, I005, I006, I007
# Skipping I003, I004, I005, I006, I007
class InvalidValueForField(WarnLevel):
@@ -815,20 +837,10 @@ class ParsePerfInfoPath(InfoLevel):
return f"Performance info: {self.path}"
class GenericTestFileParse(DebugLevel):
def code(self):
return "I011"
def message(self) -> str:
return f"Parsing {self.path}"
# Removed I011: GenericTestFileParse
class MacroFileParse(DebugLevel):
def code(self):
return "I012"
def message(self) -> str:
return f"Parsing {self.path}"
# Removed I012: MacroFileParse
# Skipping I013
@@ -1118,6 +1130,109 @@ class JinjaLogDebug(DebugLevel):
return self.msg
class UnpinnedRefNewVersionAvailable(InfoLevel):
def code(self):
return "I064"
def message(self) -> str:
msg = (
f"While compiling '{self.node_info.node_name}':\n"
f"Found an unpinned reference to versioned model '{self.ref_node_name}' in project '{self.ref_node_package}'.\n"
f"Resolving to latest version: {self.ref_node_name}.v{self.ref_node_version}\n"
f"A prerelease version {self.ref_max_version} is available. It has not yet been marked 'latest' by its maintainer.\n"
f"When that happens, this reference will resolve to {self.ref_node_name}.v{self.ref_max_version} instead.\n\n"
f" Try out v{self.ref_max_version}: {{{{ ref('{self.ref_node_package}', '{self.ref_node_name}', v='{self.ref_max_version}') }}}}\n"
f" Pin to v{self.ref_node_version}: {{{{ ref('{self.ref_node_package}', '{self.ref_node_name}', v='{self.ref_node_version}') }}}}\n"
)
return msg
class DeprecatedModel(WarnLevel):
def code(self):
return "I065"
def message(self) -> str:
version = ".v" + self.model_version if self.model_version else ""
msg = (
f"Model {self.model_name}{version} has passed its deprecation date of {self.deprecation_date}. "
"This model should be disabled or removed."
)
return warning_tag(msg)
class UpcomingReferenceDeprecation(WarnLevel):
def code(self):
return "I066"
def message(self) -> str:
ref_model_version = ".v" + self.ref_model_version if self.ref_model_version else ""
msg = (
f"While compiling '{self.model_name}': Found a reference to {self.ref_model_name}{ref_model_version}, "
f"which is slated for deprecation on '{self.ref_model_deprecation_date}'. "
)
if self.ref_model_version and self.ref_model_version != self.ref_model_latest_version:
coda = (
f"A new version of '{self.ref_model_name}' is available. Try it out: "
f"{{{{ ref('{self.ref_model_package}', '{self.ref_model_name}', "
f"v='{self.ref_model_latest_version}') }}}}."
)
msg = msg + coda
return warning_tag(msg)
class DeprecatedReference(WarnLevel):
def code(self):
return "I067"
def message(self) -> str:
ref_model_version = ".v" + self.ref_model_version if self.ref_model_version else ""
msg = (
f"While compiling '{self.model_name}': Found a reference to {self.ref_model_name}{ref_model_version}, "
f"which was deprecated on '{self.ref_model_deprecation_date}'. "
)
if self.ref_model_version and self.ref_model_version != self.ref_model_latest_version:
coda = (
f"A new version of '{self.ref_model_name}' is available. Migrate now: "
f"{{{{ ref('{self.ref_model_package}', '{self.ref_model_name}', "
f"v='{self.ref_model_latest_version}') }}}}."
)
msg = msg + coda
return warning_tag(msg)
class UnsupportedConstraintMaterialization(WarnLevel):
def code(self):
return "I068"
def message(self) -> str:
msg = (
f"Constraint types are not supported for {self.materialized} materializations and will "
"be ignored. Set 'warn_unsupported: false' on this constraint to ignore this warning."
)
return line_wrap_message(warning_tag(msg))
class ParseInlineNodeError(ErrorLevel):
def code(self):
return "I069"
def message(self) -> str:
return "Error while parsing node: " + self.node_info.node_name + "\n" + self.exc
class SemanticValidationFailure(WarnLevel):
def code(self):
return "I070"
def message(self) -> str:
return self.msg
# =======================================================
# M - Deps generation
# =======================================================

File diff suppressed because one or more lines are too long

View File

@@ -3,11 +3,11 @@ import json
import re
import io
import agate
from typing import Any, Dict, List, Mapping, Optional, Union
from typing import Any, Dict, List, Mapping, Optional, Tuple, Union
from dbt.dataclass_schema import ValidationError
from dbt.events.helpers import env_secrets, scrub_secrets
from dbt.node_types import NodeType
from dbt.node_types import NodeType, AccessType
from dbt.ui import line_wrap_message
import dbt.dataclass_schema
@@ -212,11 +212,21 @@ class ContractBreakingChangeError(DbtRuntimeError):
MESSAGE = "Breaking Change to Contract"
def __init__(
self, contract_enforced_disabled, columns_removed, column_type_changes, node=None
self,
contract_enforced_disabled: bool,
columns_removed: List[str],
column_type_changes: List[Tuple[str, str, str]],
enforced_column_constraint_removed: List[Tuple[str, str]],
enforced_model_constraint_removed: List[Tuple[str, List[str]]],
materialization_changed: List[str],
node=None,
):
self.contract_enforced_disabled = contract_enforced_disabled
self.columns_removed = columns_removed
self.column_type_changes = column_type_changes
self.enforced_column_constraint_removed = enforced_column_constraint_removed
self.enforced_model_constraint_removed = enforced_model_constraint_removed
self.materialization_changed = materialization_changed
super().__init__(self.message(), node)
@property
@@ -237,6 +247,27 @@ class ContractBreakingChangeError(DbtRuntimeError):
breaking_changes.append(
f"Columns with data_type changes: \n - {column_type_changes_str}"
)
if self.enforced_column_constraint_removed:
column_constraint_changes_str = "\n - ".join(
[f"{c[0]} ({c[1]})" for c in self.enforced_column_constraint_removed]
)
breaking_changes.append(
f"Enforced column level constraints were removed: \n - {column_constraint_changes_str}"
)
if self.enforced_model_constraint_removed:
model_constraint_changes_str = "\n - ".join(
[f"{c[0]} -> {c[1]}" for c in self.enforced_model_constraint_removed]
)
breaking_changes.append(
f"Enforced model level constraints were removed: \n - {model_constraint_changes_str}"
)
if self.materialization_changed:
materialization_changes_str = "\n - ".join(
f"{self.materialization_changed[0]} -> {self.materialization_changed[1]}"
)
breaking_changes.append(
f"Materialization changed with enforced constraints: \n - {materialization_changes_str}"
)
reasons = "\n\n".join(breaking_changes)
@@ -266,6 +297,11 @@ class ParsingError(DbtRuntimeError):
return "Parsing"
class dbtPluginError(DbtRuntimeError):
CODE = 10020
MESSAGE = "Plugin Error"
# TODO: this isn't raised in the core codebase. Is it raised elsewhere?
class JSONValidationError(DbtValidationError):
def __init__(self, typename, errors):
@@ -375,7 +411,7 @@ class DbtProfileError(DbtConfigError):
class SemverError(Exception):
def __init__(self, msg: str = None):
def __init__(self, msg: Optional[str] = None):
self.msg = msg
if msg is not None:
super().__init__(msg)
@@ -654,6 +690,15 @@ class UnknownGitCloningProblemError(DbtRuntimeError):
return msg
class NoAdaptersAvailableError(DbtRuntimeError):
def __init__(self):
super().__init__(msg=self.get_message())
def get_message(self) -> str:
msg = "No adapters available. Learn how to install an adapter by going to https://docs.getdbt.com/docs/connect-adapters#install-using-the-cli"
return msg
class BadSpecError(DbtInternalError):
def __init__(self, repo, revision, error):
self.repo = repo
@@ -1019,6 +1064,17 @@ class DuplicateMacroNameError(CompilationError):
return msg
class MacroResultAlreadyLoadedError(CompilationError):
def __init__(self, result_name):
self.result_name = result_name
super().__init__(msg=self.get_message())
def get_message(self) -> str:
msg = f"The 'statement' result named '{self.result_name}' has already been loaded into a variable"
return msg
# parser level exceptions
class DictParseError(ParsingError):
def __init__(self, exc: ValidationError, node):
@@ -1163,26 +1219,31 @@ class SnapshopConfigError(ParsingError):
class DbtReferenceError(ParsingError):
def __init__(self, unique_id: str, ref_unique_id: str, group: str):
def __init__(self, unique_id: str, ref_unique_id: str, access: AccessType, scope: str):
self.unique_id = unique_id
self.ref_unique_id = ref_unique_id
self.group = group
self.access = access
self.scope = scope
self.scope_type = "group" if self.access == AccessType.Private else "package"
super().__init__(msg=self.get_message())
def get_message(self) -> str:
return (
f"Node {self.unique_id} attempted to reference node {self.ref_unique_id}, "
f"which is not allowed because the referenced node is private to the {self.group} group."
f"which is not allowed because the referenced node is {self.access} to the '{self.scope}' {self.scope_type}."
)
class InvalidAccessTypeError(ParsingError):
def __init__(self, unique_id: str, field_value: str):
def __init__(self, unique_id: str, field_value: str, materialization: Optional[str] = None):
self.unique_id = unique_id
self.field_value = field_value
msg = (
f"Node {self.unique_id} has an invalid value ({self.field_value}) for the access field"
self.materialization = materialization
with_materialization = (
f"with '{self.materialization}' materialization " if self.materialization else ""
)
msg = f"Node {self.unique_id} {with_materialization}has an invalid value ({self.field_value}) for the access field"
super().__init__(msg=msg)
@@ -1349,7 +1410,7 @@ class TargetNotFoundError(CompilationError):
target_package_string = ""
if self.target_package is not None:
target_package_string = f"in package '{self.target_package}' "
target_package_string = f"in package or project '{self.target_package}' "
msg = (
f"{resource_type_title} '{unique_id}' ({original_file_path}) depends on a "
@@ -1763,17 +1824,19 @@ class UninstalledPackagesFoundError(CompilationError):
self,
count_packages_specified: int,
count_packages_installed: int,
packages_specified_path: str,
packages_install_path: str,
):
self.count_packages_specified = count_packages_specified
self.count_packages_installed = count_packages_installed
self.packages_specified_path = packages_specified_path
self.packages_install_path = packages_install_path
super().__init__(msg=self.get_message())
def get_message(self) -> str:
msg = (
f"dbt found {self.count_packages_specified} package(s) "
"specified in packages.yml, but only "
f"specified in {self.packages_specified_path}, but only "
f"{self.count_packages_installed} package(s) installed "
f'in {self.packages_install_path}. Run "dbt deps" to '
"install package dependencies."
@@ -1950,6 +2013,23 @@ class AmbiguousAliasError(CompilationError):
return msg
class AmbiguousResourceNameRefError(CompilationError):
def __init__(self, duped_name, unique_ids, node=None):
self.duped_name = duped_name
self.unique_ids = unique_ids
self.packages = [unique_id.split(".")[1] for unique_id in unique_ids]
super().__init__(msg=self.get_message(), node=node)
def get_message(self) -> str:
formatted_unique_ids = "'{0}'".format("', '".join(self.unique_ids))
formatted_packages = "'{0}'".format("' or '".join(self.packages))
msg = (
f"When referencing '{self.duped_name}', dbt found nodes in multiple packages: {formatted_unique_ids}"
f"\nTo fix this, use two-argument 'ref', with the package name first: {formatted_packages}"
)
return msg
class AmbiguousCatalogMatchError(CompilationError):
def __init__(self, unique_id: str, match_1, match_2):
self.unique_id = unique_id
@@ -2145,6 +2225,26 @@ To fix this, change the name of one of these resources:
return msg
class DuplicateVersionedUnversionedError(ParsingError):
def __init__(self, versioned_node, unversioned_node):
self.versioned_node = versioned_node
self.unversioned_node = unversioned_node
super().__init__(msg=self.get_message())
def get_message(self) -> str:
msg = f"""
dbt found versioned and unversioned models with the name "{self.versioned_node.name}".
Since these resources have the same name, dbt will be unable to find the correct resource
when looking for ref('{self.versioned_node.name}').
To fix this, change the name of the unversioned resource
{self.unversioned_node.unique_id} ({self.unversioned_node.original_file_path})
or add the unversioned model to the versions in {self.versioned_node.patch_path}
""".strip()
return msg
class PropertyYMLError(CompilationError):
def __init__(self, path: str, issue: str):
self.path = path
@@ -2230,6 +2330,11 @@ class ContractError(CompilationError):
return table_from_data_flat(mismatches_sorted, column_names)
def get_message(self) -> str:
if not self.yaml_columns:
return (
"This model has an enforced contract, and its 'columns' specification is missing"
)
table: agate.Table = self.get_mismatches()
# Hack to get Agate table output as string
output = io.StringIO()
@@ -2302,7 +2407,7 @@ class RPCCompiling(DbtRuntimeError):
CODE = 10010
MESSAGE = 'RPC server is compiling the project, call the "status" method for' " compile status"
def __init__(self, msg: str = None, node=None):
def __init__(self, msg: Optional[str] = None, node=None):
if msg is None:
msg = "compile in progress"
super().__init__(msg, node)

View File

@@ -3,6 +3,7 @@ from os import getenv as os_getenv
from argparse import Namespace
from multiprocessing import get_context
from typing import Optional
from pathlib import Path
# for setting up logger for legacy logger
@@ -86,6 +87,7 @@ def get_flag_dict():
"introspect",
"target_path",
"log_path",
"invocation_command",
}
return {key: getattr(GLOBAL_FLAGS, key.upper(), None) for key in flag_attr}
@@ -95,6 +97,8 @@ def get_flag_dict():
def get_flag_obj():
new_flags = Namespace()
for key, val in get_flag_dict().items():
if isinstance(val, Path):
val = str(val)
setattr(new_flags, key.upper(), val)
# The following 3 are CLI arguments only so they're not full-fledged flags,
# but we put in flags for users.

View File

@@ -28,16 +28,29 @@ class Graph:
"""Returns all nodes having a path to `node` in `graph`"""
if not self.graph.has_node(node):
raise DbtInternalError(f"Node {node} not found in the graph!")
filtered_graph = self.exclude_edge_type("parent_test")
return {
child
for _, child in nx.bfs_edges(self.graph, node, reverse=True, depth_limit=max_depth)
for _, child in nx.bfs_edges(filtered_graph, node, reverse=True, depth_limit=max_depth)
}
def descendants(self, node: UniqueId, max_depth: Optional[int] = None) -> Set[UniqueId]:
"""Returns all nodes reachable from `node` in `graph`"""
if not self.graph.has_node(node):
raise DbtInternalError(f"Node {node} not found in the graph!")
return {child for _, child in nx.bfs_edges(self.graph, node, depth_limit=max_depth)}
filtered_graph = self.exclude_edge_type("parent_test")
return {child for _, child in nx.bfs_edges(filtered_graph, node, depth_limit=max_depth)}
def exclude_edge_type(self, edge_type_to_exclude):
return nx.restricted_view(
self.graph,
nodes=[],
edges=(
(a, b)
for a, b in self.graph.edges
if self.graph[a][b].get("edge_type") == edge_type_to_exclude
),
)
def select_childrens_parents(self, selected: Set[UniqueId]) -> Set[UniqueId]:
ancestors_for = self.select_children(selected) | selected

View File

@@ -36,16 +36,18 @@ def can_select_indirectly(node):
class NodeSelector(MethodManager):
"""The node selector is aware of the graph and manifest,"""
"""The node selector is aware of the graph and manifest"""
def __init__(
self,
graph: Graph,
manifest: Manifest,
previous_state: Optional[PreviousState] = None,
include_empty_nodes: bool = False,
):
super().__init__(manifest, previous_state)
self.full_graph = graph
self.include_empty_nodes = include_empty_nodes
# build a subgraph containing only non-empty, enabled nodes and enabled
# sources.
@@ -166,8 +168,14 @@ class NodeSelector(MethodManager):
elif unique_id in self.manifest.metrics:
metric = self.manifest.metrics[unique_id]
return metric.config.enabled
elif unique_id in self.manifest.semantic_models:
return True
node = self.manifest.nodes[unique_id]
return not node.empty and node.config.enabled
if self.include_empty_nodes:
return node.config.enabled
else:
return not node.empty and node.config.enabled
def node_is_match(self, node: GraphMemberNode) -> bool:
"""Determine if a node is a match for the selector. Non-match nodes
@@ -185,6 +193,8 @@ class NodeSelector(MethodManager):
node = self.manifest.exposures[unique_id]
elif unique_id in self.manifest.metrics:
node = self.manifest.metrics[unique_id]
elif unique_id in self.manifest.semantic_models:
node = self.manifest.semantic_models[unique_id]
else:
raise DbtInternalError(f"Node {unique_id} not found in the manifest!")
return self.node_is_match(node)
@@ -313,11 +323,13 @@ class ResourceTypeSelector(NodeSelector):
manifest: Manifest,
previous_state: Optional[PreviousState],
resource_types: List[NodeType],
include_empty_nodes: bool = False,
):
super().__init__(
graph=graph,
manifest=manifest,
previous_state=previous_state,
include_empty_nodes=include_empty_nodes,
)
self.resource_types: Set[NodeType] = set(resource_types)

View File

@@ -26,6 +26,7 @@ from dbt.exceptions import (
DbtRuntimeError,
)
from dbt.node_types import NodeType
from dbt.events.contextvars import get_project_root
SELECTOR_GLOB = "*"
@@ -36,6 +37,7 @@ class MethodName(StrEnum):
FQN = "fqn"
Tag = "tag"
Group = "group"
Access = "access"
Source = "source"
Path = "path"
File = "file"
@@ -59,8 +61,8 @@ def is_selected_node(fqn: List[str], node_selector: str, is_versioned: bool) ->
flat_node_selector = node_selector.split(".")
if fqn[-2] == node_selector:
return True
# If this is a versioned model, then the last two segments should be allowed to exactly match
elif fqn[-2:] == flat_node_selector[-2:]:
# If this is a versioned model, then the last two segments should be allowed to exactly match on either the '.' or '_' delimiter
elif "_".join(fqn[-2:]) == "_".join(flat_node_selector[-2:]):
return True
else:
if fqn[-1] == node_selector:
@@ -230,6 +232,16 @@ class GroupSelectorMethod(SelectorMethod):
yield node
class AccessSelectorMethod(SelectorMethod):
def search(self, included_nodes: Set[UniqueId], selector: str) -> Iterator[UniqueId]:
"""yields model nodes matching the specified access level"""
for node, real_node in self.parsed_nodes(included_nodes):
if not isinstance(real_node, ModelNode):
continue
if selector == real_node.access:
yield node
class SourceSelectorMethod(SelectorMethod):
def search(self, included_nodes: Set[UniqueId], selector: str) -> Iterator[UniqueId]:
"""yields nodes from included are the specified source."""
@@ -313,8 +325,12 @@ class MetricSelectorMethod(SelectorMethod):
class PathSelectorMethod(SelectorMethod):
def search(self, included_nodes: Set[UniqueId], selector: str) -> Iterator[UniqueId]:
"""Yields nodes from included that match the given path."""
# use '.' and not 'root' for easy comparison
root = Path.cwd()
# get project root from contextvar
project_root = get_project_root()
if project_root:
root = Path(project_root)
else:
root = Path.cwd()
paths = set(p.relative_to(root) for p in root.glob(selector))
for node, real_node in self.all_nodes(included_nodes):
ofp = Path(real_node.original_file_path)
@@ -335,6 +351,8 @@ class FileSelectorMethod(SelectorMethod):
for node, real_node in self.all_nodes(included_nodes):
if fnmatch(Path(real_node.original_file_path).name, selector):
yield node
elif fnmatch(Path(real_node.original_file_path).stem, selector):
yield node
class PackageSelectorMethod(SelectorMethod):
@@ -419,6 +437,8 @@ class ResourceTypeSelectorMethod(SelectorMethod):
class TestNameSelectorMethod(SelectorMethod):
__test__ = False
def search(self, included_nodes: Set[UniqueId], selector: str) -> Iterator[UniqueId]:
for node, real_node in self.parsed_nodes(included_nodes):
if real_node.resource_type == NodeType.Test and hasattr(real_node, "test_metadata"):
@@ -427,6 +447,8 @@ class TestNameSelectorMethod(SelectorMethod):
class TestTypeSelectorMethod(SelectorMethod):
__test__ = False
def search(self, included_nodes: Set[UniqueId], selector: str) -> Iterator[UniqueId]:
search_type: Type
# continue supporting 'schema' + 'data' for backwards compatibility
@@ -514,12 +536,24 @@ class StateSelectorMethod(SelectorMethod):
return self.recursively_check_macros_modified(node, visited_macros)
# TODO check modifed_content and check_modified macro seems a bit redundent
def check_modified_content(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
different_contents = not new.same_contents(old) # type: ignore
def check_modified_content(
self, old: Optional[SelectorTarget], new: SelectorTarget, adapter_type: str
) -> bool:
if isinstance(new, (SourceDefinition, Exposure, Metric)):
# these all overwrite `same_contents`
different_contents = not new.same_contents(old) # type: ignore
else:
different_contents = not new.same_contents(old, adapter_type) # type: ignore
upstream_macro_change = self.check_macros_modified(new)
return different_contents or upstream_macro_change
def check_modified_macros(self, _, new: SelectorTarget) -> bool:
def check_unmodified_content(
self, old: Optional[SelectorTarget], new: SelectorTarget, adapter_type: str
) -> bool:
return not self.check_modified_content(old, new, adapter_type)
def check_modified_macros(self, old, new: SelectorTarget) -> bool:
return self.check_macros_modified(new)
@staticmethod
@@ -536,6 +570,21 @@ class StateSelectorMethod(SelectorMethod):
return check_modified_things
@staticmethod
def check_modified_contract(
compare_method: str,
adapter_type: Optional[str],
) -> Callable[[Optional[SelectorTarget], SelectorTarget], bool]:
# get a function that compares two selector target based on compare method provided
def check_modified_contract(old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
if hasattr(new, compare_method):
# when old body does not exist or old and new are not the same
return not old or not getattr(new, compare_method)(old, adapter_type) # type: ignore
else:
return False
return check_modified_contract
def check_new(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
return old is None
@@ -543,11 +592,15 @@ class StateSelectorMethod(SelectorMethod):
if self.previous_state is None or self.previous_state.manifest is None:
raise DbtRuntimeError("Got a state selector method, but no comparison manifest")
adapter_type = self.manifest.metadata.adapter_type
state_checks = {
# it's new if there is no old version
"new": lambda old, _: old is None,
"new": lambda old, new: old is None,
"old": lambda old, new: old is not None,
# use methods defined above to compare properties of old + new
"modified": self.check_modified_content,
"unmodified": self.check_unmodified_content,
"modified.body": self.check_modified_factory("same_body"),
"modified.configs": self.check_modified_factory("same_config"),
"modified.persisted_descriptions": self.check_modified_factory(
@@ -555,7 +608,7 @@ class StateSelectorMethod(SelectorMethod):
),
"modified.relation": self.check_modified_factory("same_database_representation"),
"modified.macros": self.check_modified_macros,
"modified.contract": self.check_modified_factory("same_contract"),
"modified.contract": self.check_modified_contract("same_contract", adapter_type),
}
if selector in state_checks:
checker = state_checks[selector]
@@ -568,6 +621,7 @@ class StateSelectorMethod(SelectorMethod):
for node, real_node in self.all_nodes(included_nodes):
previous_node: Optional[SelectorTarget] = None
if node in manifest.nodes:
previous_node = manifest.nodes[node]
elif node in manifest.sources:
@@ -577,7 +631,15 @@ class StateSelectorMethod(SelectorMethod):
elif node in manifest.metrics:
previous_node = manifest.metrics[node]
if checker(previous_node, real_node):
keyword_args = {}
if checker.__name__ in [
"same_contract",
"check_modified_content",
"check_unmodified_content",
]:
keyword_args["adapter_type"] = adapter_type # type: ignore
if checker(previous_node, real_node, **keyword_args): # type: ignore
yield node
@@ -684,6 +746,7 @@ class MethodManager:
MethodName.FQN: QualifiedNameSelectorMethod,
MethodName.Tag: TagSelectorMethod,
MethodName.Group: GroupSelectorMethod,
MethodName.Access: AccessSelectorMethod,
MethodName.Source: SourceSelectorMethod,
MethodName.Path: PathSelectorMethod,
MethodName.File: FileSelectorMethod,

View File

@@ -17,15 +17,18 @@
{% endmacro %}
{% macro get_empty_subquery_sql(select_sql) -%}
{{ return(adapter.dispatch('get_empty_subquery_sql', 'dbt')(select_sql)) }}
{% macro get_empty_subquery_sql(select_sql, select_sql_header=none) -%}
{{ return(adapter.dispatch('get_empty_subquery_sql', 'dbt')(select_sql, select_sql_header)) }}
{% endmacro %}
{#
Builds a query that results in the same schema as the given select_sql statement, without necessitating a data scan.
Useful for running a query in a 'pre-flight' context, such as model contract enforcement (assert_columns_equivalent macro).
#}
{% macro default__get_empty_subquery_sql(select_sql) %}
{% macro default__get_empty_subquery_sql(select_sql, select_sql_header=none) %}
{%- if select_sql_header is not none -%}
{{ select_sql_header }}
{%- endif -%}
select * from (
{{ select_sql }}
) as __dbt_sbq
@@ -46,17 +49,18 @@
{%- if col['data_type'] is not defined -%}
{{ col_err.append(col['name']) }}
{%- endif -%}
cast(null as {{ col['data_type'] }}) as {{ col['name'] }}{{ ", " if not loop.last }}
{% set col_name = adapter.quote(col['name']) if col.get('quote') else col['name'] %}
cast(null as {{ col['data_type'] }}) as {{ col_name }}{{ ", " if not loop.last }}
{%- endfor -%}
{%- if (col_err | length) > 0 -%}
{{ exceptions.column_type_missing(column_names=col_err) }}
{%- endif -%}
{% endmacro %}
{% macro get_column_schema_from_query(select_sql) -%}
{% macro get_column_schema_from_query(select_sql, select_sql_header=none) -%}
{% set columns = [] %}
{# -- Using an 'empty subquery' here to get the same schema as the given select_sql statement, without necessitating a data scan.#}
{% set sql = get_empty_subquery_sql(select_sql) %}
{% set sql = get_empty_subquery_sql(select_sql, select_sql_header) %}
{% set column_schema = adapter.get_column_schema_from_query(sql) %}
{{ return(column_schema) }}
{% endmacro %}

View File

@@ -0,0 +1,44 @@
{% macro drop_relation(relation) -%}
{{ return(adapter.dispatch('drop_relation', 'dbt')(relation)) }}
{% endmacro %}
{% macro default__drop_relation(relation) -%}
{% call statement('drop_relation', auto_begin=False) -%}
{%- if relation.is_table -%}
{{- drop_table(relation) -}}
{%- elif relation.is_view -%}
{{- drop_view(relation) -}}
{%- elif relation.is_materialized_view -%}
{{- drop_materialized_view(relation) -}}
{%- else -%}
drop {{ relation.type }} if exists {{ relation }} cascade
{%- endif -%}
{%- endcall %}
{% endmacro %}
{% macro drop_table(relation) -%}
{{ return(adapter.dispatch('drop_table', 'dbt')(relation)) }}
{%- endmacro %}
{% macro default__drop_table(relation) -%}
drop table if exists {{ relation }} cascade
{%- endmacro %}
{% macro drop_view(relation) -%}
{{ return(adapter.dispatch('drop_view', 'dbt')(relation)) }}
{%- endmacro %}
{% macro default__drop_view(relation) -%}
drop view if exists {{ relation }} cascade
{%- endmacro %}
{% macro drop_materialized_view(relation) -%}
{{ return(adapter.dispatch('drop_materialized_view', 'dbt')(relation)) }}
{%- endmacro %}
{% macro default__drop_materialized_view(relation) -%}
drop materialized view if exists {{ relation }} cascade
{%- endmacro %}

View File

@@ -21,3 +21,21 @@
{% endif %}
{% endfor %}
{% endmacro %}
{% macro get_drop_index_sql(relation, index_name) -%}
{{ adapter.dispatch('get_drop_index_sql', 'dbt')(relation, index_name) }}
{%- endmacro %}
{% macro default__get_drop_index_sql(relation, index_name) -%}
{{ exceptions.raise_compiler_error("`get_drop_index_sql has not been implemented for this adapter.") }}
{%- endmacro %}
{% macro get_show_indexes_sql(relation) -%}
{{ adapter.dispatch('get_show_indexes_sql', 'dbt')(relation) }}
{%- endmacro %}
{% macro default__get_show_indexes_sql(relation) -%}
{{ exceptions.raise_compiler_error("`get_show_indexes_sql has not been implemented for this adapter.") }}
{%- endmacro %}

View File

@@ -31,16 +31,6 @@
{{ return(backup_relation) }}
{% endmacro %}
{% macro drop_relation(relation) -%}
{{ return(adapter.dispatch('drop_relation', 'dbt')(relation)) }}
{% endmacro %}
{% macro default__drop_relation(relation) -%}
{% call statement('drop_relation', auto_begin=False) -%}
drop {{ relation.type }} if exists {{ relation }} cascade
{%- endcall %}
{% endmacro %}
{% macro truncate_relation(relation) -%}
{{ return(adapter.dispatch('truncate_relation', 'dbt')(relation)) }}

View File

@@ -0,0 +1,10 @@
{% macro validate_sql(sql) -%}
{{ return(adapter.dispatch('validate_sql', 'dbt')(sql)) }}
{% endmacro %}
{% macro default__validate_sql(sql) -%}
{% call statement('validate_sql') -%}
explain {{ sql }}
{% endcall %}
{{ return(load_result('validate_sql')) }}
{% endmacro %}

Some files were not shown because too many files have changed in this diff Show More