Compare commits

..

49 Commits

Author SHA1 Message Date
Michelle Ark
e603d9e2fa sqlparse <0.4.4 2023-04-18 10:18:43 -04:00
github-actions[bot]
7d1c7518f8 Raise upper pin for hologram to 0.0.16 (#7221) (#7290)
(cherry picked from commit b718c537a7)

Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
2023-04-07 15:34:47 -04:00
Jeremy Cohen
30f8c6269a Make use of hashlib.md5() FIPS compliant (#6982) (#7074)
Signed-off-by: Niels Pardon <par@zurich.ibm.com>
Co-authored-by: Niels Pardon <mail@niels-pardon.de>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2023-04-04 15:53:29 +02:00
FishtownBuildBot
65fcbe7b79 [Automated] Merged prep-release/1.4.5_4389129737 into target 1.4.latest during release process 2023-03-10 17:30:19 -06:00
Github Build Bot
f745e7c823 Bumping version to 1.4.5 and generate changelog 2023-03-10 23:03:03 +00:00
colin-rogers-dbt
14f966c24f backport 7115 (#7152) 2023-03-10 10:46:39 -08:00
Doug Beatty
d0a6ea96e4 Add new index.html and changelog yaml files from dbt-docs (#7142)
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
2023-03-08 13:26:27 -07:00
github-actions[bot]
f3f1244eb0 [CT-1959]: moving simple_seed tests to adapter zone (#6859) (#7131)
* Formatting

* Changelog entry

* Rename to BaseSimpleSeedColumnOverride

* Better error handling

* Update test to include the BOM test

* Cleanup and formating

* Unused import remove

* nit line

* Pr comments

(cherry picked from commit 4c63b630de)

Co-authored-by: Neelesh Salian <nssalian@users.noreply.github.com>
2023-03-07 08:37:59 -08:00
github-actions[bot]
0c067350da CT 2057 Fix compilation logic for ephemeral nodes (#7023) (#7037)
* Don't overwrite sql in extra_ctes when compiling (rendering) nodes

(cherry picked from commit 70c26f5c74)

Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
2023-03-01 07:53:49 -05:00
FishtownBuildBot
57a6963913 [Automated] Merged prep-release/1.4.4_4294080763 into target 1.4.latest during release process 2023-02-28 09:07:01 -06:00
Github Build Bot
d2c9ee46a1 Bumping version to 1.4.4 and generate changelog 2023-02-28 14:38:35 +00:00
Sam Debruyn
89fa60d86a fix: add pytz dependency (#7078) 2023-02-28 09:07:34 -05:00
FishtownBuildBot
e39505fec6 [Automated] Merged prep-release/1.4.3_4258471710 into target 1.4.latest during release process 2023-02-23 20:11:31 -06:00
Github Build Bot
c82795a99f Bumping version to 1.4.3 and generate changelog 2023-02-24 01:35:10 +00:00
Neelesh Salian
f0f4af5c8d Fixing the release secret (#7043) 2023-02-23 17:31:04 -08:00
github-actions[bot]
3fe1502c15 Fix regression in semver comparison logic (#7040) (#7041)
(cherry picked from commit 915585c36e)

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
2023-02-23 16:33:28 -08:00
FishtownBuildBot
cc2f259108 [Automated] Merged prep-release/1.4.2_4255074059 into target 1.4.latest during release process 2023-02-23 11:41:42 -06:00
Github Build Bot
51d216186d Bumping version to 1.4.2 and generate changelog 2023-02-23 17:14:29 +00:00
leahwicz
bdf7c1ff82 Revert "CT 2057 Fix compilation logic for ephemeral nodes (#7023) (#7032)" (#7035)
This reverts commit f23dbb6f6b.
2023-02-23 12:09:33 -05:00
github-actions[bot]
f23dbb6f6b CT 2057 Fix compilation logic for ephemeral nodes (#7023) (#7032)
* Don't overwrite sql in extra_ctes when compiling (rendering) nodes

(cherry picked from commit 70c26f5c74)

Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
2023-02-22 18:05:00 -05:00
FishtownBuildBot
ec17ecec86 [Automated] Merged prep-release/1.4.2rc2_4246496113 into target 1.4.latest during release process 2023-02-22 14:36:47 -06:00
Github Build Bot
a3d326404d Bumping version to 1.4.2rc2 and generate changelog 2023-02-22 20:06:40 +00:00
Peter Webb
c785f320ea CT-2160: Fix logbook (legacy logging) regression (#7029)
* CT-2160: Fix logbook (legacy logging) regression

* CT-2160: Changelog entry
2023-02-22 14:19:10 -05:00
FishtownBuildBot
1fe0c32890 [Automated] Merged prep-release/1.4.2rc1_4187540534 into target 1.4.latest during release process 2023-02-15 14:15:08 -06:00
Github Build Bot
11752edef8 Bumping version to 1.4.2rc1 and generate changelog 2023-02-15 19:47:50 +00:00
github-actions[bot]
16cb498b56 [Backport 1.4.latest] Set relation_name in tests at compile time (#6978)
* Set relation_name in tests at compile time (#6949)

(cherry picked from commit 480e0e55c5)

* Tweak artifact test

---------

Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
2023-02-14 18:33:12 -05:00
Gerda Shank
03f6655f1a Remove empty changie file (#6977) 2023-02-14 15:52:35 -05:00
Gerda Shank
8dc3d0a531 Fix state comparison with disabled exposure (or metric) (#6967) 2023-02-13 21:55:18 -05:00
github-actions[bot]
c172687736 CT 2000 fix semver prerelease comparisons (#6838) (#6958)
* Modify semver.py to not use packaging.version.parse

* Changie

(cherry picked from commit d9424cc710)

Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
2023-02-13 22:39:59 +01:00
Emily Rockman
8a31eab181 Add most recent dbt-docs changes (#6923) (#6925)
* Add new index.html and changelog yaml files from dbt-docs

* Update .changes/unreleased/Docs-20230209-082901.yaml

---------

Co-authored-by: FishtownBuildBot <77737458+FishtownBuildBot@users.noreply.github.com>
2023-02-10 11:14:33 -06:00
Peter Webb
4d0ee2fc47 Ensure flush() after logging write() (#6909) (#6922)
* ct-2063: Ensure flush after logging, by using Python's logging subsystem directly

* ct-2063: Add changelog entry
2023-02-09 10:48:30 -05:00
github-actions[bot]
ddb2f0f71d [Backport 1.4.latest] Add back depends_on for seeds - only macros, never nodes (#6920)
* Add back `depends_on` for seeds - only `macros`, never `nodes` (#6851)

* Extend functional tests for seeds w hooks

* Add MacroDependsOn to seeds, raise exception for other deps

* Add changelog entry

* Fix unit tests

* Update upgrade_seed_content

* Cleanup

* Regen manifest v8 schema. Fix tests

* Be less magical

* PR feedback

(cherry picked from commit 298bf8a1d4)

* Update manifest v8 for v1.4.x

* Cleanup

---------

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
2023-02-09 12:37:46 +01:00
Emily Rockman
2bb2e73df9 1.4 regression: Check if status has node attribute (#6899) (#6904)
* check for node

* add changelog

* add test for regression
2023-02-08 15:00:32 -06:00
Peter Webb
790fecab92 CT-1917: Fix a regression in the behavior of the -q/--quiet cli parameter (#6886) (#6889) 2023-02-08 10:38:42 -05:00
Emily Rockman
f6926c0ed6 [BACKPORT] 1.4.latest update regex to match all iterations (#6839) (#6846)
* update regex to match all iterations (#6839)

* update regex to match all iterations

* convert to num to match all adapters

* add comments, remove extra .

* clarify with more comments

* Update .bumpversion.cfg

Co-authored-by: Nathaniel May <nathaniel.may@fishtownanalytics.com>

---------

Co-authored-by: Nathaniel May <nathaniel.may@fishtownanalytics.com>
# Conflicts:
#	.bumpversion.cfg

* put back correct version

* put back correct version
2023-02-06 13:54:34 -06:00
Alexander Smolyakov
52be5ffaa6 [CI/CD] Backport release workflow to 1.4.latest (#6793)
* [CI/CD] Update release workflow and introduce workflow for nightly releases (#6602)

* Add release workflows

* Update nightly-release.yml

* Set default `test_run` value to `true`

* Update .bumpversion.cfg

* Resolve review comment

- Update workflow docs
- Change workflow name
- Set `test_run` default value to `true`

* Update Slack secret

* PyPI

* Update release workflow (#6778)

- Update AWS secrets
- Rework condition for Slack notification

---------

Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
2023-02-01 14:20:33 -06:00
Emily Rockman
31bfff81d9 fix contributor list generation (#6799) (#6808) 2023-01-31 21:39:13 -06:00
github-actions[bot]
3fbfa6e6f3 [Backport 1.4.latest] CT 1894 log partial parsing var changes and sort cli vars before hashing (#6758)
* CT 1894 log partial parsing var changes and sort cli vars before hashing (#6713)

* Log information about vars_hash, normalize cli_vars before hashing

* Changie
2023-01-26 15:04:25 -05:00
github-actions[bot]
165035692f Bumping version to 1.4.1 and generate changelog (#6743)
* Bumping version to 1.4.1rc1 and generate CHANGELOG

* Bumping version to 1.4.1 and generate CHANGELOG (#6744)

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>

* Adjust docker version image

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2023-01-25 20:48:44 -05:00
github-actions[bot]
80f76e9e6e change exposure_content to source_content (#6739) (#6740)
* change `exposure_content` to `source_content`

* Adding changelog

Co-authored-by: Leah Antkiewicz <leah.antkiewicz@fishtownanalytics.com>
(cherry picked from commit b0651b13b5)

Co-authored-by: Matthew Beall <matthew@beall.org>
2023-01-25 20:11:38 -05:00
github-actions[bot]
a2e7249c44 Bumping version to 1.4.0 and generate CHANGELOG (#6727)
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
2023-01-25 10:17:36 -05:00
github-actions[bot]
9700ff1866 CT 1886 include adapter_response in NodeFinished log message (#6709) (#6714)
* Include adapter_response in run_result in NodeFinished log event

* Changie

(cherry picked from commit e2ccf011d9)

Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
2023-01-24 15:38:51 -05:00
colin-rogers-dbt
e61a39f27f mv on_schema_change tests -> "adapter zone" (#6618) (#6686)
* mv `on_schema_change` tests -> "adapter zone" (#6618)

* Mv incremental on_schema_change tests to 'adapter zone'

* Use type_string()

* Cleanup

* mv `on_schema_change` tests -> "adapter zone" (#6618)

* Mv incremental on_schema_change tests to 'adapter zone'

* Use type_string()

* Cleanup

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
2023-01-20 12:26:32 -08:00
github-actions[bot]
d934e713db Bumping version to 1.4.0rc2 and generate CHANGELOG (#6661)
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
2023-01-19 12:13:06 -05:00
github-actions[bot]
ef9bb925d3 add backwards compatibility and default argument for incremental_predicates (#6628) (#6660)
* add backwards compatibility and default argument

* changie <3

* Update .changes/unreleased/Fixes-20230117-101342.yaml

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
(cherry picked from commit f841a7ca76)

Co-authored-by: dave-connors-3 <73915542+dave-connors-3@users.noreply.github.com>
2023-01-19 10:24:49 -05:00
github-actions[bot]
f73359b87c [Backport 1.4.latest] convert 062_defer_state_tests (#6657)
* convert 062_defer_state_tests (#6616)

* Fix --favor-state flag

* Convert 062_defer_state_tests

* Revert "Fix --favor-state flag"

This reverts commit ccbdcbad98b26822629364e6fdbd2780db0c20d3.

* Reformat

* Revert "Revert "Fix --favor-state flag""

This reverts commit fa9d2a09d693b1870bd724a694fce2883748c987.

(cherry picked from commit 07a004b301)

* Add changelog entry

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
2023-01-19 13:27:23 +01:00
Emily Rockman
b4706c4dec finish message rename in types.proto (#6594) (#6596)
* finish message rename in types.proto

* add new parameter
2023-01-13 10:20:34 -06:00
github-actions[bot]
b46d35c13f Call update_event_status earlier + rename an event (#6572) (#6591)
* Rename HookFinished -> FinishedRunningStats

* Move update_event_status earlier when node finishes

* Add changelog entry

* Add update_event_status for skip

* Update changelog entry

(cherry picked from commit 86e8722cd8)

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
2023-01-13 11:53:14 +01:00
github-actions[bot]
eba90863ed Bumping version to 1.4.0rc1 and generate CHANGELOG (#6569)
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
2023-01-10 22:04:55 -05:00
699 changed files with 32399 additions and 70039 deletions

View File

@@ -1,18 +1,14 @@
[bumpversion]
current_version = 1.7.0a1
current_version = 1.4.5
parse = (?P<major>[\d]+) # major version number
\.(?P<minor>[\d]+) # minor version number
\.(?P<patch>[\d]+) # patch version number
(?P<prerelease> # optional pre-release - ex: a1, b2, rc25
(?P<prekind>a|b|rc) # pre-release type
(?P<num>[\d]+) # pre-release version number
(((?P<prekind>a|b|rc) # optional pre-release type
?(?P<num>[\d]+?)) # optional pre-release version number
\.?(?P<nightly>[a-z0-9]+\+[a-z]+)? # optional nightly release indicator
)?
( # optional nightly release indicator
\.(?P<nightly>dev[0-9]+) # ex: .dev02142023
)? # expected matches: `1.15.0`, `1.5.0a11`, `1.5.0a1.dev123`, `1.5.0.dev123457`, expected failures: `1`, `1.5`, `1.5.2-a1`, `text1.5.0`
serialize =
{major}.{minor}.{patch}{prekind}{num}.{nightly}
{major}.{minor}.{patch}.{nightly}
{major}.{minor}.{patch}{prekind}{num}
{major}.{minor}.{patch}
commit = False

View File

@@ -3,9 +3,6 @@
For information on prior major and minor releases, see their changelogs:
* [1.6](https://github.com/dbt-labs/dbt-core/blob/1.6.latest/CHANGELOG.md)
* [1.5](https://github.com/dbt-labs/dbt-core/blob/1.5.latest/CHANGELOG.md)
* [1.4](https://github.com/dbt-labs/dbt-core/blob/1.4.latest/CHANGELOG.md)
* [1.3](https://github.com/dbt-labs/dbt-core/blob/1.3.latest/CHANGELOG.md)
* [1.2](https://github.com/dbt-labs/dbt-core/blob/1.2.latest/CHANGELOG.md)
* [1.1](https://github.com/dbt-labs/dbt-core/blob/1.1.latest/CHANGELOG.md)

126
.changes/1.4.0.md Normal file
View File

@@ -0,0 +1,126 @@
## dbt-core 1.4.0 - January 25, 2023
### Breaking Changes
- Cleaned up exceptions to directly raise in code. Also updated the existing exception to meet PEP guidelines.Removed use of all exception functions in the code base and marked them all as deprecated to be removed next minor release. ([#6339](https://github.com/dbt-labs/dbt-core/issues/6339), [#6393](https://github.com/dbt-labs/dbt-core/issues/6393), [#6460](https://github.com/dbt-labs/dbt-core/issues/6460))
### Features
- Added favor-state flag to optionally favor state nodes even if unselected node exists ([#5016](https://github.com/dbt-labs/dbt-core/issues/5016))
- Update structured logging. Convert to using protobuf messages. Ensure events are enriched with node_info. ([#5610](https://github.com/dbt-labs/dbt-core/issues/5610))
- incremental predicates ([#5680](https://github.com/dbt-labs/dbt-core/issues/5680))
- Friendlier error messages when packages.yml is malformed ([#5486](https://github.com/dbt-labs/dbt-core/issues/5486))
- Allow partitions in external tables to be supplied as a list ([#5929](https://github.com/dbt-labs/dbt-core/issues/5929))
- extend -f flag shorthand for seed command ([#5990](https://github.com/dbt-labs/dbt-core/issues/5990))
- This pulls the profile name from args when constructing a RuntimeConfig in lib.py, enabling the dbt-server to override the value that's in the dbt_project.yml ([#6201](https://github.com/dbt-labs/dbt-core/issues/6201))
- Adding tarball install method for packages. Allowing package tarball to be specified via url in the packages.yaml. ([#4205](https://github.com/dbt-labs/dbt-core/issues/4205))
- Added an md5 function to the base context ([#6246](https://github.com/dbt-labs/dbt-core/issues/6246))
- Exposures support metrics in lineage ([#6057](https://github.com/dbt-labs/dbt-core/issues/6057))
- Add support for Python 3.11 ([#6147](https://github.com/dbt-labs/dbt-core/issues/6147))
- Making timestamp optional for metrics ([#6398](https://github.com/dbt-labs/dbt-core/issues/6398))
- The meta configuration field is now included in the node_info property of structured logs. ([#6216](https://github.com/dbt-labs/dbt-core/issues/6216))
- Adds buildable selection mode ([#6365](https://github.com/dbt-labs/dbt-core/issues/6365))
- --warn-error-options: Treat warnings as errors for specific events, based on user configuration ([#6165](https://github.com/dbt-labs/dbt-core/issues/6165))
### Fixes
- Account for disabled flags on models in schema files more completely ([#3992](https://github.com/dbt-labs/dbt-core/issues/3992))
- Add validation of enabled config for metrics, exposures and sources ([#6030](https://github.com/dbt-labs/dbt-core/issues/6030))
- check length of args of python model function before accessing it ([#6041](https://github.com/dbt-labs/dbt-core/issues/6041))
- Add functors to ensure event types with str-type attributes are initialized to spec, even when provided non-str type params. ([#5436](https://github.com/dbt-labs/dbt-core/issues/5436))
- Allow hooks to fail without halting execution flow ([#5625](https://github.com/dbt-labs/dbt-core/issues/5625))
- fix missing f-strings, convert old .format() messages to f-strings for consistency ([#6241](https://github.com/dbt-labs/dbt-core/issues/6241))
- Clarify Error Message for how many models are allowed in a Python file ([#6245](https://github.com/dbt-labs/dbt-core/issues/6245))
- Fix typo in util.py ([#4904](https://github.com/dbt-labs/dbt-core/issues/4904))
- After this, will be possible to use default values for dbt.config.get ([#6309](https://github.com/dbt-labs/dbt-core/issues/6309))
- Use full path for writing manifest ([#6055](https://github.com/dbt-labs/dbt-core/issues/6055))
- add pre-commit install to make dev script in Makefile ([#6269](https://github.com/dbt-labs/dbt-core/issues/6269))
- Late-rendering for `pre_` and `post_hook`s in `dbt_project.yml` ([#6411](https://github.com/dbt-labs/dbt-core/issues/6411))
- [CT-1284] Change Python model default materialization to table ([#5989](https://github.com/dbt-labs/dbt-core/issues/5989))
- [CT-1591] Don't parse empty Python files ([#6345](https://github.com/dbt-labs/dbt-core/issues/6345))
- Repair a regression which prevented basic logging before the logging subsystem is completely configured. ([#6434](https://github.com/dbt-labs/dbt-core/issues/6434))
- fix docs generate --defer by adding defer_to_manifest to before_run ([#6488](https://github.com/dbt-labs/dbt-core/issues/6488))
- Bug when partial parsing with an empty schema file ([#4850](https://github.com/dbt-labs/dbt-core/issues/4850))
- Fix DBT_FAVOR_STATE env var ([#5859](https://github.com/dbt-labs/dbt-core/issues/5859))
- Restore historical behavior of certain disabled test messages, so that they are at the less obtrusive debug level, rather than the warning level. ([#6501](https://github.com/dbt-labs/dbt-core/issues/6501))
- Bump mashumuro version to get regression fix and add unit test to verify that fix. ([#6428](https://github.com/dbt-labs/dbt-core/issues/6428))
- Call update_event_status earlier for node results. Rename event 'HookFinished' -> FinishedRunningStats ([#6571](https://github.com/dbt-labs/dbt-core/issues/6571))
- Provide backward compatibility for `get_merge_sql` arguments ([#6625](https://github.com/dbt-labs/dbt-core/issues/6625))
- Fix behavior of --favor-state with --defer ([#6617](https://github.com/dbt-labs/dbt-core/issues/6617))
- Include adapter_response in NodeFinished run_result log event ([#6703](https://github.com/dbt-labs/dbt-core/issues/6703))
### Docs
- minor doc correction ([dbt-docs/#5791](https://github.com/dbt-labs/dbt-docs/issues/5791))
- Generate API docs for new CLI interface ([dbt-docs/#5528](https://github.com/dbt-labs/dbt-docs/issues/5528))
- ([dbt-docs/#5880](https://github.com/dbt-labs/dbt-docs/issues/5880))
- Fix rendering of sample code for metrics ([dbt-docs/#323](https://github.com/dbt-labs/dbt-docs/issues/323))
- Alphabetize `core/dbt/README.md` ([dbt-docs/#6368](https://github.com/dbt-labs/dbt-docs/issues/6368))
- Updated minor typos encountered when skipping profile setup ([dbt-docs/#6529](https://github.com/dbt-labs/dbt-docs/issues/6529))
### Under the Hood
- Put black config in explicit config ([#5946](https://github.com/dbt-labs/dbt-core/issues/5946))
- Added flat_graph attribute the Manifest class's deepcopy() coverage ([#5809](https://github.com/dbt-labs/dbt-core/issues/5809))
- Add mypy configs so `mypy` passes from CLI ([#5983](https://github.com/dbt-labs/dbt-core/issues/5983))
- Exception message cleanup. ([#6023](https://github.com/dbt-labs/dbt-core/issues/6023))
- Add dmypy cache to gitignore ([#6028](https://github.com/dbt-labs/dbt-core/issues/6028))
- Provide useful errors when the value of 'materialized' is invalid ([#5229](https://github.com/dbt-labs/dbt-core/issues/5229))
- Clean up string formatting ([#6068](https://github.com/dbt-labs/dbt-core/issues/6068))
- Fixed extra whitespace in strings introduced by black. ([#1350](https://github.com/dbt-labs/dbt-core/issues/1350))
- Remove the 'root_path' field from most nodes ([#6171](https://github.com/dbt-labs/dbt-core/issues/6171))
- Combine certain logging events with different levels ([#6173](https://github.com/dbt-labs/dbt-core/issues/6173))
- Convert threading tests to pytest ([#5942](https://github.com/dbt-labs/dbt-core/issues/5942))
- Convert postgres index tests to pytest ([#5770](https://github.com/dbt-labs/dbt-core/issues/5770))
- Convert use color tests to pytest ([#5771](https://github.com/dbt-labs/dbt-core/issues/5771))
- Add github actions workflow to generate high level CLI API docs ([#5942](https://github.com/dbt-labs/dbt-core/issues/5942))
- Functionality-neutral refactor of event logging system to improve encapsulation and modularity. ([#6139](https://github.com/dbt-labs/dbt-core/issues/6139))
- Consolidate ParsedNode and CompiledNode classes ([#6383](https://github.com/dbt-labs/dbt-core/issues/6383))
- Prevent doc gen workflow from running on forks ([#6386](https://github.com/dbt-labs/dbt-core/issues/6386))
- Fix intermittent database connection failure in Windows CI test ([#6394](https://github.com/dbt-labs/dbt-core/issues/6394))
- Refactor and clean up manifest nodes ([#6426](https://github.com/dbt-labs/dbt-core/issues/6426))
- Restore important legacy logging behaviors, following refactor which removed them ([#6437](https://github.com/dbt-labs/dbt-core/issues/6437))
- Treat dense text blobs as binary for `git grep` ([#6294](https://github.com/dbt-labs/dbt-core/issues/6294))
- Prune partial parsing logging events ([#6313](https://github.com/dbt-labs/dbt-core/issues/6313))
- Updating the deprecation warning in the metric attributes renamed event ([#6507](https://github.com/dbt-labs/dbt-core/issues/6507))
- [CT-1693] Port severity test to Pytest ([#6466](https://github.com/dbt-labs/dbt-core/issues/6466))
- [CT-1694] Deprecate event tracking tests ([#6467](https://github.com/dbt-labs/dbt-core/issues/6467))
- Reorganize structured logging events to have two top keys ([#6311](https://github.com/dbt-labs/dbt-core/issues/6311))
- Combine some logging events ([#1716](https://github.com/dbt-labs/dbt-core/issues/1716), [#1717](https://github.com/dbt-labs/dbt-core/issues/1717), [#1719](https://github.com/dbt-labs/dbt-core/issues/1719))
- Check length of escaped strings in the adapter test ([#6566](https://github.com/dbt-labs/dbt-core/issues/6566))
### Dependencies
- Update pathspec requirement from ~=0.9.0 to >=0.9,<0.11 in /core ([#5917](https://github.com/dbt-labs/dbt-core/pull/5917))
- Bump black from 22.8.0 to 22.10.0 ([#6019](https://github.com/dbt-labs/dbt-core/pull/6019))
- Bump mashumaro[msgpack] from 3.0.4 to 3.1.1 in /core ([#6108](https://github.com/dbt-labs/dbt-core/pull/6108))
- Update colorama requirement from <0.4.6,>=0.3.9 to >=0.3.9,<0.4.7 in /core ([#6144](https://github.com/dbt-labs/dbt-core/pull/6144))
- Bump mashumaro[msgpack] from 3.1.1 to 3.2 in /core ([#6375](https://github.com/dbt-labs/dbt-core/pull/6375))
- Update agate requirement from <1.6.4,>=1.6 to >=1.6,<1.7.1 in /core ([#6506](https://github.com/dbt-labs/dbt-core/pull/6506))
### Contributors
- [@NiallRees](https://github.com/NiallRees) ([#5859](https://github.com/dbt-labs/dbt-core/issues/5859))
- [@agpapa](https://github.com/agpapa) ([#6365](https://github.com/dbt-labs/dbt-core/issues/6365))
- [@andy-clapson](https://github.com/andy-clapson) ([dbt-docs/#5791](https://github.com/dbt-labs/dbt-docs/issues/5791))
- [@callum-mcdata](https://github.com/callum-mcdata) ([#6398](https://github.com/dbt-labs/dbt-core/issues/6398), [#6507](https://github.com/dbt-labs/dbt-core/issues/6507))
- [@chamini2](https://github.com/chamini2) ([#6041](https://github.com/dbt-labs/dbt-core/issues/6041))
- [@daniel-murray](https://github.com/daniel-murray) ([#5016](https://github.com/dbt-labs/dbt-core/issues/5016))
- [@dave-connors-3](https://github.com/dave-connors-3) ([#5680](https://github.com/dbt-labs/dbt-core/issues/5680), [#5990](https://github.com/dbt-labs/dbt-core/issues/5990), [#6625](https://github.com/dbt-labs/dbt-core/issues/6625))
- [@dbeatty10](https://github.com/dbeatty10) ([#6411](https://github.com/dbt-labs/dbt-core/issues/6411), [dbt-docs/#6368](https://github.com/dbt-labs/dbt-docs/issues/6368), [#6394](https://github.com/dbt-labs/dbt-core/issues/6394), [#6294](https://github.com/dbt-labs/dbt-core/issues/6294), [#6566](https://github.com/dbt-labs/dbt-core/issues/6566))
- [@devmessias](https://github.com/devmessias) ([#6309](https://github.com/dbt-labs/dbt-core/issues/6309))
- [@eltociear](https://github.com/eltociear) ([#4904](https://github.com/dbt-labs/dbt-core/issues/4904))
- [@eve-johns](https://github.com/eve-johns) ([#6068](https://github.com/dbt-labs/dbt-core/issues/6068))
- [@haritamar](https://github.com/haritamar) ([#6246](https://github.com/dbt-labs/dbt-core/issues/6246))
- [@jared-rimmer](https://github.com/jared-rimmer) ([#5486](https://github.com/dbt-labs/dbt-core/issues/5486))
- [@josephberni](https://github.com/josephberni) ([#5016](https://github.com/dbt-labs/dbt-core/issues/5016))
- [@joshuataylor](https://github.com/joshuataylor) ([#6147](https://github.com/dbt-labs/dbt-core/issues/6147))
- [@justbldwn](https://github.com/justbldwn) ([#6241](https://github.com/dbt-labs/dbt-core/issues/6241), [#6245](https://github.com/dbt-labs/dbt-core/issues/6245), [#6269](https://github.com/dbt-labs/dbt-core/issues/6269))
- [@luke-bassett](https://github.com/luke-bassett) ([#1350](https://github.com/dbt-labs/dbt-core/issues/1350))
- [@max-sixty](https://github.com/max-sixty) ([#5946](https://github.com/dbt-labs/dbt-core/issues/5946), [#5983](https://github.com/dbt-labs/dbt-core/issues/5983), [#6028](https://github.com/dbt-labs/dbt-core/issues/6028))
- [@mivanicova](https://github.com/mivanicova) ([#6488](https://github.com/dbt-labs/dbt-core/issues/6488))
- [@nshuman1](https://github.com/nshuman1) ([dbt-docs/#6529](https://github.com/dbt-labs/dbt-docs/issues/6529))
- [@paulbenschmidt](https://github.com/paulbenschmidt) ([dbt-docs/#5880](https://github.com/dbt-labs/dbt-docs/issues/5880))
- [@pgoslatara](https://github.com/pgoslatara) ([#5929](https://github.com/dbt-labs/dbt-core/issues/5929))
- [@racheldaniel](https://github.com/racheldaniel) ([#6201](https://github.com/dbt-labs/dbt-core/issues/6201))
- [@timle2](https://github.com/timle2) ([#4205](https://github.com/dbt-labs/dbt-core/issues/4205))
- [@tmastny](https://github.com/tmastny) ([#6216](https://github.com/dbt-labs/dbt-core/issues/6216))

8
.changes/1.4.1.md Normal file
View File

@@ -0,0 +1,8 @@
## dbt-core 1.4.1 - January 26, 2023
### Fixes
- [Regression] exposure_content referenced incorrectly ([#6738](https://github.com/dbt-labs/dbt-core/issues/6738))
### Contributors
- [@Mathyoub](https://github.com/Mathyoub) ([#6738](https://github.com/dbt-labs/dbt-core/issues/6738))

20
.changes/1.4.2.md Normal file
View File

@@ -0,0 +1,20 @@
## dbt-core 1.4.2 - February 23, 2023
### Fixes
- Sort cli vars before hashing for partial parsing ([#6710](https://github.com/dbt-labs/dbt-core/issues/6710))
- Remove pin on packaging and stop using it for prerelease comparisons ([#6834](https://github.com/dbt-labs/dbt-core/issues/6834))
- Readd depends_on.macros to SeedNode, to support seeds with hooks calling macros ([#6806](https://github.com/dbt-labs/dbt-core/issues/6806))
- Fix regression of --quiet cli parameter behavior ([#6749](https://github.com/dbt-labs/dbt-core/issues/6749))
- Ensure results from hooks contain nodes when processing them ([#6796](https://github.com/dbt-labs/dbt-core/issues/6796))
- Always flush stdout after logging ([#6901](https://github.com/dbt-labs/dbt-core/issues/6901))
- Set relation_name in test nodes at compile time ([#6930](https://github.com/dbt-labs/dbt-core/issues/6930))
- Fix disabled definition in WritableManifest ([#6752](https://github.com/dbt-labs/dbt-core/issues/6752))
- Fix regression in logbook log output ([#7028](https://github.com/dbt-labs/dbt-core/issues/7028))
### Docs
- Fix JSON path to overview docs ([dbt-docs/#366](https://github.com/dbt-labs/dbt-docs/issues/366))
### Contributors
- [@halvorlu](https://github.com/halvorlu) ([#366](https://github.com/dbt-labs/dbt-core/issues/366))

5
.changes/1.4.3.md Normal file
View File

@@ -0,0 +1,5 @@
## dbt-core 1.4.3 - February 24, 2023
### Fixes
- Fix semver comparison logic by ensuring numeric values ([#7039](https://github.com/dbt-labs/dbt-core/issues/7039))

8
.changes/1.4.4.md Normal file
View File

@@ -0,0 +1,8 @@
## dbt-core 1.4.4 - February 28, 2023
### Fixes
- add pytz dependency ([#7077](https://github.com/dbt-labs/dbt-core/issues/7077))
### Contributors
- [@sdebruyn](https://github.com/sdebruyn) ([#7077](https://github.com/dbt-labs/dbt-core/issues/7077))

19
.changes/1.4.5.md Normal file
View File

@@ -0,0 +1,19 @@
## dbt-core 1.4.5 - March 10, 2023
### Fixes
- Fix compilation logic for ephemeral nodes ([#6885](https://github.com/dbt-labs/dbt-core/issues/6885))
- allow adapters to change model name resolution in py models ([#7114](https://github.com/dbt-labs/dbt-core/issues/7114))
### Docs
- Fix JSON path to package overview docs ([dbt-docs/#390](https://github.com/dbt-labs/dbt-docs/issues/390))
### Under the Hood
- Moving simple_seed to adapter zone to help adapter test conversions ([#CT-1959](https://github.com/dbt-labs/dbt-core/issues/CT-1959))
### Contributors
- [@dbeatty10](https://github.com/dbeatty10) ([#390](https://github.com/dbt-labs/dbt-core/issues/390))
- [@nssalian](https://github.com/nssalian) ([#CT-1959](https://github.com/dbt-labs/dbt-core/issues/CT-1959))
- [@rlh1994](https://github.com/rlh1994) ([#390](https://github.com/dbt-labs/dbt-core/issues/390))

View File

@@ -1,6 +0,0 @@
kind: Dependencies
body: Pin click<9 + sqlparse<0.5
time: 2023-07-19T12:37:43.716495+02:00
custom:
Author: jtcohen6
PR: "8146"

View File

@@ -1,6 +0,0 @@
kind: Docs
body: Fix for column tests not rendering on quoted columns
time: 2023-05-31T11:54:19.687363-04:00
custom:
Author: drewbanin
Issue: "201"

View File

@@ -1,6 +0,0 @@
kind: Docs
body: Remove static SQL codeblock for metrics
time: 2023-07-18T19:24:22.155323+02:00
custom:
Author: marcodamore
Issue: "436"

View File

@@ -0,0 +1,6 @@
kind: Fixes
body: Make use of hashlib.md5() FIPS compliant
time: 2023-02-15T10:45:36.755797+01:00
custom:
Author: nielspardon
Issue: "6900"

View File

@@ -1,6 +0,0 @@
kind: Fixes
body: Enable converting deprecation warnings to errors
time: 2023-07-18T12:55:18.03914-04:00
custom:
Author: michelleark
Issue: "8130"

View File

@@ -0,0 +1,6 @@
kind: Under the Hood
body: Remove upper pin for hologram/jsonschema
time: 2023-03-24T14:40:50.574108-04:00
custom:
Author: gshank
Issue: "6775"

View File

@@ -4,7 +4,6 @@ headerPath: header.tpl.md
versionHeaderPath: ""
changelogPath: CHANGELOG.md
versionExt: md
envPrefix: "CHANGIE_"
versionFormat: '## dbt-core {{.Version}} - {{.Time.Format "January 02, 2006"}}'
kindFormat: '### {{.Kind}}'
changeFormat: |-
@@ -88,21 +87,15 @@ custom:
footerFormat: |
{{- $contributorDict := dict }}
{{- /* ensure all names in this list are all lowercase for later matching purposes */}}
{{- $core_team := splitList " " .Env.CORE_TEAM }}
{{- /* ensure we always skip snyk and dependabot in addition to the core team */}}
{{- $maintainers := list "dependabot[bot]" "snyk-bot"}}
{{- range $team_member := $core_team }}
{{- $team_member_lower := lower $team_member }}
{{- $maintainers = append $maintainers $team_member_lower }}
{{- end }}
{{- /* any names added to this list should be all lowercase for later matching purposes */}}
{{- $core_team := list "michelleark" "peterallenwebb" "emmyoop" "nathaniel-may" "gshank" "leahwicz" "chenyulinx" "stu-k" "iknox-fa" "versusfacit" "mcknight-42" "jtcohen6" "aranke" "dependabot[bot]" "snyk-bot" "colin-rogers-dbt" }}
{{- range $change := .Changes }}
{{- $authorList := splitList " " $change.Custom.Author }}
{{- /* loop through all authors for a single changelog */}}
{{- range $author := $authorList }}
{{- $authorLower := lower $author }}
{{- /* we only want to include non-core team contributors */}}
{{- if not (has $authorLower $maintainers)}}
{{- if not (has $authorLower $core_team)}}
{{- $changeList := splitList " " $change.Custom.Author }}
{{- $IssueList := list }}
{{- $changeLink := $change.Kind }}

View File

@@ -9,4 +9,4 @@ ignore =
E203 # makes Flake8 work like black
E741
E501 # long line checking is done in black
exclude = test/
exclude = test

4
.gitattributes vendored
View File

@@ -1,6 +1,2 @@
core/dbt/include/index.html binary
tests/functional/artifacts/data/state/*/manifest.json binary
core/dbt/docs/build/html/searchindex.js binary
core/dbt/docs/build/html/index.html binary
performance/runner/Cargo.lock binary
core/dbt/events/types_pb2.py binary

53
.github/CODEOWNERS vendored
View File

@@ -11,24 +11,44 @@
# As a default for areas with no assignment,
# the core team as a whole will be assigned
* @dbt-labs/core-team
* @dbt-labs/core
### OSS Tooling Guild
# Changes to GitHub configurations including Actions
/.github/ @leahwicz
/.github/ @dbt-labs/guild-oss-tooling
.bumpversion.cfg @dbt-labs/guild-oss-tooling
### LANGUAGE
.changie.yaml @dbt-labs/guild-oss-tooling
# Language core modules
/core/dbt/config/ @dbt-labs/core-language
/core/dbt/context/ @dbt-labs/core-language
/core/dbt/contracts/ @dbt-labs/core-language
/core/dbt/deps/ @dbt-labs/core-language
/core/dbt/events/ @dbt-labs/core-language # structured logging
/core/dbt/parser/ @dbt-labs/core-language
pre-commit-config.yaml @dbt-labs/guild-oss-tooling
pytest.ini @dbt-labs/guild-oss-tooling
tox.ini @dbt-labs/guild-oss-tooling
# Language misc files
/core/dbt/dataclass_schema.py @dbt-labs/core-language
/core/dbt/hooks.py @dbt-labs/core-language
/core/dbt/node_types.py @dbt-labs/core-language
/core/dbt/semver.py @dbt-labs/core-language
### EXECUTION
# Execution core modules
/core/dbt/graph/ @dbt-labs/core-execution
/core/dbt/task/ @dbt-labs/core-execution
# Execution misc files
/core/dbt/compilation.py @dbt-labs/core-execution
/core/dbt/flags.py @dbt-labs/core-execution
/core/dbt/lib.py @dbt-labs/core-execution
/core/dbt/main.py @dbt-labs/core-execution
/core/dbt/profiler.py @dbt-labs/core-execution
/core/dbt/selected_resources.py @dbt-labs/core-execution
/core/dbt/tracking.py @dbt-labs/core-execution
/core/dbt/version.py @dbt-labs/core-execution
pyproject.toml @dbt-labs/guild-oss-tooling
requirements.txt @dbt-labs/guild-oss-tooling
dev_requirements.txt @dbt-labs/guild-oss-tooling
/core/setup.py @dbt-labs/guild-oss-tooling
/core/MANIFEST.in @dbt-labs/guild-oss-tooling
### ADAPTERS
@@ -40,7 +60,6 @@ dev_requirements.txt @dbt-labs/guild-oss-tooling
# Postgres plugin
/plugins/ @dbt-labs/core-adapters
/plugins/postgres/setup.py @dbt-labs/core-adapters @dbt-labs/guild-oss-tooling
# Functional tests for adapter plugins
/tests/adapter @dbt-labs/core-adapters
@@ -52,9 +71,5 @@ dev_requirements.txt @dbt-labs/guild-oss-tooling
# Perf regression testing framework
# This excludes the test project files itself since those aren't specific
# framework changes (excluded by not setting an owner next to it- no owner)
/performance @nathaniel-may
/performance @nathaniel-may
/performance/projects
### ARTIFACTS
/schemas/dbt @dbt-labs/cloud-artifacts

22
.github/_README.md vendored
View File

@@ -63,12 +63,12 @@ permissions:
contents: read
pull-requests: write
```
### Secrets
- When to use a [Personal Access Token (PAT)](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token) vs the [GITHUB_TOKEN](https://docs.github.com/en/actions/security-guides/automatic-token-authentication) generated for the action?
The `GITHUB_TOKEN` is used by default. In most cases it is sufficient for what you need.
If you expect the workflow to result in a commit to that should retrigger workflows, you will need to use a Personal Access Token for the bot to commit the file. When using the GITHUB_TOKEN, the resulting commit will not trigger another GitHub Actions Workflow run. This is due to limitations set by GitHub. See [the docs](https://docs.github.com/en/actions/security-guides/automatic-token-authentication#using-the-github_token-in-a-workflow) for a more detailed explanation.
For example, we must use a PAT in our workflow to commit a new changelog yaml file for bot PRs. Once the file has been committed to the branch, it should retrigger the check to validate that a changelog exists on the PR. Otherwise, it would stay in a failed state since the check would never retrigger.
@@ -105,7 +105,7 @@ Some triggers of note that we use:
```
# **what?**
# Describe what the action does.
# Describe what the action does.
# **why?**
# Why does this action exist?
@@ -138,7 +138,7 @@ Some triggers of note that we use:
id: fp
run: |
FILEPATH=.changes/unreleased/Dependencies-${{ steps.filename_time.outputs.time }}.yaml
echo "FILEPATH=$FILEPATH" >> $GITHUB_OUTPUT
echo "::set-output name=FILEPATH::$FILEPATH"
```
- Print out all variables you will reference as the first step of a job. This allows for easier debugging. The first job should log all inputs. Subsequent jobs should reference outputs of other jobs, if present.
@@ -158,14 +158,14 @@ Some triggers of note that we use:
echo "The build_script_path: ${{ inputs.build_script_path }}"
echo "The s3_bucket_name: ${{ inputs.s3_bucket_name }}"
echo "The package_test_command: ${{ inputs.package_test_command }}"
# collect all the variables that need to be used in subsequent jobs
- name: Set Variables
id: variables
run: |
echo "important_path='performance/runner/Cargo.toml'" >> $GITHUB_OUTPUT
echo "release_id=${{github.event.inputs.release_id}}" >> $GITHUB_OUTPUT
echo "open_prs=${{github.event.inputs.open_prs}}" >> $GITHUB_OUTPUT
echo "::set-output name=important_path::'performance/runner/Cargo.toml'"
echo "::set-output name=release_id::${{github.event.inputs.release_id}}"
echo "::set-output name=open_prs::${{github.event.inputs.open_prs}}"
job2:
needs: [job1]
@@ -190,14 +190,14 @@ ___
### Actions from the Marketplace
- Dont use external actions for things that can easily be accomplished manually.
- Always read through what an external action does before using it! Often an action in the GitHub Actions Marketplace can be replaced with a few lines in bash. This is much more maintainable (and wont change under us) and clear as to whats actually happening. It also prevents any
- Pin actions _we don't control_ to tags.
- Pin actions _we don't control_ to tags.
### Connecting to AWS
- Authenticate with the aws managed workflow
```yaml
- name: Configure AWS credentials from Test account
uses: aws-actions/configure-aws-credentials@v2
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
@@ -208,7 +208,7 @@ ___
```yaml
- name: Copy Artifacts from S3 via CLI
run: aws s3 cp ${{ env.s3_bucket }} . --recursive
run: aws s3 cp ${{ env.s3_bucket }} . --recursive
```
### Testing

View File

@@ -35,7 +35,7 @@ jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v1
- name: Wrangle latest tag
id: is_latest
uses: ./.github/actions/latest-wrangler

View File

@@ -13,7 +13,7 @@ jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v1
- name: Wrangle latest tag
id: is_latest
uses: ./.github/actions/latest-wrangler

View File

@@ -28,12 +28,11 @@ if __name__ == "__main__":
if package_request.status_code == 404:
if halt_on_missing:
sys.exit(1)
# everything is the latest if the package doesn't exist
github_output = os.environ.get("GITHUB_OUTPUT")
with open(github_output, "at", encoding="utf-8") as gh_output:
gh_output.write("latest=True")
gh_output.write("minor_latest=True")
sys.exit(0)
else:
# everything is the latest if the package doesn't exist
print(f"::set-output name=latest::{True}")
print(f"::set-output name=minor_latest::{True}")
sys.exit(0)
# TODO: verify package meta is "correct"
# https://github.com/dbt-labs/dbt-core/issues/4640
@@ -92,7 +91,5 @@ if __name__ == "__main__":
latest = is_latest(pre_rel, new_version, current_latest)
minor_latest = is_latest(pre_rel, new_version, current_minor_latest)
github_output = os.environ.get("GITHUB_OUTPUT")
with open(github_output, "at", encoding="utf-8") as gh_output:
gh_output.write(f"latest={latest}")
gh_output.write(f"minor_latest={minor_latest}")
print(f"::set-output name=latest::{latest}")
print(f"::set-output name=minor_latest::{minor_latest}")

View File

@@ -1,35 +1,23 @@
resolves #
[docs](https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose) dbt-labs/docs.getdbt.com/#
resolves #
<!---
Include the number of the issue addressed by this PR above if applicable.
PRs for code changes without an associated issue *will not be merged*.
See CONTRIBUTING.md for more information.
Include the number of the docs issue that was opened for this PR. If
this change has no user-facing implications, "N/A" suffices instead. New
docs tickets can be created by clicking the link above or by going to
https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose.
-->
### Problem
### Description
<!---
Describe the problem this PR is solving. What is the application state
before this PR is merged?
-->
### Solution
<!---
Describe the way this PR solves the above problem. Add as much detail as you
can to help reviewers understand your changes. Include any alternatives and
tradeoffs you considered.
Describe the Pull Request here. Add any references and info to help reviewers
understand your changes. Include any tradeoffs you considered.
-->
### Checklist
- [ ] I have read [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md) and understand what's expected of me
- [ ] I have run this code in development and it appears to resolve the stated issue
- [ ] I have read [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md) and understand what's expected of me
- [ ] I have signed the [CLA](https://docs.getdbt.com/docs/contributor-license-agreements)
- [ ] I have run this code in development and it appears to resolve the stated issue
- [ ] This PR includes tests, or tests are not required/relevant for this PR
- [ ] This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
- [ ] I have [opened an issue to add/update docs](https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose), or docs changes are not required/relevant for this PR
- [ ] I have run `changie new` to [create a changelog entry](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-a-changelog-entry)

View File

@@ -35,6 +35,6 @@ jobs:
github.event.pull_request.merged
&& contains(github.event.label.name, 'backport')
steps:
- uses: tibdex/backport@v2.0.3
- uses: tibdex/backport@v2.0.2
with:
github_token: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -50,7 +50,7 @@ jobs:
- name: Create and commit changelog on bot PR
if: ${{ contains(github.event.pull_request.labels.*.name, matrix.label) }}
id: bot_changelog
uses: emmyoop/changie_bot@v1.1.0
uses: emmyoop/changie_bot@v1.0.1
with:
GITHUB_TOKEN: ${{ secrets.FISHTOWN_BOT_PAT }}
commit_author_name: "Github Build Bot"

View File

@@ -1,41 +0,0 @@
# **what?**
# Cuts a new `*.latest` branch
# Also cleans up all files in `.changes/unreleased` and `.changes/previous verion on
# `main` and bumps `main` to the input version.
# **why?**
# Generally reduces the workload of engineers and reduces error. Allow automation.
# **when?**
# This will run when called manually.
name: Cut new release branch
on:
workflow_dispatch:
inputs:
version_to_bump_main:
description: 'The alpha version main should bump to (ex. 1.6.0a1)'
required: true
new_branch_name:
description: 'The full name of the new branch (ex. 1.5.latest)'
required: true
defaults:
run:
shell: bash
permissions:
contents: write
jobs:
cut_branch:
name: "Cut branch and clean up main for dbt-core"
uses: dbt-labs/actions/.github/workflows/cut-release-branch.yml@main
with:
version_to_bump_main: ${{ inputs.version_to_bump_main }}
new_branch_name: ${{ inputs.new_branch_name }}
PR_title: "Cleanup main after cutting new ${{ inputs.new_branch_name }} branch"
PR_body: "All adapter PRs will fail CI until the dbt-core PR has been merged due to release version conflicts."
secrets:
FISHTOWN_BOT_PAT: ${{ secrets.FISHTOWN_BOT_PAT }}

View File

@@ -0,0 +1,165 @@
# **what?**
# On push, if anything in core/dbt/docs or core/dbt/cli has been
# created or modified, regenerate the CLI API docs using sphinx.
# **why?**
# We watch for changes in core/dbt/cli because the CLI API docs rely on click
# and all supporting flags/params to be generated. We watch for changes in
# core/dbt/docs since any changes to sphinx configuration or any of the
# .rst files there could result in a differently build final index.html file.
# **when?**
# Whenever a change has been pushed to a branch, and only if there is a diff
# between the PR branch and main's core/dbt/cli and or core/dbt/docs dirs.
# TODO: add bot comment to PR informing contributor that the docs have been committed
# TODO: figure out why github action triggered pushes cause github to fail to report
# the status of jobs
name: Generate CLI API docs
on:
pull_request:
permissions:
contents: write
pull-requests: write
env:
CLI_DIR: ${{ github.workspace }}/core/dbt/cli
DOCS_DIR: ${{ github.workspace }}/core/dbt/docs
DOCS_BUILD_DIR: ${{ github.workspace }}/core/dbt/docs/build
jobs:
check_gen:
name: check if generation needed
runs-on: ubuntu-latest
if: ${{ github.event.pull_request.head.repo.fork == false }}
outputs:
cli_dir_changed: ${{ steps.check_cli.outputs.cli_dir_changed }}
docs_dir_changed: ${{ steps.check_docs.outputs.docs_dir_changed }}
steps:
- name: "[DEBUG] print variables"
run: |
echo "env.CLI_DIR: ${{ env.CLI_DIR }}"
echo "env.DOCS_BUILD_DIR: ${{ env.DOCS_BUILD_DIR }}"
echo "env.DOCS_DIR: ${{ env.DOCS_DIR }}"
- name: git checkout
uses: actions/checkout@v3
with:
fetch-depth: 0
ref: ${{ github.head_ref }}
- name: set shas
id: set_shas
run: |
THIS_SHA=$(git rev-parse @)
LAST_SHA=$(git rev-parse @~1)
echo "this sha: $THIS_SHA"
echo "last sha: $LAST_SHA"
echo "this_sha=$THIS_SHA" >> $GITHUB_OUTPUT
echo "last_sha=$LAST_SHA" >> $GITHUB_OUTPUT
- name: check for changes in core/dbt/cli
id: check_cli
run: |
CLI_DIR_CHANGES=$(git diff \
${{ steps.set_shas.outputs.last_sha }} \
${{ steps.set_shas.outputs.this_sha }} \
-- ${{ env.CLI_DIR }})
if [ -n "$CLI_DIR_CHANGES" ]; then
echo "changes found"
echo $CLI_DIR_CHANGES
echo "cli_dir_changed=true" >> $GITHUB_OUTPUT
exit 0
fi
echo "cli_dir_changed=false" >> $GITHUB_OUTPUT
echo "no changes found"
- name: check for changes in core/dbt/docs
id: check_docs
if: steps.check_cli.outputs.cli_dir_changed == 'false'
run: |
DOCS_DIR_CHANGES=$(git diff --name-only \
${{ steps.set_shas.outputs.last_sha }} \
${{ steps.set_shas.outputs.this_sha }} \
-- ${{ env.DOCS_DIR }} ':!${{ env.DOCS_BUILD_DIR }}')
DOCS_BUILD_DIR_CHANGES=$(git diff --name-only \
${{ steps.set_shas.outputs.last_sha }} \
${{ steps.set_shas.outputs.this_sha }} \
-- ${{ env.DOCS_BUILD_DIR }})
if [ -n "$DOCS_DIR_CHANGES" ] && [ -z "$DOCS_BUILD_DIR_CHANGES" ]; then
echo "changes found"
echo $DOCS_DIR_CHANGES
echo "docs_dir_changed=true" >> $GITHUB_OUTPUT
exit 0
fi
echo "docs_dir_changed=false" >> $GITHUB_OUTPUT
echo "no changes found"
gen_docs:
name: generate docs
runs-on: ubuntu-latest
needs: [check_gen]
if: |
needs.check_gen.outputs.cli_dir_changed == 'true'
|| needs.check_gen.outputs.docs_dir_changed == 'true'
steps:
- name: "[DEBUG] print variables"
run: |
echo "env.DOCS_DIR: ${{ env.DOCS_DIR }}"
echo "github head_ref: ${{ github.head_ref }}"
- name: git checkout
uses: actions/checkout@v3
with:
ref: ${{ github.head_ref }}
- name: install python
uses: actions/setup-python@v4.3.0
with:
python-version: 3.8
- name: install dev requirements
run: |
python3 -m venv env
source env/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt -r dev-requirements.txt
- name: generate docs
run: |
source env/bin/activate
cd ${{ env.DOCS_DIR }}
echo "cleaning existing docs"
make clean
echo "creating docs"
make html
- name: debug
run: |
echo ">>>>> status"
git status
echo ">>>>> remotes"
git remote -v
echo ">>>>> branch"
git branch -v
echo ">>>>> log"
git log --pretty=oneline | head -5
- name: commit docs
run: |
git config user.name 'Github Build Bot'
git config user.email 'buildbot@fishtownanalytics.com'
git commit -am "Add generated CLI API docs"
git push -u origin ${{ github.head_ref }}

View File

@@ -18,8 +18,8 @@ permissions:
issues: write
jobs:
call-creation-action:
uses: dbt-labs/actions/.github/workflows/jira-creation-actions.yml@main
call-label-action:
uses: dbt-labs/jira-actions/.github/workflows/jira-creation.yml@main
secrets:
JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}

View File

@@ -19,7 +19,7 @@ permissions:
jobs:
call-label-action:
uses: dbt-labs/actions/.github/workflows/jira-label-actions.yml@main
uses: dbt-labs/jira-actions/.github/workflows/jira-label.yml@main
secrets:
JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}

View File

@@ -19,8 +19,8 @@ on:
permissions: read-all
jobs:
call-transition-action:
uses: dbt-labs/actions/.github/workflows/jira-transition-actions.yml@main
call-label-action:
uses: dbt-labs/jira-actions/.github/workflows/jira-transition.yml@main
secrets:
JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}

View File

@@ -42,10 +42,10 @@ jobs:
steps:
- name: Check out the repository
uses: actions/checkout@v3
uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v4
uses: actions/setup-python@v4.3.0
with:
python-version: '3.8'
@@ -53,8 +53,12 @@ jobs:
run: |
python -m pip install --user --upgrade pip
python -m pip --version
make dev
python -m pip install pre-commit
pre-commit --version
python -m pip install mypy==0.942
mypy --version
python -m pip install -r requirements.txt
python -m pip install -r dev-requirements.txt
dbt --version
- name: Run pre-commit hooks
@@ -69,17 +73,18 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11"]
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
env:
TOXENV: "unit"
PYTEST_ADDOPTS: "-v --color=yes --csv unit_results.csv"
steps:
- name: Check out the repository
uses: actions/checkout@v3
uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
uses: actions/setup-python@v4.3.0
with:
python-version: ${{ matrix.python-version }}
@@ -96,26 +101,24 @@ jobs:
- name: Get current date
if: always()
id: date
run: |
CURRENT_DATE=$(date +'%Y-%m-%dT%H_%M_%S') # no colons allowed for artifacts
echo "date=$CURRENT_DATE" >> $GITHUB_OUTPUT
run: echo "::set-output name=date::$(date +'%Y-%m-%dT%H_%M_%S')" #no colons allowed for artifacts
- name: Upload Unit Test Coverage to Codecov
if: ${{ matrix.python-version == '3.11' }}
uses: codecov/codecov-action@v3
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
- uses: actions/upload-artifact@v2
if: always()
with:
name: unit_results_${{ matrix.python-version }}-${{ steps.date.outputs.date }}.csv
path: unit_results.csv
integration:
name: integration test / python ${{ matrix.python-version }} / ${{ matrix.os }}
runs-on: ${{ matrix.os }}
timeout-minutes: 60
timeout-minutes: 45
strategy:
fail-fast: false
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11"]
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
os: [ubuntu-20.04]
include:
- python-version: 3.8
@@ -125,22 +128,18 @@ jobs:
env:
TOXENV: integration
PYTEST_ADDOPTS: "-v --color=yes -n4 --csv integration_results.csv"
DBT_INVOCATION_ENV: github-actions
DBT_TEST_USER_1: dbt_test_user_1
DBT_TEST_USER_2: dbt_test_user_2
DBT_TEST_USER_3: dbt_test_user_3
DD_CIVISIBILITY_AGENTLESS_ENABLED: true
DD_API_KEY: ${{ secrets.DATADOG_API_KEY }}
DD_SITE: datadoghq.com
DD_ENV: ci
DD_SERVICE: ${{ github.event.repository.name }}
steps:
- name: Check out the repository
uses: actions/checkout@v3
uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
uses: actions/setup-python@v4.3.0
with:
python-version: ${{ matrix.python-version }}
@@ -164,26 +163,24 @@ jobs:
tox --version
- name: Run tests
run: tox -- --ddtrace
run: tox
- name: Get current date
if: always()
id: date
run: |
CURRENT_DATE=$(date +'%Y-%m-%dT%H_%M_%S') # no colons allowed for artifacts
echo "date=$CURRENT_DATE" >> $GITHUB_OUTPUT
run: echo "::set-output name=date::$(date +'%Y_%m_%dT%H_%M_%S')" #no colons allowed for artifacts
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v2
if: always()
with:
name: logs_${{ matrix.python-version }}_${{ matrix.os }}_${{ steps.date.outputs.date }}
path: ./logs
- name: Upload Integration Test Coverage to Codecov
if: ${{ matrix.python-version == '3.11' }}
uses: codecov/codecov-action@v3
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
- uses: actions/upload-artifact@v2
if: always()
with:
name: integration_results_${{ matrix.python-version }}_${{ matrix.os }}_${{ steps.date.outputs.date }}.csv
path: integration_results.csv
build:
name: build packages
@@ -192,10 +189,10 @@ jobs:
steps:
- name: Check out the repository
uses: actions/checkout@v3
uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v4
uses: actions/setup-python@v4.3.0
with:
python-version: '3.8'

View File

@@ -1,265 +0,0 @@
# **what?**
# This workflow models the performance characteristics of a point in time in dbt.
# It runs specific dbt commands on committed projects multiple times to create and
# commit information about the distribution to the current branch. For more information
# see the readme in the performance module at /performance/README.md.
#
# **why?**
# When developing new features, we can take quick performance samples and compare
# them against the commited baseline measurements produced by this workflow to detect
# some performance regressions at development time before they reach users.
#
# **when?**
# This is only run once directly after each release (for non-prereleases). If for some
# reason the results of a run are not satisfactory, it can also be triggered manually.
name: Model Performance Characteristics
on:
# runs after non-prereleases are published.
release:
types: [released]
# run manually from the actions tab
workflow_dispatch:
inputs:
release_id:
description: 'dbt version to model (must be non-prerelease in Pypi)'
type: string
required: true
env:
RUNNER_CACHE_PATH: performance/runner/target/release/runner
# both jobs need to write
permissions:
contents: write
pull-requests: write
jobs:
set-variables:
name: Setting Variables
runs-on: ubuntu-latest
outputs:
cache_key: ${{ steps.variables.outputs.cache_key }}
release_id: ${{ steps.semver.outputs.base-version }}
release_branch: ${{ steps.variables.outputs.release_branch }}
steps:
# explicitly checkout the performance runner from main regardless of which
# version we are modeling.
- name: Checkout
uses: actions/checkout@v3
with:
ref: main
- name: Parse version into parts
id: semver
uses: dbt-labs/actions/parse-semver@v1
with:
version: ${{ github.event.inputs.release_id || github.event.release.tag_name }}
# collect all the variables that need to be used in subsequent jobs
- name: Set variables
id: variables
run: |
# create a cache key that will be used in the next job. without this the
# next job would have to checkout from main and hash the files itself.
echo "cache_key=${{ runner.os }}-${{ hashFiles('performance/runner/Cargo.toml')}}-${{ hashFiles('performance/runner/src/*') }}" >> $GITHUB_OUTPUT
branch_name="${{steps.semver.outputs.major}}.${{steps.semver.outputs.minor}}.latest"
echo "release_branch=$branch_name" >> $GITHUB_OUTPUT
echo "release branch is inferred to be ${branch_name}"
latest-runner:
name: Build or Fetch Runner
runs-on: ubuntu-latest
needs: [set-variables]
env:
RUSTFLAGS: "-D warnings"
steps:
- name: '[DEBUG] print variables'
run: |
echo "all variables defined in set-variables"
echo "cache_key: ${{ needs.set-variables.outputs.cache_key }}"
echo "release_id: ${{ needs.set-variables.outputs.release_id }}"
echo "release_branch: ${{ needs.set-variables.outputs.release_branch }}"
# explicitly checkout the performance runner from main regardless of which
# version we are modeling.
- name: Checkout
uses: actions/checkout@v3
with:
ref: main
# attempts to access a previously cached runner
- uses: actions/cache@v3
id: cache
with:
path: ${{ env.RUNNER_CACHE_PATH }}
key: ${{ needs.set-variables.outputs.cache_key }}
- name: Fetch Rust Toolchain
if: steps.cache.outputs.cache-hit != 'true'
uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
- name: Add fmt
if: steps.cache.outputs.cache-hit != 'true'
run: rustup component add rustfmt
- name: Cargo fmt
if: steps.cache.outputs.cache-hit != 'true'
uses: actions-rs/cargo@v1
with:
command: fmt
args: --manifest-path performance/runner/Cargo.toml --all -- --check
- name: Test
if: steps.cache.outputs.cache-hit != 'true'
uses: actions-rs/cargo@v1
with:
command: test
args: --manifest-path performance/runner/Cargo.toml
- name: Build (optimized)
if: steps.cache.outputs.cache-hit != 'true'
uses: actions-rs/cargo@v1
with:
command: build
args: --release --manifest-path performance/runner/Cargo.toml
# the cache action automatically caches this binary at the end of the job
model:
# depends on `latest-runner` as a separate job so that failures in this job do not prevent
# a successfully tested and built binary from being cached.
needs: [set-variables, latest-runner]
name: Model a release
runs-on: ubuntu-latest
steps:
- name: '[DEBUG] print variables'
run: |
echo "all variables defined in set-variables"
echo "cache_key: ${{ needs.set-variables.outputs.cache_key }}"
echo "release_id: ${{ needs.set-variables.outputs.release_id }}"
echo "release_branch: ${{ needs.set-variables.outputs.release_branch }}"
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dbt
run: pip install dbt-postgres==${{ needs.set-variables.outputs.release_id }}
- name: Install Hyperfine
run: wget https://github.com/sharkdp/hyperfine/releases/download/v1.11.0/hyperfine_1.11.0_amd64.deb && sudo dpkg -i hyperfine_1.11.0_amd64.deb
# explicitly checkout main to get the latest project definitions
- name: Checkout
uses: actions/checkout@v3
with:
ref: main
# this was built in the previous job so it will be there.
- name: Fetch Runner
uses: actions/cache@v3
id: cache
with:
path: ${{ env.RUNNER_CACHE_PATH }}
key: ${{ needs.set-variables.outputs.cache_key }}
- name: Move Runner
run: mv performance/runner/target/release/runner performance/app
- name: Change Runner Permissions
run: chmod +x ./performance/app
- name: '[DEBUG] ls baseline directory before run'
run: ls -R performance/baselines/
# `${{ github.workspace }}` is used to pass the absolute path
- name: Create directories
run: |
mkdir ${{ github.workspace }}/performance/tmp/
mkdir -p performance/baselines/${{ needs.set-variables.outputs.release_id }}/
# Run modeling with taking 20 samples
- name: Run Measurement
run: |
performance/app model -v ${{ needs.set-variables.outputs.release_id }} -b ${{ github.workspace }}/performance/baselines/ -p ${{ github.workspace }}/performance/projects/ -t ${{ github.workspace }}/performance/tmp/ -n 20
- name: '[DEBUG] ls baseline directory after run'
run: ls -R performance/baselines/
- uses: actions/upload-artifact@v3
with:
name: baseline
path: performance/baselines/${{ needs.set-variables.outputs.release_id }}/
create-pr:
name: Open PR for ${{ matrix.base-branch }}
# depends on `model` as a separate job so that the baseline can be committed to more than one branch
# i.e. release branch and main
needs: [set-variables, latest-runner, model]
runs-on: ubuntu-latest
strategy:
matrix:
include:
- base-branch: refs/heads/main
target-branch: performance-bot/main_${{ needs.set-variables.outputs.release_id }}_${{GITHUB.RUN_ID}}
- base-branch: refs/heads/${{ needs.set-variables.outputs.release_branch }}
target-branch: performance-bot/release_${{ needs.set-variables.outputs.release_id }}_${{GITHUB.RUN_ID}}
steps:
- name: '[DEBUG] print variables'
run: |
echo "all variables defined in set-variables"
echo "cache_key: ${{ needs.set-variables.outputs.cache_key }}"
echo "release_id: ${{ needs.set-variables.outputs.release_id }}"
echo "release_branch: ${{ needs.set-variables.outputs.release_branch }}"
- name: Checkout
uses: actions/checkout@v3
with:
ref: ${{ matrix.base-branch }}
- name: Create PR branch
run: |
git checkout -b ${{ matrix.target-branch }}
git push origin ${{ matrix.target-branch }}
git branch --set-upstream-to=origin/${{ matrix.target-branch }} ${{ matrix.target-branch }}
- uses: actions/download-artifact@v3
with:
name: baseline
path: performance/baselines/${{ needs.set-variables.outputs.release_id }}
- name: '[DEBUG] ls baselines after artifact download'
run: ls -R performance/baselines/
- name: Commit baseline
uses: EndBug/add-and-commit@v9
with:
add: 'performance/baselines/*'
author_name: 'Github Build Bot'
author_email: 'buildbot@fishtownanalytics.com'
message: 'adding performance baseline for ${{ needs.set-variables.outputs.release_id }}'
push: 'origin origin/${{ matrix.target-branch }}'
- name: Create Pull Request
uses: peter-evans/create-pull-request@v5
with:
author: 'Github Build Bot <buildbot@fishtownanalytics.com>'
base: ${{ matrix.base-branch }}
branch: '${{ matrix.target-branch }}'
title: 'Adding performance modeling for ${{needs.set-variables.outputs.release_id}} to ${{ matrix.base-branch }}'
body: 'Committing perf results for tracking for the ${{needs.set-variables.outputs.release_id}}'
labels: |
Skip Changelog
Performance

View File

@@ -68,7 +68,7 @@ jobs:
- name: "Generate Nightly Release Version Number"
id: nightly-release-version
run: |
number="${{ steps.semver.outputs.version }}.dev${{ steps.current-date.outputs.date }}"
number="${{ steps.semver.outputs.version }}.dev${{ steps.current-date.outputs.date }}+nightly"
echo "number=$number" >> $GITHUB_OUTPUT
- name: "Audit Nightly Release Version And Parse Into Parts"
@@ -98,7 +98,7 @@ jobs:
uses: ./.github/workflows/release.yml
with:
sha: ${{ needs.aggregate-release-data.outputs.commit_sha }}
target_branch: ${{ needs.aggregate-release-data.outputs.release_branch }}
target_branch: ${{ needs.aggregate-release-data.outputs.release-branch }}
version_number: ${{ needs.aggregate-release-data.outputs.version_number }}
build_script_path: "scripts/build-dist.sh"
env_setup_script_path: "scripts/env-setup.sh"

View File

@@ -1,7 +1,11 @@
# **what?**
# The purpose of this workflow is to trigger CI to run for each
# release branch and main branch on a regular cadence. If the CI workflow
# fails for a branch, it will post to #dev-core-alerts to raise awareness.
# fails for a branch, it will post to dev-core-alerts to raise awareness.
# The 'aurelien-baudet/workflow-dispatch' Action triggers the existing
# CI worklow file on the given branch to run so that even if we change the
# CI workflow file in the future, the one that is tailored for the given
# release branch will be used.
# **why?**
# Ensures release branches and main are always shippable and not broken.
@@ -24,8 +28,35 @@ on:
permissions: read-all
jobs:
run_tests:
uses: dbt-labs/actions/.github/workflows/release-branch-tests.yml@main
with:
workflows_to_run: '["main.yml"]'
secrets: inherit
kick-off-ci:
name: Kick-off CI
runs-on: ubuntu-latest
strategy:
# must run CI 1 branch at a time b/c the workflow-dispatch Action polls for
# latest run for results and it gets confused when we kick off multiple runs
# at once. There is a race condition so we will just run in sequential order.
max-parallel: 1
fail-fast: false
matrix:
branch: [1.0.latest, 1.1.latest, 1.2.latest, 1.3.latest, main]
steps:
- name: Call CI workflow for ${{ matrix.branch }} branch
id: trigger-step
uses: aurelien-baudet/workflow-dispatch@v2.1.1
with:
workflow: main.yml
ref: ${{ matrix.branch }}
token: ${{ secrets.FISHTOWN_BOT_PAT }}
- name: Post failure to Slack
uses: ravsamhq/notify-slack-action@v1
if: ${{ always() && !contains(steps.trigger-step.outputs.workflow-conclusion,'success') }}
with:
status: ${{ job.status }}
notification_title: 'dbt-core scheduled run of "${{ matrix.branch }}" branch not successful'
message_format: ':x: CI on branch "${{ matrix.branch }}" ${{ steps.trigger-step.outputs.workflow-conclusion }}'
footer: 'Linked failed CI run ${{ steps.trigger-step.outputs.workflow-url }}'
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_DEV_CORE_ALERTS }}

View File

@@ -36,14 +36,14 @@ jobs:
latest: ${{ steps.latest.outputs.latest }}
minor_latest: ${{ steps.latest.outputs.minor_latest }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v1
- name: Split version
id: version
run: |
IFS="." read -r MAJOR MINOR PATCH <<< ${{ github.event.inputs.version_number }}
echo "major=$MAJOR" >> $GITHUB_OUTPUT
echo "minor=$MINOR" >> $GITHUB_OUTPUT
echo "patch=$PATCH" >> $GITHUB_OUTPUT
echo "::set-output name=major::$MAJOR"
echo "::set-output name=minor::$MINOR"
echo "::set-output name=patch::$PATCH"
- name: Is pkg 'latest'
id: latest
@@ -60,7 +60,7 @@ jobs:
needs: [get_version_meta]
steps:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
uses: docker/setup-buildx-action@v1
build_and_push:
name: Build images and push to GHCR
@@ -70,20 +70,18 @@ jobs:
- name: Get docker build arg
id: build_arg
run: |
BUILD_ARG_NAME=$(echo ${{ github.event.inputs.package }} | sed 's/\-/_/g')
BUILD_ARG_VALUE=$(echo ${{ github.event.inputs.package }} | sed 's/postgres/core/g')
echo "build_arg_name=$BUILD_ARG_NAME" >> $GITHUB_OUTPUT
echo "build_arg_value=$BUILD_ARG_VALUE" >> $GITHUB_OUTPUT
echo "::set-output name=build_arg_name::"$(echo ${{ github.event.inputs.package }} | sed 's/\-/_/g')
echo "::set-output name=build_arg_value::"$(echo ${{ github.event.inputs.package }} | sed 's/postgres/core/g')
- name: Log in to the GHCR
uses: docker/login-action@v2
uses: docker/login-action@v1
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push MAJOR.MINOR.PATCH tag
uses: docker/build-push-action@v4
uses: docker/build-push-action@v2
with:
file: docker/Dockerfile
push: True
@@ -94,7 +92,7 @@ jobs:
ghcr.io/dbt-labs/${{ github.event.inputs.package }}:${{ github.event.inputs.version_number }}
- name: Build and push MINOR.latest tag
uses: docker/build-push-action@v4
uses: docker/build-push-action@v2
if: ${{ needs.get_version_meta.outputs.minor_latest == 'True' }}
with:
file: docker/Dockerfile
@@ -106,7 +104,7 @@ jobs:
ghcr.io/dbt-labs/${{ github.event.inputs.package }}:${{ needs.get_version_meta.outputs.major }}.${{ needs.get_version_meta.outputs.minor }}.latest
- name: Build and push latest tag
uses: docker/build-push-action@v4
uses: docker/build-push-action@v2
if: ${{ needs.get_version_meta.outputs.latest == 'True' }}
with:
file: docker/Dockerfile

View File

@@ -37,17 +37,17 @@ jobs:
steps:
- name: Set up Python
uses: actions/setup-python@v4
uses: actions/setup-python@v2
with:
python-version: 3.8
- name: Checkout dbt repo
uses: actions/checkout@v3
uses: actions/checkout@v2.3.4
with:
path: ${{ env.DBT_REPO_DIRECTORY }}
- name: Checkout schemas.getdbt.com repo
uses: actions/checkout@v3
uses: actions/checkout@v2.3.4
with:
repository: dbt-labs/schemas.getdbt.com
ref: 'main'
@@ -83,7 +83,7 @@ jobs:
fi
- name: Upload schema diff
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v2.2.4
if: ${{ failure() }}
with:
name: 'schema_schanges.txt'

View File

@@ -30,8 +30,6 @@ jobs:
LOG_DIR: "/home/runner/work/dbt-core/dbt-core/logs"
# tells integration tests to output into json format
DBT_LOG_FORMAT: "json"
# tell eventmgr to convert logging events into bytes
DBT_TEST_BINARY_SERIALIZATION: "true"
# Additional test users
DBT_TEST_USER_1: dbt_test_user_1
DBT_TEST_USER_2: dbt_test_user_2
@@ -39,12 +37,12 @@ jobs:
steps:
- name: checkout dev
uses: actions/checkout@v3
uses: actions/checkout@v2
with:
persist-credentials: false
- name: Setup Python
uses: actions/setup-python@v4
uses: actions/setup-python@v2.2.2
with:
python-version: "3.8"

View File

@@ -1,155 +0,0 @@
# **what?**
# This workflow will test all test(s) at the input path given number of times to determine if it's flaky or not. You can test with any supported OS/Python combination.
# This is batched in 10 to allow more test iterations faster.
# **why?**
# Testing if a test is flaky and if a previously flaky test has been fixed. This allows easy testing on supported python versions and OS combinations.
# **when?**
# This is triggered manually from dbt-core.
name: Flaky Tester
on:
workflow_dispatch:
inputs:
branch:
description: 'Branch to check out'
type: string
required: true
default: 'main'
test_path:
description: 'Path to single test to run (ex: tests/functional/retry/test_retry.py::TestRetry::test_fail_fast)'
type: string
required: true
default: 'tests/functional/...'
python_version:
description: 'Version of Python to Test Against'
type: choice
options:
- '3.8'
- '3.9'
- '3.10'
- '3.11'
os:
description: 'OS to run test in'
type: choice
options:
- 'ubuntu-latest'
- 'macos-latest'
- 'windows-latest'
num_runs_per_batch:
description: 'Max number of times to run the test per batch. We always run 10 batches.'
type: number
required: true
default: '50'
permissions: read-all
defaults:
run:
shell: bash
jobs:
debug:
runs-on: ubuntu-latest
steps:
- name: "[DEBUG] Output Inputs"
run: |
echo "Branch: ${{ inputs.branch }}"
echo "test_path: ${{ inputs.test_path }}"
echo "python_version: ${{ inputs.python_version }}"
echo "os: ${{ inputs.os }}"
echo "num_runs_per_batch: ${{ inputs.num_runs_per_batch }}"
pytest:
runs-on: ${{ inputs.os }}
strategy:
# run all batches, even if one fails. This informs how flaky the test may be.
fail-fast: false
# using a matrix to speed up the jobs since the matrix will run in parallel when runners are available
matrix:
batch: ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"]
env:
PYTEST_ADDOPTS: "-v --color=yes -n4 --csv integration_results.csv"
DBT_TEST_USER_1: dbt_test_user_1
DBT_TEST_USER_2: dbt_test_user_2
DBT_TEST_USER_3: dbt_test_user_3
DD_CIVISIBILITY_AGENTLESS_ENABLED: true
DD_API_KEY: ${{ secrets.DATADOG_API_KEY }}
DD_SITE: datadoghq.com
DD_ENV: ci
DD_SERVICE: ${{ github.event.repository.name }}
steps:
- name: "Checkout code"
uses: actions/checkout@v3
with:
ref: ${{ inputs.branch }}
- name: "Setup Python"
uses: actions/setup-python@v4
with:
python-version: "${{ inputs.python_version }}"
- name: "Setup Dev Environment"
run: make dev
- name: "Set up postgres (linux)"
if: inputs.os == 'ubuntu-latest'
run: make setup-db
# mac and windows don't use make due to limitations with docker with those runners in GitHub
- name: "Set up postgres (macos)"
if: inputs.os == 'macos-latest'
uses: ./.github/actions/setup-postgres-macos
- name: "Set up postgres (windows)"
if: inputs.os == 'windows-latest'
uses: ./.github/actions/setup-postgres-windows
- name: "Test Command"
id: command
run: |
test_command="python -m pytest ${{ inputs.test_path }}"
echo "test_command=$test_command" >> $GITHUB_OUTPUT
- name: "Run test ${{ inputs.num_runs_per_batch }} times"
id: pytest
run: |
set +e
for ((i=1; i<=${{ inputs.num_runs_per_batch }}; i++))
do
echo "Running pytest iteration $i..."
python -m pytest --ddtrace ${{ inputs.test_path }}
exit_code=$?
if [[ $exit_code -eq 0 ]]; then
success=$((success + 1))
echo "Iteration $i: Success"
else
failure=$((failure + 1))
echo "Iteration $i: Failure"
fi
echo
echo "==========================="
echo "Successful runs: $success"
echo "Failed runs: $failure"
echo "==========================="
echo
done
echo "failure=$failure" >> $GITHUB_OUTPUT
- name: "Success and Failure Summary: ${{ inputs.os }}/Python ${{ inputs.python_version }}"
run: |
echo "Batch: ${{ matrix.batch }}"
echo "Successful runs: ${{ steps.pytest.outputs.success }}"
echo "Failed runs: ${{ steps.pytest.outputs.failure }}"
- name: "Error for Failures"
if: ${{ steps.pytest.outputs.failure }}
run: |
echo "Batch ${{ matrix.batch }} failed ${{ steps.pytest.outputs.failure }} of ${{ inputs.num_runs_per_batch }} tests"
exit 1

View File

@@ -24,8 +24,10 @@ permissions:
jobs:
triage_label:
if: contains(github.event.issue.labels.*.name, 'awaiting_response')
uses: dbt-labs/actions/.github/workflows/swap-labels.yml@main
with:
add_label: "triage"
remove_label: "awaiting_response"
secrets: inherit
runs-on: ubuntu-latest
steps:
- name: initial labeling
uses: andymckay/labeler@master
with:
add-labels: "triage"
remove-labels: "awaiting_response"

View File

@@ -20,9 +20,106 @@ on:
description: 'The version number to bump to (ex. 1.2.0, 1.3.0b1)'
required: true
permissions:
contents: write
pull-requests: write
jobs:
version_bump_and_changie:
uses: dbt-labs/actions/.github/workflows/version-bump.yml@main
with:
version_number: ${{ inputs.version_number }}
secrets: inherit # ok since what we are calling is internally maintained
bump:
runs-on: ubuntu-latest
steps:
- name: "[DEBUG] Print Variables"
run: |
echo "all variables defined as inputs"
echo The version_number: ${{ github.event.inputs.version_number }}
- name: Check out the repository
uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: "3.8"
- name: Install python dependencies
run: |
python3 -m venv env
source env/bin/activate
pip install --upgrade pip
- name: Add Homebrew to PATH
run: |
echo "/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin" >> $GITHUB_PATH
- name: Install Homebrew packages
run: |
brew install pre-commit
brew tap miniscruff/changie https://github.com/miniscruff/changie
brew install changie
- name: Audit Version and Parse Into Parts
id: semver
uses: dbt-labs/actions/parse-semver@v1
with:
version: ${{ github.event.inputs.version_number }}
- name: Set branch value
id: variables
run: |
echo "::set-output name=BRANCH_NAME::prep-release/${{ github.event.inputs.version_number }}_$GITHUB_RUN_ID"
- name: Create PR branch
run: |
git checkout -b ${{ steps.variables.outputs.BRANCH_NAME }}
git push origin ${{ steps.variables.outputs.BRANCH_NAME }}
git branch --set-upstream-to=origin/${{ steps.variables.outputs.BRANCH_NAME }} ${{ steps.variables.outputs.BRANCH_NAME }}
- name: Bump version
run: |
source env/bin/activate
pip install -r dev-requirements.txt
env/bin/bumpversion --allow-dirty --new-version ${{ github.event.inputs.version_number }} major
git status
- name: Run changie
run: |
if [[ ${{ steps.semver.outputs.is-pre-release }} -eq 1 ]]
then
changie batch ${{ steps.semver.outputs.base-version }} --move-dir '${{ steps.semver.outputs.base-version }}' --prerelease '${{ steps.semver.outputs.pre-release }}'
else
changie batch ${{ steps.semver.outputs.base-version }} --include '${{ steps.semver.outputs.base-version }}' --remove-prereleases
fi
changie merge
git status
# this step will fail on whitespace errors but also correct them
- name: Remove trailing whitespace
continue-on-error: true
run: |
pre-commit run trailing-whitespace --files .bumpversion.cfg CHANGELOG.md .changes/*
git status
# this step will fail on newline errors but also correct them
- name: Removing extra newlines
continue-on-error: true
run: |
pre-commit run end-of-file-fixer --files .bumpversion.cfg CHANGELOG.md .changes/*
git status
- name: Commit version bump to branch
uses: EndBug/add-and-commit@v7
with:
author_name: 'Github Build Bot'
author_email: 'buildbot@fishtownanalytics.com'
message: 'Bumping version to ${{ github.event.inputs.version_number }} and generate CHANGELOG'
branch: '${{ steps.variables.outputs.BRANCH_NAME }}'
push: 'origin origin/${{ steps.variables.outputs.BRANCH_NAME }}'
- name: Create Pull Request
uses: peter-evans/create-pull-request@v3
with:
author: 'Github Build Bot <buildbot@fishtownanalytics.com>'
base: ${{github.ref}}
title: 'Bumping version to ${{ github.event.inputs.version_number }} and generate changelog'
branch: '${{ steps.variables.outputs.BRANCH_NAME }}'
labels: |
Skip Changelog

4
.gitignore vendored
View File

@@ -11,7 +11,6 @@ __pycache__/
env*/
dbt_env/
build/
!tests/functional/build
!core/dbt/docs/build
develop-eggs/
dist/
@@ -29,8 +28,6 @@ var/
.mypy_cache/
.dmypy.json
logs/
.user.yml
profiles.yml
# PyInstaller
# Usually these files are written by a python script from a template
@@ -54,7 +51,6 @@ coverage.xml
*,cover
.hypothesis/
test.env
makefile.test.env
*.pytest_cache/

View File

@@ -1,7 +1,8 @@
# Configuration for pre-commit hooks (see https://pre-commit.com/).
# Eventually the hooks described here will be run as tests before merging each PR.
exclude: ^(core/dbt/docs/build/|core/dbt/events/types_pb2.py)
# TODO: remove global exclusion of tests when testing overhaul is complete
exclude: ^(test/|core/dbt/docs/build/)
# Force all unspecified python hooks to run python 3.8
default_language_version:
@@ -37,7 +38,7 @@ repos:
alias: flake8-check
stages: [manual]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.3.0
rev: v0.942
hooks:
- id: mypy
# N.B.: Mypy is... a bit fragile.

View File

@@ -5,14 +5,204 @@
- "Breaking changes" listed under a version may require action from end users or external maintainers when upgrading to that version.
- Do not edit this file directly. This file is auto-generated using [changie](https://github.com/miniscruff/changie). For details on how to document a change, see [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-changelog-entry)
## dbt-core 1.4.5 - March 10, 2023
### Fixes
- Fix compilation logic for ephemeral nodes ([#6885](https://github.com/dbt-labs/dbt-core/issues/6885))
- allow adapters to change model name resolution in py models ([#7114](https://github.com/dbt-labs/dbt-core/issues/7114))
### Docs
- Fix JSON path to package overview docs ([dbt-docs/#390](https://github.com/dbt-labs/dbt-docs/issues/390))
### Under the Hood
- Moving simple_seed to adapter zone to help adapter test conversions ([#CT-1959](https://github.com/dbt-labs/dbt-core/issues/CT-1959))
### Contributors
- [@dbeatty10](https://github.com/dbeatty10) ([#390](https://github.com/dbt-labs/dbt-core/issues/390))
- [@nssalian](https://github.com/nssalian) ([#CT-1959](https://github.com/dbt-labs/dbt-core/issues/CT-1959))
- [@rlh1994](https://github.com/rlh1994) ([#390](https://github.com/dbt-labs/dbt-core/issues/390))
## dbt-core 1.4.4 - February 28, 2023
### Fixes
- add pytz dependency ([#7077](https://github.com/dbt-labs/dbt-core/issues/7077))
### Contributors
- [@sdebruyn](https://github.com/sdebruyn) ([#7077](https://github.com/dbt-labs/dbt-core/issues/7077))
## dbt-core 1.4.3 - February 24, 2023
### Fixes
- Fix semver comparison logic by ensuring numeric values ([#7039](https://github.com/dbt-labs/dbt-core/issues/7039))
## dbt-core 1.4.2 - February 23, 2023
### Fixes
- Sort cli vars before hashing for partial parsing ([#6710](https://github.com/dbt-labs/dbt-core/issues/6710))
- Remove pin on packaging and stop using it for prerelease comparisons ([#6834](https://github.com/dbt-labs/dbt-core/issues/6834))
- Readd depends_on.macros to SeedNode, to support seeds with hooks calling macros ([#6806](https://github.com/dbt-labs/dbt-core/issues/6806))
- Fix regression of --quiet cli parameter behavior ([#6749](https://github.com/dbt-labs/dbt-core/issues/6749))
- Ensure results from hooks contain nodes when processing them ([#6796](https://github.com/dbt-labs/dbt-core/issues/6796))
- Always flush stdout after logging ([#6901](https://github.com/dbt-labs/dbt-core/issues/6901))
- Set relation_name in test nodes at compile time ([#6930](https://github.com/dbt-labs/dbt-core/issues/6930))
- Fix disabled definition in WritableManifest ([#6752](https://github.com/dbt-labs/dbt-core/issues/6752))
- Fix regression in logbook log output ([#7028](https://github.com/dbt-labs/dbt-core/issues/7028))
### Docs
- Fix JSON path to overview docs ([dbt-docs/#366](https://github.com/dbt-labs/dbt-docs/issues/366))
### Contributors
- [@halvorlu](https://github.com/halvorlu) ([#366](https://github.com/dbt-labs/dbt-core/issues/366))
## dbt-core 1.4.1 - January 26, 2023
### Fixes
- [Regression] exposure_content referenced incorrectly ([#6738](https://github.com/dbt-labs/dbt-core/issues/6738))
### Contributors
- [@Mathyoub](https://github.com/Mathyoub) ([#6738](https://github.com/dbt-labs/dbt-core/issues/6738))
## dbt-core 1.4.0 - January 25, 2023
### Breaking Changes
- Cleaned up exceptions to directly raise in code. Also updated the existing exception to meet PEP guidelines.Removed use of all exception functions in the code base and marked them all as deprecated to be removed next minor release. ([#6339](https://github.com/dbt-labs/dbt-core/issues/6339), [#6393](https://github.com/dbt-labs/dbt-core/issues/6393), [#6460](https://github.com/dbt-labs/dbt-core/issues/6460))
### Features
- Added favor-state flag to optionally favor state nodes even if unselected node exists ([#5016](https://github.com/dbt-labs/dbt-core/issues/5016))
- Update structured logging. Convert to using protobuf messages. Ensure events are enriched with node_info. ([#5610](https://github.com/dbt-labs/dbt-core/issues/5610))
- incremental predicates ([#5680](https://github.com/dbt-labs/dbt-core/issues/5680))
- Friendlier error messages when packages.yml is malformed ([#5486](https://github.com/dbt-labs/dbt-core/issues/5486))
- Allow partitions in external tables to be supplied as a list ([#5929](https://github.com/dbt-labs/dbt-core/issues/5929))
- extend -f flag shorthand for seed command ([#5990](https://github.com/dbt-labs/dbt-core/issues/5990))
- This pulls the profile name from args when constructing a RuntimeConfig in lib.py, enabling the dbt-server to override the value that's in the dbt_project.yml ([#6201](https://github.com/dbt-labs/dbt-core/issues/6201))
- Adding tarball install method for packages. Allowing package tarball to be specified via url in the packages.yaml. ([#4205](https://github.com/dbt-labs/dbt-core/issues/4205))
- Added an md5 function to the base context ([#6246](https://github.com/dbt-labs/dbt-core/issues/6246))
- Exposures support metrics in lineage ([#6057](https://github.com/dbt-labs/dbt-core/issues/6057))
- Add support for Python 3.11 ([#6147](https://github.com/dbt-labs/dbt-core/issues/6147))
- Making timestamp optional for metrics ([#6398](https://github.com/dbt-labs/dbt-core/issues/6398))
- The meta configuration field is now included in the node_info property of structured logs. ([#6216](https://github.com/dbt-labs/dbt-core/issues/6216))
- Adds buildable selection mode ([#6365](https://github.com/dbt-labs/dbt-core/issues/6365))
- --warn-error-options: Treat warnings as errors for specific events, based on user configuration ([#6165](https://github.com/dbt-labs/dbt-core/issues/6165))
### Fixes
- Account for disabled flags on models in schema files more completely ([#3992](https://github.com/dbt-labs/dbt-core/issues/3992))
- Add validation of enabled config for metrics, exposures and sources ([#6030](https://github.com/dbt-labs/dbt-core/issues/6030))
- check length of args of python model function before accessing it ([#6041](https://github.com/dbt-labs/dbt-core/issues/6041))
- Add functors to ensure event types with str-type attributes are initialized to spec, even when provided non-str type params. ([#5436](https://github.com/dbt-labs/dbt-core/issues/5436))
- Allow hooks to fail without halting execution flow ([#5625](https://github.com/dbt-labs/dbt-core/issues/5625))
- fix missing f-strings, convert old .format() messages to f-strings for consistency ([#6241](https://github.com/dbt-labs/dbt-core/issues/6241))
- Clarify Error Message for how many models are allowed in a Python file ([#6245](https://github.com/dbt-labs/dbt-core/issues/6245))
- Fix typo in util.py ([#4904](https://github.com/dbt-labs/dbt-core/issues/4904))
- After this, will be possible to use default values for dbt.config.get ([#6309](https://github.com/dbt-labs/dbt-core/issues/6309))
- Use full path for writing manifest ([#6055](https://github.com/dbt-labs/dbt-core/issues/6055))
- add pre-commit install to make dev script in Makefile ([#6269](https://github.com/dbt-labs/dbt-core/issues/6269))
- Late-rendering for `pre_` and `post_hook`s in `dbt_project.yml` ([#6411](https://github.com/dbt-labs/dbt-core/issues/6411))
- [CT-1284] Change Python model default materialization to table ([#5989](https://github.com/dbt-labs/dbt-core/issues/5989))
- [CT-1591] Don't parse empty Python files ([#6345](https://github.com/dbt-labs/dbt-core/issues/6345))
- Repair a regression which prevented basic logging before the logging subsystem is completely configured. ([#6434](https://github.com/dbt-labs/dbt-core/issues/6434))
- fix docs generate --defer by adding defer_to_manifest to before_run ([#6488](https://github.com/dbt-labs/dbt-core/issues/6488))
- Bug when partial parsing with an empty schema file ([#4850](https://github.com/dbt-labs/dbt-core/issues/4850))
- Fix DBT_FAVOR_STATE env var ([#5859](https://github.com/dbt-labs/dbt-core/issues/5859))
- Restore historical behavior of certain disabled test messages, so that they are at the less obtrusive debug level, rather than the warning level. ([#6501](https://github.com/dbt-labs/dbt-core/issues/6501))
- Bump mashumuro version to get regression fix and add unit test to verify that fix. ([#6428](https://github.com/dbt-labs/dbt-core/issues/6428))
- Call update_event_status earlier for node results. Rename event 'HookFinished' -> FinishedRunningStats ([#6571](https://github.com/dbt-labs/dbt-core/issues/6571))
- Provide backward compatibility for `get_merge_sql` arguments ([#6625](https://github.com/dbt-labs/dbt-core/issues/6625))
- Fix behavior of --favor-state with --defer ([#6617](https://github.com/dbt-labs/dbt-core/issues/6617))
- Include adapter_response in NodeFinished run_result log event ([#6703](https://github.com/dbt-labs/dbt-core/issues/6703))
### Docs
- minor doc correction ([dbt-docs/#5791](https://github.com/dbt-labs/dbt-docs/issues/5791))
- Generate API docs for new CLI interface ([dbt-docs/#5528](https://github.com/dbt-labs/dbt-docs/issues/5528))
- ([dbt-docs/#5880](https://github.com/dbt-labs/dbt-docs/issues/5880))
- Fix rendering of sample code for metrics ([dbt-docs/#323](https://github.com/dbt-labs/dbt-docs/issues/323))
- Alphabetize `core/dbt/README.md` ([dbt-docs/#6368](https://github.com/dbt-labs/dbt-docs/issues/6368))
- Updated minor typos encountered when skipping profile setup ([dbt-docs/#6529](https://github.com/dbt-labs/dbt-docs/issues/6529))
### Under the Hood
- Put black config in explicit config ([#5946](https://github.com/dbt-labs/dbt-core/issues/5946))
- Added flat_graph attribute the Manifest class's deepcopy() coverage ([#5809](https://github.com/dbt-labs/dbt-core/issues/5809))
- Add mypy configs so `mypy` passes from CLI ([#5983](https://github.com/dbt-labs/dbt-core/issues/5983))
- Exception message cleanup. ([#6023](https://github.com/dbt-labs/dbt-core/issues/6023))
- Add dmypy cache to gitignore ([#6028](https://github.com/dbt-labs/dbt-core/issues/6028))
- Provide useful errors when the value of 'materialized' is invalid ([#5229](https://github.com/dbt-labs/dbt-core/issues/5229))
- Clean up string formatting ([#6068](https://github.com/dbt-labs/dbt-core/issues/6068))
- Fixed extra whitespace in strings introduced by black. ([#1350](https://github.com/dbt-labs/dbt-core/issues/1350))
- Remove the 'root_path' field from most nodes ([#6171](https://github.com/dbt-labs/dbt-core/issues/6171))
- Combine certain logging events with different levels ([#6173](https://github.com/dbt-labs/dbt-core/issues/6173))
- Convert threading tests to pytest ([#5942](https://github.com/dbt-labs/dbt-core/issues/5942))
- Convert postgres index tests to pytest ([#5770](https://github.com/dbt-labs/dbt-core/issues/5770))
- Convert use color tests to pytest ([#5771](https://github.com/dbt-labs/dbt-core/issues/5771))
- Add github actions workflow to generate high level CLI API docs ([#5942](https://github.com/dbt-labs/dbt-core/issues/5942))
- Functionality-neutral refactor of event logging system to improve encapsulation and modularity. ([#6139](https://github.com/dbt-labs/dbt-core/issues/6139))
- Consolidate ParsedNode and CompiledNode classes ([#6383](https://github.com/dbt-labs/dbt-core/issues/6383))
- Prevent doc gen workflow from running on forks ([#6386](https://github.com/dbt-labs/dbt-core/issues/6386))
- Fix intermittent database connection failure in Windows CI test ([#6394](https://github.com/dbt-labs/dbt-core/issues/6394))
- Refactor and clean up manifest nodes ([#6426](https://github.com/dbt-labs/dbt-core/issues/6426))
- Restore important legacy logging behaviors, following refactor which removed them ([#6437](https://github.com/dbt-labs/dbt-core/issues/6437))
- Treat dense text blobs as binary for `git grep` ([#6294](https://github.com/dbt-labs/dbt-core/issues/6294))
- Prune partial parsing logging events ([#6313](https://github.com/dbt-labs/dbt-core/issues/6313))
- Updating the deprecation warning in the metric attributes renamed event ([#6507](https://github.com/dbt-labs/dbt-core/issues/6507))
- [CT-1693] Port severity test to Pytest ([#6466](https://github.com/dbt-labs/dbt-core/issues/6466))
- [CT-1694] Deprecate event tracking tests ([#6467](https://github.com/dbt-labs/dbt-core/issues/6467))
- Reorganize structured logging events to have two top keys ([#6311](https://github.com/dbt-labs/dbt-core/issues/6311))
- Combine some logging events ([#1716](https://github.com/dbt-labs/dbt-core/issues/1716), [#1717](https://github.com/dbt-labs/dbt-core/issues/1717), [#1719](https://github.com/dbt-labs/dbt-core/issues/1719))
- Check length of escaped strings in the adapter test ([#6566](https://github.com/dbt-labs/dbt-core/issues/6566))
### Dependencies
- Update pathspec requirement from ~=0.9.0 to >=0.9,<0.11 in /core ([#5917](https://github.com/dbt-labs/dbt-core/pull/5917))
- Bump black from 22.8.0 to 22.10.0 ([#6019](https://github.com/dbt-labs/dbt-core/pull/6019))
- Bump mashumaro[msgpack] from 3.0.4 to 3.1.1 in /core ([#6108](https://github.com/dbt-labs/dbt-core/pull/6108))
- Update colorama requirement from <0.4.6,>=0.3.9 to >=0.3.9,<0.4.7 in /core ([#6144](https://github.com/dbt-labs/dbt-core/pull/6144))
- Bump mashumaro[msgpack] from 3.1.1 to 3.2 in /core ([#6375](https://github.com/dbt-labs/dbt-core/pull/6375))
- Update agate requirement from <1.6.4,>=1.6 to >=1.6,<1.7.1 in /core ([#6506](https://github.com/dbt-labs/dbt-core/pull/6506))
### Contributors
- [@NiallRees](https://github.com/NiallRees) ([#5859](https://github.com/dbt-labs/dbt-core/issues/5859))
- [@agpapa](https://github.com/agpapa) ([#6365](https://github.com/dbt-labs/dbt-core/issues/6365))
- [@andy-clapson](https://github.com/andy-clapson) ([dbt-docs/#5791](https://github.com/dbt-labs/dbt-docs/issues/5791))
- [@callum-mcdata](https://github.com/callum-mcdata) ([#6398](https://github.com/dbt-labs/dbt-core/issues/6398), [#6507](https://github.com/dbt-labs/dbt-core/issues/6507))
- [@chamini2](https://github.com/chamini2) ([#6041](https://github.com/dbt-labs/dbt-core/issues/6041))
- [@daniel-murray](https://github.com/daniel-murray) ([#5016](https://github.com/dbt-labs/dbt-core/issues/5016))
- [@dave-connors-3](https://github.com/dave-connors-3) ([#5680](https://github.com/dbt-labs/dbt-core/issues/5680), [#5990](https://github.com/dbt-labs/dbt-core/issues/5990), [#6625](https://github.com/dbt-labs/dbt-core/issues/6625))
- [@dbeatty10](https://github.com/dbeatty10) ([#6411](https://github.com/dbt-labs/dbt-core/issues/6411), [dbt-docs/#6368](https://github.com/dbt-labs/dbt-docs/issues/6368), [#6394](https://github.com/dbt-labs/dbt-core/issues/6394), [#6294](https://github.com/dbt-labs/dbt-core/issues/6294), [#6566](https://github.com/dbt-labs/dbt-core/issues/6566))
- [@devmessias](https://github.com/devmessias) ([#6309](https://github.com/dbt-labs/dbt-core/issues/6309))
- [@eltociear](https://github.com/eltociear) ([#4904](https://github.com/dbt-labs/dbt-core/issues/4904))
- [@eve-johns](https://github.com/eve-johns) ([#6068](https://github.com/dbt-labs/dbt-core/issues/6068))
- [@haritamar](https://github.com/haritamar) ([#6246](https://github.com/dbt-labs/dbt-core/issues/6246))
- [@jared-rimmer](https://github.com/jared-rimmer) ([#5486](https://github.com/dbt-labs/dbt-core/issues/5486))
- [@josephberni](https://github.com/josephberni) ([#5016](https://github.com/dbt-labs/dbt-core/issues/5016))
- [@joshuataylor](https://github.com/joshuataylor) ([#6147](https://github.com/dbt-labs/dbt-core/issues/6147))
- [@justbldwn](https://github.com/justbldwn) ([#6241](https://github.com/dbt-labs/dbt-core/issues/6241), [#6245](https://github.com/dbt-labs/dbt-core/issues/6245), [#6269](https://github.com/dbt-labs/dbt-core/issues/6269))
- [@luke-bassett](https://github.com/luke-bassett) ([#1350](https://github.com/dbt-labs/dbt-core/issues/1350))
- [@max-sixty](https://github.com/max-sixty) ([#5946](https://github.com/dbt-labs/dbt-core/issues/5946), [#5983](https://github.com/dbt-labs/dbt-core/issues/5983), [#6028](https://github.com/dbt-labs/dbt-core/issues/6028))
- [@mivanicova](https://github.com/mivanicova) ([#6488](https://github.com/dbt-labs/dbt-core/issues/6488))
- [@nshuman1](https://github.com/nshuman1) ([dbt-docs/#6529](https://github.com/dbt-labs/dbt-docs/issues/6529))
- [@paulbenschmidt](https://github.com/paulbenschmidt) ([dbt-docs/#5880](https://github.com/dbt-labs/dbt-docs/issues/5880))
- [@pgoslatara](https://github.com/pgoslatara) ([#5929](https://github.com/dbt-labs/dbt-core/issues/5929))
- [@racheldaniel](https://github.com/racheldaniel) ([#6201](https://github.com/dbt-labs/dbt-core/issues/6201))
- [@timle2](https://github.com/timle2) ([#4205](https://github.com/dbt-labs/dbt-core/issues/4205))
- [@tmastny](https://github.com/tmastny) ([#6216](https://github.com/dbt-labs/dbt-core/issues/6216))
## Previous Releases
For information on prior major and minor releases, see their changelogs:
* [1.6](https://github.com/dbt-labs/dbt-core/blob/1.6.latest/CHANGELOG.md)
* [1.5](https://github.com/dbt-labs/dbt-core/blob/1.5.latest/CHANGELOG.md)
* [1.4](https://github.com/dbt-labs/dbt-core/blob/1.4.latest/CHANGELOG.md)
* [1.3](https://github.com/dbt-labs/dbt-core/blob/1.3.latest/CHANGELOG.md)
* [1.2](https://github.com/dbt-labs/dbt-core/blob/1.2.latest/CHANGELOG.md)
* [1.1](https://github.com/dbt-labs/dbt-core/blob/1.1.latest/CHANGELOG.md)

View File

@@ -5,10 +5,10 @@
1. [About this document](#about-this-document)
2. [Getting the code](#getting-the-code)
3. [Setting up an environment](#setting-up-an-environment)
4. [Running dbt-core in development](#running-dbt-core-in-development)
4. [Running `dbt` in development](#running-dbt-core-in-development)
5. [Testing dbt-core](#testing)
6. [Debugging](#debugging)
7. [Adding or modifying a changelog entry](#adding-or-modifying-a-changelog-entry)
7. [Adding a changelog entry](#adding-a-changelog-entry)
8. [Submitting a Pull Request](#submitting-a-pull-request)
## About this document
@@ -56,7 +56,7 @@ There are some tools that will be helpful to you in developing locally. While th
These are the tools used in `dbt-core` development and testing:
- [`tox`](https://tox.readthedocs.io/en/latest/) to manage virtualenvs across python versions. We currently target the latest patch releases for Python 3.8, 3.9, 3.10 and 3.11
- [`tox`](https://tox.readthedocs.io/en/latest/) to manage virtualenvs across python versions. We currently target the latest patch releases for Python 3.7, 3.8, 3.9, 3.10 and 3.11
- [`pytest`](https://docs.pytest.org/en/latest/) to define, discover, and run tests
- [`flake8`](https://flake8.pycqa.org/en/latest/) for code linting
- [`black`](https://github.com/psf/black) for code formatting
@@ -113,7 +113,7 @@ When installed in this way, any changes you make to your local copy of the sourc
With your virtualenv activated, the `dbt` script should point back to the source code you've cloned on your machine. You can verify this by running `which dbt`. This command should show you a path to an executable in your virtualenv.
Configure your [profile](https://docs.getdbt.com/docs/configure-your-profile) as necessary to connect to your target databases. It may be a good idea to add a new profile pointing to a local Postgres instance, or a specific test sandbox within your data warehouse if appropriate. Make sure to create a profile before running integration tests.
Configure your [profile](https://docs.getdbt.com/docs/configure-your-profile) as necessary to connect to your target databases. It may be a good idea to add a new profile pointing to a local Postgres instance, or a specific test sandbox within your data warehouse if appropriate.
## Testing
@@ -163,7 +163,7 @@ suites.
#### `tox`
[`tox`](https://tox.readthedocs.io/en/latest/) takes care of managing virtualenvs and install dependencies in order to run tests. You can also run tests in parallel, for example, you can run unit tests for Python 3.8, Python 3.9, Python 3.10 and Python 3.11 checks in parallel with `tox -p`. Also, you can run unit tests for specific python versions with `tox -e py38`. The configuration for these tests in located in `tox.ini`.
[`tox`](https://tox.readthedocs.io/en/latest/) takes care of managing virtualenvs and install dependencies in order to run tests. You can also run tests in parallel, for example, you can run unit tests for Python 3.7, Python 3.8, Python 3.9, Python 3.10 and Python 3.11 checks in parallel with `tox -p`. Also, you can run unit tests for specific python versions with `tox -e py37`. The configuration for these tests in located in `tox.ini`.
#### `pytest`
@@ -171,10 +171,12 @@ Finally, you can also run a specific test or group of tests using [`pytest`](htt
```sh
# run all unit tests in a file
python3 -m pytest tests/unit/test_graph.py
python3 -m pytest test/unit/test_graph.py
# run a specific unit test
python3 -m pytest tests/unit/test_graph.py::GraphTest::test__dependency_list
# run specific Postgres functional tests
python3 -m pytest test/unit/test_graph.py::GraphTest::test__dependency_list
# run specific Postgres integration tests (old way)
python3 -m pytest -m profile_postgres test/integration/074_postgres_unlogged_table_tests
# run specific Postgres integration tests (new way)
python3 -m pytest tests/functional/sources
```
@@ -183,8 +185,9 @@ python3 -m pytest tests/functional/sources
### Unit, Integration, Functional?
Here are some general rules for adding tests:
* unit tests (`tests/unit`) dont need to access a database; "pure Python" tests should be written as unit tests
* functional tests (`tests/functional`) cover anything that interacts with a database, namely adapter
* unit tests (`test/unit` & `tests/unit`) dont need to access a database; "pure Python" tests should be written as unit tests
* functional tests (`test/integration` & `tests/functional`) cover anything that interacts with a database, namely adapter
* *everything in* `test/*` *is being steadily migrated to* `tests/*`
## Debugging

View File

@@ -9,7 +9,7 @@ ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
software-properties-common gpg-agent \
software-properties-common \
&& add-apt-repository ppa:git-core/ppa -y \
&& apt-get dist-upgrade -y \
&& apt-get install -y --no-install-recommends \
@@ -30,9 +30,16 @@ RUN apt-get update \
unixodbc-dev \
&& add-apt-repository ppa:deadsnakes/ppa \
&& apt-get install -y \
python-is-python3 \
python-dev-is-python3 \
python \
python-dev \
python3-pip \
python3.6 \
python3.6-dev \
python3-pip \
python3.6-venv \
python3.7 \
python3.7-dev \
python3.7-venv \
python3.8 \
python3.8-dev \
python3.8-venv \

View File

@@ -6,42 +6,29 @@ ifeq ($(USE_DOCKER),true)
DOCKER_CMD := docker-compose run --rm test
endif
#
# To override CI_flags, create a file at this repo's root dir named `makefile.test.env`. Fill it
# with any ENV_VAR overrides required by your test environment, e.g.
# DBT_TEST_USER_1=user
# LOG_DIR="dir with a space in it"
#
# Warn: Restrict each line to one variable only.
#
ifeq (./makefile.test.env,$(wildcard ./makefile.test.env))
include ./makefile.test.env
endif
LOGS_DIR := ./logs
# Optional flag to invoke tests using our CI env.
# But we always want these active for structured
# log testing.
CI_FLAGS =\
DBT_TEST_USER_1=$(if $(DBT_TEST_USER_1),$(DBT_TEST_USER_1),dbt_test_user_1)\
DBT_TEST_USER_2=$(if $(DBT_TEST_USER_2),$(DBT_TEST_USER_2),dbt_test_user_2)\
DBT_TEST_USER_3=$(if $(DBT_TEST_USER_3),$(DBT_TEST_USER_3),dbt_test_user_3)\
RUSTFLAGS=$(if $(RUSTFLAGS),$(RUSTFLAGS),"-D warnings")\
LOG_DIR=$(if $(LOG_DIR),$(LOG_DIR),./logs)\
DBT_LOG_FORMAT=$(if $(DBT_LOG_FORMAT),$(DBT_LOG_FORMAT),json)
DBT_TEST_USER_1=dbt_test_user_1\
DBT_TEST_USER_2=dbt_test_user_2\
DBT_TEST_USER_3=dbt_test_user_3\
RUSTFLAGS="-D warnings"\
LOG_DIR=./logs\
DBT_LOG_FORMAT=json
.PHONY: dev_req
dev_req: ## Installs dbt-* packages in develop mode along with only development dependencies.
@\
pip install -r dev-requirements.txt
pip install -r editable-requirements.txt
pip install -r dev-requirements.txt -r editable-requirements.txt
.PHONY: dev
dev: dev_req ## Installs dbt-* packages in develop mode along with development dependencies and pre-commit.
@\
pre-commit install
.PHONY: proto_types
proto_types: ## generates google protobuf python file from types.proto
protoc -I=./core/dbt/events --python_out=./core/dbt/events ./core/dbt/events/types.proto
.PHONY: mypy
mypy: .env ## Runs mypy against staged changes for static type checking.
@\
@@ -79,7 +66,7 @@ test: .env ## Runs unit tests with py and code checks against staged changes.
.PHONY: integration
integration: .env ## Runs postgres integration tests with py-integration
@\
$(CI_FLAGS) $(DOCKER_CMD) tox -e py-integration -- -nauto
$(if $(USE_CI_FLAGS), $(CI_FLAGS)) $(DOCKER_CMD) tox -e py-integration -- -nauto
.PHONY: integration-fail-fast
integration-fail-fast: .env ## Runs postgres integration tests with py-integration in "fail fast" mode.
@@ -89,9 +76,9 @@ integration-fail-fast: .env ## Runs postgres integration tests with py-integrati
.PHONY: interop
interop: clean
@\
mkdir $(LOG_DIR) && \
mkdir $(LOGS_DIR) && \
$(CI_FLAGS) $(DOCKER_CMD) tox -e py-integration -- -nauto && \
LOG_DIR=$(LOG_DIR) cargo run --manifest-path test/interop/log_parsing/Cargo.toml
LOG_DIR=$(LOGS_DIR) cargo run --manifest-path test/interop/log_parsing/Cargo.toml
.PHONY: setup-db
setup-db: ## Setup Postgres database with docker-compose for system testing.

View File

@@ -21,7 +21,7 @@ These select statements, or "models", form a dbt project. Models frequently buil
## Getting started
- [Install dbt](https://docs.getdbt.com/docs/get-started/installation)
- [Install dbt](https://docs.getdbt.com/docs/installation)
- Read the [introduction](https://docs.getdbt.com/docs/introduction/) and [viewpoint](https://docs.getdbt.com/docs/about/viewpoint/)
## Join the dbt Community

View File

@@ -1,19 +1,14 @@
# these are all just exports, #noqa them so flake8 will be happy
# TODO: Should we still include this in the `adapters` namespace?
from dbt.contracts.connection import Credentials # noqa: F401
from dbt.adapters.base.meta import available # noqa: F401
from dbt.adapters.base.connections import BaseConnectionManager # noqa: F401
from dbt.adapters.base.relation import ( # noqa: F401
from dbt.contracts.connection import Credentials # noqa
from dbt.adapters.base.meta import available # noqa
from dbt.adapters.base.connections import BaseConnectionManager # noqa
from dbt.adapters.base.relation import ( # noqa
BaseRelation,
RelationType,
SchemaSearchMap,
)
from dbt.adapters.base.column import Column # noqa: F401
from dbt.adapters.base.impl import ( # noqa: F401
AdapterConfig,
BaseAdapter,
PythonJobHelper,
ConstraintSupport,
)
from dbt.adapters.base.plugin import AdapterPlugin # noqa: F401
from dbt.adapters.base.column import Column # noqa
from dbt.adapters.base.impl import AdapterConfig, BaseAdapter, PythonJobHelper # noqa
from dbt.adapters.base.plugin import AdapterPlugin # noqa

View File

@@ -60,7 +60,6 @@ class Column:
"float",
"double precision",
"float8",
"double",
]
def is_integer(self) -> bool:

View File

@@ -142,44 +142,44 @@ class BaseConnectionManager(metaclass=abc.ABCMeta):
)
def set_connection_name(self, name: Optional[str] = None) -> Connection:
"""Called by 'acquire_connection' in BaseAdapter, which is called by
'connection_named', called by 'connection_for(node)'.
Creates a connection for this thread if one doesn't already
exist, and will rename an existing connection."""
conn_name: str
if name is None:
# if a name isn't specified, we'll re-use a single handle
# named 'master'
conn_name = "master"
else:
if not isinstance(name, str):
raise dbt.exceptions.CompilerException(
f"For connection name, got {name} - not a string!"
)
assert isinstance(name, str)
conn_name = name
conn_name: str = "master" if name is None else name
# Get a connection for this thread
conn = self.get_if_exists()
if conn and conn.name == conn_name and conn.state == "open":
# Found a connection and nothing to do, so just return it
return conn
if conn is None:
# Create a new connection
conn = Connection(
type=Identifier(self.TYPE),
name=conn_name,
name=None,
state=ConnectionState.INIT,
transaction_open=False,
handle=None,
credentials=self.profile.credentials,
)
conn.handle = LazyHandle(self.open)
# Add the connection to thread_connections for this thread
self.set_thread_connection(conn)
fire_event(
NewConnection(conn_name=conn_name, conn_type=self.TYPE, node_info=get_node_info())
)
else: # existing connection either wasn't open or didn't have the right name
if conn.state != "open":
conn.handle = LazyHandle(self.open)
if conn.name != conn_name:
orig_conn_name: str = conn.name or ""
conn.name = conn_name
fire_event(ConnectionReused(orig_conn_name=orig_conn_name, conn_name=conn_name))
if conn.name == conn_name and conn.state == "open":
return conn
fire_event(
NewConnection(conn_name=conn_name, conn_type=self.TYPE, node_info=get_node_info())
)
if conn.state == "open":
fire_event(ConnectionReused(conn_name=conn_name))
else:
conn.handle = LazyHandle(self.open)
conn.name = conn_name
return conn
@classmethod

View File

@@ -2,48 +2,46 @@ import abc
from concurrent.futures import as_completed, Future
from contextlib import contextmanager
from datetime import datetime
from enum import Enum
import time
from itertools import chain
from typing import (
Any,
Optional,
Tuple,
Callable,
Dict,
Iterable,
Iterator,
Type,
Dict,
Any,
List,
Mapping,
Optional,
Iterator,
Set,
Tuple,
Type,
Union,
)
from dbt.contracts.graph.nodes import ColumnLevelConstraint, ConstraintType, ModelLevelConstraint
import agate
import pytz
from dbt.exceptions import (
DbtInternalError,
DbtRuntimeError,
DbtValidationError,
MacroArgTypeError,
MacroResultError,
QuoteConfigTypeError,
NotImplementedError,
NullRelationCacheAttemptedError,
NullRelationDropAttemptedError,
QuoteConfigTypeError,
RelationReturnedMultipleResultsError,
RenameToNoneAttemptedError,
DbtRuntimeError,
SnapshotTargetIncompleteError,
SnapshotTargetNotSnapshotTableError,
UnexpectedNonTimestampError,
UnexpectedNullError,
UnexpectedNonTimestampError,
)
from dbt.adapters.protocol import AdapterConfig, ConnectionManagerProtocol
from dbt.adapters.protocol import (
AdapterConfig,
ConnectionManagerProtocol,
)
from dbt.clients.agate_helper import empty_table, merge_tables, table_from_rows
from dbt.clients.jinja import MacroGenerator
from dbt.contracts.graph.manifest import Manifest, MacroManifest
@@ -55,10 +53,8 @@ from dbt.events.types import (
CodeExecution,
CodeExecutionStatus,
CatalogGenerationError,
ConstraintNotSupported,
ConstraintNotEnforced,
)
from dbt.utils import filter_null_values, executor, cast_to_str, AttrDict
from dbt.utils import filter_null_values, executor, cast_to_str
from dbt.adapters.base.connections import Connection, AdapterResponse
from dbt.adapters.base.meta import AdapterMeta, available
@@ -70,19 +66,13 @@ from dbt.adapters.base.relation import (
)
from dbt.adapters.base import Column as BaseColumn
from dbt.adapters.base import Credentials
from dbt.adapters.cache import RelationsCache, _make_ref_key_dict
from dbt import deprecations
from dbt.adapters.cache import RelationsCache, _make_ref_key_msg
GET_CATALOG_MACRO_NAME = "get_catalog"
FRESHNESS_MACRO_NAME = "collect_freshness"
class ConstraintSupport(str, Enum):
ENFORCED = "enforced"
NOT_ENFORCED = "not_enforced"
NOT_SUPPORTED = "not_supported"
def _expect_row_value(key: str, row: agate.Row):
if key not in row.keys():
raise DbtInternalError(
@@ -187,7 +177,6 @@ class BaseAdapter(metaclass=AdapterMeta):
- truncate_relation
- rename_relation
- get_columns_in_relation
- get_column_schema_from_query
- expand_column_types
- list_relations_without_caching
- is_cancelable
@@ -214,14 +203,6 @@ class BaseAdapter(metaclass=AdapterMeta):
# for use in materializations
AdapterSpecificConfigs: Type[AdapterConfig] = AdapterConfig
CONSTRAINT_SUPPORT = {
ConstraintType.check: ConstraintSupport.NOT_SUPPORTED,
ConstraintType.not_null: ConstraintSupport.ENFORCED,
ConstraintType.unique: ConstraintSupport.NOT_ENFORCED,
ConstraintType.primary_key: ConstraintSupport.NOT_ENFORCED,
ConstraintType.foreign_key: ConstraintSupport.ENFORCED,
}
def __init__(self, config):
self.config = config
self.cache = RelationsCache()
@@ -274,7 +255,7 @@ class BaseAdapter(metaclass=AdapterMeta):
@available.parse(lambda *a, **k: ("", empty_table()))
def execute(
self, sql: str, auto_begin: bool = False, fetch: bool = False, limit: Optional[int] = None
self, sql: str, auto_begin: bool = False, fetch: bool = False
) -> Tuple[AdapterResponse, agate.Table]:
"""Execute the given SQL. This is a thin wrapper around
ConnectionManager.execute.
@@ -283,35 +264,10 @@ class BaseAdapter(metaclass=AdapterMeta):
:param bool auto_begin: If set, and dbt is not currently inside a
transaction, automatically begin one.
:param bool fetch: If set, fetch results.
:param Optional[int] limit: If set, only fetch n number of rows
:return: A tuple of the query status and results (empty if fetch=False).
:rtype: Tuple[AdapterResponse, agate.Table]
"""
return self.connections.execute(sql=sql, auto_begin=auto_begin, fetch=fetch, limit=limit)
def validate_sql(self, sql: str) -> AdapterResponse:
"""Submit the given SQL to the engine for validation, but not execution.
This should throw an appropriate exception if the input SQL is invalid, although
in practice that will generally be handled by delegating to an existing method
for execution and allowing the error handler to take care of the rest.
:param str sql: The sql to validate
"""
raise NotImplementedError("`validate_sql` is not implemented for this adapter!")
@available.parse(lambda *a, **k: [])
def get_column_schema_from_query(self, sql: str) -> List[BaseColumn]:
"""Get a list of the Columns with names and data types from the given sql."""
_, cursor = self.connections.add_select_query(sql)
columns = [
self.Column.create(
column_name, self.connections.data_type_code_to_name(column_type_code)
)
# https://peps.python.org/pep-0249/#description
for column_name, column_type_code, *_ in cursor.description
]
return columns
return self.connections.execute(sql=sql, auto_begin=auto_begin, fetch=fetch)
@available.parse(lambda *a, **k: ("", empty_table()))
def get_partitions_metadata(self, table: str) -> Tuple[agate.Table]:
@@ -395,7 +351,7 @@ class BaseAdapter(metaclass=AdapterMeta):
return {
self.Relation.create_from(self.config, node).without_identifier()
for node in manifest.nodes.values()
if (node.is_relational and not node.is_ephemeral_model and not node.is_external_node)
if (node.is_relational and not node.is_ephemeral_model)
}
def _get_catalog_schemas(self, manifest: Manifest) -> SchemaSearchMap:
@@ -426,7 +382,7 @@ class BaseAdapter(metaclass=AdapterMeta):
return info_schema_name_map
def _relations_cache_for_schemas(
self, manifest: Manifest, cache_schemas: Optional[Set[BaseRelation]] = None
self, manifest: Manifest, cache_schemas: Set[BaseRelation] = None
) -> None:
"""Populate the relations cache for the given schemas. Returns an
iterable of the schemas populated, as strings.
@@ -462,7 +418,7 @@ class BaseAdapter(metaclass=AdapterMeta):
self,
manifest: Manifest,
clear: bool = False,
required_schemas: Optional[Set[BaseRelation]] = None,
required_schemas: Set[BaseRelation] = None,
) -> None:
"""Run a query that gets a populated cache of the relations in the
database and set the cache on this adapter.
@@ -748,23 +704,11 @@ class BaseAdapter(metaclass=AdapterMeta):
# we can't build the relations cache because we don't have a
# manifest so we can't run any operations.
relations = self.list_relations_without_caching(schema_relation)
# if the cache is already populated, add this schema in
# otherwise, skip updating the cache and just ignore
if self.cache:
for relation in relations:
self.cache.add(relation)
if not relations:
# it's possible that there were no relations in some schemas. We want
# to insert the schemas we query into the cache's `.schemas` attribute
# so we can check it later
self.cache.update_schemas([(database, schema)])
fire_event(
ListRelations(
database=cast_to_str(database),
schema=schema,
relations=[_make_ref_key_dict(x) for x in relations],
relations=[_make_ref_key_msg(x) for x in relations],
)
)
@@ -796,6 +740,7 @@ class BaseAdapter(metaclass=AdapterMeta):
schema: str,
identifier: str,
) -> List[BaseRelation]:
matches = []
search = self._make_match_kwargs(database, schema, identifier)
@@ -996,9 +941,9 @@ class BaseAdapter(metaclass=AdapterMeta):
manifest: Optional[Manifest] = None,
project: Optional[str] = None,
context_override: Optional[Dict[str, Any]] = None,
kwargs: Optional[Dict[str, Any]] = None,
kwargs: Dict[str, Any] = None,
text_only_columns: Optional[Iterable[str]] = None,
) -> AttrDict:
) -> agate.Table:
"""Look macro_name up in the manifest and execute its results.
:param macro_name: The name of the macro to execute.
@@ -1073,6 +1018,7 @@ class BaseAdapter(metaclass=AdapterMeta):
schemas: Set[str],
manifest: Manifest,
) -> agate.Table:
kwargs = {"information_schema": information_schema, "schemas": schemas}
table = self.execute_macro(
GET_CATALOG_MACRO_NAME,
@@ -1082,7 +1028,7 @@ class BaseAdapter(metaclass=AdapterMeta):
manifest=manifest,
)
results = self._catalog_filter_table(table, manifest) # type: ignore[arg-type]
results = self._catalog_filter_table(table, manifest)
return results
def get_catalog(self, manifest: Manifest) -> Tuple[agate.Table, List[Exception]]:
@@ -1114,7 +1060,7 @@ class BaseAdapter(metaclass=AdapterMeta):
loaded_at_field: str,
filter: Optional[str],
manifest: Optional[Manifest] = None,
) -> Tuple[Optional[AdapterResponse], Dict[str, Any]]:
) -> Dict[str, Any]:
"""Calculate the freshness of sources in dbt, and return it"""
kwargs: Dict[str, Any] = {
"source": source,
@@ -1123,19 +1069,7 @@ class BaseAdapter(metaclass=AdapterMeta):
}
# run the macro
# in older versions of dbt-core, the 'collect_freshness' macro returned the table of results directly
# starting in v1.5, by default, we return both the table and the adapter response (metadata about the query)
result: Union[
AttrDict, # current: contains AdapterResponse + agate.Table
agate.Table, # previous: just table
]
result = self.execute_macro(FRESHNESS_MACRO_NAME, kwargs=kwargs, manifest=manifest)
if isinstance(result, agate.Table):
deprecations.warn("collect-freshness-return-signature")
adapter_response = None
table = result
else:
adapter_response, table = result.response, result.table # type: ignore[attr-defined]
table = self.execute_macro(FRESHNESS_MACRO_NAME, kwargs=kwargs, manifest=manifest)
# now we have a 1-row table of the maximum `loaded_at_field` value and
# the current time according to the db.
if len(table) != 1 or len(table[0]) != 2:
@@ -1149,12 +1083,11 @@ class BaseAdapter(metaclass=AdapterMeta):
snapshotted_at = _utc(table[0][1], source, loaded_at_field)
age = (snapshotted_at - max_loaded_at).total_seconds()
freshness = {
return {
"max_loaded_at": max_loaded_at,
"snapshotted_at": snapshotted_at,
"age": age,
}
return adapter_response, freshness
def pre_model_hook(self, config: Mapping[str, Any]) -> Any:
"""A hook for running some operation before the model materialization
@@ -1316,119 +1249,6 @@ class BaseAdapter(metaclass=AdapterMeta):
# This returns a callable macro
return model_context[macro_name]
@classmethod
def _parse_column_constraint(cls, raw_constraint: Dict[str, Any]) -> ColumnLevelConstraint:
try:
ColumnLevelConstraint.validate(raw_constraint)
return ColumnLevelConstraint.from_dict(raw_constraint)
except Exception:
raise DbtValidationError(f"Could not parse constraint: {raw_constraint}")
@classmethod
def render_column_constraint(cls, constraint: ColumnLevelConstraint) -> Optional[str]:
"""Render the given constraint as DDL text. Should be overriden by adapters which need custom constraint
rendering."""
constraint_expression = constraint.expression or ""
rendered_column_constraint = None
if constraint.type == ConstraintType.check and constraint_expression:
rendered_column_constraint = f"check ({constraint_expression})"
elif constraint.type == ConstraintType.not_null:
rendered_column_constraint = f"not null {constraint_expression}"
elif constraint.type == ConstraintType.unique:
rendered_column_constraint = f"unique {constraint_expression}"
elif constraint.type == ConstraintType.primary_key:
rendered_column_constraint = f"primary key {constraint_expression}"
elif constraint.type == ConstraintType.foreign_key and constraint_expression:
rendered_column_constraint = f"references {constraint_expression}"
elif constraint.type == ConstraintType.custom and constraint_expression:
rendered_column_constraint = constraint_expression
if rendered_column_constraint:
rendered_column_constraint = rendered_column_constraint.strip()
return rendered_column_constraint
@available
@classmethod
def render_raw_columns_constraints(cls, raw_columns: Dict[str, Dict[str, Any]]) -> List:
rendered_column_constraints = []
for v in raw_columns.values():
col_name = cls.quote(v["name"]) if v.get("quote") else v["name"]
rendered_column_constraint = [f"{col_name} {v['data_type']}"]
for con in v.get("constraints", None):
constraint = cls._parse_column_constraint(con)
c = cls.process_parsed_constraint(constraint, cls.render_column_constraint)
if c is not None:
rendered_column_constraint.append(c)
rendered_column_constraints.append(" ".join(rendered_column_constraint))
return rendered_column_constraints
@classmethod
def process_parsed_constraint(
cls, parsed_constraint: Union[ColumnLevelConstraint, ModelLevelConstraint], render_func
) -> Optional[str]:
if (
parsed_constraint.warn_unsupported
and cls.CONSTRAINT_SUPPORT[parsed_constraint.type] == ConstraintSupport.NOT_SUPPORTED
):
warn_or_error(
ConstraintNotSupported(constraint=parsed_constraint.type.value, adapter=cls.type())
)
if (
parsed_constraint.warn_unenforced
and cls.CONSTRAINT_SUPPORT[parsed_constraint.type] == ConstraintSupport.NOT_ENFORCED
):
warn_or_error(
ConstraintNotEnforced(constraint=parsed_constraint.type.value, adapter=cls.type())
)
if cls.CONSTRAINT_SUPPORT[parsed_constraint.type] != ConstraintSupport.NOT_SUPPORTED:
return render_func(parsed_constraint)
return None
@classmethod
def _parse_model_constraint(cls, raw_constraint: Dict[str, Any]) -> ModelLevelConstraint:
try:
ModelLevelConstraint.validate(raw_constraint)
c = ModelLevelConstraint.from_dict(raw_constraint)
return c
except Exception:
raise DbtValidationError(f"Could not parse constraint: {raw_constraint}")
@available
@classmethod
def render_raw_model_constraints(cls, raw_constraints: List[Dict[str, Any]]) -> List[str]:
return [c for c in map(cls.render_raw_model_constraint, raw_constraints) if c is not None]
@classmethod
def render_raw_model_constraint(cls, raw_constraint: Dict[str, Any]) -> Optional[str]:
constraint = cls._parse_model_constraint(raw_constraint)
return cls.process_parsed_constraint(constraint, cls.render_model_constraint)
@classmethod
def render_model_constraint(cls, constraint: ModelLevelConstraint) -> Optional[str]:
"""Render the given constraint as DDL text. Should be overriden by adapters which need custom constraint
rendering."""
constraint_prefix = f"constraint {constraint.name} " if constraint.name else ""
column_list = ", ".join(constraint.columns)
if constraint.type == ConstraintType.check and constraint.expression:
return f"{constraint_prefix}check ({constraint.expression})"
elif constraint.type == ConstraintType.unique:
constraint_expression = f" {constraint.expression}" if constraint.expression else ""
return f"{constraint_prefix}unique{constraint_expression} ({column_list})"
elif constraint.type == ConstraintType.primary_key:
constraint_expression = f" {constraint.expression}" if constraint.expression else ""
return f"{constraint_prefix}primary key{constraint_expression} ({column_list})"
elif constraint.type == ConstraintType.foreign_key and constraint.expression:
return f"{constraint_prefix}foreign key ({column_list}) references {constraint.expression}"
elif constraint.type == ConstraintType.custom and constraint.expression:
return f"{constraint_prefix}{constraint.expression}"
else:
return None
COLUMNS_EQUAL_SQL = """
with diff_count as (
@@ -1462,6 +1282,7 @@ join diff_count using (id)
def catch_as_completed(
futures, # typing: List[Future[agate.Table]]
) -> Tuple[agate.Table, List[Exception]]:
# catalogs: agate.Table = agate.Table(rows=[])
tables: List[agate.Table] = []
exceptions: List[Exception] = []

View File

@@ -7,9 +7,9 @@ from dbt.adapters.protocol import AdapterProtocol
def project_name_from_path(include_path: str) -> str:
# avoid an import cycle
from dbt.config.project import PartialProject
from dbt.config.project import Project
partial = PartialProject.from_project_root(include_path)
partial = Project.partial_load(include_path)
if partial.project_name is None:
raise CompilationError(f"Invalid project at {include_path}: name not set!")
return partial.project_name

View File

@@ -227,7 +227,7 @@ class BaseRelation(FakeAPIObject, Hashable):
def create_from_node(
cls: Type[Self],
config: HasQuoting,
node,
node: ManifestNode,
quote_policy: Optional[Dict[str, bool]] = None,
**kwargs: Any,
) -> Self:
@@ -328,10 +328,6 @@ class BaseRelation(FakeAPIObject, Hashable):
def is_view(self) -> bool:
return self.type == RelationType.View
@property
def is_materialized_view(self) -> bool:
return self.type == RelationType.MaterializedView
@classproperty
def Table(cls) -> str:
return str(RelationType.Table)
@@ -348,10 +344,6 @@ class BaseRelation(FakeAPIObject, Hashable):
def External(cls) -> str:
return str(RelationType.External)
@classproperty
def MaterializedView(cls) -> str:
return str(RelationType.MaterializedView)
@classproperty
def get_relation_type(cls) -> Type[RelationType]:
return RelationType

View File

@@ -4,7 +4,8 @@ from typing import Any, Dict, Iterable, List, Optional, Set, Tuple
from dbt.adapters.reference_keys import (
_make_ref_key,
_make_ref_key_dict,
_make_ref_key_msg,
_make_msg_from_ref_key,
_ReferenceKey,
)
from dbt.exceptions import (
@@ -16,7 +17,7 @@ from dbt.exceptions import (
)
from dbt.events.functions import fire_event, fire_event_if
from dbt.events.types import CacheAction, CacheDumpGraph
from dbt.flags import get_flags
import dbt.flags as flags
from dbt.utils import lowercase
@@ -229,7 +230,7 @@ class RelationsCache:
# self.relations or any cache entry's referenced_by during iteration
# it's a runtime error!
with self.lock:
return {dot_separated(k): str(v.dump_graph_entry()) for k, v in self.relations.items()}
return {dot_separated(k): v.dump_graph_entry() for k, v in self.relations.items()}
def _setdefault(self, relation: _CachedRelation):
"""Add a relation to the cache, or return it if it already exists.
@@ -289,8 +290,8 @@ class RelationsCache:
# a link - we will never drop the referenced relation during a run.
fire_event(
CacheAction(
ref_key=ref_key._asdict(),
ref_key_2=dep_key._asdict(),
ref_key=_make_msg_from_ref_key(ref_key),
ref_key_2=_make_msg_from_ref_key(dep_key),
)
)
return
@@ -305,8 +306,8 @@ class RelationsCache:
fire_event(
CacheAction(
action="add_link",
ref_key=dep_key._asdict(),
ref_key_2=ref_key._asdict(),
ref_key=_make_msg_from_ref_key(dep_key),
ref_key_2=_make_msg_from_ref_key(ref_key),
)
)
with self.lock:
@@ -318,13 +319,12 @@ class RelationsCache:
:param BaseRelation relation: The underlying relation.
"""
flags = get_flags()
cached = _CachedRelation(relation)
fire_event_if(
flags.LOG_CACHE_EVENTS,
lambda: CacheDumpGraph(before_after="before", action="adding", dump=self.dump_graph()),
)
fire_event(CacheAction(action="add_relation", ref_key=_make_ref_key_dict(cached)))
fire_event(CacheAction(action="add_relation", ref_key=_make_ref_key_msg(cached)))
with self.lock:
self._setdefault(cached)
@@ -358,7 +358,7 @@ class RelationsCache:
:param str identifier: The identifier of the relation to drop.
"""
dropped_key = _make_ref_key(relation)
dropped_key_msg = _make_ref_key_dict(relation)
dropped_key_msg = _make_ref_key_msg(relation)
fire_event(CacheAction(action="drop_relation", ref_key=dropped_key_msg))
with self.lock:
if dropped_key not in self.relations:
@@ -366,7 +366,7 @@ class RelationsCache:
return
consequences = self.relations[dropped_key].collect_consequences()
# convert from a list of _ReferenceKeys to a list of ReferenceKeyMsgs
consequence_msgs = [key._asdict() for key in consequences]
consequence_msgs = [_make_msg_from_ref_key(key) for key in consequences]
fire_event(
CacheAction(
action="drop_cascade", ref_key=dropped_key_msg, ref_list=consequence_msgs
@@ -396,9 +396,9 @@ class RelationsCache:
fire_event(
CacheAction(
action="update_reference",
ref_key=_make_ref_key_dict(old_key),
ref_key_2=_make_ref_key_dict(new_key),
ref_key_3=_make_ref_key_dict(cached.key()),
ref_key=_make_ref_key_msg(old_key),
ref_key_2=_make_ref_key_msg(new_key),
ref_key_3=_make_ref_key_msg(cached.key()),
)
)
@@ -429,7 +429,9 @@ class RelationsCache:
raise TruncatedModelNameCausedCollisionError(new_key, self.relations)
if old_key not in self.relations:
fire_event(CacheAction(action="temporary_relation", ref_key=old_key._asdict()))
fire_event(
CacheAction(action="temporary_relation", ref_key=_make_msg_from_ref_key(old_key))
)
return False
return True
@@ -450,11 +452,11 @@ class RelationsCache:
fire_event(
CacheAction(
action="rename_relation",
ref_key=old_key._asdict(),
ref_key_2=new_key._asdict(),
ref_key=_make_msg_from_ref_key(old_key),
ref_key_2=_make_msg_from_ref_key(new),
)
)
flags = get_flags()
fire_event_if(
flags.LOG_CACHE_EVENTS,
lambda: CacheDumpGraph(before_after="before", action="rename", dump=self.dump_graph()),

View File

@@ -9,11 +9,10 @@ from dbt.adapters.base.plugin import AdapterPlugin
from dbt.adapters.protocol import AdapterConfig, AdapterProtocol, RelationProtocol
from dbt.contracts.connection import AdapterRequiredConfig, Credentials
from dbt.events.functions import fire_event
from dbt.events.types import AdapterImportError, PluginLoadError, AdapterRegistered
from dbt.events.types import AdapterImportError, PluginLoadError
from dbt.exceptions import DbtInternalError, DbtRuntimeError
from dbt.include.global_project import PACKAGE_PATH as GLOBAL_PROJECT_PATH
from dbt.include.global_project import PROJECT_NAME as GLOBAL_PROJECT_NAME
from dbt.semver import VersionSpecifier
Adapter = AdapterProtocol
@@ -90,13 +89,7 @@ class AdapterContainer:
def register_adapter(self, config: AdapterRequiredConfig) -> None:
adapter_name = config.credentials.type
adapter_type = self.get_adapter_class_by_name(adapter_name)
adapter_version = import_module(f".{adapter_name}.__version__", "dbt.adapters").version
adapter_version_specifier = VersionSpecifier.from_version_string(
adapter_version
).to_version_string()
fire_event(
AdapterRegistered(adapter_name=adapter_name, adapter_version=adapter_version_specifier)
)
with self.lock:
if adapter_name in self.adapters:
# this shouldn't really happen...
@@ -165,9 +158,6 @@ class AdapterContainer:
def get_adapter_type_names(self, name: Optional[str]) -> List[str]:
return [p.adapter.type() for p in self.get_adapter_plugins(name)]
def get_adapter_constraint_support(self, name: Optional[str]) -> List[str]:
return self.lookup_adapter(name).CONSTRAINT_SUPPORT # type: ignore
FACTORY: AdapterContainer = AdapterContainer()
@@ -224,10 +214,6 @@ def get_adapter_type_names(name: Optional[str]) -> List[str]:
return FACTORY.get_adapter_type_names(name)
def get_adapter_constraint_support(name: Optional[str]) -> List[str]:
return FACTORY.get_adapter_constraint_support(name)
@contextmanager
def adapter_management():
reset_adapters()

View File

@@ -2,6 +2,7 @@
from collections import namedtuple
from typing import Any, Optional
from dbt.events.proto_types import ReferenceKeyMsg
_ReferenceKey = namedtuple("_ReferenceKey", "database schema identifier")
@@ -29,9 +30,11 @@ def _make_ref_key(relation: Any) -> _ReferenceKey:
)
def _make_ref_key_dict(relation: Any):
return {
"database": relation.database,
"schema": relation.schema,
"identifier": relation.identifier,
}
def _make_ref_key_msg(relation: Any):
return _make_msg_from_ref_key(_make_ref_key(relation))
def _make_msg_from_ref_key(ref_key: _ReferenceKey) -> ReferenceKeyMsg:
return ReferenceKeyMsg(
database=ref_key.database, schema=ref_key.schema, identifier=ref_key.identifier
)

View File

@@ -1,25 +0,0 @@
# RelationConfig
This package serves as an initial abstraction for managing the inspection of existing relations and determining
changes on those relations. It arose from the materialized view work and is currently only supporting
materialized views for Postgres and Redshift as well as dynamic tables for Snowflake. There are three main
classes in this package.
## RelationConfigBase
This is a very small class that only has a `from_dict()` method and a default `NotImplementedError()`. At some
point this could be replaced by a more robust framework, like `mashumaro` or `pydantic`.
## RelationConfigChange
This class inherits from `RelationConfigBase` ; however, this can be thought of as a separate class. The subclassing
merely points to the idea that both classes would likely inherit from the same class in a `mashumaro` or
`pydantic` implementation. This class is much more restricted in attribution. It should really only
ever need an `action` and a `context`. This can be though of as being analogous to a web request. You need to
know what you're doing (`action`: 'create' = GET, 'drop' = DELETE, etc.) and the information (`context`) needed
to make the change. In our scenarios, the context tends to be an instance of `RelationConfigBase` corresponding
to the new state.
## RelationConfigValidationMixin
This mixin provides optional validation mechanics that can be applied to either `RelationConfigBase` or
`RelationConfigChange` subclasses. A validation rule is a combination of a `validation_check`, something
that should evaluate to `True`, and an optional `validation_error`, an instance of `DbtRuntimeError`
that should be raised in the event the `validation_check` fails. While optional, it's recommended that
the `validation_error` be provided for clearer transparency to the end user.

View File

@@ -1,12 +0,0 @@
from dbt.adapters.relation_configs.config_base import ( # noqa: F401
RelationConfigBase,
RelationResults,
)
from dbt.adapters.relation_configs.config_change import ( # noqa: F401
RelationConfigChangeAction,
RelationConfigChange,
)
from dbt.adapters.relation_configs.config_validation import ( # noqa: F401
RelationConfigValidationMixin,
RelationConfigValidationRule,
)

View File

@@ -1,44 +0,0 @@
from dataclasses import dataclass
from typing import Union, Dict
import agate
from dbt.utils import filter_null_values
"""
This is what relation metadata from the database looks like. It's a dictionary because there will be
multiple grains of data for a single object. For example, a materialized view in Postgres has base level information,
like name. But it also can have multiple indexes, which needs to be a separate query. It might look like this:
{
"base": agate.Row({"table_name": "table_abc", "query": "select * from table_def"})
"indexes": agate.Table("rows": [
agate.Row({"name": "index_a", "columns": ["column_a"], "type": "hash", "unique": False}),
agate.Row({"name": "index_b", "columns": ["time_dim_a"], "type": "btree", "unique": False}),
])
}
"""
RelationResults = Dict[str, Union[agate.Row, agate.Table]]
@dataclass(frozen=True)
class RelationConfigBase:
@classmethod
def from_dict(cls, kwargs_dict) -> "RelationConfigBase":
"""
This assumes the subclass of `RelationConfigBase` is flat, in the sense that no attribute is
itself another subclass of `RelationConfigBase`. If that's not the case, this should be overriden
to manually manage that complexity.
Args:
kwargs_dict: the dict representation of this instance
Returns: the `RelationConfigBase` representation associated with the provided dict
"""
return cls(**filter_null_values(kwargs_dict)) # type: ignore
@classmethod
def _not_implemented_error(cls) -> NotImplementedError:
return NotImplementedError(
"This relation type has not been fully configured for this adapter."
)

View File

@@ -1,23 +0,0 @@
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Hashable
from dbt.adapters.relation_configs.config_base import RelationConfigBase
from dbt.dataclass_schema import StrEnum
class RelationConfigChangeAction(StrEnum):
alter = "alter"
create = "create"
drop = "drop"
@dataclass(frozen=True, eq=True, unsafe_hash=True)
class RelationConfigChange(RelationConfigBase, ABC):
action: RelationConfigChangeAction
context: Hashable # this is usually a RelationConfig, e.g. IndexConfig, but shouldn't be limited
@property
@abstractmethod
def requires_full_refresh(self) -> bool:
raise self._not_implemented_error()

View File

@@ -1,57 +0,0 @@
from dataclasses import dataclass
from typing import Set, Optional
from dbt.exceptions import DbtRuntimeError
@dataclass(frozen=True, eq=True, unsafe_hash=True)
class RelationConfigValidationRule:
validation_check: bool
validation_error: Optional[DbtRuntimeError]
@property
def default_error(self):
return DbtRuntimeError(
"There was a validation error in preparing this relation config."
"No additional context was provided by this adapter."
)
@dataclass(frozen=True)
class RelationConfigValidationMixin:
def __post_init__(self):
self.run_validation_rules()
@property
def validation_rules(self) -> Set[RelationConfigValidationRule]:
"""
A set of validation rules to run against the object upon creation.
A validation rule is a combination of a validation check (bool) and an optional error message.
This defaults to no validation rules if not implemented. It's recommended to override this with values,
but that may not always be necessary.
Returns: a set of validation rules
"""
return set()
def run_validation_rules(self):
for validation_rule in self.validation_rules:
try:
assert validation_rule.validation_check
except AssertionError:
if validation_rule.validation_error:
raise validation_rule.validation_error
else:
raise validation_rule.default_error
self.run_child_validation_rules()
def run_child_validation_rules(self):
for attr_value in vars(self).values():
if hasattr(attr_value, "validation_rules"):
attr_value.run_validation_rules()
if isinstance(attr_value, set):
for member in attr_value:
if hasattr(member, "validation_rules"):
member.run_validation_rules()

View File

@@ -1,6 +1,6 @@
import abc
import time
from typing import List, Optional, Tuple, Any, Iterable, Dict, Union
from typing import List, Optional, Tuple, Any, Iterable, Dict
import agate
@@ -117,36 +117,25 @@ class SQLConnectionManager(BaseConnectionManager):
return [dict(zip(column_names, row)) for row in rows]
@classmethod
def get_result_from_cursor(cls, cursor: Any, limit: Optional[int]) -> agate.Table:
def get_result_from_cursor(cls, cursor: Any) -> agate.Table:
data: List[Any] = []
column_names: List[str] = []
if cursor.description is not None:
column_names = [col[0] for col in cursor.description]
if limit:
rows = cursor.fetchmany(limit)
else:
rows = cursor.fetchall()
rows = cursor.fetchall()
data = cls.process_results(column_names, rows)
return dbt.clients.agate_helper.table_from_data_flat(data, column_names)
@classmethod
def data_type_code_to_name(cls, type_code: Union[int, str]) -> str:
"""Get the string representation of the data type from the type_code."""
# https://peps.python.org/pep-0249/#type-objects
raise dbt.exceptions.NotImplementedError(
"`data_type_code_to_name` is not implemented for this adapter!"
)
def execute(
self, sql: str, auto_begin: bool = False, fetch: bool = False, limit: Optional[int] = None
self, sql: str, auto_begin: bool = False, fetch: bool = False
) -> Tuple[AdapterResponse, agate.Table]:
sql = self._add_query_comment(sql)
_, cursor = self.add_query(sql, auto_begin)
response = self.get_response(cursor)
if fetch:
table = self.get_result_from_cursor(cursor, limit)
table = self.get_result_from_cursor(cursor)
else:
table = dbt.clients.agate_helper.empty_table()
return response, table
@@ -157,10 +146,6 @@ class SQLConnectionManager(BaseConnectionManager):
def add_commit_query(self):
return self.add_query("COMMIT", auto_begin=False)
def add_select_query(self, sql: str) -> Tuple[Connection, Any]:
sql = self._add_query_comment(sql)
return self.add_query(sql, auto_begin=False)
def begin(self):
connection = self.get_thread_connection()
if connection.transaction_open is True:

View File

@@ -1,10 +1,10 @@
import agate
from typing import Any, Optional, Tuple, Type, List
from dbt.contracts.connection import Connection, AdapterResponse
from dbt.contracts.connection import Connection
from dbt.exceptions import RelationTypeNullError
from dbt.adapters.base import BaseAdapter, available
from dbt.adapters.cache import _make_ref_key_dict
from dbt.adapters.cache import _make_ref_key_msg
from dbt.adapters.sql import SQLConnectionManager
from dbt.events.functions import fire_event
from dbt.events.types import ColTypeChange, SchemaCreation, SchemaDrop
@@ -22,7 +22,6 @@ RENAME_RELATION_MACRO_NAME = "rename_relation"
TRUNCATE_RELATION_MACRO_NAME = "truncate_relation"
DROP_RELATION_MACRO_NAME = "drop_relation"
ALTER_COLUMN_TYPE_MACRO_NAME = "alter_column_type"
VALIDATE_SQL_MACRO_NAME = "validate_sql"
class SQLAdapter(BaseAdapter):
@@ -110,7 +109,7 @@ class SQLAdapter(BaseAdapter):
ColTypeChange(
orig_type=target_column.data_type,
new_type=new_type,
table=_make_ref_key_dict(current),
table=_make_ref_key_msg(current),
)
)
@@ -153,7 +152,7 @@ class SQLAdapter(BaseAdapter):
def create_schema(self, relation: BaseRelation) -> None:
relation = relation.without_identifier()
fire_event(SchemaCreation(relation=_make_ref_key_dict(relation)))
fire_event(SchemaCreation(relation=_make_ref_key_msg(relation)))
kwargs = {
"relation": relation,
}
@@ -164,7 +163,7 @@ class SQLAdapter(BaseAdapter):
def drop_schema(self, relation: BaseRelation) -> None:
relation = relation.without_identifier()
fire_event(SchemaDrop(relation=_make_ref_key_dict(relation)))
fire_event(SchemaDrop(relation=_make_ref_key_msg(relation)))
kwargs = {
"relation": relation,
}
@@ -198,7 +197,6 @@ class SQLAdapter(BaseAdapter):
)
return relations
@classmethod
def quote(self, identifier):
return '"{}"'.format(identifier)
@@ -219,34 +217,6 @@ class SQLAdapter(BaseAdapter):
results = self.execute_macro(CHECK_SCHEMA_EXISTS_MACRO_NAME, kwargs=kwargs)
return results[0][0] > 0
def validate_sql(self, sql: str) -> AdapterResponse:
"""Submit the given SQL to the engine for validation, but not execution.
By default we simply prefix the query with the explain keyword and allow the
exceptions thrown by the underlying engine on invalid SQL inputs to bubble up
to the exception handler. For adjustments to the explain statement - such as
for adapters that have different mechanisms for hinting at query validation
or dry-run - callers may be able to override the validate_sql_query macro with
the addition of an <adapter>__validate_sql implementation.
:param sql str: The sql to validate
"""
kwargs = {
"sql": sql,
}
result = self.execute_macro(VALIDATE_SQL_MACRO_NAME, kwargs=kwargs)
# The statement macro always returns an AdapterResponse in the output AttrDict's
# `response` property, and we preserve the full payload in case we want to
# return fetched output for engines where explain plans are emitted as columnar
# results. Any macro override that deviates from this behavior may encounter an
# assertion error in the runtime.
adapter_response = result.response # type: ignore[attr-defined]
assert isinstance(adapter_response, AdapterResponse), (
f"Expected AdapterResponse from validate_sql macro execution, "
f"got {type(adapter_response)}."
)
return adapter_response
# This is for use in the test suite
def run_sql_for_tests(self, sql, fetch, conn):
cursor = conn.handle.cursor()

View File

@@ -1,71 +1 @@
# Adding a new command
## `main.py`
Add the new command with all necessary decorators. Every command will need at minimum:
- a decorator for the click group it belongs to which also names the command
- the postflight decorator (must come before other decorators from the `requires` module for error handling)
- the preflight decorator
```py
@cli.command("my-new-command")
@requires.postflight
@requires.preflight
def my_new_command(ctx, **kwargs):
...
```
## `types.py`
Add an entry to the `Command` enum with your new command. Commands that are sub-commands should have entries
that represent their full command path (e.g. `source freshness -> SOURCE_FRESHNESS`, `docs serve -> DOCS_SERVE`).
## `flags.py`
Add the new command to the dictionary within the `command_args` function.
# Exception Handling
## `requires.py`
### `postflight`
In the postflight decorator, the click command is invoked (i.e. `func(*args, **kwargs)`) and wrapped in a `try/except` block to handle any exceptions thrown.
Any exceptions thrown from `postflight` are wrapped by custom exceptions from the `dbt.cli.exceptions` module (i.e. `ResultExit`, `ExceptionExit`) to instruct click to complete execution with a particular exit code.
Some `dbt-core` handled exceptions have an attribute named `results` which contains results from running nodes (e.g. `FailFastError`). These are wrapped in the `ResultExit` exception to represent runs that have failed in a way that `dbt-core` expects.
If the invocation of the command does not throw any exceptions but does not succeed, `postflight` will still raise the `ResultExit` exception to make use of the exit code.
These exceptions produce an exit code of `1`.
Exceptions wrapped with `ExceptionExit` may be thrown by `dbt-core` intentionally (i.e. an exception that inherits from `dbt.exceptions.Exception`) or unintentionally (i.e. exceptions thrown by the python runtime). In either case these are considered errors that `dbt-core` did not expect and are treated as genuine exceptions.
These exceptions produce an exit code of `2`.
If no exceptions are thrown from invoking the command and the command succeeds, `postflight` will not raise any exceptions.
When no exceptions are raised an exit code of `0` is produced.
## `main.py`
### `dbtRunner`
`dbtRunner` provides a programmatic interface for our click CLI and wraps the invocation of the click commands to handle any exceptions thrown.
`dbtRunner.invoke` should ideally only ever return an instantiated `dbtRunnerResult` which contains the following fields:
- `success`: A boolean representing whether the command invocation was successful
- `result`: The optional result of the command invoked. This attribute can have many types, please see the definition of `dbtRunnerResult` for more information
- `exception`: If an exception was thrown during command invocation it will be saved here, otherwise it will be `None`. Please note that the exceptions held in this attribute are not the exceptions thrown by `preflight` but instead the exceptions that `ResultExit` and `ExceptionExit` wrap
Programmatic exception handling might look like the following:
```python
res = dbtRunner().invoke(["run"])
if not res.success:
...
if type(res.exception) == SomeExceptionType:
...
```
## `dbt/tests/util.py`
### `run_dbt`
In many of our functional and integration tests, we want to be sure that an invocation of `dbt` raises a certain exception.
A common pattern for these assertions:
```python
class TestSomething:
def test_something(self, project):
with pytest.raises(SomeException):
run_dbt(["run"])
```
To allow these tests to assert that exceptions have been thrown, the `run_dbt` function will raise any exceptions it recieves from the invocation of a `dbt` command.
TODO

View File

@@ -1 +0,0 @@
from .main import cli as dbt_cli # noqa

View File

@@ -1,16 +0,0 @@
import click
from typing import Optional
from dbt.cli.main import cli as dbt
def make_context(args, command=dbt) -> Optional[click.Context]:
try:
ctx = command.make_context(command.name, args)
except click.exceptions.Exit:
return None
ctx.invoked_subcommand = ctx.protected_args[0] if ctx.protected_args else None
ctx.obj = {}
return ctx

View File

@@ -1,43 +0,0 @@
from typing import Optional, IO
from click.exceptions import ClickException
from dbt.utils import ExitCodes
class DbtUsageException(Exception):
pass
class DbtInternalException(Exception):
pass
class CliException(ClickException):
"""The base exception class for our implementation of the click CLI.
The exit_code attribute is used by click to determine which exit code to produce
after an invocation."""
def __init__(self, exit_code: ExitCodes) -> None:
self.exit_code = exit_code.value
# the typing of _file is to satisfy the signature of ClickException.show
# overriding this method prevents click from printing any exceptions to stdout
def show(self, _file: Optional[IO] = None) -> None:
pass
class ResultExit(CliException):
"""This class wraps any exception that contains results while invoking dbt, or the
results of an invocation that did not succeed but did not throw any exceptions."""
def __init__(self, result) -> None:
super().__init__(ExitCodes.ModelError)
self.result = result
class ExceptionExit(CliException):
"""This class wraps any exception that does not contain results thrown while invoking dbt."""
def __init__(self, exception: Exception) -> None:
super().__init__(ExitCodes.UnhandledError)
self.exception = exception

View File

@@ -1,404 +1,44 @@
# TODO Move this to /core/dbt/flags.py when we're ready to break things
import os
import sys
from dataclasses import dataclass
from importlib import import_module
from multiprocessing import get_context
from pprint import pformat as pf
from typing import Any, Callable, Dict, List, Optional, Set, Union
from click import Context, get_current_context, Parameter
from click.core import Command as ClickCommand, Group, ParameterSource
from dbt.cli.exceptions import DbtUsageException
from dbt.cli.resolvers import default_log_path, default_project_dir
from dbt.cli.types import Command as CliCommand
from dbt.config.profile import read_user_config
from dbt.contracts.project import UserConfig
from dbt.exceptions import DbtInternalError
from dbt.deprecations import renamed_env_var
from dbt.helper_types import WarnErrorOptions
from click import get_current_context
if os.name != "nt":
# https://bugs.python.org/issue41567
import multiprocessing.popen_spawn_posix # type: ignore # noqa: F401
FLAGS_DEFAULTS = {
"INDIRECT_SELECTION": "eager",
"TARGET_PATH": None,
# Cli args without user_config or env var option.
"FULL_REFRESH": False,
"STRICT_MODE": False,
"STORE_FAILURES": False,
"INTROSPECT": True,
}
DEPRECATED_PARAMS = {
"deprecated_defer": "defer",
"deprecated_favor_state": "favor_state",
"deprecated_print": "print",
"deprecated_state": "state",
}
WHICH_KEY = "which"
def convert_config(config_name, config_value):
"""Convert the values from config and original set_from_args to the correct type."""
ret = config_value
if config_name.lower() == "warn_error_options" and type(config_value) == dict:
ret = WarnErrorOptions(
include=config_value.get("include", []), exclude=config_value.get("exclude", [])
)
return ret
def args_to_context(args: List[str]) -> Context:
"""Convert a list of args to a click context with proper hierarchy for dbt commands"""
from dbt.cli.main import cli
cli_ctx = cli.make_context(cli.name, args)
# Split args if they're a comma seperated string.
if len(args) == 1 and "," in args[0]:
args = args[0].split(",")
sub_command_name, sub_command, args = cli.resolve_command(cli_ctx, args)
# Handle source and docs group.
if isinstance(sub_command, Group):
sub_command_name, sub_command, args = sub_command.resolve_command(cli_ctx, args)
assert isinstance(sub_command, ClickCommand)
sub_command_ctx = sub_command.make_context(sub_command_name, args)
sub_command_ctx.parent = cli_ctx
return sub_command_ctx
@dataclass(frozen=True)
class Flags:
"""Primary configuration artifact for running dbt"""
def __init__(
self, ctx: Optional[Context] = None, user_config: Optional[UserConfig] = None
) -> None:
# Set the default flags.
for key, value in FLAGS_DEFAULTS.items():
object.__setattr__(self, key, value)
def __init__(self, ctx=None) -> None:
if ctx is None:
ctx = get_current_context()
def _get_params_by_source(ctx: Context, source_type: ParameterSource):
"""Generates all params of a given source type."""
yield from [
name for name, source in ctx._parameter_source.items() if source is source_type
]
if ctx.parent:
yield from _get_params_by_source(ctx.parent, source_type)
# Ensure that any params sourced from the commandline are not present more than once.
# Click handles this exclusivity, but only at a per-subcommand level.
seen_params = []
for param in _get_params_by_source(ctx, ParameterSource.COMMANDLINE):
if param in seen_params:
raise DbtUsageException(
f"{param.lower()} was provided both before and after the subcommand, it can only be set either before or after.",
)
seen_params.append(param)
def _assign_params(
ctx: Context,
params_assigned_from_default: set,
deprecated_env_vars: Dict[str, Callable],
):
def assign_params(ctx):
"""Recursively adds all click params to flag object"""
for param_name, param_value in ctx.params.items():
# N.B. You have to use the base MRO method (object.__setattr__) to set attributes
# when using frozen dataclasses.
# https://docs.python.org/3/library/dataclasses.html#frozen-instances
# Handle deprecated env vars while still respecting old values
# e.g. DBT_NO_PRINT -> DBT_PRINT if DBT_NO_PRINT is set, it is
# respected over DBT_PRINT or --print.
new_name: Union[str, None] = None
if param_name in DEPRECATED_PARAMS:
# Deprecated env vars can only be set via env var.
# We use the deprecated option in click to serialize the value
# from the env var string.
param_source = ctx.get_parameter_source(param_name)
if param_source == ParameterSource.DEFAULT:
continue
elif param_source != ParameterSource.ENVIRONMENT:
raise DbtUsageException(
"Deprecated parameters can only be set via environment variables",
)
# Rename for clarity.
dep_name = param_name
new_name = DEPRECATED_PARAMS.get(dep_name)
try:
assert isinstance(new_name, str)
except AssertionError:
raise Exception(
f"No deprecated param name match in DEPRECATED_PARAMS from {dep_name} to {new_name}"
)
# Find param objects for their envvar name.
try:
dep_param = [x for x in ctx.command.params if x.name == dep_name][0]
new_param = [x for x in ctx.command.params if x.name == new_name][0]
except IndexError:
raise Exception(
f"No deprecated param name match in context from {dep_name} to {new_name}"
)
# Remove param from defaulted set since the deprecated
# value is not set from default, but from an env var.
if new_name in params_assigned_from_default:
params_assigned_from_default.remove(new_name)
# Add the deprecation warning function to the set.
assert isinstance(dep_param.envvar, str)
assert isinstance(new_param.envvar, str)
deprecated_env_vars[new_name] = renamed_env_var(
old_name=dep_param.envvar,
new_name=new_param.envvar,
)
# Set the flag value.
is_duplicate = hasattr(self, param_name.upper())
is_default = ctx.get_parameter_source(param_name) == ParameterSource.DEFAULT
flag_name = (new_name or param_name).upper()
if (is_duplicate and not is_default) or not is_duplicate:
object.__setattr__(self, flag_name, param_value)
# Track default assigned params.
if is_default:
params_assigned_from_default.add(param_name)
if hasattr(self, param_name):
raise Exception(f"Duplicate flag names found in click command: {param_name}")
object.__setattr__(self, param_name.upper(), param_value)
if ctx.parent:
_assign_params(ctx.parent, params_assigned_from_default, deprecated_env_vars)
assign_params(ctx.parent)
params_assigned_from_default = set() # type: Set[str]
deprecated_env_vars: Dict[str, Callable] = {}
_assign_params(ctx, params_assigned_from_default, deprecated_env_vars)
assign_params(ctx)
# Set deprecated_env_var_warnings to be fired later after events have been init.
object.__setattr__(
self, "deprecated_env_var_warnings", [x for x in deprecated_env_vars.values()]
)
# Get the invoked command flags.
invoked_subcommand_name = (
ctx.invoked_subcommand if hasattr(ctx, "invoked_subcommand") else None
)
if invoked_subcommand_name is not None:
invoked_subcommand = getattr(import_module("dbt.cli.main"), invoked_subcommand_name)
invoked_subcommand.allow_extra_args = True
invoked_subcommand.ignore_unknown_options = True
invoked_subcommand_ctx = invoked_subcommand.make_context(None, sys.argv)
_assign_params(
invoked_subcommand_ctx, params_assigned_from_default, deprecated_env_vars
)
if not user_config:
profiles_dir = getattr(self, "PROFILES_DIR", None)
user_config = read_user_config(profiles_dir) if profiles_dir else None
# Add entire invocation command to flags
object.__setattr__(self, "INVOCATION_COMMAND", "dbt " + " ".join(sys.argv[1:]))
# Overwrite default assignments with user config if available.
if user_config:
param_assigned_from_default_copy = params_assigned_from_default.copy()
for param_assigned_from_default in params_assigned_from_default:
user_config_param_value = getattr(user_config, param_assigned_from_default, None)
if user_config_param_value is not None:
object.__setattr__(
self,
param_assigned_from_default.upper(),
convert_config(param_assigned_from_default, user_config_param_value),
)
param_assigned_from_default_copy.remove(param_assigned_from_default)
params_assigned_from_default = param_assigned_from_default_copy
# Set hard coded flags.
object.__setattr__(self, "WHICH", invoked_subcommand_name or ctx.info_name)
# Hard coded flags
object.__setattr__(self, "WHICH", ctx.info_name)
object.__setattr__(self, "MP_CONTEXT", get_context("spawn"))
# Apply the lead/follow relationship between some parameters.
self._override_if_set("USE_COLORS", "USE_COLORS_FILE", params_assigned_from_default)
self._override_if_set("LOG_LEVEL", "LOG_LEVEL_FILE", params_assigned_from_default)
self._override_if_set("LOG_FORMAT", "LOG_FORMAT_FILE", params_assigned_from_default)
# Set default LOG_PATH from PROJECT_DIR, if available.
# Starting in v1.5, if `log-path` is set in `dbt_project.yml`, it will raise a deprecation warning,
# with the possibility of removing it in a future release.
if getattr(self, "LOG_PATH", None) is None:
project_dir = getattr(self, "PROJECT_DIR", default_project_dir())
version_check = getattr(self, "VERSION_CHECK", True)
object.__setattr__(self, "LOG_PATH", default_log_path(project_dir, version_check))
# Support console DO NOT TRACK initiative.
if os.getenv("DO_NOT_TRACK", "").lower() in ("1", "t", "true", "y", "yes"):
object.__setattr__(self, "SEND_ANONYMOUS_USAGE_STATS", False)
# Check mutual exclusivity once all flags are set.
self._assert_mutually_exclusive(
params_assigned_from_default, ["WARN_ERROR", "WARN_ERROR_OPTIONS"]
)
# Support lower cased access for legacy code.
params = set(
x for x in dir(self) if not callable(getattr(self, x)) and not x.startswith("__")
)
for param in params:
object.__setattr__(self, param.lower(), getattr(self, param))
# Support console DO NOT TRACK initiave
if os.getenv("DO_NOT_TRACK", "").lower() in (1, "t", "true", "y", "yes"):
object.__setattr__(self, "ANONYMOUS_USAGE_STATS", False)
def __str__(self) -> str:
return str(pf(self.__dict__))
def _override_if_set(self, lead: str, follow: str, defaulted: Set[str]) -> None:
"""If the value of the lead parameter was set explicitly, apply the value to follow, unless follow was also set explicitly."""
if lead.lower() not in defaulted and follow.lower() in defaulted:
object.__setattr__(self, follow.upper(), getattr(self, lead.upper(), None))
def _assert_mutually_exclusive(
self, params_assigned_from_default: Set[str], group: List[str]
) -> None:
"""
Ensure no elements from group are simultaneously provided by a user, as inferred from params_assigned_from_default.
Raises click.UsageError if any two elements from group are simultaneously provided by a user.
"""
set_flag = None
for flag in group:
flag_set_by_user = flag.lower() not in params_assigned_from_default
if flag_set_by_user and set_flag:
raise DbtUsageException(
f"{flag.lower()}: not allowed with argument {set_flag.lower()}"
)
elif flag_set_by_user:
set_flag = flag
def fire_deprecations(self):
"""Fires events for deprecated env_var usage."""
[dep_fn() for dep_fn in self.deprecated_env_var_warnings]
# It is necessary to remove this attr from the class so it does
# not get pickled when written to disk as json.
object.__delattr__(self, "deprecated_env_var_warnings")
@classmethod
def from_dict(cls, command: CliCommand, args_dict: Dict[str, Any]) -> "Flags":
command_arg_list = command_params(command, args_dict)
ctx = args_to_context(command_arg_list)
flags = cls(ctx=ctx)
flags.fire_deprecations()
return flags
CommandParams = List[str]
def command_params(command: CliCommand, args_dict: Dict[str, Any]) -> CommandParams:
"""Given a command and a dict, returns a list of strings representing
the CLI params for that command. The order of this list is consistent with
which flags are expected at the parent level vs the command level.
e.g. fn("run", {"defer": True, "print": False}) -> ["--no-print", "run", "--defer"]
The result of this function can be passed in to the args_to_context function
to produce a click context to instantiate Flags with.
"""
cmd_args = set(command_args(command))
prnt_args = set(parent_args())
default_args = set([x.lower() for x in FLAGS_DEFAULTS.keys()])
res = command.to_list()
for k, v in args_dict.items():
k = k.lower()
# if a "which" value exists in the args dict, it should match the command provided
if k == WHICH_KEY:
if v != command.value:
raise DbtInternalError(
f"Command '{command.value}' does not match value of which: '{v}'"
)
continue
# param was assigned from defaults and should not be included
if k not in (cmd_args | prnt_args) - default_args:
continue
# if the param is in parent args, it should come before the arg name
# e.g. ["--print", "run"] vs ["run", "--print"]
add_fn = res.append
if k in prnt_args:
def add_fn(x):
res.insert(0, x)
spinal_cased = k.replace("_", "-")
if k == "macro" and command == CliCommand.RUN_OPERATION:
add_fn(v)
elif v in (None, False):
add_fn(f"--no-{spinal_cased}")
elif v is True:
add_fn(f"--{spinal_cased}")
else:
add_fn(f"--{spinal_cased}={v}")
return res
ArgsList = List[str]
def parent_args() -> ArgsList:
"""Return a list representing the params the base click command takes."""
from dbt.cli.main import cli
return format_params(cli.params)
def command_args(command: CliCommand) -> ArgsList:
"""Given a command, return a list of strings representing the params
that command takes. This function only returns params assigned to a
specific command, not those of its parent command.
e.g. fn("run") -> ["defer", "favor_state", "exclude", ...]
"""
import dbt.cli.main as cli
CMD_DICT: Dict[CliCommand, ClickCommand] = {
CliCommand.BUILD: cli.build,
CliCommand.CLEAN: cli.clean,
CliCommand.CLONE: cli.clone,
CliCommand.COMPILE: cli.compile,
CliCommand.DOCS_GENERATE: cli.docs_generate,
CliCommand.DOCS_SERVE: cli.docs_serve,
CliCommand.DEBUG: cli.debug,
CliCommand.DEPS: cli.deps,
CliCommand.INIT: cli.init,
CliCommand.LIST: cli.list,
CliCommand.PARSE: cli.parse,
CliCommand.RUN: cli.run,
CliCommand.RUN_OPERATION: cli.run_operation,
CliCommand.SEED: cli.seed,
CliCommand.SHOW: cli.show,
CliCommand.SNAPSHOT: cli.snapshot,
CliCommand.SOURCE_FRESHNESS: cli.freshness,
CliCommand.TEST: cli.test,
CliCommand.RETRY: cli.retry,
}
click_cmd: Optional[ClickCommand] = CMD_DICT.get(command, None)
if click_cmd is None:
raise DbtInternalError(f"No command found for name '{command.name}'")
return format_params(click_cmd.params)
def format_params(params: List[Parameter]) -> ArgsList:
return [str(x.name) for x in params if not str(x.name).lower().startswith("deprecated_")]

View File

@@ -1,121 +1,22 @@
import inspect # This is temporary for RAT-ing
from copy import copy
from dataclasses import dataclass
from typing import Callable, List, Optional, Union
from pprint import pformat as pf # This is temporary for RAT-ing
import click
from click.exceptions import (
Exit as ClickExit,
BadOptionUsage,
NoSuchOption,
UsageError,
)
from dbt.cli import requires, params as p
from dbt.cli.exceptions import (
DbtInternalException,
DbtUsageException,
)
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.results import (
CatalogArtifact,
RunExecutionResult,
)
from dbt.events.base_types import EventMsg
from dbt.task.build import BuildTask
from dbt.task.clean import CleanTask
from dbt.task.clone import CloneTask
from dbt.task.compile import CompileTask
from dbt.task.debug import DebugTask
from dbt.task.deps import DepsTask
from dbt.task.freshness import FreshnessTask
from dbt.task.generate import GenerateTask
from dbt.task.init import InitTask
from dbt.task.list import ListTask
from dbt.task.retry import RetryTask
from dbt.task.run import RunTask
from dbt.task.run_operation import RunOperationTask
from dbt.task.seed import SeedTask
from dbt.task.serve import ServeTask
from dbt.task.show import ShowTask
from dbt.task.snapshot import SnapshotTask
from dbt.task.test import TestTask
from dbt.adapters.factory import adapter_management
from dbt.cli import params as p
from dbt.cli.flags import Flags
from dbt.profiler import profiler
@dataclass
class dbtRunnerResult:
"""Contains the result of an invocation of the dbtRunner"""
def cli_runner():
# Alias "list" to "ls"
ls = copy(cli.commands["list"])
ls.hidden = True
cli.add_command(ls, "ls")
success: bool
exception: Optional[BaseException] = None
result: Union[
bool, # debug
CatalogArtifact, # docs generate
List[str], # list/ls
Manifest, # parse
None, # clean, deps, init, source
RunExecutionResult, # build, compile, run, seed, snapshot, test, run-operation
] = None
# Programmatic invocation
class dbtRunner:
def __init__(
self,
manifest: Optional[Manifest] = None,
callbacks: Optional[List[Callable[[EventMsg], None]]] = None,
):
self.manifest = manifest
if callbacks is None:
callbacks = []
self.callbacks = callbacks
def invoke(self, args: List[str], **kwargs) -> dbtRunnerResult:
try:
dbt_ctx = cli.make_context(cli.name, args)
dbt_ctx.obj = {
"manifest": self.manifest,
"callbacks": self.callbacks,
}
for key, value in kwargs.items():
dbt_ctx.params[key] = value
# Hack to set parameter source to custom string
dbt_ctx.set_parameter_source(key, "kwargs") # type: ignore
result, success = cli.invoke(dbt_ctx)
return dbtRunnerResult(
result=result,
success=success,
)
except requires.ResultExit as e:
return dbtRunnerResult(
result=e.result,
success=False,
)
except requires.ExceptionExit as e:
return dbtRunnerResult(
exception=e.exception,
success=False,
)
except (BadOptionUsage, NoSuchOption, UsageError) as e:
return dbtRunnerResult(
exception=DbtUsageException(e.message),
success=False,
)
except ClickExit as e:
if e.exit_code == 0:
return dbtRunnerResult(success=True)
return dbtRunnerResult(
exception=DbtInternalException(f"unhandled exit code {e.exit_code}"),
success=False,
)
except BaseException as e:
return dbtRunnerResult(
exception=e,
success=False,
)
# Run the cli
cli()
# dbt
@@ -126,30 +27,21 @@ class dbtRunner:
epilog="Specify one of these sub-commands and you can find more help from there.",
)
@click.pass_context
@p.anonymous_usage_stats
@p.cache_selected_only
@p.debug
@p.deprecated_print
@p.enable_legacy_logger
@p.fail_fast
@p.log_cache_events
@p.log_format
@p.log_format_file
@p.log_level
@p.log_level_file
@p.log_path
@p.macro_debugging
@p.partial_parse
@p.partial_parse_file_path
@p.populate_cache
@p.print
@p.printer_width
@p.quiet
@p.record_timing_info
@p.send_anonymous_usage_stats
@p.single_threaded
@p.static_parser
@p.use_colors
@p.use_colors_file
@p.use_experimental_parser
@p.version
@p.version_check
@@ -160,52 +52,49 @@ def cli(ctx, **kwargs):
"""An ELT tool for managing your SQL transformations and data models.
For more documentation on these commands, visit: docs.getdbt.com
"""
incomplete_flags = Flags()
# Profiling
if incomplete_flags.RECORD_TIMING_INFO:
ctx.with_resource(profiler(enable=True, outfile=incomplete_flags.RECORD_TIMING_INFO))
# Adapter management
ctx.with_resource(adapter_management())
# Version info
if incomplete_flags.VERSION:
click.echo(f"`version` called\n ctx.params: {pf(ctx.params)}")
return
else:
del ctx.params["version"]
# dbt build
@cli.command("build")
@click.pass_context
@p.defer
@p.deprecated_defer
@p.exclude
@p.fail_fast
@p.favor_state
@p.deprecated_favor_state
@p.full_refresh
@p.indirect_selection
@p.log_path
@p.models
@p.profile
@p.profiles_dir
@p.project_dir
@p.resource_type
@p.select
@p.selector
@p.show
@p.state
@p.defer_state
@p.deprecated_state
@p.store_failures
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def build(ctx, **kwargs):
"""Run all seeds, models, snapshots, and tests in DAG order"""
task = BuildTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
"""Run all Seeds, Models, Snapshots, and tests in DAG order"""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt clean
@@ -215,19 +104,11 @@ def build(ctx, **kwargs):
@p.profiles_dir
@p.project_dir
@p.target
@p.target_path
@p.vars
@requires.postflight
@requires.preflight
@requires.unset_profile
@requires.project
def clean(ctx, **kwargs):
"""Delete all folders in the clean-targets list (usually the dbt_packages and target directories.)"""
task = CleanTask(ctx.obj["flags"], ctx.obj["project"])
results = task.run()
success = task.interpret_results(results)
return results, success
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt docs
@@ -242,41 +123,23 @@ def docs(ctx, **kwargs):
@click.pass_context
@p.compile_docs
@p.defer
@p.deprecated_defer
@p.exclude
@p.favor_state
@p.deprecated_favor_state
@p.log_path
@p.models
@p.profile
@p.profiles_dir
@p.project_dir
@p.select
@p.selector
@p.empty_catalog
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest(write=False)
def docs_generate(ctx, **kwargs):
"""Generate the documentation website for your project"""
task = GenerateTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt docs serve
@@ -288,184 +151,81 @@ def docs_generate(ctx, **kwargs):
@p.profiles_dir
@p.project_dir
@p.target
@p.target_path
@p.vars
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
def docs_serve(ctx, **kwargs):
"""Serve the documentation website for your project"""
task = ServeTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt compile
@cli.command("compile")
@click.pass_context
@p.defer
@p.deprecated_defer
@p.exclude
@p.favor_state
@p.deprecated_favor_state
@p.full_refresh
@p.show_output_format
@p.indirect_selection
@p.introspect
@p.log_path
@p.models
@p.parse_only
@p.profile
@p.profiles_dir
@p.project_dir
@p.select
@p.selector
@p.inline
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def compile(ctx, **kwargs):
"""Generates executable SQL from source, model, test, and analysis files. Compiled SQL files are written to the
target/ directory."""
task = CompileTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
# dbt show
@cli.command("show")
@click.pass_context
@p.defer
@p.deprecated_defer
@p.exclude
@p.favor_state
@p.deprecated_favor_state
@p.full_refresh
@p.show_output_format
@p.show_limit
@p.indirect_selection
@p.introspect
@p.profile
@p.profiles_dir
@p.project_dir
@p.select
@p.selector
@p.inline
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def show(ctx, **kwargs):
"""Generates executable SQL for a named resource or inline query, runs that SQL, and returns a preview of the
results. Does not materialize anything to the warehouse."""
task = ShowTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
"""Generates executable SQL from source, model, test, and analysis files. Compiled SQL files are written to the target/ directory."""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt debug
@cli.command("debug")
@click.pass_context
@p.debug_connection
@p.config_dir
@p.profile
@p.profiles_dir_exists_false
@p.profiles_dir
@p.project_dir
@p.target
@p.vars
@p.version_check
@requires.postflight
@requires.preflight
def debug(ctx, **kwargs):
"""Show information on the current dbt environment and check dependencies, then test the database connection. Not to be confused with the --debug option which increases verbosity."""
task = DebugTask(
ctx.obj["flags"],
None,
)
results = task.run()
success = task.interpret_results(results)
return results, success
"""Show some helpful information about dbt for debugging. Not to be confused with the --debug option which increases verbosity."""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt deps
@cli.command("deps")
@click.pass_context
@p.profile
@p.profiles_dir_exists_false
@p.profiles_dir
@p.project_dir
@p.target
@p.vars
@requires.postflight
@requires.preflight
@requires.unset_profile
@requires.project
def deps(ctx, **kwargs):
"""Pull the most recent version of the dependencies listed in packages.yml"""
task = DepsTask(ctx.obj["flags"], ctx.obj["project"])
results = task.run()
success = task.interpret_results(results)
return results, success
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt init
@cli.command("init")
@click.pass_context
# for backwards compatibility, accept 'project_name' as an optional positional argument
@click.argument("project_name", required=False)
@p.profile
@p.profiles_dir_exists_false
@p.profiles_dir
@p.project_dir
@p.skip_profile_setup
@p.target
@p.vars
@requires.postflight
@requires.preflight
def init(ctx, **kwargs):
"""Initialize a new dbt project."""
task = InitTask(ctx.obj["flags"], None)
results = task.run()
success = task.interpret_results(results)
return results, success
"""Initialize a new DBT project."""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt list
@@ -480,42 +240,21 @@ def init(ctx, **kwargs):
@p.profiles_dir
@p.project_dir
@p.resource_type
@p.raw_select
@p.selector
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@p.vars
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def list(ctx, **kwargs):
"""List the resources in your project"""
task = ListTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
# Alias "list" to "ls"
ls = copy(cli.commands["list"])
ls.hidden = True
cli.add_command(ls, "ls")
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt parse
@cli.command("parse")
@click.pass_context
@p.compile_parse
@p.log_path
@p.profile
@p.profiles_dir
@p.project_dir
@@ -524,157 +263,51 @@ cli.add_command(ls, "ls")
@p.threads
@p.vars
@p.version_check
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest(write_perf_info=True)
@p.write_manifest
def parse(ctx, **kwargs):
"""Parses the project and provides information on performance"""
# manifest generation and writing happens in @requires.manifest
return ctx.obj["manifest"], True
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt run
@cli.command("run")
@click.pass_context
@p.defer
@p.deprecated_defer
@p.favor_state
@p.deprecated_favor_state
@p.exclude
@p.fail_fast
@p.full_refresh
@p.log_path
@p.models
@p.profile
@p.profiles_dir
@p.project_dir
@p.select
@p.selector
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def run(ctx, **kwargs):
"""Compile SQL and execute against the current target database."""
task = RunTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
# dbt retry
@cli.command("retry")
@click.pass_context
@p.project_dir
@p.profiles_dir
@p.vars
@p.profile
@p.target
@p.state
@p.threads
@p.fail_fast
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def retry(ctx, **kwargs):
"""Retry the nodes that failed in the previous run."""
task = RetryTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
# dbt clone
@cli.command("clone")
@click.pass_context
@p.defer_state
@p.exclude
@p.full_refresh
@p.profile
@p.profiles_dir
@p.project_dir
@p.resource_type
@p.select
@p.selector
@p.state # required
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
@requires.postflight
def clone(ctx, **kwargs):
"""Create clones of selected nodes based on their location in the manifest provided to --state."""
task = CloneTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt run operation
@cli.command("run-operation")
@click.pass_context
@click.argument("macro")
@p.args
@p.profile
@p.profiles_dir
@p.project_dir
@p.target
@p.target_path
@p.threads
@p.vars
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def run_operation(ctx, **kwargs):
"""Run the named macro with any supplied arguments."""
task = RunOperationTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt seed
@@ -682,75 +315,43 @@ def run_operation(ctx, **kwargs):
@click.pass_context
@p.exclude
@p.full_refresh
@p.log_path
@p.models
@p.profile
@p.profiles_dir
@p.project_dir
@p.select
@p.selector
@p.show
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def seed(ctx, **kwargs):
"""Load data from csv files into your data warehouse."""
task = SeedTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt snapshot
@cli.command("snapshot")
@click.pass_context
@p.defer
@p.deprecated_defer
@p.exclude
@p.favor_state
@p.deprecated_favor_state
@p.models
@p.profile
@p.profiles_dir
@p.project_dir
@p.select
@p.selector
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@p.threads
@p.vars
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def snapshot(ctx, **kwargs):
"""Execute snapshots defined in your project"""
task = SnapshotTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt source
@@ -764,87 +365,48 @@ def source(ctx, **kwargs):
@source.command("freshness")
@click.pass_context
@p.exclude
@p.models
@p.output_path # TODO: Is this ok to re-use? We have three different output params, how much can we consolidate?
@p.profile
@p.profiles_dir
@p.project_dir
@p.select
@p.selector
@p.state
@p.defer_state
@p.deprecated_state
@p.target
@p.target_path
@p.threads
@p.vars
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def freshness(ctx, **kwargs):
"""check the current freshness of the project's sources"""
task = FreshnessTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
# Alias "source freshness" to "snapshot-freshness"
snapshot_freshness = copy(cli.commands["source"].commands["freshness"]) # type: ignore
snapshot_freshness.hidden = True
cli.commands["source"].add_command(snapshot_freshness, "snapshot-freshness") # type: ignore
"""Snapshots the current freshness of the project's sources"""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt test
@cli.command("test")
@click.pass_context
@p.defer
@p.deprecated_defer
@p.exclude
@p.fail_fast
@p.favor_state
@p.deprecated_favor_state
@p.indirect_selection
@p.log_path
@p.models
@p.profile
@p.profiles_dir
@p.project_dir
@p.select
@p.selector
@p.state
@p.defer_state
@p.deprecated_state
@p.store_failures
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
@requires.postflight
@requires.preflight
@requires.profile
@requires.project
@requires.runtime_config
@requires.manifest
def test(ctx, **kwargs):
"""Runs tests on data in deployed models. Run this after `dbt run`"""
task = TestTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
)
results = task.run()
success = task.interpret_results(results)
return results, success
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# Support running as a module
if __name__ == "__main__":
cli()
cli_runner()

View File

@@ -1,7 +1,5 @@
from click import ParamType, Choice
from dbt.config.utils import parse_cli_yaml_string
from dbt.exceptions import ValidationError, DbtValidationError, OptionNotYamlDictError
from click import ParamType
import yaml
from dbt.helper_types import WarnErrorOptions
@@ -16,9 +14,8 @@ class YAML(ParamType):
if not isinstance(value, str):
self.fail(f"Cannot load YAML from type {type(value)}", param, ctx)
try:
param_option_name = param.opts[0] if param.opts else param.name
return parse_cli_yaml_string(value, param_option_name.strip("-"))
except (ValidationError, DbtValidationError, OptionNotYamlDictError):
return yaml.load(value, Loader=yaml.Loader)
except yaml.parser.ParserError:
self.fail(f"String '{value}' is not valid YAML", param, ctx)
@@ -28,7 +25,6 @@ class WarnErrorOptionsType(YAML):
name = "WarnErrorOptionsType"
def convert(self, value, param, ctx):
# this function is being used by param in click
include_exclude = super().convert(value, param, ctx)
return WarnErrorOptions(
@@ -50,13 +46,3 @@ class Truthy(ParamType):
return None
else:
return value
class ChoiceTuple(Choice):
name = "CHOICE_TUPLE"
def convert(self, value, param, ctx):
for value_item in value:
super().convert(value_item, param, ctx)
return value

View File

@@ -1,75 +0,0 @@
import click
import inspect
import typing as t
from click import Context
from dbt.cli.option_types import ChoiceTuple
# Implementation from: https://stackoverflow.com/a/48394004
# Note MultiOption options must be specified with type=tuple or type=ChoiceTuple (https://github.com/pallets/click/issues/2012)
class MultiOption(click.Option):
def __init__(self, *args, **kwargs):
self.save_other_options = kwargs.pop("save_other_options", True)
nargs = kwargs.pop("nargs", -1)
assert nargs == -1, "nargs, if set, must be -1 not {}".format(nargs)
super(MultiOption, self).__init__(*args, **kwargs)
self._previous_parser_process = None
self._eat_all_parser = None
# validate that multiple=True
multiple = kwargs.pop("multiple", None)
msg = f"MultiOption named `{self.name}` must have multiple=True (rather than {multiple})"
assert multiple, msg
# validate that type=tuple or type=ChoiceTuple
option_type = kwargs.pop("type", None)
msg = f"MultiOption named `{self.name}` must be tuple or ChoiceTuple (rather than {option_type})"
if inspect.isclass(option_type):
assert issubclass(option_type, tuple), msg
else:
assert isinstance(option_type, ChoiceTuple), msg
def add_to_parser(self, parser, ctx):
def parser_process(value, state):
# method to hook to the parser.process
done = False
value = [value]
if self.save_other_options:
# grab everything up to the next option
while state.rargs and not done:
for prefix in self._eat_all_parser.prefixes:
if state.rargs[0].startswith(prefix):
done = True
if not done:
value.append(state.rargs.pop(0))
else:
# grab everything remaining
value += state.rargs
state.rargs[:] = []
value = tuple(value)
# call the actual process
self._previous_parser_process(value, state)
retval = super(MultiOption, self).add_to_parser(parser, ctx)
for name in self.opts:
our_parser = parser._long_opt.get(name) or parser._short_opt.get(name)
if our_parser:
self._eat_all_parser = our_parser
self._previous_parser_process = our_parser.process
our_parser.process = parser_process
break
return retval
def type_cast_value(self, ctx: Context, value: t.Any) -> t.Any:
def flatten(data):
if isinstance(data, tuple):
for x in data:
yield from flatten(x)
else:
yield data
# there will be nested tuples to flatten when multiple=True
value = super(MultiOption, self).type_cast_value(ctx, value)
if value:
value = tuple(flatten(value))
return value

View File

@@ -1,10 +1,20 @@
from pathlib import Path
from pathlib import Path, PurePath
import click
from dbt.cli.options import MultiOption
from dbt.cli.option_types import YAML, ChoiceTuple, WarnErrorOptionsType
from dbt.cli.option_types import YAML, WarnErrorOptionsType
from dbt.cli.resolvers import default_project_dir, default_profiles_dir
from dbt.version import get_version_information
# TODO: The name (reflected in flags) is a correction!
# The original name was `SEND_ANONYMOUS_USAGE_STATS` and used an env var called "DBT_SEND_ANONYMOUS_USAGE_STATS"
# Both of which break existing naming conventions (doesn't match param flag).
# This will need to be fixed before use in the main codebase and communicated as a change to the community!
anonymous_usage_stats = click.option(
"--anonymous-usage-stats/--no-anonymous-usage-stats",
envvar="DBT_ANONYMOUS_USAGE_STATS",
help="Send anonymous usage stats to dbt Labs.",
default=True,
)
args = click.option(
"--args",
@@ -23,28 +33,28 @@ browser = click.option(
cache_selected_only = click.option(
"--cache-selected-only/--no-cache-selected-only",
envvar="DBT_CACHE_SELECTED_ONLY",
help="At start of run, populate relational cache only for schemas containing selected nodes, or for all schemas of interest.",
)
introspect = click.option(
"--introspect/--no-introspect",
envvar="DBT_INTROSPECT",
help="Whether to scaffold introspective queries as part of compilation",
default=True,
help="Pre cache database objects relevant to selected resource only.",
)
compile_docs = click.option(
"--compile/--no-compile",
envvar=None,
help="Whether or not to run 'dbt compile' as part of docs generation",
help="Wether or not to run 'dbt compile' as part of docs generation",
default=True,
)
compile_parse = click.option(
"--compile/--no-compile",
envvar=None,
help="TODO: No help text currently available",
default=True,
)
config_dir = click.option(
"--config-dir",
envvar=None,
help="Print a system-specific command to access the directory that the current dbt project is searching for a profiles.yml. Then, exit. This flag renders other debug step flags no-ops.",
is_flag=True,
help="If specified, DBT will show path information for this project",
type=click.STRING,
)
debug = click.option(
@@ -54,19 +64,14 @@ debug = click.option(
help="Display debug logging during dbt execution. Useful for debugging and making bug reports.",
)
# flag was previously named DEFER_MODE
# TODO: The env var and name (reflected in flags) are corrections!
# The original name was `DEFER_MODE` and used an env var called "DBT_DEFER_TO_STATE"
# Both of which break existing naming conventions.
# This will need to be fixed before use in the main codebase and communicated as a change to the community!
defer = click.option(
"--defer/--no-defer",
envvar="DBT_DEFER",
help="If set, resolve unselected nodes by deferring to the manifest within the --state directory.",
)
deprecated_defer = click.option(
"--deprecated-defer",
envvar="DBT_DEFER_TO_STATE",
help="Internal flag for deprecating old env var.",
default=False,
hidden=True,
help="If set, defer to the state variable for resolving unselected nodes.",
)
enable_legacy_logger = click.option(
@@ -75,14 +80,7 @@ enable_legacy_logger = click.option(
hidden=True,
)
exclude = click.option(
"--exclude",
envvar=None,
type=tuple,
cls=MultiOption,
multiple=True,
help="Specify the nodes to exclude.",
)
exclude = click.option("--exclude", envvar=None, help="Specify the nodes to exclude.")
fail_fast = click.option(
"--fail-fast/--no-fail-fast",
@@ -91,18 +89,6 @@ fail_fast = click.option(
help="Stop execution on first failure.",
)
favor_state = click.option(
"--favor-state/--no-favor-state",
envvar="DBT_FAVOR_STATE",
help="If set, defer to the argument provided to the state flag for resolving unselected nodes, even if the node(s) exist as a database object in the current environment.",
)
deprecated_favor_state = click.option(
"--deprecated-favor-state",
envvar="DBT_FAVOR_STATE_MODE",
help="Internal flag for deprecating old env var.",
)
full_refresh = click.option(
"--full-refresh",
"-f",
@@ -114,69 +100,30 @@ full_refresh = click.option(
indirect_selection = click.option(
"--indirect-selection",
envvar="DBT_INDIRECT_SELECTION",
help="Choose which tests to select that are adjacent to selected resources. Eager is most inclusive, cautious is most exclusive, and buildable is in between. Empty includes no tests at all.",
type=click.Choice(["eager", "cautious", "buildable", "empty"], case_sensitive=False),
help="Select all tests that are adjacent to selected resources, even if they those resources have been explicitly selected.",
type=click.Choice(["eager", "cautious"], case_sensitive=False),
default="eager",
)
log_cache_events = click.option(
"--log-cache-events/--no-log-cache-events",
help="Enable verbose logging for relational cache events to help when debugging.",
help="Enable verbose adapter cache logging.",
envvar="DBT_LOG_CACHE_EVENTS",
)
log_format = click.option(
"--log-format",
envvar="DBT_LOG_FORMAT",
help="Specify the format of logging to the console and the log file. Use --log-format-file to configure the format for the log file differently than the console.",
type=click.Choice(["text", "debug", "json", "default"], case_sensitive=False),
help="Specify the log format, overriding the command's default.",
type=click.Choice(["text", "json", "default"], case_sensitive=False),
default="default",
)
log_format_file = click.option(
"--log-format-file",
envvar="DBT_LOG_FORMAT_FILE",
help="Specify the format of logging to the log file by overriding the default value and the general --log-format setting.",
type=click.Choice(["text", "debug", "json", "default"], case_sensitive=False),
default="debug",
)
log_level = click.option(
"--log-level",
envvar="DBT_LOG_LEVEL",
help="Specify the minimum severity of events that are logged to the console and the log file. Use --log-level-file to configure the severity for the log file differently than the console.",
type=click.Choice(["debug", "info", "warn", "error", "none"], case_sensitive=False),
default="info",
)
log_level_file = click.option(
"--log-level-file",
envvar="DBT_LOG_LEVEL_FILE",
help="Specify the minimum severity of events that are logged to the log file by overriding the default value and the general --log-level setting.",
type=click.Choice(["debug", "info", "warn", "error", "none"], case_sensitive=False),
default="debug",
)
use_colors = click.option(
"--use-colors/--no-use-colors",
envvar="DBT_USE_COLORS",
help="Specify whether log output is colorized in the console and the log file. Use --use-colors-file/--no-use-colors-file to colorize the log file differently than the console.",
default=True,
)
use_colors_file = click.option(
"--use-colors-file/--no-use-colors-file",
envvar="DBT_USE_COLORS_FILE",
help="Specify whether log file output is colorized by overriding the default value and the general --use-colors/--no-use-colors setting.",
default=True,
)
log_path = click.option(
"--log-path",
envvar="DBT_LOG_PATH",
help="Configure the 'log-path'. Only applies this setting for the current run. Overrides the 'DBT_LOG_PATH' if it is set.",
default=None,
type=click.Path(resolve_path=True, path_type=Path),
type=click.Path(),
)
macro_debugging = click.option(
@@ -185,51 +132,41 @@ macro_debugging = click.option(
hidden=True,
)
# This less standard usage of --output where output_path below is more standard
models = click.option(
"-m",
"-s",
"models",
envvar=None,
help="Specify the nodes to include.",
multiple=True,
)
output = click.option(
"--output",
envvar=None,
help="Specify the output format: either JSON or a newline-delimited list of selectors, paths, or names",
help="TODO: No current help text",
type=click.Choice(["json", "name", "path", "selector"], case_sensitive=False),
default="selector",
)
show_output_format = click.option(
"--output",
envvar=None,
help="Output format for dbt compile and dbt show",
type=click.Choice(["json", "text"], case_sensitive=False),
default="text",
)
show_limit = click.option(
"--limit",
envvar=None,
help="Limit the number of results returned by dbt show",
type=click.INT,
default=5,
default="name",
)
output_keys = click.option(
"--output-keys",
envvar=None,
help=(
"Space-delimited listing of node properties to include as custom keys for JSON output "
"(e.g. `--output json --output-keys name resource_type description`)"
),
type=tuple,
cls=MultiOption,
multiple=True,
default=[],
"--output-keys", envvar=None, help="TODO: No current help text", type=click.STRING
)
output_path = click.option(
"--output",
"-o",
envvar=None,
help="Specify the output path for the JSON report. By default, outputs to 'target/sources.json'",
help="Specify the output path for the json report. By default, outputs to 'target/sources.json'",
type=click.Path(file_okay=True, dir_okay=False, writable=True),
default=None,
default=PurePath.joinpath(Path.cwd(), "target/sources.json"),
)
parse_only = click.option(
"--parse-only",
envvar=None,
help="TODO: No help text currently available",
is_flag=True,
)
partial_parse = click.option(
@@ -239,22 +176,6 @@ partial_parse = click.option(
default=True,
)
partial_parse_file_path = click.option(
"--partial-parse-file-path",
envvar="DBT_PARTIAL_PARSE_FILE_PATH",
help="Internal flag for path to partial_parse.manifest file.",
default=None,
hidden=True,
type=click.Path(exists=True, dir_okay=False, resolve_path=True),
)
populate_cache = click.option(
"--populate-cache/--no-populate-cache",
envvar="DBT_POPULATE_CACHE",
help="At start of run, use `show` or `information_schema` queries to populate a relational cache, which can speed up subsequent materializations.",
default=True,
)
port = click.option(
"--port",
envvar=None,
@@ -263,6 +184,10 @@ port = click.option(
type=click.INT,
)
# TODO: The env var and name (reflected in flags) are corrections!
# The original name was `NO_PRINT` and used the env var `DBT_NO_PRINT`.
# Both of which break existing naming conventions.
# This will need to be fixed before use in the main codebase and communicated as a change to the community!
print = click.option(
"--print/--no-print",
envvar="DBT_PRINT",
@@ -270,15 +195,6 @@ print = click.option(
default=True,
)
deprecated_print = click.option(
"--deprecated-print/--deprecated-no-print",
envvar="DBT_NO_PRINT",
help="Internal flag for deprecating old env var.",
default=True,
hidden=True,
callback=lambda ctx, param, value: not value,
)
printer_width = click.option(
"--printer-width",
envvar="DBT_PRINTER_WIDTH",
@@ -297,32 +213,20 @@ profiles_dir = click.option(
"--profiles-dir",
envvar="DBT_PROFILES_DIR",
help="Which directory to look in for the profiles.yml file. If not set, dbt will look in the current working directory first, then HOME/.dbt/",
default=default_profiles_dir,
default=default_profiles_dir(),
type=click.Path(exists=True),
)
# `dbt debug` uses this because it implements custom behaviour for non-existent profiles.yml directories
# `dbt deps` does not load a profile at all
# `dbt init` will write profiles.yml if it doesn't yet exist
profiles_dir_exists_false = click.option(
"--profiles-dir",
envvar="DBT_PROFILES_DIR",
help="Which directory to look in for the profiles.yml file. If not set, dbt will look in the current working directory first, then HOME/.dbt/",
default=default_profiles_dir,
type=click.Path(exists=False),
)
project_dir = click.option(
"--project-dir",
envvar="DBT_PROJECT_DIR",
envvar=None,
help="Which directory to look in for the dbt_project.yml file. Default is the current working directory and its parents.",
default=default_project_dir,
default=default_project_dir(),
type=click.Path(exists=True),
)
quiet = click.option(
"--quiet/--no-quiet",
"-q",
envvar="DBT_QUIET",
help="Suppress all non-error logging to stdout. Does not affect {{ print() }} macro calls.",
)
@@ -336,11 +240,10 @@ record_timing_info = click.option(
)
resource_type = click.option(
"--resource-types",
"--resource-type",
envvar=None,
help="Restricts the types of resources that dbt will include",
type=ChoiceTuple(
help="TODO: No current help text",
type=click.Choice(
[
"metric",
"source",
@@ -355,120 +258,35 @@ resource_type = click.option(
],
case_sensitive=False,
),
cls=MultiOption,
multiple=True,
default=(),
default="default",
)
model_decls = ("-m", "--models", "--model")
select_decls = ("-s", "--select")
select_attrs = {
"envvar": None,
"help": "Specify the nodes to include.",
"cls": MultiOption,
"multiple": True,
"type": tuple,
}
inline = click.option(
"--inline",
envvar=None,
help="Pass SQL inline to dbt compile and show",
)
# `--select` and `--models` are analogous for most commands except `dbt list` for legacy reasons.
# Most CLI arguments should use the combined `select` option that aliases `--models` to `--select`.
# However, if you need to split out these separators (like `dbt ls`), use the `models` and `raw_select` options instead.
# See https://github.com/dbt-labs/dbt-core/pull/6774#issuecomment-1408476095 for more info.
models = click.option(*model_decls, **select_attrs)
raw_select = click.option(*select_decls, **select_attrs)
select = click.option(*select_decls, *model_decls, **select_attrs)
selector = click.option(
"--selector",
envvar=None,
help="The selector name to use, as defined in selectors.yml",
)
send_anonymous_usage_stats = click.option(
"--send-anonymous-usage-stats/--no-send-anonymous-usage-stats",
envvar="DBT_SEND_ANONYMOUS_USAGE_STATS",
help="Send anonymous usage stats to dbt Labs.",
default=True,
"--selector", envvar=None, help="The selector name to use, as defined in selectors.yml"
)
show = click.option(
"--show",
envvar=None,
help="Show a sample of the loaded data in the terminal",
is_flag=True,
)
# TODO: The env var is a correction!
# The original env var was `DBT_TEST_SINGLE_THREADED`.
# This broke the existing naming convention.
# This will need to be communicated as a change to the community!
#
# N.B. This flag is only used for testing, hence it's hidden from help text.
single_threaded = click.option(
"--single-threaded/--no-single-threaded",
envvar="DBT_SINGLE_THREADED",
default=False,
hidden=True,
"--show", envvar=None, help="Show a sample of the loaded data in the terminal", is_flag=True
)
skip_profile_setup = click.option(
"--skip-profile-setup",
"-s",
envvar=None,
help="Skip interactive profile setup.",
is_flag=True,
)
empty_catalog = click.option(
"--empty-catalog",
help="If specified, generate empty catalog.json file during the `dbt docs generate` command.",
default=False,
is_flag=True,
"--skip-profile-setup", "-s", envvar=None, help="Skip interactive profile setup.", is_flag=True
)
# TODO: The env var and name (reflected in flags) are corrections!
# The original name was `ARTIFACT_STATE_PATH` and used the env var `DBT_ARTIFACT_STATE_PATH`.
# Both of which break existing naming conventions.
# This will need to be fixed before use in the main codebase and communicated as a change to the community!
state = click.option(
"--state",
envvar="DBT_STATE",
help="Unless overridden, use this state directory for both state comparison and deferral.",
type=click.Path(
dir_okay=True,
file_okay=False,
readable=True,
resolve_path=False,
path_type=Path,
),
)
defer_state = click.option(
"--defer-state",
envvar="DBT_DEFER_STATE",
help="Override the state directory for deferral only.",
type=click.Path(
dir_okay=True,
file_okay=False,
readable=True,
resolve_path=False,
path_type=Path,
),
)
deprecated_state = click.option(
"--deprecated-state",
envvar="DBT_ARTIFACT_STATE_PATH",
help="Internal flag for deprecating old env var.",
hidden=True,
help="If set, use the given directory as the source for json files to compare with this project.",
type=click.Path(
dir_okay=True,
exists=True,
file_okay=False,
readable=True,
resolve_path=True,
path_type=Path,
),
)
@@ -487,10 +305,7 @@ store_failures = click.option(
)
target = click.option(
"--target",
"-t",
envvar=None,
help="Which target to load for the given profile",
"--target", "-t", envvar=None, help="Which target to load for the given profile"
)
target_path = click.option(
@@ -500,21 +315,21 @@ target_path = click.option(
type=click.Path(),
)
debug_connection = click.option(
"--connection",
envvar=None,
help="Test the connection to the target database independent of dependency checks.",
is_flag=True,
)
threads = click.option(
"--threads",
envvar=None,
help="Specify number of threads to use while executing models. Overrides settings in profiles.yml.",
default=None,
default=1,
type=click.INT,
)
use_colors = click.option(
"--use-colors/--no-use-colors",
envvar="DBT_USE_COLORS",
help="Output is colorized by default and may also be set in a profile or at the command line.",
default=True,
)
use_experimental_parser = click.option(
"--use-experimental-parser/--no-use-experimental-parser",
envvar="DBT_USE_EXPERIMENTAL_PARSER",
@@ -526,35 +341,19 @@ vars = click.option(
envvar=None,
help="Supply variables to the project. This argument overrides variables defined in your dbt_project.yml file. This argument should be a YAML string, eg. '{my_variable: my_value}'",
type=YAML(),
default="{}",
)
# TODO: when legacy flags are deprecated use
# click.version_option instead of a callback
def _version_callback(ctx, _param, value):
if not value or ctx.resilient_parsing:
return
click.echo(get_version_information())
ctx.exit()
version = click.option(
"--version",
"-V",
"-v",
callback=_version_callback,
envvar=None,
expose_value=False,
help="Show version information and exit",
is_eager=True,
help="Show version information",
is_flag=True,
)
version_check = click.option(
"--version-check/--no-version-check",
envvar="DBT_VERSION_CHECK",
help="If set, ensure the installed dbt version matches the require-dbt-version specified in the dbt_project.yml file (if any). Otherwise, allow them to differ.",
help="Ensure dbt's version matches the one specified in the dbt_project.yml file ('require-dbt-version')",
default=True,
)
@@ -563,13 +362,13 @@ warn_error = click.option(
envvar="DBT_WARN_ERROR",
help="If dbt would normally warn, instead raise an exception. Examples include --select that selects nothing, deprecations, configurations with no associated models, invalid test configurations, and missing sources/refs in tests.",
default=None,
is_flag=True,
flag_value=True,
)
warn_error_options = click.option(
"--warn-error-options",
envvar="DBT_WARN_ERROR_OPTIONS",
default="{}",
default=None,
help="""If dbt would normally warn, instead raise an exception based on include/exclude configuration. Examples include --select that selects nothing, deprecations, configurations with no associated models, invalid test configurations,
and missing sources/refs in tests. This argument should be a YAML string, with keys 'include' or 'exclude'. eg. '{"include": "all", "exclude": ["NoNodesForSelectionCriteria"]}'""",
type=WarnErrorOptionsType(),
@@ -578,6 +377,13 @@ warn_error_options = click.option(
write_json = click.option(
"--write-json/--no-write-json",
envvar="DBT_WRITE_JSON",
help="Whether or not to write the manifest.json and run_results.json files to the target directory",
help="Writing the manifest and run_results.json files to disk",
default=True,
)
write_manifest = click.option(
"--write-manifest/--no-write-manifest",
envvar=None,
help="TODO: No help text currently available",
default=True,
)

View File

@@ -1,267 +0,0 @@
import dbt.tracking
from dbt.version import installed as installed_version
from dbt.adapters.factory import adapter_management, register_adapter
from dbt.flags import set_flags, get_flag_dict
from dbt.cli.exceptions import (
ExceptionExit,
ResultExit,
)
from dbt.cli.flags import Flags
from dbt.config import RuntimeConfig
from dbt.config.runtime import load_project, load_profile, UnsetProfile
from dbt.events.functions import fire_event, LOG_VERSION, set_invocation_id, setup_event_logger
from dbt.events.types import (
CommandCompleted,
MainReportVersion,
MainReportArgs,
MainTrackingUserState,
)
from dbt.events.helpers import get_json_string_utcnow
from dbt.events.types import MainEncounteredError, MainStackTrace
from dbt.exceptions import Exception as DbtException, DbtProjectError, FailFastError
from dbt.parser.manifest import ManifestLoader, write_manifest
from dbt.profiler import profiler
from dbt.tracking import active_user, initialize_from_flags, track_run
from dbt.utils import cast_dict_to_dict_of_strings
from dbt.plugins import set_up_plugin_manager, get_plugin_manager
from click import Context
from functools import update_wrapper
import time
import traceback
def preflight(func):
def wrapper(*args, **kwargs):
ctx = args[0]
assert isinstance(ctx, Context)
ctx.obj = ctx.obj or {}
# Flags
flags = Flags(ctx)
ctx.obj["flags"] = flags
set_flags(flags)
# Logging
callbacks = ctx.obj.get("callbacks", [])
set_invocation_id()
setup_event_logger(flags=flags, callbacks=callbacks)
# Tracking
initialize_from_flags(flags.SEND_ANONYMOUS_USAGE_STATS, flags.PROFILES_DIR)
ctx.with_resource(track_run(run_command=flags.WHICH))
# Now that we have our logger, fire away!
fire_event(MainReportVersion(version=str(installed_version), log_version=LOG_VERSION))
flags_dict_str = cast_dict_to_dict_of_strings(get_flag_dict())
fire_event(MainReportArgs(args=flags_dict_str))
# Deprecation warnings
flags.fire_deprecations()
if active_user is not None: # mypy appeasement, always true
fire_event(MainTrackingUserState(user_state=active_user.state()))
# Profiling
if flags.RECORD_TIMING_INFO:
ctx.with_resource(profiler(enable=True, outfile=flags.RECORD_TIMING_INFO))
# Adapter management
ctx.with_resource(adapter_management())
return func(*args, **kwargs)
return update_wrapper(wrapper, func)
def postflight(func):
"""The decorator that handles all exception handling for the click commands.
This decorator must be used before any other decorators that may throw an exception."""
def wrapper(*args, **kwargs):
ctx = args[0]
start_func = time.perf_counter()
success = False
try:
result, success = func(*args, **kwargs)
except FailFastError as e:
fire_event(MainEncounteredError(exc=str(e)))
raise ResultExit(e.result)
except DbtException as e:
fire_event(MainEncounteredError(exc=str(e)))
raise ExceptionExit(e)
except BaseException as e:
fire_event(MainEncounteredError(exc=str(e)))
fire_event(MainStackTrace(stack_trace=traceback.format_exc()))
raise ExceptionExit(e)
finally:
fire_event(
CommandCompleted(
command=ctx.command_path,
success=success,
completed_at=get_json_string_utcnow(),
elapsed=time.perf_counter() - start_func,
)
)
if not success:
raise ResultExit(result)
return (result, success)
return update_wrapper(wrapper, func)
# TODO: UnsetProfile is necessary for deps and clean to load a project.
# This decorator and its usage can be removed once https://github.com/dbt-labs/dbt-core/issues/6257 is closed.
def unset_profile(func):
def wrapper(*args, **kwargs):
ctx = args[0]
assert isinstance(ctx, Context)
profile = UnsetProfile()
ctx.obj["profile"] = profile
return func(*args, **kwargs)
return update_wrapper(wrapper, func)
def profile(func):
def wrapper(*args, **kwargs):
ctx = args[0]
assert isinstance(ctx, Context)
flags = ctx.obj["flags"]
# TODO: Generalize safe access to flags.THREADS:
# https://github.com/dbt-labs/dbt-core/issues/6259
threads = getattr(flags, "THREADS", None)
profile = load_profile(flags.PROJECT_DIR, flags.VARS, flags.PROFILE, flags.TARGET, threads)
ctx.obj["profile"] = profile
return func(*args, **kwargs)
return update_wrapper(wrapper, func)
def project(func):
def wrapper(*args, **kwargs):
ctx = args[0]
assert isinstance(ctx, Context)
# TODO: Decouple target from profile, and remove the need for profile here:
# https://github.com/dbt-labs/dbt-core/issues/6257
if not ctx.obj.get("profile"):
raise DbtProjectError("profile required for project")
flags = ctx.obj["flags"]
project = load_project(
flags.PROJECT_DIR, flags.VERSION_CHECK, ctx.obj["profile"], flags.VARS
)
ctx.obj["project"] = project
# Plugins
set_up_plugin_manager(project_name=project.project_name)
if dbt.tracking.active_user is not None:
project_id = None if project is None else project.hashed_name()
dbt.tracking.track_project_id({"project_id": project_id})
return func(*args, **kwargs)
return update_wrapper(wrapper, func)
def runtime_config(func):
"""A decorator used by click command functions for generating a runtime
config given a profile and project.
"""
def wrapper(*args, **kwargs):
ctx = args[0]
assert isinstance(ctx, Context)
req_strs = ["profile", "project"]
reqs = [ctx.obj.get(req_str) for req_str in req_strs]
if None in reqs:
raise DbtProjectError("profile and project required for runtime_config")
config = RuntimeConfig.from_parts(
ctx.obj["project"],
ctx.obj["profile"],
ctx.obj["flags"],
)
ctx.obj["runtime_config"] = config
if dbt.tracking.active_user is not None:
adapter_type = (
getattr(config.credentials, "type", None)
if hasattr(config, "credentials")
else None
)
adapter_unique_id = (
config.credentials.hashed_unique_field()
if hasattr(config, "credentials")
else None
)
dbt.tracking.track_adapter_info(
{
"adapter_type": adapter_type,
"adapter_unique_id": adapter_unique_id,
}
)
return func(*args, **kwargs)
return update_wrapper(wrapper, func)
def manifest(*args0, write=True, write_perf_info=False):
"""A decorator used by click command functions for generating a manifest
given a profile, project, and runtime config. This also registers the adapter
from the runtime config and conditionally writes the manifest to disk.
"""
def outer_wrapper(func):
def wrapper(*args, **kwargs):
ctx = args[0]
assert isinstance(ctx, Context)
req_strs = ["profile", "project", "runtime_config"]
reqs = [ctx.obj.get(dep) for dep in req_strs]
if None in reqs:
raise DbtProjectError("profile, project, and runtime_config required for manifest")
runtime_config = ctx.obj["runtime_config"]
register_adapter(runtime_config)
# a manifest has already been set on the context, so don't overwrite it
if ctx.obj.get("manifest") is None:
manifest = ManifestLoader.get_full_manifest(
runtime_config,
write_perf_info=write_perf_info,
)
ctx.obj["manifest"] = manifest
if write and ctx.obj["flags"].write_json:
write_manifest(manifest, runtime_config.project_target_path)
pm = get_plugin_manager(runtime_config.project_name)
plugin_artifacts = pm.get_manifest_artifacts(manifest)
for path, plugin_artifact in plugin_artifacts.items():
plugin_artifact.write(path)
return func(*args, **kwargs)
return update_wrapper(wrapper, func)
# if there are no args, the decorator was used without params @decorator
# otherwise, the decorator was called with params @decorator(arg)
if len(args0) == 0:
return outer_wrapper
return outer_wrapper(args0[0])

View File

@@ -1,31 +1,11 @@
from pathlib import Path
from dbt.config.project import PartialProject
from dbt.exceptions import DbtProjectError
def default_project_dir() -> Path:
def default_project_dir():
paths = list(Path.cwd().parents)
paths.insert(0, Path.cwd())
return next((x for x in paths if (x / "dbt_project.yml").exists()), Path.cwd())
def default_profiles_dir() -> Path:
def default_profiles_dir():
return Path.cwd() if (Path.cwd() / "profiles.yml").exists() else Path.home() / ".dbt"
def default_log_path(project_dir: Path, verify_version: bool = False) -> Path:
"""If available, derive a default log path from dbt_project.yml. Otherwise, default to "logs".
Known limitations:
1. Using PartialProject here, so no jinja rendering of log-path.
2. Programmatic invocations of the cli via dbtRunner may pass a Project object directly,
which is not being taken into consideration here to extract a log-path.
"""
default_log_path = Path("logs")
try:
partial = PartialProject.from_project_root(str(project_dir), verify_version=verify_version)
partial_log_path = partial.project_dict.get("log-path") or default_log_path
default_log_path = Path(project_dir) / partial_log_path
except DbtProjectError:
pass
return default_log_path

View File

@@ -1,40 +0,0 @@
from enum import Enum
from typing import List
from dbt.exceptions import DbtInternalError
class Command(Enum):
BUILD = "build"
CLEAN = "clean"
COMPILE = "compile"
CLONE = "clone"
DOCS_GENERATE = "generate"
DOCS_SERVE = "serve"
DEBUG = "debug"
DEPS = "deps"
INIT = "init"
LIST = "list"
PARSE = "parse"
RUN = "run"
RUN_OPERATION = "run-operation"
SEED = "seed"
SHOW = "show"
SNAPSHOT = "snapshot"
SOURCE_FRESHNESS = "freshness"
TEST = "test"
RETRY = "retry"
@classmethod
def from_str(cls, s: str) -> "Command":
try:
return cls(s)
except ValueError:
raise DbtInternalError(f"No value '{s}' exists in Command enum")
def to_list(self) -> List[str]:
return {
Command.DOCS_GENERATE: ["docs", "generate"],
Command.DOCS_SERVE: ["docs", "serve"],
Command.SOURCE_FRESHNESS: ["source", "freshness"],
}.get(self, [self.value])

View File

@@ -40,7 +40,7 @@ from dbt.exceptions import (
UndefinedCompilationError,
UndefinedMacroError,
)
from dbt.flags import get_flags
from dbt import flags
from dbt.node_types import ModelLanguage
@@ -99,9 +99,8 @@ class MacroFuzzEnvironment(jinja2.sandbox.SandboxedEnvironment):
If the value is 'write', also write the files to disk.
WARNING: This can write a ton of data if you aren't careful.
"""
macro_debugging = get_flags().MACRO_DEBUGGING
if filename == "<template>" and macro_debugging:
write = macro_debugging == "write"
if filename == "<template>" and flags.MACRO_DEBUGGING:
write = flags.MACRO_DEBUGGING == "write"
filename = _linecache_inject(source, write)
return super()._compile(source, filename) # type: ignore
@@ -483,7 +482,7 @@ def get_environment(
native: bool = False,
) -> jinja2.Environment:
args: Dict[str, List[Union[str, Type[jinja2.ext.Extension]]]] = {
"extensions": ["jinja2.ext.do", "jinja2.ext.loopcontrols"]
"extensions": ["jinja2.ext.do"]
}
if capture_macros:
@@ -565,8 +564,6 @@ def _requote_result(raw_value: str, rendered: str) -> str:
# is small enough that I've just chosen the more readable option.
_HAS_RENDER_CHARS_PAT = re.compile(r"({[{%#]|[#}%]})")
_render_cache: Dict[str, Any] = dict()
def get_rendered(
string: str,
@@ -574,21 +571,15 @@ def get_rendered(
node=None,
capture_macros: bool = False,
native: bool = False,
) -> Any:
) -> str:
# performance optimization: if there are no jinja control characters in the
# string, we can just return the input. Fall back to jinja if the type is
# not a string or if native rendering is enabled (so '1' -> 1, etc...)
# If this is desirable in the native env as well, we could handle the
# native=True case by passing the input string to ast.literal_eval, like
# the native renderer does.
has_render_chars = not isinstance(string, str) or _HAS_RENDER_CHARS_PAT.search(string)
if not has_render_chars:
if not native:
return string
elif string in _render_cache:
return _render_cache[string]
if not native and isinstance(string, str) and _HAS_RENDER_CHARS_PAT.search(string) is None:
return string
template = get_template(
string,
ctx,
@@ -596,13 +587,7 @@ def get_rendered(
capture_macros=capture_macros,
native=native,
)
rendered = render_template(template, ctx, node)
if not has_render_chars and native:
_render_cache[string] = rendered
return rendered
return render_template(template, ctx, node)
def undefined_error(msg) -> NoReturn:

View File

@@ -141,7 +141,7 @@ def statically_parse_adapter_dispatch(func_call, ctx, db_wrapper):
macro = db_wrapper.dispatch(func_name, macro_namespace=macro_namespace).macro
func_name = f"{macro.package_name}.{macro.name}"
possible_macro_calls.append(func_name)
else: # this is only for tests/unit/test_macro_calls.py
else: # this is only for test/unit/test_macro_calls.py
if macro_namespace:
packages = [macro_namespace]
else:

View File

@@ -1,31 +1,30 @@
import errno
import fnmatch
import functools
import fnmatch
import json
import os
import os.path
import re
import shutil
import stat
import subprocess
import sys
import tarfile
from pathlib import Path
from typing import Any, Callable, Dict, List, NoReturn, Optional, Tuple, Type, Union
import dbt.exceptions
import requests
import stat
from typing import Type, NoReturn, List, Optional, Dict, Any, Tuple, Callable, Union
from pathspec import PathSpec # type: ignore
from dbt.events.functions import fire_event
from dbt.events.types import (
SystemErrorRetrievingModTime,
SystemCouldNotWrite,
SystemExecutingCmd,
SystemStdOut,
SystemStdErr,
SystemReportReturnCode,
)
from dbt.exceptions import DbtInternalError
import dbt.exceptions
from dbt.utils import _connection_exception_retry as connection_exception_retry
from pathspec import PathSpec # type: ignore
if sys.platform == "win32":
from ctypes import WinDLL, c_bool
@@ -76,7 +75,11 @@ def find_matching(
relative_path = os.path.relpath(absolute_path, absolute_path_to_search)
relative_path_to_root = os.path.join(relative_path_to_search, relative_path)
modification_time = os.path.getmtime(absolute_path)
modification_time = 0.0
try:
modification_time = os.path.getmtime(absolute_path)
except OSError:
fire_event(SystemErrorRetrievingModTime(path=absolute_path))
if reobj.match(local_file) and (
not ignore_spec or not ignore_spec.match_file(relative_path_to_root)
):
@@ -103,18 +106,12 @@ def load_file_contents(path: str, strip: bool = True) -> str:
return to_return
@functools.singledispatch
def make_directory(path=None) -> None:
def make_directory(path: str) -> None:
"""
Make a directory and any intermediate directories that don't already
exist. This function handles the case where two threads try to create
a directory at once.
"""
raise DbtInternalError(f"Can not create directory from {type(path)} ")
@make_directory.register
def _(path: str) -> None:
path = convert_path(path)
if not os.path.exists(path):
# concurrent writes that try to create the same dir can fail
@@ -128,11 +125,6 @@ def _(path: str) -> None:
raise e
@make_directory.register
def _(path: Path) -> None:
path.mkdir(parents=True, exist_ok=True)
def make_file(path: str, contents: str = "", overwrite: bool = False) -> bool:
"""
Make a file at `path` assuming that the directory it resides in already
@@ -211,7 +203,7 @@ def _windows_rmdir_readonly(func: Callable[[str], Any], path: str, exc: Tuple[An
def resolve_path_from_base(path_to_resolve: str, base_path: str) -> str:
"""
If path_to_resolve is a relative path, create an absolute path
If path-to_resolve is a relative path, create an absolute path
with base_path as the base.
If path_to_resolve is an absolute path or a user path (~), just
@@ -449,8 +441,8 @@ def run_cmd(cwd: str, cmd: List[str], env: Optional[Dict[str, Any]] = None) -> T
except OSError as exc:
_interpret_oserror(exc, cwd, cmd)
fire_event(SystemStdOut(bmsg=str(out)))
fire_event(SystemStdErr(bmsg=str(err)))
fire_event(SystemStdOut(bmsg=out))
fire_event(SystemStdErr(bmsg=err))
if proc.returncode != 0:
fire_event(SystemReportReturnCode(returncode=proc.returncode))

View File

@@ -1,15 +1,12 @@
import argparse
import json
import networkx as nx # type: ignore
import os
import pickle
import sqlparse
from collections import defaultdict
from typing import List, Dict, Any, Tuple, Optional
from dbt.flags import get_flags
import networkx as nx # type: ignore
import pickle
import sqlparse
from dbt import flags
from dbt.adapters.factory import get_adapter
from dbt.clients import jinja
from dbt.clients.system import make_directory
@@ -29,13 +26,12 @@ from dbt.exceptions import (
DbtRuntimeError,
)
from dbt.graph import Graph
from dbt.events.functions import fire_event, get_invocation_id
from dbt.events.types import FoundStats, Note, WritingInjectedSQLForNode
from dbt.events.functions import fire_event
from dbt.events.types import FoundStats, WritingInjectedSQLForNode
from dbt.events.contextvars import get_node_info
from dbt.node_types import NodeType, ModelLanguage
from dbt.events.format import pluralize
import dbt.tracking
import dbt.task.list as list_task
graph_file_name = "graph.gpickle"
@@ -48,12 +44,10 @@ def print_compile_stats(stats):
NodeType.Analysis: "analysis",
NodeType.Macro: "macro",
NodeType.Operation: "operation",
NodeType.Seed: "seed",
NodeType.Seed: "seed file",
NodeType.Source: "source",
NodeType.Exposure: "exposure",
NodeType.SemanticModel: "semantic model",
NodeType.Metric: "metric",
NodeType.Group: "group",
}
results = {k: 0 for k in names.keys()}
@@ -64,8 +58,7 @@ def print_compile_stats(stats):
resource_counts = {k.pluralize(): v for k, v in results.items()}
dbt.tracking.track_resource_counts(resource_counts)
# do not include resource types that are not actually defined in the project
stat_line = ", ".join([pluralize(ct, names.get(t)) for t, ct in stats.items() if t in names])
stat_line = ", ".join([pluralize(ct, names.get(t)) for t, ct in results.items() if t in names])
fire_event(FoundStats(stat_line=stat_line))
@@ -84,16 +77,14 @@ def _generate_stats(manifest: Manifest):
if _node_enabled(node):
stats[node.resource_type] += 1
# Disabled nodes don't appear in the following collections, so we don't check.
stats[NodeType.Source] += len(manifest.sources)
stats[NodeType.Exposure] += len(manifest.exposures)
stats[NodeType.Metric] += len(manifest.metrics)
stats[NodeType.Macro] += len(manifest.macros)
stats[NodeType.Group] += len(manifest.groups)
stats[NodeType.SemanticModel] += len(manifest.semantic_models)
# TODO: should we be counting dimensions + entities?
for source in manifest.sources.values():
stats[source.resource_type] += 1
for exposure in manifest.exposures.values():
stats[exposure.resource_type] += 1
for metric in manifest.metrics.values():
stats[metric.resource_type] += 1
for macro in manifest.macros.values():
stats[macro.resource_type] += 1
return stats
@@ -165,120 +156,13 @@ class Linker:
with open(outfile, "wb") as outfh:
pickle.dump(out_graph, outfh, protocol=pickle.HIGHEST_PROTOCOL)
def link_node(self, node: GraphMemberNode, manifest: Manifest):
self.add_node(node.unique_id)
for dependency in node.depends_on_nodes:
if dependency in manifest.nodes:
self.dependency(node.unique_id, (manifest.nodes[dependency].unique_id))
elif dependency in manifest.sources:
self.dependency(node.unique_id, (manifest.sources[dependency].unique_id))
elif dependency in manifest.metrics:
self.dependency(node.unique_id, (manifest.metrics[dependency].unique_id))
elif dependency in manifest.semantic_models:
self.dependency(node.unique_id, (manifest.semantic_models[dependency].unique_id))
else:
raise GraphDependencyNotFoundError(node, dependency)
def link_graph(self, manifest: Manifest):
for source in manifest.sources.values():
self.add_node(source.unique_id)
for semantic_model in manifest.semantic_models.values():
self.add_node(semantic_model.unique_id)
for node in manifest.nodes.values():
self.link_node(node, manifest)
for exposure in manifest.exposures.values():
self.link_node(exposure, manifest)
for metric in manifest.metrics.values():
self.link_node(metric, manifest)
cycle = self.find_cycles()
if cycle:
raise RuntimeError("Found a cycle: {}".format(cycle))
def add_test_edges(self, manifest: Manifest) -> None:
"""This method adds additional edges to the DAG. For a given non-test
executable node, add an edge from an upstream test to the given node if
the set of nodes the test depends on is a subset of the upstream nodes
for the given node."""
# Given a graph:
# model1 --> model2 --> model3
# | |
# | \/
# \/ test 2
# test1
#
# Produce the following graph:
# model1 --> model2 --> model3
# | /\ | /\ /\
# | | \/ | |
# \/ | test2 ----| |
# test1 ----|---------------|
for node_id in self.graph:
# If node is executable (in manifest.nodes) and does _not_
# represent a test, continue.
if (
node_id in manifest.nodes
and manifest.nodes[node_id].resource_type != NodeType.Test
):
# Get *everything* upstream of the node
all_upstream_nodes = nx.traversal.bfs_tree(self.graph, node_id, reverse=True)
# Get the set of upstream nodes not including the current node.
upstream_nodes = set([n for n in all_upstream_nodes if n != node_id])
# Get all tests that depend on any upstream nodes.
upstream_tests = []
for upstream_node in upstream_nodes:
upstream_tests += _get_tests_for_node(manifest, upstream_node)
for upstream_test in upstream_tests:
# Get the set of all nodes that the test depends on
# including the upstream_node itself. This is necessary
# because tests can depend on multiple nodes (ex:
# relationship tests). Test nodes do not distinguish
# between what node the test is "testing" and what
# node(s) it depends on.
test_depends_on = set(manifest.nodes[upstream_test].depends_on_nodes)
# If the set of nodes that an upstream test depends on
# is a subset of all upstream nodes of the current node,
# add an edge from the upstream test to the current node.
if test_depends_on.issubset(upstream_nodes):
self.graph.add_edge(upstream_test, node_id, edge_type="parent_test")
def get_graph(self, manifest: Manifest) -> Graph:
self.link_graph(manifest)
return Graph(self.graph)
def get_graph_summary(self, manifest: Manifest) -> Dict[int, Dict[str, Any]]:
"""Create a smaller summary of the graph, suitable for basic diagnostics
and performance tuning. The summary includes only the edge structure,
node types, and node names. Each of the n nodes is assigned an integer
index 0, 1, 2,..., n-1 for compactness"""
graph_nodes = dict()
index_dict = dict()
for node_index, node_name in enumerate(self.graph):
index_dict[node_name] = node_index
data = manifest.expect(node_name).to_dict(omit_none=True)
graph_nodes[node_index] = {"name": node_name, "type": data["resource_type"]}
for node_index, node in graph_nodes.items():
successors = [index_dict[n] for n in self.graph.successors(node["name"])]
if successors:
node["succ"] = [index_dict[n] for n in self.graph.successors(node["name"])]
return graph_nodes
class Compiler:
def __init__(self, config):
self.config = config
def initialize(self):
make_directory(self.config.project_target_path)
make_directory(self.config.target_path)
make_directory(self.config.packages_install_path)
# creates a ModelContext which is converted to
@@ -304,6 +188,62 @@ class Compiler:
relation_cls = adapter.Relation
return relation_cls.add_ephemeral_prefix(name)
def _inject_ctes_into_sql(self, sql: str, ctes: List[InjectedCTE]) -> str:
"""
`ctes` is a list of InjectedCTEs like:
[
InjectedCTE(
id="cte_id_1",
sql="__dbt__cte__ephemeral as (select * from table)",
),
InjectedCTE(
id="cte_id_2",
sql="__dbt__cte__events as (select id, type from events)",
),
]
Given `sql` like:
"with internal_cte as (select * from sessions)
select * from internal_cte"
This will spit out:
"with __dbt__cte__ephemeral as (select * from table),
__dbt__cte__events as (select id, type from events),
with internal_cte as (select * from sessions)
select * from internal_cte"
(Whitespace enhanced for readability.)
"""
if len(ctes) == 0:
return sql
parsed_stmts = sqlparse.parse(sql)
parsed = parsed_stmts[0]
with_stmt = None
for token in parsed.tokens:
if token.is_keyword and token.normalized == "WITH":
with_stmt = token
break
if with_stmt is None:
# no with stmt, add one, and inject CTEs right at the beginning
first_token = parsed.token_first()
with_stmt = sqlparse.sql.Token(sqlparse.tokens.Keyword, "with")
parsed.insert_before(first_token, with_stmt)
else:
# stmt exists, add a comma (which will come after injected CTEs)
trailing_comma = sqlparse.sql.Token(sqlparse.tokens.Punctuation, ",")
parsed.insert_after(with_stmt, trailing_comma)
token = sqlparse.sql.Token(sqlparse.tokens.Keyword, ", ".join(c.sql for c in ctes))
parsed.insert_after(with_stmt, token)
return str(parsed)
def _recursively_prepend_ctes(
self,
model: ManifestSQLNode,
@@ -378,7 +318,7 @@ class Compiler:
_add_prepended_cte(prepended_ctes, InjectedCTE(id=cte.id, sql=sql))
injected_sql = inject_ctes_into_sql(
injected_sql = self._inject_ctes_into_sql(
model.compiled_code,
prepended_ctes,
)
@@ -406,6 +346,13 @@ class Compiler:
extra_context = {}
if node.language == ModelLanguage.python:
# TODO could we also 'minify' this code at all? just aesthetic, not functional
# quoating seems like something very specific to sql so far
# for all python implementations we are seeing there's no quating.
# TODO try to find better way to do this, given that
original_quoting = self.config.quoting
self.config.quoting = {key: False for key in original_quoting.keys()}
context = self._create_node_context(node, manifest, extra_context)
postfix = jinja.get_rendered(
@@ -415,6 +362,8 @@ class Compiler:
)
# we should NOT jinja render the python model's 'raw code'
node.compiled_code = f"{node.raw_code}\n\n{postfix}"
# restore quoting settings in the end since context is lazy evaluated
self.config.quoting = original_quoting
else:
context = self._create_node_context(node, manifest, extra_context)
@@ -440,62 +389,110 @@ class Compiler:
return node
# This method doesn't actually "compile" any of the nodes. That is done by the
# "compile_node" method. This creates a Linker and builds the networkx graph,
# writes out the graph.gpickle file, and prints the stats, returning a Graph object.
def compile(self, manifest: Manifest, write=True, add_test_edges=False) -> Graph:
self.initialize()
linker = Linker()
linker.link_graph(manifest)
def write_graph_file(self, linker: Linker, manifest: Manifest):
filename = graph_file_name
graph_path = os.path.join(self.config.target_path, filename)
if flags.WRITE_JSON:
linker.write_graph(graph_path, manifest)
# Create a file containing basic information about graph structure,
# supporting diagnostics and performance analysis.
summaries: Dict = dict()
summaries["_invocation_id"] = get_invocation_id()
summaries["linked"] = linker.get_graph_summary(manifest)
def link_node(self, linker: Linker, node: GraphMemberNode, manifest: Manifest):
linker.add_node(node.unique_id)
for dependency in node.depends_on_nodes:
if dependency in manifest.nodes:
linker.dependency(node.unique_id, (manifest.nodes[dependency].unique_id))
elif dependency in manifest.sources:
linker.dependency(node.unique_id, (manifest.sources[dependency].unique_id))
elif dependency in manifest.metrics:
linker.dependency(node.unique_id, (manifest.metrics[dependency].unique_id))
else:
raise GraphDependencyNotFoundError(node, dependency)
def link_graph(self, linker: Linker, manifest: Manifest, add_test_edges: bool = False):
for source in manifest.sources.values():
linker.add_node(source.unique_id)
for node in manifest.nodes.values():
self.link_node(linker, node, manifest)
for exposure in manifest.exposures.values():
self.link_node(linker, exposure, manifest)
for metric in manifest.metrics.values():
self.link_node(linker, metric, manifest)
cycle = linker.find_cycles()
if cycle:
raise RuntimeError("Found a cycle: {}".format(cycle))
if add_test_edges:
manifest.build_parent_and_child_maps()
linker.add_test_edges(manifest)
self.add_test_edges(linker, manifest)
# Create another diagnostic summary, just as above, but this time
# including the test edges.
summaries["with_test_edges"] = linker.get_graph_summary(manifest)
def add_test_edges(self, linker: Linker, manifest: Manifest) -> None:
"""This method adds additional edges to the DAG. For a given non-test
executable node, add an edge from an upstream test to the given node if
the set of nodes the test depends on is a subset of the upstream nodes
for the given node."""
with open(
os.path.join(self.config.project_target_path, "graph_summary.json"), "w"
) as out_stream:
try:
out_stream.write(json.dumps(summaries))
except Exception as e: # This is non-essential information, so merely note failures.
fire_event(
Note(
msg=f"An error was encountered writing the graph summary information: {e}"
)
)
# Given a graph:
# model1 --> model2 --> model3
# | |
# | \/
# \/ test 2
# test1
#
# Produce the following graph:
# model1 --> model2 --> model3
# | /\ | /\ /\
# | | \/ | |
# \/ | test2 ----| |
# test1 ----|---------------|
for node_id in linker.graph:
# If node is executable (in manifest.nodes) and does _not_
# represent a test, continue.
if (
node_id in manifest.nodes
and manifest.nodes[node_id].resource_type != NodeType.Test
):
# Get *everything* upstream of the node
all_upstream_nodes = nx.traversal.bfs_tree(linker.graph, node_id, reverse=True)
# Get the set of upstream nodes not including the current node.
upstream_nodes = set([n for n in all_upstream_nodes if n != node_id])
# Get all tests that depend on any upstream nodes.
upstream_tests = []
for upstream_node in upstream_nodes:
upstream_tests += _get_tests_for_node(manifest, upstream_node)
for upstream_test in upstream_tests:
# Get the set of all nodes that the test depends on
# including the upstream_node itself. This is necessary
# because tests can depend on multiple nodes (ex:
# relationship tests). Test nodes do not distinguish
# between what node the test is "testing" and what
# node(s) it depends on.
test_depends_on = set(manifest.nodes[upstream_test].depends_on_nodes)
# If the set of nodes that an upstream test depends on
# is a subset of all upstream nodes of the current node,
# add an edge from the upstream test to the current node.
if test_depends_on.issubset(upstream_nodes):
linker.graph.add_edge(upstream_test, node_id)
def compile(self, manifest: Manifest, write=True, add_test_edges=False) -> Graph:
self.initialize()
linker = Linker()
self.link_graph(linker, manifest, add_test_edges)
stats = _generate_stats(manifest)
if write:
self.write_graph_file(linker, manifest)
# Do not print these for ListTask's
if not (
self.config.args.__class__ == argparse.Namespace
and self.config.args.cls == list_task.ListTask
):
stats = _generate_stats(manifest)
print_compile_stats(stats)
print_compile_stats(stats)
return Graph(linker.graph)
def write_graph_file(self, linker: Linker, manifest: Manifest):
filename = graph_file_name
graph_path = os.path.join(self.config.project_target_path, filename)
flags = get_flags()
if flags.WRITE_JSON:
linker.write_graph(graph_path, manifest)
# writes the "compiled_code" into the target/compiled directory
def _write_node(self, node: ManifestSQLNode) -> ManifestSQLNode:
if not node.extra_ctes_injected or node.resource_type in (
@@ -506,8 +503,9 @@ class Compiler:
fire_event(WritingInjectedSQLForNode(node_info=get_node_info()))
if node.compiled_code:
node.compiled_path = node.get_target_write_path(self.config.target_path, "compiled")
node.write_node(self.config.project_root, node.compiled_path, node.compiled_code)
node.compiled_path = node.write_node(
self.config.target_path, "compiled", node.compiled_code
)
return node
def compile_node(
@@ -529,74 +527,3 @@ class Compiler:
if write:
self._write_node(node)
return node
def inject_ctes_into_sql(sql: str, ctes: List[InjectedCTE]) -> str:
"""
`ctes` is a list of InjectedCTEs like:
[
InjectedCTE(
id="cte_id_1",
sql="__dbt__cte__ephemeral as (select * from table)",
),
InjectedCTE(
id="cte_id_2",
sql="__dbt__cte__events as (select id, type from events)",
),
]
Given `sql` like:
"with internal_cte as (select * from sessions)
select * from internal_cte"
This will spit out:
"with __dbt__cte__ephemeral as (select * from table),
__dbt__cte__events as (select id, type from events),
internal_cte as (select * from sessions)
select * from internal_cte"
(Whitespace enhanced for readability.)
"""
if len(ctes) == 0:
return sql
parsed_stmts = sqlparse.parse(sql)
parsed = parsed_stmts[0]
with_stmt = None
for token in parsed.tokens:
if token.is_keyword and token.normalized == "WITH":
with_stmt = token
elif token.is_keyword and token.normalized == "RECURSIVE" and with_stmt is not None:
with_stmt = token
break
elif not token.is_whitespace and with_stmt is not None:
break
if with_stmt is None:
# no with stmt, add one, and inject CTEs right at the beginning
# [original_sql]
first_token = parsed.token_first()
with_token = sqlparse.sql.Token(sqlparse.tokens.Keyword, "with")
parsed.insert_before(first_token, with_token)
# [with][original_sql]
injected_ctes = ", ".join(c.sql for c in ctes) + " "
injected_ctes_token = sqlparse.sql.Token(sqlparse.tokens.Keyword, injected_ctes)
parsed.insert_after(with_token, injected_ctes_token)
# [with][joined_ctes][original_sql]
else:
# with stmt exists so we don't need to add one, but we do need to add a comma
# between the injected ctes and the original sql
# [with][original_sql]
injected_ctes = ", ".join(c.sql for c in ctes)
injected_ctes_token = sqlparse.sql.Token(sqlparse.tokens.Keyword, injected_ctes)
parsed.insert_after(with_stmt, injected_ctes_token)
# [with][joined_ctes][original_sql]
comma_token = sqlparse.sql.Token(sqlparse.tokens.Punctuation, ", ")
parsed.insert_after(injected_ctes_token, comma_token)
# [with][joined_ctes][, ][original_sql]
return str(parsed)

View File

@@ -1,4 +1,4 @@
# all these are just exports, they need "noqa" so flake8 will not complain.
from .profile import Profile, read_user_config # noqa
from .project import Project, IsFQNResource, PartialProject # noqa
from .runtime import RuntimeConfig # noqa
from .project import Project, IsFQNResource # noqa
from .runtime import RuntimeConfig, UnsetProfileConfig # noqa

View File

@@ -4,7 +4,7 @@ import os
from dbt.dataclass_schema import ValidationError
from dbt.flags import get_flags
from dbt import flags
from dbt.clients.system import load_file_contents
from dbt.clients.yaml_helper import load_yaml_text
from dbt.contracts.connection import Credentials, HasCredentials
@@ -32,6 +32,22 @@ dbt encountered an error while trying to read your profiles.yml file.
"""
NO_SUPPLIED_PROFILE_ERROR = """\
dbt cannot run because no profile was specified for this dbt project.
To specify a profile for this project, add a line like the this to
your dbt_project.yml file:
profile: [profile name]
Here, [profile name] should be replaced with a profile name
defined in your profiles.yml file. You can find profiles.yml here:
{profiles_file}/profiles.yml
""".format(
profiles_file=flags.DEFAULT_PROFILES_DIR
)
def read_profile(profiles_dir: str) -> Dict[str, Any]:
path = os.path.join(profiles_dir, "profiles.yml")
@@ -181,33 +197,10 @@ class Profile(HasCredentials):
args_profile_name: Optional[str],
project_profile_name: Optional[str] = None,
) -> str:
# TODO: Duplicating this method as direct copy of the implementation in dbt.cli.resolvers
# dbt.cli.resolvers implementation can't be used because it causes a circular dependency.
# This should be removed and use a safe default access on the Flags module when
# https://github.com/dbt-labs/dbt-core/issues/6259 is closed.
def default_profiles_dir():
from pathlib import Path
return Path.cwd() if (Path.cwd() / "profiles.yml").exists() else Path.home() / ".dbt"
profile_name = project_profile_name
if args_profile_name is not None:
profile_name = args_profile_name
if profile_name is None:
NO_SUPPLIED_PROFILE_ERROR = """\
dbt cannot run because no profile was specified for this dbt project.
To specify a profile for this project, add a line like the this to
your dbt_project.yml file:
profile: [profile name]
Here, [profile name] should be replaced with a profile name
defined in your profiles.yml file. You can find profiles.yml here:
{profiles_file}/profiles.yml
""".format(
profiles_file=default_profiles_dir()
)
raise DbtProjectError(NO_SUPPLIED_PROFILE_ERROR)
return profile_name
@@ -408,13 +401,11 @@ defined in your profiles.yml file. You can find profiles.yml here:
)
@classmethod
def render(
def render_from_args(
cls,
args: Any,
renderer: ProfileRenderer,
project_profile_name: Optional[str],
profile_name_override: Optional[str] = None,
target_override: Optional[str] = None,
threads_override: Optional[int] = None,
) -> "Profile":
"""Given the raw profiles as read from disk and the name of the desired
profile if specified, return the profile component of the runtime
@@ -430,9 +421,10 @@ defined in your profiles.yml file. You can find profiles.yml here:
target could not be found.
:returns Profile: The new Profile object.
"""
flags = get_flags()
threads_override = getattr(args, "threads", None)
target_override = getattr(args, "target", None)
raw_profiles = read_profile(flags.PROFILES_DIR)
profile_name = cls.pick_profile_name(profile_name_override, project_profile_name)
profile_name = cls.pick_profile_name(getattr(args, "profile", None), project_profile_name)
return cls.from_raw_profiles(
raw_profiles=raw_profiles,
profile_name=profile_name,

View File

@@ -14,9 +14,7 @@ from typing_extensions import Protocol, runtime_checkable
import os
from dbt.flags import get_flags
from dbt import deprecations
from dbt.constants import DEPENDENCIES_FILE_NAME, PACKAGES_FILE_NAME
from dbt import flags, deprecations
from dbt.clients.system import path_exists, resolve_path_from_base, load_file_contents
from dbt.clients.yaml_helper import load_yaml_text
from dbt.contracts.connection import QueryComment
@@ -38,9 +36,9 @@ from dbt.contracts.project import (
Project as ProjectContract,
SemverString,
)
from dbt.contracts.project import PackageConfig, ProjectPackageMetadata
from dbt.contracts.project import PackageConfig
from dbt.dataclass_schema import ValidationError
from .renderer import DbtProjectYamlRenderer, PackageRenderer
from .renderer import DbtProjectYamlRenderer
from .selectors import (
selector_config_from_data,
selector_data_from_root,
@@ -76,11 +74,6 @@ Validator Error:
{error}
"""
MISSING_DBT_PROJECT_ERROR = """\
No dbt_project.yml found at expected path {path}
Verify that each entry within packages.yml (and their transitive dependencies) contains a file named dbt_project.yml
"""
@runtime_checkable
class IsFQNResource(Protocol):
@@ -94,36 +87,17 @@ def _load_yaml(path):
return load_yaml_text(contents)
def package_and_project_data_from_root(project_root):
package_filepath = resolve_path_from_base(PACKAGES_FILE_NAME, project_root)
dependencies_filepath = resolve_path_from_base(DEPENDENCIES_FILE_NAME, project_root)
def package_data_from_root(project_root):
package_filepath = resolve_path_from_base("packages.yml", project_root)
packages_yml_dict = {}
dependencies_yml_dict = {}
if path_exists(package_filepath):
packages_yml_dict = _load_yaml(package_filepath) or {}
if path_exists(dependencies_filepath):
dependencies_yml_dict = _load_yaml(dependencies_filepath) or {}
if "packages" in packages_yml_dict and "packages" in dependencies_yml_dict:
msg = "The 'packages' key cannot be specified in both packages.yml and dependencies.yml"
raise DbtProjectError(msg)
if "projects" in packages_yml_dict:
msg = "The 'projects' key cannot be specified in packages.yml"
raise DbtProjectError(msg)
packages_specified_path = PACKAGES_FILE_NAME
packages_dict = {}
if "packages" in dependencies_yml_dict:
packages_dict["packages"] = dependencies_yml_dict["packages"]
packages_specified_path = DEPENDENCIES_FILE_NAME
else: # don't check for "packages" here so we capture invalid keys in packages.yml
packages_dict = packages_yml_dict
return packages_dict, packages_specified_path
packages_dict = _load_yaml(package_filepath)
else:
packages_dict = None
return packages_dict
def package_config_from_data(packages_data: Dict[str, Any]) -> PackageConfig:
def package_config_from_data(packages_data: Dict[str, Any]):
if not packages_data:
packages_data = {"packages": []}
@@ -157,10 +131,11 @@ def _all_source_paths(
analysis_paths: List[str],
macro_paths: List[str],
) -> List[str]:
paths = chain(model_paths, seed_paths, snapshot_paths, analysis_paths, macro_paths)
# Strip trailing slashes since the path is the same even though the name is not
stripped_paths = map(lambda s: s.rstrip("/"), paths)
return list(set(stripped_paths))
# We need to turn a list of lists into just a list, then convert to a set to
# get only unique elements, then back to a list
return list(
set(list(chain(model_paths, seed_paths, snapshot_paths, analysis_paths, macro_paths)))
)
T = TypeVar("T")
@@ -180,14 +155,16 @@ def value_or(value: Optional[T], default: T) -> T:
return value
def load_raw_project(project_root: str) -> Dict[str, Any]:
def _raw_project_from(project_root: str) -> Dict[str, Any]:
project_root = os.path.normpath(project_root)
project_yaml_filepath = os.path.join(project_root, "dbt_project.yml")
# get the project.yml contents
if not path_exists(project_yaml_filepath):
raise DbtProjectError(MISSING_DBT_PROJECT_ERROR.format(path=project_yaml_filepath))
raise DbtProjectError(
"no dbt_project.yml found at expected path {}".format(project_yaml_filepath)
)
project_dict = _load_yaml(project_yaml_filepath)
@@ -264,7 +241,6 @@ class RenderComponents:
@dataclass
class PartialProject(RenderComponents):
# This class includes the project_dict, packages_dict, selectors_dict, etc from RenderComponents
profile_name: Optional[str] = field(
metadata=dict(description="The unrendered profile name in the project, if set")
)
@@ -281,9 +257,6 @@ class PartialProject(RenderComponents):
verify_version: bool = field(
metadata=dict(description=("If True, verify the dbt version matches the required version"))
)
packages_specified_path: str = field(
metadata=dict(description="The filename where packages were specified")
)
def render_profile_name(self, renderer) -> Optional[str]:
if self.profile_name is None:
@@ -296,9 +269,7 @@ class PartialProject(RenderComponents):
) -> RenderComponents:
rendered_project = renderer.render_project(self.project_dict, self.project_root)
rendered_packages = renderer.render_packages(
self.packages_dict, self.packages_specified_path
)
rendered_packages = renderer.render_packages(self.packages_dict)
rendered_selectors = renderer.render_selectors(self.selectors_dict)
return RenderComponents(
@@ -307,7 +278,7 @@ class PartialProject(RenderComponents):
selectors_dict=rendered_selectors,
)
# Called by Project.from_project_root (not PartialProject.from_project_root!)
# Called by 'collect_parts' in RuntimeConfig
def render(self, renderer: DbtProjectYamlRenderer) -> "Project":
try:
rendered = self.get_rendered(renderer)
@@ -317,34 +288,23 @@ class PartialProject(RenderComponents):
exc.path = os.path.join(self.project_root, "dbt_project.yml")
raise
def render_package_metadata(self, renderer: PackageRenderer) -> ProjectPackageMetadata:
packages_data = renderer.render_data(self.packages_dict)
packages_config = package_config_from_data(packages_data)
if not self.project_name:
raise DbtProjectError("Package dbt_project.yml must have a name!")
return ProjectPackageMetadata(self.project_name, packages_config.packages)
def check_config_path(
self, project_dict, deprecated_path, expected_path=None, default_value=None
):
def check_config_path(self, project_dict, deprecated_path, exp_path):
if deprecated_path in project_dict:
if expected_path in project_dict:
if exp_path in project_dict:
msg = (
"{deprecated_path} and {expected_path} cannot both be defined. The "
"`{deprecated_path}` config has been deprecated in favor of `{expected_path}`. "
"{deprecated_path} and {exp_path} cannot both be defined. The "
"`{deprecated_path}` config has been deprecated in favor of `{exp_path}`. "
"Please update your `dbt_project.yml` configuration to reflect this "
"change."
)
raise DbtProjectError(
msg.format(deprecated_path=deprecated_path, expected_path=expected_path)
msg.format(deprecated_path=deprecated_path, exp_path=exp_path)
)
# this field is no longer supported, but many projects may specify it with the default value
# if so, let's only raise this deprecation warning if they set a custom value
if not default_value or project_dict[deprecated_path] != default_value:
kwargs = {"deprecated_path": deprecated_path}
if expected_path:
kwargs.update({"exp_path": expected_path})
deprecations.warn(f"project-config-{deprecated_path}", **kwargs)
deprecations.warn(
f"project-config-{deprecated_path}",
deprecated_path=deprecated_path,
exp_path=exp_path,
)
def create_project(self, rendered: RenderComponents) -> "Project":
unrendered = RenderComponents(
@@ -359,8 +319,6 @@ class PartialProject(RenderComponents):
self.check_config_path(rendered.project_dict, "source-paths", "model-paths")
self.check_config_path(rendered.project_dict, "data-paths", "seed-paths")
self.check_config_path(rendered.project_dict, "log-path", default_value="logs")
self.check_config_path(rendered.project_dict, "target-path", default_value="target")
try:
ProjectContract.validate(rendered.project_dict)
@@ -404,13 +362,9 @@ class PartialProject(RenderComponents):
docs_paths: List[str] = value_or(cfg.docs_paths, all_source_paths)
asset_paths: List[str] = value_or(cfg.asset_paths, [])
flags = get_flags()
flag_target_path = str(flags.TARGET_PATH) if flags.TARGET_PATH else None
target_path: str = flag_or(flag_target_path, cfg.target_path, "target")
log_path: str = str(flags.LOG_PATH)
target_path: str = flag_or(flags.TARGET_PATH, cfg.target_path, "target")
clean_targets: List[str] = value_or(cfg.clean_targets, [target_path])
log_path: str = flag_or(flags.LOG_PATH, cfg.log_path, "logs")
packages_install_path: str = value_or(cfg.packages_install_path, "dbt_packages")
# in the default case we'll populate this once we know the adapter type
# It would be nice to just pass along a Quoting here, but that would
@@ -450,7 +404,7 @@ class PartialProject(RenderComponents):
query_comment = _query_comment_from_cfg(cfg.query_comment)
packages: PackageConfig = package_config_from_data(rendered.packages_dict)
packages = package_config_from_data(rendered.packages_dict)
selectors = selector_config_from_data(rendered.selectors_dict)
manifest_selectors: Dict[str, Any] = {}
if rendered.selectors_dict and rendered.selectors_dict["selectors"]:
@@ -476,7 +430,6 @@ class PartialProject(RenderComponents):
clean_targets=clean_targets,
log_path=log_path,
packages_install_path=packages_install_path,
packages_specified_path=self.packages_specified_path,
quoting=quoting,
models=models,
on_run_start=on_run_start,
@@ -497,7 +450,6 @@ class PartialProject(RenderComponents):
config_version=cfg.config_version,
unrendered=unrendered,
project_env_vars=project_env_vars,
restrict_access=cfg.restrict_access,
)
# sanity check - this means an internal issue
project.validate()
@@ -512,13 +464,11 @@ class PartialProject(RenderComponents):
selectors_dict: Dict[str, Any],
*,
verify_version: bool = False,
packages_specified_path: str = PACKAGES_FILE_NAME,
):
"""Construct a partial project from its constituent dicts."""
project_name = project_dict.get("name")
profile_name = project_dict.get("profile")
# Create a PartialProject
return cls(
profile_name=profile_name,
project_name=project_name,
@@ -527,7 +477,6 @@ class PartialProject(RenderComponents):
packages_dict=packages_dict,
selectors_dict=selectors_dict,
verify_version=verify_version,
packages_specified_path=packages_specified_path,
)
@classmethod
@@ -535,11 +484,15 @@ class PartialProject(RenderComponents):
cls, project_root: str, *, verify_version: bool = False
) -> "PartialProject":
project_root = os.path.normpath(project_root)
project_dict = load_raw_project(project_root)
(
packages_dict,
packages_specified_path,
) = package_and_project_data_from_root(project_root)
project_dict = _raw_project_from(project_root)
config_version = project_dict.get("config-version", 1)
if config_version != 2:
raise DbtProjectError(
f"Invalid config version: {config_version}, expected 2",
path=os.path.join(project_root, "dbt_project.yml"),
)
packages_dict = package_data_from_root(project_root)
selectors_dict = selector_data_from_root(project_root)
return cls.from_dicts(
project_root=project_root,
@@ -547,7 +500,6 @@ class PartialProject(RenderComponents):
selectors_dict=selectors_dict,
packages_dict=packages_dict,
verify_version=verify_version,
packages_specified_path=packages_specified_path,
)
@@ -572,7 +524,7 @@ class VarProvider:
@dataclass
class Project:
project_name: str
version: Optional[Union[SemverString, float]]
version: Union[SemverString, float]
project_root: str
profile_name: Optional[str]
model_paths: List[str]
@@ -587,7 +539,6 @@ class Project:
clean_targets: List[str]
log_path: str
packages_install_path: str
packages_specified_path: str
quoting: Dict[str, Any]
models: Dict[str, Any]
on_run_start: List[str]
@@ -601,14 +552,13 @@ class Project:
exposures: Dict[str, Any]
vars: VarProvider
dbt_version: List[VersionSpecifier]
packages: PackageConfig
packages: Dict[str, Any]
manifest_selectors: Dict[str, Any]
selectors: SelectorConfig
query_comment: QueryComment
config_version: int
unrendered: RenderComponents
project_env_vars: Dict[str, Any]
restrict_access: bool
@property
def all_source_paths(self) -> List[str]:
@@ -677,7 +627,6 @@ class Project:
"vars": self.vars.to_dict(),
"require-dbt-version": [v.to_version_string() for v in self.dbt_version],
"config-version": self.config_version,
"restrict-access": self.restrict_access,
}
)
if self.query_comment:
@@ -694,9 +643,13 @@ class Project:
except ValidationError as e:
raise ProjectContractBrokenError(e) from e
# Called by:
# RtConfig.load_dependencies => RtConfig.load_projects => RtConfig.new_project => Project.from_project_root
# RtConfig.from_args => RtConfig.collect_parts => load_project => Project.from_project_root
@classmethod
def partial_load(cls, project_root: str, *, verify_version: bool = False) -> PartialProject:
return PartialProject.from_project_root(
project_root,
verify_version=verify_version,
)
@classmethod
def from_project_root(
cls,
@@ -705,7 +658,7 @@ class Project:
*,
verify_version: bool = False,
) -> "Project":
partial = PartialProject.from_project_root(project_root, verify_version=verify_version)
partial = cls.partial_load(project_root, verify_version=verify_version)
return partial.render(renderer)
def hashed_name(self):
@@ -734,8 +687,3 @@ class Project:
if dispatch_entry["macro_namespace"] == macro_namespace:
return dispatch_entry["search_order"]
return None
@property
def project_target_path(self):
# If target_path is absolute, project_root will not be included
return os.path.join(self.project_root, self.target_path)

View File

@@ -1,10 +1,9 @@
from typing import Dict, Any, Tuple, Optional, Union, Callable
import re
import os
from datetime import date
from dbt.clients.jinja import get_rendered, catch_jinja
from dbt.constants import SECRET_ENV_PREFIX, DEPENDENCIES_FILE_NAME
from dbt.constants import SECRET_ENV_PREFIX
from dbt.context.target import TargetContext
from dbt.context.secret import SecretContext, SECRET_PLACEHOLDER
from dbt.context.base import BaseContext
@@ -34,10 +33,10 @@ class BaseRenderer:
return self.render_value(value, keypath)
def render_value(self, value: Any, keypath: Optional[Keypath] = None) -> Any:
# keypath is ignored (and someone who knows should explain why here)
# keypath is ignored.
# if it wasn't read as a string, ignore it
if not isinstance(value, str):
return value if not isinstance(value, date) else value.isoformat()
return value
try:
with catch_jinja():
return get_rendered(value, self.context, native=True)
@@ -108,7 +107,7 @@ class DbtProjectYamlRenderer(BaseRenderer):
if cli_vars is None:
cli_vars = {}
if profile:
self.ctx_obj = TargetContext(profile.to_target_dict(), cli_vars)
self.ctx_obj = TargetContext(profile, cli_vars)
else:
self.ctx_obj = BaseContext(cli_vars) # type:ignore
context = self.ctx_obj.to_dict()
@@ -132,15 +131,10 @@ class DbtProjectYamlRenderer(BaseRenderer):
rendered_project["project-root"] = project_root
return rendered_project
def render_packages(self, packages: Dict[str, Any], packages_specified_path: str):
def render_packages(self, packages: Dict[str, Any]):
"""Render the given packages dict"""
packages = packages or {} # Sometimes this is none in tests
package_renderer = self.get_package_renderer()
if packages_specified_path == DEPENDENCIES_FILE_NAME:
# We don't want to render the "packages" dictionary that came from dependencies.yml
return packages
else:
return package_renderer.render_data(packages)
return package_renderer.render_data(packages)
def render_selectors(self, selectors: Dict[str, Any]):
return self.render_data(selectors)
@@ -188,17 +182,7 @@ class SecretRenderer(BaseRenderer):
# First, standard Jinja rendering, with special handling for 'secret' environment variables
# "{{ env_var('DBT_SECRET_ENV_VAR') }}" -> "$$$DBT_SECRET_START$$$DBT_SECRET_ENV_{VARIABLE_NAME}$$$DBT_SECRET_END$$$"
# This prevents Jinja manipulation of secrets via macros/filters that might leak partial/modified values in logs
try:
rendered = super().render_value(value, keypath)
except Exception as ex:
if keypath and "password" in keypath:
# Passwords sometimes contain jinja-esque characters, but we
# don't want to render them if they aren't valid jinja.
rendered = value
else:
raise ex
rendered = super().render_value(value, keypath)
# Now, detect instances of the placeholder value ($$$DBT_SECRET_START...DBT_SECRET_END$$$)
# and replace them with the actual secret value
if SECRET_ENV_PREFIX in str(rendered):

View File

@@ -1,7 +1,7 @@
import itertools
import os
from copy import deepcopy
from dataclasses import dataclass
from dataclasses import dataclass, field
from pathlib import Path
from typing import (
Any,
@@ -13,18 +13,17 @@ from typing import (
Optional,
Tuple,
Type,
Union,
)
from dbt.flags import get_flags
from dbt import flags
from dbt.adapters.factory import get_include_paths, get_relation_class_by_name
from dbt.config.project import load_raw_project
from dbt.contracts.connection import AdapterRequiredConfig, Credentials, HasCredentials
from dbt.config.profile import read_user_config
from dbt.contracts.connection import AdapterRequiredConfig, Credentials
from dbt.contracts.graph.manifest import ManifestMetadata
from dbt.contracts.project import Configuration, UserConfig
from dbt.contracts.relation import ComponentName
from dbt.dataclass_schema import ValidationError
from dbt.events.functions import warn_or_error
from dbt.events.types import UnusedResourceConfigPath
from dbt.exceptions import (
ConfigContractBrokenError,
DbtProjectError,
@@ -32,47 +31,14 @@ from dbt.exceptions import (
DbtRuntimeError,
UninstalledPackagesFoundError,
)
from dbt.events.functions import warn_or_error
from dbt.events.types import UnusedResourceConfigPath
from dbt.helper_types import DictDefaultEmptyStr, FQNPath, PathSet
from .profile import Profile
from .project import Project
from .project import Project, PartialProject
from .renderer import DbtProjectYamlRenderer, ProfileRenderer
# Called by RuntimeConfig.collect_parts class method
def load_project(
project_root: str,
version_check: bool,
profile: HasCredentials,
cli_vars: Optional[Dict[str, Any]] = None,
) -> Project:
# get the project with all of the provided information
project_renderer = DbtProjectYamlRenderer(profile, cli_vars)
project = Project.from_project_root(
project_root, project_renderer, verify_version=version_check
)
# Save env_vars encountered in rendering for partial parsing
project.project_env_vars = project_renderer.ctx_obj.env_vars
return project
def load_profile(
project_root: str,
cli_vars: Dict[str, Any],
profile_name_override: Optional[str] = None,
target_override: Optional[str] = None,
threads_override: Optional[int] = None,
) -> Profile:
raw_project = load_raw_project(project_root)
raw_profile_name = raw_project.get("profile")
profile_renderer = ProfileRenderer(cli_vars)
profile_name = profile_renderer.render_value(raw_profile_name)
profile = Profile.render(
profile_renderer, profile_name, profile_name_override, target_override, threads_override
)
# Save env_vars encountered in rendering for partial parsing
profile.profile_env_vars = profile_renderer.ctx_obj.env_vars
return profile
from .utils import parse_cli_vars
def _project_quoting_dict(proj: Project, profile: Profile) -> Dict[ComponentName, bool]:
@@ -96,21 +62,6 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
def __post_init__(self):
self.validate()
@classmethod
def get_profile(
cls,
project_root: str,
cli_vars: Dict[str, Any],
args: Any,
) -> Profile:
return load_profile(
project_root,
cli_vars,
args.profile,
args.target,
args.threads,
)
# Called by 'new_project' and 'from_args'
@classmethod
def from_parts(
@@ -133,7 +84,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
.replace_dict(_project_quoting_dict(project, profile))
).to_dict(omit_none=True)
cli_vars: Dict[str, Any] = getattr(args, "vars", {})
cli_vars: Dict[str, Any] = parse_cli_vars(getattr(args, "vars", "{}"))
return cls(
project_name=project.project_name,
@@ -151,7 +102,6 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
clean_targets=project.clean_targets,
log_path=project.log_path,
packages_install_path=project.packages_install_path,
packages_specified_path=project.packages_specified_path,
quoting=quoting,
models=project.models,
on_run_start=project.on_run_start,
@@ -172,7 +122,6 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
config_version=project.config_version,
unrendered=project.unrendered,
project_env_vars=project.project_env_vars,
restrict_access=project.restrict_access,
profile_env_vars=profile.profile_env_vars,
profile_name=profile.profile_name,
target_name=profile.target_name,
@@ -200,10 +149,11 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
# load the new project and its packages. Don't pass cli variables.
renderer = DbtProjectYamlRenderer(profile)
project = Project.from_project_root(
project_root,
renderer,
verify_version=bool(getattr(self.args, "VERSION_CHECK", True)),
verify_version=bool(flags.VERSION_CHECK),
)
runtime_config = self.from_parts(
@@ -239,22 +189,66 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
except ValidationError as e:
raise ConfigContractBrokenError(e) from e
# Called by RuntimeConfig.from_args
@classmethod
def _get_rendered_profile(
cls,
args: Any,
profile_renderer: ProfileRenderer,
profile_name: Optional[str],
) -> Profile:
return Profile.render_from_args(args, profile_renderer, profile_name)
@classmethod
def collect_parts(cls: Type["RuntimeConfig"], args: Any) -> Tuple[Project, Profile]:
# profile_name from the project
project_root = args.project_dir if args.project_dir else os.getcwd()
cli_vars: Dict[str, Any] = getattr(args, "vars", {})
profile = cls.get_profile(
project_root,
cli_vars,
args,
)
flags = get_flags()
project = load_project(project_root, bool(flags.VERSION_CHECK), profile, cli_vars)
return project, profile
# Called in task/base.py, in BaseTask.from_args
cli_vars: Dict[str, Any] = parse_cli_vars(getattr(args, "vars", "{}"))
profile = cls.collect_profile(args=args)
project_renderer = DbtProjectYamlRenderer(profile, cli_vars)
project = cls.collect_project(args=args, project_renderer=project_renderer)
assert type(project) is Project
return (project, profile)
@classmethod
def collect_profile(
cls: Type["RuntimeConfig"], args: Any, profile_name: Optional[str] = None
) -> Profile:
cli_vars: Dict[str, Any] = parse_cli_vars(getattr(args, "vars", "{}"))
profile_renderer = ProfileRenderer(cli_vars)
# build the profile using the base renderer and the one fact we know
if profile_name is None:
# Note: only the named profile section is rendered here. The rest of the
# profile is ignored.
partial = cls.collect_project(args)
assert type(partial) is PartialProject
profile_name = partial.render_profile_name(profile_renderer)
profile = cls._get_rendered_profile(args, profile_renderer, profile_name)
# Save env_vars encountered in rendering for partial parsing
profile.profile_env_vars = profile_renderer.ctx_obj.env_vars
return profile
@classmethod
def collect_project(
cls: Type["RuntimeConfig"],
args: Any,
project_renderer: Optional[DbtProjectYamlRenderer] = None,
) -> Union[Project, PartialProject]:
project_root = args.project_dir if args.project_dir else os.getcwd()
version_check = bool(flags.VERSION_CHECK)
partial = Project.partial_load(project_root, verify_version=version_check)
if project_renderer is None:
return partial
else:
project = partial.render(project_renderer)
project.project_env_vars = project_renderer.ctx_obj.env_vars
return project
# Called in main.py, lib.py, task/base.py
@classmethod
def from_args(cls, args: Any) -> "RuntimeConfig":
"""Given arguments, read in dbt_project.yml from the current directory,
@@ -275,11 +269,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
)
def get_metadata(self) -> ManifestMetadata:
return ManifestMetadata(
project_name=self.project_name,
project_id=self.hashed_name(),
adapter_type=self.credentials.type,
)
return ManifestMetadata(project_id=self.hashed_name(), adapter_type=self.credentials.type)
def _get_v2_config_paths(
self,
@@ -366,7 +356,6 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
raise UninstalledPackagesFoundError(
count_packages_specified,
count_packages_installed,
self.packages_specified_path,
self.packages_install_path,
)
project_paths = itertools.chain(internal_packages, self._get_project_directories())
@@ -422,8 +411,8 @@ class UnsetCredentials(Credentials):
return ()
# This is used by commands which do not require
# a profile, i.e. dbt deps and clean
# This is used by UnsetProfileConfig, for commands which do
# not require a profile, i.e. dbt deps and clean
class UnsetProfile(Profile):
def __init__(self):
self.credentials = UnsetCredentials()
@@ -442,12 +431,182 @@ class UnsetProfile(Profile):
return Profile.__getattribute__(self, name)
UNUSED_RESOURCE_CONFIGURATION_PATH_MESSAGE = """\
Configuration paths exist in your dbt_project.yml file which do not \
apply to any resources.
There are {} unused configuration paths:
{}
"""
# This class is used by the dbt deps and clean commands, because they don't
# require a functioning profile.
@dataclass
class UnsetProfileConfig(RuntimeConfig):
"""This class acts a lot _like_ a RuntimeConfig, except if your profile is
missing, any access to profile members results in an exception.
"""
profile_name: str = field(repr=False)
target_name: str = field(repr=False)
def __post_init__(self):
# instead of futzing with InitVar overrides or rewriting __init__, just
# `del` the attrs we don't want users touching.
del self.profile_name
del self.target_name
# don't call super().__post_init__(), as that calls validate(), and
# this object isn't very valid
def __getattribute__(self, name):
# Override __getattribute__ to check that the attribute isn't 'banned'.
if name in {"profile_name", "target_name"}:
raise DbtRuntimeError(f'Error: disallowed attribute "{name}" - no profile!')
# avoid every attribute access triggering infinite recursion
return RuntimeConfig.__getattribute__(self, name)
def to_target_dict(self):
# re-override the poisoned profile behavior
return DictDefaultEmptyStr({})
def to_project_config(self, with_packages=False):
"""Return a dict representation of the config that could be written to
disk with `yaml.safe_dump` to get this configuration.
Overrides dbt.config.Project.to_project_config to omit undefined profile
attributes.
:param with_packages bool: If True, include the serialized packages
file in the root.
:returns dict: The serialized profile.
"""
result = deepcopy(
{
"name": self.project_name,
"version": self.version,
"project-root": self.project_root,
"profile": "",
"model-paths": self.model_paths,
"macro-paths": self.macro_paths,
"seed-paths": self.seed_paths,
"test-paths": self.test_paths,
"analysis-paths": self.analysis_paths,
"docs-paths": self.docs_paths,
"asset-paths": self.asset_paths,
"target-path": self.target_path,
"snapshot-paths": self.snapshot_paths,
"clean-targets": self.clean_targets,
"log-path": self.log_path,
"quoting": self.quoting,
"models": self.models,
"on-run-start": self.on_run_start,
"on-run-end": self.on_run_end,
"dispatch": self.dispatch,
"seeds": self.seeds,
"snapshots": self.snapshots,
"sources": self.sources,
"tests": self.tests,
"metrics": self.metrics,
"exposures": self.exposures,
"vars": self.vars.to_dict(),
"require-dbt-version": [v.to_version_string() for v in self.dbt_version],
"config-version": self.config_version,
}
)
if self.query_comment:
result["query-comment"] = self.query_comment.to_dict(omit_none=True)
if with_packages:
result.update(self.packages.to_dict(omit_none=True))
return result
@classmethod
def from_parts(
cls,
project: Project,
profile: Profile,
args: Any,
dependencies: Optional[Mapping[str, "RuntimeConfig"]] = None,
) -> "RuntimeConfig":
"""Instantiate a RuntimeConfig from its components.
:param profile: Ignored.
:param project: A parsed dbt Project.
:param args: The parsed command-line arguments.
:returns RuntimeConfig: The new configuration.
"""
cli_vars: Dict[str, Any] = parse_cli_vars(getattr(args, "vars", "{}"))
return cls(
project_name=project.project_name,
version=project.version,
project_root=project.project_root,
model_paths=project.model_paths,
macro_paths=project.macro_paths,
seed_paths=project.seed_paths,
test_paths=project.test_paths,
analysis_paths=project.analysis_paths,
docs_paths=project.docs_paths,
asset_paths=project.asset_paths,
target_path=project.target_path,
snapshot_paths=project.snapshot_paths,
clean_targets=project.clean_targets,
log_path=project.log_path,
packages_install_path=project.packages_install_path,
quoting=project.quoting, # we never use this anyway.
models=project.models,
on_run_start=project.on_run_start,
on_run_end=project.on_run_end,
dispatch=project.dispatch,
seeds=project.seeds,
snapshots=project.snapshots,
dbt_version=project.dbt_version,
packages=project.packages,
manifest_selectors=project.manifest_selectors,
selectors=project.selectors,
query_comment=project.query_comment,
sources=project.sources,
tests=project.tests,
metrics=project.metrics,
exposures=project.exposures,
vars=project.vars,
config_version=project.config_version,
unrendered=project.unrendered,
project_env_vars=project.project_env_vars,
profile_env_vars=profile.profile_env_vars,
profile_name="",
target_name="",
user_config=UserConfig(),
threads=getattr(args, "threads", 1),
credentials=UnsetCredentials(),
args=args,
cli_vars=cli_vars,
dependencies=dependencies,
)
@classmethod
def _get_rendered_profile(
cls,
args: Any,
profile_renderer: ProfileRenderer,
profile_name: Optional[str],
) -> Profile:
profile = UnsetProfile()
# The profile (for warehouse connection) is not needed, but we want
# to get the UserConfig, which is also in profiles.yml
user_config = read_user_config(flags.PROFILES_DIR)
profile.user_config = user_config
return profile
@classmethod
def from_args(cls: Type[RuntimeConfig], args: Any) -> "RuntimeConfig":
"""Given arguments, read in dbt_project.yml from the current directory,
read in packages.yml if it exists, and use them to find the profile to
load.
:param args: The arguments as parsed from the cli.
:raises DbtProjectError: If the project is invalid or missing.
:raises DbtProfileError: If the profile is invalid or missing.
:raises DbtValidationError: If the cli variables are invalid.
"""
project, profile = cls.collect_parts(args)
return cls.from_parts(project=project, profile=profile, args=args)
def _is_config_used(path, fqns):

View File

@@ -21,7 +21,7 @@ The selectors.yml file in this project is malformed. Please double check
the contents of this file and fix any errors before retrying.
You can find more information on the syntax for this file here:
https://docs.getdbt.com/reference/node-selection/yaml-selectors
https://docs.getdbt.com/docs/package-management
Validator Error:
{error}

View File

@@ -1,7 +1,12 @@
from typing import Any, Dict
from argparse import Namespace
from typing import Any, Dict, Optional, Union
from xmlrpc.client import Boolean
from dbt.contracts.project import UserConfig
import dbt.flags as flags
from dbt.clients import yaml_helper
from dbt.config import Profile, Project, read_user_config
from dbt.config.renderer import DbtProjectYamlRenderer, ProfileRenderer
from dbt.events.functions import fire_event
from dbt.events.types import InvalidOptionYAML
from dbt.exceptions import DbtValidationError, OptionNotYamlDictError
@@ -19,6 +24,52 @@ def parse_cli_yaml_string(var_string: str, cli_option_name: str) -> Dict[str, An
return cli_vars
else:
raise OptionNotYamlDictError(var_type, cli_option_name)
except (DbtValidationError, OptionNotYamlDictError):
except DbtValidationError:
fire_event(InvalidOptionYAML(option_name=cli_option_name))
raise
def get_project_config(
project_path: str,
profile_name: str,
args: Namespace = Namespace(),
cli_vars: Optional[Dict[str, Any]] = None,
profile: Optional[Profile] = None,
user_config: Optional[UserConfig] = None,
return_dict: Boolean = True,
) -> Union[Project, Dict]:
"""Returns a project config (dict or object) from a given project path and profile name.
Args:
project_path: Path to project
profile_name: Name of profile
args: An argparse.Namespace that represents what would have been passed in on the
command line (optional)
cli_vars: A dict of any vars that would have been passed in on the command line (optional)
(see parse_cli_vars above for formatting details)
profile: A dbt.config.profile.Profile object (optional)
user_config: A dbt.contracts.project.UserConfig object (optional)
return_dict: Return a dict if true, return the full dbt.config.project.Project object if false
Returns:
A full project config
"""
# Generate a profile if not provided
if profile is None:
# Generate user_config if not provided
if user_config is None:
user_config = read_user_config(flags.PROFILES_DIR)
# Update flags
flags.set_from_args(args, user_config)
if cli_vars is None:
cli_vars = {}
profile = Profile.render_from_args(args, ProfileRenderer(cli_vars), profile_name)
# Generate a project
project = Project.from_project_root(
project_path,
DbtProjectYamlRenderer(profile),
verify_version=bool(flags.VERSION_CHECK),
)
# Return
return project.to_project_config() if return_dict else project

View File

@@ -8,9 +8,3 @@ MAXIMUM_SEED_SIZE_NAME = "1MB"
PIN_PACKAGE_URL = (
"https://docs.getdbt.com/docs/package-management#section-specifying-package-versions"
)
PACKAGES_FILE_NAME = "packages.yml"
DEPENDENCIES_FILE_NAME = "dependencies.yml"
MANIFEST_FILE_NAME = "manifest.json"
SEMANTIC_MANIFEST_FILE_NAME = "semantic_manifest.json"
PARTIAL_PARSE_FILE_NAME = "partial_parse.msgpack"

View File

@@ -1,10 +1,8 @@
import json
import os
from typing import Any, Dict, NoReturn, Optional, Mapping, Iterable, Set, List
import threading
from dbt.flags import get_flags
import dbt.flags as flags_module
from dbt import flags
from dbt import tracking
from dbt import utils
from dbt.clients.jinja import get_rendered
@@ -597,11 +595,6 @@ class BaseContext(metaclass=ContextMeta):
"""
return get_invocation_id()
@contextproperty
def thread_id(self) -> str:
"""thread_id outputs an ID for the current thread (useful for auditing)"""
return threading.current_thread().name
@contextproperty
def modules(self) -> Dict[str, Any]:
"""The `modules` variable in the Jinja context contains useful Python
@@ -642,7 +635,7 @@ class BaseContext(metaclass=ContextMeta):
This supports all flags defined in flags submodule (core/dbt/flags.py)
"""
return flags_module.get_flag_obj()
return flags.get_flag_obj()
@contextmember
@staticmethod
@@ -658,7 +651,7 @@ class BaseContext(metaclass=ContextMeta):
{% endmacro %}"
"""
if get_flags().PRINT:
if not flags.NO_PRINT:
print(msg)
return ""

View File

@@ -16,8 +16,7 @@ class ConfiguredContext(TargetContext):
config: AdapterRequiredConfig
def __init__(self, config: AdapterRequiredConfig) -> None:
super().__init__(config.to_target_dict(), config.cli_vars)
self.config = config
super().__init__(config, config.cli_vars)
@contextproperty
def project_name(self) -> str:
@@ -52,11 +51,10 @@ class ConfiguredVar(Var):
adapter_type = self._config.credentials.type
lookup = FQNLookup(self._project_name)
active_vars = self._config.vars.vars_for(lookup, adapter_type)
all_vars = MultiDict([active_vars])
all_vars = MultiDict()
if self._config.project_name != my_config.project_name:
all_vars.add(my_config.vars.vars_for(lookup, adapter_type))
all_vars.add(active_vars)
if var_name in all_vars:
return all_vars[var_name]
@@ -119,9 +117,7 @@ class MacroResolvingContext(ConfiguredContext):
def generate_schema_yml_context(
config: AdapterRequiredConfig,
project_name: str,
schema_yaml_vars: Optional[SchemaYamlVars] = None,
config: AdapterRequiredConfig, project_name: str, schema_yaml_vars: SchemaYamlVars = None
) -> Dict[str, Any]:
ctx = SchemaYamlContext(config, project_name, schema_yaml_vars)
return ctx.to_dict()

View File

@@ -1,7 +1,7 @@
from abc import abstractmethod
from copy import deepcopy
from dataclasses import dataclass
from typing import List, Iterator, Dict, Any, TypeVar, Generic, Optional
from typing import List, Iterator, Dict, Any, TypeVar, Generic
from dbt.config import RuntimeConfig, Project, IsFQNResource
from dbt.contracts.graph.model_config import BaseConfig, get_config_for, _listify
@@ -130,7 +130,7 @@ class BaseContextConfigGenerator(Generic[T]):
resource_type: NodeType,
project_name: str,
base: bool,
patch_config_dict: Optional[Dict[str, Any]] = None,
patch_config_dict: Dict[str, Any] = None,
) -> BaseConfig:
own_config = self.get_node_project(project_name)
@@ -166,7 +166,7 @@ class BaseContextConfigGenerator(Generic[T]):
resource_type: NodeType,
project_name: str,
base: bool,
patch_config_dict: Optional[Dict[str, Any]] = None,
patch_config_dict: Dict[str, Any],
) -> Dict[str, Any]:
...
@@ -200,7 +200,7 @@ class ContextConfigGenerator(BaseContextConfigGenerator[C]):
resource_type: NodeType,
project_name: str,
base: bool,
patch_config_dict: Optional[dict] = None,
patch_config_dict: dict = None,
) -> Dict[str, Any]:
config = self.calculate_node_config(
config_call_dict=config_call_dict,
@@ -225,7 +225,7 @@ class UnrenderedConfigGenerator(BaseContextConfigGenerator[Dict[str, Any]]):
resource_type: NodeType,
project_name: str,
base: bool,
patch_config_dict: Optional[dict] = None,
patch_config_dict: dict = None,
) -> Dict[str, Any]:
# TODO CT-211
return self.calculate_node_config(
@@ -318,11 +318,7 @@ class ContextConfig:
config_call_dict[k] = v
def build_config_dict(
self,
base: bool = False,
*,
rendered: bool = True,
patch_config_dict: Optional[dict] = None,
self, base: bool = False, *, rendered: bool = True, patch_config_dict: dict = None
) -> Dict[str, Any]:
if rendered:
# TODO CT-211

View File

@@ -23,9 +23,6 @@ from dbt.exceptions import (
PropertyYMLError,
NotImplementedError,
RelationWrongTypeError,
ContractError,
ColumnTypeMissingError,
FailFastError,
)
@@ -68,10 +65,6 @@ def raise_compiler_error(msg, node=None) -> NoReturn:
raise CompilationError(msg, node)
def raise_contract_error(yaml_columns, sql_columns) -> NoReturn:
raise ContractError(yaml_columns, sql_columns)
def raise_database_error(msg, node=None) -> NoReturn:
raise DbtDatabaseError(msg, node)
@@ -104,14 +97,6 @@ def relation_wrong_type(relation, expected_type, model=None) -> NoReturn:
raise RelationWrongTypeError(relation, expected_type, model)
def column_type_missing(column_names) -> NoReturn:
raise ColumnTypeMissingError(column_names)
def raise_fail_fast_error(msg, node=None) -> NoReturn:
raise FailFastError(msg, node=node)
# Update this when a new function should be added to the
# dbt context's `exceptions` key!
CONTEXT_EXPORTS = {
@@ -134,9 +119,6 @@ CONTEXT_EXPORTS = {
raise_invalid_property_yml_version,
raise_not_implemented,
relation_wrong_type,
raise_contract_error,
column_type_missing,
raise_fail_fast_error,
]
}

View File

@@ -32,16 +32,13 @@ from dbt.contracts.graph.manifest import Manifest, Disabled
from dbt.contracts.graph.nodes import (
Macro,
Exposure,
Metric,
SeedNode,
SourceDefinition,
Resource,
ManifestNode,
RefArgs,
AccessType,
SemanticModel,
)
from dbt.contracts.graph.metrics import MetricReference, ResolvedMetricReference
from dbt.contracts.graph.unparsed import NodeVersion
from dbt.events.functions import get_metadata_vars
from dbt.exceptions import (
CompilationError,
@@ -55,7 +52,6 @@ from dbt.exceptions import (
LoadAgateTableNotSeedError,
LoadAgateTableValueError,
MacroDispatchArgError,
MacroResultAlreadyLoadedError,
MacrosSourcesUnWriteableError,
MetricArgsError,
MissingConfigError,
@@ -67,12 +63,11 @@ from dbt.exceptions import (
DbtRuntimeError,
TargetNotFoundError,
DbtValidationError,
DbtReferenceError,
)
from dbt.config import IsFQNResource
from dbt.node_types import NodeType, ModelLanguage
from dbt.utils import merge, AttrDict, MultiDict, args_to_dict, cast_to_str
from dbt.utils import merge, AttrDict, MultiDict, args_to_dict
from dbt import selected_resources
@@ -133,25 +128,6 @@ class BaseDatabaseWrapper:
search_prefixes = get_adapter_type_names(self._adapter.type()) + ["default"]
return search_prefixes
def _get_search_packages(self, namespace: Optional[str] = None) -> List[Optional[str]]:
search_packages: List[Optional[str]] = [None]
if namespace is None:
search_packages = [None]
elif isinstance(namespace, str):
macro_search_order = self._adapter.config.get_macro_search_order(namespace)
if macro_search_order:
search_packages = macro_search_order
elif not macro_search_order and namespace in self._adapter.config.dependencies:
search_packages = [self.config.project_name, namespace]
else:
raise CompilationError(
f"In adapter.dispatch, got a {type(namespace)} macro_namespace argument "
f'("{namespace}"), but macro_namespace should be None or a string.'
)
return search_packages
def dispatch(
self,
macro_name: str,
@@ -173,7 +149,20 @@ class BaseDatabaseWrapper:
if packages is not None:
raise MacroDispatchArgError(macro_name)
search_packages = self._get_search_packages(macro_namespace)
namespace = macro_namespace
if namespace is None:
search_packages = [None]
elif isinstance(namespace, str):
search_packages = self._adapter.config.get_macro_search_order(namespace)
if not search_packages and namespace in self._adapter.config.dependencies:
search_packages = [self.config.project_name, namespace]
else:
# Not a string and not None so must be a list
raise CompilationError(
f"In adapter.dispatch, got a list macro_namespace argument "
f'("{macro_namespace}"), but macro_namespace should be None or a string.'
)
attempts = []
@@ -197,7 +186,7 @@ class BaseDatabaseWrapper:
return macro
searched = ", ".join(repr(a) for a in attempts)
msg = f"In dispatch: No macro named '{macro_name}' found within namespace: '{macro_namespace}'\n Searched for: {searched}"
msg = f"In dispatch: No macro named '{macro_name}' found\n Searched for: {searched}"
raise CompilationError(msg)
@@ -223,17 +212,16 @@ class BaseResolver(metaclass=abc.ABCMeta):
class BaseRefResolver(BaseResolver):
@abc.abstractmethod
def resolve(
self, name: str, package: Optional[str] = None, version: Optional[NodeVersion] = None
) -> RelationProxy:
def resolve(self, name: str, package: Optional[str] = None) -> RelationProxy:
...
def _repack_args(
self, name: str, package: Optional[str], version: Optional[NodeVersion]
) -> RefArgs:
return RefArgs(package=package, name=name, version=version)
def _repack_args(self, name: str, package: Optional[str]) -> List[str]:
if package is None:
return [name]
else:
return [package, name]
def validate_args(self, name: str, package: Optional[str], version: Optional[NodeVersion]):
def validate_args(self, name: str, package: Optional[str]):
if not isinstance(name, str):
raise CompilationError(
f"The name argument to ref() must be a string, got {type(name)}"
@@ -244,15 +232,9 @@ class BaseRefResolver(BaseResolver):
f"The package argument to ref() must be a string or None, got {type(package)}"
)
if version is not None and not isinstance(version, (str, int, float)):
raise CompilationError(
f"The version argument to ref() must be a string, int, float, or None - got {type(version)}"
)
def __call__(self, *args: str, **kwargs) -> RelationProxy:
def __call__(self, *args: str) -> RelationProxy:
name: str
package: Optional[str] = None
version: Optional[NodeVersion] = None
if len(args) == 1:
name = args[0]
@@ -260,10 +242,8 @@ class BaseRefResolver(BaseResolver):
package, name = args
else:
raise RefArgsError(node=self.model, args=args)
version = kwargs.get("version") or kwargs.get("v")
self.validate_args(name, package, version)
return self.resolve(name, package, version)
self.validate_args(name, package)
return self.resolve(name, package)
class BaseSourceResolver(BaseResolver):
@@ -291,7 +271,6 @@ class BaseSourceResolver(BaseResolver):
class BaseMetricResolver(BaseResolver):
@abc.abstractmethod
def resolve(self, name: str, package: Optional[str] = None) -> MetricReference:
...
@@ -469,12 +448,9 @@ class RuntimeDatabaseWrapper(BaseDatabaseWrapper):
# `ref` implementations
class ParseRefResolver(BaseRefResolver):
def resolve(
self, name: str, package: Optional[str] = None, version: Optional[NodeVersion] = None
) -> RelationProxy:
self.model.refs.append(self._repack_args(name, package, version))
def resolve(self, name: str, package: Optional[str] = None) -> RelationProxy:
self.model.refs.append(self._repack_args(name, package))
# This is not the ref for the "name" passed in, but for the current model.
return self.Relation.create_from(self.config, self.model)
@@ -482,17 +458,10 @@ ResolveRef = Union[Disabled, ManifestNode]
class RuntimeRefResolver(BaseRefResolver):
def resolve(
self,
target_name: str,
target_package: Optional[str] = None,
target_version: Optional[NodeVersion] = None,
) -> RelationProxy:
def resolve(self, target_name: str, target_package: Optional[str] = None) -> RelationProxy:
target_model = self.manifest.resolve_ref(
self.model,
target_name,
target_package,
target_version,
self.current_project,
self.model.package_name,
)
@@ -503,32 +472,12 @@ class RuntimeRefResolver(BaseRefResolver):
target_name=target_name,
target_kind="node",
target_package=target_package,
target_version=target_version,
disabled=isinstance(target_model, Disabled),
)
elif self.manifest.is_invalid_private_ref(
self.model, target_model, self.config.dependencies
):
raise DbtReferenceError(
unique_id=self.model.unique_id,
ref_unique_id=target_model.unique_id,
access=AccessType.Private,
scope=cast_to_str(target_model.group),
)
elif self.manifest.is_invalid_protected_ref(
self.model, target_model, self.config.dependencies
):
raise DbtReferenceError(
unique_id=self.model.unique_id,
ref_unique_id=target_model.unique_id,
access=AccessType.Protected,
scope=target_model.package_name,
)
self.validate(target_model, target_name, target_package)
return self.create_relation(target_model, target_name)
self.validate(target_model, target_name, target_package, target_version)
return self.create_relation(target_model)
def create_relation(self, target_model: ManifestNode) -> RelationProxy:
def create_relation(self, target_model: ManifestNode, name: str) -> RelationProxy:
if target_model.is_ephemeral_model:
self.model.set_cte(target_model.unique_id, None)
return self.Relation.create_ephemeral_from_node(self.config, target_model)
@@ -536,14 +485,10 @@ class RuntimeRefResolver(BaseRefResolver):
return self.Relation.create_from(self.config, target_model)
def validate(
self,
resolved: ManifestNode,
target_name: str,
target_package: Optional[str],
target_version: Optional[NodeVersion],
self, resolved: ManifestNode, target_name: str, target_package: Optional[str]
) -> None:
if resolved.unique_id not in self.model.depends_on.nodes:
args = self._repack_args(target_name, target_package, target_version)
args = self._repack_args(target_name, target_package)
raise RefBadContextError(node=self.model, args=args)
@@ -553,17 +498,16 @@ class OperationRefResolver(RuntimeRefResolver):
resolved: ManifestNode,
target_name: str,
target_package: Optional[str],
target_version: Optional[NodeVersion],
) -> None:
pass
def create_relation(self, target_model: ManifestNode) -> RelationProxy:
def create_relation(self, target_model: ManifestNode, name: str) -> RelationProxy:
if target_model.is_ephemeral_model:
# In operations, we can't ref() ephemeral nodes, because
# Macros do not support set_cte
raise OperationsCannotRefEphemeralNodesError(target_model.name, node=self.model)
else:
return super().create_relation(target_model)
return super().create_relation(target_model, name)
# `source` implementations
@@ -735,7 +679,7 @@ class ProviderContext(ManifestContext):
self.config: RuntimeConfig
self.model: Union[Macro, ManifestNode] = model
super().__init__(config, manifest, model.package_name)
self.sql_results: Dict[str, Optional[AttrDict]] = {}
self.sql_results: Dict[str, AttrDict] = {}
self.context_config: Optional[ContextConfig] = context_config
self.provider: Provider = provider
self.adapter = get_adapter(self.config)
@@ -763,29 +707,12 @@ class ProviderContext(ManifestContext):
return args_to_dict(self.config.args)
@contextproperty
def _sql_results(self) -> Dict[str, Optional[AttrDict]]:
def _sql_results(self) -> Dict[str, AttrDict]:
return self.sql_results
@contextmember
def load_result(self, name: str) -> Optional[AttrDict]:
if name in self.sql_results:
# handle the special case of "main" macro
# See: https://github.com/dbt-labs/dbt-core/blob/ada8860e48b32ac712d92e8b0977b2c3c9749981/core/dbt/task/run.py#L228
if name == "main":
return self.sql_results["main"]
# handle a None, which indicates this name was populated but has since been loaded
elif self.sql_results[name] is None:
raise MacroResultAlreadyLoadedError(name)
# Handle the regular use case
else:
ret_val = self.sql_results[name]
self.sql_results[name] = None
return ret_val
else:
# Handle trying to load a result that was never stored
return None
return self.sql_results.get(name)
@contextmember
def store_result(
@@ -841,8 +768,7 @@ class ProviderContext(ManifestContext):
# macros/source defs aren't 'writeable'.
if isinstance(self.model, (Macro, SourceDefinition)):
raise MacrosSourcesUnWriteableError(node=self.model)
self.model.build_path = self.model.get_target_write_path(self.config.target_path, "run")
self.model.write_node(self.config.project_root, self.model.build_path, payload)
self.model.build_path = self.model.write_node(self.config.target_path, "run", payload)
return ""
@contextmember
@@ -1378,30 +1304,20 @@ class ModelContext(ProviderContext):
@contextproperty
def sql(self) -> Optional[str]:
# only doing this in sql model for backward compatible
if self.model.language == ModelLanguage.sql: # type: ignore[union-attr]
# If the model is deferred and the adapter doesn't support zero-copy cloning, then select * from the prod
# relation
if getattr(self.model, "defer_relation", None):
# TODO https://github.com/dbt-labs/dbt-core/issues/7976
return f"select * from {self.model.defer_relation.relation_name or str(self.defer_relation)}" # type: ignore[union-attr]
elif getattr(self.model, "extra_ctes_injected", None):
# TODO CT-211
return self.model.compiled_code # type: ignore[union-attr]
else:
return None
else:
return None
if (
getattr(self.model, "extra_ctes_injected", None)
and self.model.language == ModelLanguage.sql # type: ignore[union-attr]
):
# TODO CT-211
return self.model.compiled_code # type: ignore[union-attr]
return None
@contextproperty
def compiled_code(self) -> Optional[str]:
if getattr(self.model, "defer_relation", None):
# TODO https://github.com/dbt-labs/dbt-core/issues/7976
return f"select * from {self.model.defer_relation.relation_name or str(self.defer_relation)}" # type: ignore[union-attr]
elif getattr(self.model, "extra_ctes_injected", None):
if getattr(self.model, "extra_ctes_injected", None):
# TODO CT-211
return self.model.compiled_code # type: ignore[union-attr]
else:
return None
return None
@contextproperty
def database(self) -> str:
@@ -1446,20 +1362,6 @@ class ModelContext(ProviderContext):
return None
return self.db_wrapper.Relation.create_from(self.config, self.model)
@contextproperty
def defer_relation(self) -> Optional[RelationProxy]:
"""
For commands which add information about this node's corresponding
production version (via a --state artifact), access the Relation
object for that stateful other
"""
if getattr(self.model, "defer_relation", None):
return self.db_wrapper.Relation.create_from_node(
self.config, self.model.defer_relation # type: ignore
)
else:
return None
# This is called by '_context_for', used in 'render_with_context'
def generate_parser_model_context(
@@ -1506,18 +1408,10 @@ def generate_runtime_macro_context(
class ExposureRefResolver(BaseResolver):
def __call__(self, *args, **kwargs) -> str:
package = None
if len(args) == 1:
name = args[0]
elif len(args) == 2:
package, name = args
else:
def __call__(self, *args) -> str:
if len(args) not in (1, 2):
raise RefArgsError(node=self.model, args=args)
version = kwargs.get("version") or kwargs.get("v")
self.model.refs.append(RefArgs(package=package, name=name, version=version))
self.model.refs.append(list(args))
return ""
@@ -1566,9 +1460,8 @@ def generate_parse_exposure(
}
# applies to SemanticModels
class SemanticModelRefResolver(BaseResolver):
def __call__(self, *args, **kwargs) -> str:
class MetricRefResolver(BaseResolver):
def __call__(self, *args) -> str:
package = None
if len(args) == 1:
name = args[0]
@@ -1576,34 +1469,35 @@ class SemanticModelRefResolver(BaseResolver):
package, name = args
else:
raise RefArgsError(node=self.model, args=args)
version = kwargs.get("version") or kwargs.get("v")
self.validate_args(name, package, version)
# "model" here is any node
self.model.refs.append(RefArgs(package=package, name=name, version=version))
self.validate_args(name, package)
self.model.refs.append(list(args))
return ""
def validate_args(self, name, package, version):
def validate_args(self, name, package):
if not isinstance(name, str):
raise ParsingError(
f"In a semantic model or metrics section in {self.model.original_file_path} "
f"In a metrics section in {self.model.original_file_path} "
"the name argument to ref() must be a string"
)
# used for semantic models
def generate_parse_semantic_models(
semantic_model: SemanticModel,
def generate_parse_metrics(
metric: Metric,
config: RuntimeConfig,
manifest: Manifest,
package_name: str,
) -> Dict[str, Any]:
project = config.load_dependencies()[package_name]
return {
"ref": SemanticModelRefResolver(
"ref": MetricRefResolver(
None,
semantic_model,
metric,
project,
manifest,
),
"metric": ParseMetricResolver(
None,
metric,
project,
manifest,
),

View File

@@ -1,13 +1,15 @@
from typing import Any, Dict
from dbt.contracts.connection import HasCredentials
from dbt.context.base import BaseContext, contextproperty
class TargetContext(BaseContext):
# subclass is ConfiguredContext
def __init__(self, target_dict: Dict[str, Any], cli_vars: Dict[str, Any]):
def __init__(self, config: HasCredentials, cli_vars: Dict[str, Any]):
super().__init__(cli_vars=cli_vars)
self.target_dict = target_dict
self.config = config
@contextproperty
def target(self) -> Dict[str, Any]:
@@ -71,4 +73,9 @@ class TargetContext(BaseContext):
|----------|-----------|------------------------------------------|
"""
return self.target_dict
return self.config.to_target_dict()
def generate_target_context(config: HasCredentials, cli_vars: Dict[str, Any]) -> Dict[str, Any]:
ctx = TargetContext(config, cli_vars)
return ctx.to_dict()

View File

@@ -61,6 +61,8 @@ class FilePath(dbtClassMixin):
@property
def original_file_path(self) -> str:
# this is mostly used for reporting errors. It doesn't show the project
# name, should it?
return os.path.join(self.searched_path, self.relative_path)
def seed_too_large(self) -> bool:
@@ -225,10 +227,8 @@ class SchemaSourceFile(BaseSourceFile):
sources: List[str] = field(default_factory=list)
exposures: List[str] = field(default_factory=list)
metrics: List[str] = field(default_factory=list)
groups: List[str] = field(default_factory=list)
# node patches contain models, seeds, snapshots, analyses
ndp: List[str] = field(default_factory=list)
semantic_models: List[str] = field(default_factory=list)
# any macro patches in this file by macro unique_id.
mcp: Dict[str, str] = field(default_factory=dict)
# any source patches in this file. The entries are package, name pairs

View File

@@ -1,11 +1,9 @@
import enum
from collections import defaultdict
from dataclasses import dataclass, field
from itertools import chain, islice
from mashumaro.mixins.msgpack import DataClassMessagePackMixin
from multiprocessing.synchronize import Lock
from typing import (
DefaultDict,
Dict,
List,
Optional,
@@ -25,24 +23,19 @@ from typing_extensions import Protocol
from uuid import UUID
from dbt.contracts.graph.nodes import (
BaseNode,
Documentation,
Exposure,
GenericTestNode,
GraphMemberNode,
Group,
Macro,
ManifestNode,
Metric,
ModelNode,
DeferRelation,
ResultNode,
SemanticModel,
Documentation,
SourceDefinition,
GenericTestNode,
Exposure,
Metric,
UnpatchedSourceDefinition,
ManifestNode,
GraphMemberNode,
ResultNode,
BaseNode,
)
from dbt.contracts.graph.unparsed import SourcePatch, NodeVersion, UnparsedVersion
from dbt.contracts.graph.manifest_upgrade import upgrade_manifest_json
from dbt.contracts.graph.unparsed import SourcePatch
from dbt.contracts.files import SourceFile, SchemaSourceFile, FileHash, AnySourceFile
from dbt.contracts.util import BaseArtifactMetadata, SourceKey, ArtifactMixin, schema_version
from dbt.dataclass_schema import dbtClassMixin
@@ -51,18 +44,15 @@ from dbt.exceptions import (
DuplicateResourceNameError,
DuplicateMacroInPackageError,
DuplicateMaterializationNameError,
AmbiguousResourceNameRefError,
)
from dbt.helper_types import PathSet
from dbt.events.functions import fire_event
from dbt.events.types import MergedFromState, UnpinnedRefNewVersionAvailable
from dbt.events.contextvars import get_node_info
from dbt.node_types import NodeType, AccessType
from dbt.flags import get_flags, MP_CONTEXT
from dbt.events.types import MergedFromState
from dbt.node_types import NodeType
from dbt import flags
from dbt import tracking
import dbt.utils
NodeEdgeMap = Dict[str, List[str]]
PackageName = str
DocName = str
@@ -154,116 +144,39 @@ class SourceLookup(dbtClassMixin):
class RefableLookup(dbtClassMixin):
# model, seed, snapshot
_lookup_types: ClassVar[set] = set(NodeType.refable())
_versioned_types: ClassVar[set] = set(NodeType.versioned())
# refables are actually unique, so the Dict[PackageName, UniqueID] will
# only ever have exactly one value, but doing 3 dict lookups instead of 1
# is not a big deal at all and retains consistency
def __init__(self, manifest: "Manifest"):
self.storage: Dict[str, Dict[PackageName, UniqueID]] = {}
self.populate(manifest)
def get_unique_id(
self,
key: str,
package: Optional[PackageName],
version: Optional[NodeVersion],
node: Optional[GraphMemberNode] = None,
):
if version:
key = f"{key}.v{version}"
def get_unique_id(self, key, package: Optional[PackageName]):
return find_unique_id_for_package(self.storage, key, package)
unique_ids = self._find_unique_ids_for_package(key, package)
if len(unique_ids) > 1:
raise AmbiguousResourceNameRefError(key, unique_ids, node)
else:
return unique_ids[0] if unique_ids else None
def find(
self,
key: str,
package: Optional[PackageName],
version: Optional[NodeVersion],
manifest: "Manifest",
source_node: Optional[GraphMemberNode] = None,
):
unique_id = self.get_unique_id(key, package, version, source_node)
def find(self, key, package: Optional[PackageName], manifest: "Manifest"):
unique_id = self.get_unique_id(key, package)
if unique_id is not None:
node = self.perform_lookup(unique_id, manifest)
# If this is an unpinned ref (no 'version' arg was passed),
# AND this is a versioned node,
# AND this ref is being resolved at runtime -- get_node_info != {}
# Only ModelNodes can be versioned.
if (
isinstance(node, ModelNode)
and version is None
and node.is_versioned
and get_node_info()
):
# Check to see if newer versions are available, and log an "FYI" if so
max_version: UnparsedVersion = max(
[
UnparsedVersion(v.version)
for v in manifest.nodes.values()
if isinstance(v, ModelNode)
and v.name == node.name
and v.version is not None
]
)
assert node.latest_version is not None # for mypy, whenever i may find it
if max_version > UnparsedVersion(node.latest_version):
fire_event(
UnpinnedRefNewVersionAvailable(
node_info=get_node_info(),
ref_node_name=node.name,
ref_node_package=node.package_name,
ref_node_version=str(node.version),
ref_max_version=str(max_version.v),
)
)
return node
return self.perform_lookup(unique_id, manifest)
return None
def add_node(self, node: ManifestNode):
if node.resource_type in self._lookup_types:
if node.name not in self.storage:
self.storage[node.name] = {}
if node.is_versioned:
if node.search_name not in self.storage:
self.storage[node.search_name] = {}
self.storage[node.search_name][node.package_name] = node.unique_id
if node.is_latest_version: # type: ignore
self.storage[node.name][node.package_name] = node.unique_id
else:
self.storage[node.name][node.package_name] = node.unique_id
self.storage[node.name][node.package_name] = node.unique_id
def populate(self, manifest):
for node in manifest.nodes.values():
self.add_node(node)
def perform_lookup(self, unique_id: UniqueID, manifest) -> ManifestNode:
if unique_id in manifest.nodes:
node = manifest.nodes[unique_id]
else:
if unique_id not in manifest.nodes:
raise dbt.exceptions.DbtInternalError(
f"Node {unique_id} found in cache but not found in manifest"
)
return node
def _find_unique_ids_for_package(self, key, package: Optional[PackageName]) -> List[str]:
if key not in self.storage:
return []
pkg_dct: Mapping[PackageName, UniqueID] = self.storage[key]
if package is None:
if not pkg_dct:
return []
else:
return list(pkg_dct.values())
elif package in pkg_dct:
return [pkg_dct[package]]
else:
return []
return manifest.nodes[unique_id]
class MetricLookup(dbtClassMixin):
@@ -299,49 +212,6 @@ class MetricLookup(dbtClassMixin):
return manifest.metrics[unique_id]
class SemanticModelByMeasureLookup(dbtClassMixin):
"""Lookup utility for finding SemanticModel by measure
This is possible because measure names are supposed to be unique across
the semantic models in a manifest.
"""
def __init__(self, manifest: "Manifest"):
self.storage: DefaultDict[str, Dict[PackageName, UniqueID]] = defaultdict(dict)
self.populate(manifest)
def get_unique_id(self, search_name: str, package: Optional[PackageName]):
return find_unique_id_for_package(self.storage, search_name, package)
def find(
self, search_name: str, package: Optional[PackageName], manifest: "Manifest"
) -> Optional[SemanticModel]:
"""Tries to find a SemanticModel based on a measure name"""
unique_id = self.get_unique_id(search_name, package)
if unique_id is not None:
return self.perform_lookup(unique_id, manifest)
return None
def add(self, semantic_model: SemanticModel):
"""Sets all measures for a SemanticModel as paths to the SemanticModel's `unique_id`"""
for measure in semantic_model.measures:
self.storage[measure.name][semantic_model.package_name] = semantic_model.unique_id
def populate(self, manifest: "Manifest"):
"""Populate storage with all the measure + package paths to the Manifest's SemanticModels"""
for semantic_model in manifest.semantic_models.values():
self.add(semantic_model=semantic_model)
def perform_lookup(self, unique_id: UniqueID, manifest: "Manifest") -> SemanticModel:
"""Tries to get a SemanticModel from the Manifest"""
semantic_model = manifest.semantic_models.get(unique_id)
if semantic_model is None:
raise dbt.exceptions.DbtInternalError(
f"Semantic model `{unique_id}` found in cache but not found in manifest"
)
return semantic_model
# This handles both models/seeds/snapshots and sources/metrics/exposures
class DisabledLookup(dbtClassMixin):
def __init__(self, manifest: "Manifest"):
@@ -361,12 +231,7 @@ class DisabledLookup(dbtClassMixin):
# This should return a list of disabled nodes. It's different from
# the other Lookup functions in that it returns full nodes, not just unique_ids
def find(
self, search_name, package: Optional[PackageName], version: Optional[NodeVersion] = None
):
if version:
search_name = f"{search_name}.v{version}"
def find(self, search_name, package: Optional[PackageName]):
if search_name not in self.storage:
return None
@@ -385,10 +250,9 @@ class DisabledLookup(dbtClassMixin):
class AnalysisLookup(RefableLookup):
_lookup_types: ClassVar[set] = set([NodeType.Analysis])
_versioned_types: ClassVar[set] = set()
def _packages_to_search(
def _search_packages(
current_project: str,
node_package: str,
target_package: Optional[str] = None,
@@ -408,16 +272,10 @@ class ManifestMetadata(BaseArtifactMetadata):
dbt_schema_version: str = field(
default_factory=lambda: str(WritableManifest.dbt_schema_version)
)
project_name: Optional[str] = field(
default=None,
metadata={
"description": "Name of the root project",
},
)
project_id: Optional[str] = field(
default=None,
metadata={
"description": "A unique identifier for the project, hashed from the project name",
"description": "A unique identifier for the project",
},
)
user_id: Optional[UUID] = field(
@@ -445,7 +303,7 @@ class ManifestMetadata(BaseArtifactMetadata):
self.user_id = tracking.active_user.id
if self.send_anonymous_usage_stats is None:
self.send_anonymous_usage_stats = get_flags().SEND_ANONYMOUS_USAGE_STATS
self.send_anonymous_usage_stats = flags.SEND_ANONYMOUS_USAGE_STATS
@classmethod
def default(cls):
@@ -471,7 +329,7 @@ def build_node_edges(nodes: List[ManifestNode]):
forward_edges: Dict[str, List[str]] = {n.unique_id: [] for n in nodes}
for node in nodes:
backward_edges[node.unique_id] = node.depends_on_nodes[:]
for unique_id in backward_edges[node.unique_id]:
for unique_id in node.depends_on_nodes:
if unique_id in forward_edges.keys():
forward_edges[unique_id].append(node.unique_id)
return _sort_values(forward_edges), _sort_values(backward_edges)
@@ -615,6 +473,25 @@ MaybeNonSource = Optional[Union[ManifestNode, Disabled[ManifestNode]]]
T = TypeVar("T", bound=GraphMemberNode)
def _update_into(dest: MutableMapping[str, T], new_item: T):
"""Update dest to overwrite whatever is at dest[new_item.unique_id] with
new_itme. There must be an existing value to overwrite, and the two nodes
must have the same original file path.
"""
unique_id = new_item.unique_id
if unique_id not in dest:
raise dbt.exceptions.DbtRuntimeError(
f"got an update_{new_item.resource_type} call with an "
f"unrecognized {new_item.resource_type}: {new_item.unique_id}"
)
existing = dest[unique_id]
if new_item.original_file_path != existing.original_file_path:
raise dbt.exceptions.DbtRuntimeError(
f"cannot update a {new_item.resource_type} to have a new file path!"
)
dest[unique_id] = new_item
# This contains macro methods that are in both the Manifest
# and the MacroManifest
class MacroMethods:
@@ -647,36 +524,26 @@ class MacroMethods:
return candidates.last()
def find_generate_macro_by_name(
self, component: str, root_project_name: str, imported_package: Optional[str] = None
self, component: str, root_project_name: str
) -> Optional[Macro]:
"""
The default `generate_X_name` macros are similar to regular ones, but only
includes imported packages when searching for a package.
- if package is not provided:
The `generate_X_name` macros are similar to regular ones, but ignore
imported packages.
- if there is a `generate_{component}_name` macro in the root
project, return it
- return the `generate_{component}_name` macro from the 'dbt'
internal project
- if package is provided
- return the `generate_{component}_name` macro from the imported
package, if one exists
"""
def filter(candidate: MacroCandidate) -> bool:
if imported_package:
return (
candidate.locality == Locality.Imported
and imported_package == candidate.macro.package_name
)
else:
return candidate.locality != Locality.Imported
return candidate.locality != Locality.Imported
candidates: CandidateList = self._find_macros_by_name(
name=f"generate_{component}_name",
root_project_name=root_project_name,
# filter out imported packages
filter=filter,
)
return candidates.last()
def _find_macros_by_name(
@@ -732,7 +599,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
docs: MutableMapping[str, Documentation] = field(default_factory=dict)
exposures: MutableMapping[str, Exposure] = field(default_factory=dict)
metrics: MutableMapping[str, Metric] = field(default_factory=dict)
groups: MutableMapping[str, Group] = field(default_factory=dict)
selectors: MutableMapping[str, Any] = field(default_factory=dict)
files: MutableMapping[str, AnySourceFile] = field(default_factory=dict)
metadata: ManifestMetadata = field(default_factory=ManifestMetadata)
@@ -741,7 +607,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
source_patches: MutableMapping[SourceKey, SourcePatch] = field(default_factory=dict)
disabled: MutableMapping[str, List[GraphMemberNode]] = field(default_factory=dict)
env_vars: MutableMapping[str, str] = field(default_factory=dict)
semantic_models: MutableMapping[str, SemanticModel] = field(default_factory=dict)
_doc_lookup: Optional[DocLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
@@ -755,9 +620,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
_metric_lookup: Optional[MetricLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
)
_semantic_model_by_measure_lookup: Optional[SemanticModelByMeasureLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
)
_disabled_lookup: Optional[DisabledLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
)
@@ -769,7 +631,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
metadata={"serialize": lambda x: None, "deserialize": lambda x: None},
)
_lock: Lock = field(
default_factory=MP_CONTEXT.Lock,
default_factory=flags.MP_CONTEXT.Lock,
metadata={"serialize": lambda x: None, "deserialize": lambda x: None},
)
@@ -781,9 +643,21 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
@classmethod
def __post_deserialize__(cls, obj):
obj._lock = MP_CONTEXT.Lock()
obj._lock = flags.MP_CONTEXT.Lock()
return obj
def update_exposure(self, new_exposure: Exposure):
_update_into(self.exposures, new_exposure)
def update_metric(self, new_metric: Metric):
_update_into(self.metrics, new_metric)
def update_node(self, new_node: ManifestNode):
_update_into(self.nodes, new_node)
def update_source(self, new_source: SourceDefinition):
_update_into(self.sources, new_source)
def build_flat_graph(self):
"""This attribute is used in context.common by each node, so we want to
only build it once and avoid any concurrency issues around it.
@@ -792,13 +666,9 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
"""
self.flat_graph = {
"exposures": {k: v.to_dict(omit_none=False) for k, v in self.exposures.items()},
"groups": {k: v.to_dict(omit_none=False) for k, v in self.groups.items()},
"metrics": {k: v.to_dict(omit_none=False) for k, v in self.metrics.items()},
"nodes": {k: v.to_dict(omit_none=False) for k, v in self.nodes.items()},
"sources": {k: v.to_dict(omit_none=False) for k, v in self.sources.items()},
"semantic_models": {
k: v.to_dict(omit_none=False) for k, v in self.semantic_models.items()
},
}
def build_disabled_by_file_id(self):
@@ -859,7 +729,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self.nodes.values(),
self.sources.values(),
self.metrics.values(),
self.semantic_models.values(),
)
for resource in all_resources:
resource_type_plural = resource.resource_type.pluralize()
@@ -888,13 +757,11 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
docs={k: _deepcopy(v) for k, v in self.docs.items()},
exposures={k: _deepcopy(v) for k, v in self.exposures.items()},
metrics={k: _deepcopy(v) for k, v in self.metrics.items()},
groups={k: _deepcopy(v) for k, v in self.groups.items()},
selectors={k: _deepcopy(v) for k, v in self.selectors.items()},
metadata=self.metadata,
disabled={k: _deepcopy(v) for k, v in self.disabled.items()},
files={k: _deepcopy(v) for k, v in self.files.items()},
state_check=_deepcopy(self.state_check),
semantic_models={k: _deepcopy(v) for k, v in self.semantic_models.items()},
)
copy.build_flat_graph()
return copy
@@ -906,7 +773,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self.sources.values(),
self.exposures.values(),
self.metrics.values(),
self.semantic_models.values(),
)
)
forward_edges, backward_edges = build_node_edges(edge_members)
@@ -923,22 +789,8 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
forward_edges = build_macro_edges(edge_members)
return forward_edges
def build_group_map(self):
groupable_nodes = list(
chain(
self.nodes.values(),
self.metrics.values(),
)
)
group_map = {group.name: [] for group in self.groups.values()}
for node in groupable_nodes:
if node.group is not None:
group_map[node.group].append(node.unique_id)
self.group_map = group_map
def writable_manifest(self) -> "WritableManifest":
def writable_manifest(self):
self.build_parent_and_child_maps()
self.build_group_map()
return WritableManifest(
nodes=self.nodes,
sources=self.sources,
@@ -946,14 +798,11 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
docs=self.docs,
exposures=self.exposures,
metrics=self.metrics,
groups=self.groups,
selectors=self.selectors,
metadata=self.metadata,
disabled=self.disabled,
child_map=self.child_map,
parent_map=self.parent_map,
group_map=self.group_map,
semantic_models=self.semantic_models,
)
def write(self, path):
@@ -970,8 +819,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
return self.exposures[unique_id]
elif unique_id in self.metrics:
return self.metrics[unique_id]
elif unique_id in self.semantic_models:
return self.semantic_models[unique_id]
else:
# something terrible has happened
raise dbt.exceptions.DbtInternalError(
@@ -1008,13 +855,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self._metric_lookup = MetricLookup(self)
return self._metric_lookup
@property
def semantic_model_by_measure_lookup(self) -> SemanticModelByMeasureLookup:
"""Gets (and creates if necessary) the lookup utility for getting SemanticModels by measures"""
if self._semantic_model_by_measure_lookup is None:
self._semantic_model_by_measure_lookup = SemanticModelByMeasureLookup(self)
return self._semantic_model_by_measure_lookup
def rebuild_ref_lookup(self):
self._ref_lookup = RefableLookup(self)
@@ -1033,37 +873,12 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self._analysis_lookup = AnalysisLookup(self)
return self._analysis_lookup
@property
def external_node_unique_ids(self):
return [node.unique_id for node in self.nodes.values() if node.is_external_node]
def resolve_refs(
self,
source_node: ModelNode,
current_project: str, # TODO: ModelNode is overly restrictive typing
) -> List[MaybeNonSource]:
resolved_refs: List[MaybeNonSource] = []
for ref in source_node.refs:
resolved = self.resolve_ref(
source_node,
ref.name,
ref.package,
ref.version,
current_project,
source_node.package_name,
)
resolved_refs.append(resolved)
return resolved_refs
# Called by dbt.parser.manifest._process_refs_for_exposure, _process_refs_for_metric,
# Called by dbt.parser.manifest._resolve_refs_for_exposure
# and dbt.parser.manifest._process_refs_for_node
def resolve_ref(
self,
source_node: GraphMemberNode,
target_model_name: str,
target_model_package: Optional[str],
target_model_version: Optional[NodeVersion],
current_project: str,
node_package: str,
) -> MaybeNonSource:
@@ -1071,18 +886,16 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
node: Optional[ManifestNode] = None
disabled: Optional[List[ManifestNode]] = None
candidates = _packages_to_search(current_project, node_package, target_model_package)
candidates = _search_packages(current_project, node_package, target_model_package)
for pkg in candidates:
node = self.ref_lookup.find(
target_model_name, pkg, target_model_version, self, source_node
)
node = self.ref_lookup.find(target_model_name, pkg, self)
if node is not None and hasattr(node, "config") and node.config.enabled:
if node is not None and node.config.enabled:
return node
# it's possible that the node is disabled
if disabled is None:
disabled = self.disabled_lookup.find(target_model_name, pkg, target_model_version)
disabled = self.disabled_lookup.find(target_model_name, pkg)
if disabled:
return Disabled(disabled[0])
@@ -1098,7 +911,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
node_package: str,
) -> MaybeParsedSource:
search_name = f"{target_source_name}.{target_table_name}"
candidates = _packages_to_search(current_project, node_package)
candidates = _search_packages(current_project, node_package)
source: Optional[SourceDefinition] = None
disabled: Optional[List[SourceDefinition]] = None
@@ -1128,7 +941,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
metric: Optional[Metric] = None
disabled: Optional[List[Metric]] = None
candidates = _packages_to_search(current_project, node_package, target_metric_package)
candidates = _search_packages(current_project, node_package, target_metric_package)
for pkg in candidates:
metric = self.metric_lookup.find(target_metric_name, pkg, self)
@@ -1142,25 +955,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
return Disabled(disabled[0])
return None
def resolve_semantic_model_for_measure(
self,
target_measure_name: str,
current_project: str,
node_package: str,
target_package: Optional[str] = None,
) -> Optional[SemanticModel]:
"""Tries to find the SemanticModel that a measure belongs to"""
candidates = _packages_to_search(current_project, node_package, target_package)
for pkg in candidates:
semantic_model = self.semantic_model_by_measure_lookup.find(
target_measure_name, pkg, self
)
if semantic_model is not None:
return semantic_model
return None
# Called by DocsRuntimeContext.doc
def resolve_doc(
self,
@@ -1173,7 +967,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
resolve_ref except the is_enabled checks are unnecessary as docs are
always enabled.
"""
candidates = _packages_to_search(current_project, node_package, package)
candidates = _search_packages(current_project, node_package, package)
for pkg in candidates:
result = self.doc_lookup.find(name, pkg, self)
@@ -1181,50 +975,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
return result
return None
def is_invalid_private_ref(
self, node: GraphMemberNode, target_model: MaybeNonSource, dependencies: Optional[Mapping]
) -> bool:
dependencies = dependencies or {}
if not isinstance(target_model, ModelNode):
return False
is_private_ref = (
target_model.access == AccessType.Private
# don't raise this reference error for ad hoc 'preview' queries
and node.resource_type != NodeType.SqlOperation
and node.resource_type != NodeType.RPCCall # TODO: rm
)
target_dependency = dependencies.get(target_model.package_name)
restrict_package_access = target_dependency.restrict_access if target_dependency else False
# TODO: SemanticModel and SourceDefinition do not have group, and so should not be able to make _any_ private ref.
return is_private_ref and (
not hasattr(node, "group")
or not node.group
or node.group != target_model.group
or restrict_package_access
)
def is_invalid_protected_ref(
self, node: GraphMemberNode, target_model: MaybeNonSource, dependencies: Optional[Mapping]
) -> bool:
dependencies = dependencies or {}
if not isinstance(target_model, ModelNode):
return False
is_protected_ref = (
target_model.access == AccessType.Protected
# don't raise this reference error for ad hoc 'preview' queries
and node.resource_type != NodeType.SqlOperation
and node.resource_type != NodeType.RPCCall # TODO: rm
)
target_dependency = dependencies.get(target_model.package_name)
restrict_package_access = target_dependency.restrict_access if target_dependency else False
return is_protected_ref and (
node.package_name != target_model.package_name and restrict_package_access
)
# Called by RunTask.defer_to_manifest
def merge_from_artifact(
self,
@@ -1262,25 +1012,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
sample = list(islice(merged, 5))
fire_event(MergedFromState(num_merged=len(merged), sample=sample))
# Called by CloneTask.defer_to_manifest
def add_from_artifact(
self,
other: "WritableManifest",
) -> None:
"""Update this manifest by *adding* information about each node's location
in the other manifest.
Only non-ephemeral refable nodes are examined.
"""
refables = set(NodeType.refable())
for unique_id, node in other.nodes.items():
current = self.nodes.get(unique_id)
if current and (node.resource_type in refables and not node.is_ephemeral):
defer_relation = DeferRelation(
node.database, node.schema, node.alias, node.relation_name
)
self.nodes[unique_id] = current.replace(defer_relation=defer_relation)
# Methods that were formerly in ParseResult
def add_macro(self, source_file: SourceFile, macro: Macro):
@@ -1321,8 +1052,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
source_file.metrics.append(node.unique_id)
if isinstance(node, Exposure):
source_file.exposures.append(node.unique_id)
if isinstance(node, Group):
source_file.groups.append(node.unique_id)
else:
source_file.nodes.append(node.unique_id)
@@ -1336,11 +1065,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self.metrics[metric.unique_id] = metric
source_file.metrics.append(metric.unique_id)
def add_group(self, source_file: SchemaSourceFile, group: Group):
_check_duplicates(group, self.groups)
self.groups[group.unique_id] = group
source_file.groups.append(group.unique_id)
def add_disabled_nofile(self, node: GraphMemberNode):
# There can be multiple disabled nodes for the same unique_id
if node.unique_id in self.disabled:
@@ -1366,11 +1090,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self.docs[doc.unique_id] = doc
source_file.docs.append(doc.unique_id)
def add_semantic_model(self, source_file: SchemaSourceFile, semantic_model: SemanticModel):
_check_duplicates(semantic_model, self.semantic_models)
self.semantic_models[semantic_model.unique_id] = semantic_model
source_file.semantic_models.append(semantic_model.unique_id)
# end of methods formerly in ParseResult
# Provide support for copy.deepcopy() - we just need to avoid the lock!
@@ -1388,7 +1107,6 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self.docs,
self.exposures,
self.metrics,
self.groups,
self.selectors,
self.files,
self.metadata,
@@ -1397,12 +1115,10 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self.source_patches,
self.disabled,
self.env_vars,
self.semantic_models,
self._doc_lookup,
self._source_lookup,
self._ref_lookup,
self._metric_lookup,
self._semantic_model_by_measure_lookup,
self._disabled_lookup,
self._analysis_lookup,
)
@@ -1422,7 +1138,7 @@ AnyManifest = Union[Manifest, MacroManifest]
@dataclass
@schema_version("manifest", 10)
@schema_version("manifest", 8)
class WritableManifest(ArtifactMixin):
nodes: Mapping[UniqueID, ManifestNode] = field(
metadata=dict(description=("The nodes defined in the dbt project and its dependencies"))
@@ -1444,9 +1160,6 @@ class WritableManifest(ArtifactMixin):
metrics: Mapping[UniqueID, Metric] = field(
metadata=dict(description=("The metrics defined in the dbt project and its dependencies"))
)
groups: Mapping[UniqueID, Group] = field(
metadata=dict(description=("The groups defined in the dbt project"))
)
selectors: Mapping[UniqueID, Any] = field(
metadata=dict(description=("The selectors defined in selectors.yml"))
)
@@ -1463,14 +1176,6 @@ class WritableManifest(ArtifactMixin):
description="A mapping from parent nodes to their dependents",
)
)
group_map: Optional[NodeEdgeMap] = field(
metadata=dict(
description="A mapping from group names to their nodes",
)
)
semantic_models: Mapping[UniqueID, SemanticModel] = field(
metadata=dict(description=("The semantic models defined in the dbt project"))
)
metadata: ManifestMetadata = field(
metadata=dict(
description="Metadata about the manifest",
@@ -1479,40 +1184,15 @@ class WritableManifest(ArtifactMixin):
@classmethod
def compatible_previous_versions(self):
return [
("manifest", 4),
("manifest", 5),
("manifest", 6),
("manifest", 7),
("manifest", 8),
("manifest", 9),
]
@classmethod
def upgrade_schema_version(cls, data):
"""This overrides the "upgrade_schema_version" call in VersionedSchema (via
ArtifactMixin) to modify the dictionary passed in from earlier versions of the manifest."""
manifest_schema_version = get_manifest_schema_version(data)
if manifest_schema_version <= 9:
data = upgrade_manifest_json(data, manifest_schema_version)
return cls.from_dict(data)
return [("manifest", 4), ("manifest", 5), ("manifest", 6), ("manifest", 7)]
def __post_serialize__(self, dct):
for unique_id, node in dct["nodes"].items():
if "config_call_dict" in node:
del node["config_call_dict"]
if "defer_relation" in node:
del node["defer_relation"]
return dct
def get_manifest_schema_version(dct: dict) -> int:
schema_version = dct.get("metadata", {}).get("dbt_schema_version", None)
if not schema_version:
raise ValueError("Manifest doesn't have schema version")
return int(schema_version.split(".")[-2][-1])
def _check_duplicates(value: BaseNode, src: Mapping[str, BaseNode]):
if value.unique_id in src:
raise DuplicateResourceNameError(value, src[value.unique_id])

View File

@@ -1,107 +0,0 @@
def rename_sql_attr(node_content: dict) -> dict:
if "raw_sql" in node_content:
node_content["raw_code"] = node_content.pop("raw_sql")
if "compiled_sql" in node_content:
node_content["compiled_code"] = node_content.pop("compiled_sql")
node_content["language"] = "sql"
return node_content
def upgrade_ref_content(node_content: dict) -> dict:
# In v1.5 we switched Node.refs from List[List[str]] to List[Dict[str, Union[NodeVersion, str]]]
# Previous versions did not have a version keyword argument for ref
if "refs" in node_content:
upgraded_refs = []
for ref in node_content["refs"]:
if isinstance(ref, list):
if len(ref) == 1:
upgraded_refs.append({"package": None, "name": ref[0], "version": None})
else:
upgraded_refs.append({"package": ref[0], "name": ref[1], "version": None})
node_content["refs"] = upgraded_refs
return node_content
def upgrade_node_content(node_content):
rename_sql_attr(node_content)
upgrade_ref_content(node_content)
if node_content["resource_type"] != "seed" and "root_path" in node_content:
del node_content["root_path"]
def upgrade_seed_content(node_content):
# Remove compilation related attributes
for attr_name in (
"language",
"refs",
"sources",
"metrics",
"compiled_path",
"compiled",
"compiled_code",
"extra_ctes_injected",
"extra_ctes",
"relation_name",
):
if attr_name in node_content:
del node_content[attr_name]
# In v1.4, we switched SeedNode.depends_on from DependsOn to MacroDependsOn
node_content.get("depends_on", {}).pop("nodes", None)
def drop_v9_and_prior_metrics(manifest: dict) -> None:
manifest["metrics"] = {}
filtered_disabled_entries = {}
for entry_name, resource_list in manifest.get("disabled", {}).items():
filtered_resource_list = []
for resource in resource_list:
if resource.get("resource_type") != "metric":
filtered_resource_list.append(resource)
filtered_disabled_entries[entry_name] = filtered_resource_list
manifest["disabled"] = filtered_disabled_entries
def upgrade_manifest_json(manifest: dict, manifest_schema_version: int) -> dict:
# this should remain 9 while the check in `upgrade_schema_version` may change
if manifest_schema_version <= 9:
drop_v9_and_prior_metrics(manifest=manifest)
for node_content in manifest.get("nodes", {}).values():
upgrade_node_content(node_content)
if node_content["resource_type"] == "seed":
upgrade_seed_content(node_content)
for disabled in manifest.get("disabled", {}).values():
# There can be multiple disabled nodes for the same unique_id
# so make sure all the nodes get the attr renamed
for node_content in disabled:
upgrade_node_content(node_content)
if node_content["resource_type"] == "seed":
upgrade_seed_content(node_content)
# add group key
if "groups" not in manifest:
manifest["groups"] = {}
if "group_map" not in manifest:
manifest["group_map"] = {}
for metric_content in manifest.get("metrics", {}).values():
# handle attr renames + value translation ("expression" -> "derived")
metric_content = upgrade_ref_content(metric_content)
if "root_path" in metric_content:
del metric_content["root_path"]
for exposure_content in manifest.get("exposures", {}).values():
exposure_content = upgrade_ref_content(exposure_content)
if "root_path" in exposure_content:
del exposure_content["root_path"]
for source_content in manifest.get("sources", {}).values():
if "root_path" in source_content:
del source_content["root_path"]
for macro_content in manifest.get("macros", {}).values():
if "root_path" in macro_content:
del macro_content["root_path"]
for doc_content in manifest.get("docs", {}).values():
if "root_path" in doc_content:
del doc_content["root_path"]
doc_content["resource_type"] = "doc"
if "semantic_models" not in manifest:
manifest["semantic_models"] = {}
return manifest

View File

@@ -2,17 +2,15 @@ from dataclasses import field, Field, dataclass
from enum import Enum
from itertools import chain
from typing import Any, List, Optional, Dict, Union, Type, TypeVar, Callable
from dbt.dataclass_schema import (
dbtClassMixin,
ValidationError,
register_pattern,
StrEnum,
)
from dbt.contracts.graph.unparsed import AdditionalPropertiesAllowed, Docs
from dbt.contracts.graph.utils import validate_color
from dbt.contracts.util import Replaceable, list_str
from dbt.exceptions import DbtInternalError, CompilationError
from dbt.contracts.util import Replaceable, list_str
from dbt import hooks
from dbt.node_types import NodeType
@@ -191,21 +189,6 @@ class Severity(str):
register_pattern(Severity, insensitive_patterns("warn", "error"))
class OnConfigurationChangeOption(StrEnum):
Apply = "apply"
Continue = "continue"
Fail = "fail"
@classmethod
def default(cls) -> "OnConfigurationChangeOption":
return cls.Apply
@dataclass
class ContractConfig(dbtClassMixin, Replaceable):
enforced: bool = False
@dataclass
class Hook(dbtClassMixin, Replaceable):
sql: str
@@ -299,17 +282,11 @@ class BaseConfig(AdditionalPropertiesAllowed, Replaceable):
return False
return True
# This is used in 'add_config_call' to create the combined config_call_dict.
# This is used in 'add_config_call' to created the combined config_call_dict.
# 'meta' moved here from node
mergebehavior = {
"append": ["pre-hook", "pre_hook", "post-hook", "post_hook", "tags"],
"update": [
"quoting",
"column_types",
"meta",
"docs",
"contract",
],
"update": ["quoting", "column_types", "meta", "docs"],
"dict_key_append": ["grants"],
}
@@ -386,15 +363,9 @@ class BaseConfig(AdditionalPropertiesAllowed, Replaceable):
return self.from_dict(dct)
@dataclass
class SemanticModelConfig(BaseConfig):
enabled: bool = True
@dataclass
class MetricConfig(BaseConfig):
enabled: bool = True
group: Optional[str] = None
@dataclass
@@ -432,10 +403,6 @@ class NodeAndTestConfig(BaseConfig):
default_factory=dict,
metadata=MergeBehavior.Update.meta(),
)
group: Optional[str] = field(
default=None,
metadata=CompareBehavior.Exclude.meta(),
)
@dataclass
@@ -468,9 +435,6 @@ class NodeConfig(NodeAndTestConfig):
# sometimes getting the Union order wrong, causing serialization failures.
unique_key: Union[str, List[str], None] = None
on_schema_change: Optional[str] = "ignore"
on_configuration_change: OnConfigurationChangeOption = field(
default_factory=OnConfigurationChangeOption.default
)
grants: Dict[str, Any] = field(
default_factory=dict, metadata=MergeBehavior.DictKeyAppend.meta()
)
@@ -482,13 +446,9 @@ class NodeConfig(NodeAndTestConfig):
default_factory=Docs,
metadata=MergeBehavior.Update.meta(),
)
contract: ContractConfig = field(
default_factory=ContractConfig,
metadata=MergeBehavior.Update.meta(),
)
# we validate that node_color has a suitable value to prevent dbt-docs from crashing
def __post_init__(self):
# we validate that node_color has a suitable value to prevent dbt-docs from crashing
if self.docs.node_color:
node_color = self.docs.node_color
if not validate_color(node_color):
@@ -497,17 +457,6 @@ class NodeConfig(NodeAndTestConfig):
"It is neither a valid HTML color name nor a valid HEX code."
)
if (
self.contract.enforced
and self.materialized == "incremental"
and self.on_schema_change not in ("append_new_columns", "fail")
):
raise ValidationError(
f"Invalid value for on_schema_change: {self.on_schema_change}. Models "
"materialized as incremental with contracts enabled must set "
"on_schema_change to 'append_new_columns' or 'fail'"
)
@classmethod
def __pre_deserialize__(cls, data):
data = super().__pre_deserialize__(data)
@@ -555,8 +504,6 @@ class SeedConfig(NodeConfig):
@dataclass
class TestConfig(NodeAndTestConfig):
__test__ = False
# this is repeated because of a different default
schema: Optional[str] = field(
default="dbt_test__audit",

View File

@@ -1,31 +0,0 @@
from dataclasses import dataclass, field
from datetime import datetime
from typing import Optional, List
from dbt.contracts.graph.unparsed import NodeVersion
from dbt.node_types import NodeType, AccessType
@dataclass
class ModelNodeArgs:
name: str
package_name: str
identifier: str
schema: str
database: Optional[str] = None
relation_name: Optional[str] = None
version: Optional[NodeVersion] = None
latest_version: Optional[NodeVersion] = None
deprecation_date: Optional[datetime] = None
access: Optional[str] = AccessType.Protected.value
generated_at: datetime = field(default_factory=datetime.utcnow)
depends_on_nodes: List[str] = field(default_factory=list)
enabled: bool = True
@property
def unique_id(self) -> str:
unique_id = f"{NodeType.Model}.{self.package_name}.{self.name}"
if self.version:
unique_id = f"{unique_id}.v{self.version}"
return unique_id

View File

@@ -1,63 +1,54 @@
import os
from datetime import datetime
import time
from dataclasses import dataclass, field
from enum import Enum
import hashlib
from mashumaro.types import SerializableType
from typing import Optional, Union, List, Dict, Any, Sequence, Tuple, Iterator
from typing import (
Optional,
Union,
List,
Dict,
Any,
Sequence,
Tuple,
Iterator,
)
from dbt.dataclass_schema import dbtClassMixin, ExtensibleDbtClassMixin
from dbt.clients.system import write_file
from dbt.contracts.files import FileHash
from dbt.contracts.graph.semantic_models import (
Defaults,
Dimension,
Entity,
Measure,
SourceFileMetadata,
)
from dbt.contracts.graph.unparsed import (
Quoting,
Docs,
ExposureType,
ExternalTable,
FreshnessThreshold,
ExternalTable,
HasYamlMetadata,
MacroArgument,
MaturityType,
Owner,
Quoting,
TestDef,
NodeVersion,
UnparsedSourceDefinition,
UnparsedSourceTableDefinition,
UnparsedColumn,
TestDef,
ExposureOwner,
ExposureType,
MaturityType,
MetricFilter,
MetricTime,
)
from dbt.contracts.graph.node_args import ModelNodeArgs
from dbt.contracts.util import Replaceable, AdditionalPropertiesMixin
from dbt.events.proto_types import NodeInfo
from dbt.events.functions import warn_or_error
from dbt.exceptions import ParsingError, ContractBreakingChangeError
from dbt.exceptions import ParsingError
from dbt.events.types import (
SeedIncreased,
SeedExceedsLimitSamePath,
SeedExceedsLimitAndPathChanged,
SeedExceedsLimitChecksumChanged,
)
from dbt.events.contextvars import set_log_contextvars
from dbt.flags import get_flags
from dbt.node_types import ModelLanguage, NodeType, AccessType
from dbt_semantic_interfaces.call_parameter_sets import FilterCallParameterSets
from dbt_semantic_interfaces.references import (
MeasureReference,
LinkableElementReference,
SemanticModelReference,
TimeDimensionReference,
)
from dbt_semantic_interfaces.references import MetricReference as DSIMetricReference
from dbt_semantic_interfaces.type_enums import MetricType, TimeGranularity
from dbt_semantic_interfaces.parsing.where_filter_parser import WhereFilterParser
from dbt.events.contextvars import set_contextvars
from dbt import flags
from dbt.node_types import ModelLanguage, NodeType
from dbt.utils import cast_dict_to_dict_of_strings
from .model_config import (
NodeConfig,
@@ -68,10 +59,8 @@ from .model_config import (
ExposureConfig,
EmptySnapshotConfig,
SnapshotConfig,
SemanticModelConfig,
)
# =====================================================================
# This contains the classes for all of the nodes and node-like objects
# in the manifest. In the "nodes" dictionary of the manifest we find
@@ -127,10 +116,6 @@ class BaseNode(dbtClassMixin, Replaceable):
def is_relational(self):
return self.resource_type in NodeType.refable()
@property
def is_versioned(self):
return self.resource_type in NodeType.versioned() and self.version is not None
@property
def is_ephemeral(self):
return self.config.materialized == "ephemeral"
@@ -153,65 +138,6 @@ class GraphNode(BaseNode):
return self.fqn == other.fqn
@dataclass
class RefArgs(dbtClassMixin):
name: str
package: Optional[str] = None
version: Optional[NodeVersion] = None
@property
def positional_args(self) -> List[str]:
if self.package:
return [self.package, self.name]
else:
return [self.name]
@property
def keyword_args(self) -> Dict[str, Optional[NodeVersion]]:
if self.version:
return {"version": self.version}
else:
return {}
class ConstraintType(str, Enum):
check = "check"
not_null = "not_null"
unique = "unique"
primary_key = "primary_key"
foreign_key = "foreign_key"
custom = "custom"
@classmethod
def is_valid(cls, item):
try:
cls(item)
except ValueError:
return False
return True
@dataclass
class ColumnLevelConstraint(dbtClassMixin):
type: ConstraintType
name: Optional[str] = None
# expression is a user-provided field that will depend on the constraint type.
# It could be a predicate (check type), or a sequence sql keywords (e.g. unique type),
# so the vague naming of 'expression' is intended to capture this range.
expression: Optional[str] = None
warn_unenforced: bool = (
True # Warn if constraint cannot be enforced by platform but will be in DDL
)
warn_unsupported: bool = (
True # Warn if constraint is not supported by the platform and won't be in DDL
)
@dataclass
class ModelLevelConstraint(ColumnLevelConstraint):
columns: List[str] = field(default_factory=list)
@dataclass
class ColumnInfo(AdditionalPropertiesMixin, ExtensibleDbtClassMixin, Replaceable):
"""Used in all ManifestNodes and SourceDefinition"""
@@ -220,18 +146,11 @@ class ColumnInfo(AdditionalPropertiesMixin, ExtensibleDbtClassMixin, Replaceable
description: str = ""
meta: Dict[str, Any] = field(default_factory=dict)
data_type: Optional[str] = None
constraints: List[ColumnLevelConstraint] = field(default_factory=list)
quote: Optional[bool] = None
tags: List[str] = field(default_factory=list)
_extra: Dict[str, Any] = field(default_factory=dict)
@dataclass
class Contract(dbtClassMixin, Replaceable):
enforced: bool = False
checksum: Optional[str] = None
# Metrics, exposures,
@dataclass
class HasRelationMetadata(dbtClassMixin, Replaceable):
@@ -261,16 +180,6 @@ class MacroDependsOn(dbtClassMixin, Replaceable):
self.macros.append(value)
@dataclass
class DeferRelation(HasRelationMetadata):
alias: str
relation_name: Optional[str]
@property
def identifier(self):
return self.alias
@dataclass
class DependsOn(MacroDependsOn):
nodes: List[str] = field(default_factory=list)
@@ -299,6 +208,8 @@ class NodeInfoMixin:
@property
def node_info(self):
meta = getattr(self, "meta", {})
meta_stringified = cast_dict_to_dict_of_strings(meta)
node_info = {
"node_path": getattr(self, "path", None),
"node_name": getattr(self, "name", None),
@@ -308,20 +219,15 @@ class NodeInfoMixin:
"node_status": str(self._event_status.get("node_status")),
"node_started_at": self._event_status.get("started_at"),
"node_finished_at": self._event_status.get("finished_at"),
"meta": getattr(self, "meta", {}),
"node_relation": {
"database": getattr(self, "database", None),
"schema": getattr(self, "schema", None),
"alias": getattr(self, "alias", None),
"relation_name": getattr(self, "relation_name", None),
},
"meta": meta_stringified,
}
return node_info
node_info_msg = NodeInfo(**node_info)
return node_info_msg
def update_event_status(self, **kwargs):
for k, v in kwargs.items():
self._event_status[k] = v
set_log_contextvars(node_info=self.node_info)
set_contextvars(node_info=self.node_info)
def clear_event_status(self):
self._event_status = dict()
@@ -333,7 +239,6 @@ class ParsedNode(NodeInfoMixin, ParsedNodeMandatory, SerializableType):
description: str = field(default="")
columns: Dict[str, ColumnInfo] = field(default_factory=dict)
meta: Dict[str, Any] = field(default_factory=dict)
group: Optional[str] = None
docs: Docs = field(default_factory=Docs)
patch_path: Optional[str] = None
build_path: Optional[str] = None
@@ -344,23 +249,17 @@ class ParsedNode(NodeInfoMixin, ParsedNodeMandatory, SerializableType):
relation_name: Optional[str] = None
raw_code: str = ""
def get_target_write_path(self, target_path: str, subdirectory: str):
# This is called for both the "compiled" subdirectory of "target" and the "run" subdirectory
def write_node(self, target_path: str, subdirectory: str, payload: str):
if os.path.basename(self.path) == os.path.basename(self.original_file_path):
# One-to-one relationship of nodes to files.
path = self.original_file_path
else:
# Many-to-one relationship of nodes to files.
path = os.path.join(self.original_file_path, self.path)
target_write_path = os.path.join(target_path, subdirectory, self.package_name, path)
return target_write_path
full_path = os.path.join(target_path, subdirectory, self.package_name, path)
def write_node(self, project_root: str, compiled_path, compiled_code: str):
if os.path.isabs(compiled_path):
full_path = compiled_path
else:
full_path = os.path.join(project_root, compiled_path)
write_file(full_path, compiled_code)
write_file(full_path, payload)
return full_path
def _serialize(self):
return self.to_dict()
@@ -452,34 +351,30 @@ class ParsedNode(NodeInfoMixin, ParsedNodeMandatory, SerializableType):
old.unrendered_config,
)
def build_contract_checksum(self):
pass
def patch(self, patch: "ParsedNodePatch"):
"""Given a ParsedNodePatch, add the new information to the node."""
# explicitly pick out the parts to update so we don't inadvertently
# step on the model name or anything
# Note: config should already be updated
self.patch_path: Optional[str] = patch.file_id
# update created_at so process_docs will run in partial parsing
self.created_at = time.time()
self.description = patch.description
self.columns = patch.columns
def same_contract(self, old, adapter_type=None) -> bool:
# This would only apply to seeds
return True
def same_contents(self, old, adapter_type) -> bool:
def same_contents(self, old) -> bool:
if old is None:
return False
# Need to ensure that same_contract is called because it
# could throw an error
same_contract = self.same_contract(old, adapter_type)
return (
self.same_body(old)
and self.same_config(old)
and self.same_persisted_description(old)
and self.same_fqn(old)
and self.same_database_representation(old)
and same_contract
and True
)
@property
def is_external_node(self):
return False
@dataclass
class InjectedCTE(dbtClassMixin, Replaceable):
@@ -495,7 +390,7 @@ class CompiledNode(ParsedNode):
so all ManifestNodes except SeedNode."""
language: str = "sql"
refs: List[RefArgs] = field(default_factory=list)
refs: List[List[str]] = field(default_factory=list)
sources: List[List[str]] = field(default_factory=list)
metrics: List[List[str]] = field(default_factory=list)
depends_on: DependsOn = field(default_factory=DependsOn)
@@ -505,7 +400,6 @@ class CompiledNode(ParsedNode):
extra_ctes_injected: bool = False
extra_ctes: List[InjectedCTE] = field(default_factory=list)
_pre_injected_sql: Optional[str] = None
contract: Contract = field(default_factory=Contract)
@property
def empty(self):
@@ -566,209 +460,6 @@ class HookNode(CompiledNode):
@dataclass
class ModelNode(CompiledNode):
resource_type: NodeType = field(metadata={"restrict": [NodeType.Model]})
access: AccessType = AccessType.Protected
constraints: List[ModelLevelConstraint] = field(default_factory=list)
version: Optional[NodeVersion] = None
latest_version: Optional[NodeVersion] = None
deprecation_date: Optional[datetime] = None
defer_relation: Optional[DeferRelation] = None
@classmethod
def from_args(cls, args: ModelNodeArgs) -> "ModelNode":
unique_id = args.unique_id
# build unrendered config -- for usage in ParsedNode.same_contents
unrendered_config = {}
unrendered_config["alias"] = args.identifier
unrendered_config["schema"] = args.schema
if args.database:
unrendered_config["database"] = args.database
return cls(
resource_type=NodeType.Model,
name=args.name,
package_name=args.package_name,
unique_id=unique_id,
fqn=[args.package_name, args.name],
version=args.version,
latest_version=args.latest_version,
relation_name=args.relation_name,
database=args.database,
schema=args.schema,
alias=args.identifier,
deprecation_date=args.deprecation_date,
checksum=FileHash.from_contents(f"{unique_id},{args.generated_at}"),
access=AccessType(args.access),
original_file_path="",
path="",
unrendered_config=unrendered_config,
depends_on=DependsOn(nodes=args.depends_on_nodes),
config=NodeConfig(enabled=args.enabled),
)
@property
def is_external_node(self) -> bool:
return not self.original_file_path and not self.path
@property
def is_latest_version(self) -> bool:
return self.version is not None and self.version == self.latest_version
@property
def search_name(self):
if self.version is None:
return self.name
else:
return f"{self.name}.v{self.version}"
@property
def materialization_enforces_constraints(self) -> bool:
return self.config.materialized in ["table", "incremental"]
def build_contract_checksum(self):
# We don't need to construct the checksum if the model does not
# have contract enforced, because it won't be used.
# This needs to be executed after contract config is set
# Avoid rebuilding the checksum if it has already been set.
if self.contract.checksum is not None:
return
if self.contract.enforced is True:
contract_state = ""
# We need to sort the columns so that order doesn't matter
# columns is a str: ColumnInfo dictionary
sorted_columns = sorted(self.columns.values(), key=lambda col: col.name)
for column in sorted_columns:
contract_state += f"|{column.name}"
contract_state += str(column.data_type)
contract_state += str(column.constraints)
if self.materialization_enforces_constraints:
contract_state += self.config.materialized
contract_state += str(self.constraints)
data = contract_state.encode("utf-8")
self.contract.checksum = hashlib.new("sha256", data).hexdigest()
def same_contract(self, old, adapter_type=None) -> bool:
# If the contract wasn't previously enforced:
if old.contract.enforced is False and self.contract.enforced is False:
# No change -- same_contract: True
return True
if old.contract.enforced is False and self.contract.enforced is True:
# Now it's enforced. This is a change, but not a breaking change -- same_contract: False
return False
# Otherwise: The contract was previously enforced, and we need to check for changes.
# Happy path: The contract is still being enforced, and the checksums are identical.
if self.contract.enforced is True and self.contract.checksum == old.contract.checksum:
# No change -- same_contract: True
return True
# Otherwise: There has been a change.
# We need to determine if it is a **breaking** change.
# These are the categories of breaking changes:
contract_enforced_disabled: bool = False
columns_removed: List[str] = []
column_type_changes: List[Tuple[str, str, str]] = []
enforced_column_constraint_removed: List[Tuple[str, str]] = [] # column, constraint_type
enforced_model_constraint_removed: List[
Tuple[str, List[str]]
] = [] # constraint_type, columns
materialization_changed: List[str] = []
if old.contract.enforced is True and self.contract.enforced is False:
# Breaking change: the contract was previously enforced, and it no longer is
contract_enforced_disabled = True
# TODO: this avoid the circular imports but isn't ideal
from dbt.adapters.factory import get_adapter_constraint_support
from dbt.adapters.base import ConstraintSupport
constraint_support = get_adapter_constraint_support(adapter_type)
column_constraints_exist = False
# Next, compare each column from the previous contract (old.columns)
for old_key, old_value in sorted(old.columns.items()):
# Has this column been removed?
if old_key not in self.columns.keys():
columns_removed.append(old_value.name)
# Has this column's data type changed?
elif old_value.data_type != self.columns[old_key].data_type:
column_type_changes.append(
(
str(old_value.name),
str(old_value.data_type),
str(self.columns[old_key].data_type),
)
)
# track if there are any column level constraints for the materialization check late
if old_value.constraints:
column_constraints_exist = True
# Have enforced columns level constraints changed?
# Constraints are only enforced for table and incremental materializations.
# We only really care if the old node was one of those materializations for breaking changes
if (
old_key in self.columns.keys()
and old_value.constraints != self.columns[old_key].constraints
and old.materialization_enforces_constraints
):
for old_constraint in old_value.constraints:
if (
old_constraint not in self.columns[old_key].constraints
and constraint_support[old_constraint.type] == ConstraintSupport.ENFORCED
):
enforced_column_constraint_removed.append(
(old_key, str(old_constraint.type))
)
# Now compare the model level constraints
if old.constraints != self.constraints and old.materialization_enforces_constraints:
for old_constraint in old.constraints:
if (
old_constraint not in self.constraints
and constraint_support[old_constraint.type] == ConstraintSupport.ENFORCED
):
enforced_model_constraint_removed.append(
(str(old_constraint.type), old_constraint.columns)
)
# Check for relevant materialization changes.
if (
old.materialization_enforces_constraints
and not self.materialization_enforces_constraints
and (old.constraints or column_constraints_exist)
):
materialization_changed = [old.config.materialized, self.config.materialized]
# If a column has been added, it will be missing in the old.columns, and present in self.columns
# That's a change (caught by the different checksums), but not a breaking change
# Did we find any changes that we consider breaking? If so, that's an error
if (
contract_enforced_disabled
or columns_removed
or column_type_changes
or enforced_model_constraint_removed
or enforced_column_constraint_removed
or materialization_changed
):
raise (
ContractBreakingChangeError(
contract_enforced_disabled=contract_enforced_disabled,
columns_removed=columns_removed,
column_type_changes=column_type_changes,
enforced_column_constraint_removed=enforced_column_constraint_removed,
enforced_model_constraint_removed=enforced_model_constraint_removed,
materialization_changed=materialization_changed,
node=self,
)
)
# Otherwise, though we didn't find any *breaking* changes, the contract has still changed -- same_contract: False
else:
return False
# TODO: rm?
@@ -795,7 +486,6 @@ class SeedNode(ParsedNode): # No SQLDefaults!
# and we need the root_path to load the seed later
root_path: Optional[str] = None
depends_on: MacroDependsOn = field(default_factory=MacroDependsOn)
defer_relation: Optional[DeferRelation] = None
def same_seeds(self, other: "SeedNode") -> bool:
# for seeds, we check the hashes. If the hashes are different types,
@@ -904,7 +594,7 @@ class TestShouldStoreFailures:
def should_store_failures(self):
if self.config.store_failures:
return self.config.store_failures
return get_flags().STORE_FAILURES
return flags.STORE_FAILURES
@property
def is_relational(self):
@@ -932,8 +622,6 @@ class SingularTestNode(TestShouldStoreFailures, CompiledNode):
@dataclass
class TestMetadata(dbtClassMixin, Replaceable):
__test__ = False
name: str
# kwargs are the args that are left in the test builder after
# removing configs. They are set from the test builder when
@@ -957,9 +645,8 @@ class GenericTestNode(TestShouldStoreFailures, CompiledNode, HasTestMetadata):
# Was not able to make mypy happy and keep the code working. We need to
# refactor the various configs.
config: TestConfig = field(default_factory=TestConfig) # type: ignore
attached_node: Optional[str] = None
def same_contents(self, other, adapter_type: Optional[str]) -> bool:
def same_contents(self, other) -> bool:
if other is None:
return False
@@ -992,7 +679,6 @@ class IntermediateSnapshotNode(CompiledNode):
class SnapshotNode(CompiledNode):
resource_type: NodeType = field(metadata={"restrict": [NodeType.Snapshot]})
config: SnapshotConfig
defer_relation: Optional[DeferRelation] = None
# ====================================
@@ -1013,6 +699,14 @@ class Macro(BaseNode):
created_at: float = field(default_factory=lambda: time.time())
supported_languages: Optional[List[ModelLanguage]] = None
def patch(self, patch: "ParsedMacroPatch"):
self.patch_path: Optional[str] = patch.file_id
self.description = patch.description
self.created_at = time.time()
self.meta = patch.meta
self.docs = patch.docs
self.arguments = patch.arguments
def same_contents(self, other: Optional["Macro"]) -> bool:
if other is None:
return False
@@ -1166,7 +860,7 @@ class SourceDefinition(NodeInfoMixin, ParsedSourceMandatory):
if old is None:
return True
# config changes are changes (because the only config is "enforced", and
# config changes are changes (because the only config is "enabled", and
# enabling a source is a change!)
# changing the database/schema/identifier is a change
# messing around with external stuff is a change (uh, right?)
@@ -1235,7 +929,7 @@ class SourceDefinition(NodeInfoMixin, ParsedSourceMandatory):
@dataclass
class Exposure(GraphNode):
type: ExposureType
owner: Owner
owner: ExposureOwner
resource_type: NodeType = field(metadata={"restrict": [NodeType.Exposure]})
description: str = ""
label: Optional[str] = None
@@ -1246,7 +940,7 @@ class Exposure(GraphNode):
unrendered_config: Dict[str, Any] = field(default_factory=dict)
url: Optional[str] = None
depends_on: DependsOn = field(default_factory=DependsOn)
refs: List[RefArgs] = field(default_factory=list)
refs: List[List[str]] = field(default_factory=list)
sources: List[List[str]] = field(default_factory=list)
metrics: List[List[str]] = field(default_factory=list)
created_at: float = field(default_factory=lambda: time.time())
@@ -1305,75 +999,16 @@ class Exposure(GraphNode):
and True
)
@property
def group(self):
return None
# ====================================
# Metric node
# ====================================
@dataclass
class WhereFilter(dbtClassMixin):
where_sql_template: str
@property
def call_parameter_sets(self) -> FilterCallParameterSets:
return WhereFilterParser.parse_call_parameter_sets(self.where_sql_template)
@dataclass
class MetricInputMeasure(dbtClassMixin):
name: str
filter: Optional[WhereFilter] = None
alias: Optional[str] = None
def measure_reference(self) -> MeasureReference:
return MeasureReference(element_name=self.name)
def post_aggregation_measure_reference(self) -> MeasureReference:
return MeasureReference(element_name=self.alias or self.name)
@dataclass
class MetricTimeWindow(dbtClassMixin):
count: int
granularity: TimeGranularity
@dataclass
class MetricInput(dbtClassMixin):
name: str
filter: Optional[WhereFilter] = None
alias: Optional[str] = None
offset_window: Optional[MetricTimeWindow] = None
offset_to_grain: Optional[TimeGranularity] = None
def as_reference(self) -> DSIMetricReference:
return DSIMetricReference(element_name=self.name)
def post_aggregation_reference(self) -> DSIMetricReference:
return DSIMetricReference(element_name=self.alias or self.name)
@dataclass
class MetricTypeParams(dbtClassMixin):
measure: Optional[MetricInputMeasure] = None
input_measures: List[MetricInputMeasure] = field(default_factory=list)
numerator: Optional[MetricInput] = None
denominator: Optional[MetricInput] = None
expr: Optional[str] = None
window: Optional[MetricTimeWindow] = None
grain_to_date: Optional[TimeGranularity] = None
metrics: Optional[List[MetricInput]] = None
@dataclass
class MetricReference(dbtClassMixin, Replaceable):
sql: Optional[Union[str, int]] = None
unique_id: Optional[str] = None
sql: Optional[Union[str, int]]
unique_id: Optional[str]
@dataclass
@@ -1381,21 +1016,25 @@ class Metric(GraphNode):
name: str
description: str
label: str
type: MetricType
type_params: MetricTypeParams
filter: Optional[WhereFilter] = None
metadata: Optional[SourceFileMetadata] = None
calculation_method: str
expression: str
filters: List[MetricFilter]
time_grains: List[str]
dimensions: List[str]
resource_type: NodeType = field(metadata={"restrict": [NodeType.Metric]})
timestamp: Optional[str] = None
window: Optional[MetricTime] = None
model: Optional[str] = None
model_unique_id: Optional[str] = None
meta: Dict[str, Any] = field(default_factory=dict)
tags: List[str] = field(default_factory=list)
config: MetricConfig = field(default_factory=MetricConfig)
unrendered_config: Dict[str, Any] = field(default_factory=dict)
sources: List[List[str]] = field(default_factory=list)
depends_on: DependsOn = field(default_factory=DependsOn)
refs: List[RefArgs] = field(default_factory=list)
refs: List[List[str]] = field(default_factory=list)
metrics: List[List[str]] = field(default_factory=list)
created_at: float = field(default_factory=lambda: time.time())
group: Optional[str] = None
@property
def depends_on_nodes(self):
@@ -1405,17 +1044,17 @@ class Metric(GraphNode):
def search_name(self):
return self.name
@property
def input_measures(self) -> List[MetricInputMeasure]:
return self.type_params.input_measures
def same_model(self, old: "Metric") -> bool:
return self.model == old.model
@property
def measure_references(self) -> List[MeasureReference]:
return [x.measure_reference() for x in self.input_measures]
def same_window(self, old: "Metric") -> bool:
return self.window == old.window
@property
def input_metrics(self) -> List[MetricInput]:
return self.type_params.metrics or []
def same_dimensions(self, old: "Metric") -> bool:
return self.dimensions == old.dimensions
def same_filters(self, old: "Metric") -> bool:
return self.filters == old.filters
def same_description(self, old: "Metric") -> bool:
return self.description == old.description
@@ -1423,24 +1062,24 @@ class Metric(GraphNode):
def same_label(self, old: "Metric") -> bool:
return self.label == old.label
def same_calculation_method(self, old: "Metric") -> bool:
return self.calculation_method == old.calculation_method
def same_expression(self, old: "Metric") -> bool:
return self.expression == old.expression
def same_timestamp(self, old: "Metric") -> bool:
return self.timestamp == old.timestamp
def same_time_grains(self, old: "Metric") -> bool:
return self.time_grains == old.time_grains
def same_config(self, old: "Metric") -> bool:
return self.config.same_contents(
self.unrendered_config,
old.unrendered_config,
)
def same_filter(self, old: "Metric") -> bool:
return True # TODO
def same_metadata(self, old: "Metric") -> bool:
return True # TODO
def same_type(self, old: "Metric") -> bool:
return self.type == old.type
def same_type_params(self, old: "Metric") -> bool:
return True # TODO
def same_contents(self, old: Optional["Metric"]) -> bool:
# existing when it didn't before is a change!
# metadata/tags changes are not "changes"
@@ -1448,138 +1087,21 @@ class Metric(GraphNode):
return True
return (
self.same_filter(old)
and self.same_metadata(old)
and self.same_type(old)
and self.same_type_params(old)
self.same_model(old)
and self.same_window(old)
and self.same_dimensions(old)
and self.same_filters(old)
and self.same_description(old)
and self.same_label(old)
and self.same_calculation_method(old)
and self.same_expression(old)
and self.same_timestamp(old)
and self.same_time_grains(old)
and self.same_config(old)
and True
)
# ====================================
# Group node
# ====================================
@dataclass
class Group(BaseNode):
name: str
owner: Owner
resource_type: NodeType = field(metadata={"restrict": [NodeType.Group]})
# ====================================
# SemanticModel and related classes
# ====================================
@dataclass
class NodeRelation(dbtClassMixin):
alias: str
schema_name: str # TODO: Could this be called simply "schema" so we could reuse StateRelation?
database: Optional[str] = None
relation_name: Optional[str] = None
@dataclass
class SemanticModel(GraphNode):
model: str
node_relation: Optional[NodeRelation]
description: Optional[str] = None
defaults: Optional[Defaults] = None
entities: Sequence[Entity] = field(default_factory=list)
measures: Sequence[Measure] = field(default_factory=list)
dimensions: Sequence[Dimension] = field(default_factory=list)
metadata: Optional[SourceFileMetadata] = None
depends_on: DependsOn = field(default_factory=DependsOn)
refs: List[RefArgs] = field(default_factory=list)
created_at: float = field(default_factory=lambda: time.time())
config: SemanticModelConfig = field(default_factory=SemanticModelConfig)
@property
def entity_references(self) -> List[LinkableElementReference]:
return [entity.reference for entity in self.entities]
@property
def dimension_references(self) -> List[LinkableElementReference]:
return [dimension.reference for dimension in self.dimensions]
@property
def measure_references(self) -> List[MeasureReference]:
return [measure.reference for measure in self.measures]
@property
def has_validity_dimensions(self) -> bool:
return any([dim.validity_params is not None for dim in self.dimensions])
@property
def validity_start_dimension(self) -> Optional[Dimension]:
validity_start_dims = [
dim for dim in self.dimensions if dim.validity_params and dim.validity_params.is_start
]
if not validity_start_dims:
return None
return validity_start_dims[0]
@property
def validity_end_dimension(self) -> Optional[Dimension]:
validity_end_dims = [
dim for dim in self.dimensions if dim.validity_params and dim.validity_params.is_end
]
if not validity_end_dims:
return None
return validity_end_dims[0]
@property
def partitions(self) -> List[Dimension]: # noqa: D
return [dim for dim in self.dimensions or [] if dim.is_partition]
@property
def partition(self) -> Optional[Dimension]:
partitions = self.partitions
if not partitions:
return None
return partitions[0]
@property
def reference(self) -> SemanticModelReference:
return SemanticModelReference(semantic_model_name=self.name)
@property
def depends_on_nodes(self):
return self.depends_on.nodes
@property
def depends_on_macros(self):
return self.depends_on.macros
def checked_agg_time_dimension_for_measure(
self, measure_reference: MeasureReference
) -> TimeDimensionReference:
measure: Optional[Measure] = None
for measure in self.measures:
if measure.reference == measure_reference:
measure = measure
assert (
measure is not None
), f"No measure with name ({measure_reference.element_name}) in semantic_model with name ({self.name})"
if self.defaults is not None:
default_agg_time_dimesion = self.defaults.agg_time_dimension
agg_time_dimension_name = measure.agg_time_dimension or default_agg_time_dimesion
assert agg_time_dimension_name is not None, (
f"Aggregation time dimension for measure {measure.name} is not set! This should either be set directly on "
f"the measure specification in the model, or else defaulted to the primary time dimension in the data "
f"source containing the measure."
)
return TimeDimensionReference(element_name=agg_time_dimension_name)
# ====================================
# Patches
# ====================================
@@ -1600,11 +1122,6 @@ class ParsedPatch(HasYamlMetadata, Replaceable):
@dataclass
class ParsedNodePatch(ParsedPatch):
columns: Dict[str, ColumnInfo]
access: Optional[str]
version: Optional[NodeVersion]
latest_version: Optional[NodeVersion]
constraints: List[Dict[str, Any]]
deprecation_date: Optional[datetime]
@dataclass
@@ -1646,7 +1163,6 @@ GraphMemberNode = Union[
ResultNode,
Exposure,
Metric,
SemanticModel,
]
# All "nodes" (or node-like objects) in this file
@@ -1654,7 +1170,6 @@ Resource = Union[
GraphMemberNode,
Documentation,
Macro,
Group,
]
TestNode = Union[

View File

@@ -1,95 +0,0 @@
from dbt_semantic_interfaces.implementations.metric import PydanticMetric
from dbt_semantic_interfaces.implementations.project_configuration import (
PydanticProjectConfiguration,
)
from dbt_semantic_interfaces.implementations.semantic_manifest import PydanticSemanticManifest
from dbt_semantic_interfaces.implementations.semantic_model import PydanticSemanticModel
from dbt_semantic_interfaces.implementations.time_spine_table_configuration import (
PydanticTimeSpineTableConfiguration,
)
from dbt_semantic_interfaces.type_enums import TimeGranularity
from dbt_semantic_interfaces.validations.semantic_manifest_validator import (
SemanticManifestValidator,
)
from dbt.clients.system import write_file
from dbt.events.base_types import EventLevel
from dbt.events.functions import fire_event
from dbt.events.types import SemanticValidationFailure
from dbt.exceptions import ParsingError
class SemanticManifest:
def __init__(self, manifest):
self.manifest = manifest
def validate(self) -> bool:
# TODO: Enforce this check.
# if self.manifest.metrics and not self.manifest.semantic_models:
# fire_event(
# SemanticValidationFailure(
# msg="Metrics require semantic models, but none were found."
# ),
# EventLevel.ERROR,
# )
# return False
if not self.manifest.metrics or not self.manifest.semantic_models:
return True
semantic_manifest = self._get_pydantic_semantic_manifest()
validator = SemanticManifestValidator[PydanticSemanticManifest]()
validation_results = validator.validate_semantic_manifest(semantic_manifest)
for warning in validation_results.warnings:
fire_event(SemanticValidationFailure(msg=warning.message))
for error in validation_results.errors:
fire_event(SemanticValidationFailure(msg=error.message), EventLevel.ERROR)
return not validation_results.errors
def write_json_to_file(self, file_path: str):
semantic_manifest = self._get_pydantic_semantic_manifest()
json = semantic_manifest.json()
write_file(file_path, json)
def _get_pydantic_semantic_manifest(self) -> PydanticSemanticManifest:
project_config = PydanticProjectConfiguration(
time_spine_table_configurations=[],
)
pydantic_semantic_manifest = PydanticSemanticManifest(
metrics=[], semantic_models=[], project_configuration=project_config
)
for semantic_model in self.manifest.semantic_models.values():
pydantic_semantic_manifest.semantic_models.append(
PydanticSemanticModel.parse_obj(semantic_model.to_dict())
)
for metric in self.manifest.metrics.values():
pydantic_semantic_manifest.metrics.append(PydanticMetric.parse_obj(metric.to_dict()))
# Look for time-spine table model and create time spine table configuration
if self.manifest.semantic_models:
# Get model for time_spine_table
time_spine_model_name = "metricflow_time_spine"
model = self.manifest.ref_lookup.find(time_spine_model_name, None, None, self.manifest)
if not model:
raise ParsingError(
"The semantic layer requires a 'metricflow_time_spine' model in the project, but none was found. "
"Guidance on creating this model can be found on our docs site ("
"https://docs.getdbt.com/docs/build/metricflow-time-spine) "
)
# Create time_spine_table_config, set it in project_config, and add to semantic manifest
time_spine_table_config = PydanticTimeSpineTableConfiguration(
location=model.relation_name,
column_name="date_day",
grain=TimeGranularity.DAY,
)
pydantic_semantic_manifest.project_configuration.time_spine_table_configurations = [
time_spine_table_config
]
return pydantic_semantic_manifest

Some files were not shown because too many files have changed in this diff Show More