mirror of
https://github.com/dbt-labs/dbt-core
synced 2025-12-17 19:31:34 +00:00
Compare commits
107 Commits
fix/docker
...
v1.1.3
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
e660fefc52 | ||
|
|
7e23a3e05d | ||
|
|
6dbea380de | ||
|
|
cfd6a6cc93 | ||
|
|
2298c9fbfa | ||
|
|
9f865fb148 | ||
|
|
b6c9221100 | ||
|
|
ad710f51cd | ||
|
|
8ef7259d9c | ||
|
|
424b454918 | ||
|
|
437b763390 | ||
|
|
d10ebecdb6 | ||
|
|
8052331439 | ||
|
|
50d2981320 | ||
|
|
5ef0442654 | ||
|
|
4d63fb2ed2 | ||
|
|
413a19abe6 | ||
|
|
a3dcc14f87 | ||
|
|
5e298b1854 | ||
|
|
c74e9dde2b | ||
|
|
9c6bdcabf8 | ||
|
|
4ae2e41cc6 | ||
|
|
cbd4570c67 | ||
|
|
5ae33b90bf | ||
|
|
37d78338c2 | ||
|
|
698420b420 | ||
|
|
e23f0e0747 | ||
|
|
106da05db7 | ||
|
|
09a396a731 | ||
|
|
9c233f27ab | ||
|
|
a09bc28768 | ||
|
|
365414b5fc | ||
|
|
ec46be7368 | ||
|
|
f23a403468 | ||
|
|
15ad34e415 | ||
|
|
bacc891703 | ||
|
|
a2e167761c | ||
|
|
cce8fda06c | ||
|
|
dd4ac1ba4a | ||
|
|
401ebc2768 | ||
|
|
83612a98b7 | ||
|
|
827eae2750 | ||
|
|
3a3bedcd8e | ||
|
|
c1dfb4e6e6 | ||
|
|
5852f17f0b | ||
|
|
a94156703d | ||
|
|
2b3fb7a5d0 | ||
|
|
5f2a43864f | ||
|
|
88fbc94db2 | ||
|
|
6c277b5fe1 | ||
|
|
40e64b238c | ||
|
|
581bf51574 | ||
|
|
899b0ef224 | ||
|
|
3ade206e86 | ||
|
|
58bd750007 | ||
|
|
0ec829a096 | ||
|
|
7f953a6d48 | ||
|
|
0b92f04683 | ||
|
|
3f37a43a8c | ||
|
|
204d53516a | ||
|
|
5071b00baa | ||
|
|
81118d904a | ||
|
|
69cdc4148e | ||
|
|
1c71bf414d | ||
|
|
7cf57ae72d | ||
|
|
1b6f95fef4 | ||
|
|
38940eeeea | ||
|
|
6c950bad7c | ||
|
|
5e681929ae | ||
|
|
ea5a9da71e | ||
|
|
9c5ee59e19 | ||
|
|
55b1d3a191 | ||
|
|
a968aa7725 | ||
|
|
5e0a765917 | ||
|
|
0aeb9976f4 | ||
|
|
30a7da8112 | ||
|
|
f6a9dae422 | ||
|
|
62a7163334 | ||
|
|
e2f0467f5d | ||
|
|
3e3ecb1c3f | ||
|
|
27511d807f | ||
|
|
15077d087c | ||
|
|
5b01cc0c22 | ||
|
|
d1bcff865d | ||
|
|
0fce63665c | ||
|
|
1183e85eb4 | ||
|
|
3b86243f04 | ||
|
|
c251dae75e | ||
|
|
ecfd77f1ca | ||
|
|
9a0abc1bfc | ||
|
|
490d68e076 | ||
|
|
c45147fe6d | ||
|
|
bc3468e649 | ||
|
|
8fff6729a2 | ||
|
|
08f50acb9e | ||
|
|
436a5f5cd4 | ||
|
|
aca710048f | ||
|
|
673ad50e21 | ||
|
|
8ee86a61a0 | ||
|
|
0dda0a90cf | ||
|
|
220d8b888c | ||
|
|
42d5812577 | ||
|
|
dea4f5f8ff | ||
|
|
8f50eee330 | ||
|
|
8fd8dfcf74 | ||
|
|
10b27b9633 | ||
|
|
5808ee6dd7 |
@@ -1,5 +1,5 @@
|
||||
[bumpversion]
|
||||
current_version = 1.0.1
|
||||
current_version = 1.1.3
|
||||
parse = (?P<major>\d+)
|
||||
\.(?P<minor>\d+)
|
||||
\.(?P<patch>\d+)
|
||||
@@ -24,8 +24,6 @@ values =
|
||||
[bumpversion:part:pre]
|
||||
first_value = 1
|
||||
|
||||
[bumpversion:file:setup.py]
|
||||
|
||||
[bumpversion:file:core/setup.py]
|
||||
|
||||
[bumpversion:file:core/dbt/version.py]
|
||||
@@ -37,3 +35,7 @@ first_value = 1
|
||||
[bumpversion:file:plugins/postgres/dbt/adapters/postgres/__version__.py]
|
||||
|
||||
[bumpversion:file:docker/Dockerfile]
|
||||
|
||||
[bumpversion:file:tests/adapter/setup.py]
|
||||
|
||||
[bumpversion:file:tests/adapter/dbt/tests/adapter/__version__.py]
|
||||
|
||||
16
.changes/0.0.0.md
Normal file
16
.changes/0.0.0.md
Normal file
@@ -0,0 +1,16 @@
|
||||
## Previous Releases
|
||||
|
||||
For information on prior major and minor releases, see their changelogs:
|
||||
|
||||
* [1.0](https://github.com/dbt-labs/dbt-core/blob/1.0.latest/CHANGELOG.md)
|
||||
* [0.21](https://github.com/dbt-labs/dbt-core/blob/0.21.latest/CHANGELOG.md)
|
||||
* [0.20](https://github.com/dbt-labs/dbt-core/blob/0.20.latest/CHANGELOG.md)
|
||||
* [0.19](https://github.com/dbt-labs/dbt-core/blob/0.19.latest/CHANGELOG.md)
|
||||
* [0.18](https://github.com/dbt-labs/dbt-core/blob/0.18.latest/CHANGELOG.md)
|
||||
* [0.17](https://github.com/dbt-labs/dbt-core/blob/0.17.latest/CHANGELOG.md)
|
||||
* [0.16](https://github.com/dbt-labs/dbt-core/blob/0.16.latest/CHANGELOG.md)
|
||||
* [0.15](https://github.com/dbt-labs/dbt-core/blob/0.15.latest/CHANGELOG.md)
|
||||
* [0.14](https://github.com/dbt-labs/dbt-core/blob/0.14.latest/CHANGELOG.md)
|
||||
* [0.13](https://github.com/dbt-labs/dbt-core/blob/0.13.latest/CHANGELOG.md)
|
||||
* [0.12](https://github.com/dbt-labs/dbt-core/blob/0.12.latest/CHANGELOG.md)
|
||||
* [0.11 and earlier](https://github.com/dbt-labs/dbt-core/blob/0.11.latest/CHANGELOG.md)
|
||||
96
.changes/1.1.0.md
Normal file
96
.changes/1.1.0.md
Normal file
@@ -0,0 +1,96 @@
|
||||
## dbt-core 1.1.0 - April 28, 2022
|
||||
### Breaking Changes
|
||||
- **Relevant to maintainers of adapter plugins _only_:** The abstractmethods `get_response` and `execute` now only return a `connection.AdapterReponse` in type hints. (Previously, they could return a string.) We encourage you to update your methods to return an object of class `AdapterResponse`, or implement a subclass specific to your adapter
|
||||
([#4499](https://github.com/dbt-labs/dbt-core/issues/4499), [#4869](https://github.com/dbt-labs/dbt-core/pull/4869))
|
||||
- For adapter plugin maintainers only: Internal adapter methods `set_relations_cache` + `_relations_cache_for_schemas` each take an additional argument, for use with experimental `CACHE_SELECTED_ONLY` config ([#4688](https://github.com/dbt-labs/dbt-core/issues/4688), [#4860](https://github.com/dbt-labs/dbt-core/pull/4860))
|
||||
### Features
|
||||
- Change behaviour of `non_null` test so that it only `select`s all columns if `--store-failures` is enabled. ([#4769](https://github.com/dbt-labs/dbt-core/issues/4769), [#4777](https://github.com/dbt-labs/dbt-core/pull/4777))
|
||||
- Testing framework for dbt adapter testing ([#4730](https://github.com/dbt-labs/dbt-core/issues/4730), [#4846](https://github.com/dbt-labs/dbt-core/pull/4846))
|
||||
- Allow unique key to take a list implementation for postgres/redshift ([#4738](https://github.com/dbt-labs/dbt-core/issues/4738), [#4858](https://github.com/dbt-labs/dbt-core/pull/4858))
|
||||
- Add `--cache_selected_only` flag to cache schema object of selected models only. ([#4688](https://github.com/dbt-labs/dbt-core/issues/4688), [#4860](https://github.com/dbt-labs/dbt-core/pull/4860))
|
||||
- Support custom names for generic tests ([#3348](https://github.com/dbt-labs/dbt-core/issues/3348), [#4898](https://github.com/dbt-labs/dbt-core/pull/4898))
|
||||
- Added Support for Semantic Versioning ([#4453](https://github.com/dbt-labs/dbt-core/issues/4453), [#4644](https://github.com/dbt-labs/dbt-core/pull/4644))
|
||||
- New Dockerfile to support specific db adapters and platforms. See docker/README.md for details ([#4495](https://github.com/dbt-labs/dbt-core/issues/4495), [#4487](https://github.com/dbt-labs/dbt-core/pull/4487))
|
||||
- Allow unique_key to take a list ([#2479](https://github.com/dbt-labs/dbt-core/issues/2479), [#4618](https://github.com/dbt-labs/dbt-core/pull/4618))
|
||||
- Add `--quiet` global flag and `print` Jinja function ([#3451](https://github.com/dbt-labs/dbt-core/issues/3451), [#4701](https://github.com/dbt-labs/dbt-core/pull/4701))
|
||||
- Add space before justification periods ([#4737](https://github.com/dbt-labs/dbt-core/issues/4737), [#4744](https://github.com/dbt-labs/dbt-core/pull/4744))
|
||||
- Enable dbt jobs to run downstream models based on fresher sources. Compare the source freshness results between previous and current state. If any source is fresher and/or new in current vs. previous state, dbt will run and test the downstream models in scope. Example command: `dbt build --select source_status:fresher+` ([#4050](https://github.com/dbt-labs/dbt-core/issues/4050), [#4256](https://github.com/dbt-labs/dbt-core/pull/4256))
|
||||
- converting unique key as list tests to new pytest format ([#4882](https://github.com/dbt-labs/dbt-core/issues/4882), [#4958](https://github.com/dbt-labs/dbt-core/pull/4958))
|
||||
- Add a variable called selected_resources in the Jinja context containing a list of all the resources matching the nodes for the --select, --exclude and/or --selector parameters. ([#3471](https://github.com/dbt-labs/dbt-core/issues/3471), [#5001](https://github.com/dbt-labs/dbt-core/pull/5001))
|
||||
- Support the DO_NOT_TRACK environment variable from the consoledonottrack.com initiative ([#3540](https://github.com/dbt-labs/dbt-core/issues/3540), [#5000](https://github.com/dbt-labs/dbt-core/pull/5000))
|
||||
- Add `--no-print` global flag ([#4710](https://github.com/dbt-labs/dbt-core/issues/4710), [#4854](https://github.com/dbt-labs/dbt-core/pull/4854))
|
||||
- add enabled as a source config ([#3662](https://github.com/dbt-labs/dbt-core/issues/3662), [#5008](https://github.com/dbt-labs/dbt-core/pull/5008))
|
||||
### Fixes
|
||||
- Fix bug causing empty node level meta, snapshot config errors ([#4459](https://github.com/dbt-labs/dbt-core/issues/4459), [#4726](https://github.com/dbt-labs/dbt-core/pull/4726))
|
||||
- Inconsistent timestamps between inserted/updated and deleted rows in snapshots ([#4347](https://github.com/dbt-labs/dbt-core/issues/4347), [#4513](https://github.com/dbt-labs/dbt-core/pull/4513))
|
||||
- Catch all Requests Exceptions on deps install to attempt retries. Also log the exceptions hit. ([#4849](https://github.com/dbt-labs/dbt-core/issues/4849), [#4865](https://github.com/dbt-labs/dbt-core/pull/4865))
|
||||
- Use cli_vars instead of context to create package and selector renderers ([#4876](https://github.com/dbt-labs/dbt-core/issues/4876), [#4878](https://github.com/dbt-labs/dbt-core/pull/4878))
|
||||
- depend on new dbt-extractor version with fixed github links ([#4891](https://github.com/dbt-labs/dbt-core/issues/4891), [#4890](https://github.com/dbt-labs/dbt-core/pull/4890))
|
||||
- Update bumpervsion config to stop looking for missing setup.py ([#-1](https://github.com/dbt-labs/dbt-core/issues/-1), [#4896](https://github.com/dbt-labs/dbt-core/pull/4896))
|
||||
- Corrected permissions settings for docker release workflow ([#4902](https://github.com/dbt-labs/dbt-core/issues/4902), [#4903](https://github.com/dbt-labs/dbt-core/pull/4903))
|
||||
- User wasn't asked for permission to overwite a profile entry when running init inside an existing project ([#4375](https://github.com/dbt-labs/dbt-core/issues/4375), [#4447](https://github.com/dbt-labs/dbt-core/pull/4447))
|
||||
- Add project name validation to `dbt init` ([#4490](https://github.com/dbt-labs/dbt-core/issues/4490), [#4536](https://github.com/dbt-labs/dbt-core/pull/4536))
|
||||
- Allow override of string and numeric types for adapters. ([#4603](https://github.com/dbt-labs/dbt-core/issues/4603), [#4604](https://github.com/dbt-labs/dbt-core/pull/4604))
|
||||
- A change in secret environment variables won't trigger a full reparse ([#4650](https://github.com/dbt-labs/dbt-core/issues/4650), [#4665](https://github.com/dbt-labs/dbt-core/pull/4665))
|
||||
- Fix misspellings and typos in docstrings ([#4904](https://github.com/dbt-labs/dbt-core/issues/4904), [#4545](https://github.com/dbt-labs/dbt-core/pull/4545))
|
||||
- Catch more cases to retry package retrieval for deps pointing to the hub. Also start to cache the package requests. ([#4849](https://github.com/dbt-labs/dbt-core/issues/4849), [#4982](https://github.com/dbt-labs/dbt-core/pull/4982))
|
||||
- Make the warning message for a full event deque more descriptive ([#4962](https://github.com/dbt-labs/dbt-core/issues/4962), [#5011](https://github.com/dbt-labs/dbt-core/pull/5011))
|
||||
- Fix hard delete snapshot test ([#4916](https://github.com/dbt-labs/dbt-core/issues/4916), [#5020](https://github.com/dbt-labs/dbt-core/pull/5020))
|
||||
- Restore ability to utilize `updated_at` for check_cols snapshots ([#5076](https://github.com/dbt-labs/dbt-core/issues/5076), [#5077](https://github.com/dbt-labs/dbt-core/pull/5077))
|
||||
- Use yaml renderer (with target context) for rendering selectors ([#5131](https://github.com/dbt-labs/dbt-core/issues/5131), [#5136](https://github.com/dbt-labs/dbt-core/pull/5136))
|
||||
- Fix retry logic to return values after initial try ([#5023](https://github.com/dbt-labs/dbt-core/issues/5023), [#5137](https://github.com/dbt-labs/dbt-core/pull/5137))
|
||||
- Scrub secret env vars from CommandError in exception stacktrace ([#5151](https://github.com/dbt-labs/dbt-core/issues/5151), [#5152](https://github.com/dbt-labs/dbt-core/pull/5152))
|
||||
### Docs
|
||||
- Resolve errors related to operations preventing DAG from generating in the docs. Also patch a spark issue to allow search to filter accurately past the missing columns. ([#4578](https://github.com/dbt-labs/dbt-core/issues/4578), [#4763](https://github.com/dbt-labs/dbt-core/pull/4763))
|
||||
- Fixed capitalization in UI for exposures of `type: ml` ([#4984](https://github.com/dbt-labs/dbt-core/issues/4984), [#4995](https://github.com/dbt-labs/dbt-core/pull/4995))
|
||||
- List packages and tags in alphabetical order ([#4984](https://github.com/dbt-labs/dbt-core/issues/4984), [#4995](https://github.com/dbt-labs/dbt-core/pull/4995))
|
||||
- Bump jekyll from 3.8.7 to 3.9.0 ([#4984](https://github.com/dbt-labs/dbt-core/issues/4984), [#4995](https://github.com/dbt-labs/dbt-core/pull/4995))
|
||||
- Updated docker README to reflect necessity of using BuildKit ([#4990](https://github.com/dbt-labs/dbt-core/issues/4990), [#5018](https://github.com/dbt-labs/dbt-core/pull/5018))
|
||||
### Under the Hood
|
||||
- Automate changelog generation with changie ([#4652](https://github.com/dbt-labs/dbt-core/issues/4652), [#4743](https://github.com/dbt-labs/dbt-core/pull/4743))
|
||||
- add performance regression testing runner without orchestration ([#4021](https://github.com/dbt-labs/dbt-core/issues/4021), [#4602](https://github.com/dbt-labs/dbt-core/pull/4602))
|
||||
- Fix broken links for changelog generation and tweak GHA to only post a comment once when changelog entry is missing. ([#4848](https://github.com/dbt-labs/dbt-core/issues/4848), [#4857](https://github.com/dbt-labs/dbt-core/pull/4857))
|
||||
- Add support for Python 3.10 ([#4562](https://github.com/dbt-labs/dbt-core/issues/4562), [#4866](https://github.com/dbt-labs/dbt-core/pull/4866))
|
||||
- Enable more dialects to snapshot sources with added columns, even those that don't support boolean datatypes ([#4488](https://github.com/dbt-labs/dbt-core/issues/4488), [#4871](https://github.com/dbt-labs/dbt-core/pull/4871))
|
||||
- Add Graph Compilation and Adapter Cache tracking ([#4625](https://github.com/dbt-labs/dbt-core/issues/4625), [#4912](https://github.com/dbt-labs/dbt-core/pull/4912))
|
||||
- Testing cleanup ([#3648](https://github.com/dbt-labs/dbt-core/issues/3648), [#4509](https://github.com/dbt-labs/dbt-core/pull/4509))
|
||||
- Clean up test deprecation warnings ([#3988](https://github.com/dbt-labs/dbt-core/issues/3988), [#4556](https://github.com/dbt-labs/dbt-core/pull/4556))
|
||||
- Use mashumaro for serialization in event logging ([#4504](https://github.com/dbt-labs/dbt-core/issues/4504), [#4505](https://github.com/dbt-labs/dbt-core/pull/4505))
|
||||
- Drop support for Python 3.7.0 + 3.7.1 ([#4584](https://github.com/dbt-labs/dbt-core/issues/4584), [#4585](https://github.com/dbt-labs/dbt-core/pull/4585))
|
||||
- Drop support for Python <3.7.2 ([#4584](https://github.com/dbt-labs/dbt-core/issues/4584), [#4643](https://github.com/dbt-labs/dbt-core/pull/4643))
|
||||
- Re-format codebase (except tests) using pre-commit hooks ([#3195](https://github.com/dbt-labs/dbt-core/issues/3195), [#4697](https://github.com/dbt-labs/dbt-core/pull/4697))
|
||||
- Add deps module README ([#4904](https://github.com/dbt-labs/dbt-core/issues/4904), [#4686](https://github.com/dbt-labs/dbt-core/pull/4686))
|
||||
- Initial conversion of tests to pytest ([#4690](https://github.com/dbt-labs/dbt-core/issues/4690), [#4691](https://github.com/dbt-labs/dbt-core/pull/4691))
|
||||
- Fix errors in Windows for tests/functions ([#4782](https://github.com/dbt-labs/dbt-core/issues/4782), [#4767](https://github.com/dbt-labs/dbt-core/pull/4767))
|
||||
- Create a dbt.tests.adapter release when releasing dbt and postgres ([#4812](https://github.com/dbt-labs/dbt-core/issues/4812), [#4948](https://github.com/dbt-labs/dbt-core/pull/4948))
|
||||
- update docker image to use python 3.10.3 ([#4904](https://github.com/dbt-labs/dbt-core/issues/4904), [#4963](https://github.com/dbt-labs/dbt-core/pull/4963))
|
||||
- updates black to 22.3.0 which fixes dependency incompatibility when running with precommit. ([#4904](https://github.com/dbt-labs/dbt-core/issues/4904), [#4972](https://github.com/dbt-labs/dbt-core/pull/4972))
|
||||
- Adds config util for ad-hoc creation of project objs or dicts ([#4808](https://github.com/dbt-labs/dbt-core/issues/4808), [#4981](https://github.com/dbt-labs/dbt-core/pull/4981))
|
||||
- Remove TableComparison and convert existing calls to use dbt.tests.util ([#4778](https://github.com/dbt-labs/dbt-core/issues/4778), [#4986](https://github.com/dbt-labs/dbt-core/pull/4986))
|
||||
- Remove unneeded create_schema in snapshot materialization ([#4742](https://github.com/dbt-labs/dbt-core/issues/4742), [#4993](https://github.com/dbt-labs/dbt-core/pull/4993))
|
||||
- Added .git-blame-ignore-revs file to mask re-formmating commits from git blame ([#5004](https://github.com/dbt-labs/dbt-core/issues/5004), [#5019](https://github.com/dbt-labs/dbt-core/pull/5019))
|
||||
- Convert version tests to pytest ([#5024](https://github.com/dbt-labs/dbt-core/issues/5024), [#5026](https://github.com/dbt-labs/dbt-core/pull/5026))
|
||||
- Updating tests and docs to show that we now support Python 3.10 ([#4974](https://github.com/dbt-labs/dbt-core/issues/4974), [#5025](https://github.com/dbt-labs/dbt-core/pull/5025))
|
||||
- Update --version output and logic ([#4724](https://github.com/dbt-labs/dbt-core/issues/4724), [#5029](https://github.com/dbt-labs/dbt-core/pull/5029))
|
||||
- ([#5033](https://github.com/dbt-labs/dbt-core/issues/5033), [#5032](https://github.com/dbt-labs/dbt-core/pull/5032))
|
||||
|
||||
### Contributors
|
||||
- [@NiallRees](https://github.com/NiallRees) ([#4447](https://github.com/dbt-labs/dbt-core/pull/4447))
|
||||
- [@agoblet](https://github.com/agoblet) ([#5000](https://github.com/dbt-labs/dbt-core/pull/5000))
|
||||
- [@alswang18](https://github.com/alswang18) ([#4644](https://github.com/dbt-labs/dbt-core/pull/4644))
|
||||
- [@amirkdv](https://github.com/amirkdv) ([#4536](https://github.com/dbt-labs/dbt-core/pull/4536))
|
||||
- [@anaisvaillant](https://github.com/anaisvaillant) ([#4256](https://github.com/dbt-labs/dbt-core/pull/4256))
|
||||
- [@b-per](https://github.com/b-per) ([#5001](https://github.com/dbt-labs/dbt-core/pull/5001))
|
||||
- [@dbeatty10](https://github.com/dbeatty10) ([#5077](https://github.com/dbt-labs/dbt-core/pull/5077))
|
||||
- [@ehmartens](https://github.com/ehmartens) ([#4701](https://github.com/dbt-labs/dbt-core/pull/4701))
|
||||
- [@joellabes](https://github.com/joellabes) ([#4744](https://github.com/dbt-labs/dbt-core/pull/4744))
|
||||
- [@jonstacks](https://github.com/jonstacks) ([#4995](https://github.com/dbt-labs/dbt-core/pull/4995))
|
||||
- [@kadero](https://github.com/kadero) ([#4513](https://github.com/dbt-labs/dbt-core/pull/4513))
|
||||
- [@karunpoudel](https://github.com/karunpoudel) ([#4860](https://github.com/dbt-labs/dbt-core/pull/4860), [#4860](https://github.com/dbt-labs/dbt-core/pull/4860))
|
||||
- [@kazanzhy](https://github.com/kazanzhy) ([#4545](https://github.com/dbt-labs/dbt-core/pull/4545))
|
||||
- [@matt-winkler](https://github.com/matt-winkler) ([#4256](https://github.com/dbt-labs/dbt-core/pull/4256))
|
||||
- [@mdesmet](https://github.com/mdesmet) ([#4604](https://github.com/dbt-labs/dbt-core/pull/4604))
|
||||
- [@pgoslatara](https://github.com/pgoslatara) ([#4995](https://github.com/dbt-labs/dbt-core/pull/4995))
|
||||
- [@poloaraujo](https://github.com/poloaraujo) ([#4854](https://github.com/dbt-labs/dbt-core/pull/4854))
|
||||
- [@sungchun12](https://github.com/sungchun12) ([#4256](https://github.com/dbt-labs/dbt-core/pull/4256))
|
||||
- [@willbowditch](https://github.com/willbowditch) ([#4777](https://github.com/dbt-labs/dbt-core/pull/4777))
|
||||
14
.changes/1.1.1.md
Normal file
14
.changes/1.1.1.md
Normal file
@@ -0,0 +1,14 @@
|
||||
## dbt-core 1.1.1 - June 15, 2022
|
||||
### Fixes
|
||||
- Relax minimum supported version of MarkupSafe ([#4745](https://github.com/dbt-labs/dbt-core/issues/4745), [#5039](https://github.com/dbt-labs/dbt-core/pull/5039))
|
||||
- When parsing 'all_sources' should be a list of unique dirs ([#5120](https://github.com/dbt-labs/dbt-core/issues/5120), [#5176](https://github.com/dbt-labs/dbt-core/pull/5176))
|
||||
- Remove docs file from manifest when removing doc node ([#4146](https://github.com/dbt-labs/dbt-core/issues/4146), [#5270](https://github.com/dbt-labs/dbt-core/pull/5270))
|
||||
- Fixing Windows color regression ([#5191](https://github.com/dbt-labs/dbt-core/issues/5191), [#5327](https://github.com/dbt-labs/dbt-core/pull/5327))
|
||||
### Under the Hood
|
||||
- Update context readme + clean up context code" ([#4796](https://github.com/dbt-labs/dbt-core/issues/4796), [#5334](https://github.com/dbt-labs/dbt-core/pull/5334))
|
||||
### Dependencies
|
||||
- Bumping hologram version ([#5219](https://github.com/dbt-labs/dbt-core/issues/5219), [#5218](https://github.com/dbt-labs/dbt-core/pull/5218))
|
||||
- Pin networkx to <2.8.4 for v1.1 patches ([#5286](https://github.com/dbt-labs/dbt-core/issues/5286), [#5334](https://github.com/dbt-labs/dbt-core/pull/5334))
|
||||
|
||||
### Contributors
|
||||
- [@adamantike](https://github.com/adamantike) ([#5039](https://github.com/dbt-labs/dbt-core/pull/5039))
|
||||
5
.changes/1.1.2.md
Normal file
5
.changes/1.1.2.md
Normal file
@@ -0,0 +1,5 @@
|
||||
## dbt-core 1.1.2 - July 28, 2022
|
||||
### Fixes
|
||||
- Define compatibility for older manifest versions when using state: selection methods ([#5213](https://github.com/dbt-labs/dbt-core/issues/5213), [#5346](https://github.com/dbt-labs/dbt-core/pull/5346))
|
||||
### Under the Hood
|
||||
- Add annotation to render_value method reimplemented in #5334 ([#4796](https://github.com/dbt-labs/dbt-core/issues/4796), [#5382](https://github.com/dbt-labs/dbt-core/pull/5382))
|
||||
3
.changes/1.1.3.md
Normal file
3
.changes/1.1.3.md
Normal file
@@ -0,0 +1,3 @@
|
||||
## dbt-core 1.1.3 - January 05, 2023
|
||||
### Fixes
|
||||
- Bug when partial parsing with an empty schema file ([#4850](https://github.com/dbt-labs/dbt-core/issues/4850), [#<no value>](https://github.com/dbt-labs/dbt-core/pull/<no value>))
|
||||
53
.changes/README.md
Normal file
53
.changes/README.md
Normal file
@@ -0,0 +1,53 @@
|
||||
# CHANGELOG Automation
|
||||
|
||||
We use [changie](https://changie.dev/) to automate `CHANGELOG` generation. For installation and format/command specifics, see the documentation.
|
||||
|
||||
### Quick Tour
|
||||
|
||||
- All new change entries get generated under `/.changes/unreleased` as a yaml file
|
||||
- `header.tpl.md` contains the contents of the entire CHANGELOG file
|
||||
- `0.0.0.md` contains the contents of the footer for the entire CHANGELOG file. changie looks to be in the process of supporting a footer file the same as it supports a header file. Switch to that when available. For now, the 0.0.0 in the file name forces it to the bottom of the changelog no matter what version we are releasing.
|
||||
- `.changie.yaml` contains the fields in a change, the format of a single change, as well as the format of the Contributors section for each version.
|
||||
|
||||
### Workflow
|
||||
|
||||
#### Daily workflow
|
||||
Almost every code change we make associated with an issue will require a `CHANGELOG` entry. After you have created the PR in GitHub, run `changie new` and follow the command prompts to generate a yaml file with your change details. This only needs to be done once per PR.
|
||||
|
||||
The `changie new` command will ensure correct file format and file name. There is a one to one mapping of issues to changes. Multiple issues cannot be lumped into a single entry. If you make a mistake, the yaml file may be directly modified and saved as long as the format is preserved.
|
||||
|
||||
Note: If your PR has been cleared by the Core Team as not needing a changelog entry, the `Skip Changelog` label may be put on the PR to bypass the GitHub action that blacks PRs from being merged when they are missing a `CHANGELOG` entry.
|
||||
|
||||
#### Prerelease Workflow
|
||||
These commands batch up changes in `/.changes/unreleased` to be included in this prerelease and move those files to a directory named for the release version. The `--move-dir` will be created if it does not exist and is created in `/.changes`.
|
||||
|
||||
```
|
||||
changie batch <version> --move-dir '<version>' --prerelease 'rc1'
|
||||
changie merge
|
||||
```
|
||||
|
||||
Example
|
||||
```
|
||||
changie batch 1.0.5 --move-dir '1.0.5' --prerelease 'rc1'
|
||||
changie merge
|
||||
```
|
||||
|
||||
#### Final Release Workflow
|
||||
These commands batch up changes in `/.changes/unreleased` as well as `/.changes/<version>` to be included in this final release and delete all prereleases. This rolls all prereleases up into a single final release. All `yaml` files in `/unreleased` and `<version>` will be deleted at this point.
|
||||
|
||||
```
|
||||
changie batch <version> --include '<version>' --remove-prereleases
|
||||
changie merge
|
||||
```
|
||||
|
||||
Example
|
||||
```
|
||||
changie batch 1.0.5 --include '1.0.5' --remove-prereleases
|
||||
changie merge
|
||||
```
|
||||
|
||||
### A Note on Manual Edits & Gotchas
|
||||
- Changie generates markdown files in the `.changes` directory that are parsed together with the `changie merge` command. Every time `changie merge` is run, it regenerates the entire file. For this reason, any changes made directly to `CHANGELOG.md` will be overwritten on the next run of `changie merge`.
|
||||
- If changes need to be made to the `CHANGELOG.md`, make the changes to the relevant `<version>.md` file located in the `/.changes` directory. You will then run `changie merge` to regenerate the `CHANGELOG.MD`.
|
||||
- Do not run `changie batch` again on released versions. Our final release workflow deletes all of the yaml files associated with individual changes. If for some reason modifications to the `CHANGELOG.md` are required after we've generated the final release `CHANGELOG.md`, the modifications need to be done manually to the `<version>.md` file in the `/.changes` directory.
|
||||
- changie can modify, create and delete files depending on the command you run. This is expected. Be sure to commit everything that has been modified and deleted.
|
||||
6
.changes/header.tpl.md
Executable file
6
.changes/header.tpl.md
Executable file
@@ -0,0 +1,6 @@
|
||||
# dbt Core Changelog
|
||||
|
||||
- This file provides a full account of all changes to `dbt-core` and `dbt-postgres`
|
||||
- Changes are listed under the (pre)release in which they first appear. Subsequent releases include changes from previous releases.
|
||||
- "Breaking changes" listed under a version may require action from end users or external maintainers when upgrading to that version.
|
||||
- Do not edit this file directly. This file is auto-generated using [changie](https://github.com/miniscruff/changie). For details on how to document a change, see [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-changelog-entry)
|
||||
0
test/integration/004_simple_snapshot_tests/models/.gitkeep → .changes/unreleased/.gitkeep
Normal file → Executable file
0
test/integration/004_simple_snapshot_tests/models/.gitkeep → .changes/unreleased/.gitkeep
Normal file → Executable file
61
.changie.yaml
Executable file
61
.changie.yaml
Executable file
@@ -0,0 +1,61 @@
|
||||
changesDir: .changes
|
||||
unreleasedDir: unreleased
|
||||
headerPath: header.tpl.md
|
||||
versionHeaderPath: ""
|
||||
changelogPath: CHANGELOG.md
|
||||
versionExt: md
|
||||
versionFormat: '## dbt-core {{.Version}} - {{.Time.Format "January 02, 2006"}}'
|
||||
kindFormat: '### {{.Kind}}'
|
||||
changeFormat: '- {{.Body}} ([#{{.Custom.Issue}}](https://github.com/dbt-labs/dbt-core/issues/{{.Custom.Issue}}), [#{{.Custom.PR}}](https://github.com/dbt-labs/dbt-core/pull/{{.Custom.PR}}))'
|
||||
kinds:
|
||||
- label: Breaking Changes
|
||||
- label: Features
|
||||
- label: Fixes
|
||||
- label: Docs
|
||||
- label: Under the Hood
|
||||
- label: Dependencies
|
||||
- label: Security
|
||||
custom:
|
||||
- key: Author
|
||||
label: GitHub Username(s) (separated by a single space if multiple)
|
||||
type: string
|
||||
minLength: 3
|
||||
- key: Issue
|
||||
label: GitHub Issue Number
|
||||
type: int
|
||||
minLength: 4
|
||||
- key: PR
|
||||
label: GitHub Pull Request Number
|
||||
type: int
|
||||
minLength: 4
|
||||
footerFormat: |
|
||||
{{- $contributorDict := dict }}
|
||||
{{- /* any names added to this list should be all lowercase for later matching purposes */}}
|
||||
{{- $core_team := list "emmyoop" "nathaniel-may" "gshank" "leahwicz" "chenyulinx" "stu-k" "iknox-fa" "versusfacit" "mcknight-42" "jtcohen6" "dependabot" }}
|
||||
{{- range $change := .Changes }}
|
||||
{{- $authorList := splitList " " $change.Custom.Author }}
|
||||
{{- /* loop through all authors for a PR */}}
|
||||
{{- range $author := $authorList }}
|
||||
{{- $authorLower := lower $author }}
|
||||
{{- /* we only want to include non-core team contributors */}}
|
||||
{{- if not (has $authorLower $core_team)}}
|
||||
{{- $pr := $change.Custom.PR }}
|
||||
{{- /* check if this contributor has other PRs associated with them already */}}
|
||||
{{- if hasKey $contributorDict $author }}
|
||||
{{- $prList := get $contributorDict $author }}
|
||||
{{- $prList = append $prList $pr }}
|
||||
{{- $contributorDict := set $contributorDict $author $prList }}
|
||||
{{- else }}
|
||||
{{- $prList := list $change.Custom.PR }}
|
||||
{{- $contributorDict := set $contributorDict $author $prList }}
|
||||
{{- end }}
|
||||
{{- end}}
|
||||
{{- end}}
|
||||
{{- end }}
|
||||
{{- /* no indentation here for formatting so the final markdown doesn't have unneeded indentations */}}
|
||||
{{- if $contributorDict}}
|
||||
### Contributors
|
||||
{{- range $k,$v := $contributorDict }}
|
||||
- [@{{$k}}](https://github.com/{{$k}}) ({{ range $index, $element := $v }}{{if $index}}, {{end}}[#{{$element}}](https://github.com/dbt-labs/dbt-core/pull/{{$element}}){{end}})
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
2
.flake8
2
.flake8
@@ -8,5 +8,5 @@ ignore =
|
||||
W504
|
||||
E203 # makes Flake8 work like black
|
||||
E741
|
||||
max-line-length = 99
|
||||
E501 # long line checking is done in black
|
||||
exclude = test
|
||||
|
||||
2
.git-blame-ignore-revs
Normal file
2
.git-blame-ignore-revs
Normal file
@@ -0,0 +1,2 @@
|
||||
# Reformatting dbt-core via black, flake8, mypy, and assorted pre-commit hooks.
|
||||
43e3fc22c4eae4d3d901faba05e33c40f1f1dc5a
|
||||
2
.github/pull_request_template.md
vendored
2
.github/pull_request_template.md
vendored
@@ -18,4 +18,4 @@ resolves #
|
||||
- [ ] I have signed the [CLA](https://docs.getdbt.com/docs/contributor-license-agreements)
|
||||
- [ ] I have run this code in development and it appears to resolve the stated issue
|
||||
- [ ] This PR includes tests, or tests are not required/relevant for this PR
|
||||
- [ ] I have updated the `CHANGELOG.md` and added information about my change
|
||||
- [ ] I have added information about my change to be included in the [CHANGELOG](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#Adding-CHANGELOG-Entry).
|
||||
|
||||
95
.github/scripts/integration-test-matrix.js
vendored
95
.github/scripts/integration-test-matrix.js
vendored
@@ -1,95 +0,0 @@
|
||||
module.exports = ({ context }) => {
|
||||
const defaultPythonVersion = "3.8";
|
||||
const supportedPythonVersions = ["3.7", "3.8", "3.9"];
|
||||
const supportedAdapters = ["postgres"];
|
||||
|
||||
// if PR, generate matrix based on files changed and PR labels
|
||||
if (context.eventName.includes("pull_request")) {
|
||||
// `changes` is a list of adapter names that have related
|
||||
// file changes in the PR
|
||||
// ex: ['postgres', 'snowflake']
|
||||
const changes = JSON.parse(process.env.CHANGES);
|
||||
const labels = context.payload.pull_request.labels.map(({ name }) => name);
|
||||
console.log("labels", labels);
|
||||
console.log("changes", changes);
|
||||
const testAllLabel = labels.includes("test all");
|
||||
const include = [];
|
||||
|
||||
for (const adapter of supportedAdapters) {
|
||||
if (
|
||||
changes.includes(adapter) ||
|
||||
testAllLabel ||
|
||||
labels.includes(`test ${adapter}`)
|
||||
) {
|
||||
for (const pythonVersion of supportedPythonVersions) {
|
||||
if (
|
||||
pythonVersion === defaultPythonVersion ||
|
||||
labels.includes(`test python${pythonVersion}`) ||
|
||||
testAllLabel
|
||||
) {
|
||||
// always run tests on ubuntu by default
|
||||
include.push({
|
||||
os: "ubuntu-latest",
|
||||
adapter,
|
||||
"python-version": pythonVersion,
|
||||
});
|
||||
|
||||
if (labels.includes("test windows") || testAllLabel) {
|
||||
include.push({
|
||||
os: "windows-latest",
|
||||
adapter,
|
||||
"python-version": pythonVersion,
|
||||
});
|
||||
}
|
||||
|
||||
if (labels.includes("test macos") || testAllLabel) {
|
||||
include.push({
|
||||
os: "macos-latest",
|
||||
adapter,
|
||||
"python-version": pythonVersion,
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
console.log("matrix", { include });
|
||||
|
||||
return {
|
||||
include,
|
||||
};
|
||||
}
|
||||
// if not PR, generate matrix of python version, adapter, and operating
|
||||
// system to run integration tests on
|
||||
|
||||
const include = [];
|
||||
// run for all adapters and python versions on ubuntu
|
||||
for (const adapter of supportedAdapters) {
|
||||
for (const pythonVersion of supportedPythonVersions) {
|
||||
include.push({
|
||||
os: 'ubuntu-latest',
|
||||
adapter: adapter,
|
||||
"python-version": pythonVersion,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// additionally include runs for all adapters, on macos and windows,
|
||||
// but only for the default python version
|
||||
for (const adapter of supportedAdapters) {
|
||||
for (const operatingSystem of ["windows-latest", "macos-latest"]) {
|
||||
include.push({
|
||||
os: operatingSystem,
|
||||
adapter: adapter,
|
||||
"python-version": defaultPythonVersion,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
console.log("matrix", { include });
|
||||
|
||||
return {
|
||||
include,
|
||||
};
|
||||
};
|
||||
2
.github/workflows/backport.yml
vendored
2
.github/workflows/backport.yml
vendored
@@ -29,6 +29,6 @@ jobs:
|
||||
name: Backport
|
||||
steps:
|
||||
- name: Backport
|
||||
uses: tibdex/backport@v1.1.1
|
||||
uses: dbt-labs/backport@v1.1.1
|
||||
with:
|
||||
github_token: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
78
.github/workflows/changelog-check.yml
vendored
Normal file
78
.github/workflows/changelog-check.yml
vendored
Normal file
@@ -0,0 +1,78 @@
|
||||
# **what?**
|
||||
# Checks that a file has been committed under the /.changes directory
|
||||
# as a new CHANGELOG entry. Cannot check for a specific filename as
|
||||
# it is dynamically generated by change type and timestamp.
|
||||
# This workflow should not require any secrets since it runs for PRs
|
||||
# from forked repos.
|
||||
# By default, secrets are not passed to workflows running from
|
||||
# a forked repo.
|
||||
|
||||
# **why?**
|
||||
# Ensure code change gets reflected in the CHANGELOG.
|
||||
|
||||
# **when?**
|
||||
# This will run for all PRs going into main and *.latest. It will
|
||||
# run when they are opened, reopened, when any label is added or removed
|
||||
# and when new code is pushed to the branch. The action will then get
|
||||
# skipped if the 'Skip Changelog' label is present is any of the labels.
|
||||
|
||||
name: Check Changelog Entry
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
types: [opened, reopened, labeled, unlabeled, synchronize]
|
||||
workflow_dispatch:
|
||||
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
pull-requests: write
|
||||
|
||||
env:
|
||||
changelog_comment: 'Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-changelog-entry).'
|
||||
|
||||
jobs:
|
||||
changelog:
|
||||
name: changelog
|
||||
if: "!contains(github.event.pull_request.labels.*.name, 'Skip Changelog')"
|
||||
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- name: Check if changelog file was added
|
||||
# https://github.com/marketplace/actions/paths-changes-filter
|
||||
# For each filter, it sets output variable named by the filter to the text:
|
||||
# 'true' - if any of changed files matches any of filter rules
|
||||
# 'false' - if none of changed files matches any of filter rules
|
||||
# also, returns:
|
||||
# `changes` - JSON array with names of all filters matching any of the changed files
|
||||
uses: dorny/paths-filter@v2
|
||||
id: filter
|
||||
with:
|
||||
token: ${{ secrets.GITHUB_TOKEN }}
|
||||
filters: |
|
||||
changelog:
|
||||
- added: '.changes/unreleased/**.yaml'
|
||||
- name: Check if comment already exists
|
||||
uses: peter-evans/find-comment@v1
|
||||
id: changelog_comment
|
||||
with:
|
||||
issue-number: ${{ github.event.pull_request.number }}
|
||||
comment-author: 'github-actions[bot]'
|
||||
body-includes: ${{ env.changelog_comment }}
|
||||
- name: Create PR comment if changelog entry is missing, required, and does nto exist
|
||||
if: |
|
||||
steps.filter.outputs.changelog == 'false' &&
|
||||
steps.changelog_comment.outputs.comment-body == ''
|
||||
uses: peter-evans/create-or-update-comment@v1
|
||||
with:
|
||||
issue-number: ${{ github.event.pull_request.number }}
|
||||
body: ${{ env.changelog_comment }}
|
||||
- name: Fail job if changelog entry is missing and required
|
||||
if: steps.filter.outputs.changelog == 'false'
|
||||
uses: actions/github-script@v6
|
||||
with:
|
||||
script: core.setFailed('Changelog entry required to merge.')
|
||||
222
.github/workflows/integration.yml
vendored
222
.github/workflows/integration.yml
vendored
@@ -1,222 +0,0 @@
|
||||
# **what?**
|
||||
# This workflow runs all integration tests for supported OS
|
||||
# and python versions and core adapters. If triggered by PR,
|
||||
# the workflow will only run tests for adapters related
|
||||
# to code changes. Use the `test all` and `test ${adapter}`
|
||||
# label to run all or additional tests. Use `ok to test`
|
||||
# label to mark PRs from forked repositories that are safe
|
||||
# to run integration tests for. Requires secrets to run
|
||||
# against different warehouses.
|
||||
|
||||
# **why?**
|
||||
# This checks the functionality of dbt from a user's perspective
|
||||
# and attempts to catch functional regressions.
|
||||
|
||||
# **when?**
|
||||
# This workflow will run on every push to a protected branch
|
||||
# and when manually triggered. It will also run for all PRs, including
|
||||
# PRs from forks. The workflow will be skipped until there is a label
|
||||
# to mark the PR as safe to run.
|
||||
|
||||
name: Adapter Integration Tests
|
||||
|
||||
on:
|
||||
# pushes to release branches
|
||||
push:
|
||||
branches:
|
||||
- "main"
|
||||
- "develop"
|
||||
- "*.latest"
|
||||
- "releases/*"
|
||||
# all PRs, important to note that `pull_request_target` workflows
|
||||
# will run in the context of the target branch of a PR
|
||||
pull_request_target:
|
||||
# manual tigger
|
||||
workflow_dispatch:
|
||||
|
||||
# explicitly turn off permissions for `GITHUB_TOKEN`
|
||||
permissions: read-all
|
||||
|
||||
# will cancel previous workflows triggered by the same event and for the same ref for PRs or same SHA otherwise
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.event_name }}-${{ contains(github.event_name, 'pull_request') && github.event.pull_request.head.ref || github.sha }}
|
||||
cancel-in-progress: true
|
||||
|
||||
# sets default shell to bash, for all operating systems
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
|
||||
jobs:
|
||||
# generate test metadata about what files changed and the testing matrix to use
|
||||
test-metadata:
|
||||
# run if not a PR from a forked repository or has a label to mark as safe to test
|
||||
if: >-
|
||||
github.event_name != 'pull_request_target' ||
|
||||
github.event.pull_request.head.repo.full_name == github.repository ||
|
||||
contains(github.event.pull_request.labels.*.name, 'ok to test')
|
||||
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
outputs:
|
||||
matrix: ${{ steps.generate-matrix.outputs.result }}
|
||||
|
||||
steps:
|
||||
- name: Check out the repository (non-PR)
|
||||
if: github.event_name != 'pull_request_target'
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Check out the repository (PR)
|
||||
if: github.event_name == 'pull_request_target'
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
persist-credentials: false
|
||||
ref: ${{ github.event.pull_request.head.sha }}
|
||||
|
||||
- name: Check if relevant files changed
|
||||
# https://github.com/marketplace/actions/paths-changes-filter
|
||||
# For each filter, it sets output variable named by the filter to the text:
|
||||
# 'true' - if any of changed files matches any of filter rules
|
||||
# 'false' - if none of changed files matches any of filter rules
|
||||
# also, returns:
|
||||
# `changes` - JSON array with names of all filters matching any of the changed files
|
||||
uses: dorny/paths-filter@v2
|
||||
id: get-changes
|
||||
with:
|
||||
token: ${{ secrets.GITHUB_TOKEN }}
|
||||
filters: |
|
||||
postgres:
|
||||
- 'core/**'
|
||||
- 'plugins/postgres/**'
|
||||
- 'dev-requirements.txt'
|
||||
|
||||
- name: Generate integration test matrix
|
||||
id: generate-matrix
|
||||
uses: actions/github-script@v4
|
||||
env:
|
||||
CHANGES: ${{ steps.get-changes.outputs.changes }}
|
||||
with:
|
||||
script: |
|
||||
const script = require('./.github/scripts/integration-test-matrix.js')
|
||||
const matrix = script({ context })
|
||||
console.log(matrix)
|
||||
return matrix
|
||||
|
||||
test:
|
||||
name: ${{ matrix.adapter }} / python ${{ matrix.python-version }} / ${{ matrix.os }}
|
||||
|
||||
# run if not a PR from a forked repository or has a label to mark as safe to test
|
||||
# also checks that the matrix generated is not empty
|
||||
if: >-
|
||||
needs.test-metadata.outputs.matrix &&
|
||||
fromJSON( needs.test-metadata.outputs.matrix ).include[0] &&
|
||||
(
|
||||
github.event_name != 'pull_request_target' ||
|
||||
github.event.pull_request.head.repo.full_name == github.repository ||
|
||||
contains(github.event.pull_request.labels.*.name, 'ok to test')
|
||||
)
|
||||
|
||||
runs-on: ${{ matrix.os }}
|
||||
|
||||
needs: test-metadata
|
||||
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix: ${{ fromJSON(needs.test-metadata.outputs.matrix) }}
|
||||
|
||||
env:
|
||||
TOXENV: integration-${{ matrix.adapter }}
|
||||
PYTEST_ADDOPTS: "-v --color=yes -n4 --csv integration_results.csv"
|
||||
DBT_INVOCATION_ENV: github-actions
|
||||
|
||||
steps:
|
||||
- name: Check out the repository
|
||||
if: github.event_name != 'pull_request_target'
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
# explicity checkout the branch for the PR,
|
||||
# this is necessary for the `pull_request_target` event
|
||||
- name: Check out the repository (PR)
|
||||
if: github.event_name == 'pull_request_target'
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
persist-credentials: false
|
||||
ref: ${{ github.event.pull_request.head.sha }}
|
||||
|
||||
- name: Set up Python ${{ matrix.python-version }}
|
||||
uses: actions/setup-python@v2
|
||||
with:
|
||||
python-version: ${{ matrix.python-version }}
|
||||
|
||||
- name: Set up postgres (linux)
|
||||
if: |
|
||||
matrix.adapter == 'postgres' &&
|
||||
runner.os == 'Linux'
|
||||
uses: ./.github/actions/setup-postgres-linux
|
||||
|
||||
- name: Set up postgres (macos)
|
||||
if: |
|
||||
matrix.adapter == 'postgres' &&
|
||||
runner.os == 'macOS'
|
||||
uses: ./.github/actions/setup-postgres-macos
|
||||
|
||||
- name: Set up postgres (windows)
|
||||
if: |
|
||||
matrix.adapter == 'postgres' &&
|
||||
runner.os == 'Windows'
|
||||
uses: ./.github/actions/setup-postgres-windows
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip install tox
|
||||
pip --version
|
||||
tox --version
|
||||
|
||||
- name: Run tox (postgres)
|
||||
if: matrix.adapter == 'postgres'
|
||||
run: tox
|
||||
|
||||
- uses: actions/upload-artifact@v2
|
||||
if: always()
|
||||
with:
|
||||
name: logs
|
||||
path: ./logs
|
||||
|
||||
- name: Get current date
|
||||
if: always()
|
||||
id: date
|
||||
run: echo "::set-output name=date::$(date +'%Y-%m-%dT%H_%M_%S')" #no colons allowed for artifacts
|
||||
|
||||
- uses: actions/upload-artifact@v2
|
||||
if: always()
|
||||
with:
|
||||
name: integration_results_${{ matrix.python-version }}_${{ matrix.os }}_${{ matrix.adapter }}-${{ steps.date.outputs.date }}.csv
|
||||
path: integration_results.csv
|
||||
|
||||
require-label-comment:
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
needs: test
|
||||
|
||||
permissions:
|
||||
pull-requests: write
|
||||
|
||||
steps:
|
||||
- name: Needs permission PR comment
|
||||
if: >-
|
||||
needs.test.result == 'skipped' &&
|
||||
github.event_name == 'pull_request_target' &&
|
||||
github.event.pull_request.head.repo.full_name != github.repository
|
||||
uses: unsplash/comment-on-pr@master
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
with:
|
||||
msg: |
|
||||
"You do not have permissions to run integration tests, @dbt-labs/core "\
|
||||
"needs to label this PR with `ok to test` in order to run integration tests!"
|
||||
check_for_duplicate_msg: true
|
||||
151
.github/workflows/main.yml
vendored
151
.github/workflows/main.yml
vendored
@@ -1,9 +1,8 @@
|
||||
# **what?**
|
||||
# Runs code quality checks, unit tests, and verifies python build on
|
||||
# all code commited to the repository. This workflow should not
|
||||
# require any secrets since it runs for PRs from forked repos.
|
||||
# By default, secrets are not passed to workflows running from
|
||||
# a forked repo.
|
||||
# Runs code quality checks, unit tests, integration tests and
|
||||
# verifies python build on all code commited to the repository. This workflow
|
||||
# should not require any secrets since it runs for PRs from forked repos. By
|
||||
# default, secrets are not passed to workflows running from a forked repos.
|
||||
|
||||
# **why?**
|
||||
# Ensure code for dbt meets a certain quality standard.
|
||||
@@ -18,7 +17,6 @@ on:
|
||||
push:
|
||||
branches:
|
||||
- "main"
|
||||
- "develop"
|
||||
- "*.latest"
|
||||
- "releases/*"
|
||||
pull_request:
|
||||
@@ -39,26 +37,26 @@ jobs:
|
||||
code-quality:
|
||||
name: code-quality
|
||||
|
||||
runs-on: ubuntu-latest
|
||||
runs-on: ubuntu-20.04
|
||||
|
||||
steps:
|
||||
- name: Check out the repository
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v2
|
||||
uses: actions/setup-python@v4.3.0
|
||||
with:
|
||||
python-version: '3.8'
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip install pre-commit
|
||||
pip install mypy==0.782
|
||||
pip install -r editable-requirements.txt
|
||||
pip --version
|
||||
pip install pre-commit
|
||||
pre-commit --version
|
||||
pip install mypy==0.782
|
||||
mypy --version
|
||||
pip install -r editable-requirements.txt
|
||||
dbt --version
|
||||
|
||||
- name: Run pre-commit hooks
|
||||
@@ -67,12 +65,12 @@ jobs:
|
||||
unit:
|
||||
name: unit test / python ${{ matrix.python-version }}
|
||||
|
||||
runs-on: ubuntu-latest
|
||||
runs-on: ubuntu-20.04
|
||||
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
python-version: [3.7, 3.8, 3.9]
|
||||
python-version: ['3.7', '3.8', '3.9', '3.10']
|
||||
|
||||
env:
|
||||
TOXENV: "unit"
|
||||
@@ -81,19 +79,17 @@ jobs:
|
||||
steps:
|
||||
- name: Check out the repository
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Set up Python ${{ matrix.python-version }}
|
||||
uses: actions/setup-python@v2
|
||||
uses: actions/setup-python@v4.3.0
|
||||
with:
|
||||
python-version: ${{ matrix.python-version }}
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip install tox
|
||||
pip --version
|
||||
pip install tox
|
||||
tox --version
|
||||
|
||||
- name: Run tox
|
||||
@@ -110,21 +106,88 @@ jobs:
|
||||
name: unit_results_${{ matrix.python-version }}-${{ steps.date.outputs.date }}.csv
|
||||
path: unit_results.csv
|
||||
|
||||
build:
|
||||
name: build packages
|
||||
integration:
|
||||
name: integration test / python ${{ matrix.python-version }} / ${{ matrix.os }}
|
||||
|
||||
runs-on: ubuntu-latest
|
||||
runs-on: ${{ matrix.os }}
|
||||
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
python-version: ['3.7', '3.8', '3.9', '3.10']
|
||||
os: [ubuntu-20.04]
|
||||
include:
|
||||
- python-version: 3.8
|
||||
os: windows-latest
|
||||
- python-version: 3.8
|
||||
os: macos-latest
|
||||
|
||||
env:
|
||||
TOXENV: integration
|
||||
PYTEST_ADDOPTS: "-v --color=yes -n4 --csv integration_results.csv"
|
||||
DBT_INVOCATION_ENV: github-actions
|
||||
|
||||
steps:
|
||||
- name: Check out the repository
|
||||
uses: actions/checkout@v2
|
||||
|
||||
- name: Set up Python ${{ matrix.python-version }}
|
||||
uses: actions/setup-python@v4.3.0
|
||||
with:
|
||||
persist-credentials: false
|
||||
python-version: ${{ matrix.python-version }}
|
||||
|
||||
- name: Set up postgres (linux)
|
||||
if: runner.os == 'Linux'
|
||||
uses: ./.github/actions/setup-postgres-linux
|
||||
|
||||
- name: Set up postgres (macos)
|
||||
if: runner.os == 'macOS'
|
||||
uses: ./.github/actions/setup-postgres-macos
|
||||
|
||||
- name: Set up postgres (windows)
|
||||
if: runner.os == 'Windows'
|
||||
uses: ./.github/actions/setup-postgres-windows
|
||||
|
||||
- name: Install python tools
|
||||
run: |
|
||||
# pip install --user --upgrade pip
|
||||
# pip --version
|
||||
pip install tox
|
||||
tox --version
|
||||
|
||||
- name: Run tests
|
||||
run: tox
|
||||
|
||||
- name: Get current date
|
||||
if: always()
|
||||
id: date
|
||||
run: echo "::set-output name=date::$(date +'%Y_%m_%dT%H_%M_%S')" #no colons allowed for artifacts
|
||||
|
||||
- uses: actions/upload-artifact@v2
|
||||
if: always()
|
||||
with:
|
||||
name: logs_${{ matrix.python-version }}_${{ matrix.os }}_${{ steps.date.outputs.date }}
|
||||
path: ./logs
|
||||
|
||||
- uses: actions/upload-artifact@v2
|
||||
if: always()
|
||||
with:
|
||||
name: integration_results_${{ matrix.python-version }}_${{ matrix.os }}_${{ steps.date.outputs.date }}.csv
|
||||
path: integration_results.csv
|
||||
|
||||
build:
|
||||
name: build packages
|
||||
|
||||
runs-on: ubuntu-20.04
|
||||
|
||||
steps:
|
||||
- name: Check out the repository
|
||||
uses: actions/checkout@v2
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v2
|
||||
uses: actions/setup-python@v4.3.0
|
||||
with:
|
||||
python-version: 3.8
|
||||
python-version: '3.8'
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
@@ -146,44 +209,6 @@ jobs:
|
||||
run: |
|
||||
check-wheel-contents dist/*.whl --ignore W007,W008
|
||||
|
||||
- uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: dist
|
||||
path: dist/
|
||||
|
||||
test-build:
|
||||
name: verify packages / python ${{ matrix.python-version }} / ${{ matrix.os }}
|
||||
|
||||
needs: build
|
||||
|
||||
runs-on: ${{ matrix.os }}
|
||||
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
os: [ubuntu-latest, macos-latest, windows-latest]
|
||||
python-version: [3.7, 3.8, 3.9]
|
||||
|
||||
steps:
|
||||
- name: Set up Python ${{ matrix.python-version }}
|
||||
uses: actions/setup-python@v2
|
||||
with:
|
||||
python-version: ${{ matrix.python-version }}
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip install --upgrade wheel
|
||||
pip --version
|
||||
|
||||
- uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: dist
|
||||
path: dist/
|
||||
|
||||
- name: Show distributions
|
||||
run: ls -lh dist/
|
||||
|
||||
- name: Install wheel distributions
|
||||
run: |
|
||||
find ./dist/*.whl -maxdepth 1 -type f | xargs pip install --force-reinstall --find-links=dist/
|
||||
|
||||
176
.github/workflows/performance.yml
vendored
176
.github/workflows/performance.yml
vendored
@@ -1,176 +0,0 @@
|
||||
name: Performance Regression Tests
|
||||
# Schedule triggers
|
||||
on:
|
||||
# runs twice a day at 10:05am and 10:05pm
|
||||
schedule:
|
||||
- cron: "5 10,22 * * *"
|
||||
# Allows you to run this workflow manually from the Actions tab
|
||||
workflow_dispatch:
|
||||
|
||||
jobs:
|
||||
# checks fmt of runner code
|
||||
# purposefully not a dependency of any other job
|
||||
# will block merging, but not prevent developing
|
||||
fmt:
|
||||
name: Cargo fmt
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- uses: actions-rs/toolchain@v1
|
||||
with:
|
||||
profile: minimal
|
||||
toolchain: stable
|
||||
override: true
|
||||
- run: rustup component add rustfmt
|
||||
- uses: actions-rs/cargo@v1
|
||||
with:
|
||||
command: fmt
|
||||
args: --manifest-path performance/runner/Cargo.toml --all -- --check
|
||||
|
||||
# runs any tests associated with the runner
|
||||
# these tests make sure the runner logic is correct
|
||||
test-runner:
|
||||
name: Test Runner
|
||||
runs-on: ubuntu-latest
|
||||
env:
|
||||
# turns errors into warnings
|
||||
RUSTFLAGS: "-D warnings"
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- uses: actions-rs/toolchain@v1
|
||||
with:
|
||||
profile: minimal
|
||||
toolchain: stable
|
||||
override: true
|
||||
- uses: actions-rs/cargo@v1
|
||||
with:
|
||||
command: test
|
||||
args: --manifest-path performance/runner/Cargo.toml
|
||||
|
||||
# build an optimized binary to be used as the runner in later steps
|
||||
build-runner:
|
||||
needs: [test-runner]
|
||||
name: Build Runner
|
||||
runs-on: ubuntu-latest
|
||||
env:
|
||||
RUSTFLAGS: "-D warnings"
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- uses: actions-rs/toolchain@v1
|
||||
with:
|
||||
profile: minimal
|
||||
toolchain: stable
|
||||
override: true
|
||||
- uses: actions-rs/cargo@v1
|
||||
with:
|
||||
command: build
|
||||
args: --release --manifest-path performance/runner/Cargo.toml
|
||||
- uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: runner
|
||||
path: performance/runner/target/release/runner
|
||||
|
||||
# run the performance measurements on the current or default branch
|
||||
measure-dev:
|
||||
needs: [build-runner]
|
||||
name: Measure Dev Branch
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: checkout dev
|
||||
uses: actions/checkout@v2
|
||||
- name: Setup Python
|
||||
uses: actions/setup-python@v2.2.2
|
||||
with:
|
||||
python-version: "3.8"
|
||||
- name: install dbt
|
||||
run: pip install -r dev-requirements.txt -r editable-requirements.txt
|
||||
- name: install hyperfine
|
||||
run: wget https://github.com/sharkdp/hyperfine/releases/download/v1.11.0/hyperfine_1.11.0_amd64.deb && sudo dpkg -i hyperfine_1.11.0_amd64.deb
|
||||
- uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: runner
|
||||
- name: change permissions
|
||||
run: chmod +x ./runner
|
||||
- name: run
|
||||
run: ./runner measure -b dev -p ${{ github.workspace }}/performance/projects/
|
||||
- uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: dev-results
|
||||
path: performance/results/
|
||||
|
||||
# run the performance measurements on the release branch which we use
|
||||
# as a performance baseline. This part takes by far the longest, so
|
||||
# we do everything we can first so the job fails fast.
|
||||
# -----
|
||||
# we need to checkout dbt twice in this job: once for the baseline dbt
|
||||
# version, and once to get the latest regression testing projects,
|
||||
# metrics, and runner code from the develop or current branch so that
|
||||
# the calculations match for both versions of dbt we are comparing.
|
||||
measure-baseline:
|
||||
needs: [build-runner]
|
||||
name: Measure Baseline Branch
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: checkout latest
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
ref: "0.20.latest"
|
||||
- name: Setup Python
|
||||
uses: actions/setup-python@v2.2.2
|
||||
with:
|
||||
python-version: "3.8"
|
||||
- name: move repo up a level
|
||||
run: mkdir ${{ github.workspace }}/../baseline/ && cp -r ${{ github.workspace }} ${{ github.workspace }}/../baseline
|
||||
- name: "[debug] ls new dbt location"
|
||||
run: ls ${{ github.workspace }}/../baseline/dbt/
|
||||
# installation creates egg-links so we have to preserve source
|
||||
- name: install dbt from new location
|
||||
run: cd ${{ github.workspace }}/../baseline/dbt/ && pip install -r dev-requirements.txt -r editable-requirements.txt
|
||||
# checkout the current branch to get all the target projects
|
||||
# this deletes the old checked out code which is why we had to copy before
|
||||
- name: checkout dev
|
||||
uses: actions/checkout@v2
|
||||
- name: install hyperfine
|
||||
run: wget https://github.com/sharkdp/hyperfine/releases/download/v1.11.0/hyperfine_1.11.0_amd64.deb && sudo dpkg -i hyperfine_1.11.0_amd64.deb
|
||||
- uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: runner
|
||||
- name: change permissions
|
||||
run: chmod +x ./runner
|
||||
- name: run runner
|
||||
run: ./runner measure -b baseline -p ${{ github.workspace }}/performance/projects/
|
||||
- uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: baseline-results
|
||||
path: performance/results/
|
||||
|
||||
# detect regressions on the output generated from measuring
|
||||
# the two branches. Exits with non-zero code if a regression is detected.
|
||||
calculate-regressions:
|
||||
needs: [measure-dev, measure-baseline]
|
||||
name: Compare Results
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: dev-results
|
||||
- uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: baseline-results
|
||||
- name: "[debug] ls result files"
|
||||
run: ls
|
||||
- uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: runner
|
||||
- name: change permissions
|
||||
run: chmod +x ./runner
|
||||
- name: make results directory
|
||||
run: mkdir ./final-output/
|
||||
- name: run calculation
|
||||
run: ./runner calculate -r ./ -o ./final-output/
|
||||
# always attempt to upload the results even if there were regressions found
|
||||
- uses: actions/upload-artifact@v2
|
||||
if: ${{ always() }}
|
||||
with:
|
||||
name: final-calculations
|
||||
path: ./final-output/*
|
||||
@@ -12,6 +12,9 @@
|
||||
|
||||
name: Docker release
|
||||
|
||||
permissions:
|
||||
packages: write
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
@@ -6,7 +6,6 @@
|
||||
# version of our structured logging and add new documentation to
|
||||
# communicate these changes.
|
||||
|
||||
|
||||
name: Structured Logging Schema Check
|
||||
on:
|
||||
push:
|
||||
@@ -23,16 +22,15 @@ jobs:
|
||||
# run the performance measurements on the current or default branch
|
||||
test-schema:
|
||||
name: Test Log Schema
|
||||
runs-on: ubuntu-latest
|
||||
runs-on: ubuntu-20.04
|
||||
env:
|
||||
# turns warnings into errors
|
||||
RUSTFLAGS: "-D warnings"
|
||||
# points tests to the log file
|
||||
LOG_DIR: "/home/runner/work/dbt-core/dbt-core/logs"
|
||||
# tells integration tests to output into json format
|
||||
DBT_LOG_FORMAT: 'json'
|
||||
DBT_LOG_FORMAT: "json"
|
||||
steps:
|
||||
|
||||
- name: checkout dev
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
@@ -49,8 +47,12 @@ jobs:
|
||||
toolchain: stable
|
||||
override: true
|
||||
|
||||
- name: install dbt
|
||||
run: pip install -r dev-requirements.txt -r editable-requirements.txt
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip --version
|
||||
pip install tox
|
||||
tox --version
|
||||
|
||||
- name: Set up postgres
|
||||
uses: ./.github/actions/setup-postgres-linux
|
||||
@@ -61,7 +63,7 @@ jobs:
|
||||
# integration tests generate a ton of logs in different files. the next step will find them all.
|
||||
# we actually care if these pass, because the normal test run doesn't usually include many json log outputs
|
||||
- name: Run integration tests
|
||||
run: tox -e py38-postgres -- -nauto
|
||||
run: tox -e integration -- -nauto
|
||||
|
||||
# apply our schema tests to every log event from the previous step
|
||||
# skips any output that isn't valid json
|
||||
|
||||
122
.github/workflows/version-bump.yml
vendored
122
.github/workflows/version-bump.yml
vendored
@@ -1,18 +1,15 @@
|
||||
# **what?**
|
||||
# This workflow will take a version number and a dry run flag. With that
|
||||
# This workflow will take the new version number to bump to. With that
|
||||
# it will run versionbump to update the version number everywhere in the
|
||||
# code base and then generate an update Docker requirements file. If this
|
||||
# is a dry run, a draft PR will open with the changes. If this isn't a dry
|
||||
# run, the changes will be committed to the branch this is run on.
|
||||
# code base and then run changie to create the corresponding changelog.
|
||||
# A PR will be created with the changes that can be reviewed before committing.
|
||||
|
||||
# **why?**
|
||||
# This is to aid in releasing dbt and making sure we have updated
|
||||
# the versions and Docker requirements in all places.
|
||||
# the version in all places and generated the changelog.
|
||||
|
||||
# **when?**
|
||||
# This is triggered either manually OR
|
||||
# from the repository_dispatch event "version-bump" which is sent from
|
||||
# the dbt-release repo Action
|
||||
# This is triggered manually
|
||||
|
||||
name: Version Bump
|
||||
|
||||
@@ -20,35 +17,21 @@ on:
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
version_number:
|
||||
description: 'The version number to bump to'
|
||||
description: 'The version number to bump to (ex. 1.2.0, 1.3.0b1)'
|
||||
required: true
|
||||
is_dry_run:
|
||||
description: 'Creates a draft PR to allow testing instead of committing to a branch'
|
||||
required: true
|
||||
default: 'true'
|
||||
repository_dispatch:
|
||||
types: [version-bump]
|
||||
|
||||
jobs:
|
||||
bump:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: "[DEBUG] Print Variables"
|
||||
run: |
|
||||
echo "all variables defined as inputs"
|
||||
echo The version_number: ${{ github.event.inputs.version_number }}
|
||||
|
||||
- name: Check out the repository
|
||||
uses: actions/checkout@v2
|
||||
|
||||
- name: Set version and dry run values
|
||||
id: variables
|
||||
env:
|
||||
VERSION_NUMBER: "${{ github.event.client_payload.version_number == '' && github.event.inputs.version_number || github.event.client_payload.version_number }}"
|
||||
IS_DRY_RUN: "${{ github.event.client_payload.is_dry_run == '' && github.event.inputs.is_dry_run || github.event.client_payload.is_dry_run }}"
|
||||
run: |
|
||||
echo Repository dispatch event version: ${{ github.event.client_payload.version_number }}
|
||||
echo Repository dispatch event dry run: ${{ github.event.client_payload.is_dry_run }}
|
||||
echo Workflow dispatch event version: ${{ github.event.inputs.version_number }}
|
||||
echo Workflow dispatch event dry run: ${{ github.event.inputs.is_dry_run }}
|
||||
echo ::set-output name=VERSION_NUMBER::$VERSION_NUMBER
|
||||
echo ::set-output name=IS_DRY_RUN::$IS_DRY_RUN
|
||||
|
||||
- uses: actions/setup-python@v2
|
||||
with:
|
||||
python-version: "3.8"
|
||||
@@ -59,51 +42,80 @@ jobs:
|
||||
source env/bin/activate
|
||||
pip install --upgrade pip
|
||||
|
||||
- name: Create PR branch
|
||||
if: ${{ steps.variables.outputs.IS_DRY_RUN == 'true' }}
|
||||
- name: Add Homebrew to PATH
|
||||
run: |
|
||||
git checkout -b bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_$GITHUB_RUN_ID
|
||||
git push origin bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_$GITHUB_RUN_ID
|
||||
git branch --set-upstream-to=origin/bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_$GITHUB_RUN_ID bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_$GITHUB_RUN_ID
|
||||
echo "/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin" >> $GITHUB_PATH
|
||||
|
||||
# - name: Generate Docker requirements
|
||||
# run: |
|
||||
# source env/bin/activate
|
||||
# pip install -r requirements.txt
|
||||
# pip freeze -l > docker/requirements/requirements.txt
|
||||
# git status
|
||||
- name: Install Homebrew packages
|
||||
run: |
|
||||
brew install pre-commit
|
||||
brew tap miniscruff/changie https://github.com/miniscruff/changie
|
||||
brew install changie
|
||||
|
||||
- name: Audit Version and Parse Into Parts
|
||||
id: semver
|
||||
uses: dbt-labs/actions/parse-semver@v1
|
||||
with:
|
||||
version: ${{ github.event.inputs.version_number }}
|
||||
|
||||
- name: Set branch value
|
||||
id: variables
|
||||
run: |
|
||||
echo "::set-output name=BRANCH_NAME::prep-release/${{ github.event.inputs.version_number }}_$GITHUB_RUN_ID"
|
||||
|
||||
- name: Create PR branch
|
||||
run: |
|
||||
git checkout -b ${{ steps.variables.outputs.BRANCH_NAME }}
|
||||
git push origin ${{ steps.variables.outputs.BRANCH_NAME }}
|
||||
git branch --set-upstream-to=origin/${{ steps.variables.outputs.BRANCH_NAME }} ${{ steps.variables.outputs.BRANCH_NAME }}
|
||||
|
||||
- name: Bump version
|
||||
run: |
|
||||
source env/bin/activate
|
||||
pip install -r dev-requirements.txt
|
||||
env/bin/bumpversion --allow-dirty --new-version ${{steps.variables.outputs.VERSION_NUMBER}} major
|
||||
env/bin/bumpversion --allow-dirty --new-version ${{ github.event.inputs.version_number }} major
|
||||
git status
|
||||
|
||||
- name: Commit version bump directly
|
||||
uses: EndBug/add-and-commit@v7
|
||||
if: ${{ steps.variables.outputs.IS_DRY_RUN == 'false' }}
|
||||
with:
|
||||
author_name: 'Github Build Bot'
|
||||
author_email: 'buildbot@fishtownanalytics.com'
|
||||
message: 'Bumping version to ${{steps.variables.outputs.VERSION_NUMBER}}'
|
||||
- name: Run changie
|
||||
run: |
|
||||
if [[ ${{ steps.semver.outputs.is-pre-release }} -eq 1 ]]
|
||||
then
|
||||
changie batch ${{ steps.semver.outputs.base-version }} --move-dir '${{ steps.semver.outputs.base-version }}' --prerelease '${{ steps.semver.outputs.pre-release }}'
|
||||
else
|
||||
changie batch ${{ steps.semver.outputs.base-version }} --include '${{ steps.semver.outputs.base-version }}' --remove-prereleases
|
||||
fi
|
||||
changie merge
|
||||
git status
|
||||
|
||||
# this step will fail on whitespace errors but also correct them
|
||||
- name: Remove trailing whitespace
|
||||
continue-on-error: true
|
||||
run: |
|
||||
pre-commit run trailing-whitespace --files .bumpversion.cfg CHANGELOG.md .changes/*
|
||||
git status
|
||||
|
||||
# this step will fail on newline errors but also correct them
|
||||
- name: Removing extra newlines
|
||||
continue-on-error: true
|
||||
run: |
|
||||
pre-commit run end-of-file-fixer --files .bumpversion.cfg CHANGELOG.md .changes/*
|
||||
git status
|
||||
|
||||
- name: Commit version bump to branch
|
||||
uses: EndBug/add-and-commit@v7
|
||||
if: ${{ steps.variables.outputs.IS_DRY_RUN == 'true' }}
|
||||
with:
|
||||
author_name: 'Github Build Bot'
|
||||
author_email: 'buildbot@fishtownanalytics.com'
|
||||
message: 'Bumping version to ${{steps.variables.outputs.VERSION_NUMBER}}'
|
||||
branch: 'bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_${{GITHUB.RUN_ID}}'
|
||||
push: 'origin origin/bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_${{GITHUB.RUN_ID}}'
|
||||
message: 'Bumping version to ${{ github.event.inputs.version_number }} and generate CHANGELOG'
|
||||
branch: '${{ steps.variables.outputs.BRANCH_NAME }}'
|
||||
push: 'origin origin/${{ steps.variables.outputs.BRANCH_NAME }}'
|
||||
|
||||
- name: Create Pull Request
|
||||
uses: peter-evans/create-pull-request@v3
|
||||
if: ${{ steps.variables.outputs.IS_DRY_RUN == 'true' }}
|
||||
with:
|
||||
author: 'Github Build Bot <buildbot@fishtownanalytics.com>'
|
||||
draft: true
|
||||
base: ${{github.ref}}
|
||||
title: 'Bumping version to ${{steps.variables.outputs.VERSION_NUMBER}}'
|
||||
branch: 'bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_${{GITHUB.RUN_ID}}'
|
||||
title: 'Bumping version to ${{ github.event.inputs.version_number }} and generate changelog'
|
||||
branch: '${{ steps.variables.outputs.BRANCH_NAME }}'
|
||||
labels: |
|
||||
Skip Changelog
|
||||
|
||||
@@ -21,7 +21,7 @@ repos:
|
||||
- "markdown"
|
||||
- id: check-case-conflict
|
||||
- repo: https://github.com/psf/black
|
||||
rev: 21.12b0
|
||||
rev: 22.3.0
|
||||
hooks:
|
||||
- id: black
|
||||
args:
|
||||
@@ -35,7 +35,7 @@ repos:
|
||||
- "--target-version=py38"
|
||||
- "--check"
|
||||
- "--diff"
|
||||
- repo: https://gitlab.com/pycqa/flake8
|
||||
- repo: https://github.com/pycqa/flake8
|
||||
rev: 4.0.1
|
||||
hooks:
|
||||
- id: flake8
|
||||
|
||||
3655
CHANGELOG.md
Normal file → Executable file
3655
CHANGELOG.md
Normal file → Executable file
File diff suppressed because it is too large
Load Diff
@@ -103,7 +103,7 @@ There are some tools that will be helpful to you in developing locally. While th
|
||||
|
||||
A short list of tools used in `dbt-core` testing that will be helpful to your understanding:
|
||||
|
||||
- [`tox`](https://tox.readthedocs.io/en/latest/) to manage virtualenvs across python versions. We currently target the latest patch releases for Python 3.7, Python 3.8, and Python 3.9
|
||||
- [`tox`](https://tox.readthedocs.io/en/latest/) to manage virtualenvs across python versions. We currently target the latest patch releases for Python 3.7, Python 3.8, Python 3.9, and Python 3.10
|
||||
- [`pytest`](https://docs.pytest.org/en/latest/) to discover/run tests
|
||||
- [`make`](https://users.cs.duke.edu/~ola/courses/programming/Makefiles/Makefiles.html) - but don't worry too much, nobody _really_ understands how make works and our Makefile is super simple
|
||||
- [`flake8`](https://flake8.pycqa.org/en/latest/) for code linting
|
||||
@@ -203,7 +203,7 @@ suites.
|
||||
|
||||
#### `tox`
|
||||
|
||||
[`tox`](https://tox.readthedocs.io/en/latest/) takes care of managing virtualenvs and install dependencies in order to run tests. You can also run tests in parallel, for example, you can run unit tests for Python 3.7, Python 3.8, and Python 3.9 checks in parallel with `tox -p`. Also, you can run unit tests for specific python versions with `tox -e py37`. The configuration for these tests in located in `tox.ini`.
|
||||
[`tox`](https://tox.readthedocs.io/en/latest/) takes care of managing virtualenvs and install dependencies in order to run tests. You can also run tests in parallel, for example, you can run unit tests for Python 3.7, Python 3.8, Python 3.9, and Python 3.10 checks in parallel with `tox -p`. Also, you can run unit tests for specific python versions with `tox -e py37`. The configuration for these tests in located in `tox.ini`.
|
||||
|
||||
#### `pytest`
|
||||
|
||||
@@ -219,6 +219,15 @@ python -m pytest test/unit/test_graph.py::GraphTest::test__dependency_list
|
||||
```
|
||||
> [Here](https://docs.pytest.org/en/reorganize-docs/new-docs/user/commandlineuseful.html)
|
||||
> is a list of useful command-line options for `pytest` to use while developing.
|
||||
|
||||
## Adding CHANGELOG Entry
|
||||
|
||||
We use [changie](https://changie.dev) to generate `CHANGELOG` entries. Do not edit the `CHANGELOG.md` directly. Your modifications will be lost.
|
||||
|
||||
Follow the steps to [install `changie`](https://changie.dev/guide/installation/) for your system.
|
||||
|
||||
Once changie is installed and your PR is created, simply run `changie new` and changie will walk you through the process of creating a changelog entry. Commit the file that's created and your changelog entry is complete!
|
||||
|
||||
## Submitting a Pull Request
|
||||
|
||||
dbt Labs provides a CI environment to test changes to specific adapters, and periodic maintenance checks of `dbt-core` through Github Actions. For example, if you submit a pull request to the `dbt-redshift` repo, GitHub will trigger automated code checks and tests against Redshift.
|
||||
|
||||
@@ -46,6 +46,9 @@ RUN apt-get update \
|
||||
python3.9 \
|
||||
python3.9-dev \
|
||||
python3.9-venv \
|
||||
python3.10 \
|
||||
python3.10-dev \
|
||||
python3.10-venv \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
|
||||
|
||||
|
||||
18
Makefile
18
Makefile
@@ -41,26 +41,20 @@ unit: .env ## Runs unit tests with py38.
|
||||
.PHONY: test
|
||||
test: .env ## Runs unit tests with py38 and code checks against staged changes.
|
||||
@\
|
||||
$(DOCKER_CMD) tox -p -e py38; \
|
||||
$(DOCKER_CMD) tox -e py38; \
|
||||
$(DOCKER_CMD) pre-commit run black-check --hook-stage manual | grep -v "INFO"; \
|
||||
$(DOCKER_CMD) pre-commit run flake8-check --hook-stage manual | grep -v "INFO"; \
|
||||
$(DOCKER_CMD) pre-commit run mypy-check --hook-stage manual | grep -v "INFO"
|
||||
|
||||
.PHONY: integration
|
||||
integration: .env integration-postgres ## Alias for integration-postgres.
|
||||
integration: .env ## Runs postgres integration tests with py38.
|
||||
@\
|
||||
$(DOCKER_CMD) tox -e py38-integration -- -nauto
|
||||
|
||||
.PHONY: integration-fail-fast
|
||||
integration-fail-fast: .env integration-postgres-fail-fast ## Alias for integration-postgres-fail-fast.
|
||||
|
||||
.PHONY: integration-postgres
|
||||
integration-postgres: .env ## Runs postgres integration tests with py38.
|
||||
integration-fail-fast: .env ## Runs postgres integration tests with py38 in "fail fast" mode.
|
||||
@\
|
||||
$(DOCKER_CMD) tox -e py38-postgres -- -nauto
|
||||
|
||||
.PHONY: integration-postgres-fail-fast
|
||||
integration-postgres-fail-fast: .env ## Runs postgres integration tests with py38 in "fail fast" mode.
|
||||
@\
|
||||
$(DOCKER_CMD) tox -e py38-postgres -- -x -nauto
|
||||
$(DOCKER_CMD) tox -e py38-integration -- -x -nauto
|
||||
|
||||
.PHONY: setup-db
|
||||
setup-db: ## Setup Postgres database with docker-compose for system testing.
|
||||
|
||||
@@ -3,10 +3,7 @@
|
||||
</p>
|
||||
<p align="center">
|
||||
<a href="https://github.com/dbt-labs/dbt-core/actions/workflows/main.yml">
|
||||
<img src="https://github.com/dbt-labs/dbt-core/actions/workflows/main.yml/badge.svg?event=push" alt="Unit Tests Badge"/>
|
||||
</a>
|
||||
<a href="https://github.com/dbt-labs/dbt-core/actions/workflows/integration.yml">
|
||||
<img src="https://github.com/dbt-labs/dbt-core/actions/workflows/integration.yml/badge.svg?event=push" alt="Integration Tests Badge"/>
|
||||
<img src="https://github.com/dbt-labs/dbt-core/actions/workflows/main.yml/badge.svg?event=push" alt="CI Badge"/>
|
||||
</a>
|
||||
</p>
|
||||
|
||||
|
||||
@@ -3,10 +3,7 @@
|
||||
</p>
|
||||
<p align="center">
|
||||
<a href="https://github.com/dbt-labs/dbt-core/actions/workflows/main.yml">
|
||||
<img src="https://github.com/dbt-labs/dbt-core/actions/workflows/main.yml/badge.svg?event=push" alt="Unit Tests Badge"/>
|
||||
</a>
|
||||
<a href="https://github.com/dbt-labs/dbt-core/actions/workflows/integration.yml">
|
||||
<img src="https://github.com/dbt-labs/dbt-core/actions/workflows/integration.yml/badge.svg?event=push" alt="Integration Tests Badge"/>
|
||||
<img src="https://github.com/dbt-labs/dbt-core/actions/workflows/main.yml/badge.svg?event=push" alt="CI Badge"/>
|
||||
</a>
|
||||
</p>
|
||||
|
||||
|
||||
@@ -4,7 +4,7 @@ import os
|
||||
# multiprocessing.RLock is a function returning this type
|
||||
from multiprocessing.synchronize import RLock
|
||||
from threading import get_ident
|
||||
from typing import Dict, Tuple, Hashable, Optional, ContextManager, List, Union
|
||||
from typing import Dict, Tuple, Hashable, Optional, ContextManager, List
|
||||
|
||||
import agate
|
||||
|
||||
@@ -281,15 +281,15 @@ class BaseConnectionManager(metaclass=abc.ABCMeta):
|
||||
@abc.abstractmethod
|
||||
def execute(
|
||||
self, sql: str, auto_begin: bool = False, fetch: bool = False
|
||||
) -> Tuple[Union[str, AdapterResponse], agate.Table]:
|
||||
) -> Tuple[AdapterResponse, agate.Table]:
|
||||
"""Execute the given SQL.
|
||||
|
||||
:param str sql: The sql to execute.
|
||||
:param bool auto_begin: If set, and dbt is not currently inside a
|
||||
transaction, automatically begin one.
|
||||
:param bool fetch: If set, fetch results.
|
||||
:return: A tuple of the status and the results (empty if fetch=False).
|
||||
:rtype: Tuple[Union[str, AdapterResponse], agate.Table]
|
||||
:return: A tuple of the query status and results (empty if fetch=False).
|
||||
:rtype: Tuple[AdapterResponse, agate.Table]
|
||||
"""
|
||||
raise dbt.exceptions.NotImplementedException(
|
||||
"`execute` is not implemented for this adapter!"
|
||||
|
||||
@@ -221,7 +221,7 @@ class BaseAdapter(metaclass=AdapterMeta):
|
||||
@available.parse(lambda *a, **k: ("", empty_table()))
|
||||
def execute(
|
||||
self, sql: str, auto_begin: bool = False, fetch: bool = False
|
||||
) -> Tuple[Union[str, AdapterResponse], agate.Table]:
|
||||
) -> Tuple[AdapterResponse, agate.Table]:
|
||||
"""Execute the given SQL. This is a thin wrapper around
|
||||
ConnectionManager.execute.
|
||||
|
||||
@@ -229,8 +229,8 @@ class BaseAdapter(metaclass=AdapterMeta):
|
||||
:param bool auto_begin: If set, and dbt is not currently inside a
|
||||
transaction, automatically begin one.
|
||||
:param bool fetch: If set, fetch results.
|
||||
:return: A tuple of the status and the results (empty if fetch=False).
|
||||
:rtype: Tuple[Union[str, AdapterResponse], agate.Table]
|
||||
:return: A tuple of the query status and results (empty if fetch=False).
|
||||
:rtype: Tuple[AdapterResponse, agate.Table]
|
||||
"""
|
||||
return self.connections.execute(sql=sql, auto_begin=auto_begin, fetch=fetch)
|
||||
|
||||
@@ -270,12 +270,15 @@ class BaseAdapter(metaclass=AdapterMeta):
|
||||
"""
|
||||
return self._macro_manifest_lazy
|
||||
|
||||
def load_macro_manifest(self) -> MacroManifest:
|
||||
def load_macro_manifest(self, base_macros_only=False) -> MacroManifest:
|
||||
# base_macros_only is for the test framework
|
||||
if self._macro_manifest_lazy is None:
|
||||
# avoid a circular import
|
||||
from dbt.parser.manifest import ManifestLoader
|
||||
|
||||
manifest = ManifestLoader.load_macros(self.config, self.connections.set_query_header)
|
||||
manifest = ManifestLoader.load_macros(
|
||||
self.config, self.connections.set_query_header, base_macros_only=base_macros_only
|
||||
)
|
||||
# TODO CT-211
|
||||
self._macro_manifest_lazy = manifest # type: ignore[assignment]
|
||||
# TODO CT-211
|
||||
@@ -337,11 +340,14 @@ class BaseAdapter(metaclass=AdapterMeta):
|
||||
# databases
|
||||
return info_schema_name_map
|
||||
|
||||
def _relations_cache_for_schemas(self, manifest: Manifest) -> None:
|
||||
def _relations_cache_for_schemas(
|
||||
self, manifest: Manifest, cache_schemas: Set[BaseRelation] = None
|
||||
) -> None:
|
||||
"""Populate the relations cache for the given schemas. Returns an
|
||||
iterable of the schemas populated, as strings.
|
||||
"""
|
||||
cache_schemas = self._get_cache_schemas(manifest)
|
||||
if not cache_schemas:
|
||||
cache_schemas = self._get_cache_schemas(manifest)
|
||||
with executor(self.config) as tpe:
|
||||
futures: List[Future[List[BaseRelation]]] = []
|
||||
for cache_schema in cache_schemas:
|
||||
@@ -367,14 +373,16 @@ class BaseAdapter(metaclass=AdapterMeta):
|
||||
cache_update.add((relation.database, relation.schema))
|
||||
self.cache.update_schemas(cache_update)
|
||||
|
||||
def set_relations_cache(self, manifest: Manifest, clear: bool = False) -> None:
|
||||
def set_relations_cache(
|
||||
self, manifest: Manifest, clear: bool = False, required_schemas: Set[BaseRelation] = None
|
||||
) -> None:
|
||||
"""Run a query that gets a populated cache of the relations in the
|
||||
database and set the cache on this adapter.
|
||||
"""
|
||||
with self.cache.lock:
|
||||
if clear:
|
||||
self.cache.clear()
|
||||
self._relations_cache_for_schemas(manifest)
|
||||
self._relations_cache_for_schemas(manifest, required_schemas)
|
||||
|
||||
@available
|
||||
def cache_added(self, relation: Optional[BaseRelation]) -> str:
|
||||
|
||||
@@ -177,6 +177,10 @@ def get_adapter(config: AdapterRequiredConfig):
|
||||
return FACTORY.lookup_adapter(config.credentials.type)
|
||||
|
||||
|
||||
def get_adapter_by_type(adapter_type):
|
||||
return FACTORY.lookup_adapter(adapter_type)
|
||||
|
||||
|
||||
def reset_adapters():
|
||||
"""Clear the adapters. This is useful for tests, which change configs."""
|
||||
FACTORY.reset_adapters()
|
||||
|
||||
@@ -155,7 +155,7 @@ class AdapterProtocol( # type: ignore[misc]
|
||||
|
||||
def execute(
|
||||
self, sql: str, auto_begin: bool = False, fetch: bool = False
|
||||
) -> Tuple[Union[str, AdapterResponse], agate.Table]:
|
||||
) -> Tuple[AdapterResponse, agate.Table]:
|
||||
...
|
||||
|
||||
def get_compiler(self) -> Compiler_T:
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import abc
|
||||
import time
|
||||
from typing import List, Optional, Tuple, Any, Iterable, Dict, Union
|
||||
from typing import List, Optional, Tuple, Any, Iterable, Dict
|
||||
|
||||
import agate
|
||||
|
||||
@@ -78,7 +78,7 @@ class SQLConnectionManager(BaseConnectionManager):
|
||||
return connection, cursor
|
||||
|
||||
@abc.abstractclassmethod
|
||||
def get_response(cls, cursor: Any) -> Union[AdapterResponse, str]:
|
||||
def get_response(cls, cursor: Any) -> AdapterResponse:
|
||||
"""Get the status of the cursor."""
|
||||
raise dbt.exceptions.NotImplementedException(
|
||||
"`get_response` is not implemented for this adapter!"
|
||||
@@ -117,7 +117,7 @@ class SQLConnectionManager(BaseConnectionManager):
|
||||
|
||||
def execute(
|
||||
self, sql: str, auto_begin: bool = False, fetch: bool = False
|
||||
) -> Tuple[Union[AdapterResponse, str], agate.Table]:
|
||||
) -> Tuple[AdapterResponse, agate.Table]:
|
||||
sql = self._add_query_comment(sql)
|
||||
_, cursor = self.add_query(sql, auto_begin)
|
||||
response = self.get_response(cursor)
|
||||
|
||||
@@ -27,7 +27,7 @@ ALTER_COLUMN_TYPE_MACRO_NAME = "alter_column_type"
|
||||
|
||||
class SQLAdapter(BaseAdapter):
|
||||
"""The default adapter with the common agate conversions and some SQL
|
||||
methods implemented. This adapter has a different much shorter list of
|
||||
methods was implemented. This adapter has a different much shorter list of
|
||||
methods to implement, but some more macros that must be implemented.
|
||||
|
||||
To implement a macro, implement "${adapter_type}__${macro_name}". in the
|
||||
@@ -218,3 +218,25 @@ class SQLAdapter(BaseAdapter):
|
||||
kwargs = {"information_schema": information_schema, "schema": schema}
|
||||
results = self.execute_macro(CHECK_SCHEMA_EXISTS_MACRO_NAME, kwargs=kwargs)
|
||||
return results[0][0] > 0
|
||||
|
||||
# This is for use in the test suite
|
||||
def run_sql_for_tests(self, sql, fetch, conn):
|
||||
cursor = conn.handle.cursor()
|
||||
try:
|
||||
cursor.execute(sql)
|
||||
if hasattr(conn.handle, "commit"):
|
||||
conn.handle.commit()
|
||||
if fetch == "one":
|
||||
return cursor.fetchone()
|
||||
elif fetch == "all":
|
||||
return cursor.fetchall()
|
||||
else:
|
||||
return
|
||||
except BaseException as e:
|
||||
if conn.handle and not getattr(conn.handle, "closed", True):
|
||||
conn.handle.rollback()
|
||||
print(sql)
|
||||
print(e)
|
||||
raise
|
||||
finally:
|
||||
conn.transaction_open = False
|
||||
|
||||
@@ -80,7 +80,7 @@ def table_from_rows(
|
||||
|
||||
|
||||
def table_from_data(data, column_names: Iterable[str]) -> agate.Table:
|
||||
"Convert list of dictionaries into an Agate table"
|
||||
"Convert a list of dictionaries into an Agate table"
|
||||
|
||||
# The agate table is generated from a list of dicts, so the column order
|
||||
# from `data` is not preserved. We can use `select` to reorder the columns
|
||||
|
||||
@@ -28,7 +28,7 @@ def _is_commit(revision: str) -> bool:
|
||||
|
||||
|
||||
def _raise_git_cloning_error(repo, revision, error):
|
||||
stderr = error.stderr.decode("utf-8").strip()
|
||||
stderr = error.stderr.strip()
|
||||
if "usage: git" in stderr:
|
||||
stderr = stderr.split("\nusage: git")[0]
|
||||
if re.match("fatal: destination path '(.+)' already exists", stderr):
|
||||
@@ -115,8 +115,8 @@ def checkout(cwd, repo, revision=None):
|
||||
try:
|
||||
return _checkout(cwd, repo, revision)
|
||||
except CommandResultError as exc:
|
||||
stderr = exc.stderr.decode("utf-8").strip()
|
||||
bad_package_spec(repo, revision, stderr)
|
||||
stderr = exc.stderr.strip()
|
||||
bad_package_spec(repo, revision, stderr)
|
||||
|
||||
|
||||
def get_current_sha(cwd):
|
||||
@@ -142,7 +142,7 @@ def clone_and_checkout(
|
||||
subdirectory=subdirectory,
|
||||
)
|
||||
except CommandResultError as exc:
|
||||
err = exc.stderr.decode("utf-8")
|
||||
err = exc.stderr
|
||||
exists = re.match("fatal: destination path '(.+)' already exists", err)
|
||||
if not exists:
|
||||
raise_git_cloning_problem(repo)
|
||||
|
||||
@@ -103,7 +103,7 @@ class NativeSandboxEnvironment(MacroFuzzEnvironment):
|
||||
|
||||
|
||||
class TextMarker(str):
|
||||
"""A special native-env marker that indicates that a value is text and is
|
||||
"""A special native-env marker that indicates a value is text and is
|
||||
not to be evaluated. Use this to prevent your numbery-strings from becoming
|
||||
numbers!
|
||||
"""
|
||||
@@ -580,7 +580,7 @@ def extract_toplevel_blocks(
|
||||
allowed_blocks: Optional[Set[str]] = None,
|
||||
collect_raw_data: bool = True,
|
||||
) -> List[Union[BlockData, BlockTag]]:
|
||||
"""Extract the top level blocks with matching block types from a jinja
|
||||
"""Extract the top-level blocks with matching block types from a jinja
|
||||
file, with some special handling for block nesting.
|
||||
|
||||
:param data: The data to extract blocks from.
|
||||
|
||||
@@ -1,7 +1,17 @@
|
||||
import functools
|
||||
from typing import Any, Dict, List
|
||||
import requests
|
||||
from dbt.events.functions import fire_event
|
||||
from dbt.events.types import RegistryProgressMakingGETRequest, RegistryProgressGETResponse
|
||||
from dbt.events.types import (
|
||||
RegistryProgressMakingGETRequest,
|
||||
RegistryProgressGETResponse,
|
||||
RegistryIndexProgressMakingGETRequest,
|
||||
RegistryIndexProgressGETResponse,
|
||||
RegistryResponseUnexpectedType,
|
||||
RegistryResponseMissingTopKeys,
|
||||
RegistryResponseMissingNestedKeys,
|
||||
RegistryResponseExtraNestedKeys,
|
||||
)
|
||||
from dbt.utils import memoized, _connection_exception_retry as connection_exception_retry
|
||||
from dbt import deprecations
|
||||
import os
|
||||
@@ -12,55 +22,77 @@ else:
|
||||
DEFAULT_REGISTRY_BASE_URL = "https://hub.getdbt.com/"
|
||||
|
||||
|
||||
def _get_url(url, registry_base_url=None):
|
||||
def _get_url(name, registry_base_url=None):
|
||||
if registry_base_url is None:
|
||||
registry_base_url = DEFAULT_REGISTRY_BASE_URL
|
||||
url = "api/v1/{}.json".format(name)
|
||||
|
||||
return "{}{}".format(registry_base_url, url)
|
||||
|
||||
|
||||
def _get_with_retries(path, registry_base_url=None):
|
||||
get_fn = functools.partial(_get, path, registry_base_url)
|
||||
def _get_with_retries(package_name, registry_base_url=None):
|
||||
get_fn = functools.partial(_get, package_name, registry_base_url)
|
||||
return connection_exception_retry(get_fn, 5)
|
||||
|
||||
|
||||
def _get(path, registry_base_url=None):
|
||||
url = _get_url(path, registry_base_url)
|
||||
def _get(package_name, registry_base_url=None):
|
||||
url = _get_url(package_name, registry_base_url)
|
||||
fire_event(RegistryProgressMakingGETRequest(url=url))
|
||||
# all exceptions from requests get caught in the retry logic so no need to wrap this here
|
||||
resp = requests.get(url, timeout=30)
|
||||
fire_event(RegistryProgressGETResponse(url=url, resp_code=resp.status_code))
|
||||
resp.raise_for_status()
|
||||
|
||||
# It is unexpected for the content of the response to be None so if it is, raising this error
|
||||
# will cause this function to retry (if called within _get_with_retries) and hopefully get
|
||||
# a response. This seems to happen when there's an issue with the Hub.
|
||||
# The response should always be a dictionary. Anything else is unexpected, raise error.
|
||||
# Raising this error will cause this function to retry (if called within _get_with_retries)
|
||||
# and hopefully get a valid response. This seems to happen when there's an issue with the Hub.
|
||||
# Since we control what we expect the HUB to return, this is safe.
|
||||
# See https://github.com/dbt-labs/dbt-core/issues/4577
|
||||
if resp.json() is None:
|
||||
raise requests.exceptions.ContentDecodingError(
|
||||
"Request error: The response is None", response=resp
|
||||
# and https://github.com/dbt-labs/dbt-core/issues/4849
|
||||
response = resp.json()
|
||||
|
||||
if not isinstance(response, dict): # This will also catch Nonetype
|
||||
error_msg = (
|
||||
f"Request error: Expected a response type of <dict> but got {type(response)} instead"
|
||||
)
|
||||
return resp.json()
|
||||
fire_event(RegistryResponseUnexpectedType(response=response))
|
||||
raise requests.exceptions.ContentDecodingError(error_msg, response=resp)
|
||||
|
||||
# check for expected top level keys
|
||||
expected_keys = {"name", "versions"}
|
||||
if not expected_keys.issubset(response):
|
||||
error_msg = (
|
||||
f"Request error: Expected the response to contain keys {expected_keys} "
|
||||
f"but is missing {expected_keys.difference(set(response))}"
|
||||
)
|
||||
fire_event(RegistryResponseMissingTopKeys(response=response))
|
||||
raise requests.exceptions.ContentDecodingError(error_msg, response=resp)
|
||||
|
||||
def index(registry_base_url=None):
|
||||
return _get_with_retries("api/v1/index.json", registry_base_url)
|
||||
# check for the keys we need nested under each version
|
||||
expected_version_keys = {"name", "packages", "downloads"}
|
||||
all_keys = set().union(*(response["versions"][d] for d in response["versions"]))
|
||||
if not expected_version_keys.issubset(all_keys):
|
||||
error_msg = (
|
||||
"Request error: Expected the response for the version to contain keys "
|
||||
f"{expected_version_keys} but is missing {expected_version_keys.difference(all_keys)}"
|
||||
)
|
||||
fire_event(RegistryResponseMissingNestedKeys(response=response))
|
||||
raise requests.exceptions.ContentDecodingError(error_msg, response=resp)
|
||||
|
||||
|
||||
index_cached = memoized(index)
|
||||
|
||||
|
||||
def packages(registry_base_url=None):
|
||||
return _get_with_retries("api/v1/packages.json", registry_base_url)
|
||||
|
||||
|
||||
def package(name, registry_base_url=None):
|
||||
response = _get_with_retries("api/v1/{}.json".format(name), registry_base_url)
|
||||
# all version responses should contain identical keys.
|
||||
has_extra_keys = set().difference(*(response["versions"][d] for d in response["versions"]))
|
||||
if has_extra_keys:
|
||||
error_msg = (
|
||||
"Request error: Keys for all versions do not match. Found extra key(s) "
|
||||
f"of {has_extra_keys}."
|
||||
)
|
||||
fire_event(RegistryResponseExtraNestedKeys(response=response))
|
||||
raise requests.exceptions.ContentDecodingError(error_msg, response=resp)
|
||||
|
||||
# Either redirectnamespace or redirectname in the JSON response indicate a redirect
|
||||
# redirectnamespace redirects based on package ownership
|
||||
# redirectname redirects based on package name
|
||||
# Both can be present at the same time, or neither. Fails gracefully to old name
|
||||
|
||||
if ("redirectnamespace" in response) or ("redirectname" in response):
|
||||
|
||||
if ("redirectnamespace" in response) and response["redirectnamespace"] is not None:
|
||||
@@ -74,15 +106,59 @@ def package(name, registry_base_url=None):
|
||||
use_name = response["name"]
|
||||
|
||||
new_nwo = use_namespace + "/" + use_name
|
||||
deprecations.warn("package-redirect", old_name=name, new_name=new_nwo)
|
||||
deprecations.warn("package-redirect", old_name=package_name, new_name=new_nwo)
|
||||
|
||||
return response
|
||||
|
||||
|
||||
def package_version(name, version, registry_base_url=None):
|
||||
return _get_with_retries("api/v1/{}/{}.json".format(name, version), registry_base_url)
|
||||
_get_cached = memoized(_get_with_retries)
|
||||
|
||||
|
||||
def get_available_versions(name):
|
||||
response = package(name)
|
||||
return list(response["versions"])
|
||||
def package(package_name, registry_base_url=None) -> Dict[str, Any]:
|
||||
# returns a dictionary of metadata for all versions of a package
|
||||
response = _get_cached(package_name, registry_base_url)
|
||||
return response["versions"]
|
||||
|
||||
|
||||
def package_version(package_name, version, registry_base_url=None) -> Dict[str, Any]:
|
||||
# returns the metadata of a specific version of a package
|
||||
response = package(package_name, registry_base_url)
|
||||
return response[version]
|
||||
|
||||
|
||||
def get_available_versions(package_name) -> List["str"]:
|
||||
# returns a list of all available versions of a package
|
||||
response = package(package_name)
|
||||
return list(response)
|
||||
|
||||
|
||||
def _get_index(registry_base_url=None):
|
||||
|
||||
url = _get_url("index", registry_base_url)
|
||||
fire_event(RegistryIndexProgressMakingGETRequest(url=url))
|
||||
# all exceptions from requests get caught in the retry logic so no need to wrap this here
|
||||
resp = requests.get(url, timeout=30)
|
||||
fire_event(RegistryIndexProgressGETResponse(url=url, resp_code=resp.status_code))
|
||||
resp.raise_for_status()
|
||||
|
||||
# The response should be a list. Anything else is unexpected, raise an error.
|
||||
# Raising this error will cause this function to retry and hopefully get a valid response.
|
||||
|
||||
response = resp.json()
|
||||
|
||||
if not isinstance(response, list): # This will also catch Nonetype
|
||||
error_msg = (
|
||||
f"Request error: The response type of {type(response)} is not valid: {resp.text}"
|
||||
)
|
||||
raise requests.exceptions.ContentDecodingError(error_msg, response=resp)
|
||||
|
||||
return response
|
||||
|
||||
|
||||
def index(registry_base_url=None) -> List[str]:
|
||||
# this returns a list of all packages on the Hub
|
||||
get_index_fn = functools.partial(_get_index, registry_base_url)
|
||||
return connection_exception_retry(get_index_fn, 5)
|
||||
|
||||
|
||||
index_cached = memoized(index)
|
||||
|
||||
@@ -335,7 +335,7 @@ def _handle_posix_cmd_error(exc: OSError, cwd: str, cmd: List[str]) -> NoReturn:
|
||||
|
||||
|
||||
def _handle_posix_error(exc: OSError, cwd: str, cmd: List[str]) -> NoReturn:
|
||||
"""OSError handling for posix systems.
|
||||
"""OSError handling for POSIX systems.
|
||||
|
||||
Some things that could happen to trigger an OSError:
|
||||
- cwd could not exist
|
||||
@@ -386,7 +386,7 @@ def _handle_windows_error(exc: OSError, cwd: str, cmd: List[str]) -> NoReturn:
|
||||
|
||||
|
||||
def _interpret_oserror(exc: OSError, cwd: str, cmd: List[str]) -> NoReturn:
|
||||
"""Interpret an OSError exc and raise the appropriate dbt exception."""
|
||||
"""Interpret an OSError exception and raise the appropriate dbt exception."""
|
||||
if len(cmd) == 0:
|
||||
raise dbt.exceptions.CommandError(cwd, cmd)
|
||||
|
||||
@@ -501,7 +501,7 @@ def move(src, dst):
|
||||
directory on windows when it has read-only files in it and the move is
|
||||
between two drives.
|
||||
|
||||
This is almost identical to the real shutil.move, except it uses our rmtree
|
||||
This is almost identical to the real shutil.move, except it, uses our rmtree
|
||||
and skips handling non-windows OSes since the existing one works ok there.
|
||||
"""
|
||||
src = convert_path(src)
|
||||
@@ -536,7 +536,7 @@ def move(src, dst):
|
||||
|
||||
|
||||
def rmtree(path):
|
||||
"""Recursively remove path. On permissions errors on windows, try to remove
|
||||
"""Recursively remove the path. On permissions errors on windows, try to remove
|
||||
the read-only flag and try again.
|
||||
"""
|
||||
path = convert_path(path)
|
||||
|
||||
@@ -132,7 +132,11 @@ def _all_source_paths(
|
||||
analysis_paths: List[str],
|
||||
macro_paths: List[str],
|
||||
) -> List[str]:
|
||||
return list(chain(model_paths, seed_paths, snapshot_paths, analysis_paths, macro_paths))
|
||||
# We need to turn a list of lists into just a list, then convert to a set to
|
||||
# get only unique elements, then back to a list
|
||||
return list(
|
||||
set(list(chain(model_paths, seed_paths, snapshot_paths, analysis_paths, macro_paths)))
|
||||
)
|
||||
|
||||
|
||||
T = TypeVar("T")
|
||||
|
||||
@@ -1,12 +1,15 @@
|
||||
from typing import Dict, Any, Tuple, Optional, Union, Callable
|
||||
import re
|
||||
import os
|
||||
|
||||
from dbt.clients.jinja import get_rendered, catch_jinja
|
||||
from dbt.context.target import TargetContext
|
||||
from dbt.context.secret import SecretContext
|
||||
from dbt.context.secret import SecretContext, SECRET_PLACEHOLDER
|
||||
from dbt.context.base import BaseContext
|
||||
from dbt.contracts.connection import HasCredentials
|
||||
from dbt.exceptions import DbtProjectError, CompilationException, RecursionException
|
||||
from dbt.utils import deep_map_render
|
||||
from dbt.logger import SECRET_ENV_PREFIX
|
||||
|
||||
|
||||
Keypath = Tuple[Union[str, int], ...]
|
||||
@@ -114,11 +117,9 @@ class DbtProjectYamlRenderer(BaseRenderer):
|
||||
def name(self):
|
||||
"Project config"
|
||||
|
||||
# Uses SecretRenderer
|
||||
def get_package_renderer(self) -> BaseRenderer:
|
||||
return PackageRenderer(self.context)
|
||||
|
||||
def get_selector_renderer(self) -> BaseRenderer:
|
||||
return SelectorRenderer(self.context)
|
||||
return PackageRenderer(self.ctx_obj.cli_vars)
|
||||
|
||||
def render_project(
|
||||
self,
|
||||
@@ -136,8 +137,7 @@ class DbtProjectYamlRenderer(BaseRenderer):
|
||||
return package_renderer.render_data(packages)
|
||||
|
||||
def render_selectors(self, selectors: Dict[str, Any]):
|
||||
selector_renderer = self.get_selector_renderer()
|
||||
return selector_renderer.render_data(selectors)
|
||||
return self.render_data(selectors)
|
||||
|
||||
def render_entry(self, value: Any, keypath: Keypath) -> Any:
|
||||
result = super().render_entry(value, keypath)
|
||||
@@ -165,18 +165,10 @@ class DbtProjectYamlRenderer(BaseRenderer):
|
||||
return True
|
||||
|
||||
|
||||
class SelectorRenderer(BaseRenderer):
|
||||
@property
|
||||
def name(self):
|
||||
return "Selector config"
|
||||
|
||||
|
||||
class SecretRenderer(BaseRenderer):
|
||||
def __init__(self, cli_vars: Optional[Dict[str, Any]] = None) -> None:
|
||||
def __init__(self, cli_vars: Dict[str, Any] = {}) -> None:
|
||||
# Generate contexts here because we want to save the context
|
||||
# object in order to retrieve the env_vars.
|
||||
if cli_vars is None:
|
||||
cli_vars = {}
|
||||
self.ctx_obj = SecretContext(cli_vars)
|
||||
context = self.ctx_obj.to_dict()
|
||||
super().__init__(context)
|
||||
@@ -185,6 +177,28 @@ class SecretRenderer(BaseRenderer):
|
||||
def name(self):
|
||||
return "Secret"
|
||||
|
||||
def render_value(self, value: Any, keypath: Optional[Keypath] = None) -> Any:
|
||||
# First, standard Jinja rendering, with special handling for 'secret' environment variables
|
||||
# "{{ env_var('DBT_SECRET_ENV_VAR') }}" -> "$$$DBT_SECRET_START$$$DBT_SECRET_ENV_{VARIABLE_NAME}$$$DBT_SECRET_END$$$"
|
||||
# This prevents Jinja manipulation of secrets via macros/filters that might leak partial/modified values in logs
|
||||
rendered = super().render_value(value, keypath)
|
||||
# Now, detect instances of the placeholder value ($$$DBT_SECRET_START...DBT_SECRET_END$$$)
|
||||
# and replace them with the actual secret value
|
||||
if SECRET_ENV_PREFIX in str(rendered):
|
||||
search_group = f"({SECRET_ENV_PREFIX}(.*))"
|
||||
pattern = SECRET_PLACEHOLDER.format(search_group).replace("$", r"\$")
|
||||
m = re.search(
|
||||
pattern,
|
||||
rendered,
|
||||
)
|
||||
if m:
|
||||
found = m.group(1)
|
||||
value = os.environ[found]
|
||||
replace_this = SECRET_PLACEHOLDER.format(found)
|
||||
return rendered.replace(replace_this, value)
|
||||
else:
|
||||
return rendered
|
||||
|
||||
|
||||
class ProfileRenderer(SecretRenderer):
|
||||
@property
|
||||
|
||||
@@ -11,7 +11,7 @@ from .renderer import DbtProjectYamlRenderer, ProfileRenderer
|
||||
from .utils import parse_cli_vars
|
||||
from dbt import flags
|
||||
from dbt.adapters.factory import get_relation_class_by_name, get_include_paths
|
||||
from dbt.helper_types import FQNPath, PathSet
|
||||
from dbt.helper_types import FQNPath, PathSet, DictDefaultEmptyStr
|
||||
from dbt.config.profile import read_user_config
|
||||
from dbt.contracts.connection import AdapterRequiredConfig, Credentials
|
||||
from dbt.contracts.graph.manifest import ManifestMetadata
|
||||
@@ -312,22 +312,26 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
|
||||
|
||||
warn_or_error(msg, log_fmt=warning_tag("{}"))
|
||||
|
||||
def load_dependencies(self) -> Mapping[str, "RuntimeConfig"]:
|
||||
def load_dependencies(self, base_only=False) -> Mapping[str, "RuntimeConfig"]:
|
||||
if self.dependencies is None:
|
||||
all_projects = {self.project_name: self}
|
||||
internal_packages = get_include_paths(self.credentials.type)
|
||||
# raise exception if fewer installed packages than in packages.yml
|
||||
count_packages_specified = len(self.packages.packages) # type: ignore
|
||||
count_packages_installed = len(tuple(self._get_project_directories()))
|
||||
if count_packages_specified > count_packages_installed:
|
||||
raise_compiler_error(
|
||||
f"dbt found {count_packages_specified} package(s) "
|
||||
f"specified in packages.yml, but only "
|
||||
f"{count_packages_installed} package(s) installed "
|
||||
f'in {self.packages_install_path}. Run "dbt deps" to '
|
||||
f"install package dependencies."
|
||||
)
|
||||
project_paths = itertools.chain(internal_packages, self._get_project_directories())
|
||||
if base_only:
|
||||
# Test setup -- we want to load macros without dependencies
|
||||
project_paths = itertools.chain(internal_packages)
|
||||
else:
|
||||
# raise exception if fewer installed packages than in packages.yml
|
||||
count_packages_specified = len(self.packages.packages) # type: ignore
|
||||
count_packages_installed = len(tuple(self._get_project_directories()))
|
||||
if count_packages_specified > count_packages_installed:
|
||||
raise_compiler_error(
|
||||
f"dbt found {count_packages_specified} package(s) "
|
||||
f"specified in packages.yml, but only "
|
||||
f"{count_packages_installed} package(s) installed "
|
||||
f'in {self.packages_install_path}. Run "dbt deps" to '
|
||||
f"install package dependencies."
|
||||
)
|
||||
project_paths = itertools.chain(internal_packages, self._get_project_directories())
|
||||
for project_name, project in self.load_projects(project_paths):
|
||||
if project_name in all_projects:
|
||||
raise_compiler_error(
|
||||
@@ -396,7 +400,7 @@ class UnsetProfile(Profile):
|
||||
self.threads = -1
|
||||
|
||||
def to_target_dict(self):
|
||||
return {}
|
||||
return DictDefaultEmptyStr({})
|
||||
|
||||
def __getattribute__(self, name):
|
||||
if name in {"profile_name", "target_name", "threads"}:
|
||||
@@ -431,7 +435,7 @@ class UnsetProfileConfig(RuntimeConfig):
|
||||
|
||||
def to_target_dict(self):
|
||||
# re-override the poisoned profile behavior
|
||||
return {}
|
||||
return DictDefaultEmptyStr({})
|
||||
|
||||
@classmethod
|
||||
def from_parts(
|
||||
|
||||
@@ -3,7 +3,7 @@ from typing import Dict, Any, Union
|
||||
from dbt.clients.yaml_helper import yaml, Loader, Dumper, load_yaml_text # noqa: F401
|
||||
from dbt.dataclass_schema import ValidationError
|
||||
|
||||
from .renderer import SelectorRenderer
|
||||
from .renderer import BaseRenderer
|
||||
|
||||
from dbt.clients.system import (
|
||||
load_file_contents,
|
||||
@@ -57,7 +57,7 @@ class SelectorConfig(Dict[str, Dict[str, Union[SelectionSpec, bool]]]):
|
||||
def render_from_dict(
|
||||
cls,
|
||||
data: Dict[str, Any],
|
||||
renderer: SelectorRenderer,
|
||||
renderer: BaseRenderer,
|
||||
) -> "SelectorConfig":
|
||||
try:
|
||||
rendered = renderer.render_data(data)
|
||||
@@ -72,7 +72,7 @@ class SelectorConfig(Dict[str, Dict[str, Union[SelectionSpec, bool]]]):
|
||||
def from_path(
|
||||
cls,
|
||||
path: Path,
|
||||
renderer: SelectorRenderer,
|
||||
renderer: BaseRenderer,
|
||||
) -> "SelectorConfig":
|
||||
try:
|
||||
data = load_yaml_text(load_file_contents(str(path)))
|
||||
|
||||
@@ -1,9 +1,15 @@
|
||||
from typing import Dict, Any
|
||||
from argparse import Namespace
|
||||
from typing import Any, Dict, Optional, Union
|
||||
from xmlrpc.client import Boolean
|
||||
from dbt.contracts.project import UserConfig
|
||||
|
||||
import dbt.flags as flags
|
||||
from dbt.clients import yaml_helper
|
||||
from dbt.config import Profile, Project, read_user_config
|
||||
from dbt.config.renderer import DbtProjectYamlRenderer, ProfileRenderer
|
||||
from dbt.events.functions import fire_event
|
||||
from dbt.exceptions import raise_compiler_error, ValidationException
|
||||
from dbt.events.types import InvalidVarsYAML
|
||||
from dbt.exceptions import ValidationException, raise_compiler_error
|
||||
|
||||
|
||||
def parse_cli_vars(var_string: str) -> Dict[str, Any]:
|
||||
@@ -21,3 +27,49 @@ def parse_cli_vars(var_string: str) -> Dict[str, Any]:
|
||||
except ValidationException:
|
||||
fire_event(InvalidVarsYAML())
|
||||
raise
|
||||
|
||||
|
||||
def get_project_config(
|
||||
project_path: str,
|
||||
profile_name: str,
|
||||
args: Namespace = Namespace(),
|
||||
cli_vars: Optional[Dict[str, Any]] = None,
|
||||
profile: Optional[Profile] = None,
|
||||
user_config: Optional[UserConfig] = None,
|
||||
return_dict: Boolean = True,
|
||||
) -> Union[Project, Dict]:
|
||||
"""Returns a project config (dict or object) from a given project path and profile name.
|
||||
|
||||
Args:
|
||||
project_path: Path to project
|
||||
profile_name: Name of profile
|
||||
args: An argparse.Namespace that represents what would have been passed in on the
|
||||
command line (optional)
|
||||
cli_vars: A dict of any vars that would have been passed in on the command line (optional)
|
||||
(see parse_cli_vars above for formatting details)
|
||||
profile: A dbt.config.profile.Profile object (optional)
|
||||
user_config: A dbt.contracts.project.UserConfig object (optional)
|
||||
return_dict: Return a dict if true, return the full dbt.config.project.Project object if false
|
||||
|
||||
Returns:
|
||||
A full project config
|
||||
|
||||
"""
|
||||
# Generate a profile if not provided
|
||||
if profile is None:
|
||||
# Generate user_config if not provided
|
||||
if user_config is None:
|
||||
user_config = read_user_config(flags.PROFILES_DIR)
|
||||
# Update flags
|
||||
flags.set_from_args(args, user_config)
|
||||
if cli_vars is None:
|
||||
cli_vars = {}
|
||||
profile = Profile.render_from_args(args, ProfileRenderer(cli_vars), profile_name)
|
||||
# Generate a project
|
||||
project = Project.from_project_root(
|
||||
project_path,
|
||||
DbtProjectYamlRenderer(profile),
|
||||
verify_version=bool(flags.VERSION_CHECK),
|
||||
)
|
||||
# Return
|
||||
return project.to_project_config() if return_dict else project
|
||||
|
||||
@@ -1 +1,51 @@
|
||||
# Contexts and Jinja rendering
|
||||
|
||||
Contexts are used for Jinja rendering. They include context methods, executable macros, and various settings that are available in Jinja.
|
||||
|
||||
The most common entrypoint to Jinja rendering in dbt is a method named `get_rendered`, which takes two arguments: templated code (string), and a context used to render it (dictionary).
|
||||
|
||||
The context is the bundle of information that is in "scope" when rendering Jinja-templated code. For instance, imagine a simple Jinja template:
|
||||
```
|
||||
{% set new_value = some_macro(some_variable) %}
|
||||
```
|
||||
Both `some_macro()` and `some_variable` must be defined in that context. Otherwise, it will raise an error when rendering.
|
||||
|
||||
Different contexts are used in different places because we allow access to different methods and data in different places. Executable SQL, for example, includes all available macros and the model being run. The variables and macros in scope for Jinja defined in yaml files is much more limited.
|
||||
|
||||
### Implementation
|
||||
|
||||
The context that is passed to Jinja is always in a dictionary format, not an actual class, so a `to_dict()` is executed on a context class before it is used for rendering.
|
||||
|
||||
Each context has a `generate_<name>_context` function to create the context. `ProviderContext` subclasses have different generate functions for parsing and for execution, so that certain functions (notably `ref`, `source`, and `config`) can return different results
|
||||
|
||||
### Hierarchy
|
||||
|
||||
All contexts inherit from the `BaseContext`, which includes "pure" methods (e.g. `tojson`), `env_var()`, and `var()` (but only CLI values, passed via `--vars`).
|
||||
|
||||
Methods available in parent contexts are also available in child contexts.
|
||||
|
||||
```
|
||||
BaseContext -- core/dbt/context/base.py
|
||||
SecretContext -- core/dbt/context/secret.py
|
||||
TargetContext -- core/dbt/context/target.py
|
||||
ConfiguredContext -- core/dbt/context/configured.py
|
||||
SchemaYamlContext -- core/dbt/context/configured.py
|
||||
DocsRuntimeContext -- core/dbt/context/configured.py
|
||||
MacroResolvingContext -- core/dbt/context/configured.py
|
||||
ManifestContext -- core/dbt/context/manifest.py
|
||||
QueryHeaderContext -- core/dbt/context/manifest.py
|
||||
ProviderContext -- core/dbt/context/provider.py
|
||||
MacroContext -- core/dbt/context/provider.py
|
||||
ModelContext -- core/dbt/context/provider.py
|
||||
TestContext -- core/dbt/context/provider.py
|
||||
```
|
||||
|
||||
### Contexts for configuration
|
||||
|
||||
Contexts for rendering "special" `.yml` (configuration) files:
|
||||
- `SecretContext`: Supports "secret" env vars, which are prefixed with `DBT_ENV_SECRET_`. Used for rendering in `profiles.yml` and `packages.yml` ONLY. Secrets defined elsewhere will raise explicit errors.
|
||||
- `TargetContext`: The same as `Base`, plus `target` (connection profile). Used most notably in `dbt_project.yml` and `selectors.yml`.
|
||||
|
||||
Contexts for other `.yml` files in the project:
|
||||
- `SchemaYamlContext`: Supports `vars` declared on the CLI and in `dbt_project.yml`. Does not support custom macros, beyond `var()` + `env_var()` methods. Used for all `.yml` files, to define properties and configuration.
|
||||
- `DocsRuntimeContext`: Standard `.yml` file context, plus `doc()` method (with all `docs` blocks in scope). Used to resolve `description` properties.
|
||||
|
||||
@@ -24,38 +24,7 @@ import pytz
|
||||
import datetime
|
||||
import re
|
||||
|
||||
# Contexts in dbt Core
|
||||
# Contexts are used for Jinja rendering. They include context methods,
|
||||
# executable macros, and various settings that are available in Jinja.
|
||||
#
|
||||
# Different contexts are used in different places because we allow access
|
||||
# to different methods and data in different places. Executable SQL, for
|
||||
# example, includes the available macros and the model, while Jinja in
|
||||
# yaml files is more limited.
|
||||
#
|
||||
# The context that is passed to Jinja is always in a dictionary format,
|
||||
# not an actual class, so a 'to_dict()' is executed on a context class
|
||||
# before it is used for rendering.
|
||||
#
|
||||
# Each context has a generate_<name>_context function to create the context.
|
||||
# ProviderContext subclasses have different generate functions for
|
||||
# parsing and for execution.
|
||||
#
|
||||
# Context class hierarchy
|
||||
#
|
||||
# BaseContext -- core/dbt/context/base.py
|
||||
# SecretContext -- core/dbt/context/secret.py
|
||||
# TargetContext -- core/dbt/context/target.py
|
||||
# ConfiguredContext -- core/dbt/context/configured.py
|
||||
# SchemaYamlContext -- core/dbt/context/configured.py
|
||||
# DocsRuntimeContext -- core/dbt/context/configured.py
|
||||
# MacroResolvingContext -- core/dbt/context/configured.py
|
||||
# ManifestContext -- core/dbt/context/manifest.py
|
||||
# QueryHeaderContext -- core/dbt/context/manifest.py
|
||||
# ProviderContext -- core/dbt/context/provider.py
|
||||
# MacroContext -- core/dbt/context/provider.py
|
||||
# ModelContext -- core/dbt/context/provider.py
|
||||
# TestContext -- core/dbt/context/provider.py
|
||||
# See the `contexts` module README for more information on how contexts work
|
||||
|
||||
|
||||
def get_pytz_module_context() -> Dict[str, Any]:
|
||||
@@ -552,9 +521,8 @@ class BaseContext(metaclass=ContextMeta):
|
||||
{% endif %}
|
||||
|
||||
This supports all flags defined in flags submodule (core/dbt/flags.py)
|
||||
TODO: Replace with object that provides read-only access to flag values
|
||||
"""
|
||||
return flags
|
||||
return flags.get_flag_obj()
|
||||
|
||||
@contextmember
|
||||
@staticmethod
|
||||
@@ -569,7 +537,9 @@ class BaseContext(metaclass=ContextMeta):
|
||||
{{ print("Running some_macro: " ~ arg1 ~ ", " ~ arg2) }}
|
||||
{% endmacro %}"
|
||||
"""
|
||||
print(msg)
|
||||
|
||||
if not flags.NO_PRINT:
|
||||
print(msg)
|
||||
return ""
|
||||
|
||||
|
||||
|
||||
@@ -62,6 +62,8 @@ from dbt.node_types import NodeType
|
||||
|
||||
from dbt.utils import merge, AttrDict, MultiDict
|
||||
|
||||
from dbt import selected_resources
|
||||
|
||||
import agate
|
||||
|
||||
|
||||
@@ -1143,11 +1145,20 @@ class ProviderContext(ManifestContext):
|
||||
msg = f"Env var required but not provided: '{var}'"
|
||||
raise_parsing_error(msg)
|
||||
|
||||
@contextproperty
|
||||
def selected_resources(self) -> List[str]:
|
||||
"""The `selected_resources` variable contains a list of the resources
|
||||
selected based on the parameters provided to the dbt command.
|
||||
Currently, is not populated for the command `run-operation` that
|
||||
doesn't support `--select`.
|
||||
"""
|
||||
return selected_resources.SELECTED_RESOURCES
|
||||
|
||||
|
||||
class MacroContext(ProviderContext):
|
||||
"""Internally, macros can be executed like nodes, with some restrictions:
|
||||
|
||||
- they don't have have all values available that nodes do:
|
||||
- they don't have all values available that nodes do:
|
||||
- 'this', 'pre_hooks', 'post_hooks', and 'sql' are missing
|
||||
- 'schema' does not use any 'model' information
|
||||
- they can't be configured with config() directives
|
||||
|
||||
@@ -7,6 +7,9 @@ from dbt.exceptions import raise_parsing_error
|
||||
from dbt.logger import SECRET_ENV_PREFIX
|
||||
|
||||
|
||||
SECRET_PLACEHOLDER = "$$$DBT_SECRET_START$$${}$$$DBT_SECRET_END$$$"
|
||||
|
||||
|
||||
class SecretContext(BaseContext):
|
||||
"""This context is used in profiles.yml + packages.yml. It can render secret
|
||||
env vars that aren't usable elsewhere"""
|
||||
@@ -18,21 +21,29 @@ class SecretContext(BaseContext):
|
||||
|
||||
If the default is None, raise an exception for an undefined variable.
|
||||
|
||||
In this context *only*, env_var will return the actual values of
|
||||
env vars prefixed with DBT_ENV_SECRET_
|
||||
In this context *only*, env_var will accept env vars prefixed with DBT_ENV_SECRET_.
|
||||
It will return the name of the secret env var, wrapped in 'start' and 'end' identifiers.
|
||||
The actual value will be subbed in later in SecretRenderer.render_value()
|
||||
"""
|
||||
return_value = None
|
||||
if var in os.environ:
|
||||
|
||||
# if this is a 'secret' env var, just return the name of the env var
|
||||
# instead of rendering the actual value here, to avoid any risk of
|
||||
# Jinja manipulation. it will be subbed out later, in SecretRenderer.render_value
|
||||
if var in os.environ and var.startswith(SECRET_ENV_PREFIX):
|
||||
return SECRET_PLACEHOLDER.format(var)
|
||||
|
||||
elif var in os.environ:
|
||||
return_value = os.environ[var]
|
||||
elif default is not None:
|
||||
return_value = default
|
||||
|
||||
if return_value is not None:
|
||||
# do not save secret environment variables
|
||||
# store env vars in the internal manifest to power partial parsing
|
||||
# if it's a 'secret' env var, we shouldn't even get here
|
||||
# but just to be safe — don't save secrets
|
||||
if not var.startswith(SECRET_ENV_PREFIX):
|
||||
self.env_vars[var] = return_value
|
||||
|
||||
# return the value even if its a secret
|
||||
return return_value
|
||||
else:
|
||||
msg = f"Env var required but not provided: '{var}'"
|
||||
|
||||
@@ -104,7 +104,7 @@ class Connection(ExtensibleDbtClassMixin, Replaceable):
|
||||
|
||||
|
||||
class LazyHandle:
|
||||
"""Opener must be a callable that takes a Connection object and opens the
|
||||
"""The opener must be a callable that takes a Connection object and opens the
|
||||
connection, updating the handle on the Connection.
|
||||
"""
|
||||
|
||||
|
||||
@@ -453,7 +453,7 @@ T = TypeVar("T", bound=GraphMemberNode)
|
||||
|
||||
def _update_into(dest: MutableMapping[str, T], new_item: T):
|
||||
"""Update dest to overwrite whatever is at dest[new_item.unique_id] with
|
||||
new_itme. There must be an existing value to overwrite, and they two nodes
|
||||
new_itme. There must be an existing value to overwrite, and the two nodes
|
||||
must have the same original file path.
|
||||
"""
|
||||
unique_id = new_item.unique_id
|
||||
@@ -1091,7 +1091,7 @@ AnyManifest = Union[Manifest, MacroManifest]
|
||||
|
||||
|
||||
@dataclass
|
||||
@schema_version("manifest", 4)
|
||||
@schema_version("manifest", 5)
|
||||
class WritableManifest(ArtifactMixin):
|
||||
nodes: Mapping[UniqueID, ManifestNode] = field(
|
||||
metadata=dict(description=("The nodes defined in the dbt project and its dependencies"))
|
||||
@@ -1135,6 +1135,10 @@ class WritableManifest(ArtifactMixin):
|
||||
)
|
||||
)
|
||||
|
||||
@classmethod
|
||||
def compatible_previous_versions(self):
|
||||
return [("manifest", 4)]
|
||||
|
||||
|
||||
def _check_duplicates(value: HasUniqueID, src: Mapping[str, HasUniqueID]):
|
||||
if value.unique_id in src:
|
||||
|
||||
@@ -335,6 +335,40 @@ class BaseConfig(AdditionalPropertiesAllowed, Replaceable):
|
||||
@dataclass
|
||||
class SourceConfig(BaseConfig):
|
||||
enabled: bool = True
|
||||
# to be implmented to complete CT-201
|
||||
# quoting: Dict[str, Any] = field(
|
||||
# default_factory=dict,
|
||||
# metadata=MergeBehavior.Update.meta(),
|
||||
# )
|
||||
# freshness: Optional[Dict[str, Any]] = field(
|
||||
# default=None,
|
||||
# metadata=CompareBehavior.Exclude.meta(),
|
||||
# )
|
||||
# loader: Optional[str] = field(
|
||||
# default=None,
|
||||
# metadata=CompareBehavior.Exclude.meta(),
|
||||
# )
|
||||
# # TODO what type is this? docs say: "<column_name_or_expression>"
|
||||
# loaded_at_field: Optional[str] = field(
|
||||
# default=None,
|
||||
# metadata=CompareBehavior.Exclude.meta(),
|
||||
# )
|
||||
# database: Optional[str] = field(
|
||||
# default=None,
|
||||
# metadata=CompareBehavior.Exclude.meta(),
|
||||
# )
|
||||
# schema: Optional[str] = field(
|
||||
# default=None,
|
||||
# metadata=CompareBehavior.Exclude.meta(),
|
||||
# )
|
||||
# meta: Dict[str, Any] = field(
|
||||
# default_factory=dict,
|
||||
# metadata=MergeBehavior.Update.meta(),
|
||||
# )
|
||||
# tags: Union[List[str], str] = field(
|
||||
# default_factory=list_str,
|
||||
# metadata=metas(ShowBehavior.Hide, MergeBehavior.Append, CompareBehavior.Exclude),
|
||||
# )
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -389,7 +423,9 @@ class NodeConfig(NodeAndTestConfig):
|
||||
metadata=MergeBehavior.Update.meta(),
|
||||
)
|
||||
full_refresh: Optional[bool] = None
|
||||
unique_key: Optional[Union[str, List[str]]] = None
|
||||
# 'unique_key' doesn't use 'Optional' because typing.get_type_hints was
|
||||
# sometimes getting the Union order wrong, causing serialization failures.
|
||||
unique_key: Union[str, List[str], None] = None
|
||||
on_schema_change: Optional[str] = "ignore"
|
||||
|
||||
@classmethod
|
||||
@@ -483,7 +519,8 @@ class SnapshotConfig(EmptySnapshotConfig):
|
||||
target_schema: Optional[str] = None
|
||||
target_database: Optional[str] = None
|
||||
updated_at: Optional[str] = None
|
||||
check_cols: Optional[Union[str, List[str]]] = None
|
||||
# Not using Optional because of serialization issues with a Union of str and List[str]
|
||||
check_cols: Union[str, List[str], None] = None
|
||||
|
||||
@classmethod
|
||||
def validate(cls, data):
|
||||
|
||||
@@ -242,6 +242,7 @@ class Quoting(dbtClassMixin, Mergeable):
|
||||
|
||||
@dataclass
|
||||
class UnparsedSourceTableDefinition(HasColumnTests, HasTests):
|
||||
config: Dict[str, Any] = field(default_factory=dict)
|
||||
loaded_at_field: Optional[str] = None
|
||||
identifier: Optional[str] = None
|
||||
quoting: Quoting = field(default_factory=Quoting)
|
||||
@@ -322,6 +323,7 @@ class SourcePatch(dbtClassMixin, Replaceable):
|
||||
path: Path = field(
|
||||
metadata=dict(description="The path to the patch-defining yml file"),
|
||||
)
|
||||
config: Dict[str, Any] = field(default_factory=dict)
|
||||
description: Optional[str] = None
|
||||
meta: Optional[Dict[str, Any]] = None
|
||||
database: Optional[str] = None
|
||||
|
||||
@@ -253,6 +253,7 @@ class UserConfig(ExtensibleDbtClassMixin, Replaceable, UserConfigContract):
|
||||
use_experimental_parser: Optional[bool] = None
|
||||
static_parser: Optional[bool] = None
|
||||
indirect_selection: Optional[str] = None
|
||||
cache_selected_only: Optional[bool] = None
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -85,6 +85,7 @@ class RunStatus(StrEnum):
|
||||
|
||||
|
||||
class TestStatus(StrEnum):
|
||||
__test__ = False
|
||||
Pass = NodeStatus.Pass
|
||||
Error = NodeStatus.Error
|
||||
Fail = NodeStatus.Fail
|
||||
|
||||
@@ -1,20 +1,23 @@
|
||||
from pathlib import Path
|
||||
from .graph.manifest import WritableManifest
|
||||
from .results import RunResultsArtifact
|
||||
from .results import FreshnessExecutionResultArtifact
|
||||
from typing import Optional
|
||||
from dbt.exceptions import IncompatibleSchemaException
|
||||
|
||||
|
||||
class PreviousState:
|
||||
def __init__(self, path: Path):
|
||||
def __init__(self, path: Path, current_path: Path):
|
||||
self.path: Path = path
|
||||
self.current_path: Path = current_path
|
||||
self.manifest: Optional[WritableManifest] = None
|
||||
self.results: Optional[RunResultsArtifact] = None
|
||||
self.sources: Optional[FreshnessExecutionResultArtifact] = None
|
||||
self.sources_current: Optional[FreshnessExecutionResultArtifact] = None
|
||||
|
||||
manifest_path = self.path / "manifest.json"
|
||||
if manifest_path.exists() and manifest_path.is_file():
|
||||
try:
|
||||
# we want to bail with an error if schema versions don't match
|
||||
self.manifest = WritableManifest.read_and_check_versions(str(manifest_path))
|
||||
except IncompatibleSchemaException as exc:
|
||||
exc.add_filename(str(manifest_path))
|
||||
@@ -23,8 +26,27 @@ class PreviousState:
|
||||
results_path = self.path / "run_results.json"
|
||||
if results_path.exists() and results_path.is_file():
|
||||
try:
|
||||
# we want to bail with an error if schema versions don't match
|
||||
self.results = RunResultsArtifact.read_and_check_versions(str(results_path))
|
||||
except IncompatibleSchemaException as exc:
|
||||
exc.add_filename(str(results_path))
|
||||
raise
|
||||
|
||||
sources_path = self.path / "sources.json"
|
||||
if sources_path.exists() and sources_path.is_file():
|
||||
try:
|
||||
self.sources = FreshnessExecutionResultArtifact.read_and_check_versions(
|
||||
str(sources_path)
|
||||
)
|
||||
except IncompatibleSchemaException as exc:
|
||||
exc.add_filename(str(sources_path))
|
||||
raise
|
||||
|
||||
sources_current_path = self.current_path / "sources.json"
|
||||
if sources_current_path.exists() and sources_current_path.is_file():
|
||||
try:
|
||||
self.sources_current = FreshnessExecutionResultArtifact.read_and_check_versions(
|
||||
str(sources_current_path)
|
||||
)
|
||||
except IncompatibleSchemaException as exc:
|
||||
exc.add_filename(str(sources_current_path))
|
||||
raise
|
||||
|
||||
@@ -201,6 +201,14 @@ class VersionedSchema(dbtClassMixin):
|
||||
result["$id"] = str(cls.dbt_schema_version)
|
||||
return result
|
||||
|
||||
@classmethod
|
||||
def is_compatible_version(cls, schema_version):
|
||||
compatible_versions = [str(cls.dbt_schema_version)]
|
||||
if hasattr(cls, "compatible_previous_versions"):
|
||||
for name, version in cls.compatible_previous_versions():
|
||||
compatible_versions.append(str(SchemaVersion(name, version)))
|
||||
return str(schema_version) in compatible_versions
|
||||
|
||||
@classmethod
|
||||
def read_and_check_versions(cls, path: str):
|
||||
try:
|
||||
@@ -217,7 +225,7 @@ class VersionedSchema(dbtClassMixin):
|
||||
if "metadata" in data and "dbt_schema_version" in data["metadata"]:
|
||||
previous_schema_version = data["metadata"]["dbt_schema_version"]
|
||||
# cls.dbt_schema_version is a SchemaVersion object
|
||||
if str(cls.dbt_schema_version) != previous_schema_version:
|
||||
if not cls.is_compatible_version(previous_schema_version):
|
||||
raise IncompatibleSchemaException(
|
||||
expected=str(cls.dbt_schema_version), found=previous_schema_version
|
||||
)
|
||||
|
||||
@@ -35,7 +35,7 @@ class DateTimeSerialization(SerializationStrategy):
|
||||
# jsonschemas for every class and the 'validate' method
|
||||
# come from Hologram.
|
||||
class dbtClassMixin(DataClassDictMixin, JsonSchemaMixin):
|
||||
"""Mixin which adds methods to generate a JSON schema and
|
||||
"""The Mixin adds methods to generate a JSON schema and
|
||||
convert to and from JSON encodable dicts with validation
|
||||
against the schema
|
||||
"""
|
||||
|
||||
@@ -64,7 +64,7 @@ class Event(metaclass=ABCMeta):
|
||||
|
||||
# in theory threads can change so we don't cache them.
|
||||
def get_thread_name(self) -> str:
|
||||
return threading.current_thread().getName()
|
||||
return threading.current_thread().name
|
||||
|
||||
@classmethod
|
||||
def get_invocation_id(cls) -> str:
|
||||
|
||||
@@ -15,7 +15,7 @@ def format_fancy_output_line(
|
||||
progress = ""
|
||||
else:
|
||||
progress = "{} of {} ".format(index, total)
|
||||
prefix = "{progress}{message}".format(progress=progress, message=msg)
|
||||
prefix = "{progress}{message} ".format(progress=progress, message=msg)
|
||||
|
||||
truncate_width = ui.printer_width() - 3
|
||||
justified = prefix.ljust(ui.printer_width(), ".")
|
||||
|
||||
@@ -1,4 +1,3 @@
|
||||
import colorama
|
||||
from colorama import Style
|
||||
import dbt.events.functions as this # don't worry I hate it too.
|
||||
from dbt.events.base_types import NoStdOut, Event, NoFile, ShowException, Cache
|
||||
@@ -50,25 +49,6 @@ format_color = True
|
||||
format_json = False
|
||||
invocation_id: Optional[str] = None
|
||||
|
||||
# Colorama needs some help on windows because we're using logger.info
|
||||
# intead of print(). If the Windows env doesn't have a TERM var set,
|
||||
# then we should override the logging stream to use the colorama
|
||||
# converter. If the TERM var is set (as with Git Bash), then it's safe
|
||||
# to send escape characters and no log handler injection is needed.
|
||||
colorama_stdout = sys.stdout
|
||||
colorama_wrap = True
|
||||
|
||||
colorama.init(wrap=colorama_wrap)
|
||||
|
||||
if sys.platform == "win32" and not os.getenv("TERM"):
|
||||
colorama_wrap = False
|
||||
colorama_stdout = colorama.AnsiToWin32(sys.stdout).stream
|
||||
|
||||
elif sys.platform == "win32":
|
||||
colorama_wrap = False
|
||||
|
||||
colorama.init(wrap=colorama_wrap)
|
||||
|
||||
|
||||
def setup_event_logger(log_path, level_override=None):
|
||||
# flags have been resolved, and log_path is known
|
||||
@@ -205,8 +185,8 @@ def create_debug_text_log_line(e: T_Event) -> str:
|
||||
scrubbed_msg: str = scrub_secrets(e.message(), env_secrets())
|
||||
level: str = e.level_tag() if len(e.level_tag()) == 5 else f"{e.level_tag()} "
|
||||
thread = ""
|
||||
if threading.current_thread().getName():
|
||||
thread_name = threading.current_thread().getName()
|
||||
if threading.current_thread().name:
|
||||
thread_name = threading.current_thread().name
|
||||
thread_name = thread_name[:10]
|
||||
thread_name = thread_name.ljust(10, " ")
|
||||
thread = f" [{thread_name}]:"
|
||||
|
||||
@@ -291,6 +291,25 @@ class GitProgressCheckedOutAt(DebugLevel):
|
||||
return f" Checked out at {self.end_sha}."
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryIndexProgressMakingGETRequest(DebugLevel):
|
||||
url: str
|
||||
code: str = "M022"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"Making package index registry request: GET {self.url}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryIndexProgressGETResponse(DebugLevel):
|
||||
url: str
|
||||
resp_code: int
|
||||
code: str = "M023"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"Response from registry index: GET {self.url} {self.resp_code}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryProgressMakingGETRequest(DebugLevel):
|
||||
url: str
|
||||
@@ -310,6 +329,45 @@ class RegistryProgressGETResponse(DebugLevel):
|
||||
return f"Response from registry: GET {self.url} {self.resp_code}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryResponseUnexpectedType(DebugLevel):
|
||||
response: str
|
||||
code: str = "M024"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"Response was None: {self.response}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryResponseMissingTopKeys(DebugLevel):
|
||||
response: str
|
||||
code: str = "M025"
|
||||
|
||||
def message(self) -> str:
|
||||
# expected/actual keys logged in exception
|
||||
return f"Response missing top level keys: {self.response}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryResponseMissingNestedKeys(DebugLevel):
|
||||
response: str
|
||||
code: str = "M026"
|
||||
|
||||
def message(self) -> str:
|
||||
# expected/actual keys logged in exception
|
||||
return f"Response missing nested keys: {self.response}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryResponseExtraNestedKeys(DebugLevel):
|
||||
response: str
|
||||
code: str = "M027"
|
||||
|
||||
def message(self) -> str:
|
||||
# expected/actual keys logged in exception
|
||||
return f"Response contained inconsistent keys: {self.response}"
|
||||
|
||||
|
||||
# TODO this was actually `logger.exception(...)` not `logger.error(...)`
|
||||
@dataclass
|
||||
class SystemErrorRetrievingModTime(ErrorLevel):
|
||||
@@ -2346,7 +2404,7 @@ class TrackingInitializeFailure(ShowException, DebugLevel):
|
||||
class RetryExternalCall(DebugLevel):
|
||||
attempt: int
|
||||
max: int
|
||||
code: str = "Z045"
|
||||
code: str = "M020"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"Retrying external call. Attempt: {self.attempt} Max attempts: {self.max}"
|
||||
@@ -2381,7 +2439,19 @@ class EventBufferFull(WarnLevel):
|
||||
code: str = "Z048"
|
||||
|
||||
def message(self) -> str:
|
||||
return "Internal event buffer full. Earliest events will be dropped (FIFO)."
|
||||
return (
|
||||
"Internal logging/event buffer full."
|
||||
"Earliest logs/events will be dropped as new ones are fired (FIFO)."
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class RecordRetryException(DebugLevel):
|
||||
exc: Exception
|
||||
code: str = "M021"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"External call exception: {self.exc}"
|
||||
|
||||
|
||||
# since mypy doesn't run on every file we need to suggest to mypy that every
|
||||
@@ -2413,6 +2483,14 @@ if 1 == 0:
|
||||
GitNothingToDo(sha="")
|
||||
GitProgressUpdatedCheckoutRange(start_sha="", end_sha="")
|
||||
GitProgressCheckedOutAt(end_sha="")
|
||||
RegistryIndexProgressMakingGETRequest(url="")
|
||||
RegistryIndexProgressGETResponse(url="", resp_code=1234)
|
||||
RegistryProgressMakingGETRequest(url="")
|
||||
RegistryProgressGETResponse(url="", resp_code=1234)
|
||||
RegistryResponseUnexpectedType(response=""),
|
||||
RegistryResponseMissingTopKeys(response=""),
|
||||
RegistryResponseMissingNestedKeys(response=""),
|
||||
RegistryResponseExtraNestedKeys(response=""),
|
||||
SystemErrorRetrievingModTime(path="")
|
||||
SystemCouldNotWrite(path="", reason="", exc=Exception(""))
|
||||
SystemExecutingCmd(cmd=[""])
|
||||
@@ -2737,3 +2815,4 @@ if 1 == 0:
|
||||
GeneralWarningMsg(msg="", log_fmt="")
|
||||
GeneralWarningException(exc=Exception(""), log_fmt="")
|
||||
EventBufferFull()
|
||||
RecordRetryException(exc=Exception(""))
|
||||
|
||||
@@ -383,10 +383,11 @@ class FailedToConnectException(DatabaseException):
|
||||
|
||||
class CommandError(RuntimeException):
|
||||
def __init__(self, cwd, cmd, message="Error running command"):
|
||||
cmd_scrubbed = list(scrub_secrets(cmd_txt, env_secrets()) for cmd_txt in cmd)
|
||||
super().__init__(message)
|
||||
self.cwd = cwd
|
||||
self.cmd = cmd
|
||||
self.args = (cwd, cmd, message)
|
||||
self.cmd = cmd_scrubbed
|
||||
self.args = (cwd, cmd_scrubbed, message)
|
||||
|
||||
def __str__(self):
|
||||
if len(self.cmd) == 0:
|
||||
@@ -411,9 +412,9 @@ class CommandResultError(CommandError):
|
||||
def __init__(self, cwd, cmd, returncode, stdout, stderr, message="Got a non-zero returncode"):
|
||||
super().__init__(cwd, cmd, message)
|
||||
self.returncode = returncode
|
||||
self.stdout = stdout
|
||||
self.stderr = stderr
|
||||
self.args = (cwd, cmd, returncode, stdout, stderr, message)
|
||||
self.stdout = scrub_secrets(stdout.decode("utf-8"), env_secrets())
|
||||
self.stderr = scrub_secrets(stderr.decode("utf-8"), env_secrets())
|
||||
self.args = (cwd, self.cmd, returncode, self.stdout, self.stderr, message)
|
||||
|
||||
def __str__(self):
|
||||
return "{} running: {}".format(self.msg, self.cmd)
|
||||
@@ -501,7 +502,7 @@ def invalid_type_error(
|
||||
|
||||
|
||||
def invalid_bool_error(got_value, macro_name) -> NoReturn:
|
||||
"""Raise a CompilationException when an macro expects a boolean but gets some
|
||||
"""Raise a CompilationException when a macro expects a boolean but gets some
|
||||
other value.
|
||||
"""
|
||||
msg = (
|
||||
@@ -704,7 +705,6 @@ def missing_materialization(model, adapter_type):
|
||||
|
||||
def bad_package_spec(repo, spec, error_message):
|
||||
msg = "Error checking out spec='{}' for repo {}\n{}".format(spec, repo, error_message)
|
||||
|
||||
raise InternalException(scrub_secrets(msg, env_secrets()))
|
||||
|
||||
|
||||
@@ -838,31 +838,47 @@ def raise_duplicate_macro_name(node_1, node_2, namespace) -> NoReturn:
|
||||
|
||||
def raise_duplicate_resource_name(node_1, node_2):
|
||||
duped_name = node_1.name
|
||||
node_type = NodeType(node_1.resource_type)
|
||||
pluralized = (
|
||||
node_type.pluralize()
|
||||
if node_1.resource_type == node_2.resource_type
|
||||
else "resources" # still raise if ref() collision, e.g. model + seed
|
||||
)
|
||||
|
||||
if node_1.resource_type in NodeType.refable():
|
||||
get_func = 'ref("{}")'.format(duped_name)
|
||||
elif node_1.resource_type == NodeType.Source:
|
||||
action = "looking for"
|
||||
# duplicate 'ref' targets
|
||||
if node_type in NodeType.refable():
|
||||
formatted_name = f'ref("{duped_name}")'
|
||||
# duplicate sources
|
||||
elif node_type == NodeType.Source:
|
||||
duped_name = node_1.get_full_source_name()
|
||||
get_func = node_1.get_source_representation()
|
||||
elif node_1.resource_type == NodeType.Documentation:
|
||||
get_func = 'doc("{}")'.format(duped_name)
|
||||
elif node_1.resource_type == NodeType.Test and "schema" in node_1.tags:
|
||||
return
|
||||
formatted_name = node_1.get_source_representation()
|
||||
# duplicate docs blocks
|
||||
elif node_type == NodeType.Documentation:
|
||||
formatted_name = f'doc("{duped_name}")'
|
||||
# duplicate generic tests
|
||||
elif node_type == NodeType.Test and hasattr(node_1, "test_metadata"):
|
||||
column_name = f'column "{node_1.column_name}" in ' if node_1.column_name else ""
|
||||
model_name = node_1.file_key_name
|
||||
duped_name = f'{node_1.name}" defined on {column_name}"{model_name}'
|
||||
action = "running"
|
||||
formatted_name = "tests"
|
||||
# all other resource types
|
||||
else:
|
||||
get_func = '"{}"'.format(duped_name)
|
||||
formatted_name = duped_name
|
||||
|
||||
# should this be raise_parsing_error instead?
|
||||
raise_compiler_error(
|
||||
'dbt found two resources with the name "{}". Since these resources '
|
||||
"have the same name,\ndbt will be unable to find the correct resource "
|
||||
"when {} is used. To fix this,\nchange the name of one of "
|
||||
"these resources:\n- {} ({})\n- {} ({})".format(
|
||||
duped_name,
|
||||
get_func,
|
||||
node_1.unique_id,
|
||||
node_1.original_file_path,
|
||||
node_2.unique_id,
|
||||
node_2.original_file_path,
|
||||
)
|
||||
f"""
|
||||
dbt found two {pluralized} with the name "{duped_name}".
|
||||
|
||||
Since these resources have the same name, dbt will be unable to find the correct resource
|
||||
when {action} {formatted_name}.
|
||||
|
||||
To fix this, change the name of one of these resources:
|
||||
- {node_1.unique_id} ({node_1.original_file_path})
|
||||
- {node_2.unique_id} ({node_2.original_file_path})
|
||||
""".strip()
|
||||
)
|
||||
|
||||
|
||||
@@ -966,11 +982,11 @@ def raise_duplicate_source_patch_name(patch_1, patch_2):
|
||||
)
|
||||
|
||||
|
||||
def raise_invalid_schema_yml_version(path, issue):
|
||||
def raise_invalid_property_yml_version(path, issue):
|
||||
raise_compiler_error(
|
||||
"The schema file at {} is invalid because {}. Please consult the "
|
||||
"documentation for more information on schema.yml syntax:\n\n"
|
||||
"https://docs.getdbt.com/docs/schemayml-files".format(path, issue)
|
||||
"The yml property file at {} is invalid because {}. Please consult the "
|
||||
"documentation for more information on yml property file syntax:\n\n"
|
||||
"https://docs.getdbt.com/reference/configs-and-properties".format(path, issue)
|
||||
)
|
||||
|
||||
|
||||
@@ -1048,7 +1064,7 @@ CONTEXT_EXPORTS = {
|
||||
raise_dependency_error,
|
||||
raise_duplicate_patch_name,
|
||||
raise_duplicate_resource_name,
|
||||
raise_invalid_schema_yml_version,
|
||||
raise_invalid_property_yml_version,
|
||||
raise_not_implemented,
|
||||
relation_wrong_type,
|
||||
]
|
||||
|
||||
@@ -1,7 +1,9 @@
|
||||
import os
|
||||
# Do not import the os package because we expose this package in jinja
|
||||
from os import name as os_name, path as os_path, getenv as os_getenv
|
||||
import multiprocessing
|
||||
from argparse import Namespace
|
||||
|
||||
if os.name != "nt":
|
||||
if os_name != "nt":
|
||||
# https://bugs.python.org/issue41567
|
||||
import multiprocessing.popen_spawn_posix # type: ignore
|
||||
from pathlib import Path
|
||||
@@ -10,8 +12,8 @@ from typing import Optional
|
||||
# PROFILES_DIR must be set before the other flags
|
||||
# It also gets set in main.py and in set_from_args because the rpc server
|
||||
# doesn't go through exactly the same main arg processing.
|
||||
DEFAULT_PROFILES_DIR = os.path.join(os.path.expanduser("~"), ".dbt")
|
||||
PROFILES_DIR = os.path.expanduser(os.getenv("DBT_PROFILES_DIR", DEFAULT_PROFILES_DIR))
|
||||
DEFAULT_PROFILES_DIR = os_path.join(os_path.expanduser("~"), ".dbt")
|
||||
PROFILES_DIR = os_path.expanduser(os_getenv("DBT_PROFILES_DIR", DEFAULT_PROFILES_DIR))
|
||||
|
||||
STRICT_MODE = False # Only here for backwards compatibility
|
||||
FULL_REFRESH = False # subcommand
|
||||
@@ -35,6 +37,18 @@ INDIRECT_SELECTION = None
|
||||
LOG_CACHE_EVENTS = None
|
||||
EVENT_BUFFER_SIZE = 100000
|
||||
QUIET = None
|
||||
NO_PRINT = None
|
||||
CACHE_SELECTED_ONLY = None
|
||||
|
||||
_NON_BOOLEAN_FLAGS = [
|
||||
"LOG_FORMAT",
|
||||
"PRINTER_WIDTH",
|
||||
"PROFILES_DIR",
|
||||
"INDIRECT_SELECTION",
|
||||
"EVENT_BUFFER_SIZE",
|
||||
]
|
||||
|
||||
_NON_DBT_ENV_FLAGS = ["DO_NOT_TRACK"]
|
||||
|
||||
# Global CLI defaults. These flags are set from three places:
|
||||
# CLI args, environment variables, and user_config (profiles.yml).
|
||||
@@ -57,14 +71,16 @@ flag_defaults = {
|
||||
"LOG_CACHE_EVENTS": False,
|
||||
"EVENT_BUFFER_SIZE": 100000,
|
||||
"QUIET": False,
|
||||
"NO_PRINT": False,
|
||||
"CACHE_SELECTED_ONLY": False,
|
||||
}
|
||||
|
||||
|
||||
def env_set_truthy(key: str) -> Optional[str]:
|
||||
"""Return the value if it was set to a "truthy" string value, or None
|
||||
"""Return the value if it was set to a "truthy" string value or None
|
||||
otherwise.
|
||||
"""
|
||||
value = os.getenv(key)
|
||||
value = os_getenv(key)
|
||||
if not value or value.lower() in ("0", "false", "f"):
|
||||
return None
|
||||
return value
|
||||
@@ -77,7 +93,7 @@ def env_set_bool(env_value):
|
||||
|
||||
|
||||
def env_set_path(key: str) -> Optional[Path]:
|
||||
value = os.getenv(key)
|
||||
value = os_getenv(key)
|
||||
if value is None:
|
||||
return value
|
||||
else:
|
||||
@@ -106,7 +122,7 @@ def set_from_args(args, user_config):
|
||||
global STRICT_MODE, FULL_REFRESH, WARN_ERROR, USE_EXPERIMENTAL_PARSER, STATIC_PARSER
|
||||
global WRITE_JSON, PARTIAL_PARSE, USE_COLORS, STORE_FAILURES, PROFILES_DIR, DEBUG, LOG_FORMAT
|
||||
global INDIRECT_SELECTION, VERSION_CHECK, FAIL_FAST, SEND_ANONYMOUS_USAGE_STATS
|
||||
global PRINTER_WIDTH, WHICH, LOG_CACHE_EVENTS, EVENT_BUFFER_SIZE, QUIET
|
||||
global PRINTER_WIDTH, WHICH, LOG_CACHE_EVENTS, EVENT_BUFFER_SIZE, QUIET, NO_PRINT, CACHE_SELECTED_ONLY
|
||||
|
||||
STRICT_MODE = False # backwards compatibility
|
||||
# cli args without user_config or env var option
|
||||
@@ -132,40 +148,69 @@ def set_from_args(args, user_config):
|
||||
LOG_CACHE_EVENTS = get_flag_value("LOG_CACHE_EVENTS", args, user_config)
|
||||
EVENT_BUFFER_SIZE = get_flag_value("EVENT_BUFFER_SIZE", args, user_config)
|
||||
QUIET = get_flag_value("QUIET", args, user_config)
|
||||
NO_PRINT = get_flag_value("NO_PRINT", args, user_config)
|
||||
CACHE_SELECTED_ONLY = get_flag_value("CACHE_SELECTED_ONLY", args, user_config)
|
||||
|
||||
_set_overrides_from_env()
|
||||
|
||||
|
||||
def _set_overrides_from_env():
|
||||
global SEND_ANONYMOUS_USAGE_STATS
|
||||
|
||||
flag_value = _get_flag_value_from_env("DO_NOT_TRACK")
|
||||
if flag_value is None:
|
||||
return
|
||||
|
||||
SEND_ANONYMOUS_USAGE_STATS = not flag_value
|
||||
|
||||
|
||||
def get_flag_value(flag, args, user_config):
|
||||
lc_flag = flag.lower()
|
||||
flag_value = getattr(args, lc_flag, None)
|
||||
if flag_value is None:
|
||||
# Environment variables use pattern 'DBT_{flag name}'
|
||||
env_flag = f"DBT_{flag}"
|
||||
env_value = os.getenv(env_flag)
|
||||
if env_value is not None and env_value != "":
|
||||
env_value = env_value.lower()
|
||||
# non Boolean values
|
||||
if flag in [
|
||||
"LOG_FORMAT",
|
||||
"PRINTER_WIDTH",
|
||||
"PROFILES_DIR",
|
||||
"INDIRECT_SELECTION",
|
||||
"EVENT_BUFFER_SIZE",
|
||||
]:
|
||||
flag_value = env_value
|
||||
else:
|
||||
flag_value = env_set_bool(env_value)
|
||||
elif user_config is not None and getattr(user_config, lc_flag, None) is not None:
|
||||
flag_value = getattr(user_config, lc_flag)
|
||||
else:
|
||||
flag_value = flag_defaults[flag]
|
||||
flag_value = _load_flag_value(flag, args, user_config)
|
||||
|
||||
if flag in ["PRINTER_WIDTH", "EVENT_BUFFER_SIZE"]: # must be ints
|
||||
flag_value = int(flag_value)
|
||||
if flag == "PROFILES_DIR":
|
||||
flag_value = os.path.abspath(flag_value)
|
||||
flag_value = os_path.abspath(flag_value)
|
||||
|
||||
return flag_value
|
||||
|
||||
|
||||
def _load_flag_value(flag, args, user_config):
|
||||
lc_flag = flag.lower()
|
||||
flag_value = getattr(args, lc_flag, None)
|
||||
if flag_value is not None:
|
||||
return flag_value
|
||||
|
||||
flag_value = _get_flag_value_from_env(flag)
|
||||
if flag_value is not None:
|
||||
return flag_value
|
||||
|
||||
if user_config is not None and getattr(user_config, lc_flag, None) is not None:
|
||||
return getattr(user_config, lc_flag)
|
||||
|
||||
return flag_defaults[flag]
|
||||
|
||||
|
||||
def _get_flag_value_from_env(flag):
|
||||
# Environment variables use pattern 'DBT_{flag name}'
|
||||
env_flag = _get_env_flag(flag)
|
||||
env_value = os_getenv(env_flag)
|
||||
if env_value is None or env_value == "":
|
||||
return None
|
||||
|
||||
env_value = env_value.lower()
|
||||
if flag in _NON_BOOLEAN_FLAGS:
|
||||
flag_value = env_value
|
||||
else:
|
||||
flag_value = env_set_bool(env_value)
|
||||
|
||||
return flag_value
|
||||
|
||||
|
||||
def _get_env_flag(flag):
|
||||
return flag if flag in _NON_DBT_ENV_FLAGS else f"DBT_{flag}"
|
||||
|
||||
|
||||
def get_flag_dict():
|
||||
return {
|
||||
"use_experimental_parser": USE_EXPERIMENTAL_PARSER,
|
||||
@@ -185,4 +230,20 @@ def get_flag_dict():
|
||||
"log_cache_events": LOG_CACHE_EVENTS,
|
||||
"event_buffer_size": EVENT_BUFFER_SIZE,
|
||||
"quiet": QUIET,
|
||||
"no_print": NO_PRINT,
|
||||
"cache_selected_only": CACHE_SELECTED_ONLY,
|
||||
}
|
||||
|
||||
|
||||
# This is used by core/dbt/context/base.py to return a flag object
|
||||
# in Jinja.
|
||||
def get_flag_obj():
|
||||
new_flags = Namespace()
|
||||
for k, v in get_flag_dict().items():
|
||||
setattr(new_flags, k.upper(), v)
|
||||
# The following 3 are CLI arguments only so they're not full-fledged flags,
|
||||
# but we put in flags for users.
|
||||
setattr(new_flags, "FULL_REFRESH", FULL_REFRESH)
|
||||
setattr(new_flags, "STORE_FAILURES", STORE_FAILURES)
|
||||
setattr(new_flags, "WHICH", WHICH)
|
||||
return new_flags
|
||||
|
||||
@@ -17,6 +17,8 @@ from dbt.contracts.graph.compiled import GraphMemberNode
|
||||
from dbt.contracts.graph.manifest import Manifest
|
||||
from dbt.contracts.state import PreviousState
|
||||
|
||||
from dbt import selected_resources
|
||||
|
||||
|
||||
def get_package_names(nodes):
|
||||
return set([node.split(".")[1] for node in nodes])
|
||||
@@ -269,6 +271,7 @@ class NodeSelector(MethodManager):
|
||||
dependecies.
|
||||
"""
|
||||
selected_nodes = self.get_selected(spec)
|
||||
selected_resources.set_selected_resources(selected_nodes)
|
||||
new_graph = self.full_graph.get_subset_graph(selected_nodes)
|
||||
# should we give a way here for consumers to mutate the graph?
|
||||
return GraphQueue(new_graph.graph, self.manifest, selected_nodes)
|
||||
|
||||
@@ -48,6 +48,7 @@ class MethodName(StrEnum):
|
||||
Exposure = "exposure"
|
||||
Metric = "metric"
|
||||
Result = "result"
|
||||
SourceStatus = "source_status"
|
||||
|
||||
|
||||
def is_selected_node(fqn: List[str], node_selector: str):
|
||||
@@ -414,20 +415,24 @@ class StateSelectorMethod(SelectorMethod):
|
||||
|
||||
return modified
|
||||
|
||||
def recursively_check_macros_modified(self, node, previous_macros):
|
||||
def recursively_check_macros_modified(self, node, visited_macros):
|
||||
# loop through all macros that this node depends on
|
||||
for macro_uid in node.depends_on.macros:
|
||||
# avoid infinite recursion if we've already seen this macro
|
||||
if macro_uid in previous_macros:
|
||||
if macro_uid in visited_macros:
|
||||
continue
|
||||
previous_macros.append(macro_uid)
|
||||
visited_macros.append(macro_uid)
|
||||
# is this macro one of the modified macros?
|
||||
if macro_uid in self.modified_macros:
|
||||
return True
|
||||
# if not, and this macro depends on other macros, keep looping
|
||||
macro_node = self.manifest.macros[macro_uid]
|
||||
if len(macro_node.depends_on.macros) > 0:
|
||||
return self.recursively_check_macros_modified(macro_node, previous_macros)
|
||||
return self.recursively_check_macros_modified(macro_node, visited_macros)
|
||||
# this macro hasn't been modified, but we haven't checked
|
||||
# the other macros the node depends on, so keep looking
|
||||
elif len(node.depends_on.macros) > len(visited_macros):
|
||||
continue
|
||||
else:
|
||||
return False
|
||||
|
||||
@@ -440,8 +445,8 @@ class StateSelectorMethod(SelectorMethod):
|
||||
return False
|
||||
# recursively loop through upstream macros to see if any is modified
|
||||
else:
|
||||
previous_macros = []
|
||||
return self.recursively_check_macros_modified(node, previous_macros)
|
||||
visited_macros = []
|
||||
return self.recursively_check_macros_modified(node, visited_macros)
|
||||
|
||||
# TODO check modifed_content and check_modified macro seems a bit redundent
|
||||
def check_modified_content(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
|
||||
@@ -522,6 +527,62 @@ class ResultSelectorMethod(SelectorMethod):
|
||||
yield node
|
||||
|
||||
|
||||
class SourceStatusSelectorMethod(SelectorMethod):
|
||||
def search(self, included_nodes: Set[UniqueId], selector: str) -> Iterator[UniqueId]:
|
||||
|
||||
if self.previous_state is None or self.previous_state.sources is None:
|
||||
raise InternalException(
|
||||
"No previous state comparison freshness results in sources.json"
|
||||
)
|
||||
elif self.previous_state.sources_current is None:
|
||||
raise InternalException(
|
||||
"No current state comparison freshness results in sources.json"
|
||||
)
|
||||
|
||||
current_state_sources = {
|
||||
result.unique_id: getattr(result, "max_loaded_at", None)
|
||||
for result in self.previous_state.sources_current.results
|
||||
if hasattr(result, "max_loaded_at")
|
||||
}
|
||||
|
||||
current_state_sources_runtime_error = {
|
||||
result.unique_id
|
||||
for result in self.previous_state.sources_current.results
|
||||
if not hasattr(result, "max_loaded_at")
|
||||
}
|
||||
|
||||
previous_state_sources = {
|
||||
result.unique_id: getattr(result, "max_loaded_at", None)
|
||||
for result in self.previous_state.sources.results
|
||||
if hasattr(result, "max_loaded_at")
|
||||
}
|
||||
|
||||
previous_state_sources_runtime_error = {
|
||||
result.unique_id
|
||||
for result in self.previous_state.sources_current.results
|
||||
if not hasattr(result, "max_loaded_at")
|
||||
}
|
||||
|
||||
matches = set()
|
||||
if selector == "fresher":
|
||||
for unique_id in current_state_sources:
|
||||
if unique_id not in previous_state_sources:
|
||||
matches.add(unique_id)
|
||||
elif current_state_sources[unique_id] > previous_state_sources[unique_id]:
|
||||
matches.add(unique_id)
|
||||
|
||||
for unique_id in matches:
|
||||
if (
|
||||
unique_id in previous_state_sources_runtime_error
|
||||
or unique_id in current_state_sources_runtime_error
|
||||
):
|
||||
matches.remove(unique_id)
|
||||
|
||||
for node, real_node in self.all_nodes(included_nodes):
|
||||
if node in matches:
|
||||
yield node
|
||||
|
||||
|
||||
class MethodManager:
|
||||
SELECTOR_METHODS: Dict[MethodName, Type[SelectorMethod]] = {
|
||||
MethodName.FQN: QualifiedNameSelectorMethod,
|
||||
@@ -537,6 +598,7 @@ class MethodManager:
|
||||
MethodName.Exposure: ExposureSelectorMethod,
|
||||
MethodName.Metric: MetricSelectorMethod,
|
||||
MethodName.Result: ResultSelectorMethod,
|
||||
MethodName.SourceStatus: SourceStatusSelectorMethod,
|
||||
}
|
||||
|
||||
def __init__(
|
||||
|
||||
@@ -27,7 +27,7 @@ class IndirectSelection(StrEnum):
|
||||
|
||||
|
||||
def _probably_path(value: str):
|
||||
"""Decide if value is probably a path. Windows has two path separators, so
|
||||
"""Decide if the value is probably a path. Windows has two path separators, so
|
||||
we should check both sep ('\\') and altsep ('/') there.
|
||||
"""
|
||||
if os.path.sep in value:
|
||||
|
||||
@@ -131,3 +131,10 @@ class Lazy(Generic[T]):
|
||||
if self.memo is None:
|
||||
self.memo = self._typed_eval_f()
|
||||
return self.memo
|
||||
|
||||
|
||||
# This class is used in to_target_dict, so that accesses to missing keys
|
||||
# will return an empty string instead of Undefined
|
||||
class DictDefaultEmptyStr(dict):
|
||||
def __getitem__(self, key):
|
||||
return dict.get(self, key, "")
|
||||
|
||||
@@ -1 +1,15 @@
|
||||
# Include README
|
||||
# Include Module
|
||||
|
||||
The Include module is reponsible for housing default macro definitions, starter project scaffold, and the html file used to generate the docs page.
|
||||
|
||||
# Directories
|
||||
|
||||
## `global_project`
|
||||
Defines the default implementations of jinja2 macros for `dbt-core` which can be overwritten in each adapter repo to work more in line with those adapter plugins. To view adapter specific jinja2 changes please check the relevant adapter repo [`adapter.sql` ](https://github.com/dbt-labs/dbt-bigquery/blob/main/dbt/include/bigquery/macros/adapters.sql) file in the `include` directory or in the [`impl.py`](https://github.com/dbt-labs/dbt-bigquery/blob/main/dbt/adapters/bigquery/impl.py) file for some ex. BigQuery (truncate_relation).
|
||||
|
||||
## `starter_project`
|
||||
Produces the default project after running the `dbt init` command for the CLI. `dbt-cloud` initializes the project by using [dbt-starter-project](https://github.com/dbt-labs/dbt-starter-project).
|
||||
|
||||
|
||||
# Files
|
||||
- `index.html` a file generated from [dbt-docs](https://github.com/dbt-labs/dbt-docs) prior to new releases and replaced in the `dbt-core` directory. It is used to generate the docs page after using the `generate docs` command in dbt.
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
{% macro default__test_not_null(model, column_name) %}
|
||||
|
||||
select *
|
||||
{% set column_list = '*' if should_store_failures() else column_name %}
|
||||
|
||||
select {{ column_list }}
|
||||
from {{ model }}
|
||||
where {{ column_name }} is null
|
||||
|
||||
|
||||
@@ -56,13 +56,26 @@
|
||||
|
||||
{%- set dest_cols_csv = get_quoted_csv(dest_columns | map(attribute="name")) -%}
|
||||
|
||||
{% if unique_key is not none %}
|
||||
delete from {{ target }}
|
||||
where ({{ unique_key }}) in (
|
||||
select ({{ unique_key }})
|
||||
from {{ source }}
|
||||
);
|
||||
{% endif %}
|
||||
{% if unique_key %}
|
||||
{% if unique_key is sequence and unique_key is not string %}
|
||||
delete from {{target }}
|
||||
using {{ source }}
|
||||
where (
|
||||
{% for key in unique_key %}
|
||||
{{ source }}.{{ key }} = {{ target }}.{{ key }}
|
||||
{{ "and " if not loop.last }}
|
||||
{% endfor %}
|
||||
);
|
||||
{% else %}
|
||||
delete from {{ target }}
|
||||
where (
|
||||
{{ unique_key }}) in (
|
||||
select ({{ unique_key }})
|
||||
from {{ source }}
|
||||
);
|
||||
|
||||
{% endif %}
|
||||
{% endif %}
|
||||
|
||||
insert into {{ target }} ({{ dest_cols_csv }})
|
||||
(
|
||||
|
||||
@@ -22,6 +22,13 @@
|
||||
{# no-op #}
|
||||
{% endmacro %}
|
||||
|
||||
{% macro get_true_sql() %}
|
||||
{{ adapter.dispatch('get_true_sql', 'dbt')() }}
|
||||
{% endmacro %}
|
||||
|
||||
{% macro default__get_true_sql() %}
|
||||
{{ return('TRUE') }}
|
||||
{% endmacro %}
|
||||
|
||||
{% macro snapshot_staging_table(strategy, source_sql, target_relation) -%}
|
||||
{{ adapter.dispatch('snapshot_staging_table', 'dbt')(strategy, source_sql, target_relation) }}
|
||||
|
||||
@@ -6,10 +6,6 @@
|
||||
{%- set strategy_name = config.get('strategy') -%}
|
||||
{%- set unique_key = config.get('unique_key') %}
|
||||
|
||||
{% if not adapter.check_schema_exists(model.database, model.schema) %}
|
||||
{% do create_schema(model.database, model.schema) %}
|
||||
{% endif %}
|
||||
|
||||
{% set target_relation_exists, target_relation = get_or_create_relation(
|
||||
database=model.database,
|
||||
schema=model.schema,
|
||||
|
||||
@@ -132,17 +132,7 @@
|
||||
{% set check_cols_config = config['check_cols'] %}
|
||||
{% set primary_key = config['unique_key'] %}
|
||||
{% set invalidate_hard_deletes = config.get('invalidate_hard_deletes', false) %}
|
||||
|
||||
{% set select_current_time -%}
|
||||
select {{ snapshot_get_time() }} as snapshot_start
|
||||
{%- endset %}
|
||||
|
||||
{#-- don't access the column by name, to avoid dealing with casing issues on snowflake #}
|
||||
{%- set now = run_query(select_current_time)[0][0] -%}
|
||||
{% if now is none or now is undefined -%}
|
||||
{%- do exceptions.raise_compiler_error('Could not get a snapshot start time from the database') -%}
|
||||
{%- endif %}
|
||||
{% set updated_at = config.get('updated_at', snapshot_string_as_time(now)) %}
|
||||
{% set updated_at = config.get('updated_at', snapshot_get_time()) %}
|
||||
|
||||
{% set column_added = false %}
|
||||
|
||||
@@ -157,7 +147,7 @@
|
||||
{%- set row_changed_expr -%}
|
||||
(
|
||||
{%- if column_added -%}
|
||||
TRUE
|
||||
{{ get_true_sql() }}
|
||||
{%- else -%}
|
||||
{%- for col in check_cols -%}
|
||||
{{ snapshotted_rel }}.{{ col }} != {{ current_rel }}.{{ col }}
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -15,26 +15,16 @@ import colorama
|
||||
import logbook
|
||||
from dbt.dataclass_schema import dbtClassMixin
|
||||
|
||||
# Colorama needs some help on windows because we're using logger.info
|
||||
# intead of print(). If the Windows env doesn't have a TERM var set,
|
||||
# then we should override the logging stream to use the colorama
|
||||
# converter. If the TERM var is set (as with Git Bash), then it's safe
|
||||
# to send escape characters and no log handler injection is needed.
|
||||
colorama_stdout = sys.stdout
|
||||
colorama_wrap = True
|
||||
|
||||
colorama.init(wrap=colorama_wrap)
|
||||
|
||||
|
||||
if sys.platform == "win32" and not os.getenv("TERM"):
|
||||
colorama_wrap = False
|
||||
colorama_stdout = colorama.AnsiToWin32(sys.stdout).stream
|
||||
|
||||
elif sys.platform == "win32":
|
||||
colorama_wrap = False
|
||||
|
||||
colorama.init(wrap=colorama_wrap)
|
||||
# Colorama is needed for colored logs on Windows because we're using logger.info
|
||||
# intead of print(). If the Windows env doesn't have a TERM var set or it is set to None
|
||||
# (i.e. in the case of Git Bash on Windows- this emulates Unix), then it's safe to initialize
|
||||
# Colorama with wrapping turned on which allows us to strip ANSI sequences from stdout.
|
||||
# You can safely initialize Colorama for any OS and the coloring stays the same except
|
||||
# when piped to anoter process for Linux and MacOS, then it loses the coloring. To combat
|
||||
# that, we will just initialize Colorama when needed on Windows using a non-Unix terminal.
|
||||
|
||||
if sys.platform == "win32" and (not os.getenv("TERM") or os.getenv("TERM") == "None"):
|
||||
colorama.init(wrap=True)
|
||||
|
||||
STDOUT_LOG_FORMAT = "{record.message}"
|
||||
DEBUG_LOG_FORMAT = (
|
||||
@@ -464,7 +454,7 @@ class DelayedFileHandler(logbook.RotatingFileHandler, FormatterMixin):
|
||||
|
||||
|
||||
class LogManager(logbook.NestedSetup):
|
||||
def __init__(self, stdout=colorama_stdout, stderr=sys.stderr):
|
||||
def __init__(self, stdout=sys.stdout, stderr=sys.stderr):
|
||||
self.stdout = stdout
|
||||
self.stderr = stderr
|
||||
self._null_handler = logbook.NullHandler()
|
||||
@@ -632,7 +622,7 @@ def list_handler(
|
||||
lst: Optional[List[LogMessage]],
|
||||
level=logbook.NOTSET,
|
||||
) -> ContextManager:
|
||||
"""Return a context manager that temporarly attaches a list to the logger."""
|
||||
"""Return a context manager that temporarily attaches a list to the logger."""
|
||||
return ListLogHandler(lst=lst, level=level, bubble=True)
|
||||
|
||||
|
||||
|
||||
@@ -5,6 +5,7 @@ import argparse
|
||||
import os.path
|
||||
import sys
|
||||
import traceback
|
||||
import warnings
|
||||
from contextlib import contextmanager
|
||||
from pathlib import Path
|
||||
|
||||
@@ -46,7 +47,7 @@ from dbt.exceptions import InternalException, NotImplementedException, FailedToC
|
||||
|
||||
|
||||
class DBTVersion(argparse.Action):
|
||||
"""This is very very similar to the builtin argparse._Version action,
|
||||
"""This is very similar to the built-in argparse._Version action,
|
||||
except it just calls dbt.version.get_version_information().
|
||||
"""
|
||||
|
||||
@@ -118,6 +119,9 @@ class DBTArgumentParser(argparse.ArgumentParser):
|
||||
|
||||
|
||||
def main(args=None):
|
||||
# Logbook warnings are ignored so we don't have to fork logbook to support python 3.10.
|
||||
# This _only_ works for regular cli invocations.
|
||||
warnings.filterwarnings("ignore", category=DeprecationWarning, module="logbook")
|
||||
if args is None:
|
||||
args = sys.argv[1:]
|
||||
with log_manager.applicationbound():
|
||||
@@ -1084,6 +1088,36 @@ def parse_args(args, cls=DBTArgumentParser):
|
||||
""",
|
||||
)
|
||||
|
||||
p.add_argument(
|
||||
"--no-print",
|
||||
action="store_true",
|
||||
default=None,
|
||||
help="""
|
||||
Suppress all {{ print() }} macro calls.
|
||||
""",
|
||||
)
|
||||
|
||||
schema_cache_flag = p.add_mutually_exclusive_group()
|
||||
schema_cache_flag.add_argument(
|
||||
"--cache-selected-only",
|
||||
action="store_const",
|
||||
const=True,
|
||||
default=None,
|
||||
dest="cache_selected_only",
|
||||
help="""
|
||||
Pre cache database objects relevant to selected resource only.
|
||||
""",
|
||||
)
|
||||
schema_cache_flag.add_argument(
|
||||
"--no-cache-selected-only",
|
||||
action="store_const",
|
||||
const=False,
|
||||
dest="cache_selected_only",
|
||||
help="""
|
||||
Pre cache all database objects related to the project.
|
||||
""",
|
||||
)
|
||||
|
||||
subs = p.add_subparsers(title="Available sub-commands")
|
||||
|
||||
base_subparser = _build_base_subparser()
|
||||
|
||||
@@ -25,9 +25,13 @@ from dbt.exceptions import raise_compiler_error, raise_parsing_error
|
||||
from dbt.parser.search import FileBlock
|
||||
|
||||
|
||||
def get_nice_generic_test_name(
|
||||
def synthesize_generic_test_names(
|
||||
test_type: str, test_name: str, args: Dict[str, Any]
|
||||
) -> Tuple[str, str]:
|
||||
# Using the type, name, and arguments to this generic test, synthesize a (hopefully) unique name
|
||||
# Will not be unique if multiple tests have same name + arguments, and only configs differ
|
||||
# Returns a shorter version (hashed/truncated, for the compiled file)
|
||||
# as well as the full name (for the unique_id + FQN)
|
||||
flat_args = []
|
||||
for arg_name in sorted(args):
|
||||
# the model is already embedded in the name, so skip it
|
||||
@@ -263,13 +267,25 @@ class TestBuilder(Generic[Testable]):
|
||||
if self.namespace is not None:
|
||||
self.package_name = self.namespace
|
||||
|
||||
compiled_name, fqn_name = self.get_test_name()
|
||||
self.compiled_name: str = compiled_name
|
||||
self.fqn_name: str = fqn_name
|
||||
# If the user has provided a custom name for this generic test, use it
|
||||
# Then delete the "name" argument to avoid passing it into the test macro
|
||||
# Otherwise, use an auto-generated name synthesized from test inputs
|
||||
self.compiled_name: str = ""
|
||||
self.fqn_name: str = ""
|
||||
|
||||
# use hashed name as alias if too long
|
||||
if compiled_name != fqn_name and "alias" not in self.config:
|
||||
self.config["alias"] = compiled_name
|
||||
if "name" in self.args:
|
||||
# Assign the user-defined name here, which will be checked for uniqueness later
|
||||
# we will raise an error if two tests have same name for same model + column combo
|
||||
self.compiled_name = self.args["name"]
|
||||
self.fqn_name = self.args["name"]
|
||||
del self.args["name"]
|
||||
else:
|
||||
short_name, full_name = self.get_synthetic_test_names()
|
||||
self.compiled_name = short_name
|
||||
self.fqn_name = full_name
|
||||
# use hashed name as alias if full name is too long
|
||||
if short_name != full_name and "alias" not in self.config:
|
||||
self.config["alias"] = short_name
|
||||
|
||||
def _bad_type(self) -> TypeError:
|
||||
return TypeError('invalid target type "{}"'.format(type(self.target)))
|
||||
@@ -281,13 +297,23 @@ class TestBuilder(Generic[Testable]):
|
||||
"test must be dict or str, got {} (value {})".format(type(test), test)
|
||||
)
|
||||
|
||||
test = list(test.items())
|
||||
if len(test) != 1:
|
||||
raise_parsing_error(
|
||||
"test definition dictionary must have exactly one key, got"
|
||||
" {} instead ({} keys)".format(test, len(test))
|
||||
)
|
||||
test_name, test_args = test[0]
|
||||
# If the test is a dictionary with top-level keys, the test name is "test_name"
|
||||
# and the rest are arguments
|
||||
# {'name': 'my_favorite_test', 'test_name': 'unique', 'config': {'where': '1=1'}}
|
||||
if "test_name" in test.keys():
|
||||
test_name = test.pop("test_name")
|
||||
test_args = test
|
||||
# If the test is a nested dictionary with one top-level key, the test name
|
||||
# is the dict name, and nested keys are arguments
|
||||
# {'unique': {'name': 'my_favorite_test', 'config': {'where': '1=1'}}}
|
||||
else:
|
||||
test = list(test.items())
|
||||
if len(test) != 1:
|
||||
raise_parsing_error(
|
||||
"test definition dictionary must have exactly one key, got"
|
||||
" {} instead ({} keys)".format(test, len(test))
|
||||
)
|
||||
test_name, test_args = test[0]
|
||||
|
||||
if not isinstance(test_args, dict):
|
||||
raise_parsing_error(
|
||||
@@ -401,7 +427,8 @@ class TestBuilder(Generic[Testable]):
|
||||
macro_name = "{}.{}".format(self.namespace, macro_name)
|
||||
return macro_name
|
||||
|
||||
def get_test_name(self) -> Tuple[str, str]:
|
||||
def get_synthetic_test_names(self) -> Tuple[str, str]:
|
||||
# Returns two names: shorter (for the compiled file), full (for the unique_id + FQN)
|
||||
if isinstance(self.target, UnparsedNodeUpdate):
|
||||
name = self.name
|
||||
elif isinstance(self.target, UnpatchedSourceDefinition):
|
||||
@@ -410,7 +437,7 @@ class TestBuilder(Generic[Testable]):
|
||||
raise self._bad_type()
|
||||
if self.namespace is not None:
|
||||
name = "{}_{}".format(self.namespace, name)
|
||||
return get_nice_generic_test_name(name, self.target.name, self.args)
|
||||
return synthesize_generic_test_names(name, self.target.name, self.args)
|
||||
|
||||
def construct_config(self) -> str:
|
||||
configs = ",".join(
|
||||
|
||||
@@ -465,6 +465,8 @@ class ManifestLoader:
|
||||
else:
|
||||
dct = block.file.dict_from_yaml
|
||||
parser.parse_file(block, dct=dct)
|
||||
# Came out of here with UnpatchedSourceDefinition containing configs at the source level
|
||||
# and not configs at the table level (as expected)
|
||||
else:
|
||||
parser.parse_file(block)
|
||||
project_parsed_path_count += 1
|
||||
@@ -659,7 +661,8 @@ class ManifestLoader:
|
||||
reparse_reason = ReparseReason.file_not_found
|
||||
|
||||
# this event is only fired if a full reparse is needed
|
||||
dbt.tracking.track_partial_parser({"full_reparse_reason": reparse_reason})
|
||||
if dbt.tracking.active_user is not None: # no active_user if doing load_macros
|
||||
dbt.tracking.track_partial_parser({"full_reparse_reason": reparse_reason})
|
||||
|
||||
return None
|
||||
|
||||
@@ -777,9 +780,13 @@ class ManifestLoader:
|
||||
cls,
|
||||
root_config: RuntimeConfig,
|
||||
macro_hook: Callable[[Manifest], Any],
|
||||
base_macros_only=False,
|
||||
) -> Manifest:
|
||||
with PARSING_STATE:
|
||||
projects = root_config.load_dependencies()
|
||||
# base_only/base_macros_only: for testing only,
|
||||
# allows loading macros without running 'dbt deps' first
|
||||
projects = root_config.load_dependencies(base_only=base_macros_only)
|
||||
|
||||
# This creates a loader object, including result,
|
||||
# and then throws it away, returning only the
|
||||
# manifest
|
||||
@@ -1035,7 +1042,7 @@ def _process_docs_for_metrics(context: Dict[str, Any], metric: ParsedMetric) ->
|
||||
|
||||
|
||||
def _process_refs_for_exposure(manifest: Manifest, current_project: str, exposure: ParsedExposure):
|
||||
"""Given a manifest and a exposure in that manifest, process its refs"""
|
||||
"""Given a manifest and exposure in that manifest, process its refs"""
|
||||
for ref in exposure.refs:
|
||||
target_model: Optional[Union[Disabled, ManifestNode]] = None
|
||||
target_model_name: str
|
||||
|
||||
@@ -291,10 +291,10 @@ class PartialParsing:
|
||||
if self.already_scheduled_for_parsing(old_source_file):
|
||||
return
|
||||
|
||||
# These files only have one node.
|
||||
unique_id = None
|
||||
# These files only have one node except for snapshots
|
||||
unique_ids = []
|
||||
if old_source_file.nodes:
|
||||
unique_id = old_source_file.nodes[0]
|
||||
unique_ids = old_source_file.nodes
|
||||
else:
|
||||
# It's not clear when this would actually happen.
|
||||
# Logging in case there are other associated errors.
|
||||
@@ -305,7 +305,7 @@ class PartialParsing:
|
||||
self.deleted_manifest.files[file_id] = old_source_file
|
||||
self.saved_files[file_id] = deepcopy(new_source_file)
|
||||
self.add_to_pp_files(new_source_file)
|
||||
if unique_id:
|
||||
for unique_id in unique_ids:
|
||||
self.remove_node_in_saved(new_source_file, unique_id)
|
||||
|
||||
def remove_node_in_saved(self, source_file, unique_id):
|
||||
@@ -379,7 +379,7 @@ class PartialParsing:
|
||||
if not source_file.nodes:
|
||||
fire_event(PartialParsingMissingNodes(file_id=source_file.file_id))
|
||||
return
|
||||
# There is generally only 1 node for SQL files, except for macros
|
||||
# There is generally only 1 node for SQL files, except for macros and snapshots
|
||||
for unique_id in source_file.nodes:
|
||||
self.remove_node_in_saved(source_file, unique_id)
|
||||
self.schedule_referencing_nodes_for_parsing(unique_id)
|
||||
@@ -573,6 +573,10 @@ class PartialParsing:
|
||||
# doc source_file.nodes
|
||||
self.schedule_nodes_for_parsing(source_file.nodes)
|
||||
source_file.nodes = []
|
||||
# Remove the file object
|
||||
self.deleted_manifest.files[source_file.file_id] = self.saved_manifest.files.pop(
|
||||
source_file.file_id
|
||||
)
|
||||
|
||||
# Schema files -----------------------
|
||||
# Changed schema files
|
||||
|
||||
@@ -49,7 +49,7 @@ from dbt.exceptions import (
|
||||
warn_invalid_patch,
|
||||
validator_error_message,
|
||||
JSONValidationException,
|
||||
raise_invalid_schema_yml_version,
|
||||
raise_invalid_property_yml_version,
|
||||
ValidationException,
|
||||
ParsingException,
|
||||
raise_duplicate_patch_name,
|
||||
@@ -94,7 +94,10 @@ schema_file_keys = (
|
||||
|
||||
|
||||
def error_context(
|
||||
path: str, key: str, data: Any, cause: Union[str, ValidationException, JSONValidationException]
|
||||
path: str,
|
||||
key: str,
|
||||
data: Any,
|
||||
cause: Union[str, ValidationException, JSONValidationException],
|
||||
) -> str:
|
||||
"""Provide contextual information about an error while parsing"""
|
||||
if isinstance(cause, str):
|
||||
@@ -112,7 +115,8 @@ def yaml_from_file(source_file: SchemaSourceFile) -> Dict[str, Any]:
|
||||
"""If loading the yaml fails, raise an exception."""
|
||||
path = source_file.path.relative_path
|
||||
try:
|
||||
return load_yaml_text(source_file.contents)
|
||||
# source_file.contents can sometimes be None
|
||||
return load_yaml_text(source_file.contents or "")
|
||||
except ValidationException as e:
|
||||
reason = validator_error_message(e)
|
||||
raise ParsingException(
|
||||
@@ -544,15 +548,23 @@ class SchemaParser(SimpleParser[GenericTestBlock, ParsedGenericTestNode]):
|
||||
|
||||
def check_format_version(file_path, yaml_dct) -> None:
|
||||
if "version" not in yaml_dct:
|
||||
raise_invalid_schema_yml_version(file_path, "no version is specified")
|
||||
raise_invalid_property_yml_version(
|
||||
file_path, "the yml property file {} is missing a version tag".format(file_path)
|
||||
)
|
||||
|
||||
version = yaml_dct["version"]
|
||||
# if it's not an integer, the version is malformed, or not
|
||||
# set. Either way, only 'version: 2' is supported.
|
||||
if not isinstance(version, int):
|
||||
raise_invalid_schema_yml_version(file_path, "the version is not an integer")
|
||||
raise_invalid_property_yml_version(
|
||||
file_path,
|
||||
"its 'version:' tag must be an integer (e.g. version: 2)."
|
||||
" {} is not an integer".format(version),
|
||||
)
|
||||
if version != 2:
|
||||
raise_invalid_schema_yml_version(file_path, "version {} is not supported".format(version))
|
||||
raise_invalid_property_yml_version(
|
||||
file_path, "its 'version:' tag is set to {}. Only 2 is supported".format(version)
|
||||
)
|
||||
|
||||
|
||||
Parsed = TypeVar("Parsed", UnpatchedSourceDefinition, ParsedNodePatch, ParsedMacroPatch)
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import itertools
|
||||
from pathlib import Path
|
||||
from typing import Iterable, Dict, Optional, Set, List, Any
|
||||
from typing import Iterable, Dict, Optional, Set, Any
|
||||
from dbt.adapters.factory import get_adapter
|
||||
from dbt.config import RuntimeConfig
|
||||
from dbt.context.context_config import (
|
||||
@@ -137,15 +137,13 @@ class SourcePatcher:
|
||||
tags = sorted(set(itertools.chain(source.tags, table.tags)))
|
||||
|
||||
config = self._generate_source_config(
|
||||
fqn=target.fqn,
|
||||
target=target,
|
||||
rendered=True,
|
||||
project_name=target.package_name,
|
||||
)
|
||||
|
||||
unrendered_config = self._generate_source_config(
|
||||
fqn=target.fqn,
|
||||
target=target,
|
||||
rendered=False,
|
||||
project_name=target.package_name,
|
||||
)
|
||||
|
||||
if not isinstance(config, SourceConfig):
|
||||
@@ -261,19 +259,29 @@ class SourcePatcher:
|
||||
)
|
||||
return node
|
||||
|
||||
def _generate_source_config(self, fqn: List[str], rendered: bool, project_name: str):
|
||||
def _generate_source_config(self, target: UnpatchedSourceDefinition, rendered: bool):
|
||||
generator: BaseContextConfigGenerator
|
||||
if rendered:
|
||||
generator = ContextConfigGenerator(self.root_project)
|
||||
else:
|
||||
generator = UnrenderedConfigGenerator(self.root_project)
|
||||
|
||||
# configs with precendence set
|
||||
precedence_configs = dict()
|
||||
# first apply source configs
|
||||
precedence_configs.update(target.source.config)
|
||||
# then overrite anything that is defined on source tables
|
||||
# this is not quite complex enough for configs that can be set as top-level node keys, but
|
||||
# it works while source configs can only include `enabled`.
|
||||
precedence_configs.update(target.table.config)
|
||||
|
||||
return generator.calculate_node_config(
|
||||
config_call_dict={},
|
||||
fqn=fqn,
|
||||
fqn=target.fqn,
|
||||
resource_type=NodeType.Source,
|
||||
project_name=project_name,
|
||||
project_name=target.package_name,
|
||||
base=False,
|
||||
patch_config_dict=precedence_configs,
|
||||
)
|
||||
|
||||
def _get_relation_name(self, node: ParsedSourceDefinition):
|
||||
|
||||
6
core/dbt/selected_resources.py
Normal file
6
core/dbt/selected_resources.py
Normal file
@@ -0,0 +1,6 @@
|
||||
SELECTED_RESOURCES = []
|
||||
|
||||
|
||||
def set_selected_resources(selected_resources):
|
||||
global SELECTED_RESOURCES
|
||||
SELECTED_RESOURCES = list(selected_resources)
|
||||
@@ -41,7 +41,7 @@ CATALOG_FILENAME = "catalog.json"
|
||||
|
||||
|
||||
def get_stripped_prefix(source: Dict[str, Any], prefix: str) -> Dict[str, Any]:
|
||||
"""Go through source, extracting every key/value pair where the key starts
|
||||
"""Go through the source, extracting every key/value pair where the key starts
|
||||
with the given prefix.
|
||||
"""
|
||||
cut = len(prefix)
|
||||
|
||||
@@ -436,8 +436,9 @@ class RunTask(CompileTask):
|
||||
|
||||
def before_run(self, adapter, selected_uids: AbstractSet[str]):
|
||||
with adapter.connection_named("master"):
|
||||
self.create_schemas(adapter, selected_uids)
|
||||
self.populate_adapter_cache(adapter)
|
||||
required_schemas = self.get_model_schemas(adapter, selected_uids)
|
||||
self.create_schemas(adapter, required_schemas)
|
||||
self.populate_adapter_cache(adapter, required_schemas)
|
||||
self.defer_to_manifest(adapter, selected_uids)
|
||||
self.safe_run_hooks(adapter, RunHookType.Start, {})
|
||||
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
import os
|
||||
import time
|
||||
import json
|
||||
from pathlib import Path
|
||||
from abc import abstractmethod
|
||||
from concurrent.futures import as_completed
|
||||
from datetime import datetime
|
||||
@@ -50,6 +51,7 @@ from dbt.exceptions import (
|
||||
|
||||
from dbt.graph import GraphQueue, NodeSelector, SelectionSpec, parse_difference, Graph
|
||||
from dbt.parser.manifest import ManifestLoader
|
||||
import dbt.tracking
|
||||
|
||||
import dbt.exceptions
|
||||
from dbt import flags
|
||||
@@ -88,7 +90,12 @@ class ManifestTask(ConfiguredTask):
|
||||
|
||||
def _runtime_initialize(self):
|
||||
self.load_manifest()
|
||||
|
||||
start_compile_manifest = time.perf_counter()
|
||||
self.compile_manifest()
|
||||
compile_time = time.perf_counter() - start_compile_manifest
|
||||
if dbt.tracking.active_user is not None:
|
||||
dbt.tracking.track_runnable_timing({"graph_compilation_elapsed": compile_time})
|
||||
|
||||
|
||||
class GraphRunnableTask(ManifestTask):
|
||||
@@ -110,7 +117,9 @@ class GraphRunnableTask(ManifestTask):
|
||||
|
||||
def set_previous_state(self):
|
||||
if self.args.state is not None:
|
||||
self.previous_state = PreviousState(self.args.state)
|
||||
self.previous_state = PreviousState(
|
||||
path=self.args.state, current_path=Path(self.config.target_path)
|
||||
)
|
||||
|
||||
def index_offset(self, value: int) -> int:
|
||||
return value
|
||||
@@ -390,8 +399,17 @@ class GraphRunnableTask(ManifestTask):
|
||||
for dep_node_id in self.graph.get_dependent_nodes(node_id):
|
||||
self._skipped_children[dep_node_id] = cause
|
||||
|
||||
def populate_adapter_cache(self, adapter):
|
||||
adapter.set_relations_cache(self.manifest)
|
||||
def populate_adapter_cache(self, adapter, required_schemas: Set[BaseRelation] = None):
|
||||
start_populate_cache = time.perf_counter()
|
||||
if flags.CACHE_SELECTED_ONLY is True:
|
||||
adapter.set_relations_cache(self.manifest, required_schemas=required_schemas)
|
||||
else:
|
||||
adapter.set_relations_cache(self.manifest)
|
||||
cache_populate_time = time.perf_counter() - start_populate_cache
|
||||
if dbt.tracking.active_user is not None:
|
||||
dbt.tracking.track_runnable_timing(
|
||||
{"adapter_cache_construction_elapsed": cache_populate_time}
|
||||
)
|
||||
|
||||
def before_hooks(self, adapter):
|
||||
pass
|
||||
@@ -489,8 +507,7 @@ class GraphRunnableTask(ManifestTask):
|
||||
|
||||
return result
|
||||
|
||||
def create_schemas(self, adapter, selected_uids: Iterable[str]):
|
||||
required_schemas = self.get_model_schemas(adapter, selected_uids)
|
||||
def create_schemas(self, adapter, required_schemas: Set[BaseRelation]):
|
||||
# we want the string form of the information schema database
|
||||
required_databases: Set[BaseRelation] = set()
|
||||
for required in required_schemas:
|
||||
|
||||
@@ -230,6 +230,8 @@ class TestTask(RunTask):
|
||||
constraints are satisfied.
|
||||
"""
|
||||
|
||||
__test__ = False
|
||||
|
||||
def raise_on_first_error(self):
|
||||
return False
|
||||
|
||||
|
||||
1
core/dbt/tests/fixtures/__init__.py
vendored
Normal file
1
core/dbt/tests/fixtures/__init__.py
vendored
Normal file
@@ -0,0 +1 @@
|
||||
# dbt.tests.fixtures directory
|
||||
462
core/dbt/tests/fixtures/project.py
vendored
Normal file
462
core/dbt/tests/fixtures/project.py
vendored
Normal file
@@ -0,0 +1,462 @@
|
||||
import os
|
||||
import pytest # type: ignore
|
||||
import random
|
||||
from argparse import Namespace
|
||||
from datetime import datetime
|
||||
import warnings
|
||||
import yaml
|
||||
|
||||
import dbt.flags as flags
|
||||
from dbt.config.runtime import RuntimeConfig
|
||||
from dbt.adapters.factory import get_adapter, register_adapter, reset_adapters, get_adapter_by_type
|
||||
from dbt.events.functions import setup_event_logger
|
||||
from dbt.tests.util import (
|
||||
write_file,
|
||||
run_sql_with_adapter,
|
||||
TestProcessingException,
|
||||
get_connection,
|
||||
)
|
||||
|
||||
|
||||
# These are the fixtures that are used in dbt core functional tests
|
||||
#
|
||||
# The main functional test fixture is the 'project' fixture, which combines
|
||||
# other fixtures, writes out a dbt project in a temporary directory, creates a temp
|
||||
# schema in the testing database, and returns a `TestProjInfo` object that
|
||||
# contains information from the other fixtures for convenience.
|
||||
#
|
||||
# The models, macros, seeds, snapshots, tests, and analysis fixtures all
|
||||
# represent directories in a dbt project, and are all dictionaries with
|
||||
# file name keys and file contents values.
|
||||
#
|
||||
# The other commonly used fixture is 'project_config_update'. Other
|
||||
# occasionally used fixtures are 'profiles_config_update', 'packages',
|
||||
# and 'selectors'.
|
||||
#
|
||||
# Most test cases have fairly small files which are best included in
|
||||
# the test case file itself as string variables, to make it easy to
|
||||
# understand what is happening in the test. Files which are used
|
||||
# in multiple test case files can be included in a common file, such as
|
||||
# files.py or fixtures.py. Large files, such as seed files, which would
|
||||
# just clutter the test file can be pulled in from 'data' subdirectories
|
||||
# in the test directory.
|
||||
#
|
||||
# Test logs are written in the 'logs' directory in the root of the repo.
|
||||
# Every test case writes to a log directory with the same 'prefix' as the
|
||||
# test's unique schema.
|
||||
#
|
||||
# These fixture have "class" scope. Class scope fixtures can be used both
|
||||
# in classes and in single test functions (which act as classes for this
|
||||
# purpose). Pytest will collect all classes starting with 'Test', so if
|
||||
# you have a class that you want to be subclassed, it's generally best to
|
||||
# not start the class name with 'Test'. All standalone functions starting with
|
||||
# 'test_' and methods in classes starting with 'test_' (in classes starting
|
||||
# with 'Test') will be collected.
|
||||
#
|
||||
# Please see the pytest docs for further information:
|
||||
# https://docs.pytest.org
|
||||
|
||||
|
||||
# Used in constructing the unique_schema and logs_dir
|
||||
@pytest.fixture(scope="class")
|
||||
def prefix():
|
||||
# create a directory name that will be unique per test session
|
||||
_randint = random.randint(0, 9999)
|
||||
_runtime_timedelta = datetime.utcnow() - datetime(1970, 1, 1, 0, 0, 0)
|
||||
_runtime = (int(_runtime_timedelta.total_seconds() * 1e6)) + _runtime_timedelta.microseconds
|
||||
prefix = f"test{_runtime}{_randint:04}"
|
||||
return prefix
|
||||
|
||||
|
||||
# Every test has a unique schema
|
||||
@pytest.fixture(scope="class")
|
||||
def unique_schema(request, prefix) -> str:
|
||||
test_file = request.module.__name__
|
||||
# We only want the last part of the name
|
||||
test_file = test_file.split(".")[-1]
|
||||
unique_schema = f"{prefix}_{test_file}"
|
||||
return unique_schema
|
||||
|
||||
|
||||
# Create a directory for the profile using tmpdir fixture
|
||||
@pytest.fixture(scope="class")
|
||||
def profiles_root(tmpdir_factory):
|
||||
return tmpdir_factory.mktemp("profile")
|
||||
|
||||
|
||||
# Create a directory for the project using tmpdir fixture
|
||||
@pytest.fixture(scope="class")
|
||||
def project_root(tmpdir_factory):
|
||||
# tmpdir docs - https://docs.pytest.org/en/6.2.x/tmpdir.html
|
||||
project_root = tmpdir_factory.mktemp("project")
|
||||
print(f"\n=== Test project_root: {project_root}")
|
||||
return project_root
|
||||
|
||||
|
||||
# This is for data used by multiple tests, in the 'tests/data' directory
|
||||
@pytest.fixture(scope="session")
|
||||
def shared_data_dir(request):
|
||||
return os.path.join(request.config.rootdir, "tests", "data")
|
||||
|
||||
|
||||
# This is for data for a specific test directory, i.e. tests/basic/data
|
||||
@pytest.fixture(scope="module")
|
||||
def test_data_dir(request):
|
||||
return os.path.join(request.fspath.dirname, "data")
|
||||
|
||||
|
||||
# This contains the profile target information, for simplicity in setting
|
||||
# up different profiles, particularly in the adapter repos.
|
||||
# Note: because we load the profile to create the adapter, this
|
||||
# fixture can't be used to test vars and env_vars or errors. The
|
||||
# profile must be written out after the test starts.
|
||||
@pytest.fixture(scope="class")
|
||||
def dbt_profile_target():
|
||||
return {
|
||||
"type": "postgres",
|
||||
"threads": 4,
|
||||
"host": "localhost",
|
||||
"port": int(os.getenv("POSTGRES_TEST_PORT", 5432)),
|
||||
"user": os.getenv("POSTGRES_TEST_USER", "root"),
|
||||
"pass": os.getenv("POSTGRES_TEST_PASS", "password"),
|
||||
"dbname": os.getenv("POSTGRES_TEST_DATABASE", "dbt"),
|
||||
}
|
||||
|
||||
|
||||
# This fixture can be overridden in a project. The data provided in this
|
||||
# fixture will be merged into the default project dictionary via a python 'update'.
|
||||
@pytest.fixture(scope="class")
|
||||
def profiles_config_update():
|
||||
return {}
|
||||
|
||||
|
||||
# The profile dictionary, used to write out profiles.yml. It will pull in updates
|
||||
# from two separate sources, the 'profile_target' and 'profiles_config_update'.
|
||||
# The second one is useful when using alternative targets, etc.
|
||||
@pytest.fixture(scope="class")
|
||||
def dbt_profile_data(unique_schema, dbt_profile_target, profiles_config_update):
|
||||
profile = {
|
||||
"config": {"send_anonymous_usage_stats": False},
|
||||
"test": {
|
||||
"outputs": {
|
||||
"default": {},
|
||||
},
|
||||
"target": "default",
|
||||
},
|
||||
}
|
||||
target = dbt_profile_target
|
||||
target["schema"] = unique_schema
|
||||
profile["test"]["outputs"]["default"] = target
|
||||
|
||||
if profiles_config_update:
|
||||
profile.update(profiles_config_update)
|
||||
return profile
|
||||
|
||||
|
||||
# Write out the profile data as a yaml file
|
||||
@pytest.fixture(scope="class")
|
||||
def profiles_yml(profiles_root, dbt_profile_data):
|
||||
os.environ["DBT_PROFILES_DIR"] = str(profiles_root)
|
||||
write_file(yaml.safe_dump(dbt_profile_data), profiles_root, "profiles.yml")
|
||||
yield dbt_profile_data
|
||||
del os.environ["DBT_PROFILES_DIR"]
|
||||
|
||||
|
||||
# Data used to update the dbt_project config data.
|
||||
@pytest.fixture(scope="class")
|
||||
def project_config_update():
|
||||
return {}
|
||||
|
||||
|
||||
# Combines the project_config_update dictionary with project_config defaults to
|
||||
# produce a project_yml config and write it out as dbt_project.yml
|
||||
@pytest.fixture(scope="class")
|
||||
def dbt_project_yml(project_root, project_config_update, logs_dir):
|
||||
project_config = {
|
||||
"config-version": 2,
|
||||
"name": "test",
|
||||
"version": "0.1.0",
|
||||
"profile": "test",
|
||||
"log-path": logs_dir,
|
||||
}
|
||||
if project_config_update:
|
||||
project_config.update(project_config_update)
|
||||
write_file(yaml.safe_dump(project_config), project_root, "dbt_project.yml")
|
||||
return project_config
|
||||
|
||||
|
||||
# Fixture to provide packages as either yaml or dictionary
|
||||
@pytest.fixture(scope="class")
|
||||
def packages():
|
||||
return {}
|
||||
|
||||
|
||||
# Write out the packages.yml file
|
||||
@pytest.fixture(scope="class")
|
||||
def packages_yml(project_root, packages):
|
||||
if packages:
|
||||
if isinstance(packages, str):
|
||||
data = packages
|
||||
else:
|
||||
data = yaml.safe_dump(packages)
|
||||
write_file(data, project_root, "packages.yml")
|
||||
|
||||
|
||||
# Fixture to provide selectors as either yaml or dictionary
|
||||
@pytest.fixture(scope="class")
|
||||
def selectors():
|
||||
return {}
|
||||
|
||||
|
||||
# Write out the selectors.yml file
|
||||
@pytest.fixture(scope="class")
|
||||
def selectors_yml(project_root, selectors):
|
||||
if selectors:
|
||||
if isinstance(selectors, str):
|
||||
data = selectors
|
||||
else:
|
||||
data = yaml.safe_dump(selectors)
|
||||
write_file(data, project_root, "selectors.yml")
|
||||
|
||||
|
||||
# This creates an adapter that is used for running test setup, such as creating
|
||||
# the test schema, and sql commands that are run in tests prior to the first
|
||||
# dbt command. After a dbt command is run, the project.adapter property will
|
||||
# return the current adapter (for this adapter type) from the adapter factory.
|
||||
# The adapter produced by this fixture will contain the "base" macros (not including
|
||||
# macros from dependencies).
|
||||
#
|
||||
# Anything used here must be actually working (dbt_project, profile, project and internal macros),
|
||||
# otherwise this will fail. So to test errors in those areas, you need to copy the files
|
||||
# into the project in the tests instead of putting them in the fixtures.
|
||||
@pytest.fixture(scope="class")
|
||||
def adapter(unique_schema, project_root, profiles_root, profiles_yml, dbt_project_yml):
|
||||
# The profiles.yml and dbt_project.yml should already be written out
|
||||
args = Namespace(
|
||||
profiles_dir=str(profiles_root), project_dir=str(project_root), target=None, profile=None
|
||||
)
|
||||
flags.set_from_args(args, {})
|
||||
runtime_config = RuntimeConfig.from_args(args)
|
||||
register_adapter(runtime_config)
|
||||
adapter = get_adapter(runtime_config)
|
||||
# We only need the base macros, not macros from dependencies, and don't want
|
||||
# to run 'dbt deps' here.
|
||||
adapter.load_macro_manifest(base_macros_only=True)
|
||||
yield adapter
|
||||
adapter.cleanup_connections()
|
||||
reset_adapters()
|
||||
|
||||
|
||||
# Start at directory level.
|
||||
def write_project_files(project_root, dir_name, file_dict):
|
||||
path = project_root.mkdir(dir_name)
|
||||
if file_dict:
|
||||
write_project_files_recursively(path, file_dict)
|
||||
|
||||
|
||||
# Write files out from file_dict. Can be nested directories...
|
||||
def write_project_files_recursively(path, file_dict):
|
||||
if type(file_dict) is not dict:
|
||||
raise TestProcessingException(f"Error creating {path}. Did you forget the file extension?")
|
||||
for name, value in file_dict.items():
|
||||
if name.endswith(".sql") or name.endswith(".csv") or name.endswith(".md"):
|
||||
write_file(value, path, name)
|
||||
elif name.endswith(".yml") or name.endswith(".yaml"):
|
||||
if isinstance(value, str):
|
||||
data = value
|
||||
else:
|
||||
data = yaml.safe_dump(value)
|
||||
write_file(data, path, name)
|
||||
else:
|
||||
write_project_files_recursively(path.mkdir(name), value)
|
||||
|
||||
|
||||
# models, macros, seeds, snapshots, tests, analysis
|
||||
# Provide a dictionary of file names to contents. Nested directories
|
||||
# are handle by nested dictionaries.
|
||||
|
||||
# models directory
|
||||
@pytest.fixture(scope="class")
|
||||
def models():
|
||||
return {}
|
||||
|
||||
|
||||
# macros directory
|
||||
@pytest.fixture(scope="class")
|
||||
def macros():
|
||||
return {}
|
||||
|
||||
|
||||
# seeds directory
|
||||
@pytest.fixture(scope="class")
|
||||
def seeds():
|
||||
return {}
|
||||
|
||||
|
||||
# snapshots directory
|
||||
@pytest.fixture(scope="class")
|
||||
def snapshots():
|
||||
return {}
|
||||
|
||||
|
||||
# tests directory
|
||||
@pytest.fixture(scope="class")
|
||||
def tests():
|
||||
return {}
|
||||
|
||||
|
||||
# analysis directory
|
||||
@pytest.fixture(scope="class")
|
||||
def analysis():
|
||||
return {}
|
||||
|
||||
|
||||
# Write out the files provided by models, macros, snapshots, seeds, tests, analysis
|
||||
@pytest.fixture(scope="class")
|
||||
def project_files(project_root, models, macros, snapshots, seeds, tests, analysis):
|
||||
write_project_files(project_root, "models", models)
|
||||
write_project_files(project_root, "macros", macros)
|
||||
write_project_files(project_root, "snapshots", snapshots)
|
||||
write_project_files(project_root, "seeds", seeds)
|
||||
write_project_files(project_root, "tests", tests)
|
||||
write_project_files(project_root, "analysis", analysis)
|
||||
|
||||
|
||||
# We have a separate logs dir for every test
|
||||
@pytest.fixture(scope="class")
|
||||
def logs_dir(request, prefix):
|
||||
return os.path.join(request.config.rootdir, "logs", prefix)
|
||||
|
||||
|
||||
# This fixture is for customizing tests that need overrides in adapter
|
||||
# repos. Example in dbt.tests.adapter.basic.test_base.
|
||||
@pytest.fixture(scope="class")
|
||||
def test_config():
|
||||
return {}
|
||||
|
||||
|
||||
# This class is returned from the 'project' fixture, and contains information
|
||||
# from the pytest fixtures that may be needed in the test functions, including
|
||||
# a 'run_sql' method.
|
||||
class TestProjInfo:
|
||||
def __init__(
|
||||
self,
|
||||
project_root,
|
||||
profiles_dir,
|
||||
adapter_type,
|
||||
test_dir,
|
||||
shared_data_dir,
|
||||
test_data_dir,
|
||||
test_schema,
|
||||
database,
|
||||
test_config,
|
||||
):
|
||||
self.project_root = project_root
|
||||
self.profiles_dir = profiles_dir
|
||||
self.adapter_type = adapter_type
|
||||
self.test_dir = test_dir
|
||||
self.shared_data_dir = shared_data_dir
|
||||
self.test_data_dir = test_data_dir
|
||||
self.test_schema = test_schema
|
||||
self.database = database
|
||||
self.test_config = test_config
|
||||
|
||||
@property
|
||||
def adapter(self):
|
||||
# This returns the last created "adapter" from the adapter factory. Each
|
||||
# dbt command will create a new one. This allows us to avoid patching the
|
||||
# providers 'get_adapter' function.
|
||||
return get_adapter_by_type(self.adapter_type)
|
||||
|
||||
# Run sql from a path
|
||||
def run_sql_file(self, sql_path, fetch=None):
|
||||
with open(sql_path, "r") as f:
|
||||
statements = f.read().split(";")
|
||||
for statement in statements:
|
||||
self.run_sql(statement, fetch)
|
||||
|
||||
# Run sql from a string, using adapter saved at test startup
|
||||
def run_sql(self, sql, fetch=None):
|
||||
return run_sql_with_adapter(self.adapter, sql, fetch=fetch)
|
||||
|
||||
# Create the unique test schema. Used in test setup, so that we're
|
||||
# ready for initial sql prior to a run_dbt command.
|
||||
def create_test_schema(self):
|
||||
with get_connection(self.adapter):
|
||||
relation = self.adapter.Relation.create(
|
||||
database=self.database, schema=self.test_schema
|
||||
)
|
||||
self.adapter.create_schema(relation)
|
||||
|
||||
# Drop the unique test schema, usually called in test cleanup
|
||||
def drop_test_schema(self):
|
||||
with get_connection(self.adapter):
|
||||
relation = self.adapter.Relation.create(
|
||||
database=self.database, schema=self.test_schema
|
||||
)
|
||||
self.adapter.drop_schema(relation)
|
||||
|
||||
# This return a dictionary of table names to 'view' or 'table' values.
|
||||
def get_tables_in_schema(self):
|
||||
sql = """
|
||||
select table_name,
|
||||
case when table_type = 'BASE TABLE' then 'table'
|
||||
when table_type = 'VIEW' then 'view'
|
||||
else table_type
|
||||
end as materialization
|
||||
from information_schema.tables
|
||||
where {}
|
||||
order by table_name
|
||||
"""
|
||||
sql = sql.format("{} ilike '{}'".format("table_schema", self.test_schema))
|
||||
result = self.run_sql(sql, fetch="all")
|
||||
return {model_name: materialization for (model_name, materialization) in result}
|
||||
|
||||
|
||||
# This is the main fixture that is used in all functional tests. It pulls in the other
|
||||
# fixtures that are necessary to set up a dbt project, and saves some of the information
|
||||
# in a TestProjInfo class, which it returns, so that individual test cases do not have
|
||||
# to pull in the other fixtures individually to access their information.
|
||||
@pytest.fixture(scope="class")
|
||||
def project(
|
||||
project_root,
|
||||
profiles_root,
|
||||
request,
|
||||
unique_schema,
|
||||
profiles_yml,
|
||||
dbt_project_yml,
|
||||
packages_yml,
|
||||
selectors_yml,
|
||||
adapter,
|
||||
project_files,
|
||||
shared_data_dir,
|
||||
test_data_dir,
|
||||
logs_dir,
|
||||
test_config,
|
||||
):
|
||||
# Logbook warnings are ignored so we don't have to fork logbook to support python 3.10.
|
||||
# This _only_ works for tests in `tests/` that use the project fixture.
|
||||
warnings.filterwarnings("ignore", category=DeprecationWarning, module="logbook")
|
||||
setup_event_logger(logs_dir)
|
||||
orig_cwd = os.getcwd()
|
||||
os.chdir(project_root)
|
||||
# Return whatever is needed later in tests but can only come from fixtures, so we can keep
|
||||
# the signatures in the test signature to a minimum.
|
||||
project = TestProjInfo(
|
||||
project_root=project_root,
|
||||
profiles_dir=profiles_root,
|
||||
adapter_type=adapter.type(),
|
||||
test_dir=request.fspath.dirname,
|
||||
shared_data_dir=shared_data_dir,
|
||||
test_data_dir=test_data_dir,
|
||||
test_schema=unique_schema,
|
||||
database=adapter.config.credentials.database,
|
||||
test_config=test_config,
|
||||
)
|
||||
project.drop_test_schema()
|
||||
project.create_test_schema()
|
||||
|
||||
yield project
|
||||
|
||||
project.drop_test_schema()
|
||||
os.chdir(orig_cwd)
|
||||
432
core/dbt/tests/util.py
Normal file
432
core/dbt/tests/util.py
Normal file
@@ -0,0 +1,432 @@
|
||||
import os
|
||||
import shutil
|
||||
import yaml
|
||||
import json
|
||||
import warnings
|
||||
from typing import List
|
||||
from contextlib import contextmanager
|
||||
|
||||
from dbt.main import handle_and_check
|
||||
from dbt.logger import log_manager
|
||||
from dbt.contracts.graph.manifest import Manifest
|
||||
from dbt.events.functions import fire_event, capture_stdout_logs, stop_capture_stdout_logs
|
||||
from dbt.events.test_types import IntegrationTestDebug
|
||||
|
||||
# =============================================================================
|
||||
# Test utilities
|
||||
# run_dbt
|
||||
# run_dbt_and_capture
|
||||
# get_manifest
|
||||
# copy_file
|
||||
# rm_file
|
||||
# write_file
|
||||
# read_file
|
||||
# get_artifact
|
||||
# update_config_file
|
||||
# write_config_file
|
||||
# get_unique_ids_in_results
|
||||
# check_result_nodes_by_name
|
||||
# check_result_nodes_by_unique_id
|
||||
|
||||
# SQL related utilities that use the adapter
|
||||
# run_sql_with_adapter
|
||||
# relation_from_name
|
||||
# check_relation_types (table/view)
|
||||
# check_relations_equal
|
||||
# check_relations_equal_with_relations
|
||||
# check_table_does_exist
|
||||
# check_table_does_not_exist
|
||||
# get_relation_columns
|
||||
# update_rows
|
||||
# generate_update_clause
|
||||
|
||||
# =============================================================================
|
||||
|
||||
|
||||
# 'run_dbt' is used in pytest tests to run dbt commands. It will return
|
||||
# different objects depending on the command that is executed.
|
||||
# For a run command (and most other commands) it will return a list
|
||||
# of results. For the 'docs generate' command it returns a CatalogArtifact.
|
||||
# The first parameter is a list of dbt command line arguments, such as
|
||||
# run_dbt(["run", "--vars", "seed_name: base"])
|
||||
# If the command is expected to fail, pass in "expect_pass=False"):
|
||||
# run_dbt("test"], expect_pass=False)
|
||||
def run_dbt(args: List[str] = None, expect_pass=True):
|
||||
# Ignore logbook warnings
|
||||
warnings.filterwarnings("ignore", category=DeprecationWarning, module="logbook")
|
||||
|
||||
# The logger will complain about already being initialized if
|
||||
# we don't do this.
|
||||
log_manager.reset_handlers()
|
||||
if args is None:
|
||||
args = ["run"]
|
||||
|
||||
print("\n\nInvoking dbt with {}".format(args))
|
||||
res, success = handle_and_check(args)
|
||||
assert success == expect_pass, "dbt exit state did not match expected"
|
||||
return res
|
||||
|
||||
|
||||
# Use this if you need to capture the command logs in a test
|
||||
def run_dbt_and_capture(args: List[str] = None, expect_pass=True):
|
||||
try:
|
||||
stringbuf = capture_stdout_logs()
|
||||
res = run_dbt(args, expect_pass=expect_pass)
|
||||
stdout = stringbuf.getvalue()
|
||||
|
||||
finally:
|
||||
stop_capture_stdout_logs()
|
||||
|
||||
return res, stdout
|
||||
|
||||
|
||||
# Used in test cases to get the manifest from the partial parsing file
|
||||
# Note: this uses an internal version of the manifest, and in the future
|
||||
# parts of it will not be supported for external use.
|
||||
def get_manifest(project_root):
|
||||
path = os.path.join(project_root, "target", "partial_parse.msgpack")
|
||||
if os.path.exists(path):
|
||||
with open(path, "rb") as fp:
|
||||
manifest_mp = fp.read()
|
||||
manifest: Manifest = Manifest.from_msgpack(manifest_mp)
|
||||
return manifest
|
||||
else:
|
||||
return None
|
||||
|
||||
|
||||
# Used in tests to copy a file, usually from a data directory to the project directory
|
||||
def copy_file(src_path, src, dest_path, dest) -> None:
|
||||
# dest is a list, so that we can provide nested directories, like 'models' etc.
|
||||
# copy files from the data_dir to appropriate project directory
|
||||
shutil.copyfile(
|
||||
os.path.join(src_path, src),
|
||||
os.path.join(dest_path, *dest),
|
||||
)
|
||||
|
||||
|
||||
# Used in tests when you want to remove a file from the project directory
|
||||
def rm_file(*paths) -> None:
|
||||
# remove files from proj_path
|
||||
os.remove(os.path.join(*paths))
|
||||
|
||||
|
||||
# Used in tests to write out the string contents of a file to a
|
||||
# file in the project directory.
|
||||
# We need to explicitly use encoding="utf-8" because otherwise on
|
||||
# Windows we'll get codepage 1252 and things might break
|
||||
def write_file(contents, *paths):
|
||||
with open(os.path.join(*paths), "w", encoding="utf-8") as fp:
|
||||
fp.write(contents)
|
||||
|
||||
|
||||
# Used in test utilities
|
||||
def read_file(*paths):
|
||||
contents = ""
|
||||
with open(os.path.join(*paths), "r") as fp:
|
||||
contents = fp.read()
|
||||
return contents
|
||||
|
||||
|
||||
# Get an artifact (usually from the target directory) such as
|
||||
# manifest.json or catalog.json to use in a test
|
||||
def get_artifact(*paths):
|
||||
contents = read_file(*paths)
|
||||
dct = json.loads(contents)
|
||||
return dct
|
||||
|
||||
|
||||
# For updating yaml config files
|
||||
def update_config_file(updates, *paths):
|
||||
current_yaml = read_file(*paths)
|
||||
config = yaml.safe_load(current_yaml)
|
||||
config.update(updates)
|
||||
new_yaml = yaml.safe_dump(config)
|
||||
write_file(new_yaml, *paths)
|
||||
|
||||
|
||||
# Write new config file
|
||||
def write_config_file(data, *paths):
|
||||
if type(data) is dict:
|
||||
data = yaml.safe_dump(data)
|
||||
write_file(data, *paths)
|
||||
|
||||
|
||||
# Get the unique_ids in dbt command results
|
||||
def get_unique_ids_in_results(results):
|
||||
unique_ids = []
|
||||
for result in results:
|
||||
unique_ids.append(result.node.unique_id)
|
||||
return unique_ids
|
||||
|
||||
|
||||
# Check the nodes in the results returned by a dbt run command
|
||||
def check_result_nodes_by_name(results, names):
|
||||
result_names = []
|
||||
for result in results:
|
||||
result_names.append(result.node.name)
|
||||
assert set(names) == set(result_names)
|
||||
|
||||
|
||||
# Check the nodes in the results returned by a dbt run command
|
||||
def check_result_nodes_by_unique_id(results, unique_ids):
|
||||
result_unique_ids = []
|
||||
for result in results:
|
||||
result_unique_ids.append(result.node.unique_id)
|
||||
assert set(unique_ids) == set(result_unique_ids)
|
||||
|
||||
|
||||
class TestProcessingException(Exception):
|
||||
pass
|
||||
|
||||
|
||||
# Testing utilities that use adapter code
|
||||
|
||||
# Uses:
|
||||
# adapter.config.credentials
|
||||
# adapter.quote
|
||||
# adapter.run_sql_for_tests
|
||||
def run_sql_with_adapter(adapter, sql, fetch=None):
|
||||
if sql.strip() == "":
|
||||
return
|
||||
# substitute schema and database in sql
|
||||
kwargs = {
|
||||
"schema": adapter.config.credentials.schema,
|
||||
"database": adapter.quote(adapter.config.credentials.database),
|
||||
}
|
||||
sql = sql.format(**kwargs)
|
||||
|
||||
msg = f'test connection "__test" executing: {sql}'
|
||||
fire_event(IntegrationTestDebug(msg=msg))
|
||||
with get_connection(adapter) as conn:
|
||||
return adapter.run_sql_for_tests(sql, fetch, conn)
|
||||
|
||||
|
||||
# Get a Relation object from the identifer (name of table/view).
|
||||
# Uses the default database and schema. If you need a relation
|
||||
# with a different schema, it should be constructed in the test.
|
||||
# Uses:
|
||||
# adapter.Relation
|
||||
# adapter.config.credentials
|
||||
# Relation.get_default_quote_policy
|
||||
# Relation.get_default_include_policy
|
||||
def relation_from_name(adapter, name: str):
|
||||
"""reverse-engineer a relation from a given name and
|
||||
the adapter. The relation name is split by the '.' character.
|
||||
"""
|
||||
|
||||
# Different adapters have different Relation classes
|
||||
cls = adapter.Relation
|
||||
credentials = adapter.config.credentials
|
||||
quote_policy = cls.get_default_quote_policy().to_dict()
|
||||
include_policy = cls.get_default_include_policy().to_dict()
|
||||
|
||||
# Make sure we have database/schema/identifier parts, even if
|
||||
# only identifier was supplied.
|
||||
relation_parts = name.split(".")
|
||||
if len(relation_parts) == 1:
|
||||
relation_parts.insert(0, credentials.schema)
|
||||
if len(relation_parts) == 2:
|
||||
relation_parts.insert(0, credentials.database)
|
||||
kwargs = {
|
||||
"database": relation_parts[0],
|
||||
"schema": relation_parts[1],
|
||||
"identifier": relation_parts[2],
|
||||
}
|
||||
|
||||
relation = cls.create(
|
||||
include_policy=include_policy,
|
||||
quote_policy=quote_policy,
|
||||
**kwargs,
|
||||
)
|
||||
return relation
|
||||
|
||||
|
||||
# Ensure that models with different materialiations have the
|
||||
# corrent table/view.
|
||||
# Uses:
|
||||
# adapter.list_relations_without_caching
|
||||
def check_relation_types(adapter, relation_to_type):
|
||||
"""
|
||||
Relation name to table/view
|
||||
{
|
||||
"base": "table",
|
||||
"other": "view",
|
||||
}
|
||||
"""
|
||||
|
||||
expected_relation_values = {}
|
||||
found_relations = []
|
||||
schemas = set()
|
||||
|
||||
for key, value in relation_to_type.items():
|
||||
relation = relation_from_name(adapter, key)
|
||||
expected_relation_values[relation] = value
|
||||
schemas.add(relation.without_identifier())
|
||||
|
||||
with get_connection(adapter):
|
||||
for schema in schemas:
|
||||
found_relations.extend(adapter.list_relations_without_caching(schema))
|
||||
|
||||
for key, value in relation_to_type.items():
|
||||
for relation in found_relations:
|
||||
# this might be too broad
|
||||
if relation.identifier == key:
|
||||
assert relation.type == value, (
|
||||
f"Got an unexpected relation type of {relation.type} "
|
||||
f"for relation {key}, expected {value}"
|
||||
)
|
||||
|
||||
|
||||
# Replaces assertTablesEqual. assertManyTablesEqual can be replaced
|
||||
# by doing a separate call for each set of tables/relations.
|
||||
# Wraps check_relations_equal_with_relations by creating relations
|
||||
# from the list of names passed in.
|
||||
def check_relations_equal(adapter, relation_names, compare_snapshot_cols=False):
|
||||
if len(relation_names) < 2:
|
||||
raise TestProcessingException(
|
||||
"Not enough relations to compare",
|
||||
)
|
||||
relations = [relation_from_name(adapter, name) for name in relation_names]
|
||||
return check_relations_equal_with_relations(
|
||||
adapter, relations, compare_snapshot_cols=compare_snapshot_cols
|
||||
)
|
||||
|
||||
|
||||
# This can be used when checking relations in different schemas, by supplying
|
||||
# a list of relations. Called by 'check_relations_equal'.
|
||||
# Uses:
|
||||
# adapter.get_columns_in_relation
|
||||
# adapter.get_rows_different_sql
|
||||
# adapter.execute
|
||||
def check_relations_equal_with_relations(adapter, relations, compare_snapshot_cols=False):
|
||||
|
||||
with get_connection(adapter):
|
||||
basis, compares = relations[0], relations[1:]
|
||||
# Skip columns starting with "dbt_" because we don't want to
|
||||
# compare those, since they are time sensitive
|
||||
# (unless comparing "dbt_" snapshot columns is explicitly enabled)
|
||||
column_names = [
|
||||
c.name
|
||||
for c in adapter.get_columns_in_relation(basis)
|
||||
if not c.name.lower().startswith("dbt_") or compare_snapshot_cols
|
||||
]
|
||||
|
||||
for relation in compares:
|
||||
sql = adapter.get_rows_different_sql(basis, relation, column_names=column_names)
|
||||
_, tbl = adapter.execute(sql, fetch=True)
|
||||
num_rows = len(tbl)
|
||||
assert (
|
||||
num_rows == 1
|
||||
), f"Invalid sql query from get_rows_different_sql: incorrect number of rows ({num_rows})"
|
||||
num_cols = len(tbl[0])
|
||||
assert (
|
||||
num_cols == 2
|
||||
), f"Invalid sql query from get_rows_different_sql: incorrect number of cols ({num_cols})"
|
||||
row_count_difference = tbl[0][0]
|
||||
assert (
|
||||
row_count_difference == 0
|
||||
), f"Got {row_count_difference} difference in row count betwen {basis} and {relation}"
|
||||
rows_mismatched = tbl[0][1]
|
||||
assert (
|
||||
rows_mismatched == 0
|
||||
), f"Got {rows_mismatched} different rows between {basis} and {relation}"
|
||||
|
||||
|
||||
# Uses:
|
||||
# adapter.update_column_sql
|
||||
# adapter.execute
|
||||
# adapter.commit_if_has_connection
|
||||
def update_rows(adapter, update_rows_config):
|
||||
"""
|
||||
{
|
||||
"name": "base",
|
||||
"dst_col": "some_date"
|
||||
"clause": {
|
||||
"type": "add_timestamp",
|
||||
"src_col": "some_date",
|
||||
"where" "id > 10"
|
||||
}
|
||||
"""
|
||||
for key in ["name", "dst_col", "clause"]:
|
||||
if key not in update_rows_config:
|
||||
raise TestProcessingException(f"Invalid update_rows: no {key}")
|
||||
|
||||
clause = update_rows_config["clause"]
|
||||
clause = generate_update_clause(adapter, clause)
|
||||
|
||||
where = None
|
||||
if "where" in update_rows_config:
|
||||
where = update_rows_config["where"]
|
||||
|
||||
name = update_rows_config["name"]
|
||||
dst_col = update_rows_config["dst_col"]
|
||||
relation = relation_from_name(adapter, name)
|
||||
|
||||
with get_connection(adapter):
|
||||
sql = adapter.update_column_sql(
|
||||
dst_name=str(relation),
|
||||
dst_column=dst_col,
|
||||
clause=clause,
|
||||
where_clause=where,
|
||||
)
|
||||
adapter.execute(sql, auto_begin=True)
|
||||
adapter.commit_if_has_connection()
|
||||
|
||||
|
||||
# This is called by the 'update_rows' function.
|
||||
# Uses:
|
||||
# adapter.timestamp_add_sql
|
||||
# adapter.string_add_sql
|
||||
def generate_update_clause(adapter, clause) -> str:
|
||||
"""
|
||||
Called by update_rows function. Expects the "clause" dictionary
|
||||
documented in 'update_rows.
|
||||
"""
|
||||
|
||||
if "type" not in clause or clause["type"] not in ["add_timestamp", "add_string"]:
|
||||
raise TestProcessingException("invalid update_rows clause: type missing or incorrect")
|
||||
clause_type = clause["type"]
|
||||
|
||||
if clause_type == "add_timestamp":
|
||||
if "src_col" not in clause:
|
||||
raise TestProcessingException("Invalid update_rows clause: no src_col")
|
||||
add_to = clause["src_col"]
|
||||
kwargs = {k: v for k, v in clause.items() if k in ("interval", "number")}
|
||||
with get_connection(adapter):
|
||||
return adapter.timestamp_add_sql(add_to=add_to, **kwargs)
|
||||
elif clause_type == "add_string":
|
||||
for key in ["src_col", "value"]:
|
||||
if key not in clause:
|
||||
raise TestProcessingException(f"Invalid update_rows clause: no {key}")
|
||||
src_col = clause["src_col"]
|
||||
value = clause["value"]
|
||||
location = clause.get("location", "append")
|
||||
with get_connection(adapter):
|
||||
return adapter.string_add_sql(src_col, value, location)
|
||||
return ""
|
||||
|
||||
|
||||
@contextmanager
|
||||
def get_connection(adapter, name="_test"):
|
||||
with adapter.connection_named(name):
|
||||
conn = adapter.connections.get_thread_connection()
|
||||
yield conn
|
||||
|
||||
|
||||
# Uses:
|
||||
# adapter.get_columns_in_relation
|
||||
def get_relation_columns(adapter, name):
|
||||
relation = relation_from_name(adapter, name)
|
||||
with get_connection(adapter):
|
||||
columns = adapter.get_columns_in_relation(relation)
|
||||
return sorted(((c.name, c.dtype, c.char_size) for c in columns), key=lambda x: x[0])
|
||||
|
||||
|
||||
def check_table_does_not_exist(adapter, name):
|
||||
columns = get_relation_columns(adapter, name)
|
||||
assert len(columns) == 0
|
||||
|
||||
|
||||
def check_table_does_exist(adapter, name):
|
||||
columns = get_relation_columns(adapter, name)
|
||||
assert len(columns) > 0
|
||||
@@ -44,6 +44,7 @@ LOAD_ALL_TIMING_SPEC = "iglu:com.dbt/load_all_timing/jsonschema/1-0-3"
|
||||
RESOURCE_COUNTS = "iglu:com.dbt/resource_counts/jsonschema/1-0-0"
|
||||
EXPERIMENTAL_PARSER = "iglu:com.dbt/experimental_parser/jsonschema/1-0-0"
|
||||
PARTIAL_PARSER = "iglu:com.dbt/partial_parser/jsonschema/1-0-1"
|
||||
RUNNABLE_TIMING = "iglu:com.dbt/runnable/jsonschema/1-0-0"
|
||||
DBT_INVOCATION_ENV = "DBT_INVOCATION_ENV"
|
||||
|
||||
|
||||
@@ -412,6 +413,18 @@ def track_partial_parser(options):
|
||||
)
|
||||
|
||||
|
||||
def track_runnable_timing(options):
|
||||
context = [SelfDescribingJson(RUNNABLE_TIMING, options)]
|
||||
assert active_user is not None, "Cannot track runnable info when active user is None"
|
||||
track(
|
||||
active_user,
|
||||
category="dbt",
|
||||
action="runnable_timing",
|
||||
label=get_invocation_id(),
|
||||
context=context,
|
||||
)
|
||||
|
||||
|
||||
def flush():
|
||||
fire_event(FlushEvents())
|
||||
try:
|
||||
|
||||
@@ -17,7 +17,7 @@ from pathlib import PosixPath, WindowsPath
|
||||
from contextlib import contextmanager
|
||||
from dbt.exceptions import ConnectionException
|
||||
from dbt.events.functions import fire_event
|
||||
from dbt.events.types import RetryExternalCall
|
||||
from dbt.events.types import RetryExternalCall, RecordRetryException
|
||||
from dbt import flags
|
||||
from enum import Enum
|
||||
from typing_extensions import Protocol
|
||||
@@ -211,7 +211,7 @@ def deep_map_render(func: Callable[[Any, Tuple[Union[str, int], ...]], Any], val
|
||||
|
||||
It maps the function func() onto each non-container value in 'value'
|
||||
recursively, returning a new value. As long as func does not manipulate
|
||||
value, then deep_map_render will also not manipulate it.
|
||||
the value, then deep_map_render will also not manipulate it.
|
||||
|
||||
value should be a value returned by `yaml.safe_load` or `json.load` - the
|
||||
only expected types are list, dict, native python number, str, NoneType,
|
||||
@@ -319,7 +319,7 @@ def timestring() -> str:
|
||||
|
||||
class JSONEncoder(json.JSONEncoder):
|
||||
"""A 'custom' json encoder that does normal json encoder things, but also
|
||||
handles `Decimal`s. and `Undefined`s. Decimals can lose precision because
|
||||
handles `Decimal`s and `Undefined`s. Decimals can lose precision because
|
||||
they get converted to floats. Undefined's are serialized to an empty string
|
||||
"""
|
||||
|
||||
@@ -394,7 +394,7 @@ def translate_aliases(
|
||||
|
||||
If recurse is True, perform this operation recursively.
|
||||
|
||||
:return: A dict containing all the values in kwargs referenced by their
|
||||
:returns: A dict containing all the values in kwargs referenced by their
|
||||
canonical key.
|
||||
:raises: `AliasException`, if a canonical key is defined more than once.
|
||||
"""
|
||||
@@ -600,22 +600,22 @@ class MultiDict(Mapping[str, Any]):
|
||||
|
||||
def _connection_exception_retry(fn, max_attempts: int, attempt: int = 0):
|
||||
"""Attempts to run a function that makes an external call, if the call fails
|
||||
on a connection error, timeout or decompression issue, it will be tried up to 5 more times.
|
||||
See https://github.com/dbt-labs/dbt-core/issues/4579 for context on this decompression issues
|
||||
specifically.
|
||||
on a Requests exception or decompression issue (ReadError), it will be tried
|
||||
up to 5 more times. All exceptions that Requests explicitly raises inherit from
|
||||
requests.exceptions.RequestException. See https://github.com/dbt-labs/dbt-core/issues/4579
|
||||
for context on this decompression issues specifically.
|
||||
"""
|
||||
try:
|
||||
return fn()
|
||||
except (
|
||||
requests.exceptions.ConnectionError,
|
||||
requests.exceptions.Timeout,
|
||||
requests.exceptions.ContentDecodingError,
|
||||
requests.exceptions.RequestException,
|
||||
ReadError,
|
||||
) as exc:
|
||||
if attempt <= max_attempts - 1:
|
||||
fire_event(RecordRetryException(exc=exc))
|
||||
fire_event(RetryExternalCall(attempt=attempt, max=max_attempts))
|
||||
time.sleep(1)
|
||||
_connection_exception_retry(fn, max_attempts, attempt + 1)
|
||||
return _connection_exception_retry(fn, max_attempts, attempt + 1)
|
||||
else:
|
||||
raise ConnectionException("External connection exception occurred: " + str(exc))
|
||||
|
||||
|
||||
@@ -3,7 +3,7 @@ import importlib.util
|
||||
import os
|
||||
import glob
|
||||
import json
|
||||
from typing import Iterator
|
||||
from typing import Iterator, List, Optional, Tuple
|
||||
|
||||
import requests
|
||||
|
||||
@@ -16,7 +16,34 @@ from dbt import flags
|
||||
PYPI_VERSION_URL = "https://pypi.org/pypi/dbt-core/json"
|
||||
|
||||
|
||||
def get_latest_version(version_url: str = PYPI_VERSION_URL):
|
||||
def get_version_information() -> str:
|
||||
flags.USE_COLORS = True if not flags.USE_COLORS else None
|
||||
|
||||
installed = get_installed_version()
|
||||
latest = get_latest_version()
|
||||
|
||||
core_msg_lines, core_info_msg = _get_core_msg_lines(installed, latest)
|
||||
core_msg = _format_core_msg(core_msg_lines)
|
||||
plugin_version_msg = _get_plugins_msg(installed)
|
||||
|
||||
msg_lines = [core_msg]
|
||||
|
||||
if core_info_msg != "":
|
||||
msg_lines.append(core_info_msg)
|
||||
|
||||
msg_lines.append(plugin_version_msg)
|
||||
msg_lines.append("")
|
||||
|
||||
return "\n\n".join(msg_lines)
|
||||
|
||||
|
||||
def get_installed_version() -> dbt.semver.VersionSpecifier:
|
||||
return dbt.semver.VersionSpecifier.from_version_string(__version__)
|
||||
|
||||
|
||||
def get_latest_version(
|
||||
version_url: str = PYPI_VERSION_URL,
|
||||
) -> Optional[dbt.semver.VersionSpecifier]:
|
||||
try:
|
||||
resp = requests.get(version_url)
|
||||
data = resp.json()
|
||||
@@ -27,81 +54,168 @@ def get_latest_version(version_url: str = PYPI_VERSION_URL):
|
||||
return dbt.semver.VersionSpecifier.from_version_string(version_string)
|
||||
|
||||
|
||||
def get_installed_version():
|
||||
return dbt.semver.VersionSpecifier.from_version_string(__version__)
|
||||
def _get_core_msg_lines(installed, latest) -> Tuple[List[List[str]], str]:
|
||||
installed_s = installed.to_version_string(skip_matcher=True)
|
||||
installed_line = ["installed", installed_s, ""]
|
||||
update_info = ""
|
||||
|
||||
if latest is None:
|
||||
update_info = (
|
||||
" The latest version of dbt-core could not be determined!\n"
|
||||
" Make sure that the following URL is accessible:\n"
|
||||
f" {PYPI_VERSION_URL}"
|
||||
)
|
||||
return [installed_line], update_info
|
||||
|
||||
latest_s = latest.to_version_string(skip_matcher=True)
|
||||
latest_line = ["latest", latest_s, green("Up to date!")]
|
||||
|
||||
if installed > latest:
|
||||
latest_line[2] = green("Ahead of latest version!")
|
||||
elif installed < latest:
|
||||
latest_line[2] = yellow("Update available!")
|
||||
update_info = (
|
||||
" Your version of dbt-core is out of date!\n"
|
||||
" You can find instructions for upgrading here:\n"
|
||||
" https://docs.getdbt.com/docs/installation"
|
||||
)
|
||||
|
||||
return [
|
||||
installed_line,
|
||||
latest_line,
|
||||
], update_info
|
||||
|
||||
|
||||
def _format_core_msg(lines: List[List[str]]) -> str:
|
||||
msg = "Core:\n"
|
||||
msg_lines = []
|
||||
|
||||
for name, version, update_msg in _pad_lines(lines, seperator=":"):
|
||||
line_msg = f" - {name} {version}"
|
||||
if update_msg != "":
|
||||
line_msg += f" - {update_msg}"
|
||||
msg_lines.append(line_msg)
|
||||
|
||||
return msg + "\n".join(msg_lines)
|
||||
|
||||
|
||||
def _get_plugins_msg(installed: dbt.semver.VersionSpecifier) -> str:
|
||||
msg_lines = ["Plugins:"]
|
||||
|
||||
plugins = []
|
||||
display_update_msg = False
|
||||
for name, version_s in _get_dbt_plugins_info():
|
||||
compatability_msg, needs_update = _get_plugin_msg_info(name, version_s, installed)
|
||||
if needs_update:
|
||||
display_update_msg = True
|
||||
plugins.append([name, version_s, compatability_msg])
|
||||
|
||||
for plugin in _pad_lines(plugins, seperator=":"):
|
||||
msg_lines.append(_format_single_plugin(plugin, ""))
|
||||
|
||||
if display_update_msg:
|
||||
update_msg = (
|
||||
" At least one plugin is out of date or incompatible with dbt-core.\n"
|
||||
" You can find instructions for upgrading here:\n"
|
||||
" https://docs.getdbt.com/docs/installation"
|
||||
)
|
||||
msg_lines += ["", update_msg]
|
||||
|
||||
return "\n".join(msg_lines)
|
||||
|
||||
|
||||
def _get_plugin_msg_info(
|
||||
name: str, version_s: str, core: dbt.semver.VersionSpecifier
|
||||
) -> Tuple[str, bool]:
|
||||
plugin = dbt.semver.VersionSpecifier.from_version_string(version_s)
|
||||
latest_plugin = get_latest_version(version_url=get_package_pypi_url(name))
|
||||
|
||||
needs_update = False
|
||||
|
||||
if plugin.major != core.major or plugin.minor != core.minor:
|
||||
compatibility_msg = red("Not compatible!")
|
||||
needs_update = True
|
||||
return (compatibility_msg, needs_update)
|
||||
|
||||
if not latest_plugin:
|
||||
compatibility_msg = yellow("Could not determine latest version")
|
||||
return (compatibility_msg, needs_update)
|
||||
|
||||
if plugin < latest_plugin:
|
||||
compatibility_msg = yellow("Update available!")
|
||||
needs_update = True
|
||||
elif plugin > latest_plugin:
|
||||
compatibility_msg = green("Ahead of latest version!")
|
||||
else:
|
||||
compatibility_msg = green("Up to date!")
|
||||
|
||||
return (compatibility_msg, needs_update)
|
||||
|
||||
|
||||
def _format_single_plugin(plugin: List[str], update_msg: str) -> str:
|
||||
name, version_s, compatability_msg = plugin
|
||||
msg = f" - {name} {version_s} - {compatability_msg}"
|
||||
if update_msg != "":
|
||||
msg += f"\n{update_msg}\n"
|
||||
return msg
|
||||
|
||||
|
||||
def _pad_lines(lines: List[List[str]], seperator: str = "") -> List[List[str]]:
|
||||
if len(lines) == 0:
|
||||
return []
|
||||
|
||||
# count the max line length for each column in the line
|
||||
counter = [0] * len(lines[0])
|
||||
for line in lines:
|
||||
for i, item in enumerate(line):
|
||||
counter[i] = max(counter[i], len(item))
|
||||
|
||||
result: List[List[str]] = []
|
||||
for i, line in enumerate(lines):
|
||||
|
||||
# add another list to hold padded strings
|
||||
if len(result) == i:
|
||||
result.append([""] * len(line))
|
||||
|
||||
# iterate over columns in the line
|
||||
for j, item in enumerate(line):
|
||||
|
||||
# the last column does not need padding
|
||||
if j == len(line) - 1:
|
||||
result[i][j] = item
|
||||
continue
|
||||
|
||||
# if the following column has no length
|
||||
# the string does not need padding
|
||||
if counter[j + 1] == 0:
|
||||
result[i][j] = item
|
||||
continue
|
||||
|
||||
# only add the seperator to the first column
|
||||
offset = 0
|
||||
if j == 0 and seperator != "":
|
||||
item += seperator
|
||||
offset = len(seperator)
|
||||
|
||||
result[i][j] = item.ljust(counter[j] + offset)
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def get_package_pypi_url(package_name: str) -> str:
|
||||
return f"https://pypi.org/pypi/dbt-{package_name}/json"
|
||||
|
||||
|
||||
def get_version_information():
|
||||
flags.USE_COLORS = True if not flags.USE_COLORS else None
|
||||
|
||||
installed = get_installed_version()
|
||||
latest = get_latest_version()
|
||||
|
||||
installed_s = installed.to_version_string(skip_matcher=True)
|
||||
if latest is None:
|
||||
latest_s = "unknown"
|
||||
else:
|
||||
latest_s = latest.to_version_string(skip_matcher=True)
|
||||
|
||||
version_msg = "installed version: {}\n" " latest version: {}\n\n".format(
|
||||
installed_s, latest_s
|
||||
)
|
||||
|
||||
plugin_version_msg = "Plugins:\n"
|
||||
for plugin_name, version in _get_dbt_plugins_info():
|
||||
plugin_version = dbt.semver.VersionSpecifier.from_version_string(version)
|
||||
latest_plugin_version = get_latest_version(version_url=get_package_pypi_url(plugin_name))
|
||||
plugin_update_msg = ""
|
||||
if installed == plugin_version or (
|
||||
latest_plugin_version and plugin_version == latest_plugin_version
|
||||
):
|
||||
compatibility_msg = green("Up to date!")
|
||||
else:
|
||||
if latest_plugin_version:
|
||||
if installed.major == plugin_version.major:
|
||||
compatibility_msg = yellow("Update available!")
|
||||
else:
|
||||
compatibility_msg = red("Out of date!")
|
||||
plugin_update_msg = (
|
||||
" Your version of dbt-{} is out of date! "
|
||||
"You can find instructions for upgrading here:\n"
|
||||
" https://docs.getdbt.com/dbt-cli/install/overview\n\n"
|
||||
).format(plugin_name)
|
||||
else:
|
||||
compatibility_msg = yellow("No PYPI version available")
|
||||
|
||||
plugin_version_msg += (" - {}: {} - {}\n" "{}").format(
|
||||
plugin_name, version, compatibility_msg, plugin_update_msg
|
||||
)
|
||||
|
||||
if latest is None:
|
||||
return (
|
||||
"{}The latest version of dbt could not be determined!\n"
|
||||
"Make sure that the following URL is accessible:\n{}\n\n{}".format(
|
||||
version_msg, PYPI_VERSION_URL, plugin_version_msg
|
||||
)
|
||||
)
|
||||
|
||||
if installed == latest:
|
||||
return f"{version_msg}{green('Up to date!')}\n\n{plugin_version_msg}"
|
||||
|
||||
elif installed > latest:
|
||||
return "{}Your version of dbt is ahead of the latest " "release!\n\n{}".format(
|
||||
version_msg, plugin_version_msg
|
||||
)
|
||||
|
||||
else:
|
||||
return (
|
||||
"{}Your version of dbt is out of date! "
|
||||
"You can find instructions for upgrading here:\n"
|
||||
"https://docs.getdbt.com/docs/installation\n\n{}".format(
|
||||
version_msg, plugin_version_msg
|
||||
)
|
||||
)
|
||||
def _get_dbt_plugins_info() -> Iterator[Tuple[str, str]]:
|
||||
for plugin_name in _get_adapter_plugin_names():
|
||||
if plugin_name == "core":
|
||||
continue
|
||||
try:
|
||||
mod = importlib.import_module(f"dbt.adapters.{plugin_name}.__version__")
|
||||
except ImportError:
|
||||
# not an adapter
|
||||
continue
|
||||
yield plugin_name, mod.version # type: ignore
|
||||
|
||||
|
||||
def _get_adapter_plugin_names() -> Iterator[str]:
|
||||
@@ -120,17 +234,5 @@ def _get_adapter_plugin_names() -> Iterator[str]:
|
||||
yield plugin_name
|
||||
|
||||
|
||||
def _get_dbt_plugins_info():
|
||||
for plugin_name in _get_adapter_plugin_names():
|
||||
if plugin_name == "core":
|
||||
continue
|
||||
try:
|
||||
mod = importlib.import_module(f"dbt.adapters.{plugin_name}.__version__")
|
||||
except ImportError:
|
||||
# not an adapter
|
||||
continue
|
||||
yield plugin_name, mod.version
|
||||
|
||||
|
||||
__version__ = "1.0.1"
|
||||
__version__ = "1.1.3"
|
||||
installed = get_installed_version()
|
||||
|
||||
@@ -273,12 +273,12 @@ def parse_args(argv=None):
|
||||
parser.add_argument("adapter")
|
||||
parser.add_argument("--title-case", "-t", default=None)
|
||||
parser.add_argument("--dependency", action="append")
|
||||
parser.add_argument("--dbt-core-version", default="1.0.1")
|
||||
parser.add_argument("--dbt-core-version", default="1.1.3")
|
||||
parser.add_argument("--email")
|
||||
parser.add_argument("--author")
|
||||
parser.add_argument("--url")
|
||||
parser.add_argument("--sql", action="store_true")
|
||||
parser.add_argument("--package-version", default="1.0.1")
|
||||
parser.add_argument("--package-version", default="1.1.3")
|
||||
parser.add_argument("--project-version", default="1.0")
|
||||
parser.add_argument("--no-dependency", action="store_false", dest="set_dependency")
|
||||
parsed = parser.parse_args()
|
||||
|
||||
@@ -25,7 +25,7 @@ with open(os.path.join(this_directory, "README.md")) as f:
|
||||
|
||||
|
||||
package_name = "dbt-core"
|
||||
package_version = "1.0.1"
|
||||
package_version = "1.1.3"
|
||||
description = """With dbt, data analysts and engineers can build analytics \
|
||||
the way engineers build applications."""
|
||||
|
||||
@@ -52,20 +52,20 @@ setup(
|
||||
],
|
||||
install_requires=[
|
||||
"Jinja2==2.11.3",
|
||||
"MarkupSafe==2.0.1",
|
||||
"MarkupSafe>=0.23,<2.1",
|
||||
"agate>=1.6,<1.6.4",
|
||||
"click>=7.0,<9",
|
||||
"colorama>=0.3.9,<0.4.5",
|
||||
"hologram==0.0.14",
|
||||
"hologram>=0.0.14,<=0.0.15",
|
||||
"isodate>=0.6,<0.7",
|
||||
"logbook>=1.5,<1.6",
|
||||
"mashumaro==2.9",
|
||||
"minimal-snowplow-tracker==0.0.2",
|
||||
"networkx>=2.3,<3",
|
||||
"networkx>=2.3,<2.8.4",
|
||||
"packaging>=20.9,<22.0",
|
||||
"sqlparse>=0.2.3,<0.5",
|
||||
"dbt-extractor==0.4.0",
|
||||
"typing-extensions>=3.7.4,<3.11",
|
||||
"dbt-extractor~=0.4.1",
|
||||
"typing-extensions>=3.7.4",
|
||||
"werkzeug>=1,<3",
|
||||
# the following are all to match snowflake-connector-python
|
||||
"requests<3.0.0",
|
||||
@@ -82,6 +82,7 @@ setup(
|
||||
"Programming Language :: Python :: 3.7",
|
||||
"Programming Language :: Python :: 3.8",
|
||||
"Programming Language :: Python :: 3.9",
|
||||
"Programming Language :: Python :: 3.10",
|
||||
],
|
||||
python_requires=">=3.7.2",
|
||||
)
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
black==21.12b0
|
||||
black==22.3.0
|
||||
bumpversion
|
||||
flake8
|
||||
flaky
|
||||
@@ -8,9 +8,11 @@ mypy==0.782
|
||||
pip-tools
|
||||
pre-commit
|
||||
pytest
|
||||
pytest-cov
|
||||
pytest-csv
|
||||
pytest-dotenv
|
||||
pytest-logbook
|
||||
pytest-csv
|
||||
pytest-mock
|
||||
pytest-xdist
|
||||
pytz
|
||||
tox>=3.13
|
||||
|
||||
@@ -9,17 +9,17 @@ ARG build_for=linux/amd64
|
||||
##
|
||||
# base image (abstract)
|
||||
##
|
||||
FROM --platform=$build_for python:3.9.9-slim-bullseye as base
|
||||
FROM --platform=$build_for python:3.10.3-slim-bullseye as base
|
||||
|
||||
# N.B. The refs updated automagically every release via bumpversion
|
||||
# N.B. dbt-postgres is currently found in the core codebase so a value of dbt-core@<some_version> is correct
|
||||
|
||||
ARG dbt_core_ref=dbt-core@v1.0.1
|
||||
ARG dbt_postgres_ref=dbt-core@v1.0.1
|
||||
ARG dbt_redshift_ref=dbt-redshift@v1.0.0
|
||||
ARG dbt_bigquery_ref=dbt-bigquery@v1.0.0
|
||||
ARG dbt_snowflake_ref=dbt-snowflake@v1.0.0
|
||||
ARG dbt_spark_ref=dbt-spark@v1.0.0
|
||||
ARG dbt_core_ref=dbt-core@v1.1.3
|
||||
ARG dbt_postgres_ref=dbt-core@v1.1.3
|
||||
ARG dbt_redshift_ref=dbt-redshift@v1.1.0
|
||||
ARG dbt_bigquery_ref=dbt-bigquery@v1.1.0
|
||||
ARG dbt_snowflake_ref=dbt-snowflake@v1.1.0
|
||||
ARG dbt_spark_ref=dbt-spark@v1.1.0
|
||||
# special case args
|
||||
ARG dbt_spark_version=all
|
||||
ARG dbt_third_party
|
||||
|
||||
@@ -17,6 +17,11 @@ In order to build a new image, run the following docker command.
|
||||
```
|
||||
docker build --tag <your_image_name> --target <target_name> <path/to/dockerfile>
|
||||
```
|
||||
---
|
||||
> **Note:** Docker must be configured to use [BuildKit](https://docs.docker.com/develop/develop-images/build_enhancements/) in order for images to build properly!
|
||||
|
||||
---
|
||||
|
||||
By default the images will be populated with the most recent release of `dbt-core` and whatever database adapter you select. If you need to use a different version you can specify it by git ref using the `--build-arg` flag:
|
||||
```
|
||||
docker build --tag <your_image_name> \
|
||||
@@ -32,7 +37,10 @@ valid arg names for versioning are:
|
||||
* `dbt_snowflake_ref`
|
||||
* `dbt_spark_ref`
|
||||
|
||||
> Note: Only overide a _single_ build arg for each build. Using multiple overides may lead to a non-functioning image.
|
||||
---
|
||||
>**NOTE:** Only override a _single_ build arg for each build. Using multiple overrides may lead to a non-functioning image.
|
||||
|
||||
---
|
||||
|
||||
If you wish to build an image with a third-party adapter you can use the `dbt-third-party` target. This target requires you provide a path to the adapter that can be processed by `pip` by using the `dbt_third_party` build arg:
|
||||
```
|
||||
@@ -101,6 +109,9 @@ docker run \
|
||||
my-dbt \
|
||||
ls
|
||||
```
|
||||
> Notes:
|
||||
> * Bind-mount sources _must_ be an absolute path
|
||||
> * You may need to make adjustments to the docker networking setting depending on the specifics of your data warehouse/database host.
|
||||
---
|
||||
**Notes:**
|
||||
* Bind-mount sources _must_ be an absolute path
|
||||
* You may need to make adjustments to the docker networking setting depending on the specifics of your data warehouse/database host.
|
||||
|
||||
---
|
||||
|
||||
@@ -1,2 +1,3 @@
|
||||
-e ./core
|
||||
-e ./plugins/postgres
|
||||
-e ./tests/adapter
|
||||
|
||||
@@ -1,18 +1,118 @@
|
||||
# Performance Regression Testing
|
||||
This directory includes dbt project setups to test on and a test runner written in Rust which runs specific dbt commands on each of the projects. Orchestration is done via the GitHub Action workflow in `/.github/workflows/performance.yml`. The workflow is scheduled to run every night, but it can also be triggered manually.
|
||||
|
||||
The github workflow hardcodes our baseline branch for performance metrics as `0.20.latest`. As future versions become faster, this branch will be updated to hold us to those new standards.
|
||||
## Attention!
|
||||
|
||||
## Adding a new dbt project
|
||||
Just make a new directory under `performance/projects/`. It will automatically be picked up by the tests.
|
||||
PLEASE READ THIS README IN THE MAIN BRANCH
|
||||
The performance runner is always pulled from main regardless of the version being modeled or sampled. If you are not in the main branch, this information may be stale.
|
||||
|
||||
## Adding a new dbt command
|
||||
In `runner/src/measure.rs::measure` add a metric to the `metrics` Vec. The Github Action will handle recompilation if you don't have the rust toolchain installed.
|
||||
## Description
|
||||
|
||||
This test suite samples the performance characteristics of individual commits against performance models for prior releases. Performance is measured in project-command pairs which are assumed to conform to a normal distribution. The sampling and comparison is effecient enough to run against PRs.
|
||||
|
||||
This collection of projects and commands should expand over time to reflect user feedback about poorly performing projects to protect against poor performance in these scenarios in future versions.
|
||||
|
||||
Here are all the components of the testing module:
|
||||
|
||||
- dbt project setups that are known performance bottlenecks which you can find in `/performance/projects/`, and a runner written in Rust that runs specific dbt commands on each of the projects.
|
||||
- Performance characteristics called "baselines" from released dbt versions in `/performance/baselines/`. Each branch will only have the baselines for its ancestors because when we compare samples, we compare against the lastest baseline available in the branch.
|
||||
- A GitHub action for modeling the performance distribution for a new release: `/.github/workflows/model_performance.yml`.
|
||||
- A GitHub action for sampling performance of dbt at your commit and comparing it against a previous release: `/.github/workflows/sample_performance.yml`.
|
||||
|
||||
At this time, the biggest risk in the design of this project is how to account for the natural variation of GitHub Action runs. Typically, performance work is done on dedicated hardware to elimiate this factor. However, there are ways to integrate the variation in obeservation tools if it can be measured.
|
||||
|
||||
## Adding Test Scenarios
|
||||
|
||||
A clear process for maintainers and community members to add new performance testing targets will exist after the next stage of the test suite is complete. For details, see #4768.
|
||||
|
||||
## Investigating Regressions
|
||||
|
||||
If your commit has failed one of the performance regression tests, it does not necessarily mean your commit has a performance regression. However, the observed runtime value was so much slower than the expected value that it was unlikely to be random noise. If it is not due to random noise, this commit contains the code that is causing this performance regression. However, it may not be the commit that introduced that code. That code may have been introduced in the commit before even if it passed due to natural variation in sampling. When investigating a performance regression, start with the failing commit and working your way backwards.
|
||||
|
||||
Here's an example of how this could happen:
|
||||
|
||||
```
|
||||
Commit
|
||||
A <- last release
|
||||
B
|
||||
C <- perf regression
|
||||
D
|
||||
E
|
||||
F <- the first failing commit
|
||||
```
|
||||
- Commit A is measured to have an expected value for one performance metric of 30 seconds with a standard deviation of 0.5 seconds.
|
||||
- Commit B doesn't introduce a performance regression and passes the performance regression tests.
|
||||
- Commit C introduces a performance regression such that the new expected value of the metric is 32 seconds with a standard deviation still at 0.5 seconds, but we don't know this because we don't estimate the whole performance distribution on every commit because that is far too much work to run on every commit. It passes the performance regression test because we happened to sample a value of 31 seconds which is within our threshold for the original model. It's also only 2 standard deviations away from the actual performance model of commit C so even though it's not going to be a super common situation, it is expected to happen sometimes.
|
||||
- Commit D samples a value of 31.4 seconds and passes
|
||||
- Commit E samples a value of 31.2 seconds and passes
|
||||
- Commit F samples a value of 32.9 seconds and fails
|
||||
|
||||
Because these performance regression tests are non-deterministic, it is frequently going to be possible to rerun the test on a failing commit and get it to pass. The more often we do this, the farther down the commit history we will be punting detection.
|
||||
|
||||
If your PR is against `main` your commits will be compared against the latest baseline measurement found in `performance/baselines`. If this commit needs to be backported, that PR will be against the `.latest` branch and will also compare against the latest baseline measurement found in `performance/baselines` in that branch. These two versions may be the same or they may be different. For example, If the latest version of dbt is v1.99.0, the performance sample of your PR against main will compare against the baseline for v1.99.0. When those commits are backported to `1.98.latest` those commits will be compared against the baseline for v1.98.6 (or whatever the latest is at that time). Even if the compared baseline is the same, a different sample is taken for each PR. In this case, even though it should be rare, it is possible for a performance regression to be detected in one of the two PRs even with the same baseline due to variation in sampling.
|
||||
|
||||
## The Statistics
|
||||
Particle physicists need to be confident in declaring new discoveries, snack manufacturers need to be sure each individual item is within the regulated margin of error for nutrition facts, and weight-rated climbing gear needs to be produced so you can trust your life to every unit that comes off the line. All of these use cases use the same kind of math to meet their needs: sigma-based p-values. This section will peel apart that math with the help of a physicist and walk through how we apply this approach to performance regression testing in this test suite.
|
||||
|
||||
You are likely familiar with forming a hypothesis of the form "A and B are correlated" which is known as _the research hypothesis_. Additionally, it follows that the hypothesis "A and B are not correlated" is relevant and is known as _the null hypothesis_. When looking at data, we commonly use a _p-value_ to determine the significance of the data. Formally, a _p-value_ is the probability of obtaining data at least as extreme as the ones observed, if the null hypothesis is true. To refine this definition, The experimental partical physicist [Dr. Tommaso Dorigo](https://userswww.pd.infn.it/~dorigo/#about) has an excellent [glossary](https://www.science20.com/quantum_diaries_survivor/fundamental_glossary_higgs_broadcast-85365) of these terms that helps clarify: "'Extreme' is quite tricky instead: it depends on what is your 'alternate hypothesis' of reference, and what kind of departure it would produce on the studied statistic derived from the data. So 'extreme' will mean 'departing from the typical values expected for the null hypothesis, toward the values expected from the alternate hypothesis.'" In the context of performance regression testing, our research hypothesis is that "after commit A, the codebase includes a performance regression" which means we expect the runtime of our measured processes to be _slower_, not faster than the expected value.
|
||||
|
||||
Given this definition of p-value, we need to explicitly call out the common tendancy to apply _probability inversion_ to our observations. To quote [Dr. Tommaso Dorigo](https://www.science20.com/quantum_diaries_survivor/fundamental_glossary_higgs_broadcast-85365) again, "If your ability on the long jump puts you in the 99.99% percentile, that does not mean that you are a kangaroo, and neither can one infer that the probability that you belong to the human race is 0.01%." Using our previously defined terms, the p-value is _not_ the probability that the null hypothesis _is true_.
|
||||
|
||||
This brings us to calculating sigma values. Sigma refers to the standard deviation of a statistical model, which is used as a measurement of how far away an observed value is from the expected value. When we say that we have a "3 sigma result" we are saying that if the null hypothesis is true, this is a particularly unlikely observation—not that the null hypothesis is false. Exactly how unlikely depends on what the expected values from our research hypothesis are. In the context of performance regression testing, if the null hypothesis is false, we are expecting the results to be _slower_ than the expected value not _slower or faster_. Looking at a normal distrubiton below, we can see that we only care about one _half_ of the distribution: the half where the values are slower than the expected value. This means that when we're calculating the p-value we are not including both sides of the normal distribution.
|
||||
|
||||

|
||||
|
||||
Because of this, the following table describes the significance of each sigma level for our _one-sided_ hypothesis:
|
||||
|
||||
| σ | p-value | scientific significance |
|
||||
| --- | -------------- | ----------------------- |
|
||||
| 1 σ | 1 in 6 | |
|
||||
| 2 σ | 1 in 44 | |
|
||||
| 3 σ | 1 in 741 | evidence |
|
||||
| 4 σ | 1 in 31,574 | |
|
||||
| 5 σ | 1 in 3,486,914 | discovery |
|
||||
|
||||
When detecting performance regressions that trigger alerts, block PRs, or delay releases we want to be conservative enough that detections are infrequently triggered by noise, but not so conservative as to miss most actual regressions. This test suite uses a 3 sigma standard so that only about 1 in every 700 runs is expected to fail the performance regression test suite due to expected variance in our measurements.
|
||||
|
||||
In practice, the number of performance regression failures due to random noise will be higher because we are not incorporating the variance of the tools we use to measure, namely GHA.
|
||||
|
||||
### Concrete Example: Performance Regression Detection
|
||||
|
||||
The following example data was collected by running the code in this repository in Github Actions.
|
||||
|
||||
In dbt v1.0.3, we have the following mean and standard deviation when parsing a dbt project with 2000 models:
|
||||
|
||||
μ (mean): 41.22<br/>
|
||||
σ (stddev): 0.2525<br/>
|
||||
|
||||
The 2-sided 3 sigma range can be calculated with these two values via:
|
||||
|
||||
x < μ - 3 σ or x > μ + 3 σ<br/>
|
||||
x < 41.22 - 3 * 0.2525 or x > 41.22 + 3 * 0.2525 <br/>
|
||||
x < 40.46 or x > 41.98<br/>
|
||||
|
||||
It follows that the 1-sided 3 sigma range for performance regressions is just:<br/>
|
||||
x > 41.98
|
||||
|
||||
If when we sample a single `dbt parse` of the same project with a commit slated to go into dbt v1.0.4, we observe a 42s parse time, then this observation is so unlikely if there were no code-induced performance regressions, that we should investigate if there is a performance regression in any of the commits between this failure and the commit where the initial distribution was measured.
|
||||
|
||||
Observations with 3 sigma significance that are _not_ performance regressions could be due to observing unlikely values (roughly 1 in every 750 observations), or variations in the instruments we use to take these measurements such as github actions. At this time we do not measure the variation in the instruments we use to account for these in our calculations which means failures due to random noise are more likely than they would be if we did take them into account.
|
||||
|
||||
### Concrete Example: Performance Modeling
|
||||
|
||||
Once a new dbt version is released (excluding pre-releases), the performance characteristics of that released version need to be measured. In this repository this measurement is referred to as a baseline.
|
||||
|
||||
After dbt v1.0.99 is released, a github action running from `main`, for the latest version of that action, takes the following steps:
|
||||
- Checks out main for the latest performance runner
|
||||
- pip installs dbt v1.0.99
|
||||
- builds the runner if it's not already in the github actions cache
|
||||
- uses the performance runner model sub command with `./runner model`.
|
||||
- The model subcommand calls hyperfine to run all of the project-command pairs a large number of times (maybe 20 or so) and save the hyperfine outputs to files in `performance/baselines/1.0.99/` one file per command-project pair.
|
||||
- The action opens two PRs with these files: one against `main` and one against `1.0.latest` so that future PRs against these branches will detect regressions against the performance characteristics of dbt v1.0.99 instead of v1.0.98.
|
||||
- The release driver for dbt v1.0.99 reviews and merges these PRs which is the sole deliverable of the performance modeling work.
|
||||
|
||||
## Future work
|
||||
- add more projects to test different configurations that have been known bottlenecks
|
||||
- add more dbt commands to measure
|
||||
- possibly using the uploaded json artifacts to store these results so they can be graphed over time
|
||||
- reading new metrics from a file so no one has to edit rust source to add them to the suite
|
||||
- instead of building the rust every time, we could publish and pull down the latest version.
|
||||
- instead of manually setting the baseline version of dbt to test, pull down the latest stable version as the baseline.
|
||||
- pin commands to projects by reading commands from a file defined in the project.
|
||||
- add a postgres warehouse to run `dbt compile` and `dbt run` commands
|
||||
- add more projects to test different configurations that have been known performance bottlenecks
|
||||
- Account for github action variation: Either measure it, or eliminate it. To measure it we could set up another action that periodically samples the same version of dbt and use a 7 day rolling variation. To eliminate it we could run the action using something like [act](https://github.com/nektos/act) on dedicated hardware.
|
||||
- build in a git-bisect run to automatically identify the commits that caused a performance regression by modeling each commit's expected value for the failing metric. Running this automatically, or even providing a script to do this locally would be useful.
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user