forked from repo-mirrors/dbt-core
Compare commits
86 Commits
adding-sem
...
v1.0.8
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
f1f0065a47 | ||
|
|
2cd8b1c137 | ||
|
|
b33b143da1 | ||
|
|
6a772f185d | ||
|
|
fd49525905 | ||
|
|
016613552e | ||
|
|
0c228c5383 | ||
|
|
6cdf373143 | ||
|
|
b771d8b59e | ||
|
|
de1c1a1b29 | ||
|
|
c7e5a6c6b3 | ||
|
|
9f5688bf84 | ||
|
|
4838411039 | ||
|
|
37344dd87c | ||
|
|
7202a1c78e | ||
|
|
8489e99854 | ||
|
|
4a1d8a2986 | ||
|
|
64ff87d7e4 | ||
|
|
5d0ebd502b | ||
|
|
7aa7259b1a | ||
|
|
7d1410acc9 | ||
|
|
88fc45b156 | ||
|
|
c6cde6ee2d | ||
|
|
c8f3f22e15 | ||
|
|
2748e4b822 | ||
|
|
7fca9ec2c9 | ||
|
|
ad3063a612 | ||
|
|
5218438704 | ||
|
|
33d08f8faa | ||
|
|
9ff2c8024c | ||
|
|
75696a1797 | ||
|
|
5b41b12779 | ||
|
|
27ed2f961b | ||
|
|
f2dcb6f23c | ||
|
|
77afe63c7c | ||
|
|
ca7c4c147a | ||
|
|
4145834c5b | ||
|
|
aaeb94d683 | ||
|
|
a2662b2f83 | ||
|
|
056db408cf | ||
|
|
bec6becd18 | ||
|
|
3be057b6a4 | ||
|
|
e2a6c25a6d | ||
|
|
92b3fc470d | ||
|
|
1e9fe67393 | ||
|
|
d9361259f4 | ||
|
|
7990974bd8 | ||
|
|
544d3e7a3a | ||
|
|
31962beb14 | ||
|
|
f6a0853901 | ||
|
|
336a3d4987 | ||
|
|
74dc5c49ae | ||
|
|
29fa687349 | ||
|
|
39d4e729c9 | ||
|
|
406bdcc89c | ||
|
|
9702aa733f | ||
|
|
44265716f9 | ||
|
|
20b27fd3b6 | ||
|
|
76c2e182ba | ||
|
|
791625ddf5 | ||
|
|
1baa05a764 | ||
|
|
1b47b53aff | ||
|
|
ec1f609f3e | ||
|
|
b4ea003559 | ||
|
|
23e1a9aa4f | ||
|
|
9882d08a24 | ||
|
|
79cc811a68 | ||
|
|
c82572f745 | ||
|
|
42a38e4deb | ||
|
|
ecf0ffe68c | ||
|
|
e9f26ef494 | ||
|
|
c77dc59af8 | ||
|
|
a5ebe4ff59 | ||
|
|
5c01f9006c | ||
|
|
c92e1ed9f2 | ||
|
|
85dee41a9f | ||
|
|
a4456feff0 | ||
|
|
8d27764b0f | ||
|
|
e56256d968 | ||
|
|
86cb3ba6fa | ||
|
|
4d0d2d0d6f | ||
|
|
f8a3c27fb8 | ||
|
|
30f05b0213 | ||
|
|
f1bebb3629 | ||
|
|
e7a40345ad | ||
|
|
ba94b8212c |
@@ -1,5 +1,5 @@
|
||||
[bumpversion]
|
||||
current_version = 1.0.0rc3
|
||||
current_version = 1.0.8
|
||||
parse = (?P<major>\d+)
|
||||
\.(?P<minor>\d+)
|
||||
\.(?P<patch>\d+)
|
||||
|
||||
15
.changes/0.0.0.md
Normal file
15
.changes/0.0.0.md
Normal file
@@ -0,0 +1,15 @@
|
||||
## Previous Releases
|
||||
|
||||
For information on prior major and minor releases, see their changelogs:
|
||||
|
||||
* [0.21](https://github.com/dbt-labs/dbt-core/blob/0.21.latest/CHANGELOG.md)
|
||||
* [0.20](https://github.com/dbt-labs/dbt-core/blob/0.20.latest/CHANGELOG.md)
|
||||
* [0.19](https://github.com/dbt-labs/dbt-core/blob/0.19.latest/CHANGELOG.md)
|
||||
* [0.18](https://github.com/dbt-labs/dbt-core/blob/0.18.latest/CHANGELOG.md)
|
||||
* [0.17](https://github.com/dbt-labs/dbt-core/blob/0.17.latest/CHANGELOG.md)
|
||||
* [0.16](https://github.com/dbt-labs/dbt-core/blob/0.16.latest/CHANGELOG.md)
|
||||
* [0.15](https://github.com/dbt-labs/dbt-core/blob/0.15.latest/CHANGELOG.md)
|
||||
* [0.14](https://github.com/dbt-labs/dbt-core/blob/0.14.latest/CHANGELOG.md)
|
||||
* [0.13](https://github.com/dbt-labs/dbt-core/blob/0.13.latest/CHANGELOG.md)
|
||||
* [0.12](https://github.com/dbt-labs/dbt-core/blob/0.12.latest/CHANGELOG.md)
|
||||
* [0.11 and earlier](https://github.com/dbt-labs/dbt-core/blob/0.11.latest/CHANGELOG.md)
|
||||
250
.changes/1.0.3.md
Normal file
250
.changes/1.0.3.md
Normal file
@@ -0,0 +1,250 @@
|
||||
## dbt-core 1.0.3 (February 21, 2022)
|
||||
|
||||
### Fixes
|
||||
- Fix bug accessing target fields in deps and clean commands ([#4752](https://github.com/dbt-labs/dbt-core/issues/4752), [#4758](https://github.com/dbt-labs/dbt-core/issues/4758))
|
||||
|
||||
## dbt-core 1.0.2 (February 18, 2022)
|
||||
|
||||
### Dependencies
|
||||
- Pin `MarkupSafe==2.0.1`. Deprecation of `soft_unicode` in `MarkupSafe==2.1.0` is not supported by `Jinja2==2.11`
|
||||
|
||||
## dbt-core 1.0.2rc1 (February 4, 2022)
|
||||
|
||||
### Fixes
|
||||
- Projects created using `dbt init` now have the correct `seeds` directory created (instead of `data`) ([#4588](https://github.com/dbt-labs/dbt-core/issues/4588), [#4599](https://github.com/dbt-labs/dbt-core/pull/4589))
|
||||
- Don't require a profile for dbt deps and clean commands ([#4554](https://github.com/dbt-labs/dbt-core/issues/4554), [#4610](https://github.com/dbt-labs/dbt-core/pull/4610))
|
||||
- Select modified.body works correctly when new model added([#4570](https://github.com/dbt-labs/dbt-core/issues/4570), [#4631](https://github.com/dbt-labs/dbt-core/pull/4631))
|
||||
- Fix bug in retry logic for bad response from hub and when there is a bad git tarball download. ([#4577](https://github.com/dbt-labs/dbt-core/issues/4577), [#4579](https://github.com/dbt-labs/dbt-core/issues/4579), [#4609](https://github.com/dbt-labs/dbt-core/pull/4609))
|
||||
- Restore previous log level (DEBUG) when a test depends on a disabled resource. Still WARN if the resource is missing ([#4594](https://github.com/dbt-labs/dbt-core/issues/4594), [#4647](https://github.com/dbt-labs/dbt-core/pull/4647))
|
||||
- User wasn't asked for permission to overwite a profile entry when running init inside an existing project ([#4375](https://github.com/dbt-labs/dbt-core/issues/4375), [#4447](https://github.com/dbt-labs/dbt-core/pull/4447))
|
||||
- A change in secret environment variables won't trigger a full reparse [#4650](https://github.com/dbt-labs/dbt-core/issues/4650) [4665](https://github.com/dbt-labs/dbt-core/pull/4665)
|
||||
- adapter compability messaging added([#4438](https://github.com/dbt-labs/dbt-core/pull/4438) [#4565](https://github.com/dbt-labs/dbt-core/pull/4565))
|
||||
- Add project name validation to `dbt init` ([#4490](https://github.com/dbt-labs/dbt-core/issues/4490),[#4536](https://github.com/dbt-labs/dbt-core/pull/4536))
|
||||
|
||||
Contributors:
|
||||
- [@NiallRees](https://github.com/NiallRees) ([#4447](https://github.com/dbt-labs/dbt-core/pull/4447))
|
||||
- [@amirkdv](https://github.com/amirkdv) ([#4536](https://github.com/dbt-labs/dbt-core/pull/4536))
|
||||
- [@nkyuray](https://github.com/nkyuray) ([#4565](https://github.com/dbt-labs/dbt-core/pull/4565))
|
||||
|
||||
## dbt-core 1.0.1 (January 03, 2022)
|
||||
|
||||
|
||||
## dbt-core 1.0.1rc1 (December 20, 2021)
|
||||
|
||||
### Fixes
|
||||
- Fix wrong url in the dbt docs overview homepage ([#4442](https://github.com/dbt-labs/dbt-core/pull/4442))
|
||||
- Fix redefined status param of SQLQueryStatus to typecheck the string which passes on `._message` value of `AdapterResponse` or the `str` value sent by adapter plugin. ([#4463](https://github.com/dbt-labs/dbt-core/pull/4463#issuecomment-990174166))
|
||||
- Fix `DepsStartPackageInstall` event to use package name instead of version number. ([#4482](https://github.com/dbt-labs/dbt-core/pull/4482))
|
||||
- Reimplement log message to use adapter name instead of the object method. ([#4501](https://github.com/dbt-labs/dbt-core/pull/4501))
|
||||
- Issue better error message for incompatible schemas ([#4470](https://github.com/dbt-labs/dbt-core/pull/4442), [#4497](https://github.com/dbt-labs/dbt-core/pull/4497))
|
||||
- Remove secrets from error related to packages. ([#4507](https://github.com/dbt-labs/dbt-core/pull/4507))
|
||||
- Prevent coercion of boolean values (`True`, `False`) to numeric values (`0`, `1`) in query results ([#4511](https://github.com/dbt-labs/dbt-core/issues/4511), [#4512](https://github.com/dbt-labs/dbt-core/pull/4512))
|
||||
- Fix error with an env_var in a project hook ([#4523](https://github.com/dbt-labs/dbt-core/issues/4523), [#4524](https://github.com/dbt-labs/dbt-core/pull/4524))
|
||||
|
||||
### Docs
|
||||
- Fix missing data on exposures in docs ([#4467](https://github.com/dbt-labs/dbt-core/issues/4467))
|
||||
|
||||
Contributors:
|
||||
- [remoyson](https://github.com/remoyson) ([#4442](https://github.com/dbt-labs/dbt-core/pull/4442))
|
||||
|
||||
## dbt-core 1.0.0 (December 3, 2021)
|
||||
|
||||
### Fixes
|
||||
- Configure the CLI logger destination to use stdout instead of stderr ([#4368](https://github.com/dbt-labs/dbt-core/pull/4368))
|
||||
- Make the size of `EVENT_HISTORY` configurable, via `EVENT_BUFFER_SIZE` global config ([#4411](https://github.com/dbt-labs/dbt-core/pull/4411), [#4416](https://github.com/dbt-labs/dbt-core/pull/4416))
|
||||
- Change type of `log_format` in `profiles.yml` user config to be string, not boolean ([#4394](https://github.com/dbt-labs/dbt-core/pull/4394))
|
||||
|
||||
### Under the hood
|
||||
- Only log cache events if `LOG_CACHE_EVENTS` is enabled, and disable by default. This restores previous behavior ([#4369](https://github.com/dbt-labs/dbt-core/pull/4369))
|
||||
- Move event codes to be a top-level attribute of JSON-formatted logs, rather than nested in `data` ([#4381](https://github.com/dbt-labs/dbt-core/pull/4381))
|
||||
- Fix failing integration test on Windows ([#4380](https://github.com/dbt-labs/dbt-core/pull/4380))
|
||||
- Clean up warning messages for `clean` + `deps` ([#4366](https://github.com/dbt-labs/dbt-core/pull/4366))
|
||||
- Use RFC3339 timestamps for log messages ([#4384](https://github.com/dbt-labs/dbt-core/pull/4384))
|
||||
- Different text output for console (info) and file (debug) logs ([#4379](https://github.com/dbt-labs/dbt-core/pull/4379), [#4418](https://github.com/dbt-labs/dbt-core/pull/4418))
|
||||
- Remove unused events. More structured `ConcurrencyLine`. Replace `\n` message starts/ends with `EmptyLine` events, and exclude `EmptyLine` from JSON-formatted output ([#4388](https://github.com/dbt-labs/dbt-core/pull/4388))
|
||||
- Update `events` module README ([#4395](https://github.com/dbt-labs/dbt-core/pull/4395))
|
||||
- Rework approach to JSON serialization for events with non-standard properties ([#4396](https://github.com/dbt-labs/dbt-core/pull/4396))
|
||||
- Update legacy logger file name to `dbt.log.legacy` ([#4402](https://github.com/dbt-labs/dbt-core/pull/4402))
|
||||
- Rollover `dbt.log` at 10 MB, and keep up to 5 backups, restoring previous behavior ([#4405](https://github.com/dbt-labs/dbt-core/pull/4405))
|
||||
- Use reference keys instead of full relation objects in cache events ([#4410](https://github.com/dbt-labs/dbt-core/pull/4410))
|
||||
- Add `node_type` contextual info to more events ([#4378](https://github.com/dbt-labs/dbt-core/pull/4378))
|
||||
- Make `materialized` config optional in `node_type` ([#4417](https://github.com/dbt-labs/dbt-core/pull/4417))
|
||||
- Stringify exception in `GenericExceptionOnRun` to support JSON serialization ([#4424](https://github.com/dbt-labs/dbt-core/pull/4424))
|
||||
- Add "interop" tests for machine consumption of structured log output ([#4327](https://github.com/dbt-labs/dbt-core/pull/4327))
|
||||
- Relax version specifier for `dbt-extractor` to `~=0.4.0`, to support compiled wheels for additional architectures when available ([#4427](https://github.com/dbt-labs/dbt-core/pull/4427))
|
||||
|
||||
## dbt-core 1.0.0rc3 (November 30, 2021)
|
||||
|
||||
### Fixes
|
||||
- Support partial parsing of env_vars in metrics ([#4253](https://github.com/dbt-labs/dbt-core/issues/4293), [#4322](https://github.com/dbt-labs/dbt-core/pull/4322))
|
||||
- Fix typo in `UnparsedSourceDefinition.__post_serialize__` ([#3545](https://github.com/dbt-labs/dbt-core/issues/3545), [#4349](https://github.com/dbt-labs/dbt-core/pull/4349))
|
||||
|
||||
### Under the hood
|
||||
- Change some CompilationExceptions to ParsingExceptions ([#4254](http://github.com/dbt-labs/dbt-core/issues/4254), [#4328](https://github.com/dbt-core/pull/4328))
|
||||
- Reorder logic for static parser sampling to speed up model parsing ([#4332](https://github.com/dbt-labs/dbt-core/pull/4332))
|
||||
- Use more augmented assignment statements ([#4315](https://github.com/dbt-labs/dbt-core/issues/4315)), ([#4311](https://github.com/dbt-labs/dbt-core/pull/4331))
|
||||
- Adjust logic when finding approximate matches for models and tests ([#3835](https://github.com/dbt-labs/dbt-core/issues/3835)), [#4076](https://github.com/dbt-labs/dbt-core/pull/4076))
|
||||
- Restore small previous behaviors for logging: JSON formatting for first few events; `WARN`-level stdout for `list` task; include tracking events in `dbt.log` ([#4341](https://github.com/dbt-labs/dbt-core/pull/4341))
|
||||
|
||||
Contributors:
|
||||
- [@sarah-weatherbee](https://github.com/sarah-weatherbee) ([#4331](https://github.com/dbt-labs/dbt-core/pull/4331))
|
||||
- [@emilieschario](https://github.com/emilieschario) ([#4076](https://github.com/dbt-labs/dbt-core/pull/4076))
|
||||
- [@sneznaj](https://github.com/sneznaj) ([#4349](https://github.com/dbt-labs/dbt-core/pull/4349))
|
||||
|
||||
## dbt-core 1.0.0rc2 (November 22, 2021)
|
||||
|
||||
### Breaking changes
|
||||
- Restrict secret env vars (prefixed `DBT_ENV_SECRET_`) to `profiles.yml` + `packages.yml` _only_. Raise an exception if a secret env var is used elsewhere ([#4310](https://github.com/dbt-labs/dbt-core/issues/4310), [#4311](https://github.com/dbt-labs/dbt-core/pull/4311))
|
||||
- Reorder arguments to `config.get()` so that `default` is second ([#4273](https://github.com/dbt-labs/dbt-core/issues/4273), [#4297](https://github.com/dbt-labs/dbt-core/pull/4297))
|
||||
|
||||
### Features
|
||||
- Avoid error when missing column in YAML description ([#4151](https://github.com/dbt-labs/dbt-core/issues/4151), [#4285](https://github.com/dbt-labs/dbt-core/pull/4285))
|
||||
- Allow `--defer` flag to `dbt snapshot` ([#4110](https://github.com/dbt-labs/dbt-core/issues/4110), [#4296](https://github.com/dbt-labs/dbt-core/pull/4296))
|
||||
- Install prerelease packages when `version` explicitly references a prerelease version, regardless of `install-prerelease` status ([#4243](https://github.com/dbt-labs/dbt-core/issues/4243), [#4295](https://github.com/dbt-labs/dbt-core/pull/4295))
|
||||
- Add data attributes to json log messages ([#4301](https://github.com/dbt-labs/dbt-core/pull/4301))
|
||||
- Add event codes to all log events ([#4319](https://github.com/dbt-labs/dbt-core/pull/4319))
|
||||
|
||||
### Fixes
|
||||
- Fix serialization error with missing quotes in metrics model ref ([#4252](https://github.com/dbt-labs/dbt-core/issues/4252), [#4287](https://github.com/dbt-labs/dbt-core/pull/4289))
|
||||
- Correct definition of 'created_at' in ParsedMetric nodes ([#4298](http://github.com/dbt-labs/dbt-core/issues/4298), [#4299](https://github.com/dbt-labs/dbt-core/pull/4299))
|
||||
|
||||
### Fixes
|
||||
- Allow specifying default in Jinja config.get with default keyword ([#4273](https://github.com/dbt-labs/dbt-core/issues/4273), [#4297](https://github.com/dbt-labs/dbt-core/pull/4297))
|
||||
- Fix serialization error with missing quotes in metrics model ref ([#4252](https://github.com/dbt-labs/dbt-core/issues/4252), [#4287](https://github.com/dbt-labs/dbt-core/pull/4289))
|
||||
- Correct definition of 'created_at' in ParsedMetric nodes ([#4298](https://github.com/dbt-labs/dbt-core/issues/4298), [#4299](https://github.com/dbt-labs/dbt-core/pull/4299))
|
||||
|
||||
### Under the hood
|
||||
- Add --indirect-selection parameter to profiles.yml and builtin DBT_ env vars; stringified parameter to enable multi-modal use ([#3997](https://github.com/dbt-labs/dbt-core/issues/3997), [#4270](https://github.com/dbt-labs/dbt-core/pull/4270))
|
||||
- Fix filesystem searcher test failure on Python 3.9 ([#3689](https://github.com/dbt-labs/dbt-core/issues/3689), [#4271](https://github.com/dbt-labs/dbt-core/pull/4271))
|
||||
- Clean up deprecation warnings shown for `dbt_project.yml` config renames ([#4276](https://github.com/dbt-labs/dbt-core/issues/4276), [#4291](https://github.com/dbt-labs/dbt-core/pull/4291))
|
||||
- Fix metrics count in compiled project stats ([#4290](https://github.com/dbt-labs/dbt-core/issues/4290), [#4292](https://github.com/dbt-labs/dbt-core/pull/4292))
|
||||
- First pass at supporting more dbt tasks via python lib ([#4200](https://github.com/dbt-labs/dbt-core/pull/4200))
|
||||
|
||||
Contributors:
|
||||
- [@kadero](https://github.com/kadero) ([#4285](https://github.com/dbt-labs/dbt-core/pull/4285), [#4296](https://github.com/dbt-labs/dbt-core/pull/4296))
|
||||
- [@joellabes](https://github.com/joellabes) ([#4295](https://github.com/dbt-labs/dbt-core/pull/4295))
|
||||
|
||||
## dbt-core 1.0.0rc1 (November 10, 2021)
|
||||
|
||||
### Breaking changes
|
||||
- Replace `greedy` flag/property for test selection with `indirect_selection: eager/cautious` flag/property. Set to `eager` by default. **Note:** This reverts test selection to its pre-v0.20 behavior by default. `dbt test -s my_model` _will_ select multi-parent tests, such as `relationships`, that depend on unselected resources. To achieve the behavior change in v0.20 + v0.21, set `--indirect-selection=cautious` on the CLI or `indirect_selection: cautious` in yaml selectors. ([#4082](https://github.com/dbt-labs/dbt-core/issues/4082), [#4104](https://github.com/dbt-labs/dbt-core/pull/4104))
|
||||
- In v1.0.0, **`pip install dbt` will raise an explicit error.** Instead, please use `pip install dbt-<adapter>` (to use dbt with that database adapter), or `pip install dbt-core` (for core functionality). For parity with the previous behavior of `pip install dbt`, you can use: `pip install dbt-core dbt-postgres dbt-redshift dbt-snowflake dbt-bigquery` ([#4100](https://github.com/dbt-labs/dbt-core/issues/4100), [#4133](https://github.com/dbt-labs/dbt-core/pull/4133))
|
||||
- Reorganize the `global_project` (macros) into smaller files with clearer names. Remove unused global macros: `column_list`, `column_list_for_create_table`, `incremental_upsert` ([#4154](https://github.com/dbt-labs/dbt-core/pull/4154))
|
||||
- Introduce structured event interface, and begin conversion of all legacy logging ([#3359](https://github.com/dbt-labs/dbt-core/issues/3359), [#4055](https://github.com/dbt-labs/dbt-core/pull/4055))
|
||||
- **This is a breaking change for adapter plugins, requiring a very simple migration.** See [`events` module README](core/dbt/events/README.md#adapter-maintainers) for details.
|
||||
- If you maintain another kind of dbt-core plugin that makes heavy use of legacy logging, and you need time to cut over to the new event interface, you can re-enable the legacy logger via an environment variable shim, `DBT_ENABLE_LEGACY_LOGGER=True`. Be advised that we will remove this capability in a future version of dbt-core.
|
||||
|
||||
### Features
|
||||
- Allow nullable `error_after` in source freshness ([#3874](https://github.com/dbt-labs/dbt-core/issues/3874), [#3955](https://github.com/dbt-labs/dbt-core/pull/3955))
|
||||
- Add `metrics` nodes ([#4071](https://github.com/dbt-labs/dbt-core/issues/4071), [#4235](https://github.com/dbt-labs/dbt-core/pull/4235))
|
||||
- Add support for `dbt init <project_name>`, and support for `skip_profile_setup` argument (`dbt init -s`) ([#4156](https://github.com/dbt-labs/dbt-core/issues/4156), [#4249](https://github.com/dbt-labs/dbt-core/pull/4249))
|
||||
|
||||
### Fixes
|
||||
- Changes unit tests using `assertRaisesRegexp` to `assertRaisesRegex` ([#4136](https://github.com/dbt-labs/dbt-core/issues/4132), [#4136](https://github.com/dbt-labs/dbt-core/pull/4136))
|
||||
- Allow retries when the answer from a `dbt deps` is `None` ([#4178](https://github.com/dbt-labs/dbt-core/issues/4178), [#4225](https://github.com/dbt-labs/dbt-core/pull/4225))
|
||||
|
||||
### Docs
|
||||
|
||||
- Fix non-alphabetical sort of Source Tables in source overview page ([docs#81](https://github.com/dbt-labs/dbt-docs/issues/81), [docs#218](https://github.com/dbt-labs/dbt-docs/pull/218))
|
||||
- Add title tag to node elements in tree ([docs#202](https://github.com/dbt-labs/dbt-docs/issues/202), [docs#203](https://github.com/dbt-labs/dbt-docs/pull/203))
|
||||
- Account for test rename: `schema` → `generic`, `data` →` singular`. Use `test_metadata` instead of `schema`/`data` tags to differentiate ([docs#216](https://github.com/dbt-labs/dbt-docs/issues/216), [docs#222](https://github.com/dbt-labs/dbt-docs/pull/222))
|
||||
- Add `metrics` ([core#216](https://github.com/dbt-labs/dbt-core/issues/4235), [docs#223](https://github.com/dbt-labs/dbt-docs/pull/223))
|
||||
|
||||
### Under the hood
|
||||
- Bump artifact schema versions for 1.0.0: manifest v4, run results v4, sources v3. Notable changes: added `metrics` nodes; schema test + data test nodes are renamed to generic test + singular test nodes; freshness threshold default values ([#4191](https://github.com/dbt-labs/dbt-core/pull/4191))
|
||||
- Speed up node selection by skipping `incorporate_indirect_nodes` if not needed ([#4213](https://github.com/dbt-labs/dbt-core/issues/4213), [#4214](https://github.com/dbt-labs/dbt-core/issues/4214))
|
||||
- When `on_schema_change` is set, pass common columns as `dest_columns` in incremental merge macros ([#4144](https://github.com/dbt-labs/dbt-core/issues/4144), [#4170](https://github.com/dbt-labs/dbt-core/pull/4170))
|
||||
- Clear adapters before registering in `lib` module config generation ([#4218](https://github.com/dbt-labs/dbt-core/pull/4218))
|
||||
- Remove official support for python 3.6, which is reaching end of life on December 23, 2021 ([#4134](https://github.com/dbt-labs/dbt-core/issues/4134), [#4223](https://github.com/dbt-labs/dbt-core/pull/4223))
|
||||
|
||||
Contributors:
|
||||
- [@kadero](https://github.com/kadero) ([#3955](https://github.com/dbt-labs/dbt-core/pull/3955), [#4249](https://github.com/dbt-labs/dbt-core/pull/4249))
|
||||
- [@frankcash](https://github.com/frankcash) ([#4136](https://github.com/dbt-labs/dbt-core/pull/4136))
|
||||
- [@Kayrnt](https://github.com/Kayrnt) ([#4136](https://github.com/dbt-labs/dbt-core/pull/4170))
|
||||
- [@VersusFacit](https://github.com/VersusFacit) ([#4104](https://github.com/dbt-labs/dbt-core/pull/4104))
|
||||
- [@joellabes](https://github.com/joellabes) ([#4104](https://github.com/dbt-labs/dbt-core/pull/4104))
|
||||
- [@b-per](https://github.com/b-per) ([#4225](https://github.com/dbt-labs/dbt-core/pull/4225))
|
||||
- [@salmonsd](https://github.com/salmonsd) ([docs#218](https://github.com/dbt-labs/dbt-docs/pull/218))
|
||||
- [@miike](https://github.com/miike) ([docs#203](https://github.com/dbt-labs/dbt-docs/pull/203))
|
||||
|
||||
|
||||
## dbt-core 1.0.0b2 (October 25, 2021)
|
||||
|
||||
### Breaking changes
|
||||
|
||||
- Enable `on-run-start` and `on-run-end` hooks for `dbt test`. Add `flags.WHICH` to execution context, representing current task ([#3463](https://github.com/dbt-labs/dbt-core/issues/3463), [#4004](https://github.com/dbt-labs/dbt-core/pull/4004))
|
||||
|
||||
### Features
|
||||
- Normalize global CLI arguments/flags ([#2990](https://github.com/dbt-labs/dbt/issues/2990), [#3839](https://github.com/dbt-labs/dbt/pull/3839))
|
||||
- Turns on the static parser by default and adds the flag `--no-static-parser` to disable it. ([#3377](https://github.com/dbt-labs/dbt/issues/3377), [#3939](https://github.com/dbt-labs/dbt/pull/3939))
|
||||
- Generic test FQNs have changed to include the relative path, resource, and column (if applicable) where they are defined. This makes it easier to configure them from the `tests` block in `dbt_project.yml` ([#3259](https://github.com/dbt-labs/dbt/pull/3259), [#3880](https://github.com/dbt-labs/dbt/pull/3880)
|
||||
- Turn on partial parsing by default ([#3867](https://github.com/dbt-labs/dbt/issues/3867), [#3989](https://github.com/dbt-labs/dbt/issues/3989))
|
||||
- Add `result:<status>` selectors to automatically rerun failed tests and erroneous models. This makes it easier to rerun failed dbt jobs with a simple selector flag instead of restarting from the beginning or manually running the dbt models in scope. ([#3859](https://github.com/dbt-labs/dbt/issues/3891), [#4017](https://github.com/dbt-labs/dbt/pull/4017))
|
||||
- `dbt init` is now interactive, generating profiles.yml when run inside existing project ([#3625](https://github.com/dbt-labs/dbt/pull/3625))
|
||||
|
||||
### Under the hood
|
||||
|
||||
- Fix intermittent errors in partial parsing tests ([#4060](https://github.com/dbt-labs/dbt-core/issues/4060), [#4068](https://github.com/dbt-labs/dbt-core/pull/4068))
|
||||
- Make finding disabled nodes more consistent ([#4069](https://github.com/dbt-labs/dbt-core/issues/4069), [#4073](https://github.com/dbt-labas/dbt-core/pull/4073))
|
||||
- Remove connection from `render_with_context` during parsing, thereby removing misleading log message ([#3137](https://github.com/dbt-labs/dbt-core/issues/3137), [#4062](https://github.com/dbt-labas/dbt-core/pull/4062))
|
||||
- Wait for postgres docker container to be ready in `setup_db.sh`. ([#3876](https://github.com/dbt-labs/dbt-core/issues/3876), [#3908](https://github.com/dbt-labs/dbt-core/pull/3908))
|
||||
- Prefer macros defined in the project over the ones in a package by default ([#4106](https://github.com/dbt-labs/dbt-core/issues/4106), [#4114](https://github.com/dbt-labs/dbt-core/pull/4114))
|
||||
- Dependency updates ([#4079](https://github.com/dbt-labs/dbt-core/pull/4079)), ([#3532](https://github.com/dbt-labs/dbt-core/pull/3532)
|
||||
- Schedule partial parsing for SQL files with env_var changes ([#3885](https://github.com/dbt-labs/dbt-core/issues/3885), [#4101](https://github.com/dbt-labs/dbt-core/pull/4101))
|
||||
- Schedule partial parsing for schema files with env_var changes ([#3885](https://github.com/dbt-labs/dbt-core/issues/3885), [#4162](https://github.com/dbt-labs/dbt-core/pull/4162))
|
||||
- Skip partial parsing when env_vars change in dbt_project or profile ([#3885](https://github.com/dbt-labs/dbt-core/issues/3885), [#4212](https://github.com/dbt-labs/dbt-core/pull/4212))
|
||||
|
||||
Contributors:
|
||||
- [@sungchun12](https://github.com/sungchun12) ([#4017](https://github.com/dbt-labs/dbt/pull/4017))
|
||||
- [@matt-winkler](https://github.com/matt-winkler) ([#4017](https://github.com/dbt-labs/dbt/pull/4017))
|
||||
- [@NiallRees](https://github.com/NiallRees) ([#3625](https://github.com/dbt-labs/dbt/pull/3625))
|
||||
- [@rvacaru](https://github.com/rvacaru) ([#3908](https://github.com/dbt-labs/dbt/pull/3908))
|
||||
- [@JCZuurmond](https://github.com/jczuurmond) ([#4114](https://github.com/dbt-labs/dbt-core/pull/4114))
|
||||
- [@ljhopkins2](https://github.com/dbt-labs/dbt-core/pull/4079)
|
||||
|
||||
## dbt-core 1.0.0b1 (October 11, 2021)
|
||||
|
||||
### Breaking changes
|
||||
|
||||
- The two type of test definitions are now "singular" and "generic" (instead of "data" and "schema", respectively). The `test_type:` selection method accepts `test_type:singular` and `test_type:generic`. (It will also accept `test_type:schema` and `test_type:data` for backwards compatibility) ([#3234](https://github.com/dbt-labs/dbt-core/issues/3234), [#3880](https://github.com/dbt-labs/dbt-core/pull/3880)). **Not backwards compatible:** The `--data` and `--schema` flags to `dbt test` are no longer supported, and tests no longer have the tags `'data'` and `'schema'` automatically applied.
|
||||
- Deprecated the use of the `packages` arg `adapter.dispatch` in favor of the `macro_namespace` arg. ([#3895](https://github.com/dbt-labs/dbt-core/issues/3895))
|
||||
|
||||
### Features
|
||||
- Normalize global CLI arguments/flags ([#2990](https://github.com/dbt-labs/dbt-core/issues/2990), [#3839](https://github.com/dbt-labs/dbt-core/pull/3839))
|
||||
- Turns on the static parser by default and adds the flag `--no-static-parser` to disable it. ([#3377](https://github.com/dbt-labs/dbt-core/issues/3377), [#3939](https://github.com/dbt-labs/dbt-core/pull/3939))
|
||||
- Generic test FQNs have changed to include the relative path, resource, and column (if applicable) where they are defined. This makes it easier to configure them from the `tests` block in `dbt_project.yml` ([#3259](https://github.com/dbt-labs/dbt-core/pull/3259), [#3880](https://github.com/dbt-labs/dbt-core/pull/3880)
|
||||
- Turn on partial parsing by default ([#3867](https://github.com/dbt-labs/dbt-core/issues/3867), [#3989](https://github.com/dbt-labs/dbt-core/issues/3989))
|
||||
- Generic test can now be added under a `generic` subfolder in the `test-paths` directory. ([#4052](https://github.com/dbt-labs/dbt-core/pull/4052))
|
||||
|
||||
### Fixes
|
||||
- Add generic tests defined on sources to the manifest once, not twice ([#3347](https://github.com/dbt-labs/dbt/issues/3347), [#3880](https://github.com/dbt-labs/dbt/pull/3880))
|
||||
- Skip partial parsing if certain macros have changed ([#3810](https://github.com/dbt-labs/dbt/issues/3810), [#3982](https://github.com/dbt-labs/dbt/pull/3892))
|
||||
- Enable cataloging of unlogged Postgres tables ([3961](https://github.com/dbt-labs/dbt/issues/3961), [#3993](https://github.com/dbt-labs/dbt/pull/3993))
|
||||
- Fix multiple disabled nodes ([#4013](https://github.com/dbt-labs/dbt/issues/4013), [#4018](https://github.com/dbt-labs/dbt/pull/4018))
|
||||
- Fix multiple partial parsing errors ([#3996](https://github.com/dbt-labs/dbt/issues/3006), [#4020](https://github.com/dbt-labs/dbt/pull/4018))
|
||||
- Return an error instead of a warning when runing with `--warn-error` and no models are selected ([#4006](https://github.com/dbt-labs/dbt/issues/4006), [#4019](https://github.com/dbt-labs/dbt/pull/4019))
|
||||
- Fixed bug with `error_if` test option ([#4070](https://github.com/dbt-labs/dbt-core/pull/4070))
|
||||
|
||||
### Under the hood
|
||||
- Enact deprecation for `materialization-return` and replace deprecation warning with an exception. ([#3896](https://github.com/dbt-labs/dbt-core/issues/3896))
|
||||
- Build catalog for only relational, non-ephemeral nodes in the graph ([#3920](https://github.com/dbt-labs/dbt-core/issues/3920))
|
||||
- Enact deprecation to remove the `release` arg from the `execute_macro` method. ([#3900](https://github.com/dbt-labs/dbt-core/issues/3900))
|
||||
- Enact deprecation for default quoting to be True. Override for the `dbt-snowflake` adapter so it stays `False`. ([#3898](https://github.com/dbt-labs/dbt-core/issues/3898))
|
||||
- Enact deprecation for object used as dictionaries when they should be dataclasses. Replace deprecation warning with an exception for the dunder methods of `__iter__` and `__len__` for all superclasses of FakeAPIObject. ([#3897](https://github.com/dbt-labs/dbt-core/issues/3897))
|
||||
- Enact deprecation for `adapter-macro` and replace deprecation warning with an exception. ([#3901](https://github.com/dbt-labs/dbt-core/issues/3901))
|
||||
- Add warning when trying to put a node under the wrong key. ie. A seed under models in a `schema.yml` file. ([#3899](https://github.com/dbt-labs/dbt-core/issues/3899))
|
||||
- Plugins for `redshift`, `snowflake`, and `bigquery` have moved to separate repos: [`dbt-redshift`](https://github.com/dbt-labs/dbt-redshift), [`dbt-snowflake`](https://github.com/dbt-labs/dbt-snowflake), [`dbt-bigquery`](https://github.com/dbt-labs/dbt-bigquery)
|
||||
- Change the default dbt packages installation directory to `dbt_packages` from `dbt_modules`. Also rename `module-path` to `packages-install-path` to allow default overrides of package install directory. Deprecation warning added for projects using the old `dbt_modules` name without specifying a `packages-install-path`. ([#3523](https://github.com/dbt-labs/dbt-core/issues/3523))
|
||||
- Update the default project paths to be `analysis-paths = ['analyses']` and `test-paths = ['tests]`. Also have starter project set `analysis-paths: ['analyses']` from now on. ([#2659](https://github.com/dbt-labs/dbt-core/issues/2659))
|
||||
- Define the data type of `sources` as an array of arrays of string in the manifest artifacts. ([#3966](https://github.com/dbt-labs/dbt-core/issues/3966), [#3967](https://github.com/dbt-labs/dbt-core/pull/3967))
|
||||
- Marked `source-paths` and `data-paths` as deprecated keys in `dbt_project.yml` in favor of `model-paths` and `seed-paths` respectively.([#1607](https://github.com/dbt-labs/dbt-core/issues/1607))
|
||||
- Surface git errors to `stdout` when cloning dbt packages from Github. ([#3167](https://github.com/dbt-labs/dbt-core/issues/3167))
|
||||
|
||||
Contributors:
|
||||
|
||||
- [@dave-connors-3](https://github.com/dave-connors-3) ([#3920](https://github.com/dbt-labs/dbt-core/pull/3922))
|
||||
- [@kadero](https://github.com/kadero) ([#3952](https://github.com/dbt-labs/dbt-core/pull/3953))
|
||||
- [@samlader](https://github.com/samlader) ([#3993](https://github.com/dbt-labs/dbt-core/pull/3993))
|
||||
- [@yu-iskw](https://github.com/yu-iskw) ([#3967](https://github.com/dbt-labs/dbt-core/pull/3967))
|
||||
- [@laxjesse](https://github.com/laxjesse) ([#4019](https://github.com/dbt-labs/dbt-core/pull/4019))
|
||||
- [@gitznik](https://github.com/Gitznik) ([#4124](https://github.com/dbt-labs/dbt-core/pull/4124))
|
||||
3
.changes/1.0.4.md
Normal file
3
.changes/1.0.4.md
Normal file
@@ -0,0 +1,3 @@
|
||||
## dbt-core 1.0.4 - March 18, 2022
|
||||
### Fixes
|
||||
- Depend on new dbt-extractor version with fixed GitHub links to resolve Homebrew installation issues ([#4891](https://github.com/dbt-labs/dbt-core/issues/4891), [#4890](https://github.com/dbt-labs/dbt-core/pull/4890))
|
||||
20
.changes/1.0.5.md
Normal file
20
.changes/1.0.5.md
Normal file
@@ -0,0 +1,20 @@
|
||||
## dbt-core 1.0.5 - April 20, 2022
|
||||
### Fixes
|
||||
- Fix bug causing empty node level meta, snapshot config errors ([#4459](https://github.com/dbt-labs/dbt-core/issues/4459), [#4726](https://github.com/dbt-labs/dbt-core/pull/4726))
|
||||
- Support click versions in the v7.x series ([#4566](https://github.com/dbt-labs/dbt-core/issues/4566), [#4681](https://github.com/dbt-labs/dbt-core/pull/4681))
|
||||
- Fixed a bug where nodes that depend on multiple macros couldn't be selected using `-s state:modified` ([#4678](https://github.com/dbt-labs/dbt-core/issues/4678), [#4820](https://github.com/dbt-labs/dbt-core/pull/4820))
|
||||
- Catch all Requests Exceptions on deps install to attempt retries. Also log the exceptions hit. ([#4849](https://github.com/dbt-labs/dbt-core/issues/4849), [#4865](https://github.com/dbt-labs/dbt-core/pull/4865))
|
||||
- Fix partial parsing bug with multiple snapshot blocks ([#4771](https://github.com/dbt-labs/dbt-core/issues/4771), [#4773](https://github.com/dbt-labs/dbt-core/pull/4773))
|
||||
- Use cli_vars instead of context to create package and selector renderers ([#4876](https://github.com/dbt-labs/dbt-core/issues/4876), [#4878](https://github.com/dbt-labs/dbt-core/pull/4878))
|
||||
- Catch more cases to retry package retrieval for deps pointing to the hub. Also start to cache the package requests. ([#4849](https://github.com/dbt-labs/dbt-core/issues/4849), [#4982](https://github.com/dbt-labs/dbt-core/pull/4982))
|
||||
- Relax minimum supported version of MarkupSafe ([#4745](https://github.com/dbt-labs/dbt-core/issues/4745), [#5039](https://github.com/dbt-labs/dbt-core/pull/5039))
|
||||
### Under the Hood
|
||||
- Automate changelog generation with changie ([#4652](https://github.com/dbt-labs/dbt-core/issues/4652), [#4743](https://github.com/dbt-labs/dbt-core/pull/4743))
|
||||
- Fix broken links for changelog generation and tweak GHA to only post a comment once when changelog entry is missing ([#4848](https://github.com/dbt-labs/dbt-core/issues/4848), [#4857](https://github.com/dbt-labs/dbt-core/pull/4857))
|
||||
### Docs
|
||||
- Resolve errors related to operations preventing DAG from generating in the docs. Also patch a spark issue to allow search to filter accurately past the missing columns. ([#4578](https://github.com/dbt-labs/dbt-core/issues/4578), [#4763](https://github.com/dbt-labs/dbt-core/pull/4763))
|
||||
- backporting performance regression testing readme ([#4904](https://github.com/dbt-labs/dbt-core/issues/4904), [#5042](https://github.com/dbt-labs/dbt-core/pull/5042))
|
||||
|
||||
### Contributors
|
||||
- [@adamantike](https://github.com/adamantike) ([#5039](https://github.com/dbt-labs/dbt-core/pull/5039))
|
||||
- [@twilly](https://github.com/twilly) ([#4681](https://github.com/dbt-labs/dbt-core/pull/4681))
|
||||
8
.changes/1.0.6.md
Normal file
8
.changes/1.0.6.md
Normal file
@@ -0,0 +1,8 @@
|
||||
## dbt-core 1.0.6 - April 27, 2022
|
||||
### Fixes
|
||||
- Use yaml renderer (with target context) for rendering selectors ([#5131](https://github.com/dbt-labs/dbt-core/issues/5131), [#5136](https://github.com/dbt-labs/dbt-core/pull/5136))
|
||||
- Fix retry logic to return values after initial try ([#5023](https://github.com/dbt-labs/dbt-core/issues/5023), [#5137](https://github.com/dbt-labs/dbt-core/pull/5137))
|
||||
- Scrub secret env vars from CommandError in exception stacktrace ([#5151](https://github.com/dbt-labs/dbt-core/issues/5151), [#5152](https://github.com/dbt-labs/dbt-core/pull/5152))
|
||||
### Under the Hood
|
||||
- Move package deprecation check outside of package cache ([#5068](https://github.com/dbt-labs/dbt-core/issues/5068), [#5069](https://github.com/dbt-labs/dbt-core/pull/5069))
|
||||
|
||||
4
.changes/1.0.7.md
Normal file
4
.changes/1.0.7.md
Normal file
@@ -0,0 +1,4 @@
|
||||
## dbt-core 1.0.7 - May 20, 2022
|
||||
### Dependencies
|
||||
- Lowering networkx dependency range due to new version's breaking change ([#5254](https://github.com/dbt-labs/dbt-core/issues/5254), [#5280](https://github.com/dbt-labs/dbt-core/pull/5280))
|
||||
|
||||
4
.changes/1.0.8.md
Normal file
4
.changes/1.0.8.md
Normal file
@@ -0,0 +1,4 @@
|
||||
## dbt-core 1.0.8 - June 15, 2022
|
||||
### Under the Hood
|
||||
- Update context readme + clean up context code" ([#4796](https://github.com/dbt-labs/dbt-core/issues/4796), [#5334](https://github.com/dbt-labs/dbt-core/pull/5334))
|
||||
|
||||
40
.changes/README.md
Normal file
40
.changes/README.md
Normal file
@@ -0,0 +1,40 @@
|
||||
# CHANGELOG Automation
|
||||
|
||||
We use [changie](https://changie.dev/) to automate `CHANGELOG` generation. For installation and format/command specifics, see the documentation.
|
||||
|
||||
### Quick Tour
|
||||
|
||||
- All new change entries get generated under `/.changes/unreleased` as a yaml file
|
||||
- `header.tpl.md` contains the contents of the entire CHANGELOG file
|
||||
- `0.0.0.md` contains the contents of the footer for the entire CHANGELOG file. changie looks to be in the process of supporting a footer file the same as it supports a header file. Switch to that when available. For now, the 0.0.0 in the file name forces it to the bottom of the changelog no matter what version we are releasing.
|
||||
- `.changie.yaml` contains the fields in a change, the format of a single change, as well as the format of the Contributors section for each version.
|
||||
|
||||
### Workflow
|
||||
|
||||
#### Daily workflow
|
||||
Almost every code change we make associated with an issue will require a `CHANGELOG` entry. After you have created the PR in GitHub, run `changie new` and follow the command prompts to generate a yaml file with your change details. This only needs to be done once per PR.
|
||||
|
||||
The `changie new` command will ensure correct file format and file name. There is a one to one mapping of issues to changes. Multiple issues cannot be lumped into a single entry. If you make a mistake, the yaml file may be directly modified and saved as long as the format is preserved.
|
||||
|
||||
Note: If your PR has been cleared by the Core Team as not needing a changelog entry, the `Skip Changelog` label may be put on the PR to bypass the GitHub action that blacks PRs from being merged when they are missing a `CHANGELOG` entry.
|
||||
|
||||
#### Prerelease Workflow
|
||||
These commands batch up changes in `/.changes/unreleased` to be included in this prerelease and move those files to a directory named for the release version. The `--move-dir` will be created if it does not exist and is created in `/.changes`.
|
||||
|
||||
```
|
||||
changie batch <version> --move-dir '<version>' --prerelease 'rc1'
|
||||
changie merge
|
||||
```
|
||||
|
||||
#### Final Release Workflow
|
||||
These commands batch up changes in `/.changes/unreleased` as well as `/.changes/<version>` to be included in this final release and delete all prereleases. This rolls all prereleases up into a single final release. All `yaml` files in `/unreleased` and `<version>` will be deleted at this point.
|
||||
|
||||
```
|
||||
changie batch <version> --include '<version>' --remove-prereleases
|
||||
changie merge
|
||||
```
|
||||
|
||||
### A Note on Manual Edits & Gotchas
|
||||
- Changie generates markdown files in the `.changes` directory that are parsed together with the `changie merge` command. Every time `changie merge` is run, it regenerates the entire file. For this reason, any changes made directly to `CHANGELOG.md` will be overwritten on the next run of `changie merge`.
|
||||
- If changes need to be made to the `CHANGELOG.md`, make the changes to the relevant `<version>.md` file located in the `/.changes` directory. You will then run `changie merge` to regenerate the `CHANGELOG.MD`.
|
||||
- Do not run `changie batch` again on released versions. Our final release workflow deletes all of the yaml files associated with individual changes. If for some reason modifications to the `CHANGELOG.md` are required after we've generated the final release `CHANGELOG.md`, the modifications need to be done manually to the `<version>.md` file in the `/.changes` directory.
|
||||
6
.changes/header.tpl.md
Executable file
6
.changes/header.tpl.md
Executable file
@@ -0,0 +1,6 @@
|
||||
# dbt Core Changelog
|
||||
|
||||
- This file provides a full account of all changes to `dbt-core` and `dbt-postgres`
|
||||
- Changes are listed under the (pre)release in which they first appear. Subsequent releases include changes from previous releases.
|
||||
- "Breaking changes" listed under a version may require action from end users or external maintainers when upgrading to that version.
|
||||
- Do not edit this file directly. This file is auto-generated using [changie](https://github.com/miniscruff/changie). For details on how to document a change, see [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-changelog-entry)
|
||||
0
core/dbt/include/starter_project/data/.gitkeep → .changes/unreleased/.gitkeep
Normal file → Executable file
0
core/dbt/include/starter_project/data/.gitkeep → .changes/unreleased/.gitkeep
Normal file → Executable file
60
.changie.yaml
Executable file
60
.changie.yaml
Executable file
@@ -0,0 +1,60 @@
|
||||
changesDir: .changes
|
||||
unreleasedDir: unreleased
|
||||
headerPath: header.tpl.md
|
||||
versionHeaderPath: ""
|
||||
changelogPath: CHANGELOG.md
|
||||
versionExt: md
|
||||
versionFormat: '## dbt-core {{.Version}} - {{.Time.Format "January 02, 2006"}}'
|
||||
kindFormat: '### {{.Kind}}'
|
||||
changeFormat: '- {{.Body}} ([#{{.Custom.Issue}}](https://github.com/dbt-labs/dbt-core/issues/{{.Custom.Issue}}), [#{{.Custom.PR}}](https://github.com/dbt-labs/dbt-core/pull/{{.Custom.PR}}))'
|
||||
kinds:
|
||||
- label: Fixes
|
||||
- label: Features
|
||||
- label: Under the Hood
|
||||
- label: Breaking Changes
|
||||
- label: Docs
|
||||
- label: Dependencies
|
||||
custom:
|
||||
- key: Author
|
||||
label: GitHub Username(s) (separated by a single space if multiple)
|
||||
type: string
|
||||
minLength: 3
|
||||
- key: Issue
|
||||
label: GitHub Issue Number
|
||||
type: int
|
||||
minLength: 4
|
||||
- key: PR
|
||||
label: GitHub Pull Request Number
|
||||
type: int
|
||||
minLength: 4
|
||||
footerFormat: |
|
||||
{{- $contributorDict := dict }}
|
||||
{{- /* any names added to this list should be all lowercase for later matching purposes */}}
|
||||
{{- $core_team := list "emmyoop" "nathaniel-may" "gshank" "leahwicz" "chenyulinx" "stu-k" "iknox-fa" "versusfacit" "mcknight-42" "jtcohen6" "dependabot" }}
|
||||
{{- range $change := .Changes }}
|
||||
{{- $authorList := splitList " " $change.Custom.Author }}
|
||||
{{- /* loop through all authors for a PR */}}
|
||||
{{- range $author := $authorList }}
|
||||
{{- $authorLower := lower $author }}
|
||||
{{- /* we only want to include non-core team contributors */}}
|
||||
{{- if not (has $authorLower $core_team)}}
|
||||
{{- $pr := $change.Custom.PR }}
|
||||
{{- /* check if this contributor has other PRs associated with them already */}}
|
||||
{{- if hasKey $contributorDict $author }}
|
||||
{{- $prList := get $contributorDict $author }}
|
||||
{{- $prList = append $prList $pr }}
|
||||
{{- $contributorDict := set $contributorDict $author $prList }}
|
||||
{{- else }}
|
||||
{{- $prList := list $change.Custom.PR }}
|
||||
{{- $contributorDict := set $contributorDict $author $prList }}
|
||||
{{- end }}
|
||||
{{- end}}
|
||||
{{- end}}
|
||||
{{- end }}
|
||||
{{- /* no indentation here for formatting so the final markdown doesn't have unneeded indentations */}}
|
||||
{{- if $contributorDict}}
|
||||
### Contributors
|
||||
{{- range $k,$v := $contributorDict }}
|
||||
- [@{{$k}}](https://github.com/{{$k}}) ({{ range $index, $element := $v }}{{if $index}}, {{end}}[#{{$element}}](https://github.com/dbt-labs/dbt-core/pull/{{$element}}){{end}})
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
2
.github/pull_request_template.md
vendored
2
.github/pull_request_template.md
vendored
@@ -18,4 +18,4 @@ resolves #
|
||||
- [ ] I have signed the [CLA](https://docs.getdbt.com/docs/contributor-license-agreements)
|
||||
- [ ] I have run this code in development and it appears to resolve the stated issue
|
||||
- [ ] This PR includes tests, or tests are not required/relevant for this PR
|
||||
- [ ] I have updated the `CHANGELOG.md` and added information about my change
|
||||
- [ ] I have added information about my change to be included in the [CHANGELOG](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#Adding-CHANGELOG-Entry).
|
||||
|
||||
76
.github/workflows/changelog-check.yml
vendored
Normal file
76
.github/workflows/changelog-check.yml
vendored
Normal file
@@ -0,0 +1,76 @@
|
||||
# **what?**
|
||||
# Checks that a file has been committed under the /.changes directory
|
||||
# as a new CHANGELOG entry. Cannot check for a specific filename as
|
||||
# it is dynamically generated by change type and timestamp.
|
||||
# This workflow should not require any secrets since it runs for PRs
|
||||
# from forked repos.
|
||||
# By default, secrets are not passed to workflows running from
|
||||
# a forked repo.
|
||||
|
||||
# **why?**
|
||||
# Ensure code change gets reflected in the CHANGELOG.
|
||||
|
||||
# **when?**
|
||||
# This will run for all PRs going into main and *.latest.
|
||||
|
||||
name: Check Changelog Entry
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
workflow_dispatch:
|
||||
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
pull-requests: write
|
||||
|
||||
env:
|
||||
changelog_comment: 'Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-changelog-entry).'
|
||||
|
||||
jobs:
|
||||
changelog:
|
||||
name: changelog
|
||||
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- name: Check if changelog file was added
|
||||
# https://github.com/marketplace/actions/paths-changes-filter
|
||||
# For each filter, it sets output variable named by the filter to the text:
|
||||
# 'true' - if any of changed files matches any of filter rules
|
||||
# 'false' - if none of changed files matches any of filter rules
|
||||
# also, returns:
|
||||
# `changes` - JSON array with names of all filters matching any of the changed files
|
||||
uses: dorny/paths-filter@v2
|
||||
id: filter
|
||||
with:
|
||||
token: ${{ secrets.GITHUB_TOKEN }}
|
||||
filters: |
|
||||
changelog:
|
||||
- added: '.changes/unreleased/**.yaml'
|
||||
- name: Check if comment already exists
|
||||
uses: peter-evans/find-comment@v1
|
||||
id: changelog_comment
|
||||
with:
|
||||
issue-number: ${{ github.event.pull_request.number }}
|
||||
comment-author: 'github-actions[bot]'
|
||||
body-includes: ${{ env.changelog_comment }}
|
||||
- name: Create PR comment if changelog entry is missing, required, and does nto exist
|
||||
if: |
|
||||
steps.filter.outputs.changelog == 'false' &&
|
||||
!contains( github.event.pull_request.labels.*.name, 'Skip Changelog') &&
|
||||
steps.changelog_comment.outputs.comment-body == ''
|
||||
uses: peter-evans/create-or-update-comment@v1
|
||||
with:
|
||||
issue-number: ${{ github.event.pull_request.number }}
|
||||
body: ${{ env.changelog_comment }}
|
||||
- name: Fail job if changelog entry is missing and required
|
||||
if: |
|
||||
steps.filter.outputs.changelog == 'false' &&
|
||||
!contains( github.event.pull_request.labels.*.name, 'Skip Changelog')
|
||||
uses: actions/github-script@v6
|
||||
with:
|
||||
script: core.setFailed('Changelog entry required to merge.')
|
||||
6
.github/workflows/integration.yml
vendored
6
.github/workflows/integration.yml
vendored
@@ -172,9 +172,9 @@ jobs:
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip install tox
|
||||
pip --version
|
||||
python -m pip install --user --upgrade pip
|
||||
python -m pip install tox
|
||||
python -m pip --version
|
||||
tox --version
|
||||
|
||||
- name: Run tox (postgres)
|
||||
|
||||
28
.github/workflows/main.yml
vendored
28
.github/workflows/main.yml
vendored
@@ -61,9 +61,9 @@ jobs:
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip install tox
|
||||
pip --version
|
||||
python -m pip install --user --upgrade pip
|
||||
python -m pip install tox
|
||||
python -m pip --version
|
||||
tox --version
|
||||
|
||||
- name: Run tox
|
||||
@@ -96,9 +96,9 @@ jobs:
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip install tox
|
||||
pip --version
|
||||
python -m pip install --user --upgrade pip
|
||||
python -m pip install tox
|
||||
python -m pip --version
|
||||
tox --version
|
||||
|
||||
- name: Run tox
|
||||
@@ -133,9 +133,9 @@ jobs:
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip install --upgrade setuptools wheel twine check-wheel-contents
|
||||
pip --version
|
||||
python -m pip install --user --upgrade pip
|
||||
python -m pip install --upgrade setuptools wheel twine check-wheel-contents
|
||||
python -m pip --version
|
||||
|
||||
- name: Build distributions
|
||||
run: ./scripts/build-dist.sh
|
||||
@@ -177,9 +177,9 @@ jobs:
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip install --upgrade wheel
|
||||
pip --version
|
||||
python -m pip install --user --upgrade pip
|
||||
python -m pip install --upgrade wheel
|
||||
python -m pip --version
|
||||
|
||||
- uses: actions/download-artifact@v2
|
||||
with:
|
||||
@@ -191,7 +191,7 @@ jobs:
|
||||
|
||||
- name: Install wheel distributions
|
||||
run: |
|
||||
find ./dist/*.whl -maxdepth 1 -type f | xargs pip install --force-reinstall --find-links=dist/
|
||||
find ./dist/*.whl -maxdepth 1 -type f | xargs python -m pip install --force-reinstall --find-links=dist/
|
||||
|
||||
- name: Check wheel distributions
|
||||
run: |
|
||||
@@ -200,7 +200,7 @@ jobs:
|
||||
- name: Install source distributions
|
||||
# ignore dbt-1.0.0, which intentionally raises an error when installed from source
|
||||
run: |
|
||||
find ./dist/dbt-[a-z]*.gz -maxdepth 1 -type f | xargs pip install --force-reinstall --find-links=dist/
|
||||
find ./dist/dbt-[a-z]*.gz -maxdepth 1 -type f | xargs python -m pip install --force-reinstall --find-links=dist/
|
||||
|
||||
- name: Check source distributions
|
||||
run: |
|
||||
|
||||
26
.github/workflows/release.yml
vendored
26
.github/workflows/release.yml
vendored
@@ -47,9 +47,9 @@ jobs:
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip install tox
|
||||
pip --version
|
||||
python -m pip install --user --upgrade pip
|
||||
python -m pip install tox
|
||||
python -m pip --version
|
||||
tox --version
|
||||
|
||||
- name: Run tox
|
||||
@@ -74,9 +74,9 @@ jobs:
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip install --upgrade setuptools wheel twine check-wheel-contents
|
||||
pip --version
|
||||
python -m pip install --user --upgrade pip
|
||||
python -m pip install --upgrade setuptools wheel twine check-wheel-contents
|
||||
python -m pip --version
|
||||
|
||||
- name: Build distributions
|
||||
run: ./scripts/build-dist.sh
|
||||
@@ -95,7 +95,9 @@ jobs:
|
||||
- uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: dist
|
||||
path: dist/
|
||||
path: |
|
||||
dist/
|
||||
!dist/dbt-${{github.event.inputs.version_number}}.tar.gz
|
||||
|
||||
test-build:
|
||||
name: verify packages
|
||||
@@ -112,9 +114,9 @@ jobs:
|
||||
|
||||
- name: Install python dependencies
|
||||
run: |
|
||||
pip install --user --upgrade pip
|
||||
pip install --upgrade wheel
|
||||
pip --version
|
||||
python -m pip install --user --upgrade pip
|
||||
python -m pip install --upgrade wheel
|
||||
python -m pip --version
|
||||
|
||||
- uses: actions/download-artifact@v2
|
||||
with:
|
||||
@@ -126,7 +128,7 @@ jobs:
|
||||
|
||||
- name: Install wheel distributions
|
||||
run: |
|
||||
find ./dist/*.whl -maxdepth 1 -type f | xargs pip install --force-reinstall --find-links=dist/
|
||||
find ./dist/*.whl -maxdepth 1 -type f | xargs python -m pip install --force-reinstall --find-links=dist/
|
||||
|
||||
- name: Check wheel distributions
|
||||
run: |
|
||||
@@ -134,7 +136,7 @@ jobs:
|
||||
|
||||
- name: Install source distributions
|
||||
run: |
|
||||
find ./dist/*.gz -maxdepth 1 -type f | xargs pip install --force-reinstall --find-links=dist/
|
||||
find ./dist/*.gz -maxdepth 1 -type f | xargs python -m pip install --force-reinstall --find-links=dist/
|
||||
|
||||
- name: Check source distributions
|
||||
run: |
|
||||
|
||||
71
.github/workflows/structured-logging-schema-check.yml
vendored
Normal file
71
.github/workflows/structured-logging-schema-check.yml
vendored
Normal file
@@ -0,0 +1,71 @@
|
||||
# This Action checks makes a dbt run to sample json structured logs
|
||||
# and checks that they conform to the currently documented schema.
|
||||
#
|
||||
# If this action fails it either means we have unintentionally deviated
|
||||
# from our documented structured logging schema, or we need to bump the
|
||||
# version of our structured logging and add new documentation to
|
||||
# communicate these changes.
|
||||
|
||||
|
||||
name: Structured Logging Schema Check
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- "main"
|
||||
- "*.latest"
|
||||
- "releases/*"
|
||||
pull_request:
|
||||
workflow_dispatch:
|
||||
|
||||
permissions: read-all
|
||||
|
||||
jobs:
|
||||
# run the performance measurements on the current or default branch
|
||||
test-schema:
|
||||
name: Test Log Schema
|
||||
runs-on: ubuntu-latest
|
||||
env:
|
||||
# turns warnings into errors
|
||||
RUSTFLAGS: "-D warnings"
|
||||
# points tests to the log file
|
||||
LOG_DIR: "/home/runner/work/dbt-core/dbt-core/logs"
|
||||
# tells integration tests to output into json format
|
||||
DBT_LOG_FORMAT: 'json'
|
||||
steps:
|
||||
|
||||
- name: checkout dev
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Setup Python
|
||||
uses: actions/setup-python@v2.2.2
|
||||
with:
|
||||
python-version: "3.8"
|
||||
|
||||
- uses: actions-rs/toolchain@v1
|
||||
with:
|
||||
profile: minimal
|
||||
toolchain: stable
|
||||
override: true
|
||||
|
||||
- name: install dbt
|
||||
run: python -m pip install -r dev-requirements.txt -r editable-requirements.txt
|
||||
|
||||
- name: Set up postgres
|
||||
uses: ./.github/actions/setup-postgres-linux
|
||||
|
||||
- name: ls
|
||||
run: ls
|
||||
|
||||
# integration tests generate a ton of logs in different files. the next step will find them all.
|
||||
# we actually care if these pass, because the normal test run doesn't usually include many json log outputs
|
||||
- name: Run integration tests
|
||||
run: tox -e py38-postgres -- -nauto
|
||||
|
||||
# apply our schema tests to every log event from the previous step
|
||||
# skips any output that isn't valid json
|
||||
- uses: actions-rs/cargo@v1
|
||||
with:
|
||||
command: run
|
||||
args: --manifest-path test/interop/log_parsing/Cargo.toml
|
||||
8
.github/workflows/version-bump.yml
vendored
8
.github/workflows/version-bump.yml
vendored
@@ -57,7 +57,7 @@ jobs:
|
||||
run: |
|
||||
python3 -m venv env
|
||||
source env/bin/activate
|
||||
pip install --upgrade pip
|
||||
python -m pip install --upgrade pip
|
||||
|
||||
- name: Create PR branch
|
||||
if: ${{ steps.variables.outputs.IS_DRY_RUN == 'true' }}
|
||||
@@ -69,14 +69,14 @@ jobs:
|
||||
- name: Generate Docker requirements
|
||||
run: |
|
||||
source env/bin/activate
|
||||
pip install -r requirements.txt
|
||||
pip freeze -l > docker/requirements/requirements.txt
|
||||
python -m pip install -r requirements.txt
|
||||
python -m pip freeze -l > docker/requirements/requirements.txt
|
||||
git status
|
||||
|
||||
- name: Bump version
|
||||
run: |
|
||||
source env/bin/activate
|
||||
pip install -r dev-requirements.txt
|
||||
python -m pip install -r dev-requirements.txt
|
||||
env/bin/bumpversion --allow-dirty --new-version ${{steps.variables.outputs.VERSION_NUMBER}} major
|
||||
git status
|
||||
|
||||
|
||||
3406
CHANGELOG.md
Normal file → Executable file
3406
CHANGELOG.md
Normal file → Executable file
File diff suppressed because it is too large
Load Diff
@@ -226,6 +226,15 @@ python -m pytest test/unit/test_graph.py::GraphTest::test__dependency_list
|
||||
```
|
||||
> [Here](https://docs.pytest.org/en/reorganize-docs/new-docs/user/commandlineuseful.html)
|
||||
> is a list of useful command-line options for `pytest` to use while developing.
|
||||
|
||||
## Adding CHANGELOG Entry
|
||||
|
||||
We use [changie](https://changie.dev) to generate `CHANGELOG` entries. Do not edit the `CHANGELOG.md` directly. Your modifications will be lost.
|
||||
|
||||
Follow the steps to [install `changie`](https://changie.dev/guide/installation/) for your system.
|
||||
|
||||
Once changie is installed and your PR is created, simply run `changie new` and changie will walk you through the process of creating a changelog entry. Commit the file that's created and your changelog entry is complete!
|
||||
|
||||
## Submitting a Pull Request
|
||||
|
||||
dbt Labs provides a CI environment to test changes to specific adapters, and periodic maintenance checks of `dbt-core` through Github Actions. For example, if you submit a pull request to the `dbt-redshift` repo, GitHub will trigger automated code checks and tests against Redshift.
|
||||
|
||||
@@ -39,7 +39,7 @@ from dbt.adapters.base.relation import (
|
||||
ComponentName, BaseRelation, InformationSchema, SchemaSearchMap
|
||||
)
|
||||
from dbt.adapters.base import Column as BaseColumn
|
||||
from dbt.adapters.cache import RelationsCache
|
||||
from dbt.adapters.cache import RelationsCache, _make_key
|
||||
|
||||
|
||||
SeedModel = Union[ParsedSeedNode, CompiledSeedNode]
|
||||
@@ -291,7 +291,7 @@ class BaseAdapter(metaclass=AdapterMeta):
|
||||
if (database, schema) not in self.cache:
|
||||
fire_event(
|
||||
CacheMiss(
|
||||
conn_name=self.nice_connection_name,
|
||||
conn_name=self.nice_connection_name(),
|
||||
database=database,
|
||||
schema=schema
|
||||
)
|
||||
@@ -676,7 +676,11 @@ class BaseAdapter(metaclass=AdapterMeta):
|
||||
relations = self.list_relations_without_caching(
|
||||
schema_relation
|
||||
)
|
||||
fire_event(ListRelations(database=database, schema=schema, relations=relations))
|
||||
fire_event(ListRelations(
|
||||
database=database,
|
||||
schema=schema,
|
||||
relations=[_make_key(x) for x in relations]
|
||||
))
|
||||
|
||||
return relations
|
||||
|
||||
|
||||
@@ -1,10 +1,10 @@
|
||||
import threading
|
||||
from collections import namedtuple
|
||||
from copy import deepcopy
|
||||
from typing import Any, Dict, Iterable, List, Optional, Set, Tuple
|
||||
|
||||
from dbt.adapters.reference_keys import _make_key, _ReferenceKey
|
||||
import dbt.exceptions
|
||||
from dbt.events.functions import fire_event
|
||||
from dbt.events.functions import fire_event, fire_event_if
|
||||
from dbt.events.types import (
|
||||
AddLink,
|
||||
AddRelation,
|
||||
@@ -20,20 +20,9 @@ from dbt.events.types import (
|
||||
UncachedRelation,
|
||||
UpdateReference
|
||||
)
|
||||
import dbt.flags as flags
|
||||
from dbt.utils import lowercase
|
||||
|
||||
_ReferenceKey = namedtuple('_ReferenceKey', 'database schema identifier')
|
||||
|
||||
|
||||
def _make_key(relation) -> _ReferenceKey:
|
||||
"""Make _ReferenceKeys with lowercase values for the cache so we don't have
|
||||
to keep track of quoting
|
||||
"""
|
||||
# databases and schemas can both be None
|
||||
return _ReferenceKey(lowercase(relation.database),
|
||||
lowercase(relation.schema),
|
||||
lowercase(relation.identifier))
|
||||
|
||||
|
||||
def dot_separated(key: _ReferenceKey) -> str:
|
||||
"""Return the key in dot-separated string form.
|
||||
@@ -334,12 +323,12 @@ class RelationsCache:
|
||||
:param BaseRelation relation: The underlying relation.
|
||||
"""
|
||||
cached = _CachedRelation(relation)
|
||||
fire_event(AddRelation(relation=cached))
|
||||
fire_event(DumpBeforeAddGraph(dump=self.dump_graph()))
|
||||
fire_event(AddRelation(relation=_make_key(cached)))
|
||||
fire_event_if(flags.LOG_CACHE_EVENTS, lambda: DumpBeforeAddGraph(dump=self.dump_graph()))
|
||||
|
||||
with self.lock:
|
||||
self._setdefault(cached)
|
||||
fire_event(DumpAfterAddGraph(dump=self.dump_graph()))
|
||||
fire_event_if(flags.LOG_CACHE_EVENTS, lambda: DumpAfterAddGraph(dump=self.dump_graph()))
|
||||
|
||||
def _remove_refs(self, keys):
|
||||
"""Removes all references to all entries in keys. This does not
|
||||
@@ -452,8 +441,10 @@ class RelationsCache:
|
||||
old_key = _make_key(old)
|
||||
new_key = _make_key(new)
|
||||
fire_event(RenameSchema(old_key=old_key, new_key=new_key))
|
||||
|
||||
fire_event(DumpBeforeRenameSchema(dump=self.dump_graph()))
|
||||
fire_event_if(
|
||||
flags.LOG_CACHE_EVENTS,
|
||||
lambda: DumpBeforeRenameSchema(dump=self.dump_graph())
|
||||
)
|
||||
|
||||
with self.lock:
|
||||
if self._check_rename_constraints(old_key, new_key):
|
||||
@@ -461,7 +452,10 @@ class RelationsCache:
|
||||
else:
|
||||
self._setdefault(_CachedRelation(new))
|
||||
|
||||
fire_event(DumpAfterRenameSchema(dump=self.dump_graph()))
|
||||
fire_event_if(
|
||||
flags.LOG_CACHE_EVENTS,
|
||||
lambda: DumpAfterRenameSchema(dump=self.dump_graph())
|
||||
)
|
||||
|
||||
def get_relations(
|
||||
self, database: Optional[str], schema: Optional[str]
|
||||
|
||||
24
core/dbt/adapters/reference_keys.py
Normal file
24
core/dbt/adapters/reference_keys.py
Normal file
@@ -0,0 +1,24 @@
|
||||
# this module exists to resolve circular imports with the events module
|
||||
|
||||
from collections import namedtuple
|
||||
from typing import Optional
|
||||
|
||||
|
||||
_ReferenceKey = namedtuple('_ReferenceKey', 'database schema identifier')
|
||||
|
||||
|
||||
def lowercase(value: Optional[str]) -> Optional[str]:
|
||||
if value is None:
|
||||
return None
|
||||
else:
|
||||
return value.lower()
|
||||
|
||||
|
||||
def _make_key(relation) -> _ReferenceKey:
|
||||
"""Make _ReferenceKeys with lowercase values for the cache so we don't have
|
||||
to keep track of quoting
|
||||
"""
|
||||
# databases and schemas can both be None
|
||||
return _ReferenceKey(lowercase(relation.database),
|
||||
lowercase(relation.schema),
|
||||
lowercase(relation.identifier))
|
||||
@@ -75,7 +75,8 @@ class SQLConnectionManager(BaseConnectionManager):
|
||||
|
||||
fire_event(
|
||||
SQLQueryStatus(
|
||||
status=str(self.get_response(cursor)), elapsed=round((time.time() - pre), 2)
|
||||
status=str(self.get_response(cursor)),
|
||||
elapsed=round((time.time() - pre), 2)
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
@@ -5,6 +5,7 @@ import dbt.clients.agate_helper
|
||||
from dbt.contracts.connection import Connection
|
||||
import dbt.exceptions
|
||||
from dbt.adapters.base import BaseAdapter, available
|
||||
from dbt.adapters.cache import _make_key
|
||||
from dbt.adapters.sql import SQLConnectionManager
|
||||
from dbt.events.functions import fire_event
|
||||
from dbt.events.types import ColTypeChange, SchemaCreation, SchemaDrop
|
||||
@@ -182,7 +183,7 @@ class SQLAdapter(BaseAdapter):
|
||||
|
||||
def create_schema(self, relation: BaseRelation) -> None:
|
||||
relation = relation.without_identifier()
|
||||
fire_event(SchemaCreation(relation=relation))
|
||||
fire_event(SchemaCreation(relation=_make_key(relation)))
|
||||
kwargs = {
|
||||
'relation': relation,
|
||||
}
|
||||
@@ -193,7 +194,7 @@ class SQLAdapter(BaseAdapter):
|
||||
|
||||
def drop_schema(self, relation: BaseRelation) -> None:
|
||||
relation = relation.without_identifier()
|
||||
fire_event(SchemaDrop(relation=relation))
|
||||
fire_event(SchemaDrop(relation=_make_key(relation)))
|
||||
kwargs = {
|
||||
'relation': relation,
|
||||
}
|
||||
|
||||
@@ -13,6 +13,18 @@ from dbt.exceptions import RuntimeException
|
||||
BOM = BOM_UTF8.decode('utf-8') # '\ufeff'
|
||||
|
||||
|
||||
class Number(agate.data_types.Number):
|
||||
# undo the change in https://github.com/wireservice/agate/pull/733
|
||||
# i.e. do not cast True and False to numeric 1 and 0
|
||||
def cast(self, d):
|
||||
if type(d) == bool:
|
||||
raise agate.exceptions.CastError(
|
||||
'Do not cast True to 1 or False to 0.'
|
||||
)
|
||||
else:
|
||||
return super().cast(d)
|
||||
|
||||
|
||||
class ISODateTime(agate.data_types.DateTime):
|
||||
def cast(self, d):
|
||||
# this is agate.data_types.DateTime.cast with the "clever" bits removed
|
||||
@@ -41,7 +53,7 @@ def build_type_tester(
|
||||
) -> agate.TypeTester:
|
||||
|
||||
types = [
|
||||
agate.data_types.Number(null_values=('null', '')),
|
||||
Number(null_values=('null', '')),
|
||||
agate.data_types.Date(null_values=('null', ''),
|
||||
date_format='%Y-%m-%d'),
|
||||
agate.data_types.DateTime(null_values=('null', ''),
|
||||
|
||||
@@ -4,11 +4,21 @@ import os.path
|
||||
from dbt.clients.system import run_cmd, rmdir
|
||||
from dbt.events.functions import fire_event
|
||||
from dbt.events.types import (
|
||||
GitSparseCheckoutSubdirectory, GitProgressCheckoutRevision,
|
||||
GitProgressUpdatingExistingDependency, GitProgressPullingNewDependency,
|
||||
GitNothingToDo, GitProgressUpdatedCheckoutRange, GitProgressCheckedOutAt
|
||||
GitSparseCheckoutSubdirectory,
|
||||
GitProgressCheckoutRevision,
|
||||
GitProgressUpdatingExistingDependency,
|
||||
GitProgressPullingNewDependency,
|
||||
GitNothingToDo,
|
||||
GitProgressUpdatedCheckoutRange,
|
||||
GitProgressCheckedOutAt,
|
||||
)
|
||||
from dbt.exceptions import (
|
||||
CommandResultError,
|
||||
RuntimeException,
|
||||
bad_package_spec,
|
||||
raise_git_cloning_error,
|
||||
raise_git_cloning_problem,
|
||||
)
|
||||
import dbt.exceptions
|
||||
from packaging import version
|
||||
|
||||
|
||||
@@ -18,23 +28,23 @@ def _is_commit(revision: str) -> bool:
|
||||
|
||||
|
||||
def _raise_git_cloning_error(repo, revision, error):
|
||||
stderr = error.stderr.decode('utf-8').strip()
|
||||
if 'usage: git' in stderr:
|
||||
stderr = stderr.split('\nusage: git')[0]
|
||||
stderr = error.stderr.strip()
|
||||
if "usage: git" in stderr:
|
||||
stderr = stderr.split("\nusage: git")[0]
|
||||
if re.match("fatal: destination path '(.+)' already exists", stderr):
|
||||
raise error
|
||||
raise_git_cloning_error(error)
|
||||
|
||||
dbt.exceptions.bad_package_spec(repo, revision, stderr)
|
||||
bad_package_spec(repo, revision, stderr)
|
||||
|
||||
|
||||
def clone(repo, cwd, dirname=None, remove_git_dir=False, revision=None, subdirectory=None):
|
||||
has_revision = revision is not None
|
||||
is_commit = _is_commit(revision or "")
|
||||
|
||||
clone_cmd = ['git', 'clone', '--depth', '1']
|
||||
clone_cmd = ["git", "clone", "--depth", "1"]
|
||||
if subdirectory:
|
||||
fire_event(GitSparseCheckoutSubdirectory(subdir=subdirectory))
|
||||
out, _ = run_cmd(cwd, ['git', '--version'], env={'LC_ALL': 'C'})
|
||||
out, _ = run_cmd(cwd, ["git", "--version"], env={"LC_ALL": "C"})
|
||||
git_version = version.parse(re.search(r"\d+\.\d+\.\d+", out.decode("utf-8")).group(0))
|
||||
if not git_version >= version.parse("2.25.0"):
|
||||
# 2.25.0 introduces --sparse
|
||||
@@ -42,37 +52,37 @@ def clone(repo, cwd, dirname=None, remove_git_dir=False, revision=None, subdirec
|
||||
"Please update your git version to pull a dbt package "
|
||||
"from a subdirectory: your version is {}, >= 2.25.0 needed".format(git_version)
|
||||
)
|
||||
clone_cmd.extend(['--filter=blob:none', '--sparse'])
|
||||
clone_cmd.extend(["--filter=blob:none", "--sparse"])
|
||||
|
||||
if has_revision and not is_commit:
|
||||
clone_cmd.extend(['--branch', revision])
|
||||
clone_cmd.extend(["--branch", revision])
|
||||
|
||||
clone_cmd.append(repo)
|
||||
|
||||
if dirname is not None:
|
||||
clone_cmd.append(dirname)
|
||||
try:
|
||||
result = run_cmd(cwd, clone_cmd, env={'LC_ALL': 'C'})
|
||||
except dbt.exceptions.CommandResultError as exc:
|
||||
result = run_cmd(cwd, clone_cmd, env={"LC_ALL": "C"})
|
||||
except CommandResultError as exc:
|
||||
_raise_git_cloning_error(repo, revision, exc)
|
||||
|
||||
if subdirectory:
|
||||
cwd_subdir = os.path.join(cwd, dirname or '')
|
||||
clone_cmd_subdir = ['git', 'sparse-checkout', 'set', subdirectory]
|
||||
cwd_subdir = os.path.join(cwd, dirname or "")
|
||||
clone_cmd_subdir = ["git", "sparse-checkout", "set", subdirectory]
|
||||
try:
|
||||
run_cmd(cwd_subdir, clone_cmd_subdir)
|
||||
except dbt.exceptions.CommandResultError as exc:
|
||||
except CommandResultError as exc:
|
||||
_raise_git_cloning_error(repo, revision, exc)
|
||||
|
||||
if remove_git_dir:
|
||||
rmdir(os.path.join(dirname, '.git'))
|
||||
rmdir(os.path.join(dirname, ".git"))
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def list_tags(cwd):
|
||||
out, err = run_cmd(cwd, ['git', 'tag', '--list'], env={'LC_ALL': 'C'})
|
||||
tags = out.decode('utf-8').strip().split("\n")
|
||||
out, err = run_cmd(cwd, ["git", "tag", "--list"], env={"LC_ALL": "C"})
|
||||
tags = out.decode("utf-8").strip().split("\n")
|
||||
return tags
|
||||
|
||||
|
||||
@@ -84,44 +94,44 @@ def _checkout(cwd, repo, revision):
|
||||
if _is_commit(revision):
|
||||
run_cmd(cwd, fetch_cmd + [revision])
|
||||
else:
|
||||
run_cmd(cwd, ['git', 'remote', 'set-branches', 'origin', revision])
|
||||
run_cmd(cwd, ["git", "remote", "set-branches", "origin", revision])
|
||||
run_cmd(cwd, fetch_cmd + ["--tags", revision])
|
||||
|
||||
if _is_commit(revision):
|
||||
spec = revision
|
||||
# Prefer tags to branches if one exists
|
||||
elif revision in list_tags(cwd):
|
||||
spec = 'tags/{}'.format(revision)
|
||||
spec = "tags/{}".format(revision)
|
||||
else:
|
||||
spec = 'origin/{}'.format(revision)
|
||||
spec = "origin/{}".format(revision)
|
||||
|
||||
out, err = run_cmd(cwd, ['git', 'reset', '--hard', spec],
|
||||
env={'LC_ALL': 'C'})
|
||||
out, err = run_cmd(cwd, ["git", "reset", "--hard", spec], env={"LC_ALL": "C"})
|
||||
return out, err
|
||||
|
||||
|
||||
def checkout(cwd, repo, revision=None):
|
||||
if revision is None:
|
||||
revision = 'HEAD'
|
||||
revision = "HEAD"
|
||||
try:
|
||||
return _checkout(cwd, repo, revision)
|
||||
except dbt.exceptions.CommandResultError as exc:
|
||||
stderr = exc.stderr.decode('utf-8').strip()
|
||||
dbt.exceptions.bad_package_spec(repo, revision, stderr)
|
||||
except CommandResultError as exc:
|
||||
stderr = exc.stderr.strip()
|
||||
bad_package_spec(repo, revision, stderr)
|
||||
|
||||
|
||||
def get_current_sha(cwd):
|
||||
out, err = run_cmd(cwd, ['git', 'rev-parse', 'HEAD'], env={'LC_ALL': 'C'})
|
||||
out, err = run_cmd(cwd, ["git", "rev-parse", "HEAD"], env={"LC_ALL": "C"})
|
||||
|
||||
return out.decode('utf-8')
|
||||
return out.decode("utf-8")
|
||||
|
||||
|
||||
def remove_remote(cwd):
|
||||
return run_cmd(cwd, ['git', 'remote', 'rm', 'origin'], env={'LC_ALL': 'C'})
|
||||
return run_cmd(cwd, ["git", "remote", "rm", "origin"], env={"LC_ALL": "C"})
|
||||
|
||||
|
||||
def clone_and_checkout(repo, cwd, dirname=None, remove_git_dir=False,
|
||||
revision=None, subdirectory=None):
|
||||
def clone_and_checkout(
|
||||
repo, cwd, dirname=None, remove_git_dir=False, revision=None, subdirectory=None
|
||||
):
|
||||
exists = None
|
||||
try:
|
||||
_, err = clone(
|
||||
@@ -131,14 +141,11 @@ def clone_and_checkout(repo, cwd, dirname=None, remove_git_dir=False,
|
||||
remove_git_dir=remove_git_dir,
|
||||
subdirectory=subdirectory,
|
||||
)
|
||||
except dbt.exceptions.CommandResultError as exc:
|
||||
err = exc.stderr.decode('utf-8')
|
||||
except CommandResultError as exc:
|
||||
err = exc.stderr
|
||||
exists = re.match("fatal: destination path '(.+)' already exists", err)
|
||||
if not exists:
|
||||
print(
|
||||
'\nSomething went wrong while cloning {}'.format(repo) +
|
||||
'\nCheck the debug logs for more information')
|
||||
raise
|
||||
raise_git_cloning_problem(repo)
|
||||
|
||||
directory = None
|
||||
start_sha = None
|
||||
@@ -146,11 +153,9 @@ def clone_and_checkout(repo, cwd, dirname=None, remove_git_dir=False,
|
||||
directory = exists.group(1)
|
||||
fire_event(GitProgressUpdatingExistingDependency(dir=directory))
|
||||
else:
|
||||
matches = re.match("Cloning into '(.+)'", err.decode('utf-8'))
|
||||
matches = re.match("Cloning into '(.+)'", err.decode("utf-8"))
|
||||
if matches is None:
|
||||
raise dbt.exceptions.RuntimeException(
|
||||
f'Error cloning {repo} - never saw "Cloning into ..." from git'
|
||||
)
|
||||
raise RuntimeException(f'Error cloning {repo} - never saw "Cloning into ..." from git')
|
||||
directory = matches.group(1)
|
||||
fire_event(GitProgressPullingNewDependency(dir=directory))
|
||||
full_path = os.path.join(cwd, directory)
|
||||
@@ -161,9 +166,9 @@ def clone_and_checkout(repo, cwd, dirname=None, remove_git_dir=False,
|
||||
if start_sha == end_sha:
|
||||
fire_event(GitNothingToDo(sha=start_sha[:7]))
|
||||
else:
|
||||
fire_event(GitProgressUpdatedCheckoutRange(
|
||||
start_sha=start_sha[:7], end_sha=end_sha[:7]
|
||||
))
|
||||
fire_event(
|
||||
GitProgressUpdatedCheckoutRange(start_sha=start_sha[:7], end_sha=end_sha[:7])
|
||||
)
|
||||
else:
|
||||
fire_event(GitProgressCheckedOutAt(end_sha=end_sha[:7]))
|
||||
return os.path.join(directory, subdirectory or '')
|
||||
return os.path.join(directory, subdirectory or "")
|
||||
|
||||
@@ -1,9 +1,16 @@
|
||||
import functools
|
||||
from typing import Any, Dict, List
|
||||
import requests
|
||||
from dbt.events.functions import fire_event
|
||||
from dbt.events.types import (
|
||||
RegistryProgressMakingGETRequest,
|
||||
RegistryProgressGETResponse
|
||||
RegistryProgressGETResponse,
|
||||
RegistryIndexProgressMakingGETRequest,
|
||||
RegistryIndexProgressGETResponse,
|
||||
RegistryResponseUnexpectedType,
|
||||
RegistryResponseMissingTopKeys,
|
||||
RegistryResponseMissingNestedKeys,
|
||||
RegistryResponseExtraNestedKeys,
|
||||
)
|
||||
from dbt.utils import memoized, _connection_exception_retry as connection_exception_retry
|
||||
from dbt import deprecations
|
||||
@@ -15,51 +22,87 @@ else:
|
||||
DEFAULT_REGISTRY_BASE_URL = 'https://hub.getdbt.com/'
|
||||
|
||||
|
||||
def _get_url(url, registry_base_url=None):
|
||||
def _get_url(name, registry_base_url=None):
|
||||
if registry_base_url is None:
|
||||
registry_base_url = DEFAULT_REGISTRY_BASE_URL
|
||||
url = "api/v1/{}.json".format(name)
|
||||
|
||||
return '{}{}'.format(registry_base_url, url)
|
||||
|
||||
|
||||
def _get_with_retries(path, registry_base_url=None):
|
||||
get_fn = functools.partial(_get, path, registry_base_url)
|
||||
def _get_with_retries(package_name, registry_base_url=None):
|
||||
get_fn = functools.partial(_get, package_name, registry_base_url)
|
||||
return connection_exception_retry(get_fn, 5)
|
||||
|
||||
|
||||
def _get(path, registry_base_url=None):
|
||||
url = _get_url(path, registry_base_url)
|
||||
def _get(package_name, registry_base_url=None):
|
||||
url = _get_url(package_name, registry_base_url)
|
||||
fire_event(RegistryProgressMakingGETRequest(url=url))
|
||||
# all exceptions from requests get caught in the retry logic so no need to wrap this here
|
||||
resp = requests.get(url, timeout=30)
|
||||
fire_event(RegistryProgressGETResponse(url=url, resp_code=resp.status_code))
|
||||
resp.raise_for_status()
|
||||
if resp is None:
|
||||
raise requests.exceptions.ContentDecodingError(
|
||||
'Request error: The response is None', response=resp
|
||||
|
||||
# The response should always be a dictionary. Anything else is unexpected, raise error.
|
||||
# Raising this error will cause this function to retry (if called within _get_with_retries)
|
||||
# and hopefully get a valid response. This seems to happen when there's an issue with the Hub.
|
||||
# Since we control what we expect the HUB to return, this is safe.
|
||||
# See https://github.com/dbt-labs/dbt-core/issues/4577
|
||||
# and https://github.com/dbt-labs/dbt-core/issues/4849
|
||||
response = resp.json()
|
||||
|
||||
if not isinstance(response, dict): # This will also catch Nonetype
|
||||
error_msg = (
|
||||
f"Request error: Expected a response type of <dict> but got {type(response)} instead"
|
||||
)
|
||||
return resp.json()
|
||||
fire_event(RegistryResponseUnexpectedType(response=response))
|
||||
raise requests.exceptions.ContentDecodingError(error_msg, response=resp)
|
||||
|
||||
# check for expected top level keys
|
||||
expected_keys = {"name", "versions"}
|
||||
if not expected_keys.issubset(response):
|
||||
error_msg = (
|
||||
f"Request error: Expected the response to contain keys {expected_keys} "
|
||||
f"but is missing {expected_keys.difference(set(response))}"
|
||||
)
|
||||
fire_event(RegistryResponseMissingTopKeys(response=response))
|
||||
raise requests.exceptions.ContentDecodingError(error_msg, response=resp)
|
||||
|
||||
# check for the keys we need nested under each version
|
||||
expected_version_keys = {"name", "packages", "downloads"}
|
||||
all_keys = set().union(*(response["versions"][d] for d in response["versions"]))
|
||||
if not expected_version_keys.issubset(all_keys):
|
||||
error_msg = (
|
||||
"Request error: Expected the response for the version to contain keys "
|
||||
f"{expected_version_keys} but is missing {expected_version_keys.difference(all_keys)}"
|
||||
)
|
||||
fire_event(RegistryResponseMissingNestedKeys(response=response))
|
||||
raise requests.exceptions.ContentDecodingError(error_msg, response=resp)
|
||||
|
||||
# all version responses should contain identical keys.
|
||||
has_extra_keys = set().difference(*(response["versions"][d] for d in response["versions"]))
|
||||
if has_extra_keys:
|
||||
error_msg = (
|
||||
"Request error: Keys for all versions do not match. Found extra key(s) "
|
||||
f"of {has_extra_keys}."
|
||||
)
|
||||
fire_event(RegistryResponseExtraNestedKeys(response=response))
|
||||
raise requests.exceptions.ContentDecodingError(error_msg, response=resp)
|
||||
|
||||
return response
|
||||
|
||||
|
||||
def index(registry_base_url=None):
|
||||
return _get_with_retries('api/v1/index.json', registry_base_url)
|
||||
_get_cached = memoized(_get_with_retries)
|
||||
|
||||
|
||||
index_cached = memoized(index)
|
||||
|
||||
|
||||
def packages(registry_base_url=None):
|
||||
return _get_with_retries('api/v1/packages.json', registry_base_url)
|
||||
|
||||
|
||||
def package(name, registry_base_url=None):
|
||||
response = _get_with_retries('api/v1/{}.json'.format(name), registry_base_url)
|
||||
|
||||
def package(package_name, registry_base_url=None) -> Dict[str, Any]:
|
||||
# returns a dictionary of metadata for all versions of a package
|
||||
response = _get_cached(package_name, registry_base_url)
|
||||
# Either redirectnamespace or redirectname in the JSON response indicate a redirect
|
||||
# redirectnamespace redirects based on package ownership
|
||||
# redirectname redirects based on package name
|
||||
# Both can be present at the same time, or neither. Fails gracefully to old name
|
||||
|
||||
if ('redirectnamespace' in response) or ('redirectname' in response):
|
||||
if ("redirectnamespace" in response) or ("redirectname" in response):
|
||||
|
||||
if ('redirectnamespace' in response) and response['redirectnamespace'] is not None:
|
||||
use_namespace = response['redirectnamespace']
|
||||
@@ -72,15 +115,49 @@ def package(name, registry_base_url=None):
|
||||
use_name = response['name']
|
||||
|
||||
new_nwo = use_namespace + "/" + use_name
|
||||
deprecations.warn('package-redirect', old_name=name, new_name=new_nwo)
|
||||
deprecations.warn("package-redirect", old_name=package_name, new_name=new_nwo)
|
||||
return response["versions"]
|
||||
|
||||
|
||||
def package_version(package_name, version, registry_base_url=None) -> Dict[str, Any]:
|
||||
# returns the metadata of a specific version of a package
|
||||
response = package(package_name, registry_base_url)
|
||||
return response[version]
|
||||
|
||||
|
||||
def get_available_versions(package_name) -> List["str"]:
|
||||
# returns a list of all available versions of a package
|
||||
response = package(package_name)
|
||||
return list(response)
|
||||
|
||||
|
||||
def _get_index(registry_base_url=None):
|
||||
|
||||
url = _get_url("index", registry_base_url)
|
||||
fire_event(RegistryIndexProgressMakingGETRequest(url=url))
|
||||
# all exceptions from requests get caught in the retry logic so no need to wrap this here
|
||||
resp = requests.get(url, timeout=30)
|
||||
fire_event(RegistryIndexProgressGETResponse(url=url, resp_code=resp.status_code))
|
||||
resp.raise_for_status()
|
||||
|
||||
# The response should be a list. Anything else is unexpected, raise an error.
|
||||
# Raising this error will cause this function to retry and hopefully get a valid response.
|
||||
|
||||
response = resp.json()
|
||||
|
||||
if not isinstance(response, list): # This will also catch Nonetype
|
||||
error_msg = (
|
||||
f"Request error: The response type of {type(response)} is not valid: {resp.text}"
|
||||
)
|
||||
raise requests.exceptions.ContentDecodingError(error_msg, response=resp)
|
||||
|
||||
return response
|
||||
|
||||
|
||||
def package_version(name, version, registry_base_url=None):
|
||||
return _get_with_retries('api/v1/{}/{}.json'.format(name, version), registry_base_url)
|
||||
def index(registry_base_url=None) -> List[str]:
|
||||
# this returns a list of all packages on the Hub
|
||||
get_index_fn = functools.partial(_get_index, registry_base_url)
|
||||
return connection_exception_retry(get_index_fn, 5)
|
||||
|
||||
|
||||
def get_available_versions(name):
|
||||
response = package(name)
|
||||
return list(response['versions'])
|
||||
index_cached = memoized(index)
|
||||
|
||||
@@ -485,7 +485,7 @@ def untar_package(
|
||||
) -> None:
|
||||
tar_path = convert_path(tar_path)
|
||||
tar_dir_name = None
|
||||
with tarfile.open(tar_path, 'r') as tarball:
|
||||
with tarfile.open(tar_path, 'r:gz') as tarball:
|
||||
tarball.extractall(dest_dir)
|
||||
tar_dir_name = os.path.commonprefix(tarball.getnames())
|
||||
if rename_to:
|
||||
|
||||
@@ -45,7 +45,7 @@ INVALID_VERSION_ERROR = """\
|
||||
This version of dbt is not supported with the '{package}' package.
|
||||
Installed version of dbt: {installed}
|
||||
Required version of dbt for '{package}': {version_spec}
|
||||
Check the requirements for the '{package}' package, or run dbt again with \
|
||||
Check for a different version of the '{package}' package, or run dbt again with \
|
||||
--no-version-check
|
||||
"""
|
||||
|
||||
@@ -54,7 +54,7 @@ IMPOSSIBLE_VERSION_ERROR = """\
|
||||
The package version requirement can never be satisfied for the '{package}
|
||||
package.
|
||||
Required versions of dbt for '{package}': {version_spec}
|
||||
Check the requirements for the '{package}' package, or run dbt again with \
|
||||
Check for a different version of the '{package}' package, or run dbt again with \
|
||||
--no-version-check
|
||||
"""
|
||||
|
||||
|
||||
@@ -1,14 +1,17 @@
|
||||
from typing import Dict, Any, Tuple, Optional, Union, Callable
|
||||
import re
|
||||
import os
|
||||
|
||||
from dbt.clients.jinja import get_rendered, catch_jinja
|
||||
from dbt.context.target import TargetContext
|
||||
from dbt.context.secret import SecretContext
|
||||
from dbt.context.secret import SecretContext, SECRET_PLACEHOLDER
|
||||
from dbt.context.base import BaseContext
|
||||
from dbt.contracts.connection import HasCredentials
|
||||
from dbt.exceptions import (
|
||||
DbtProjectError, CompilationException, RecursionException
|
||||
)
|
||||
from dbt.utils import deep_map_render
|
||||
from dbt.logger import SECRET_ENV_PREFIX
|
||||
|
||||
|
||||
Keypath = Tuple[Union[str, int], ...]
|
||||
@@ -122,11 +125,9 @@ class DbtProjectYamlRenderer(BaseRenderer):
|
||||
def name(self):
|
||||
'Project config'
|
||||
|
||||
# Uses SecretRenderer
|
||||
def get_package_renderer(self) -> BaseRenderer:
|
||||
return PackageRenderer(self.context)
|
||||
|
||||
def get_selector_renderer(self) -> BaseRenderer:
|
||||
return SelectorRenderer(self.context)
|
||||
return PackageRenderer(self.ctx_obj.cli_vars)
|
||||
|
||||
def render_project(
|
||||
self,
|
||||
@@ -144,8 +145,7 @@ class DbtProjectYamlRenderer(BaseRenderer):
|
||||
return package_renderer.render_data(packages)
|
||||
|
||||
def render_selectors(self, selectors: Dict[str, Any]):
|
||||
selector_renderer = self.get_selector_renderer()
|
||||
return selector_renderer.render_data(selectors)
|
||||
return self.render_data(selectors)
|
||||
|
||||
def render_entry(self, value: Any, keypath: Keypath) -> Any:
|
||||
result = super().render_entry(value, keypath)
|
||||
@@ -176,20 +176,10 @@ class DbtProjectYamlRenderer(BaseRenderer):
|
||||
return True
|
||||
|
||||
|
||||
class SelectorRenderer(BaseRenderer):
|
||||
@property
|
||||
def name(self):
|
||||
return 'Selector config'
|
||||
|
||||
|
||||
class SecretRenderer(BaseRenderer):
|
||||
def __init__(
|
||||
self, cli_vars: Optional[Dict[str, Any]] = None
|
||||
) -> None:
|
||||
def __init__(self, cli_vars: Dict[str, Any] = {}) -> None:
|
||||
# Generate contexts here because we want to save the context
|
||||
# object in order to retrieve the env_vars.
|
||||
if cli_vars is None:
|
||||
cli_vars = {}
|
||||
self.ctx_obj = SecretContext(cli_vars)
|
||||
context = self.ctx_obj.to_dict()
|
||||
super().__init__(context)
|
||||
@@ -198,6 +188,23 @@ class SecretRenderer(BaseRenderer):
|
||||
def name(self):
|
||||
return 'Secret'
|
||||
|
||||
def render_value(self, value: Any, keypath: Optional[Keypath] = None) -> Any:
|
||||
rendered = super().render_value(value, keypath)
|
||||
if SECRET_ENV_PREFIX in str(rendered):
|
||||
search_group = f"({SECRET_ENV_PREFIX}(.*))"
|
||||
pattern = SECRET_PLACEHOLDER.format(search_group).replace("$", r"\$")
|
||||
m = re.search(
|
||||
pattern,
|
||||
rendered,
|
||||
)
|
||||
if m:
|
||||
found = m.group(1)
|
||||
value = os.environ[found]
|
||||
replace_this = SECRET_PLACEHOLDER.format(found)
|
||||
return rendered.replace(replace_this, value)
|
||||
else:
|
||||
return rendered
|
||||
|
||||
|
||||
class ProfileRenderer(SecretRenderer):
|
||||
@property
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import itertools
|
||||
import os
|
||||
from copy import deepcopy
|
||||
from dataclasses import dataclass, fields
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import (
|
||||
Dict, Any, Optional, Mapping, Iterator, Iterable, Tuple, List, MutableSet,
|
||||
@@ -13,20 +13,17 @@ from .project import Project
|
||||
from .renderer import DbtProjectYamlRenderer, ProfileRenderer
|
||||
from .utils import parse_cli_vars
|
||||
from dbt import flags
|
||||
from dbt import tracking
|
||||
from dbt.adapters.factory import get_relation_class_by_name, get_include_paths
|
||||
from dbt.helper_types import FQNPath, PathSet
|
||||
from dbt.helper_types import FQNPath, PathSet, DictDefaultEmptyStr
|
||||
from dbt.config.profile import read_user_config
|
||||
from dbt.contracts.connection import AdapterRequiredConfig, Credentials
|
||||
from dbt.contracts.graph.manifest import ManifestMetadata
|
||||
from dbt.contracts.relation import ComponentName
|
||||
from dbt.events.types import ProfileLoadError, ProfileNotFound
|
||||
from dbt.events.functions import fire_event
|
||||
from dbt.ui import warning_tag
|
||||
|
||||
from dbt.contracts.project import Configuration, UserConfig
|
||||
from dbt.exceptions import (
|
||||
RuntimeException,
|
||||
DbtProfileError,
|
||||
DbtProjectError,
|
||||
validator_error_message,
|
||||
warn_or_error,
|
||||
@@ -191,6 +188,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
|
||||
profile_renderer: ProfileRenderer,
|
||||
profile_name: Optional[str],
|
||||
) -> Profile:
|
||||
|
||||
return Profile.render_from_args(
|
||||
args, profile_renderer, profile_name
|
||||
)
|
||||
@@ -412,27 +410,18 @@ class UnsetCredentials(Credentials):
|
||||
return ()
|
||||
|
||||
|
||||
class UnsetConfig(UserConfig):
|
||||
def __getattribute__(self, name):
|
||||
if name in {f.name for f in fields(UserConfig)}:
|
||||
raise AttributeError(
|
||||
f"'UnsetConfig' object has no attribute {name}"
|
||||
)
|
||||
|
||||
def __post_serialize__(self, dct):
|
||||
return {}
|
||||
|
||||
|
||||
# This is used by UnsetProfileConfig, for commands which do
|
||||
# not require a profile, i.e. dbt deps and clean
|
||||
class UnsetProfile(Profile):
|
||||
def __init__(self):
|
||||
self.credentials = UnsetCredentials()
|
||||
self.user_config = UnsetConfig()
|
||||
self.user_config = UserConfig() # This will be read in _get_rendered_profile
|
||||
self.profile_name = ''
|
||||
self.target_name = ''
|
||||
self.threads = -1
|
||||
|
||||
def to_target_dict(self):
|
||||
return {}
|
||||
return DictDefaultEmptyStr({})
|
||||
|
||||
def __getattribute__(self, name):
|
||||
if name in {'profile_name', 'target_name', 'threads'}:
|
||||
@@ -443,6 +432,8 @@ class UnsetProfile(Profile):
|
||||
return Profile.__getattribute__(self, name)
|
||||
|
||||
|
||||
# This class is used by the dbt deps and clean commands, because they don't
|
||||
# require a functioning profile.
|
||||
@dataclass
|
||||
class UnsetProfileConfig(RuntimeConfig):
|
||||
"""This class acts a lot _like_ a RuntimeConfig, except if your profile is
|
||||
@@ -469,7 +460,7 @@ class UnsetProfileConfig(RuntimeConfig):
|
||||
|
||||
def to_target_dict(self):
|
||||
# re-override the poisoned profile behavior
|
||||
return {}
|
||||
return DictDefaultEmptyStr({})
|
||||
|
||||
@classmethod
|
||||
def from_parts(
|
||||
@@ -525,7 +516,7 @@ class UnsetProfileConfig(RuntimeConfig):
|
||||
profile_env_vars=profile.profile_env_vars,
|
||||
profile_name='',
|
||||
target_name='',
|
||||
user_config=UnsetConfig(),
|
||||
user_config=UserConfig(),
|
||||
threads=getattr(args, 'threads', 1),
|
||||
credentials=UnsetCredentials(),
|
||||
args=args,
|
||||
@@ -540,17 +531,12 @@ class UnsetProfileConfig(RuntimeConfig):
|
||||
profile_renderer: ProfileRenderer,
|
||||
profile_name: Optional[str],
|
||||
) -> Profile:
|
||||
try:
|
||||
profile = Profile.render_from_args(
|
||||
args, profile_renderer, profile_name
|
||||
)
|
||||
except (DbtProjectError, DbtProfileError) as exc:
|
||||
fire_event(ProfileLoadError(exc=exc))
|
||||
fire_event(ProfileNotFound(profile_name=profile_name))
|
||||
# return the poisoned form
|
||||
profile = UnsetProfile()
|
||||
# disable anonymous usage statistics
|
||||
tracking.disable_tracking()
|
||||
|
||||
profile = UnsetProfile()
|
||||
# The profile (for warehouse connection) is not needed, but we want
|
||||
# to get the UserConfig, which is also in profiles.yml
|
||||
user_config = read_user_config(flags.PROFILES_DIR)
|
||||
profile.user_config = user_config
|
||||
return profile
|
||||
|
||||
@classmethod
|
||||
@@ -565,9 +551,6 @@ class UnsetProfileConfig(RuntimeConfig):
|
||||
:raises ValidationException: If the cli variables are invalid.
|
||||
"""
|
||||
project, profile = cls.collect_parts(args)
|
||||
if not isinstance(profile, UnsetProfile):
|
||||
# if it's a real profile, return a real config
|
||||
cls = RuntimeConfig
|
||||
|
||||
return cls.from_parts(
|
||||
project=project,
|
||||
|
||||
@@ -5,7 +5,7 @@ from dbt.clients.yaml_helper import ( # noqa: F401
|
||||
)
|
||||
from dbt.dataclass_schema import ValidationError
|
||||
|
||||
from .renderer import SelectorRenderer
|
||||
from .renderer import BaseRenderer
|
||||
|
||||
from dbt.clients.system import (
|
||||
load_file_contents,
|
||||
@@ -60,8 +60,8 @@ class SelectorConfig(Dict[str, Dict[str, Union[SelectionSpec, bool]]]):
|
||||
def render_from_dict(
|
||||
cls,
|
||||
data: Dict[str, Any],
|
||||
renderer: SelectorRenderer,
|
||||
) -> 'SelectorConfig':
|
||||
renderer: BaseRenderer,
|
||||
) -> "SelectorConfig":
|
||||
try:
|
||||
rendered = renderer.render_data(data)
|
||||
except (ValidationError, RuntimeException) as exc:
|
||||
@@ -73,8 +73,10 @@ class SelectorConfig(Dict[str, Dict[str, Union[SelectionSpec, bool]]]):
|
||||
|
||||
@classmethod
|
||||
def from_path(
|
||||
cls, path: Path, renderer: SelectorRenderer,
|
||||
) -> 'SelectorConfig':
|
||||
cls,
|
||||
path: Path,
|
||||
renderer: BaseRenderer,
|
||||
) -> "SelectorConfig":
|
||||
try:
|
||||
data = load_yaml_text(load_file_contents(str(path)))
|
||||
except (ValidationError, RuntimeException) as exc:
|
||||
|
||||
51
core/dbt/context/README.md
Normal file
51
core/dbt/context/README.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# Contexts and Jinja rendering
|
||||
|
||||
Contexts are used for Jinja rendering. They include context methods, executable macros, and various settings that are available in Jinja.
|
||||
|
||||
The most common entrypoint to Jinja rendering in dbt is a method named `get_rendered`, which takes two arguments: templated code (string), and a context used to render it (dictionary).
|
||||
|
||||
The context is the bundle of information that is in "scope" when rendering Jinja-templated code. For instance, imagine a simple Jinja template:
|
||||
```
|
||||
{% set new_value = some_macro(some_variable) %}
|
||||
```
|
||||
Both `some_macro()` and `some_variable` must be defined in that context. Otherwise, it will raise an error when rendering.
|
||||
|
||||
Different contexts are used in different places because we allow access to different methods and data in different places. Executable SQL, for example, includes all available macros and the model being run. The variables and macros in scope for Jinja defined in yaml files is much more limited.
|
||||
|
||||
### Implementation
|
||||
|
||||
The context that is passed to Jinja is always in a dictionary format, not an actual class, so a `to_dict()` is executed on a context class before it is used for rendering.
|
||||
|
||||
Each context has a `generate_<name>_context` function to create the context. `ProviderContext` subclasses have different generate functions for parsing and for execution, so that certain functions (notably `ref`, `source`, and `config`) can return different results
|
||||
|
||||
### Hierarchy
|
||||
|
||||
All contexts inherit from the `BaseContext`, which includes "pure" methods (e.g. `tojson`), `env_var()`, and `var()` (but only CLI values, passed via `--vars`).
|
||||
|
||||
Methods available in parent contexts are also available in child contexts.
|
||||
|
||||
```
|
||||
BaseContext -- core/dbt/context/base.py
|
||||
SecretContext -- core/dbt/context/secret.py
|
||||
TargetContext -- core/dbt/context/target.py
|
||||
ConfiguredContext -- core/dbt/context/configured.py
|
||||
SchemaYamlContext -- core/dbt/context/configured.py
|
||||
DocsRuntimeContext -- core/dbt/context/configured.py
|
||||
MacroResolvingContext -- core/dbt/context/configured.py
|
||||
ManifestContext -- core/dbt/context/manifest.py
|
||||
QueryHeaderContext -- core/dbt/context/manifest.py
|
||||
ProviderContext -- core/dbt/context/provider.py
|
||||
MacroContext -- core/dbt/context/provider.py
|
||||
ModelContext -- core/dbt/context/provider.py
|
||||
TestContext -- core/dbt/context/provider.py
|
||||
```
|
||||
|
||||
### Contexts for configuration
|
||||
|
||||
Contexts for rendering "special" `.yml` (configuration) files:
|
||||
- `SecretContext`: Supports "secret" env vars, which are prefixed with `DBT_ENV_SECRET_`. Used for rendering in `profiles.yml` and `packages.yml` ONLY. Secrets defined elsewhere will raise explicit errors.
|
||||
- `TargetContext`: The same as `Base`, plus `target` (connection profile). Used most notably in `dbt_project.yml` and `selectors.yml`.
|
||||
|
||||
Contexts for other `.yml` files in the project:
|
||||
- `SchemaYamlContext`: Supports `vars` declared on the CLI and in `dbt_project.yml`. Does not support custom macros, beyond `var()` + `env_var()` methods. Used for all `.yml` files, to define properties and configuration.
|
||||
- `DocsRuntimeContext`: Standard `.yml` file context, plus `doc()` method (with all `docs` blocks in scope). Used to resolve `description` properties.
|
||||
@@ -25,38 +25,7 @@ import pytz
|
||||
import datetime
|
||||
import re
|
||||
|
||||
# Contexts in dbt Core
|
||||
# Contexts are used for Jinja rendering. They include context methods,
|
||||
# executable macros, and various settings that are available in Jinja.
|
||||
#
|
||||
# Different contexts are used in different places because we allow access
|
||||
# to different methods and data in different places. Executable SQL, for
|
||||
# example, includes the available macros and the model, while Jinja in
|
||||
# yaml files is more limited.
|
||||
#
|
||||
# The context that is passed to Jinja is always in a dictionary format,
|
||||
# not an actual class, so a 'to_dict()' is executed on a context class
|
||||
# before it is used for rendering.
|
||||
#
|
||||
# Each context has a generate_<name>_context function to create the context.
|
||||
# ProviderContext subclasses have different generate functions for
|
||||
# parsing and for execution.
|
||||
#
|
||||
# Context class hierarchy
|
||||
#
|
||||
# BaseContext -- core/dbt/context/base.py
|
||||
# SecretContext -- core/dbt/context/secret.py
|
||||
# TargetContext -- core/dbt/context/target.py
|
||||
# ConfiguredContext -- core/dbt/context/configured.py
|
||||
# SchemaYamlContext -- core/dbt/context/configured.py
|
||||
# DocsRuntimeContext -- core/dbt/context/configured.py
|
||||
# MacroResolvingContext -- core/dbt/context/configured.py
|
||||
# ManifestContext -- core/dbt/context/manifest.py
|
||||
# QueryHeaderContext -- core/dbt/context/manifest.py
|
||||
# ProviderContext -- core/dbt/context/provider.py
|
||||
# MacroContext -- core/dbt/context/provider.py
|
||||
# ModelContext -- core/dbt/context/provider.py
|
||||
# TestContext -- core/dbt/context/provider.py
|
||||
# See the `contexts` module README for more information on how contexts work
|
||||
|
||||
|
||||
def get_pytz_module_context() -> Dict[str, Any]:
|
||||
|
||||
@@ -1186,10 +1186,12 @@ class ProviderContext(ManifestContext):
|
||||
# If this is compiling, do not save because it's irrelevant to parsing.
|
||||
if self.model and not hasattr(self.model, 'compiled'):
|
||||
self.manifest.env_vars[var] = return_value
|
||||
source_file = self.manifest.files[self.model.file_id]
|
||||
# Schema files should never get here
|
||||
if source_file.parse_file_type != 'schema':
|
||||
source_file.env_vars.append(var)
|
||||
# hooks come from dbt_project.yml which doesn't have a real file_id
|
||||
if self.model.file_id in self.manifest.files:
|
||||
source_file = self.manifest.files[self.model.file_id]
|
||||
# Schema files should never get here
|
||||
if source_file.parse_file_type != 'schema':
|
||||
source_file.env_vars.append(var)
|
||||
return return_value
|
||||
else:
|
||||
msg = f"Env var required but not provided: '{var}'"
|
||||
|
||||
@@ -4,6 +4,10 @@ from typing import Any, Dict, Optional
|
||||
from .base import BaseContext, contextmember
|
||||
|
||||
from dbt.exceptions import raise_parsing_error
|
||||
from dbt.logger import SECRET_ENV_PREFIX
|
||||
|
||||
|
||||
SECRET_PLACEHOLDER = "$$$DBT_SECRET_START$$${}$$$DBT_SECRET_END$$$"
|
||||
|
||||
|
||||
class SecretContext(BaseContext):
|
||||
@@ -17,17 +21,29 @@ class SecretContext(BaseContext):
|
||||
|
||||
If the default is None, raise an exception for an undefined variable.
|
||||
|
||||
In this context *only*, env_var will return the actual values of
|
||||
env vars prefixed with DBT_ENV_SECRET_
|
||||
In this context *only*, env_var will accept env vars prefixed with DBT_ENV_SECRET_.
|
||||
It will return the name of the secret env var, wrapped in 'start' and 'end' identifiers.
|
||||
The actual value will be subbed in later in SecretRenderer.render_value()
|
||||
"""
|
||||
return_value = None
|
||||
if var in os.environ:
|
||||
|
||||
# if this is a 'secret' env var, just return the name of the env var
|
||||
# instead of rendering the actual value here, to avoid any risk of
|
||||
# Jinja manipulation. it will be subbed out later, in SecretRenderer.render_value
|
||||
if var in os.environ and var.startswith(SECRET_ENV_PREFIX):
|
||||
return SECRET_PLACEHOLDER.format(var)
|
||||
|
||||
elif var in os.environ:
|
||||
return_value = os.environ[var]
|
||||
elif default is not None:
|
||||
return_value = default
|
||||
|
||||
if return_value is not None:
|
||||
self.env_vars[var] = return_value
|
||||
# store env vars in the internal manifest to power partial parsing
|
||||
# if it's a 'secret' env var, we shouldn't even get here
|
||||
# but just to be safe — don't save secrets
|
||||
if not var.startswith(SECRET_ENV_PREFIX):
|
||||
self.env_vars[var] = return_value
|
||||
return return_value
|
||||
else:
|
||||
msg = f"Env var required but not provided: '{var}'"
|
||||
|
||||
@@ -153,7 +153,6 @@ class ParsedNodeMixins(dbtClassMixin):
|
||||
self.created_at = time.time()
|
||||
self.description = patch.description
|
||||
self.columns = patch.columns
|
||||
self.meta = patch.meta
|
||||
self.docs = patch.docs
|
||||
|
||||
def get_materialization(self):
|
||||
@@ -431,6 +430,10 @@ class ParsedSingularTestNode(ParsedNode):
|
||||
# refactor the various configs.
|
||||
config: TestConfig = field(default_factory=TestConfig) # type: ignore
|
||||
|
||||
@property
|
||||
def test_node_type(self):
|
||||
return 'singular'
|
||||
|
||||
|
||||
@dataclass
|
||||
class ParsedGenericTestNode(ParsedNode, HasTestMetadata):
|
||||
@@ -452,6 +455,10 @@ class ParsedGenericTestNode(ParsedNode, HasTestMetadata):
|
||||
True
|
||||
)
|
||||
|
||||
@property
|
||||
def test_node_type(self):
|
||||
return 'generic'
|
||||
|
||||
|
||||
@dataclass
|
||||
class IntermediateSnapshotNode(ParsedNode):
|
||||
|
||||
@@ -18,6 +18,18 @@ DEFAULT_SEND_ANONYMOUS_USAGE_STATS = True
|
||||
class Name(ValidatedStringMixin):
|
||||
ValidationRegex = r'^[^\d\W]\w*$'
|
||||
|
||||
@classmethod
|
||||
def is_valid(cls, value: Any) -> bool:
|
||||
if not isinstance(value, str):
|
||||
return False
|
||||
|
||||
try:
|
||||
cls.validate(value)
|
||||
except ValidationError:
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
|
||||
register_pattern(Name, r'^[^\d\W]\w*$')
|
||||
|
||||
@@ -231,7 +243,7 @@ class UserConfig(ExtensibleDbtClassMixin, Replaceable, UserConfigContract):
|
||||
printer_width: Optional[int] = None
|
||||
write_json: Optional[bool] = None
|
||||
warn_error: Optional[bool] = None
|
||||
log_format: Optional[bool] = None
|
||||
log_format: Optional[str] = None
|
||||
debug: Optional[bool] = None
|
||||
version_check: Optional[bool] = None
|
||||
fail_fast: Optional[bool] = None
|
||||
|
||||
@@ -14,7 +14,8 @@ class PreviousState:
|
||||
manifest_path = self.path / 'manifest.json'
|
||||
if manifest_path.exists() and manifest_path.is_file():
|
||||
try:
|
||||
self.manifest = WritableManifest.read(str(manifest_path))
|
||||
# we want to bail with an error if schema versions don't match
|
||||
self.manifest = WritableManifest.read_and_check_versions(str(manifest_path))
|
||||
except IncompatibleSchemaException as exc:
|
||||
exc.add_filename(str(manifest_path))
|
||||
raise
|
||||
@@ -22,7 +23,8 @@ class PreviousState:
|
||||
results_path = self.path / 'run_results.json'
|
||||
if results_path.exists() and results_path.is_file():
|
||||
try:
|
||||
self.results = RunResultsArtifact.read(str(results_path))
|
||||
# we want to bail with an error if schema versions don't match
|
||||
self.results = RunResultsArtifact.read_and_check_versions(str(results_path))
|
||||
except IncompatibleSchemaException as exc:
|
||||
exc.add_filename(str(results_path))
|
||||
raise
|
||||
|
||||
@@ -9,6 +9,7 @@ from dbt.clients.system import write_json, read_json
|
||||
from dbt.exceptions import (
|
||||
InternalException,
|
||||
RuntimeException,
|
||||
IncompatibleSchemaException
|
||||
)
|
||||
from dbt.version import __version__
|
||||
from dbt.events.functions import get_invocation_id
|
||||
@@ -158,6 +159,8 @@ def get_metadata_env() -> Dict[str, str]:
|
||||
}
|
||||
|
||||
|
||||
# This is used in the ManifestMetadata, RunResultsMetadata, RunOperationResultMetadata,
|
||||
# FreshnessMetadata, and CatalogMetadata classes
|
||||
@dataclasses.dataclass
|
||||
class BaseArtifactMetadata(dbtClassMixin):
|
||||
dbt_schema_version: str
|
||||
@@ -177,6 +180,17 @@ class BaseArtifactMetadata(dbtClassMixin):
|
||||
return dct
|
||||
|
||||
|
||||
# This is used as a class decorator to set the schema_version in the
|
||||
# 'dbt_schema_version' class attribute. (It's copied into the metadata objects.)
|
||||
# Name attributes of SchemaVersion in classes with the 'schema_version' decorator:
|
||||
# manifest
|
||||
# run-results
|
||||
# run-operation-result
|
||||
# sources
|
||||
# catalog
|
||||
# remote-compile-result
|
||||
# remote-execution-result
|
||||
# remote-run-result
|
||||
def schema_version(name: str, version: int):
|
||||
def inner(cls: Type[VersionedSchema]):
|
||||
cls.dbt_schema_version = SchemaVersion(
|
||||
@@ -187,6 +201,7 @@ def schema_version(name: str, version: int):
|
||||
return inner
|
||||
|
||||
|
||||
# This is used in the ArtifactMixin and RemoteResult classes
|
||||
@dataclasses.dataclass
|
||||
class VersionedSchema(dbtClassMixin):
|
||||
dbt_schema_version: ClassVar[SchemaVersion]
|
||||
@@ -198,6 +213,30 @@ class VersionedSchema(dbtClassMixin):
|
||||
result['$id'] = str(cls.dbt_schema_version)
|
||||
return result
|
||||
|
||||
@classmethod
|
||||
def read_and_check_versions(cls, path: str):
|
||||
try:
|
||||
data = read_json(path)
|
||||
except (EnvironmentError, ValueError) as exc:
|
||||
raise RuntimeException(
|
||||
f'Could not read {cls.__name__} at "{path}" as JSON: {exc}'
|
||||
) from exc
|
||||
|
||||
# Check metadata version. There is a class variable 'dbt_schema_version', but
|
||||
# that doesn't show up in artifacts, where it only exists in the 'metadata'
|
||||
# dictionary.
|
||||
if hasattr(cls, 'dbt_schema_version'):
|
||||
if 'metadata' in data and 'dbt_schema_version' in data['metadata']:
|
||||
previous_schema_version = data['metadata']['dbt_schema_version']
|
||||
# cls.dbt_schema_version is a SchemaVersion object
|
||||
if str(cls.dbt_schema_version) != previous_schema_version:
|
||||
raise IncompatibleSchemaException(
|
||||
expected=str(cls.dbt_schema_version),
|
||||
found=previous_schema_version
|
||||
)
|
||||
|
||||
return cls.from_dict(data) # type: ignore
|
||||
|
||||
|
||||
T = TypeVar('T', bound='ArtifactMixin')
|
||||
|
||||
@@ -205,6 +244,8 @@ T = TypeVar('T', bound='ArtifactMixin')
|
||||
# metadata should really be a Generic[T_M] where T_M is a TypeVar bound to
|
||||
# BaseArtifactMetadata. Unfortunately this isn't possible due to a mypy issue:
|
||||
# https://github.com/python/mypy/issues/7520
|
||||
# This is used in the WritableManifest, RunResultsArtifact, RunOperationResultsArtifact,
|
||||
# and CatalogArtifact
|
||||
@dataclasses.dataclass(init=False)
|
||||
class ArtifactMixin(VersionedSchema, Writable, Readable):
|
||||
metadata: BaseArtifactMetadata
|
||||
|
||||
@@ -36,9 +36,9 @@ class DBTDeprecation:
|
||||
if self.name not in active_deprecations:
|
||||
desc = self.description.format(**kwargs)
|
||||
msg = ui.line_wrap_message(
|
||||
desc, prefix='* Deprecation Warning:\n\n'
|
||||
desc, prefix='Deprecated functionality\n\n'
|
||||
)
|
||||
dbt.exceptions.warn_or_error(msg)
|
||||
dbt.exceptions.warn_or_error(msg, log_fmt=ui.warning_tag('{}'))
|
||||
self.track_deprecation_warn()
|
||||
active_deprecations.add(self.name)
|
||||
|
||||
@@ -62,7 +62,7 @@ class PackageInstallPathDeprecation(DBTDeprecation):
|
||||
|
||||
class ConfigPathDeprecation(DBTDeprecation):
|
||||
_description = '''\
|
||||
The `{deprecated_path}` config has been deprecated in favor of `{exp_path}`.
|
||||
The `{deprecated_path}` config has been renamed to `{exp_path}`.
|
||||
Please update your `dbt_project.yml` configuration to reflect this change.
|
||||
'''
|
||||
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
import os
|
||||
import functools
|
||||
from typing import List
|
||||
|
||||
from dbt import semver
|
||||
@@ -14,6 +15,7 @@ from dbt.exceptions import (
|
||||
DependencyException,
|
||||
package_not_found,
|
||||
)
|
||||
from dbt.utils import _connection_exception_retry as connection_exception_retry
|
||||
|
||||
|
||||
class RegistryPackageMixin:
|
||||
@@ -68,9 +70,28 @@ class RegistryPinnedPackage(RegistryPackageMixin, PinnedPackage):
|
||||
system.make_directory(os.path.dirname(tar_path))
|
||||
|
||||
download_url = metadata.downloads.tarball
|
||||
system.download_with_retries(download_url, tar_path)
|
||||
deps_path = project.packages_install_path
|
||||
package_name = self.get_project_name(project, renderer)
|
||||
|
||||
download_untar_fn = functools.partial(
|
||||
self.download_and_untar,
|
||||
download_url,
|
||||
tar_path,
|
||||
deps_path,
|
||||
package_name
|
||||
)
|
||||
connection_exception_retry(download_untar_fn, 5)
|
||||
|
||||
def download_and_untar(self, download_url, tar_path, deps_path, package_name):
|
||||
"""
|
||||
Sometimes the download of the files fails and we want to retry. Sometimes the
|
||||
download appears successful but the file did not make it through as expected
|
||||
(generally due to a github incident). Either way we want to retry downloading
|
||||
and untarring to see if we can get a success. Call this within
|
||||
`_connection_exception_retry`
|
||||
"""
|
||||
|
||||
system.download(download_url, tar_path)
|
||||
system.untar_package(tar_path, deps_path, package_name)
|
||||
|
||||
|
||||
|
||||
@@ -6,7 +6,53 @@ The Events module is the implmentation for structured logging. These events repr
|
||||
The event module provides types that represent what is happening in dbt in `events.types`. These types are intended to represent an exhaustive list of all things happening within dbt that will need to be logged, streamed, or printed. To fire an event, `events.functions::fire_event` is the entry point to the module from everywhere in dbt.
|
||||
|
||||
# Adding a New Event
|
||||
In `events.types` add a new class that represents the new event. This may be a simple class with no values, or it may be a dataclass with some values to construct downstream messaging. Only include the data necessary to construct this message within this class. You must extend all destinations (e.g. - if your log message belongs on the cli, extend `CliEventABC`) as well as the loglevel this event belongs to.
|
||||
In `events.types` add a new class that represents the new event. All events must be a dataclass with, at minimum, a code. You may also include some other values to construct downstream messaging. Only include the data necessary to construct this message within this class. You must extend all destinations (e.g. - if your log message belongs on the cli, extend `Cli`) as well as the loglevel this event belongs to. This system has been designed to take full advantage of mypy so running it will catch anything you may miss.
|
||||
|
||||
## Required for Every Event
|
||||
|
||||
- a string attribute `code`, that's unique across events
|
||||
- assign a log level by extending `DebugLevel`, `InfoLevel`, `WarnLevel`, or `ErrorLevel`
|
||||
- a message()
|
||||
- extend `File` and/or `Cli` based on where it should output
|
||||
|
||||
Example
|
||||
```
|
||||
@dataclass
|
||||
class PartialParsingDeletedExposure(DebugLevel, Cli, File):
|
||||
unique_id: str
|
||||
code: str = "I049"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"Partial parsing: deleted exposure {self.unique_id}"
|
||||
|
||||
```
|
||||
|
||||
## Optional (based on your event)
|
||||
|
||||
- Events associated with node status changes must have `report_node_data` passed in and be extended with `NodeInfo`
|
||||
- define `asdict` if your data is not serializable to json
|
||||
|
||||
Example
|
||||
```
|
||||
@dataclass
|
||||
class SuperImportantNodeEvent(InfoLevel, File, NodeInfo):
|
||||
node_name: str
|
||||
run_result: RunResult
|
||||
report_node_data: ParsedModelNode # may vary
|
||||
code: str = "Q036"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"{self.node_name} had overly verbose result of {run_result}"
|
||||
|
||||
@classmethod
|
||||
def asdict(cls, data: list) -> dict:
|
||||
return dict((k, str(v)) for k, v in data)
|
||||
|
||||
```
|
||||
|
||||
All values other than `code` and `report_node_data` will be included in the `data` node of the json log output.
|
||||
|
||||
Once your event has been added, add a dummy call to your new event at the bottom of `types.py` and also add your new Event to the list `sample_values` in `test/unit/test_events.py'.
|
||||
|
||||
# Adapter Maintainers
|
||||
To integrate existing log messages from adapters, you likely have a line of code like this in your adapter already:
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
from abc import ABCMeta, abstractmethod, abstractproperty
|
||||
from dataclasses import dataclass
|
||||
from datetime import datetime
|
||||
import json
|
||||
import os
|
||||
import threading
|
||||
from typing import Any, Optional
|
||||
@@ -38,6 +37,11 @@ class ErrorLevel():
|
||||
return "error"
|
||||
|
||||
|
||||
class Cache():
|
||||
# Events with this class will only be logged when the `--log-cache-events` flag is passed
|
||||
pass
|
||||
|
||||
|
||||
@dataclass
|
||||
class Node():
|
||||
node_path: str
|
||||
@@ -70,6 +74,7 @@ class Event(metaclass=ABCMeta):
|
||||
# fields that should be on all events with their default implementations
|
||||
log_version: int = 1
|
||||
ts: Optional[datetime] = None # use getter for non-optional
|
||||
ts_rfc3339: Optional[str] = None # use getter for non-optional
|
||||
pid: Optional[int] = None # use getter for non-optional
|
||||
node_info: Optional[Node]
|
||||
|
||||
@@ -91,32 +96,20 @@ class Event(metaclass=ABCMeta):
|
||||
def message(self) -> str:
|
||||
raise Exception("msg not implemented for Event")
|
||||
|
||||
# override this method to convert non-json serializable fields to json.
|
||||
# for override examples, see existing concrete types.
|
||||
#
|
||||
# there is no type-level mechanism to have mypy enforce json serializability, so we just try
|
||||
# to serialize and raise an exception at runtime when that fails. This safety mechanism
|
||||
# only works if we have attempted to serialize every concrete event type in our tests.
|
||||
def fields_to_json(self, field_value: Any) -> Any:
|
||||
try:
|
||||
json.dumps(field_value, sort_keys=True)
|
||||
return field_value
|
||||
except TypeError:
|
||||
val_type = type(field_value).__name__
|
||||
event_type = type(self).__name__
|
||||
return Exception(
|
||||
f"type {val_type} is not serializable to json."
|
||||
f" First make sure that the call sites for {event_type} match the type hints"
|
||||
f" and if they do, you can override Event::fields_to_json in {event_type} in"
|
||||
" types.py to define your own serialization function to any valid json type"
|
||||
)
|
||||
|
||||
# exactly one time stamp per concrete event
|
||||
def get_ts(self) -> datetime:
|
||||
if not self.ts:
|
||||
self.ts = datetime.now()
|
||||
self.ts = datetime.utcnow()
|
||||
self.ts_rfc3339 = self.ts.strftime('%Y-%m-%dT%H:%M:%S.%fZ')
|
||||
return self.ts
|
||||
|
||||
# preformatted time stamp
|
||||
def get_ts_rfc3339(self) -> str:
|
||||
if not self.ts_rfc3339:
|
||||
# get_ts() creates the formatted string too so all time logic is centralized
|
||||
self.get_ts()
|
||||
return self.ts_rfc3339 # type: ignore
|
||||
|
||||
# exactly one pid per concrete event
|
||||
def get_pid(self) -> int:
|
||||
if not self.pid:
|
||||
@@ -132,6 +125,21 @@ class Event(metaclass=ABCMeta):
|
||||
from dbt.events.functions import get_invocation_id
|
||||
return get_invocation_id()
|
||||
|
||||
# default dict factory for all events. can override on concrete classes.
|
||||
@classmethod
|
||||
def asdict(cls, data: list) -> dict:
|
||||
d = dict()
|
||||
for k, v in data:
|
||||
# stringify all exceptions
|
||||
if isinstance(v, Exception) or isinstance(v, BaseException):
|
||||
d[k] = str(v)
|
||||
# skip all binary data
|
||||
elif isinstance(v, bytes):
|
||||
continue
|
||||
else:
|
||||
d[k] = v
|
||||
return d
|
||||
|
||||
|
||||
@dataclass # type: ignore
|
||||
class NodeInfo(Event, metaclass=ABCMeta):
|
||||
@@ -143,7 +151,7 @@ class NodeInfo(Event, metaclass=ABCMeta):
|
||||
node_name=self.report_node_data.name,
|
||||
unique_id=self.report_node_data.unique_id,
|
||||
resource_type=self.report_node_data.resource_type.value,
|
||||
materialized=self.report_node_data.config.materialized,
|
||||
materialized=self.report_node_data.config.get('materialized'),
|
||||
node_status=str(self.report_node_data._event_status.get('node_status')),
|
||||
node_started_at=self.report_node_data._event_status.get("started_at"),
|
||||
node_finished_at=self.report_node_data._event_status.get("finished_at")
|
||||
|
||||
@@ -2,8 +2,8 @@
|
||||
from colorama import Style
|
||||
from datetime import datetime
|
||||
import dbt.events.functions as this # don't worry I hate it too.
|
||||
from dbt.events.base_types import Cli, Event, File, ShowException, NodeInfo
|
||||
from dbt.events.types import EventBufferFull, T_Event
|
||||
from dbt.events.base_types import Cli, Event, File, ShowException, NodeInfo, Cache
|
||||
from dbt.events.types import EventBufferFull, T_Event, MainReportVersion, EmptyLine
|
||||
import dbt.flags as flags
|
||||
# TODO this will need to move eventually
|
||||
from dbt.logger import SECRET_ENV_PREFIX, make_log_dir_if_missing, GLOBAL_LOGGER
|
||||
@@ -13,19 +13,21 @@ from io import StringIO, TextIOWrapper
|
||||
import logbook
|
||||
import logging
|
||||
from logging import Logger
|
||||
import sys
|
||||
from logging.handlers import RotatingFileHandler
|
||||
import os
|
||||
import uuid
|
||||
import threading
|
||||
from typing import Any, Callable, Dict, List, Optional, Union
|
||||
import dataclasses
|
||||
from collections import deque
|
||||
|
||||
|
||||
# create the global event history buffer with a max size of 100k records
|
||||
# create the global event history buffer with the default max size (10k)
|
||||
# python 3.7 doesn't support type hints on globals, but mypy requires them. hence the ignore.
|
||||
# TODO: make the maxlen something configurable from the command line via args(?)
|
||||
# TODO the flags module has not yet been resolved when this is created
|
||||
global EVENT_HISTORY
|
||||
EVENT_HISTORY = deque(maxlen=100000) # type: ignore
|
||||
EVENT_HISTORY = deque(maxlen=flags.EVENT_BUFFER_SIZE) # type: ignore
|
||||
|
||||
# create the global file logger with no configuration
|
||||
global FILE_LOG
|
||||
@@ -38,7 +40,7 @@ FILE_LOG.addHandler(null_handler)
|
||||
global STDOUT_LOG
|
||||
STDOUT_LOG = logging.getLogger('default_stdout')
|
||||
STDOUT_LOG.setLevel(logging.INFO)
|
||||
stdout_handler = logging.StreamHandler()
|
||||
stdout_handler = logging.StreamHandler(sys.stdout)
|
||||
stdout_handler.setLevel(logging.INFO)
|
||||
STDOUT_LOG.addHandler(stdout_handler)
|
||||
|
||||
@@ -48,6 +50,10 @@ invocation_id: Optional[str] = None
|
||||
|
||||
|
||||
def setup_event_logger(log_path, level_override=None):
|
||||
# flags have been resolved, and log_path is known
|
||||
global EVENT_HISTORY
|
||||
EVENT_HISTORY = deque(maxlen=flags.EVENT_BUFFER_SIZE) # type: ignore
|
||||
|
||||
make_log_dir_if_missing(log_path)
|
||||
this.format_json = flags.LOG_FORMAT == 'json'
|
||||
# USE_COLORS can be None if the app just started and the cli flags
|
||||
@@ -64,7 +70,7 @@ def setup_event_logger(log_path, level_override=None):
|
||||
FORMAT = "%(message)s"
|
||||
stdout_passthrough_formatter = logging.Formatter(fmt=FORMAT)
|
||||
|
||||
stdout_handler = logging.StreamHandler()
|
||||
stdout_handler = logging.StreamHandler(sys.stdout)
|
||||
stdout_handler.setFormatter(stdout_passthrough_formatter)
|
||||
stdout_handler.setLevel(level)
|
||||
# clear existing stdout TextIOWrapper stream handlers
|
||||
@@ -80,7 +86,12 @@ def setup_event_logger(log_path, level_override=None):
|
||||
|
||||
file_passthrough_formatter = logging.Formatter(fmt=FORMAT)
|
||||
|
||||
file_handler = RotatingFileHandler(filename=log_dest, encoding='utf8')
|
||||
file_handler = RotatingFileHandler(
|
||||
filename=log_dest,
|
||||
encoding='utf8',
|
||||
maxBytes=10 * 1024 * 1024, # 10 mb
|
||||
backupCount=5
|
||||
)
|
||||
file_handler.setFormatter(file_passthrough_formatter)
|
||||
file_handler.setLevel(logging.DEBUG) # always debug regardless of user input
|
||||
this.FILE_LOG.handlers.clear()
|
||||
@@ -130,17 +141,25 @@ def event_to_serializable_dict(
|
||||
) -> Dict[str, Any]:
|
||||
data = dict()
|
||||
node_info = dict()
|
||||
if hasattr(e, '__dataclass_fields__'):
|
||||
for field, value in dataclasses.asdict(e).items(): # type: ignore[attr-defined]
|
||||
_json_value = e.fields_to_json(value)
|
||||
log_line = dict()
|
||||
try:
|
||||
log_line = dataclasses.asdict(e, dict_factory=type(e).asdict)
|
||||
except AttributeError:
|
||||
event_type = type(e).__name__
|
||||
raise Exception( # TODO this may hang async threads
|
||||
f"type {event_type} is not serializable to json."
|
||||
f" First make sure that the call sites for {event_type} match the type hints"
|
||||
f" and if they do, you can override the dataclass method `asdict` in {event_type} in"
|
||||
" types.py to define your own serialization function to a dictionary of valid json"
|
||||
" types"
|
||||
)
|
||||
|
||||
if isinstance(e, NodeInfo):
|
||||
node_info = dataclasses.asdict(e.get_node_info())
|
||||
if isinstance(e, NodeInfo):
|
||||
node_info = dataclasses.asdict(e.get_node_info())
|
||||
|
||||
if not isinstance(_json_value, Exception):
|
||||
data[field] = _json_value
|
||||
else:
|
||||
data[field] = f"JSON_SERIALIZE_FAILED: {type(value).__name__, 'NA'}"
|
||||
for field, value in log_line.items(): # type: ignore[attr-defined]
|
||||
if field not in ["code", "report_node_data"]:
|
||||
data[field] = value
|
||||
|
||||
event_dict = {
|
||||
'type': 'log_line',
|
||||
@@ -152,7 +171,8 @@ def event_to_serializable_dict(
|
||||
'data': data,
|
||||
'invocation_id': e.get_invocation_id(),
|
||||
'thread_name': e.get_thread_name(),
|
||||
'node_info': node_info
|
||||
'node_info': node_info,
|
||||
'code': e.code
|
||||
}
|
||||
|
||||
return event_dict
|
||||
@@ -161,35 +181,64 @@ def event_to_serializable_dict(
|
||||
# translates an Event to a completely formatted text-based log line
|
||||
# you have to specify which message you want. (i.e. - e.message, e.cli_msg(), e.file_msg())
|
||||
# type hinting everything as strings so we don't get any unintentional string conversions via str()
|
||||
def create_text_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) -> str:
|
||||
def create_info_text_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) -> str:
|
||||
color_tag: str = '' if this.format_color else Style.RESET_ALL
|
||||
ts: str = e.get_ts().strftime("%H:%M:%S")
|
||||
scrubbed_msg: str = scrub_secrets(msg_fn(e), env_secrets())
|
||||
log_line: str = f"{color_tag}{ts} {scrubbed_msg}"
|
||||
return log_line
|
||||
|
||||
|
||||
def create_debug_text_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) -> str:
|
||||
log_line: str = ''
|
||||
# Create a separator if this is the beginning of an invocation
|
||||
if type(e) == MainReportVersion:
|
||||
separator = 30 * '='
|
||||
log_line = f'\n\n{separator} {e.get_ts()} | {get_invocation_id()} {separator}\n'
|
||||
color_tag: str = '' if this.format_color else Style.RESET_ALL
|
||||
ts: str = e.get_ts().strftime("%H:%M:%S.%f")
|
||||
scrubbed_msg: str = scrub_secrets(msg_fn(e), env_secrets())
|
||||
level: str = e.level_tag() if len(e.level_tag()) == 5 else f"{e.level_tag()} "
|
||||
log_line: str = f"{color_tag}{ts} | [ {level} ] | {scrubbed_msg}"
|
||||
thread = ''
|
||||
if threading.current_thread().getName():
|
||||
thread_name = threading.current_thread().getName()
|
||||
thread_name = thread_name[:10]
|
||||
thread_name = thread_name.ljust(10, ' ')
|
||||
thread = f' [{thread_name}]:'
|
||||
log_line = log_line + f"{color_tag}{ts} [{level}]{thread} {scrubbed_msg}"
|
||||
return log_line
|
||||
|
||||
|
||||
# translates an Event to a completely formatted json log line
|
||||
# you have to specify which message you want. (i.e. - e.message(), e.cli_msg(), e.file_msg())
|
||||
def create_json_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) -> str:
|
||||
values = event_to_serializable_dict(e, lambda dt: dt.isoformat(), lambda x: msg_fn(x))
|
||||
def create_json_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) -> Optional[str]:
|
||||
if type(e) == EmptyLine:
|
||||
return None # will not be sent to logger
|
||||
# using preformatted string instead of formatting it here to be extra careful about timezone
|
||||
values = event_to_serializable_dict(e, lambda _: e.get_ts_rfc3339(), lambda x: msg_fn(x))
|
||||
raw_log_line = json.dumps(values, sort_keys=True)
|
||||
return scrub_secrets(raw_log_line, env_secrets())
|
||||
|
||||
|
||||
# calls create_text_log_line() or create_json_log_line() according to logger config
|
||||
def create_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) -> str:
|
||||
return (
|
||||
create_json_log_line(e, msg_fn)
|
||||
if this.format_json else
|
||||
create_text_log_line(e, msg_fn)
|
||||
)
|
||||
# calls create_stdout_text_log_line() or create_json_log_line() according to logger config
|
||||
def create_log_line(
|
||||
e: T_Event,
|
||||
msg_fn: Callable[[T_Event], str],
|
||||
file_output=False
|
||||
) -> Optional[str]:
|
||||
if this.format_json:
|
||||
return create_json_log_line(e, msg_fn) # json output, both console and file
|
||||
elif file_output is True or flags.DEBUG:
|
||||
return create_debug_text_log_line(e, msg_fn) # default file output
|
||||
else:
|
||||
return create_info_text_log_line(e, msg_fn) # console output
|
||||
|
||||
|
||||
# allows for resuse of this obnoxious if else tree.
|
||||
# do not use for exceptions, it doesn't pass along exc_info, stack_info, or extra
|
||||
def send_to_logger(l: Union[Logger, logbook.Logger], level_tag: str, log_line: str):
|
||||
if not log_line:
|
||||
return
|
||||
if level_tag == 'test':
|
||||
# TODO after implmenting #3977 send to new test level
|
||||
l.debug(log_line)
|
||||
@@ -257,33 +306,46 @@ def send_exc_to_logger(
|
||||
)
|
||||
|
||||
|
||||
# an alternative to fire_event which only creates and logs the event value
|
||||
# if the condition is met. Does nothing otherwise.
|
||||
def fire_event_if(conditional: bool, lazy_e: Callable[[], Event]) -> None:
|
||||
if conditional:
|
||||
fire_event(lazy_e())
|
||||
|
||||
|
||||
# top-level method for accessing the new eventing system
|
||||
# this is where all the side effects happen branched by event type
|
||||
# (i.e. - mutating the event history, printing to stdout, logging
|
||||
# to files, etc.)
|
||||
def fire_event(e: Event) -> None:
|
||||
# skip logs when `--log-cache-events` is not passed
|
||||
if isinstance(e, Cache) and not flags.LOG_CACHE_EVENTS:
|
||||
return
|
||||
|
||||
# if and only if the event history deque will be completely filled by this event
|
||||
# fire warning that old events are now being dropped
|
||||
global EVENT_HISTORY
|
||||
if len(EVENT_HISTORY) == ((EVENT_HISTORY.maxlen or 100000) - 1):
|
||||
if len(EVENT_HISTORY) == (flags.EVENT_BUFFER_SIZE - 1):
|
||||
EVENT_HISTORY.append(e)
|
||||
fire_event(EventBufferFull())
|
||||
|
||||
EVENT_HISTORY.append(e)
|
||||
else:
|
||||
EVENT_HISTORY.append(e)
|
||||
|
||||
# backwards compatibility for plugins that require old logger (dbt-rpc)
|
||||
if flags.ENABLE_LEGACY_LOGGER:
|
||||
# using Event::message because the legacy logger didn't differentiate messages by
|
||||
# destination
|
||||
log_line = create_log_line(e, msg_fn=lambda x: x.message())
|
||||
|
||||
send_to_logger(GLOBAL_LOGGER, e.level_tag(), log_line)
|
||||
if log_line:
|
||||
send_to_logger(GLOBAL_LOGGER, e.level_tag(), log_line)
|
||||
return # exit the function to avoid using the current logger as well
|
||||
|
||||
# always logs debug level regardless of user input
|
||||
if isinstance(e, File):
|
||||
log_line = create_log_line(e, msg_fn=lambda x: x.file_msg())
|
||||
log_line = create_log_line(e, msg_fn=lambda x: x.file_msg(), file_output=True)
|
||||
# doesn't send exceptions to exception logger
|
||||
send_to_logger(FILE_LOG, level_tag=e.level_tag(), log_line=log_line)
|
||||
if log_line:
|
||||
send_to_logger(FILE_LOG, level_tag=e.level_tag(), log_line=log_line)
|
||||
|
||||
if isinstance(e, Cli):
|
||||
# explicitly checking the debug flag here so that potentially expensive-to-construct
|
||||
@@ -292,18 +354,19 @@ def fire_event(e: Event) -> None:
|
||||
return # eat the message in case it was one of the expensive ones
|
||||
|
||||
log_line = create_log_line(e, msg_fn=lambda x: x.cli_msg())
|
||||
if not isinstance(e, ShowException):
|
||||
send_to_logger(STDOUT_LOG, level_tag=e.level_tag(), log_line=log_line)
|
||||
# CliEventABC and ShowException
|
||||
else:
|
||||
send_exc_to_logger(
|
||||
STDOUT_LOG,
|
||||
level_tag=e.level_tag(),
|
||||
log_line=log_line,
|
||||
exc_info=e.exc_info,
|
||||
stack_info=e.stack_info,
|
||||
extra=e.extra
|
||||
)
|
||||
if log_line:
|
||||
if not isinstance(e, ShowException):
|
||||
send_to_logger(STDOUT_LOG, level_tag=e.level_tag(), log_line=log_line)
|
||||
# CliEventABC and ShowException
|
||||
else:
|
||||
send_exc_to_logger(
|
||||
STDOUT_LOG,
|
||||
level_tag=e.level_tag(),
|
||||
log_line=log_line,
|
||||
exc_info=e.exc_info,
|
||||
stack_info=e.stack_info,
|
||||
extra=e.extra
|
||||
)
|
||||
|
||||
|
||||
def get_invocation_id() -> str:
|
||||
|
||||
@@ -1,16 +1,16 @@
|
||||
import argparse
|
||||
from dataclasses import dataclass
|
||||
from dbt.adapters.reference_keys import _make_key, _ReferenceKey
|
||||
from dbt.events.stubs import (
|
||||
_CachedRelation,
|
||||
BaseRelation,
|
||||
ParsedModelNode,
|
||||
ParsedHookNode,
|
||||
_ReferenceKey,
|
||||
ParsedModelNode,
|
||||
RunResult
|
||||
)
|
||||
from dbt import ui
|
||||
from dbt.events.base_types import (
|
||||
Cli, Event, File, DebugLevel, InfoLevel, WarnLevel, ErrorLevel, ShowException, NodeInfo
|
||||
Cli, Event, File, DebugLevel, InfoLevel, WarnLevel, ErrorLevel, ShowException, NodeInfo, Cache
|
||||
)
|
||||
from dbt.events.format import format_fancy_output_line, pluralize
|
||||
from dbt.node_types import NodeType
|
||||
@@ -115,14 +115,6 @@ class MainEncounteredError(ErrorLevel, Cli):
|
||||
def message(self) -> str:
|
||||
return f"Encountered an error:\n{str(self.e)}"
|
||||
|
||||
# overriding default json serialization for this event
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
# equality on BaseException is not good enough of a comparison here
|
||||
if isinstance(val, BaseException):
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class MainStackTrace(DebugLevel, Cli):
|
||||
@@ -150,12 +142,9 @@ class MainReportArgs(DebugLevel, Cli, File):
|
||||
def message(self):
|
||||
return f"running dbt with arguments {str(self.args)}"
|
||||
|
||||
# overriding default json serialization for this event
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if isinstance(val, argparse.Namespace):
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
@classmethod
|
||||
def asdict(cls, data: list) -> dict:
|
||||
return dict((k, str(v)) for k, v in data)
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -312,6 +301,25 @@ class GitProgressCheckedOutAt(DebugLevel, Cli, File):
|
||||
return f" Checked out at {self.end_sha}."
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryIndexProgressMakingGETRequest(DebugLevel, Cli, File):
|
||||
url: str
|
||||
code: str = "M022"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"Making package index registry request: GET {self.url}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryIndexProgressGETResponse(DebugLevel, Cli, File):
|
||||
url: str
|
||||
resp_code: int
|
||||
code: str = "M023"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"Response from registry index: GET {self.url} {self.resp_code}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryProgressMakingGETRequest(DebugLevel, Cli, File):
|
||||
url: str
|
||||
@@ -331,6 +339,45 @@ class RegistryProgressGETResponse(DebugLevel, Cli, File):
|
||||
return f"Response from registry: GET {self.url} {self.resp_code}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryResponseUnexpectedType(DebugLevel, File):
|
||||
response: str
|
||||
code: str = "M024"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"Response was None: {self.response}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryResponseMissingTopKeys(DebugLevel, File):
|
||||
response: str
|
||||
code: str = "M025"
|
||||
|
||||
def message(self) -> str:
|
||||
# expected/actual keys logged in exception
|
||||
return f"Response missing top level keys: {self.response}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryResponseMissingNestedKeys(DebugLevel, File):
|
||||
response: str
|
||||
code: str = "M026"
|
||||
|
||||
def message(self) -> str:
|
||||
# expected/actual keys logged in exception
|
||||
return f"Response missing nested keys: {self.response}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class RegistryResponseExtraNestedKeys(DebugLevel, File):
|
||||
response: str
|
||||
code: str = "M027"
|
||||
|
||||
def message(self) -> str:
|
||||
# expected/actual keys logged in exception
|
||||
return f"Response contained inconsistent keys: {self.response}"
|
||||
|
||||
|
||||
# TODO this was actually `logger.exception(...)` not `logger.error(...)`
|
||||
@dataclass
|
||||
class SystemErrorRetrievingModTime(ErrorLevel, Cli, File):
|
||||
@@ -354,13 +401,6 @@ class SystemCouldNotWrite(DebugLevel, Cli, File):
|
||||
f"{self.reason}\nexception: {self.exc}"
|
||||
)
|
||||
|
||||
# overriding default json serialization for this event
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class SystemExecutingCmd(DebugLevel, Cli, File):
|
||||
@@ -397,40 +437,6 @@ class SystemReportReturnCode(DebugLevel, Cli, File):
|
||||
def message(self) -> str:
|
||||
return f"command return code={self.returncode}"
|
||||
|
||||
# TODO remove?? Not called outside of this file
|
||||
|
||||
|
||||
@dataclass
|
||||
class SelectorAlertUpto3UnusedNodes(InfoLevel, Cli, File):
|
||||
node_names: List[str]
|
||||
code: str = "I_NEED_A_CODE_5"
|
||||
|
||||
def message(self) -> str:
|
||||
summary_nodes_str = ("\n - ").join(self.node_names[:3])
|
||||
and_more_str = (
|
||||
f"\n - and {len(self.node_names) - 3} more" if len(self.node_names) > 4 else ""
|
||||
)
|
||||
return (
|
||||
f"\nSome tests were excluded because at least one parent is not selected. "
|
||||
f"Use the --greedy flag to include them."
|
||||
f"\n - {summary_nodes_str}{and_more_str}"
|
||||
)
|
||||
|
||||
# TODO remove?? Not called outside of this file
|
||||
|
||||
|
||||
@dataclass
|
||||
class SelectorAlertAllUnusedNodes(DebugLevel, Cli, File):
|
||||
node_names: List[str]
|
||||
code: str = "I_NEED_A_CODE_6"
|
||||
|
||||
def message(self) -> str:
|
||||
debug_nodes_str = ("\n - ").join(self.node_names)
|
||||
return (
|
||||
f"Full list of tests that were excluded:"
|
||||
f"\n - {debug_nodes_str}"
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class SelectorReportInvalidSelector(InfoLevel, Cli, File):
|
||||
@@ -542,7 +548,7 @@ class Rollback(DebugLevel, Cli, File):
|
||||
|
||||
@dataclass
|
||||
class CacheMiss(DebugLevel, Cli, File):
|
||||
conn_name: Any # TODO mypy says this is `Callable[[], str]`?? ¯\_(ツ)_/¯
|
||||
conn_name: str
|
||||
database: Optional[str]
|
||||
schema: str
|
||||
code: str = "E013"
|
||||
@@ -558,12 +564,20 @@ class CacheMiss(DebugLevel, Cli, File):
|
||||
class ListRelations(DebugLevel, Cli, File):
|
||||
database: Optional[str]
|
||||
schema: str
|
||||
relations: List[BaseRelation]
|
||||
relations: List[_ReferenceKey]
|
||||
code: str = "E014"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"with database={self.database}, schema={self.schema}, relations={self.relations}"
|
||||
|
||||
@classmethod
|
||||
def asdict(cls, data: list) -> dict:
|
||||
d = dict()
|
||||
for k, v in data:
|
||||
if type(v) == list:
|
||||
d[k] = [str(x) for x in v]
|
||||
return d
|
||||
|
||||
|
||||
@dataclass
|
||||
class ConnectionUsed(DebugLevel, Cli, File):
|
||||
@@ -587,7 +601,7 @@ class SQLQuery(DebugLevel, Cli, File):
|
||||
|
||||
@dataclass
|
||||
class SQLQueryStatus(DebugLevel, Cli, File):
|
||||
status: str # could include AdapterResponse if we resolve circular imports
|
||||
status: str
|
||||
elapsed: float
|
||||
code: str = "E017"
|
||||
|
||||
@@ -617,7 +631,7 @@ class ColTypeChange(DebugLevel, Cli, File):
|
||||
|
||||
@dataclass
|
||||
class SchemaCreation(DebugLevel, Cli, File):
|
||||
relation: BaseRelation
|
||||
relation: _ReferenceKey
|
||||
code: str = "E020"
|
||||
|
||||
def message(self) -> str:
|
||||
@@ -626,17 +640,21 @@ class SchemaCreation(DebugLevel, Cli, File):
|
||||
|
||||
@dataclass
|
||||
class SchemaDrop(DebugLevel, Cli, File):
|
||||
relation: BaseRelation
|
||||
relation: _ReferenceKey
|
||||
code: str = "E021"
|
||||
|
||||
def message(self) -> str:
|
||||
return f'Dropping schema "{self.relation}".'
|
||||
|
||||
@classmethod
|
||||
def asdict(cls, data: list) -> dict:
|
||||
return dict((k, str(v)) for k, v in data)
|
||||
|
||||
|
||||
# TODO pretty sure this is only ever called in dead code
|
||||
# see: core/dbt/adapters/cache.py _add_link vs add_link
|
||||
@dataclass
|
||||
class UncachedRelation(DebugLevel, Cli, File):
|
||||
class UncachedRelation(DebugLevel, Cli, File, Cache):
|
||||
dep_key: _ReferenceKey
|
||||
ref_key: _ReferenceKey
|
||||
code: str = "E022"
|
||||
@@ -650,7 +668,7 @@ class UncachedRelation(DebugLevel, Cli, File):
|
||||
|
||||
|
||||
@dataclass
|
||||
class AddLink(DebugLevel, Cli, File):
|
||||
class AddLink(DebugLevel, Cli, File, Cache):
|
||||
dep_key: _ReferenceKey
|
||||
ref_key: _ReferenceKey
|
||||
code: str = "E023"
|
||||
@@ -660,23 +678,16 @@ class AddLink(DebugLevel, Cli, File):
|
||||
|
||||
|
||||
@dataclass
|
||||
class AddRelation(DebugLevel, Cli, File):
|
||||
relation: _CachedRelation
|
||||
class AddRelation(DebugLevel, Cli, File, Cache):
|
||||
relation: _ReferenceKey
|
||||
code: str = "E024"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"Adding relation: {str(self.relation)}"
|
||||
|
||||
# overriding default json serialization for this event
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if isinstance(val, _CachedRelation):
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class DropMissingRelation(DebugLevel, Cli, File):
|
||||
class DropMissingRelation(DebugLevel, Cli, File, Cache):
|
||||
relation: _ReferenceKey
|
||||
code: str = "E025"
|
||||
|
||||
@@ -685,7 +696,7 @@ class DropMissingRelation(DebugLevel, Cli, File):
|
||||
|
||||
|
||||
@dataclass
|
||||
class DropCascade(DebugLevel, Cli, File):
|
||||
class DropCascade(DebugLevel, Cli, File, Cache):
|
||||
dropped: _ReferenceKey
|
||||
consequences: Set[_ReferenceKey]
|
||||
code: str = "E026"
|
||||
@@ -693,9 +704,19 @@ class DropCascade(DebugLevel, Cli, File):
|
||||
def message(self) -> str:
|
||||
return f"drop {self.dropped} is cascading to {self.consequences}"
|
||||
|
||||
@classmethod
|
||||
def asdict(cls, data: list) -> dict:
|
||||
d = dict()
|
||||
for k, v in data:
|
||||
if isinstance(v, list):
|
||||
d[k] = [str(x) for x in v]
|
||||
else:
|
||||
d[k] = str(v) # type: ignore
|
||||
return d
|
||||
|
||||
|
||||
@dataclass
|
||||
class DropRelation(DebugLevel, Cli, File):
|
||||
class DropRelation(DebugLevel, Cli, File, Cache):
|
||||
dropped: _ReferenceKey
|
||||
code: str = "E027"
|
||||
|
||||
@@ -704,7 +725,7 @@ class DropRelation(DebugLevel, Cli, File):
|
||||
|
||||
|
||||
@dataclass
|
||||
class UpdateReference(DebugLevel, Cli, File):
|
||||
class UpdateReference(DebugLevel, Cli, File, Cache):
|
||||
old_key: _ReferenceKey
|
||||
new_key: _ReferenceKey
|
||||
cached_key: _ReferenceKey
|
||||
@@ -716,7 +737,7 @@ class UpdateReference(DebugLevel, Cli, File):
|
||||
|
||||
|
||||
@dataclass
|
||||
class TemporaryRelation(DebugLevel, Cli, File):
|
||||
class TemporaryRelation(DebugLevel, Cli, File, Cache):
|
||||
key: _ReferenceKey
|
||||
code: str = "E029"
|
||||
|
||||
@@ -725,7 +746,7 @@ class TemporaryRelation(DebugLevel, Cli, File):
|
||||
|
||||
|
||||
@dataclass
|
||||
class RenameSchema(DebugLevel, Cli, File):
|
||||
class RenameSchema(DebugLevel, Cli, File, Cache):
|
||||
old_key: _ReferenceKey
|
||||
new_key: _ReferenceKey
|
||||
code: str = "E030"
|
||||
@@ -735,8 +756,8 @@ class RenameSchema(DebugLevel, Cli, File):
|
||||
|
||||
|
||||
@dataclass
|
||||
class DumpBeforeAddGraph(DebugLevel, Cli, File):
|
||||
# large value. delay not necessary since every debug level message is logged anyway.
|
||||
class DumpBeforeAddGraph(DebugLevel, Cli, File, Cache):
|
||||
# large value. delay creation with fire_event_if.
|
||||
dump: Dict[str, List[str]]
|
||||
code: str = "E031"
|
||||
|
||||
@@ -745,8 +766,8 @@ class DumpBeforeAddGraph(DebugLevel, Cli, File):
|
||||
|
||||
|
||||
@dataclass
|
||||
class DumpAfterAddGraph(DebugLevel, Cli, File):
|
||||
# large value. delay not necessary since every debug level message is logged anyway.
|
||||
class DumpAfterAddGraph(DebugLevel, Cli, File, Cache):
|
||||
# large value. delay creation with fire_event_if.
|
||||
dump: Dict[str, List[str]]
|
||||
code: str = "E032"
|
||||
|
||||
@@ -755,8 +776,8 @@ class DumpAfterAddGraph(DebugLevel, Cli, File):
|
||||
|
||||
|
||||
@dataclass
|
||||
class DumpBeforeRenameSchema(DebugLevel, Cli, File):
|
||||
# large value. delay not necessary since every debug level message is logged anyway.
|
||||
class DumpBeforeRenameSchema(DebugLevel, Cli, File, Cache):
|
||||
# large value. delay creation with fire_event_if.
|
||||
dump: Dict[str, List[str]]
|
||||
code: str = "E033"
|
||||
|
||||
@@ -765,8 +786,8 @@ class DumpBeforeRenameSchema(DebugLevel, Cli, File):
|
||||
|
||||
|
||||
@dataclass
|
||||
class DumpAfterRenameSchema(DebugLevel, Cli, File):
|
||||
# large value. delay not necessary since every debug level message is logged anyway.
|
||||
class DumpAfterRenameSchema(DebugLevel, Cli, File, Cache):
|
||||
# large value. delay creation with fire_event_if.
|
||||
dump: Dict[str, List[str]]
|
||||
code: str = "E034"
|
||||
|
||||
@@ -782,11 +803,9 @@ class AdapterImportError(InfoLevel, Cli, File):
|
||||
def message(self) -> str:
|
||||
return f"Error importing adapter: {self.exc}"
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val())
|
||||
|
||||
return val
|
||||
@classmethod
|
||||
def asdict(cls, data: list) -> dict:
|
||||
return dict((k, str(v)) for k, v in data)
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -834,114 +853,12 @@ class MissingProfileTarget(InfoLevel, Cli, File):
|
||||
return f"target not specified in profile '{self.profile_name}', using '{self.target_name}'"
|
||||
|
||||
|
||||
@dataclass
|
||||
class ProfileLoadError(ShowException, DebugLevel, Cli, File):
|
||||
exc: Exception
|
||||
code: str = "A006"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"Profile not loaded due to error: {self.exc}"
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class ProfileNotFound(InfoLevel, Cli, File):
|
||||
profile_name: Optional[str]
|
||||
code: str = "A007"
|
||||
|
||||
def message(self) -> str:
|
||||
return f'No profile "{self.profile_name}" found, continuing with no target'
|
||||
|
||||
|
||||
@dataclass
|
||||
class InvalidVarsYAML(ErrorLevel, Cli, File):
|
||||
code: str = "A008"
|
||||
|
||||
def message(self) -> str:
|
||||
return "The YAML provided in the --vars argument is not valid.\n"
|
||||
|
||||
|
||||
# TODO: Remove? (appears to be uncalled)
|
||||
@dataclass
|
||||
class CatchRunException(ShowException, DebugLevel, Cli, File):
|
||||
build_path: Any
|
||||
exc: Exception
|
||||
code: str = "I_NEED_A_CODE_1"
|
||||
|
||||
def message(self) -> str:
|
||||
INTERNAL_ERROR_STRING = """This is an error in dbt. Please try again. If the \
|
||||
error persists, open an issue at https://github.com/dbt-labs/dbt-core
|
||||
""".strip()
|
||||
prefix = f'Internal error executing {self.build_path}'
|
||||
error = "{prefix}\n{error}\n\n{note}".format(
|
||||
prefix=ui.red(prefix),
|
||||
error=str(self.exc).strip(),
|
||||
note=INTERNAL_ERROR_STRING
|
||||
)
|
||||
return error
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
# TODO: Remove? (appears to be uncalled)
|
||||
@dataclass
|
||||
class HandleInternalException(ShowException, DebugLevel, Cli, File):
|
||||
exc: Exception
|
||||
code: str = "I_NEED_A_CODE_2"
|
||||
|
||||
def message(self) -> str:
|
||||
return str(self.exc)
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
# TODO: Remove? (appears to be uncalled)
|
||||
|
||||
|
||||
@dataclass
|
||||
class MessageHandleGenericException(ErrorLevel, Cli, File):
|
||||
build_path: str
|
||||
unique_id: str
|
||||
exc: Exception
|
||||
code: str = "I_NEED_A_CODE_3"
|
||||
|
||||
def message(self) -> str:
|
||||
node_description = self.build_path
|
||||
if node_description is None:
|
||||
node_description = self.unique_id
|
||||
prefix = "Unhandled error while executing {}".format(node_description)
|
||||
return "{prefix}\n{error}".format(
|
||||
prefix=ui.red(prefix),
|
||||
error=str(self.exc).strip()
|
||||
)
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
# TODO: Remove? (appears to be uncalled)
|
||||
|
||||
|
||||
@dataclass
|
||||
class DetailsHandleGenericException(ShowException, DebugLevel, Cli, File):
|
||||
code: str = "I_NEED_A_CODE_4"
|
||||
|
||||
def message(self) -> str:
|
||||
return ''
|
||||
return "The YAML provided in the --vars argument is not valid."
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -1110,12 +1027,6 @@ class ParsedFileLoadFailed(ShowException, DebugLevel, Cli, File):
|
||||
def message(self) -> str:
|
||||
return f"Failed to load parsed file from disk at {self.path}: {self.exc}"
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class PartialParseSaveFileNotFound(InfoLevel, Cli, File):
|
||||
@@ -1313,12 +1224,12 @@ class InvalidDisabledSourceInTestNode(WarnLevel, Cli, File):
|
||||
|
||||
|
||||
@dataclass
|
||||
class InvalidRefInTestNode(WarnLevel, Cli, File):
|
||||
class InvalidRefInTestNode(DebugLevel, Cli, File):
|
||||
msg: str
|
||||
code: str = "I051"
|
||||
|
||||
def message(self) -> str:
|
||||
return ui.warning_tag(self.msg)
|
||||
return self.msg
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -1329,12 +1240,6 @@ class RunningOperationCaughtError(ErrorLevel, Cli, File):
|
||||
def message(self) -> str:
|
||||
return f'Encountered an error while running operation: {self.exc}'
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class RunningOperationUncaughtError(ErrorLevel, Cli, File):
|
||||
@@ -1344,12 +1249,6 @@ class RunningOperationUncaughtError(ErrorLevel, Cli, File):
|
||||
def message(self) -> str:
|
||||
return f'Encountered an error while running operation: {self.exc}'
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class DbtProjectError(ErrorLevel, Cli, File):
|
||||
@@ -1367,12 +1266,6 @@ class DbtProjectErrorException(ErrorLevel, Cli, File):
|
||||
def message(self) -> str:
|
||||
return f" ERROR: {str(self.exc)}"
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class DbtProfileError(ErrorLevel, Cli, File):
|
||||
@@ -1390,12 +1283,6 @@ class DbtProfileErrorException(ErrorLevel, Cli, File):
|
||||
def message(self) -> str:
|
||||
return f" ERROR: {str(self.exc)}"
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class ProfileListTitle(InfoLevel, Cli, File):
|
||||
@@ -1443,12 +1330,6 @@ class CatchableExceptionOnRun(ShowException, DebugLevel, Cli, File):
|
||||
def message(self) -> str:
|
||||
return str(self.exc)
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class InternalExceptionOnRun(DebugLevel, Cli, File):
|
||||
@@ -1469,12 +1350,6 @@ the error persists, open an issue at https://github.com/dbt-labs/dbt-core
|
||||
note=INTERNAL_ERROR_STRING
|
||||
)
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
# This prints the stack trace at the debug level while allowing just the nice exception message
|
||||
# at the error level - or whatever other level chosen. Used in multiple places.
|
||||
@@ -1488,9 +1363,9 @@ class PrintDebugStackTrace(ShowException, DebugLevel, Cli, File):
|
||||
|
||||
@dataclass
|
||||
class GenericExceptionOnRun(ErrorLevel, Cli, File):
|
||||
build_path: str
|
||||
build_path: Optional[str]
|
||||
unique_id: str
|
||||
exc: Exception
|
||||
exc: str # TODO: make this the actual exception once we have a better searilization strategy
|
||||
code: str = "W004"
|
||||
|
||||
def message(self) -> str:
|
||||
@@ -1503,12 +1378,6 @@ class GenericExceptionOnRun(ErrorLevel, Cli, File):
|
||||
error=str(self.exc).strip()
|
||||
)
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class NodeConnectionReleaseError(ShowException, DebugLevel, Cli, File):
|
||||
@@ -1520,12 +1389,6 @@ class NodeConnectionReleaseError(ShowException, DebugLevel, Cli, File):
|
||||
return ('Error releasing connection for node {}: {!s}'
|
||||
.format(self.node_name, self.exc))
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class CheckCleanPath(InfoLevel, Cli):
|
||||
@@ -1591,11 +1454,11 @@ class DepsNoPackagesFound(InfoLevel, Cli, File):
|
||||
|
||||
@dataclass
|
||||
class DepsStartPackageInstall(InfoLevel, Cli, File):
|
||||
package: str
|
||||
package_name: str
|
||||
code: str = "M014"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"Installing {self.package}"
|
||||
return f"Installing {self.package_name}"
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -1639,7 +1502,7 @@ class DepsNotifyUpdatesAvailable(InfoLevel, Cli, File):
|
||||
code: str = "M019"
|
||||
|
||||
def message(self) -> str:
|
||||
return ('\nUpdates available for packages: {} \
|
||||
return ('Updates available for packages: {} \
|
||||
\nUpdate your versions in packages.yml, then run dbt deps'.format(self.packages))
|
||||
|
||||
|
||||
@@ -1756,7 +1619,7 @@ class ServingDocsExitInfo(InfoLevel, Cli, File):
|
||||
code: str = "Z020"
|
||||
|
||||
def message(self) -> str:
|
||||
return "Press Ctrl+C to exit.\n\n"
|
||||
return "Press Ctrl+C to exit."
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -1807,7 +1670,7 @@ class StatsLine(InfoLevel, Cli, File):
|
||||
code: str = "Z023"
|
||||
|
||||
def message(self) -> str:
|
||||
stats_line = ("\nDone. PASS={pass} WARN={warn} ERROR={error} SKIP={skip} TOTAL={total}")
|
||||
stats_line = ("Done. PASS={pass} WARN={warn} ERROR={error} SKIP={skip} TOTAL={total}")
|
||||
return stats_line.format(**self.stats)
|
||||
|
||||
|
||||
@@ -1846,12 +1709,6 @@ class SQlRunnerException(ShowException, DebugLevel, Cli, File):
|
||||
def message(self) -> str:
|
||||
return f"Got an exception: {self.exc}"
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class CheckNodeTestFailure(InfoLevel, Cli, File):
|
||||
@@ -1910,7 +1767,7 @@ class PrintStartLine(InfoLevel, Cli, File, NodeInfo):
|
||||
index: int
|
||||
total: int
|
||||
report_node_data: ParsedModelNode
|
||||
code: str = "Z031"
|
||||
code: str = "Q033"
|
||||
|
||||
def message(self) -> str:
|
||||
msg = f"START {self.description}"
|
||||
@@ -1928,8 +1785,8 @@ class PrintHookStartLine(InfoLevel, Cli, File, NodeInfo):
|
||||
index: int
|
||||
total: int
|
||||
truncate: bool
|
||||
report_node_data: Any # TODO use ParsedHookNode here
|
||||
code: str = "Z032"
|
||||
report_node_data: Any # TODO: resolve ParsedHookNode circular import
|
||||
code: str = "Q032"
|
||||
|
||||
def message(self) -> str:
|
||||
msg = f"START hook: {self.statement}"
|
||||
@@ -1948,7 +1805,7 @@ class PrintHookEndLine(InfoLevel, Cli, File, NodeInfo):
|
||||
total: int
|
||||
execution_time: int
|
||||
truncate: bool
|
||||
report_node_data: Any # TODO use ParsedHookNode here
|
||||
report_node_data: Any # TODO: resolve ParsedHookNode circular import
|
||||
code: str = "Q007"
|
||||
|
||||
def message(self) -> str:
|
||||
@@ -1969,7 +1826,7 @@ class SkippingDetails(InfoLevel, Cli, File, NodeInfo):
|
||||
index: int
|
||||
total: int
|
||||
report_node_data: ParsedModelNode
|
||||
code: str = "Z033"
|
||||
code: str = "Q034"
|
||||
|
||||
def message(self) -> str:
|
||||
if self.resource_type in NodeType.refable():
|
||||
@@ -2084,7 +1941,7 @@ class PrintModelErrorResultLine(ErrorLevel, Cli, File, NodeInfo):
|
||||
total: int
|
||||
execution_time: int
|
||||
report_node_data: ParsedModelNode
|
||||
code: str = "Z035"
|
||||
code: str = "Q035"
|
||||
|
||||
def message(self) -> str:
|
||||
info = "ERROR creating"
|
||||
@@ -2322,6 +2179,10 @@ class NodeFinished(DebugLevel, Cli, File, NodeInfo):
|
||||
def message(self) -> str:
|
||||
return f"Finished running node {self.unique_id}"
|
||||
|
||||
@classmethod
|
||||
def asdict(cls, data: list) -> dict:
|
||||
return dict((k, str(v)) for k, v in data)
|
||||
|
||||
|
||||
@dataclass
|
||||
class QueryCancelationUnsupported(InfoLevel, Cli, File):
|
||||
@@ -2337,11 +2198,12 @@ class QueryCancelationUnsupported(InfoLevel, Cli, File):
|
||||
|
||||
@dataclass
|
||||
class ConcurrencyLine(InfoLevel, Cli, File):
|
||||
concurrency_line: str
|
||||
num_threads: int
|
||||
target_name: str
|
||||
code: str = "Q026"
|
||||
|
||||
def message(self) -> str:
|
||||
return self.concurrency_line
|
||||
return f"Concurrency: {self.num_threads} threads (target='{self.target_name}')"
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -2577,7 +2439,7 @@ class TrackingInitializeFailure(ShowException, DebugLevel, Cli, File):
|
||||
class RetryExternalCall(DebugLevel, Cli, File):
|
||||
attempt: int
|
||||
max: int
|
||||
code: str = "Z045"
|
||||
code: str = "M020"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"Retrying external call. Attempt: {self.attempt} Max attempts: {self.max}"
|
||||
@@ -2606,12 +2468,6 @@ class GeneralWarningException(WarnLevel, Cli, File):
|
||||
return self.log_fmt.format(str(self.exc))
|
||||
return str(self.exc)
|
||||
|
||||
def fields_to_json(self, val: Any) -> Any:
|
||||
if val == self.exc:
|
||||
return str(val)
|
||||
|
||||
return val
|
||||
|
||||
|
||||
@dataclass
|
||||
class EventBufferFull(WarnLevel, Cli, File):
|
||||
@@ -2621,6 +2477,15 @@ class EventBufferFull(WarnLevel, Cli, File):
|
||||
return "Internal event buffer full. Earliest events will be dropped (FIFO)."
|
||||
|
||||
|
||||
@dataclass
|
||||
class RecordRetryException(DebugLevel, Cli, File):
|
||||
exc: Exception
|
||||
code: str = "M021"
|
||||
|
||||
def message(self) -> str:
|
||||
return f"External call exception: {self.exc}"
|
||||
|
||||
|
||||
# since mypy doesn't run on every file we need to suggest to mypy that every
|
||||
# class gets instantiated. But we don't actually want to run this code.
|
||||
# making the conditional `if False` causes mypy to skip it as dead code so
|
||||
@@ -2650,6 +2515,14 @@ if 1 == 0:
|
||||
GitNothingToDo(sha="")
|
||||
GitProgressUpdatedCheckoutRange(start_sha="", end_sha="")
|
||||
GitProgressCheckedOutAt(end_sha="")
|
||||
RegistryIndexProgressMakingGETRequest(url="")
|
||||
RegistryIndexProgressGETResponse(url="", resp_code=1234)
|
||||
RegistryProgressMakingGETRequest(url="")
|
||||
RegistryProgressGETResponse(url="", resp_code=1234)
|
||||
RegistryResponseUnexpectedType(response=""),
|
||||
RegistryResponseMissingTopKeys(response=""),
|
||||
RegistryResponseMissingNestedKeys(response=""),
|
||||
RegistryResponseExtraNestedKeys(response=""),
|
||||
SystemErrorRetrievingModTime(path="")
|
||||
SystemCouldNotWrite(path="", reason="", exc=Exception(""))
|
||||
SystemExecutingCmd(cmd=[""])
|
||||
@@ -2675,8 +2548,8 @@ if 1 == 0:
|
||||
SQLQueryStatus(status="", elapsed=0.1)
|
||||
SQLCommit(conn_name="")
|
||||
ColTypeChange(orig_type="", new_type="", table="")
|
||||
SchemaCreation(relation=BaseRelation())
|
||||
SchemaDrop(relation=BaseRelation())
|
||||
SchemaCreation(relation=_make_key(BaseRelation()))
|
||||
SchemaDrop(relation=_make_key(BaseRelation()))
|
||||
UncachedRelation(
|
||||
dep_key=_ReferenceKey(database="", schema="", identifier=""),
|
||||
ref_key=_ReferenceKey(database="", schema="", identifier=""),
|
||||
@@ -2685,7 +2558,7 @@ if 1 == 0:
|
||||
dep_key=_ReferenceKey(database="", schema="", identifier=""),
|
||||
ref_key=_ReferenceKey(database="", schema="", identifier=""),
|
||||
)
|
||||
AddRelation(relation=_CachedRelation())
|
||||
AddRelation(relation=_make_key(_CachedRelation()))
|
||||
DropMissingRelation(relation=_ReferenceKey(database="", schema="", identifier=""))
|
||||
DropCascade(
|
||||
dropped=_ReferenceKey(database="", schema="", identifier=""),
|
||||
@@ -2708,14 +2581,10 @@ if 1 == 0:
|
||||
AdapterImportError(ModuleNotFoundError())
|
||||
PluginLoadError()
|
||||
SystemReportReturnCode(returncode=0)
|
||||
SelectorAlertUpto3UnusedNodes(node_names=[])
|
||||
SelectorAlertAllUnusedNodes(node_names=[])
|
||||
NewConnectionOpening(connection_state='')
|
||||
TimingInfoCollected()
|
||||
MergedFromState(nbr_merged=0, sample=[])
|
||||
MissingProfileTarget(profile_name='', target_name='')
|
||||
ProfileLoadError(exc=Exception(''))
|
||||
ProfileNotFound(profile_name='')
|
||||
InvalidVarsYAML()
|
||||
GenericTestFileParse(path='')
|
||||
MacroFileParse(path='')
|
||||
@@ -2755,8 +2624,6 @@ if 1 == 0:
|
||||
PartialParsingDeletedExposure(unique_id='')
|
||||
InvalidDisabledSourceInTestNode(msg='')
|
||||
InvalidRefInTestNode(msg='')
|
||||
MessageHandleGenericException(build_path='', unique_id='', exc=Exception(''))
|
||||
DetailsHandleGenericException()
|
||||
RunningOperationCaughtError(exc=Exception(''))
|
||||
RunningOperationUncaughtError(exc=Exception(''))
|
||||
DbtProjectError()
|
||||
@@ -2769,7 +2636,7 @@ if 1 == 0:
|
||||
ProfileHelpMessage()
|
||||
CatchableExceptionOnRun(exc=Exception(''))
|
||||
InternalExceptionOnRun(build_path='', exc=Exception(''))
|
||||
GenericExceptionOnRun(build_path='', unique_id='', exc=Exception(''))
|
||||
GenericExceptionOnRun(build_path='', unique_id='', exc='')
|
||||
NodeConnectionReleaseError(node_name='', exc=Exception(''))
|
||||
CheckCleanPath(path='')
|
||||
ConfirmCleanPath(path='')
|
||||
@@ -2777,7 +2644,7 @@ if 1 == 0:
|
||||
FinishedCleanPaths()
|
||||
OpenCommand(open_cmd='', profiles_dir='')
|
||||
DepsNoPackagesFound()
|
||||
DepsStartPackageInstall(package='')
|
||||
DepsStartPackageInstall(package_name='')
|
||||
DepsInstallInfo(version_name='')
|
||||
DepsUpdateAvailable(version_latest='')
|
||||
DepsListSubdirectory(subdirectory='')
|
||||
@@ -2952,7 +2819,7 @@ if 1 == 0:
|
||||
NodeStart(report_node_data=ParsedModelNode(), unique_id='')
|
||||
NodeFinished(report_node_data=ParsedModelNode(), unique_id='', run_result=RunResult())
|
||||
QueryCancelationUnsupported(type='')
|
||||
ConcurrencyLine(concurrency_line='')
|
||||
ConcurrencyLine(num_threads=0, target_name='')
|
||||
NodeCompiling(report_node_data=ParsedModelNode(), unique_id='')
|
||||
NodeExecuting(report_node_data=ParsedModelNode(), unique_id='')
|
||||
StarterProjectPath(dir='')
|
||||
@@ -2982,3 +2849,4 @@ if 1 == 0:
|
||||
GeneralWarningMsg(msg='', log_fmt='')
|
||||
GeneralWarningException(exc=Exception(''), log_fmt='')
|
||||
EventBufferFull()
|
||||
RecordRetryException(exc=Exception(""))
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -33,6 +33,8 @@ SEND_ANONYMOUS_USAGE_STATS = None
|
||||
PRINTER_WIDTH = 80
|
||||
WHICH = None
|
||||
INDIRECT_SELECTION = None
|
||||
LOG_CACHE_EVENTS = None
|
||||
EVENT_BUFFER_SIZE = 100000
|
||||
|
||||
# Global CLI defaults. These flags are set from three places:
|
||||
# CLI args, environment variables, and user_config (profiles.yml).
|
||||
@@ -51,7 +53,9 @@ flag_defaults = {
|
||||
"FAIL_FAST": False,
|
||||
"SEND_ANONYMOUS_USAGE_STATS": True,
|
||||
"PRINTER_WIDTH": 80,
|
||||
"INDIRECT_SELECTION": 'eager'
|
||||
"INDIRECT_SELECTION": 'eager',
|
||||
"LOG_CACHE_EVENTS": False,
|
||||
"EVENT_BUFFER_SIZE": 100000
|
||||
}
|
||||
|
||||
|
||||
@@ -99,7 +103,7 @@ def set_from_args(args, user_config):
|
||||
USE_EXPERIMENTAL_PARSER, STATIC_PARSER, WRITE_JSON, PARTIAL_PARSE, \
|
||||
USE_COLORS, STORE_FAILURES, PROFILES_DIR, DEBUG, LOG_FORMAT, INDIRECT_SELECTION, \
|
||||
VERSION_CHECK, FAIL_FAST, SEND_ANONYMOUS_USAGE_STATS, PRINTER_WIDTH, \
|
||||
WHICH
|
||||
WHICH, LOG_CACHE_EVENTS, EVENT_BUFFER_SIZE
|
||||
|
||||
STRICT_MODE = False # backwards compatibility
|
||||
# cli args without user_config or env var option
|
||||
@@ -122,6 +126,8 @@ def set_from_args(args, user_config):
|
||||
SEND_ANONYMOUS_USAGE_STATS = get_flag_value('SEND_ANONYMOUS_USAGE_STATS', args, user_config)
|
||||
PRINTER_WIDTH = get_flag_value('PRINTER_WIDTH', args, user_config)
|
||||
INDIRECT_SELECTION = get_flag_value('INDIRECT_SELECTION', args, user_config)
|
||||
LOG_CACHE_EVENTS = get_flag_value('LOG_CACHE_EVENTS', args, user_config)
|
||||
EVENT_BUFFER_SIZE = get_flag_value('EVENT_BUFFER_SIZE', args, user_config)
|
||||
|
||||
|
||||
def get_flag_value(flag, args, user_config):
|
||||
@@ -134,7 +140,13 @@ def get_flag_value(flag, args, user_config):
|
||||
if env_value is not None and env_value != '':
|
||||
env_value = env_value.lower()
|
||||
# non Boolean values
|
||||
if flag in ['LOG_FORMAT', 'PRINTER_WIDTH', 'PROFILES_DIR', 'INDIRECT_SELECTION']:
|
||||
if flag in [
|
||||
'LOG_FORMAT',
|
||||
'PRINTER_WIDTH',
|
||||
'PROFILES_DIR',
|
||||
'INDIRECT_SELECTION',
|
||||
'EVENT_BUFFER_SIZE'
|
||||
]:
|
||||
flag_value = env_value
|
||||
else:
|
||||
flag_value = env_set_bool(env_value)
|
||||
@@ -142,7 +154,7 @@ def get_flag_value(flag, args, user_config):
|
||||
flag_value = getattr(user_config, lc_flag)
|
||||
else:
|
||||
flag_value = flag_defaults[flag]
|
||||
if flag == 'PRINTER_WIDTH': # printer_width must be an int or it hangs
|
||||
if flag in ['PRINTER_WIDTH', 'EVENT_BUFFER_SIZE']: # must be ints
|
||||
flag_value = int(flag_value)
|
||||
if flag == 'PROFILES_DIR':
|
||||
flag_value = os.path.abspath(flag_value)
|
||||
@@ -165,5 +177,7 @@ def get_flag_dict():
|
||||
"fail_fast": FAIL_FAST,
|
||||
"send_anonymous_usage_stats": SEND_ANONYMOUS_USAGE_STATS,
|
||||
"printer_width": PRINTER_WIDTH,
|
||||
"indirect_selection": INDIRECT_SELECTION
|
||||
"indirect_selection": INDIRECT_SELECTION,
|
||||
"log_cache_events": LOG_CACHE_EVENTS,
|
||||
"event_buffer_size": EVENT_BUFFER_SIZE
|
||||
}
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import abc
|
||||
from itertools import chain
|
||||
from pathlib import Path
|
||||
from typing import Set, List, Dict, Iterator, Tuple, Any, Union, Type, Optional
|
||||
from typing import Set, List, Dict, Iterator, Tuple, Any, Union, Type, Optional, Callable
|
||||
|
||||
from dbt.dataclass_schema import StrEnum
|
||||
|
||||
@@ -449,20 +449,24 @@ class StateSelectorMethod(SelectorMethod):
|
||||
|
||||
return modified
|
||||
|
||||
def recursively_check_macros_modified(self, node, previous_macros):
|
||||
def recursively_check_macros_modified(self, node, visited_macros):
|
||||
# loop through all macros that this node depends on
|
||||
for macro_uid in node.depends_on.macros:
|
||||
# avoid infinite recursion if we've already seen this macro
|
||||
if macro_uid in previous_macros:
|
||||
if macro_uid in visited_macros:
|
||||
continue
|
||||
previous_macros.append(macro_uid)
|
||||
visited_macros.append(macro_uid)
|
||||
# is this macro one of the modified macros?
|
||||
if macro_uid in self.modified_macros:
|
||||
return True
|
||||
# if not, and this macro depends on other macros, keep looping
|
||||
macro_node = self.manifest.macros[macro_uid]
|
||||
if len(macro_node.depends_on.macros) > 0:
|
||||
return self.recursively_check_macros_modified(macro_node, previous_macros)
|
||||
return self.recursively_check_macros_modified(macro_node, visited_macros)
|
||||
# this macro hasn't been modified, but we haven't checked
|
||||
# the other macros the node depends on, so keep looking
|
||||
elif len(node.depends_on.macros) > len(visited_macros):
|
||||
continue
|
||||
else:
|
||||
return False
|
||||
|
||||
@@ -475,45 +479,31 @@ class StateSelectorMethod(SelectorMethod):
|
||||
return False
|
||||
# recursively loop through upstream macros to see if any is modified
|
||||
else:
|
||||
previous_macros = []
|
||||
return self.recursively_check_macros_modified(node, previous_macros)
|
||||
visited_macros = []
|
||||
return self.recursively_check_macros_modified(node, visited_macros)
|
||||
|
||||
def check_modified(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
|
||||
# TODO check modifed_content and check_modified macro seems a bit redundent
|
||||
def check_modified_content(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
|
||||
different_contents = not new.same_contents(old) # type: ignore
|
||||
upstream_macro_change = self.check_macros_modified(new)
|
||||
return different_contents or upstream_macro_change
|
||||
|
||||
def check_modified_body(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
|
||||
if hasattr(new, "same_body"):
|
||||
return not new.same_body(old) # type: ignore
|
||||
else:
|
||||
return False
|
||||
|
||||
def check_modified_configs(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
|
||||
if hasattr(new, "same_config"):
|
||||
return not new.same_config(old) # type: ignore
|
||||
else:
|
||||
return False
|
||||
|
||||
def check_modified_persisted_descriptions(
|
||||
self, old: Optional[SelectorTarget], new: SelectorTarget
|
||||
) -> bool:
|
||||
if hasattr(new, "same_persisted_description"):
|
||||
return not new.same_persisted_description(old) # type: ignore
|
||||
else:
|
||||
return False
|
||||
|
||||
def check_modified_relation(
|
||||
self, old: Optional[SelectorTarget], new: SelectorTarget
|
||||
) -> bool:
|
||||
if hasattr(new, "same_database_representation"):
|
||||
return not new.same_database_representation(old) # type: ignore
|
||||
else:
|
||||
return False
|
||||
|
||||
def check_modified_macros(self, _, new: SelectorTarget) -> bool:
|
||||
return self.check_macros_modified(new)
|
||||
|
||||
@staticmethod
|
||||
def check_modified_factory(
|
||||
compare_method: str
|
||||
) -> Callable[[Optional[SelectorTarget], SelectorTarget], bool]:
|
||||
# get a function that compares two selector target based on compare method provided
|
||||
def check_modified_things(old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
|
||||
if hasattr(new, compare_method):
|
||||
# when old body does not exist or old and new are not the same
|
||||
return not old or not getattr(new, compare_method)(old) # type: ignore
|
||||
else:
|
||||
return False
|
||||
return check_modified_things
|
||||
|
||||
def check_new(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
|
||||
return old is None
|
||||
|
||||
@@ -527,14 +517,21 @@ class StateSelectorMethod(SelectorMethod):
|
||||
|
||||
state_checks = {
|
||||
# it's new if there is no old version
|
||||
'new': lambda old, _: old is None,
|
||||
'new':
|
||||
lambda old, _: old is None,
|
||||
# use methods defined above to compare properties of old + new
|
||||
'modified': self.check_modified,
|
||||
'modified.body': self.check_modified_body,
|
||||
'modified.configs': self.check_modified_configs,
|
||||
'modified.persisted_descriptions': self.check_modified_persisted_descriptions,
|
||||
'modified.relation': self.check_modified_relation,
|
||||
'modified.macros': self.check_modified_macros,
|
||||
'modified':
|
||||
self.check_modified_content,
|
||||
'modified.body':
|
||||
self.check_modified_factory('same_body'),
|
||||
'modified.configs':
|
||||
self.check_modified_factory('same_config'),
|
||||
'modified.persisted_descriptions':
|
||||
self.check_modified_factory('same_persisted_description'),
|
||||
'modified.relation':
|
||||
self.check_modified_factory('same_database_representation'),
|
||||
'modified.macros':
|
||||
self.check_modified_macros,
|
||||
}
|
||||
if selector in state_checks:
|
||||
checker = state_checks[selector]
|
||||
|
||||
@@ -93,3 +93,10 @@ dbtClassMixin.register_field_encoders({
|
||||
|
||||
FQNPath = Tuple[str, ...]
|
||||
PathSet = AbstractSet[FQNPath]
|
||||
|
||||
|
||||
# This class is used in to_target_dict, so that accesses to missing keys
|
||||
# will return an empty string instead of Undefined
|
||||
class DictDefaultEmptyStr(dict):
|
||||
def __getitem__(self, key):
|
||||
return dict.get(self, key, "")
|
||||
|
||||
@@ -35,7 +35,7 @@ Note that you can also right-click on models to interactively filter and explore
|
||||
|
||||
### More information
|
||||
|
||||
- [What is dbt](https://docs.getdbt.com/docs/overview)?
|
||||
- [What is dbt](https://docs.getdbt.com/docs/introduction)?
|
||||
- Read the [dbt viewpoint](https://docs.getdbt.com/docs/viewpoint)
|
||||
- [Installation](https://docs.getdbt.com/docs/installation)
|
||||
- Join the [dbt Community](https://www.getdbt.com/community/) for questions and discussion
|
||||
|
||||
File diff suppressed because one or more lines are too long
0
core/dbt/include/starter_project/seeds/.gitkeep
Normal file
0
core/dbt/include/starter_project/seeds/.gitkeep
Normal file
@@ -424,7 +424,7 @@ class DelayedFileHandler(logbook.RotatingFileHandler, FormatterMixin):
|
||||
return
|
||||
|
||||
make_log_dir_if_missing(log_dir)
|
||||
log_path = os.path.join(log_dir, 'dbt.log.old') # TODO hack for now
|
||||
log_path = os.path.join(log_dir, 'dbt.log.legacy') # TODO hack for now
|
||||
self._super_init(log_path)
|
||||
self._replay_buffered()
|
||||
self._log_path = log_path
|
||||
|
||||
@@ -221,24 +221,22 @@ def track_run(task):
|
||||
def run_from_args(parsed):
|
||||
log_cache_events(getattr(parsed, 'log_cache_events', False))
|
||||
|
||||
# we can now use the logger for stdout
|
||||
# set log_format in the logger
|
||||
# if 'list' task: set stdout to WARN instead of INFO
|
||||
level_override = parsed.cls.pre_init_hook(parsed)
|
||||
|
||||
fire_event(MainReportVersion(v=str(dbt.version.installed)))
|
||||
|
||||
# this will convert DbtConfigErrors into RuntimeExceptions
|
||||
# task could be any one of the task objects
|
||||
task = parsed.cls.from_args(args=parsed)
|
||||
fire_event(MainReportArgs(args=parsed))
|
||||
|
||||
# Set up logging
|
||||
log_path = None
|
||||
if task.config is not None:
|
||||
log_path = getattr(task.config, 'log_path', None)
|
||||
# we can finally set the file logger up
|
||||
log_manager.set_path(log_path)
|
||||
# if 'list' task: set stdout to WARN instead of INFO
|
||||
level_override = parsed.cls.pre_init_hook(parsed)
|
||||
setup_event_logger(log_path or 'logs', level_override)
|
||||
|
||||
fire_event(MainReportVersion(v=str(dbt.version.installed)))
|
||||
fire_event(MainReportArgs(args=parsed))
|
||||
|
||||
if dbt.tracking.active_user is not None: # mypy appeasement, always true
|
||||
fire_event(MainTrackingUserState(dbt.tracking.active_user.state()))
|
||||
|
||||
@@ -1078,6 +1076,14 @@ def parse_args(args, cls=DBTArgumentParser):
|
||||
'''
|
||||
)
|
||||
|
||||
p.add_argument(
|
||||
'--event-buffer-size',
|
||||
dest='event_buffer_size',
|
||||
help='''
|
||||
Sets the max number of events to buffer in EVENT_HISTORY
|
||||
'''
|
||||
)
|
||||
|
||||
subs = p.add_subparsers(title="Available sub-commands")
|
||||
|
||||
base_subparser = _build_base_subparser()
|
||||
|
||||
@@ -246,7 +246,7 @@ class ManifestLoader:
|
||||
project_parser_files = self.partial_parser.get_parsing_files()
|
||||
self.partially_parsing = True
|
||||
self.manifest = self.saved_manifest
|
||||
except Exception:
|
||||
except Exception as exc:
|
||||
# pp_files should still be the full set and manifest is new manifest,
|
||||
# since get_parsing_files failed
|
||||
fire_event(PartialParsingFullReparseBecauseOfError())
|
||||
@@ -284,6 +284,9 @@ class ManifestLoader:
|
||||
exc_info['full_reparse_reason'] = ReparseReason.exception
|
||||
dbt.tracking.track_partial_parser(exc_info)
|
||||
|
||||
if os.environ.get('DBT_PP_TEST'):
|
||||
raise exc
|
||||
|
||||
if self.manifest._parsing_info is None:
|
||||
self.manifest._parsing_info = ParsingInfo()
|
||||
|
||||
|
||||
@@ -272,10 +272,10 @@ class PartialParsing:
|
||||
if self.already_scheduled_for_parsing(old_source_file):
|
||||
return
|
||||
|
||||
# These files only have one node.
|
||||
unique_id = None
|
||||
# These files only have one node except for snapshots
|
||||
unique_ids = []
|
||||
if old_source_file.nodes:
|
||||
unique_id = old_source_file.nodes[0]
|
||||
unique_ids = old_source_file.nodes
|
||||
else:
|
||||
# It's not clear when this would actually happen.
|
||||
# Logging in case there are other associated errors.
|
||||
@@ -286,7 +286,7 @@ class PartialParsing:
|
||||
self.deleted_manifest.files[file_id] = old_source_file
|
||||
self.saved_files[file_id] = deepcopy(new_source_file)
|
||||
self.add_to_pp_files(new_source_file)
|
||||
if unique_id:
|
||||
for unique_id in unique_ids:
|
||||
self.remove_node_in_saved(new_source_file, unique_id)
|
||||
|
||||
def remove_node_in_saved(self, source_file, unique_id):
|
||||
@@ -315,7 +315,7 @@ class PartialParsing:
|
||||
if node.patch_path:
|
||||
file_id = node.patch_path
|
||||
# it might be changed... then what?
|
||||
if file_id not in self.file_diff['deleted']:
|
||||
if file_id not in self.file_diff['deleted'] and file_id in self.saved_files:
|
||||
# schema_files should already be updated
|
||||
schema_file = self.saved_files[file_id]
|
||||
dict_key = parse_file_type_to_key[source_file.parse_file_type]
|
||||
@@ -358,7 +358,7 @@ class PartialParsing:
|
||||
if not source_file.nodes:
|
||||
fire_event(PartialParsingMissingNodes(file_id=source_file.file_id))
|
||||
return
|
||||
# There is generally only 1 node for SQL files, except for macros
|
||||
# There is generally only 1 node for SQL files, except for macros and snapshots
|
||||
for unique_id in source_file.nodes:
|
||||
self.remove_node_in_saved(source_file, unique_id)
|
||||
self.schedule_referencing_nodes_for_parsing(unique_id)
|
||||
@@ -375,7 +375,7 @@ class PartialParsing:
|
||||
for unique_id in unique_ids:
|
||||
if unique_id in self.saved_manifest.nodes:
|
||||
node = self.saved_manifest.nodes[unique_id]
|
||||
if node.resource_type == NodeType.Test:
|
||||
if node.resource_type == NodeType.Test and node.test_node_type == 'generic':
|
||||
# test nodes are handled separately. Must be removed from schema file
|
||||
continue
|
||||
file_id = node.file_id
|
||||
@@ -435,7 +435,9 @@ class PartialParsing:
|
||||
self.check_for_special_deleted_macros(source_file)
|
||||
self.handle_macro_file_links(source_file, follow_references)
|
||||
file_id = source_file.file_id
|
||||
self.deleted_manifest.files[file_id] = self.saved_files.pop(file_id)
|
||||
# It's not clear when this file_id would not exist in saved_files
|
||||
if file_id in self.saved_files:
|
||||
self.deleted_manifest.files[file_id] = self.saved_files.pop(file_id)
|
||||
|
||||
def check_for_special_deleted_macros(self, source_file):
|
||||
for unique_id in source_file.macros:
|
||||
@@ -498,7 +500,9 @@ class PartialParsing:
|
||||
for unique_id in unique_ids:
|
||||
if unique_id in self.saved_manifest.nodes:
|
||||
node = self.saved_manifest.nodes[unique_id]
|
||||
if node.resource_type == NodeType.Test:
|
||||
# Both generic tests from yaml files and singular tests have NodeType.Test
|
||||
# so check for generic test.
|
||||
if node.resource_type == NodeType.Test and node.test_node_type == 'generic':
|
||||
schema_file_id = node.file_id
|
||||
schema_file = self.saved_manifest.files[schema_file_id]
|
||||
(key, name) = schema_file.get_key_and_name_for_test(node.unique_id)
|
||||
@@ -670,8 +674,8 @@ class PartialParsing:
|
||||
continue
|
||||
elem = self.get_schema_element(new_yaml_dict[dict_key], name)
|
||||
if elem:
|
||||
self.delete_schema_macro_patch(schema_file, macro)
|
||||
self.merge_patch(schema_file, dict_key, macro)
|
||||
self.delete_schema_macro_patch(schema_file, elem)
|
||||
self.merge_patch(schema_file, dict_key, elem)
|
||||
|
||||
# exposures
|
||||
dict_key = 'exposures'
|
||||
|
||||
@@ -960,10 +960,9 @@ class MacroPatchParser(NonSourceParser[UnparsedMacroUpdate, ParsedMacroPatch]):
|
||||
unique_id = f'macro.{patch.package_name}.{patch.name}'
|
||||
macro = self.manifest.macros.get(unique_id)
|
||||
if not macro:
|
||||
warn_or_error(
|
||||
f'WARNING: Found patch for macro "{patch.name}" '
|
||||
f'which was not found'
|
||||
)
|
||||
msg = f'Found patch for macro "{patch.name}" ' \
|
||||
f'which was not found'
|
||||
warn_or_error(msg, log_fmt=warning_tag('{}'))
|
||||
return
|
||||
if macro.patch_path:
|
||||
package_name, existing_file_path = macro.patch_path.split('://')
|
||||
|
||||
@@ -63,8 +63,14 @@ class SnapshotParser(
|
||||
|
||||
def transform(self, node: IntermediateSnapshotNode) -> ParsedSnapshotNode:
|
||||
try:
|
||||
# The config_call_dict is not serialized, because normally
|
||||
# it is not needed after parsing. But since the snapshot node
|
||||
# does this extra to_dict, save and restore it, to keep
|
||||
# the model config when there is also schema config.
|
||||
config_call_dict = node.config_call_dict
|
||||
dct = node.to_dict(omit_none=True)
|
||||
parsed_node = ParsedSnapshotNode.from_dict(dct)
|
||||
parsed_node.config_call_dict = config_call_dict
|
||||
self.set_snapshot_attributes(parsed_node)
|
||||
return parsed_node
|
||||
except ValidationError as exc:
|
||||
|
||||
@@ -334,7 +334,7 @@ class BaseRunner(metaclass=ABCMeta):
|
||||
GenericExceptionOnRun(
|
||||
build_path=self.node.build_path,
|
||||
unique_id=self.node.unique_id,
|
||||
exc=e
|
||||
exc=str(e) # TODO: unstring this when serialization is fixed
|
||||
)
|
||||
)
|
||||
fire_event(PrintDebugStackTrace())
|
||||
|
||||
@@ -38,7 +38,7 @@ class CleanTask(BaseTask):
|
||||
"""
|
||||
move_to_nearest_project_dir(self.args)
|
||||
if ('dbt_modules' in self.config.clean_targets and
|
||||
self.config.packages_install_path != 'dbt_modules'):
|
||||
self.config.packages_install_path not in self.config.clean_targets):
|
||||
deprecations.warn('install-packages-path')
|
||||
for path in self.config.clean_targets:
|
||||
fire_event(CheckCleanPath(path=path))
|
||||
|
||||
@@ -10,7 +10,7 @@ from dbt.deps.resolver import resolve_packages
|
||||
from dbt.events.functions import fire_event
|
||||
from dbt.events.types import (
|
||||
DepsNoPackagesFound, DepsStartPackageInstall, DepsUpdateAvailable, DepsUTD,
|
||||
DepsInstallInfo, DepsListSubdirectory, DepsNotifyUpdatesAvailable
|
||||
DepsInstallInfo, DepsListSubdirectory, DepsNotifyUpdatesAvailable, EmptyLine
|
||||
)
|
||||
from dbt.clients import system
|
||||
|
||||
@@ -63,7 +63,7 @@ class DepsTask(BaseTask):
|
||||
source_type = package.source_type()
|
||||
version = package.get_version()
|
||||
|
||||
fire_event(DepsStartPackageInstall(package=package))
|
||||
fire_event(DepsStartPackageInstall(package_name=package_name))
|
||||
package.install(self.config, renderer)
|
||||
fire_event(DepsInstallInfo(version_name=package.nice_version_name()))
|
||||
if source_type == 'hub':
|
||||
@@ -81,6 +81,7 @@ class DepsTask(BaseTask):
|
||||
source_type=source_type,
|
||||
version=version)
|
||||
if packages_to_upgrade:
|
||||
fire_event(EmptyLine())
|
||||
fire_event(DepsNotifyUpdatesAvailable(packages=packages_to_upgrade))
|
||||
|
||||
@classmethod
|
||||
|
||||
@@ -14,6 +14,8 @@ from dbt import flags
|
||||
from dbt.version import _get_adapter_plugin_names
|
||||
from dbt.adapters.factory import load_plugin, get_include_paths
|
||||
|
||||
from dbt.contracts.project import Name as ProjectName
|
||||
|
||||
from dbt.events.functions import fire_event
|
||||
from dbt.events.types import (
|
||||
StarterProjectPath, ConfigFolderDirectory, NoSampleProfileFound, ProfileWrittenWithSample,
|
||||
@@ -48,7 +50,8 @@ Need help? Don't hesitate to reach out to us via GitHub issues or on Slack:
|
||||
Happy modeling!
|
||||
"""
|
||||
|
||||
# https://click.palletsprojects.com/en/8.0.x/api/?highlight=float#types
|
||||
# https://click.palletsprojects.com/en/8.0.x/api/#types
|
||||
# click v7.0 has UNPROCESSED, STRING, INT, FLOAT, BOOL, and UUID available.
|
||||
click_type_mapping = {
|
||||
"string": click.STRING,
|
||||
"int": click.INT,
|
||||
@@ -269,6 +272,16 @@ class InitTask(BaseTask):
|
||||
numeric_choice = click.prompt(prompt_msg, type=click.INT)
|
||||
return available_adapters[numeric_choice - 1]
|
||||
|
||||
def get_valid_project_name(self) -> str:
|
||||
"""Returns a valid project name, either from CLI arg or user prompt."""
|
||||
name = self.args.project_name
|
||||
while not ProjectName.is_valid(name):
|
||||
if name:
|
||||
click.echo(name + " is not a valid project name.")
|
||||
name = click.prompt("Enter a name for your project (letters, digits, underscore)")
|
||||
|
||||
return name
|
||||
|
||||
def run(self):
|
||||
"""Entry point for the init task."""
|
||||
profiles_dir = flags.PROFILES_DIR
|
||||
@@ -285,6 +298,8 @@ class InitTask(BaseTask):
|
||||
# just setup the user's profile.
|
||||
fire_event(SettingUpProfile())
|
||||
profile_name = self.get_profile_name_from_current_project()
|
||||
if not self.check_if_can_write_profile(profile_name=profile_name):
|
||||
return
|
||||
# If a profile_template.yml exists in the project root, that effectively
|
||||
# overrides the profile_template.yml for the given target.
|
||||
profile_template_path = Path("profile_template.yml")
|
||||
@@ -296,8 +311,6 @@ class InitTask(BaseTask):
|
||||
return
|
||||
except Exception:
|
||||
fire_event(InvalidProfileTemplateYAML())
|
||||
if not self.check_if_can_write_profile(profile_name=profile_name):
|
||||
return
|
||||
adapter = self.ask_for_adapter_choice()
|
||||
self.create_profile_from_target(
|
||||
adapter, profile_name=profile_name
|
||||
@@ -306,11 +319,7 @@ class InitTask(BaseTask):
|
||||
|
||||
# When dbt init is run outside of an existing project,
|
||||
# create a new project and set up the user's profile.
|
||||
project_name = self.args.project_name
|
||||
if project_name is None:
|
||||
# If project name is not provided,
|
||||
# ask the user which project name they'd like to use.
|
||||
project_name = click.prompt("What is the desired project name?")
|
||||
project_name = self.get_valid_project_name()
|
||||
project_path = Path(project_name)
|
||||
if project_path.exists():
|
||||
fire_event(ProjectNameAlreadyExists(name=project_name))
|
||||
|
||||
@@ -65,6 +65,8 @@ def print_run_status_line(results) -> None:
|
||||
stats[result_type] += 1
|
||||
stats['total'] += 1
|
||||
|
||||
with TextOnly():
|
||||
fire_event(EmptyLine())
|
||||
fire_event(StatsLine(stats=stats))
|
||||
|
||||
|
||||
|
||||
@@ -11,7 +11,7 @@ from .printer import (
|
||||
print_run_end_messages,
|
||||
get_counts,
|
||||
)
|
||||
|
||||
from datetime import datetime
|
||||
from dbt import tracking
|
||||
from dbt import utils
|
||||
from dbt.adapters.base import BaseRelation
|
||||
@@ -21,7 +21,7 @@ from dbt.contracts.graph.compiled import CompileResultNode
|
||||
from dbt.contracts.graph.manifest import WritableManifest
|
||||
from dbt.contracts.graph.model_config import Hook
|
||||
from dbt.contracts.graph.parsed import ParsedHookNode
|
||||
from dbt.contracts.results import NodeStatus, RunResult, RunStatus
|
||||
from dbt.contracts.results import NodeStatus, RunResult, RunStatus, RunningStatus
|
||||
from dbt.exceptions import (
|
||||
CompilationException,
|
||||
InternalException,
|
||||
@@ -342,6 +342,8 @@ class RunTask(CompileTask):
|
||||
finishctx = TimestampNamed('node_finished_at')
|
||||
|
||||
for idx, hook in enumerate(ordered_hooks, start=1):
|
||||
hook._event_status['started_at'] = datetime.utcnow().isoformat()
|
||||
hook._event_status['node_status'] = RunningStatus.Started
|
||||
sql = self.get_hook_sql(adapter, hook, idx, num_hooks,
|
||||
extra_context)
|
||||
|
||||
@@ -360,19 +362,21 @@ class RunTask(CompileTask):
|
||||
)
|
||||
)
|
||||
|
||||
status = 'OK'
|
||||
|
||||
with Timer() as timer:
|
||||
if len(sql.strip()) > 0:
|
||||
status, _ = adapter.execute(sql, auto_begin=False,
|
||||
fetch=False)
|
||||
self.ran_hooks.append(hook)
|
||||
response, _ = adapter.execute(sql, auto_begin=False, fetch=False)
|
||||
status = response._message
|
||||
else:
|
||||
status = 'OK'
|
||||
|
||||
self.ran_hooks.append(hook)
|
||||
hook._event_status['finished_at'] = datetime.utcnow().isoformat()
|
||||
with finishctx, DbtModelState({'node_status': 'passed'}):
|
||||
hook._event_status['node_status'] = RunStatus.Success
|
||||
fire_event(
|
||||
PrintHookEndLine(
|
||||
statement=hook_text,
|
||||
status=str(status),
|
||||
status=status,
|
||||
index=idx,
|
||||
total=num_hooks,
|
||||
execution_time=timer.elapsed,
|
||||
@@ -380,6 +384,11 @@ class RunTask(CompileTask):
|
||||
report_node_data=hook
|
||||
)
|
||||
)
|
||||
# `_event_status` dict is only used for logging. Make sure
|
||||
# it gets deleted when we're done with it
|
||||
del hook._event_status["started_at"]
|
||||
del hook._event_status["finished_at"]
|
||||
del hook._event_status["node_status"]
|
||||
|
||||
self._total_executed += len(ordered_hooks)
|
||||
|
||||
|
||||
@@ -56,6 +56,7 @@ from dbt.parser.manifest import ManifestLoader
|
||||
import dbt.exceptions
|
||||
from dbt import flags
|
||||
import dbt.utils
|
||||
from dbt.ui import warning_tag
|
||||
|
||||
RESULT_FILE_NAME = 'run_results.json'
|
||||
MANIFEST_FILE_NAME = 'manifest.json'
|
||||
@@ -208,7 +209,7 @@ class GraphRunnableTask(ManifestTask):
|
||||
with RUNNING_STATE, uid_context:
|
||||
startctx = TimestampNamed('node_started_at')
|
||||
index = self.index_offset(runner.node_index)
|
||||
runner.node._event_status['dbt_internal__started_at'] = datetime.utcnow().isoformat()
|
||||
runner.node._event_status['started_at'] = datetime.utcnow().isoformat()
|
||||
runner.node._event_status['node_status'] = RunningStatus.Started
|
||||
extended_metadata = ModelMetadata(runner.node, index)
|
||||
|
||||
@@ -224,8 +225,7 @@ class GraphRunnableTask(ManifestTask):
|
||||
result = runner.run_with_hooks(self.manifest)
|
||||
status = runner.get_result_status(result)
|
||||
runner.node._event_status['node_status'] = result.status
|
||||
runner.node._event_status['dbt_internal__finished_at'] = \
|
||||
datetime.utcnow().isoformat()
|
||||
runner.node._event_status['finished_at'] = datetime.utcnow().isoformat()
|
||||
finally:
|
||||
finishctx = TimestampNamed('finished_at')
|
||||
with finishctx, DbtModelState(status):
|
||||
@@ -238,8 +238,8 @@ class GraphRunnableTask(ManifestTask):
|
||||
)
|
||||
# `_event_status` dict is only used for logging. Make sure
|
||||
# it gets deleted when we're done with it
|
||||
del runner.node._event_status["dbt_internal__started_at"]
|
||||
del runner.node._event_status["dbt_internal__finished_at"]
|
||||
del runner.node._event_status["started_at"]
|
||||
del runner.node._event_status["finished_at"]
|
||||
del runner.node._event_status["node_status"]
|
||||
|
||||
fail_fast = flags.FAIL_FAST
|
||||
@@ -359,7 +359,7 @@ class GraphRunnableTask(ManifestTask):
|
||||
adapter = get_adapter(self.config)
|
||||
|
||||
if not adapter.is_cancelable():
|
||||
fire_event(QueryCancelationUnsupported(type=adapter.type))
|
||||
fire_event(QueryCancelationUnsupported(type=adapter.type()))
|
||||
else:
|
||||
with adapter.connection_named('master'):
|
||||
for conn_name in adapter.cancel_open_connections():
|
||||
@@ -377,10 +377,8 @@ class GraphRunnableTask(ManifestTask):
|
||||
num_threads = self.config.threads
|
||||
target_name = self.config.target_name
|
||||
|
||||
text = "Concurrency: {} threads (target='{}')"
|
||||
concurrency_line = text.format(num_threads, target_name)
|
||||
with NodeCount(self.num_nodes):
|
||||
fire_event(ConcurrencyLine(concurrency_line=concurrency_line))
|
||||
fire_event(ConcurrencyLine(num_threads=num_threads, target_name=target_name))
|
||||
with TextOnly():
|
||||
fire_event(EmptyLine())
|
||||
|
||||
@@ -461,8 +459,11 @@ class GraphRunnableTask(ManifestTask):
|
||||
)
|
||||
|
||||
if len(self._flattened_nodes) == 0:
|
||||
warn_or_error("\nWARNING: Nothing to do. Try checking your model "
|
||||
"configs and model specification args")
|
||||
with TextOnly():
|
||||
fire_event(EmptyLine())
|
||||
msg = "Nothing to do. Try checking your model " \
|
||||
"configs and model specification args"
|
||||
warn_or_error(msg, log_fmt=warning_tag('{}'))
|
||||
result = self.get_result(
|
||||
results=[],
|
||||
generated_at=datetime.utcnow(),
|
||||
|
||||
@@ -6,7 +6,7 @@ from dbt.include.global_project import DOCS_INDEX_FILE_PATH
|
||||
from http.server import SimpleHTTPRequestHandler
|
||||
from socketserver import TCPServer
|
||||
from dbt.events.functions import fire_event
|
||||
from dbt.events.types import ServingDocsPort, ServingDocsAccessInfo, ServingDocsExitInfo
|
||||
from dbt.events.types import ServingDocsPort, ServingDocsAccessInfo, ServingDocsExitInfo, EmptyLine
|
||||
|
||||
from dbt.task.base import ConfiguredTask
|
||||
|
||||
@@ -22,6 +22,8 @@ class ServeTask(ConfiguredTask):
|
||||
|
||||
fire_event(ServingDocsPort(address=address, port=port))
|
||||
fire_event(ServingDocsAccessInfo(port=port))
|
||||
fire_event(EmptyLine())
|
||||
fire_event(EmptyLine())
|
||||
fire_event(ServingDocsExitInfo())
|
||||
|
||||
# mypy doesn't think SimpleHTTPRequestHandler is ok here, but it is
|
||||
|
||||
@@ -66,6 +66,4 @@ def line_wrap_message(
|
||||
|
||||
|
||||
def warning_tag(msg: str) -> str:
|
||||
# no longer needed, since new logging includes colorized log level
|
||||
# return f'[{yellow("WARNING")}]: {msg}'
|
||||
return msg
|
||||
return f'[{yellow("WARNING")}]: {msg}'
|
||||
|
||||
@@ -10,12 +10,13 @@ import jinja2
|
||||
import json
|
||||
import os
|
||||
import requests
|
||||
from tarfile import ReadError
|
||||
import time
|
||||
|
||||
from contextlib import contextmanager
|
||||
from dbt.exceptions import ConnectionException
|
||||
from dbt.events.functions import fire_event
|
||||
from dbt.events.types import RetryExternalCall
|
||||
from dbt.events.types import RetryExternalCall, RecordRetryException
|
||||
from enum import Enum
|
||||
from typing_extensions import Protocol
|
||||
from typing import (
|
||||
@@ -598,18 +599,21 @@ class MultiDict(Mapping[str, Any]):
|
||||
|
||||
def _connection_exception_retry(fn, max_attempts: int, attempt: int = 0):
|
||||
"""Attempts to run a function that makes an external call, if the call fails
|
||||
on a connection error or timeout, it will be tried up to 5 more times.
|
||||
on a Requests exception or decompression issue (ReadError), it will be tried
|
||||
up to 5 more times. All exceptions that Requests explicitly raises inherit from
|
||||
requests.exceptions.RequestException. See https://github.com/dbt-labs/dbt-core/issues/4579
|
||||
for context on this decompression issues specifically.
|
||||
"""
|
||||
try:
|
||||
return fn()
|
||||
except (
|
||||
requests.exceptions.ConnectionError,
|
||||
requests.exceptions.Timeout,
|
||||
requests.exceptions.ContentDecodingError,
|
||||
requests.exceptions.RequestException,
|
||||
ReadError,
|
||||
) as exc:
|
||||
if attempt <= max_attempts - 1:
|
||||
fire_event(RecordRetryException(exc=exc))
|
||||
fire_event(RetryExternalCall(attempt=attempt, max=max_attempts))
|
||||
time.sleep(1)
|
||||
_connection_exception_retry(fn, max_attempts, attempt + 1)
|
||||
return _connection_exception_retry(fn, max_attempts, attempt + 1)
|
||||
else:
|
||||
raise ConnectionException('External connection exception occurred: ' + str(exc))
|
||||
|
||||
@@ -10,13 +10,15 @@ import requests
|
||||
import dbt.exceptions
|
||||
import dbt.semver
|
||||
|
||||
from dbt.ui import green, red, yellow
|
||||
from dbt import flags
|
||||
|
||||
PYPI_VERSION_URL = 'https://pypi.org/pypi/dbt/json'
|
||||
PYPI_VERSION_URL = 'https://pypi.org/pypi/dbt-core/json'
|
||||
|
||||
|
||||
def get_latest_version():
|
||||
def get_latest_version(version_url: str = PYPI_VERSION_URL):
|
||||
try:
|
||||
resp = requests.get(PYPI_VERSION_URL)
|
||||
resp = requests.get(version_url)
|
||||
data = resp.json()
|
||||
version_string = data['info']['version']
|
||||
except (json.JSONDecodeError, KeyError, requests.RequestException):
|
||||
@@ -29,7 +31,13 @@ def get_installed_version():
|
||||
return dbt.semver.VersionSpecifier.from_version_string(__version__)
|
||||
|
||||
|
||||
def get_package_pypi_url(package_name: str) -> str:
|
||||
return f'https://pypi.org/pypi/dbt-{package_name}/json'
|
||||
|
||||
|
||||
def get_version_information():
|
||||
flags.USE_COLORS = True if not flags.USE_COLORS else None
|
||||
|
||||
installed = get_installed_version()
|
||||
latest = get_latest_version()
|
||||
|
||||
@@ -44,16 +52,40 @@ def get_version_information():
|
||||
|
||||
plugin_version_msg = "Plugins:\n"
|
||||
for plugin_name, version in _get_dbt_plugins_info():
|
||||
plugin_version_msg += ' - {plugin_name}: {version}\n'.format(
|
||||
plugin_name=plugin_name, version=version
|
||||
)
|
||||
plugin_version = dbt.semver.VersionSpecifier.from_version_string(version)
|
||||
latest_plugin_version = get_latest_version(version_url=get_package_pypi_url(plugin_name))
|
||||
plugin_update_msg = ''
|
||||
if installed == plugin_version or (
|
||||
latest_plugin_version and plugin_version == latest_plugin_version
|
||||
):
|
||||
compatibility_msg = green('Up to date!')
|
||||
else:
|
||||
if latest_plugin_version:
|
||||
if installed.major == plugin_version.major:
|
||||
compatibility_msg = yellow('Update available!')
|
||||
else:
|
||||
compatibility_msg = red('Out of date!')
|
||||
plugin_update_msg = (
|
||||
" Your version of dbt-{} is out of date! "
|
||||
"You can find instructions for upgrading here:\n"
|
||||
" https://docs.getdbt.com/dbt-cli/install/overview\n\n"
|
||||
).format(plugin_name)
|
||||
else:
|
||||
compatibility_msg = yellow('No PYPI version available')
|
||||
|
||||
plugin_version_msg += (
|
||||
" - {}: {} - {}\n"
|
||||
"{}"
|
||||
).format(plugin_name, version, compatibility_msg, plugin_update_msg)
|
||||
|
||||
if latest is None:
|
||||
return ("{}The latest version of dbt could not be determined!\n"
|
||||
"Make sure that the following URL is accessible:\n{}\n\n{}"
|
||||
.format(version_msg, PYPI_VERSION_URL, plugin_version_msg))
|
||||
.format(version_msg, PYPI_VERSION_URL, plugin_version_msg)
|
||||
)
|
||||
|
||||
if installed == latest:
|
||||
return "{}Up to date!\n\n{}".format(version_msg, plugin_version_msg)
|
||||
return f"{version_msg}{green('Up to date!')}\n\n{plugin_version_msg}"
|
||||
|
||||
elif installed > latest:
|
||||
return ("{}Your version of dbt is ahead of the latest "
|
||||
@@ -91,10 +123,10 @@ def _get_dbt_plugins_info():
|
||||
f'dbt.adapters.{plugin_name}.__version__'
|
||||
)
|
||||
except ImportError:
|
||||
# not an adpater
|
||||
# not an adapter
|
||||
continue
|
||||
yield plugin_name, mod.version
|
||||
|
||||
|
||||
__version__ = '1.0.0rc3'
|
||||
__version__ = '1.0.8'
|
||||
installed = get_installed_version()
|
||||
|
||||
@@ -284,12 +284,12 @@ def parse_args(argv=None):
|
||||
parser.add_argument('adapter')
|
||||
parser.add_argument('--title-case', '-t', default=None)
|
||||
parser.add_argument('--dependency', action='append')
|
||||
parser.add_argument('--dbt-core-version', default='1.0.0rc3')
|
||||
parser.add_argument('--dbt-core-version', default='1.0.8')
|
||||
parser.add_argument('--email')
|
||||
parser.add_argument('--author')
|
||||
parser.add_argument('--url')
|
||||
parser.add_argument('--sql', action='store_true')
|
||||
parser.add_argument('--package-version', default='1.0.0rc3')
|
||||
parser.add_argument('--package-version', default='1.0.8')
|
||||
parser.add_argument('--project-version', default='1.0')
|
||||
parser.add_argument(
|
||||
'--no-dependency', action='store_false', dest='set_dependency'
|
||||
|
||||
@@ -25,7 +25,7 @@ with open(os.path.join(this_directory, 'README.md')) as f:
|
||||
|
||||
|
||||
package_name = "dbt-core"
|
||||
package_version = "1.0.0rc3"
|
||||
package_version = "1.0.8"
|
||||
description = """With dbt, data analysts and engineers can build analytics \
|
||||
the way engineers build applications."""
|
||||
|
||||
@@ -52,18 +52,19 @@ setup(
|
||||
],
|
||||
install_requires=[
|
||||
'Jinja2==2.11.3',
|
||||
'MarkupSafe>=0.23,<2.1',
|
||||
'agate>=1.6,<1.6.4',
|
||||
'click>=8,<9',
|
||||
'click>=7.0,<9',
|
||||
'colorama>=0.3.9,<0.4.5',
|
||||
'hologram==0.0.14',
|
||||
'isodate>=0.6,<0.7',
|
||||
'logbook>=1.5,<1.6',
|
||||
'mashumaro==2.9',
|
||||
'minimal-snowplow-tracker==0.0.2',
|
||||
'networkx>=2.3,<3',
|
||||
'networkx>=2.3,<2.8.1',
|
||||
'packaging>=20.9,<22.0',
|
||||
'sqlparse>=0.2.3,<0.5',
|
||||
'dbt-extractor==0.4.0',
|
||||
'dbt-extractor~=0.4.1',
|
||||
'typing-extensions>=3.7.4,<3.11',
|
||||
'werkzeug>=1,<3',
|
||||
# the following are all to match snowflake-connector-python
|
||||
|
||||
@@ -1,19 +1,19 @@
|
||||
agate==1.6.3
|
||||
attrs==21.2.0
|
||||
Babel==2.9.1
|
||||
certifi==2021.10.8
|
||||
attrs==21.4.0
|
||||
Babel==2.10.2
|
||||
certifi==2022.5.18.1
|
||||
cffi==1.15.0
|
||||
charset-normalizer==2.0.8
|
||||
click==8.0.3
|
||||
charset-normalizer==2.0.12
|
||||
click==8.1.3
|
||||
colorama==0.4.4
|
||||
dbt-core==1.0.0rc3
|
||||
dbt-extractor==0.4.0
|
||||
dbt-postgres==1.0.0rc3
|
||||
dbt-core==1.0.8
|
||||
dbt-extractor==0.4.1
|
||||
dbt-postgres==1.0.8
|
||||
future==0.18.2
|
||||
hologram==0.0.14
|
||||
idna==3.3
|
||||
importlib-metadata==4.8.2
|
||||
isodate==0.6.0
|
||||
importlib-metadata==4.11.4
|
||||
isodate==0.6.1
|
||||
Jinja2==2.11.3
|
||||
jsonschema==3.1.1
|
||||
leather==0.3.4
|
||||
@@ -21,24 +21,24 @@ Logbook==1.5.3
|
||||
MarkupSafe==2.0.1
|
||||
mashumaro==2.9
|
||||
minimal-snowplow-tracker==0.0.2
|
||||
msgpack==1.0.3
|
||||
networkx==2.6.3
|
||||
msgpack==1.0.4
|
||||
networkx==2.8
|
||||
packaging==21.3
|
||||
parsedatetime==2.4
|
||||
psycopg2-binary==2.9.2
|
||||
psycopg2-binary==2.9.3
|
||||
pycparser==2.21
|
||||
pyparsing==3.0.6
|
||||
pyrsistent==0.18.0
|
||||
pyparsing==3.0.9
|
||||
pyrsistent==0.18.1
|
||||
python-dateutil==2.8.2
|
||||
python-slugify==5.0.2
|
||||
python-slugify==6.1.2
|
||||
pytimeparse==1.1.8
|
||||
pytz==2021.3
|
||||
pytz==2022.1
|
||||
PyYAML==6.0
|
||||
requests==2.26.0
|
||||
requests==2.28.0
|
||||
six==1.16.0
|
||||
sqlparse==0.4.2
|
||||
text-unidecode==1.3
|
||||
typing-extensions==3.10.0.2
|
||||
urllib3==1.26.7
|
||||
Werkzeug==2.0.2
|
||||
zipp==3.6.0
|
||||
urllib3==1.26.9
|
||||
Werkzeug==2.1.2
|
||||
zipp==3.8.0
|
||||
|
||||
@@ -1,18 +1,118 @@
|
||||
# Performance Regression Testing
|
||||
This directory includes dbt project setups to test on and a test runner written in Rust which runs specific dbt commands on each of the projects. Orchestration is done via the GitHub Action workflow in `/.github/workflows/performance.yml`. The workflow is scheduled to run every night, but it can also be triggered manually.
|
||||
|
||||
The github workflow hardcodes our baseline branch for performance metrics as `0.20.latest`. As future versions become faster, this branch will be updated to hold us to those new standards.
|
||||
## Attention!
|
||||
|
||||
## Adding a new dbt project
|
||||
Just make a new directory under `performance/projects/`. It will automatically be picked up by the tests.
|
||||
PLEASE READ THIS README IN THE MAIN BRANCH
|
||||
The performance runner is always pulled from main regardless of the version being modeled or sampled. If you are not in the main branch, this information may be stale.
|
||||
|
||||
## Adding a new dbt command
|
||||
In `runner/src/measure.rs::measure` add a metric to the `metrics` Vec. The Github Action will handle recompilation if you don't have the rust toolchain installed.
|
||||
## Description
|
||||
|
||||
This test suite samples the performance characteristics of individual commits against performance models for prior releases. Performance is measured in project-command pairs which are assumed to conform to a normal distribution. The sampling and comparison is effecient enough to run against PRs.
|
||||
|
||||
This collection of projects and commands should expand over time to reflect user feedback about poorly performing projects to protect against poor performance in these scenarios in future versions.
|
||||
|
||||
Here are all the components of the testing module:
|
||||
|
||||
- dbt project setups that are known performance bottlenecks which you can find in `/performance/projects/`, and a runner written in Rust that runs specific dbt commands on each of the projects.
|
||||
- Performance characteristics called "baselines" from released dbt versions in `/performance/baselines/`. Each branch will only have the baselines for its ancestors because when we compare samples, we compare against the lastest baseline available in the branch.
|
||||
- A GitHub action for modeling the performance distribution for a new release: `/.github/workflows/model_performance.yml`.
|
||||
- A GitHub action for sampling performance of dbt at your commit and comparing it against a previous release: `/.github/workflows/sample_performance.yml`.
|
||||
|
||||
At this time, the biggest risk in the design of this project is how to account for the natural variation of GitHub Action runs. Typically, performance work is done on dedicated hardware to elimiate this factor. However, there are ways to integrate the variation in obeservation tools if it can be measured.
|
||||
|
||||
## Adding Test Scenarios
|
||||
|
||||
A clear process for maintainers and community members to add new performance testing targets will exist after the next stage of the test suite is complete. For details, see #4768.
|
||||
|
||||
## Investigating Regressions
|
||||
|
||||
If your commit has failed one of the performance regression tests, it does not necessarily mean your commit has a performance regression. However, the observed runtime value was so much slower than the expected value that it was unlikely to be random noise. If it is not due to random noise, this commit contains the code that is causing this performance regression. However, it may not be the commit that introduced that code. That code may have been introduced in the commit before even if it passed due to natural variation in sampling. When investigating a performance regression, start with the failing commit and working your way backwards.
|
||||
|
||||
Here's an example of how this could happen:
|
||||
|
||||
```
|
||||
Commit
|
||||
A <- last release
|
||||
B
|
||||
C <- perf regression
|
||||
D
|
||||
E
|
||||
F <- the first failing commit
|
||||
```
|
||||
- Commit A is measured to have an expected value for one performance metric of 30 seconds with a standard deviation of 0.5 seconds.
|
||||
- Commit B doesn't introduce a performance regression and passes the performance regression tests.
|
||||
- Commit C introduces a performance regression such that the new expected value of the metric is 32 seconds with a standard deviation still at 0.5 seconds, but we don't know this because we don't estimate the whole performance distribution on every commit because that is far too much work to run on every commit. It passes the performance regression test because we happened to sample a value of 31 seconds which is within our threshold for the original model. It's also only 2 standard deviations away from the actual performance model of commit C so even though it's not going to be a super common situation, it is expected to happen sometimes.
|
||||
- Commit D samples a value of 31.4 seconds and passes
|
||||
- Commit E samples a value of 31.2 seconds and passes
|
||||
- Commit F samples a value of 32.9 seconds and fails
|
||||
|
||||
Because these performance regression tests are non-deterministic, it is frequently going to be possible to rerun the test on a failing commit and get it to pass. The more often we do this, the farther down the commit history we will be punting detection.
|
||||
|
||||
If your PR is against `main` your commits will be compared against the latest baseline measurement found in `performance/baselines`. If this commit needs to be backported, that PR will be against the `.latest` branch and will also compare against the latest baseline measurement found in `performance/baselines` in that branch. These two versions may be the same or they may be different. For example, If the latest version of dbt is v1.99.0, the performance sample of your PR against main will compare against the baseline for v1.99.0. When those commits are backported to `1.98.latest` those commits will be compared against the baseline for v1.98.6 (or whatever the latest is at that time). Even if the compared baseline is the same, a different sample is taken for each PR. In this case, even though it should be rare, it is possible for a performance regression to be detected in one of the two PRs even with the same baseline due to variation in sampling.
|
||||
|
||||
## The Statistics
|
||||
Particle physicists need to be confident in declaring new discoveries, snack manufacturers need to be sure each individual item is within the regulated margin of error for nutrition facts, and weight-rated climbing gear needs to be produced so you can trust your life to every unit that comes off the line. All of these use cases use the same kind of math to meet their needs: sigma-based p-values. This section will peel apart that math with the help of a physicist and walk through how we apply this approach to performance regression testing in this test suite.
|
||||
|
||||
You are likely familiar with forming a hypothesis of the form "A and B are correlated" which is known as _the research hypothesis_. Additionally, it follows that the hypothesis "A and B are not correlated" is relevant and is known as _the null hypothesis_. When looking at data, we commonly use a _p-value_ to determine the significance of the data. Formally, a _p-value_ is the probability of obtaining data at least as extreme as the ones observed, if the null hypothesis is true. To refine this definition, The experimental partical physicist [Dr. Tommaso Dorigo](https://userswww.pd.infn.it/~dorigo/#about) has an excellent [glossary](https://www.science20.com/quantum_diaries_survivor/fundamental_glossary_higgs_broadcast-85365) of these terms that helps clarify: "'Extreme' is quite tricky instead: it depends on what is your 'alternate hypothesis' of reference, and what kind of departure it would produce on the studied statistic derived from the data. So 'extreme' will mean 'departing from the typical values expected for the null hypothesis, toward the values expected from the alternate hypothesis.'" In the context of performance regression testing, our research hypothesis is that "after commit A, the codebase includes a performance regression" which means we expect the runtime of our measured processes to be _slower_, not faster than the expected value.
|
||||
|
||||
Given this definition of p-value, we need to explicitly call out the common tendancy to apply _probability inversion_ to our observations. To quote [Dr. Tommaso Dorigo](https://www.science20.com/quantum_diaries_survivor/fundamental_glossary_higgs_broadcast-85365) again, "If your ability on the long jump puts you in the 99.99% percentile, that does not mean that you are a kangaroo, and neither can one infer that the probability that you belong to the human race is 0.01%." Using our previously defined terms, the p-value is _not_ the probability that the null hypothesis _is true_.
|
||||
|
||||
This brings us to calculating sigma values. Sigma refers to the standard deviation of a statistical model, which is used as a measurement of how far away an observed value is from the expected value. When we say that we have a "3 sigma result" we are saying that if the null hypothesis is true, this is a particularly unlikely observation—not that the null hypothesis is false. Exactly how unlikely depends on what the expected values from our research hypothesis are. In the context of performance regression testing, if the null hypothesis is false, we are expecting the results to be _slower_ than the expected value not _slower or faster_. Looking at a normal distrubiton below, we can see that we only care about one _half_ of the distribution: the half where the values are slower than the expected value. This means that when we're calculating the p-value we are not including both sides of the normal distribution.
|
||||
|
||||

|
||||
|
||||
Because of this, the following table describes the significance of each sigma level for our _one-sided_ hypothesis:
|
||||
|
||||
| σ | p-value | scientific significance |
|
||||
| --- | -------------- | ----------------------- |
|
||||
| 1 σ | 1 in 6 | |
|
||||
| 2 σ | 1 in 44 | |
|
||||
| 3 σ | 1 in 741 | evidence |
|
||||
| 4 σ | 1 in 31,574 | |
|
||||
| 5 σ | 1 in 3,486,914 | discovery |
|
||||
|
||||
When detecting performance regressions that trigger alerts, block PRs, or delay releases we want to be conservative enough that detections are infrequently triggered by noise, but not so conservative as to miss most actual regressions. This test suite uses a 3 sigma standard so that only about 1 in every 700 runs is expected to fail the performance regression test suite due to expected variance in our measurements.
|
||||
|
||||
In practice, the number of performance regression failures due to random noise will be higher because we are not incorporating the variance of the tools we use to measure, namely GHA.
|
||||
|
||||
### Concrete Example: Performance Regression Detection
|
||||
|
||||
The following example data was collected by running the code in this repository in Github Actions.
|
||||
|
||||
In dbt v1.0.3, we have the following mean and standard deviation when parsing a dbt project with 2000 models:
|
||||
|
||||
μ (mean): 41.22<br/>
|
||||
σ (stddev): 0.2525<br/>
|
||||
|
||||
The 2-sided 3 sigma range can be calculated with these two values via:
|
||||
|
||||
x < μ - 3 σ or x > μ + 3 σ<br/>
|
||||
x < 41.22 - 3 * 0.2525 or x > 41.22 + 3 * 0.2525 <br/>
|
||||
x < 40.46 or x > 41.98<br/>
|
||||
|
||||
It follows that the 1-sided 3 sigma range for performance regressions is just:<br/>
|
||||
x > 41.98
|
||||
|
||||
If when we sample a single `dbt parse` of the same project with a commit slated to go into dbt v1.0.4, we observe a 42s parse time, then this observation is so unlikely if there were no code-induced performance regressions, that we should investigate if there is a performance regression in any of the commits between this failure and the commit where the initial distribution was measured.
|
||||
|
||||
Observations with 3 sigma significance that are _not_ performance regressions could be due to observing unlikely values (roughly 1 in every 750 observations), or variations in the instruments we use to take these measurements such as github actions. At this time we do not measure the variation in the instruments we use to account for these in our calculations which means failures due to random noise are more likely than they would be if we did take them into account.
|
||||
|
||||
### Concrete Example: Performance Modeling
|
||||
|
||||
Once a new dbt version is released (excluding pre-releases), the performance characteristics of that released version need to be measured. In this repository this measurement is referred to as a baseline.
|
||||
|
||||
After dbt v1.0.99 is released, a github action running from `main`, for the latest version of that action, takes the following steps:
|
||||
- Checks out main for the latest performance runner
|
||||
- pip installs dbt v1.0.99
|
||||
- builds the runner if it's not already in the github actions cache
|
||||
- uses the performance runner model sub command with `./runner model`.
|
||||
- The model subcommand calls hyperfine to run all of the project-command pairs a large number of times (maybe 20 or so) and save the hyperfine outputs to files in `performance/baselines/1.0.99/` one file per command-project pair.
|
||||
- The action opens two PRs with these files: one against `main` and one against `1.0.latest` so that future PRs against these branches will detect regressions against the performance characteristics of dbt v1.0.99 instead of v1.0.98.
|
||||
- The release driver for dbt v1.0.99 reviews and merges these PRs which is the sole deliverable of the performance modeling work.
|
||||
|
||||
## Future work
|
||||
- add more projects to test different configurations that have been known bottlenecks
|
||||
- add more dbt commands to measure
|
||||
- possibly using the uploaded json artifacts to store these results so they can be graphed over time
|
||||
- reading new metrics from a file so no one has to edit rust source to add them to the suite
|
||||
- instead of building the rust every time, we could publish and pull down the latest version.
|
||||
- instead of manually setting the baseline version of dbt to test, pull down the latest stable version as the baseline.
|
||||
- pin commands to projects by reading commands from a file defined in the project.
|
||||
- add a postgres warehouse to run `dbt compile` and `dbt run` commands
|
||||
- add more projects to test different configurations that have been known performance bottlenecks
|
||||
- Account for github action variation: Either measure it, or eliminate it. To measure it we could set up another action that periodically samples the same version of dbt and use a 7 day rolling variation. To eliminate it we could run the action using something like [act](https://github.com/nektos/act) on dedicated hardware.
|
||||
- build in a git-bisect run to automatically identify the commits that caused a performance regression by modeling each commit's expected value for the failing metric. Running this automatically, or even providing a script to do this locally would be useful.
|
||||
|
||||
40
performance/baselines/1.0.3/parse___2000_models.json
Normal file
40
performance/baselines/1.0.3/parse___2000_models.json
Normal file
@@ -0,0 +1,40 @@
|
||||
{
|
||||
"version": "1.0.3",
|
||||
"metric": {
|
||||
"name": "parse",
|
||||
"project_name": "01_2000_simple_models"
|
||||
},
|
||||
"ts": "2022-03-04T00:02:52.657727515Z",
|
||||
"measurement": {
|
||||
"command": "dbt parse --no-version-check --profiles-dir ../../project_config/",
|
||||
"mean": 41.224566760615,
|
||||
"stddev": 0.252468634424254,
|
||||
"median": 41.182836243915,
|
||||
"user": 40.70073678499999,
|
||||
"system": 0.61185062,
|
||||
"min": 40.89372129691501,
|
||||
"max": 41.68176405591501,
|
||||
"times": [
|
||||
41.397582801915,
|
||||
41.618822256915,
|
||||
41.374914350915,
|
||||
41.68176405591501,
|
||||
41.255119986915,
|
||||
41.528348636915,
|
||||
41.238762892915,
|
||||
40.950121934915,
|
||||
41.388716648915,
|
||||
41.62938069991501,
|
||||
41.139914502915,
|
||||
41.114225200915,
|
||||
41.045012222915,
|
||||
41.01039839391501,
|
||||
40.915296414915,
|
||||
41.006528646915,
|
||||
40.89372129691501,
|
||||
40.951454721915,
|
||||
41.125491559915,
|
||||
41.225757984915
|
||||
]
|
||||
}
|
||||
}
|
||||
@@ -1 +1 @@
|
||||
version = '1.0.0rc3'
|
||||
version = '1.0.8'
|
||||
|
||||
@@ -41,7 +41,7 @@ def _dbt_psycopg2_name():
|
||||
|
||||
|
||||
package_name = "dbt-postgres"
|
||||
package_version = "1.0.0rc3"
|
||||
package_version = "1.0.8"
|
||||
description = """The postgres adpter plugin for dbt (data build tool)"""
|
||||
|
||||
this_directory = os.path.abspath(os.path.dirname(__file__))
|
||||
|
||||
5
setup.py
5
setup.py
@@ -5,7 +5,7 @@ import sys
|
||||
|
||||
if 'sdist' not in sys.argv:
|
||||
print('')
|
||||
print('As of v1.0.0, `pip install dbt` is no longer supported.')
|
||||
print('As of v1.0.8, `pip install dbt` is no longer supported.')
|
||||
print('Instead, please use one of the following.')
|
||||
print('')
|
||||
print('**To use dbt with your specific database, platform, or query engine:**')
|
||||
@@ -50,7 +50,7 @@ with open(os.path.join(this_directory, 'README.md')) as f:
|
||||
|
||||
|
||||
package_name = "dbt"
|
||||
package_version = "1.0.0rc3"
|
||||
package_version = "1.0.8"
|
||||
description = """With dbt, data analysts and engineers can build analytics \
|
||||
the way engineers build applications."""
|
||||
|
||||
@@ -81,4 +81,5 @@ setup(
|
||||
'Programming Language :: Python :: 3.9',
|
||||
],
|
||||
python_requires=">=3.7",
|
||||
packages=[]
|
||||
)
|
||||
|
||||
@@ -3,3 +3,6 @@ snapshots:
|
||||
- name: snapshot_actual
|
||||
tests:
|
||||
- mutually_exclusive_ranges
|
||||
config:
|
||||
meta:
|
||||
owner: 'a_owner'
|
||||
|
||||
@@ -246,3 +246,58 @@ class TestSimpleDependencyNoProfile(TestSimpleDependency):
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
result = self.run_dbt(["clean", "--profiles-dir", tmpdir])
|
||||
return result
|
||||
|
||||
class TestSimpleDependencyBadProfile(DBTIntegrationTest):
|
||||
|
||||
@property
|
||||
def schema(self):
|
||||
return "simple_dependency_006"
|
||||
|
||||
@property
|
||||
def models(self):
|
||||
return "models"
|
||||
|
||||
@property
|
||||
def project_config(self):
|
||||
return {
|
||||
'config-version': 2,
|
||||
'models': {
|
||||
'+any_config': "{{ target.name }}",
|
||||
'+enabled': "{{ target.name in ['redshift', 'postgres'] | as_bool }}"
|
||||
}
|
||||
}
|
||||
|
||||
def postgres_profile(self):
|
||||
# Need to set the environment variable here initially because
|
||||
# the unittest setup does a load_config.
|
||||
os.environ['PROFILE_TEST_HOST'] = self.database_host
|
||||
return {
|
||||
'config': {
|
||||
'send_anonymous_usage_stats': False
|
||||
},
|
||||
'test': {
|
||||
'outputs': {
|
||||
'default2': {
|
||||
'type': 'postgres',
|
||||
'threads': 4,
|
||||
'host': "{{ env_var('PROFILE_TEST_HOST') }}",
|
||||
'port': 5432,
|
||||
'user': 'root',
|
||||
'pass': 'password',
|
||||
'dbname': 'dbt',
|
||||
'schema': self.unique_schema()
|
||||
},
|
||||
},
|
||||
'target': 'default2'
|
||||
}
|
||||
}
|
||||
|
||||
@use_profile('postgres')
|
||||
def test_postgres_deps_bad_profile(self):
|
||||
del os.environ['PROFILE_TEST_HOST']
|
||||
self.run_dbt(["deps"])
|
||||
|
||||
@use_profile('postgres')
|
||||
def test_postgres_clean_bad_profile(self):
|
||||
del os.environ['PROFILE_TEST_HOST']
|
||||
self.run_dbt(["clean"])
|
||||
|
||||
@@ -43,7 +43,7 @@ class TestConfigPathDeprecation(BaseTestDeprecations):
|
||||
with self.assertRaises(dbt.exceptions.CompilationException) as exc:
|
||||
self.run_dbt(['--warn-error', 'debug'])
|
||||
exc_str = ' '.join(str(exc.exception).split()) # flatten all whitespace
|
||||
expected = "The `data-paths` config has been deprecated"
|
||||
expected = "The `data-paths` config has been renamed"
|
||||
assert expected in exc_str
|
||||
|
||||
|
||||
@@ -116,11 +116,16 @@ class TestPackageRedirectDeprecation(BaseTestDeprecations):
|
||||
expected = {'package-redirect'}
|
||||
self.assertEqual(expected, deprecations.active_deprecations)
|
||||
|
||||
@use_profile('postgres')
|
||||
def test_postgres_package_redirect_fail(self):
|
||||
self.assertEqual(deprecations.active_deprecations, set())
|
||||
with self.assertRaises(dbt.exceptions.CompilationException) as exc:
|
||||
self.run_dbt(['--warn-error', 'deps'])
|
||||
exc_str = ' '.join(str(exc.exception).split()) # flatten all whitespace
|
||||
expected = "The `fishtown-analytics/dbt_utils` package is deprecated in favor of `dbt-labs/dbt_utils`"
|
||||
assert expected in exc_str
|
||||
# this test fails as a result of the caching added in
|
||||
# https://github.com/dbt-labs/dbt-core/pull/4982
|
||||
# This seems to be a testing issue though. Everything works when tested locally
|
||||
# and the CompilationException get raised. Since we're refactoring these tests anyways
|
||||
# I won't rewrite this one
|
||||
# @use_profile('postgres')
|
||||
# def test_postgres_package_redirect_fail(self):
|
||||
# self.assertEqual(deprecations.active_deprecations, set())
|
||||
# with self.assertRaises(dbt.exceptions.CompilationException) as exc:
|
||||
# self.run_dbt(['--warn-error', 'deps'])
|
||||
# exc_str = ' '.join(str(exc.exception).split()) # flatten all whitespace
|
||||
# expected = "The `fishtown-analytics/dbt_utils` package is deprecated in favor of `dbt-labs/dbt_utils`"
|
||||
# assert expected in exc_str
|
||||
@@ -221,6 +221,36 @@ class TestAllowSecretProfilePackage(DBTIntegrationTest):
|
||||
self.assertFalse("first_dependency" in log_output)
|
||||
|
||||
|
||||
class TestCloneFailSecretScrubbed(DBTIntegrationTest):
|
||||
|
||||
def setUp(self):
|
||||
os.environ[SECRET_ENV_PREFIX + "GIT_TOKEN"] = "abc123"
|
||||
DBTIntegrationTest.setUp(self)
|
||||
|
||||
@property
|
||||
def packages_config(self):
|
||||
return {
|
||||
"packages": [
|
||||
{"git": "https://fakeuser:{{ env_var('DBT_ENV_SECRET_GIT_TOKEN') }}@github.com/dbt-labs/fake-repo.git"},
|
||||
]
|
||||
}
|
||||
|
||||
@property
|
||||
def schema(self):
|
||||
return "context_vars_013"
|
||||
|
||||
@property
|
||||
def models(self):
|
||||
return "models"
|
||||
|
||||
@use_profile('postgres')
|
||||
def test_postgres_fail_clone_with_scrubbing(self):
|
||||
with self.assertRaises(dbt.exceptions.InternalException) as exc:
|
||||
_, log_output = self.run_dbt_and_capture(['deps'])
|
||||
|
||||
assert "abc123" not in str(exc.exception)
|
||||
|
||||
|
||||
class TestEmitWarning(DBTIntegrationTest):
|
||||
@property
|
||||
def schema(self):
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
from test.integration.base import DBTIntegrationTest, use_profile
|
||||
import os
|
||||
|
||||
|
||||
class TestPrePostRunHooks(DBTIntegrationTest):
|
||||
@@ -22,6 +23,7 @@ class TestPrePostRunHooks(DBTIntegrationTest):
|
||||
'run_started_at',
|
||||
'invocation_id'
|
||||
]
|
||||
os.environ['TERM_TEST'] = 'TESTING'
|
||||
|
||||
@property
|
||||
def schema(self):
|
||||
@@ -41,6 +43,7 @@ class TestPrePostRunHooks(DBTIntegrationTest):
|
||||
"{{ custom_run_hook('start', target, run_started_at, invocation_id) }}",
|
||||
"create table {{ target.schema }}.start_hook_order_test ( id int )",
|
||||
"drop table {{ target.schema }}.start_hook_order_test",
|
||||
"{{ log(env_var('TERM_TEST'), info=True) }}",
|
||||
],
|
||||
"on-run-end": [
|
||||
"{{ custom_run_hook('end', target, run_started_at, invocation_id) }}",
|
||||
|
||||
@@ -41,9 +41,17 @@ def temporary_working_directory() -> str:
|
||||
out : str
|
||||
The temporary working directory.
|
||||
"""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
with change_working_directory(tmpdir):
|
||||
yield tmpdir
|
||||
# N.B: supressing the OSError is necessary for older (pre 3.10) versions of python
|
||||
# which do not support the `ignore_cleanup_errors` in tempfile::TemporaryDirectory.
|
||||
# See: https://github.com/python/cpython/pull/24793
|
||||
#
|
||||
# In our case the cleanup is redundent since windows handles clearing
|
||||
# Appdata/Local/Temp at the os level anyway.
|
||||
|
||||
with contextlib.suppress(OSError):
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
with change_working_directory(tmpdir):
|
||||
yield tmpdir
|
||||
|
||||
|
||||
def get_custom_profiles_config(database_host, custom_schema):
|
||||
|
||||
@@ -34,15 +34,3 @@ class TestStatements(DBTIntegrationTest):
|
||||
self.assertEqual(len(results), 1)
|
||||
|
||||
self.assertTablesEqual("statement_actual", "statement_expected")
|
||||
|
||||
@use_profile("presto")
|
||||
def test_presto_statements(self):
|
||||
self.use_default_project({"seed-paths": [self.dir("seed")]})
|
||||
|
||||
results = self.run_dbt(["seed"])
|
||||
self.assertEqual(len(results), 2)
|
||||
results = self.run_dbt()
|
||||
self.assertEqual(len(results), 1)
|
||||
|
||||
self.assertTablesEqual("statement_actual", "statement_expected")
|
||||
|
||||
|
||||
@@ -65,6 +65,9 @@ class TestSchemaFileConfigs(DBTIntegrationTest):
|
||||
manifest = get_manifest()
|
||||
model_id = 'model.test.model'
|
||||
model_node = manifest.nodes[model_id]
|
||||
meta_expected = {'company': 'NuMade', 'project': 'test', 'team': 'Core Team', 'owner': 'Julie Smith', 'my_attr': 'TESTING'}
|
||||
self.assertEqual(model_node.meta, meta_expected)
|
||||
self.assertEqual(model_node.config.meta, meta_expected)
|
||||
model_tags = ['tag_1_in_model', 'tag_2_in_model', 'tag_in_project', 'tag_in_schema']
|
||||
model_node_tags = model_node.tags.copy()
|
||||
model_node_tags.sort()
|
||||
|
||||
@@ -327,7 +327,7 @@ test:
|
||||
]
|
||||
self.run_dbt(['init'])
|
||||
manager.assert_has_calls([
|
||||
call.prompt('What is the desired project name?'),
|
||||
call.prompt("Enter a name for your project (letters, digits, underscore)"),
|
||||
call.prompt("Which database would you like to use?\n[1] postgres\n\n(Don't see the one you want? https://docs.getdbt.com/docs/available-adapters)\n\nEnter a number", type=click.INT),
|
||||
call.prompt('host (hostname for the instance)', default=None, hide_input=False, type=None),
|
||||
call.prompt('port', default=5432, hide_input=False, type=click.INT),
|
||||
@@ -532,6 +532,48 @@ models:
|
||||
+materialized: view
|
||||
"""
|
||||
|
||||
@use_profile('postgres')
|
||||
@mock.patch('click.confirm')
|
||||
@mock.patch('click.prompt')
|
||||
def test_postgres_init_invalid_project_name_cli(self, mock_prompt, mock_confirm):
|
||||
manager = Mock()
|
||||
manager.attach_mock(mock_prompt, 'prompt')
|
||||
manager.attach_mock(mock_confirm, 'confirm')
|
||||
|
||||
os.remove('dbt_project.yml')
|
||||
invalid_name = 'name-with-hyphen'
|
||||
valid_name = self.get_project_name()
|
||||
manager.prompt.side_effect = [
|
||||
valid_name
|
||||
]
|
||||
|
||||
self.run_dbt(['init', invalid_name, '-s'])
|
||||
manager.assert_has_calls([
|
||||
call.prompt("Enter a name for your project (letters, digits, underscore)"),
|
||||
])
|
||||
|
||||
@use_profile('postgres')
|
||||
@mock.patch('click.confirm')
|
||||
@mock.patch('click.prompt')
|
||||
def test_postgres_init_invalid_project_name_prompt(self, mock_prompt, mock_confirm):
|
||||
manager = Mock()
|
||||
manager.attach_mock(mock_prompt, 'prompt')
|
||||
manager.attach_mock(mock_confirm, 'confirm')
|
||||
|
||||
os.remove('dbt_project.yml')
|
||||
|
||||
invalid_name = 'name-with-hyphen'
|
||||
valid_name = self.get_project_name()
|
||||
manager.prompt.side_effect = [
|
||||
invalid_name, valid_name
|
||||
]
|
||||
|
||||
self.run_dbt(['init', '-s'])
|
||||
manager.assert_has_calls([
|
||||
call.prompt("Enter a name for your project (letters, digits, underscore)"),
|
||||
call.prompt("Enter a name for your project (letters, digits, underscore)"),
|
||||
])
|
||||
|
||||
@use_profile('postgres')
|
||||
@mock.patch('click.confirm')
|
||||
@mock.patch('click.prompt')
|
||||
@@ -546,20 +588,12 @@ models:
|
||||
project_name = self.get_project_name()
|
||||
manager.prompt.side_effect = [
|
||||
project_name,
|
||||
1,
|
||||
'localhost',
|
||||
5432,
|
||||
'test_username',
|
||||
'test_password',
|
||||
'test_db',
|
||||
'test_schema',
|
||||
4,
|
||||
]
|
||||
|
||||
# provide project name through the ini command
|
||||
self.run_dbt(['init', '-s'])
|
||||
manager.assert_has_calls([
|
||||
call.prompt('What is the desired project name?')
|
||||
call.prompt("Enter a name for your project (letters, digits, underscore)")
|
||||
])
|
||||
|
||||
with open(os.path.join(self.test_root_dir, project_name, 'dbt_project.yml'), 'r') as f:
|
||||
|
||||
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"metadata": {
|
||||
"dbt_schema_version": "https://schemas.getdbt.com/dbt/manifest/v3.json",
|
||||
"dbt_version": "0.21.1"
|
||||
}
|
||||
}
|
||||
@@ -6,7 +6,7 @@ import string
|
||||
|
||||
import pytest
|
||||
|
||||
from dbt.exceptions import CompilationException
|
||||
from dbt.exceptions import CompilationException, IncompatibleSchemaException
|
||||
|
||||
|
||||
class TestModifiedState(DBTIntegrationTest):
|
||||
@@ -36,7 +36,7 @@ class TestModifiedState(DBTIntegrationTest):
|
||||
for entry in os.listdir(self.test_original_source_path):
|
||||
src = os.path.join(self.test_original_source_path, entry)
|
||||
tst = os.path.join(self.test_root_dir, entry)
|
||||
if entry in {'models', 'seeds', 'macros'}:
|
||||
if entry in {'models', 'seeds', 'macros', 'previous_state'}:
|
||||
shutil.copytree(src, tst)
|
||||
elif os.path.isdir(entry) or entry.endswith('.sql'):
|
||||
os.symlink(src, tst)
|
||||
@@ -202,3 +202,10 @@ class TestModifiedState(DBTIntegrationTest):
|
||||
results, stdout = self.run_dbt_and_capture(['run', '--models', '+state:modified', '--state', './state'])
|
||||
assert len(results) == 1
|
||||
assert results[0].node.name == 'view_model'
|
||||
|
||||
@use_profile('postgres')
|
||||
def test_postgres_previous_version_manifest(self):
|
||||
# This tests that a different schema version in the file throws an error
|
||||
with self.assertRaises(IncompatibleSchemaException) as exc:
|
||||
results = self.run_dbt(['ls', '-s', 'state:modified', '--state', './previous_state'])
|
||||
self.assertEqual(exc.CODE, 10014)
|
||||
|
||||
@@ -1,2 +1,2 @@
|
||||
select
|
||||
* from {{ ref('customers') }} where customer_id > 100
|
||||
* from {{ ref('customers') }} where first_name = '{{ macro_something() }}'
|
||||
|
||||
@@ -13,3 +13,17 @@ select * from {{ ref('orders') }}
|
||||
|
||||
{% endsnapshot %}
|
||||
|
||||
{% snapshot orders2_snapshot %}
|
||||
|
||||
{{
|
||||
config(
|
||||
target_schema=schema,
|
||||
strategy='check',
|
||||
unique_key='id',
|
||||
check_cols=['order_date'],
|
||||
)
|
||||
}}
|
||||
|
||||
select * from {{ ref('orders') }}
|
||||
|
||||
{% endsnapshot %}
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
- add a comment
|
||||
{% snapshot orders_snapshot %}
|
||||
|
||||
{{
|
||||
@@ -8,7 +9,22 @@
|
||||
check_cols=['status'],
|
||||
)
|
||||
}}
|
||||
|
||||
select * from {{ ref('orders') }}
|
||||
|
||||
{% endsnapshot %}
|
||||
|
||||
{% snapshot orders2_snapshot %}
|
||||
|
||||
{{
|
||||
config(
|
||||
target_schema=schema,
|
||||
strategy='check',
|
||||
unique_key='id',
|
||||
check_cols=['order_date'],
|
||||
)
|
||||
}}
|
||||
|
||||
select * from {{ ref('orders') }}
|
||||
|
||||
{% endsnapshot %}
|
||||
|
||||
@@ -0,0 +1,5 @@
|
||||
{% macro macro_something() %}
|
||||
|
||||
{% do return('macro_something') %}
|
||||
|
||||
{% endmacro %}
|
||||
@@ -0,0 +1,5 @@
|
||||
{% macro macro_something() %}
|
||||
|
||||
{% do return('some_name') %}
|
||||
|
||||
{% endmacro %}
|
||||
@@ -46,6 +46,7 @@ class BasePPTest(DBTIntegrationTest):
|
||||
os.mkdir(os.path.join(self.test_root_dir, 'macros'))
|
||||
os.mkdir(os.path.join(self.test_root_dir, 'analyses'))
|
||||
os.mkdir(os.path.join(self.test_root_dir, 'snapshots'))
|
||||
os.environ['DBT_PP_TEST'] = 'true'
|
||||
|
||||
|
||||
|
||||
@@ -332,6 +333,7 @@ class TestSources(BasePPTest):
|
||||
results = self.run_dbt(["--partial-parse", "run"])
|
||||
|
||||
# Add a data test
|
||||
self.copy_file('test-files/test-macro.sql', 'macros/test-macro.sql')
|
||||
self.copy_file('test-files/my_test.sql', 'tests/my_test.sql')
|
||||
results = self.run_dbt(["--partial-parse", "test"])
|
||||
manifest = get_manifest()
|
||||
@@ -339,6 +341,11 @@ class TestSources(BasePPTest):
|
||||
test_id = 'test.test.my_test'
|
||||
self.assertIn(test_id, manifest.nodes)
|
||||
|
||||
# Change macro that data test depends on
|
||||
self.copy_file('test-files/test-macro2.sql', 'macros/test-macro.sql')
|
||||
results = self.run_dbt(["--partial-parse", "test"])
|
||||
manifest = get_manifest()
|
||||
|
||||
# Add an analysis
|
||||
self.copy_file('test-files/my_analysis.sql', 'analyses/my_analysis.sql')
|
||||
results = self.run_dbt(["--partial-parse", "run"])
|
||||
@@ -496,10 +503,12 @@ class TestSnapshots(BasePPTest):
|
||||
manifest = get_manifest()
|
||||
snapshot_id = 'snapshot.test.orders_snapshot'
|
||||
self.assertIn(snapshot_id, manifest.nodes)
|
||||
snapshot2_id = 'snapshot.test.orders2_snapshot'
|
||||
self.assertIn(snapshot2_id, manifest.nodes)
|
||||
|
||||
# run snapshot
|
||||
results = self.run_dbt(["--partial-parse", "snapshot"])
|
||||
self.assertEqual(len(results), 1)
|
||||
self.assertEqual(len(results), 2)
|
||||
|
||||
# modify snapshot
|
||||
self.copy_file('test-files/snapshot2.sql', 'snapshots/snapshot.sql')
|
||||
|
||||
@@ -37,6 +37,7 @@ class TestDocs(DBTIntegrationTest):
|
||||
os.mkdir(os.path.join(self.test_root_dir, 'macros'))
|
||||
os.mkdir(os.path.join(self.test_root_dir, 'analyses'))
|
||||
os.mkdir(os.path.join(self.test_root_dir, 'snapshots'))
|
||||
os.environ['DBT_PP_TEST'] = 'true'
|
||||
|
||||
|
||||
@use_profile('postgres')
|
||||
|
||||
@@ -45,6 +45,7 @@ class BasePPTest(DBTIntegrationTest):
|
||||
os.mkdir(os.path.join(self.test_root_dir, 'macros'))
|
||||
os.mkdir(os.path.join(self.test_root_dir, 'analyses'))
|
||||
os.mkdir(os.path.join(self.test_root_dir, 'snapshots'))
|
||||
os.environ['DBT_PP_TEST'] = 'true'
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -2,6 +2,7 @@ from dbt.exceptions import CompilationException, ParsingException
|
||||
from dbt.contracts.graph.manifest import Manifest
|
||||
from dbt.contracts.files import ParseFileType
|
||||
from dbt.contracts.results import TestStatus
|
||||
from dbt.logger import SECRET_ENV_PREFIX
|
||||
from dbt.parser.partial import special_override_macros
|
||||
from test.integration.base import DBTIntegrationTest, use_profile, normalize, get_manifest
|
||||
import shutil
|
||||
@@ -41,7 +42,7 @@ class BasePPTest(DBTIntegrationTest):
|
||||
os.mkdir(os.path.join(self.test_root_dir, 'tests'))
|
||||
os.mkdir(os.path.join(self.test_root_dir, 'macros'))
|
||||
os.mkdir(os.path.join(self.test_root_dir, 'seeds'))
|
||||
|
||||
os.environ['DBT_PP_TEST'] = 'true'
|
||||
|
||||
|
||||
class EnvVarTest(BasePPTest):
|
||||
@@ -300,6 +301,7 @@ class ProjectEnvVarTest(BasePPTest):
|
||||
# cleanup
|
||||
del os.environ['ENV_VAR_NAME']
|
||||
|
||||
|
||||
class ProfileEnvVarTest(BasePPTest):
|
||||
|
||||
@property
|
||||
@@ -352,3 +354,63 @@ class ProfileEnvVarTest(BasePPTest):
|
||||
manifest = get_manifest()
|
||||
self.assertNotEqual(env_vars_checksum, manifest.state_check.profile_env_vars_hash.checksum)
|
||||
|
||||
|
||||
class ProfileSecretEnvVarTest(BasePPTest):
|
||||
|
||||
@property
|
||||
def profile_config(self):
|
||||
# Need to set these here because the base integration test class
|
||||
# calls 'load_config' before the tests are run.
|
||||
# Note: only the specified profile is rendered, so there's no
|
||||
# point it setting env_vars in non-used profiles.
|
||||
|
||||
# user is secret and password is not. postgres on macos doesn't care if the password
|
||||
# changes so we have to change the user. related: https://github.com/dbt-labs/dbt-core/pull/4250
|
||||
os.environ[SECRET_ENV_PREFIX + 'USER'] = 'root'
|
||||
os.environ['ENV_VAR_PASS'] = 'password'
|
||||
return {
|
||||
'config': {
|
||||
'send_anonymous_usage_stats': False
|
||||
},
|
||||
'test': {
|
||||
'outputs': {
|
||||
'dev': {
|
||||
'type': 'postgres',
|
||||
'threads': 1,
|
||||
'host': self.database_host,
|
||||
'port': 5432,
|
||||
'user': "root",
|
||||
'pass': "password",
|
||||
'user': "{{ env_var('DBT_ENV_SECRET_USER') }}",
|
||||
'pass': "{{ env_var('ENV_VAR_PASS') }}",
|
||||
'dbname': 'dbt',
|
||||
'schema': self.unique_schema()
|
||||
},
|
||||
},
|
||||
'target': 'dev'
|
||||
}
|
||||
}
|
||||
|
||||
@use_profile('postgres')
|
||||
def test_postgres_profile_secret_env_vars(self):
|
||||
|
||||
# Initial run
|
||||
os.environ[SECRET_ENV_PREFIX + 'USER'] = 'root'
|
||||
os.environ['ENV_VAR_PASS'] = 'password'
|
||||
self.setup_directories()
|
||||
self.copy_file('test-files/model_one.sql', 'models/model_one.sql')
|
||||
results = self.run_dbt(["run"])
|
||||
manifest = get_manifest()
|
||||
env_vars_checksum = manifest.state_check.profile_env_vars_hash.checksum
|
||||
|
||||
# Change a secret var, it shouldn't register because we shouldn't save secrets.
|
||||
os.environ[SECRET_ENV_PREFIX + 'USER'] = 'boop'
|
||||
# this dbt run is going to fail because the password isn't actually the right one,
|
||||
# but that doesn't matter because we just want to see if the manifest has included
|
||||
# the secret in the hash of environment variables.
|
||||
(results, log_output) = self.run_dbt_and_capture(["run"], expect_pass=False)
|
||||
# I020 is the event code for "env vars used in profiles.yml have changed"
|
||||
self.assertFalse('I020' in log_output)
|
||||
manifest = get_manifest()
|
||||
self.assertEqual(env_vars_checksum, manifest.state_check.profile_env_vars_hash.checksum)
|
||||
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user