Compare commits

...

54 Commits

Author SHA1 Message Date
Drew Banin
396f56f82f Explore nicer error messages for sql operations 2022-09-23 17:48:36 -04:00
Chenyu Li
87c9974be1 improve error message for parsing args (#5895)
* improve error message for parsing args

* update error message

* Update models.py

* skip stack_trace for all dbt Exceptions
2022-09-23 13:31:10 -07:00
Emily Rockman
f3f509da92 update disabled metrics/exposures to use add_disabled (#5909)
* update disabled metrics/exposures to use add_disabled

* put back func fo tests

* add pp logic

* switch elif

* fix test name

* return node
2022-09-23 13:24:37 -05:00
dave-connors-3
5e8dcec2c5 update flag to -f (#5908) 2022-09-23 20:20:29 +02:00
Chenyu Li
56783446db PySpark dataframe related tests (#5906)
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
2022-09-22 16:27:27 -07:00
Chenyu Li
207cc0383d initial support for a .dbtignore (#5897) 2022-09-22 09:06:35 -07:00
leahwicz
49ecd6a6a4 Adding missing permissions on GHA (#5870)
* Adding missing permissions on GHA

* Adding read all permissions explicitly
2022-09-21 14:53:27 -04:00
dave-connors-3
c109f39d82 add optional shorthand to full refresh command (#5879)
* add optional shorthand to full refresh command

* changie

* replace full refresh with shorthand in functional test
2022-09-21 14:37:28 -04:00
Ian Knox
fd778dceb5 Profiling and Adapter management work with Click (#5892) 2022-09-21 11:23:26 -05:00
Emily Rockman
e402241e0e Validate exposure names and add label (#5844)
* first pass

* add label and name validation

* changelog

* fix tests

* convert ParsingError to Deprecation

* fix bug where label did not end up in parsed node

* update deprecation msg
2022-09-21 10:24:44 -05:00
Emily Rockman
a6c37c948d Add name validation for metrics (#5841)
* add tests, add name validation

* tweak test

* small update

* change to 250 char limit, tweak message
2022-09-21 09:31:13 -05:00
Daniel Messias
fd886cb7dd ConfigSelectorMethod should check for bools (#5889)
* ConfigSelectorMethod should check for bools

* Add changelog entry

* Add support for lists and test cases

* Typo and formatting in test

* pre-commit linting
2022-09-21 10:20:14 -04:00
James McNeill
b089a471b7 implement type_boolean macro (#5875)
* implement type_boolean macro

* changie result
2022-09-20 19:24:25 -06:00
Emily Rockman
ae294b643b Manifest deserialization error with disabled models (#5891)
* account for disabled nodes in renaming attributes

* tweak naming
2022-09-20 14:27:25 -05:00
colin-rogers-dbt
0bd6df0d1b [CT-1101] 024_custom_schema_tests (#5828)
* create functional custom schema tests

* delete old integration tests

* changie log

* formatting fixes

* delete changie entry and errant comment
2022-09-20 10:42:07 -07:00
Gerda Shank
7b1d61c956 Ct 1191 event history cleanup (#5858)
* Change Exceptions in events to strings. Refactor event_buffer_handling.

* Changie

* fix fire_event call MainEncounteredError

* Set EventBufferFull message when event buffer >= 10,000
2022-09-20 12:44:33 -04:00
Chenyu Li
646a0c704f restrict python submission (#5822)
Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
2022-09-20 09:39:20 -07:00
Doug Beatty
bbf4fc30a5 Default to current working directory for profiles.yml and fall back to ~/.dbt (#5717)
* Method for capturing standard out during testing (rather than logs)

* Allow dbt exit code assertion to be optional

* Verify priority order to search for profiles.yml configuration

* Updates after pre-commit checks

* Test searching for profiles.yml within the dbt project directory before `~/.dbt/`

* Refactor `dbt debug` to move to the project directory prior to looking up profiles directory

* Search the current working directory for profiles.yml

* Changelog

* Formatting with Black

* Move `run_dbt_and_capture_stdout` into the test case

* Update CLI help text

* Unify separate DEFAULT_PROFILES_DIR definitions

* Remove unused PROFILE_DIR_MESSAGE

* Remove unused DEFAULT_PROFILES_DIR

* Use shared definition of DEFAULT_PROFILES_DIR

* Define global vs. local profiles location and dynamically determine the default

* Restore original

* Remove function for determining the default profiles directory
2022-09-20 08:44:15 -04:00
jared-rimmer
6baaa2bcb0 Add metadata env method to provider context (#5794)
* Add dbt_metadata_envs contextproperty to ProviderContext

* Refactor helper methods into functions

* Fix code quality failures

* Add Changelog
2022-09-20 08:41:44 -04:00
Chenyu Li
13a595722a add support for custom file ending (#5845) 2022-09-19 13:58:38 -07:00
Sam Debruyn
3680b6ad0e remove source quoting setting in adapter tests (#5839)
* remove source quoting setting in adapter tests

* changelog entry

* formatting
2022-09-19 11:00:12 -07:00
Ian Knox
4c29d48d1c Flags work with Click (#5790) 2022-09-19 12:36:38 -05:00
Chenyu Li
e00eb9aa3a fix multiple dbt.config.get in python model (#5850)
* fix multiple dbt.config.get in python model

Co-authored-by: Stu Kilgore <stuart.kilgore@gmail.com>
2022-09-16 15:07:59 -07:00
Jeremy Cohen
f5a94fc774 Back/fwd compatibility for renamed metrics attributes (#5825)
* Naive handling for metric attr renames

* Add tests for bwd/fwd compatibility

* Add deprecation

* Add changelog entry

* PR feedback

* Small fixups

* emmyoop's suggestions

Co-authored-by: Callum McCann <cmccann51@gmail.com>
2022-09-16 11:21:35 +02:00
Sam Debruyn
b98af4ce17 remove key as reserved keyword from test_bool_or (#5818) 2022-09-15 09:44:25 -05:00
Matthew McKnight
b0f8d3d2f1 [CT-1100] 021_test_concurrency test conversion (#5753)
* init push for 021_test_concurrency conversion

* ref to self, delete old integration tests, core passing locally

* creating base class to send setup to snowflake

* making changes to store all setup in core, todo: remove util changes after 1050 is merged

* swap sql seeds to csv

* white space removal

* rewriting seed to see if it fixes issue in snowflake

* attempt to rewrite file for test in snowflake

* update to main

* remove unneeded variable to seeds

* remove unneeded snowflake specific code
2022-09-14 15:49:36 -05:00
Emily Rockman
6c4577f44e Add config to disable metrics/exposures (#5815)
* first pass adding disabled functionality to metrics and exposures

* first pass at getting metrics disabled

* add unsaved file

* fix up comments

* Delete tmp.csv

* fix test

* add exposure logic, fix merge from main

* change when nodes are added to manifest, finish tests

* add changelog

* removed unused code

* minor cleanup
2022-09-14 14:35:38 -05:00
Matthew McKnight
89ee5962f5 [CT-1050] convert 020_ephemeral_test (#5699)
* init file creation for test_ephemeral conversion

* creating base class to run seed through and pass along to classes to test against

* laid out basic flow of tests, need to finish by figuring out how to handle the assertTrue sections and fix error thats occuring

* added creation and comparison of sql and expected result, seeing issue with extra appended test_ on some and issue with errorhandling regarding expect pass

* working on fixing view structure

* update to expected_sql file

* update to expected_sql file

* directory rename, close on all tests need to fix the test_test_ name change for first two tests and figure out why the new test is calling error instead of skipped in status

* renamed expected_sql to include the test_test_ephemeral style name, organized how models are imported into test classes

* move ephemeral functional test to adapter zone

* trying to include the BaseEphemeralMulti class to send to snowflake

* trying to fix snowflake test

* trying to fix snowflake test

* creation of second Base class to feed into others for testing purposes

* found way to check type of warehouse to make data type change for snowflake

* move seed into fixture, to be able to import it from core for adapter tests

* convert to csv and get test passing in core

* remove snowflake specific stuff from util

* remove whitespace

* update to main
2022-09-14 12:00:02 -05:00
Ian Knox
a096202b28 Complete CLI modeling for Click (#5789) 2022-09-14 10:27:47 -05:00
Stu Kilgore
7da7c2d692 Convert default selectors tests to pytest (#5820) 2022-09-13 11:47:13 -05:00
dependabot[bot]
1db48b3cca Bump python from 3.10.6-slim-bullseye to 3.10.7-slim-bullseye in /docker (#5805)
* Bump python from 3.10.6-slim-bullseye to 3.10.7-slim-bullseye in /docker

Bumps python from 3.10.6-slim-bullseye to 3.10.7-slim-bullseye.

---
updated-dependencies:
- dependency-name: python
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Add automated changelog yaml from template for bot PR

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
2022-09-12 10:58:24 -07:00
Chenyu Li
567847a5b0 add a base python job helper class (#5802)
* add a base python job helper class

* fix comment, add changelog

* fix quote for adapters
2022-09-12 10:44:19 -07:00
Sam Debruyn
9894c04d38 fix dead anchor link in PR template (#5814)
* fix dead anchor link in PR template

The link did not go to the anchor directly, now it does

* changelog entry
2022-09-12 11:08:30 -04:00
Callum McCann
b26280d1cf Altering Window Metric Attribute To Match Freshness Tests (#5793)
* changing window spec

* more updates

* adding to v7 json?

* chenyu rules

* updating for formatting

* updating metric deferral test
2022-09-09 16:33:32 -07:00
Callum McCann
cfece2cf51 Renaming Attributes In Metric Spec (#5775)
* making updates - see what fails

* updating tests

* adding timestamp to ok_metric_no_model

* adding changie and fixing description error

* test fixes

* updating schema renderer

* fixing test_yaml_render

* file cleaning and window tests
2022-09-09 14:59:52 -04:00
Stu Kilgore
79da002c3c Fix warnings as errors during tests (#5800)
Added RunResultWarningMessage event to support this change.
2022-09-09 12:50:15 -05:00
Bertjan Broeksema
e3f827513f Remove tmp file after test passed (#5749)
* Remove tmp file after test passed

* Add changelog entry
2022-09-09 11:14:43 -04:00
Gerda Shank
10b2a7e7ff Convert test/integration/074_postgres_unlogged_table_tests (#5752)
* Convert test/integration//074_postgres_unlogged_table_tests

* Remove old test
2022-09-08 16:54:03 -04:00
jared-rimmer
82c8d6a7a8 Add invocation_args_dict to ProviderContext (#5782)
* Add invocation_args_to_dict to ProviderContext

* Change invocation_args_to_dict contextproperty to invocation_args_dict

* Fix invocation_args_dict builtin test

* Add CHANGELOG entry

* Fix formatting
2022-09-08 15:31:40 -04:00
Gerda Shank
c994717cbc Call build_flat_graph in merge_from_artifact (#5786) 2022-09-08 15:30:26 -04:00
Callum McCann
e3452b9a8f Add Window Attribute for Metrics (#5722)
* file changes

* changing to window

* adding test

* adding changie for feature

* fixing commits

* fixing tests

* adding timestamp

* fixing graph unparsed

* changing default value
2022-09-07 10:45:08 -04:00
Aram Panasenco
e95e36d63b Include py.typed in MANIFEST.in (#5703)
This enables packages that install dbt-core from pypi to use mypy.
2022-09-06 22:03:12 -04:00
jared-rimmer
74f7416144 Add extra rm command in make clean to remove all .coverage files (#5759)
* Add extra rm command in make clean to remove all .coverage files

* Add Changie entry
2022-09-06 21:36:35 -04:00
Mila Page
1feeb804f4 Update makefile to match our CI (#5763)
* Add structured logging test and provide CI env vars to integration conditionally.
* Add the crazy inline if make feature and ax unneeded variable

Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
2022-09-06 15:42:50 -07:00
Mila Page
0f6e4f0e32 Ct 866/migrate hook tests fixed (#5760)
* Finish converting first test file.
* Finish test conversion.
* Remove old integration hook tests.
* Move location of schema.yml to models directory.
* fix snapshot delete test that was failing
* Add the extra env var check for our CI.
* Add changelog
* Remove naive json flag check and instead force all integration tests to check for environment variables using flag routine.
* Revise the changelog to be more of an explanation.

Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
2022-09-06 12:22:29 -07:00
Emily Rockman
2b44c2b456 Convert experimental parser tests (#5772)
* WIP

* fixed up tests

* remove old test
2022-09-06 14:11:22 -05:00
leahwicz
2bb31ade39 Update release-branch-tests.yml (#5767) 2022-09-06 14:46:53 -04:00
Mila Page
0ce12405c0 Move docs in from Jinja. (#5762)
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
2022-09-06 10:05:05 -07:00
dependabot[bot]
b8c13e05db Bump black from 22.6.0 to 22.8.0 (#5750)
* Bump black from 22.6.0 to 22.8.0

Bumps [black](https://github.com/psf/black) from 22.6.0 to 22.8.0.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/compare/22.6.0...22.8.0)

---
updated-dependencies:
- dependency-name: black
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Add automated changelog yaml from template for bot PR

* Remove newline

* Delete Dependency-20220901-154946.yaml

* Add automated changelog yaml from template for bot PR

* Delete Dependency-20220906-132643.yaml

* Add automated changelog yaml from template for bot PR

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
2022-09-06 11:31:13 -05:00
Emily Rockman
64268d2f9b Fix label selection syntax and trigger (#5754)
* Fix label selection syntax and trigger

* update version
2022-09-06 08:21:21 -05:00
Jeremy Cohen
8c8be68701 Roadmap update (Aug 2022) (#5748)
* Add dbt Core roadmap as of August 2022

* Cody intro

* Florian intro

* Lint my markdown

* add blurb on 1.5+ for Python next steps

* Revert "add blurb on 1.5+ for Python next steps"

This reverts commit 1659a5a727.

* PR feedback, self review

Co-authored-by: Cody Peterson <cody.dkdc2@gmail.com>
Co-authored-by: Florian Eiden <florian.eiden@dbtlabs.com>
2022-09-01 00:31:51 +02:00
Doug Beatty
1df713fee9 Test priority order of profiles directory configuration (#5715)
* Method for capturing standard out during testing (rather than logs)

* Allow dbt exit code assertion to be optional

* Verify priority order to search for profiles.yml configuration

* Updates after pre-commit checks

* Move `run_dbt_and_capture_stdout` into the test case
2022-08-30 17:18:17 -06:00
Gerda Shank
758afd4071 ADR for why we're using betterproto for protobuf (#5726) 2022-08-30 12:58:44 -04:00
Jeremy Cohen
0f9200d356 Update team ownership (#5694) 2022-08-30 15:24:34 +02:00
245 changed files with 7028 additions and 2902 deletions

View File

@@ -0,0 +1,7 @@
kind: Breaking Changes
body: Renaming Metric Spec Attributes
time: 2022-09-06T15:45:21.2769-05:00
custom:
Author: callum-mcdata
Issue: "5774"
PR: "5775"

View File

@@ -0,0 +1,7 @@
kind: "Dependency"
body: "Bump black from 22.6.0 to 22.8.0"
time: 2022-09-06T13:48:58.00000Z
custom:
Author: dependabot[bot]
Issue: 4904
PR: 5750

View File

@@ -0,0 +1,7 @@
kind: "Dependency"
body: "Bump python from 3.10.6-slim-bullseye to 3.10.7-slim-bullseye in /docker"
time: 2022-09-12T00:22:53.00000Z
custom:
Author: dependabot[bot]
Issue: 4904
PR: 5805

View File

@@ -0,0 +1,7 @@
kind: Features
body: Search current working directory for `profiles.yml`
time: 2022-08-25T19:50:23.940417-06:00
custom:
Author: dbeatty10
Issue: "5411"
PR: "5717"

View File

@@ -0,0 +1,7 @@
kind: Features
body: Adding the `window` parameter to the metric spec.
time: 2022-08-31T12:13:19.48356-05:00
custom:
Author: callum-mcdata
Issue: "5721"
PR: "5722"

View File

@@ -0,0 +1,7 @@
kind: Features
body: Add invocation args dict to ProviderContext class
time: 2022-09-08T08:13:15.17337+01:00
custom:
Author: jared-rimmer
Issue: "5524"
PR: "5782"

View File

@@ -0,0 +1,7 @@
kind: Features
body: Adds new cli framework
time: 2022-09-08T10:41:49.375734-05:00
custom:
Author: iknox-fa
Issue: "5526"
PR: "5647"

View File

@@ -0,0 +1,7 @@
kind: Features
body: Flags work with new Click CLI
time: 2022-09-08T12:36:50.386978-05:00
custom:
Author: iknox-fa
Issue: "5529"
PR: "5790"

View File

@@ -0,0 +1,7 @@
kind: Features
body: Add metadata env method to ProviderContext class
time: 2022-09-09T20:46:43.889302+01:00
custom:
Author: jared-rimmer
Issue: "5522"
PR: "5794"

View File

@@ -0,0 +1,7 @@
kind: Features
body: Add enabled config to exposures and metrics
time: 2022-09-13T09:59:24.445918-05:00
custom:
Author: emmyoop
Issue: "5422"
PR: "5815"

View File

@@ -0,0 +1,7 @@
kind: Features
body: add -fr flag shorthand
time: 2022-09-19T11:29:03.774678-05:00
custom:
Author: dave-connors-3
Issue: "5878"
PR: "5879"

View File

@@ -0,0 +1,7 @@
kind: Features
body: add type_boolean as a data type macro
time: 2022-09-19T23:14:14.9871+01:00
custom:
Author: jpmmcneill
Issue: "5739"
PR: "5875"

View File

@@ -0,0 +1,7 @@
kind: Features
body: Support .dbtignore in project root to ignore certain files being read by dbt
time: 2022-09-21T14:52:22.131627-07:00
custom:
Author: ChenyuLInx
Issue: "5733"
PR: "5897"

View File

@@ -0,0 +1,8 @@
kind: Fixes
body: Include py.typed in MANIFEST.in. This enables packages that install dbt-core
from pypi to use mypy.
time: 2022-08-23T11:26:33.8415455-07:00
custom:
Author: panasenco
Issue: "5703"
PR: "5703"

View File

@@ -0,0 +1,7 @@
kind: Fixes
body: Removal of all .coverage files when using make clean command
time: 2022-09-03T15:54:09.741554082+01:00
custom:
Author: jared-rimmer
Issue: "5633"
PR: "5759"

View File

@@ -0,0 +1,7 @@
kind: Fixes
body: Remove temp files generated by unit tests
time: 2022-09-09T09:28:32.590208+02:00
custom:
Author: bbroeksema
Issue: "5631"
PR: "5749"

View File

@@ -0,0 +1,7 @@
kind: Fixes
body: Fix warnings as errors during tests
time: 2022-09-09T09:56:27.90654-05:00
custom:
Author: stu-k
Issue: "5424"
PR: "5800"

View File

@@ -0,0 +1,7 @@
kind: Fixes
body: Prevent event_history from holding references
time: 2022-09-16T09:17:23.273847-04:00
custom:
Author: gshank
Issue: "5848"
PR: "5858"

View File

@@ -0,0 +1,7 @@
kind: Fixes
body: ConfigSelectorMethod should check for bools
time: 2022-09-20T18:18:56.630628+01:00
custom:
Author: danielcmessias
Issue: "5890"
PR: "5889"

View File

@@ -0,0 +1,7 @@
kind: Fixes
body: shorthand for full refresh should be one character
time: 2022-09-22T08:39:26.948671-05:00
custom:
Author: dave-connors-3
Issue: "5878"
PR: "5908"

View File

@@ -0,0 +1,7 @@
kind: Under the Hood
body: Migrate integration test 014 but also fix the snapshot hard delete test's timezone logic and force all integration tests to run flags.set_from_args to force environment variables are accessible to all integration test threads.
time: 2022-09-05T00:17:49.564534-07:00
custom:
Author: versusfacit
Issue: "5760"
PR: "5760"

View File

@@ -0,0 +1,7 @@
kind: Under the Hood
body: Support dbt-metrics compilation by rebuilding flat_graph
time: 2022-09-08T14:56:44.173322-04:00
custom:
Author: gshank
Issue: "5525"
PR: "5786"

View File

@@ -0,0 +1,8 @@
kind: Under the Hood
body: Reworking the way we define the window attribute of metrics to match freshness
tests
time: 2022-09-08T18:07:31.532608-05:00
custom:
Author: callum-mcdata
Issue: "5722"
PR: "5793"

View File

@@ -0,0 +1,7 @@
kind: Under the Hood
body: Add PythonJobHelper base class in core and add more type checking
time: 2022-09-09T11:52:20.419364-07:00
custom:
Author: ChenyuLInx
Issue: "5802"
PR: "5802"

View File

@@ -0,0 +1,7 @@
kind: Under the Hood
body: Convert default selector tests to pytest
time: 2022-09-12T13:40:00.625912-05:00
custom:
Author: stu-k
Issue: "5728"
PR: "5820"

View File

@@ -0,0 +1,7 @@
kind: Under the Hood
body: The link did not go to the anchor directly, now it does
time: 2022-09-12T14:00:35.899828+02:00
custom:
Author: sdebruyn
Issue: "5813"
PR: "5814"

View File

@@ -0,0 +1,7 @@
kind: Under the Hood
body: remove key as reserved keyword from test_bool_or
time: 2022-09-12T19:03:41.481601+02:00
custom:
Author: sdebruyn
Issue: "5817"
PR: "5818"

View File

@@ -0,0 +1,7 @@
kind: Under the Hood
body: Compatibiltiy for metric attribute renaming
time: 2022-09-13T11:17:44.953536+02:00
custom:
Author: jtcohen6 callum-mcdata
Issue: "5807"
PR: "5825"

View File

@@ -0,0 +1,7 @@
kind: Under the Hood
body: Add name validation for metrics
time: 2022-09-14T13:26:32.387524-05:00
custom:
Author: emmyoop
Issue: "5456"
PR: "5841"

View File

@@ -0,0 +1,7 @@
kind: Under the Hood
body: Validate exposure name and add label
time: 2022-09-14T15:00:58.982982-05:00
custom:
Author: emmyoop
Issue: "5606"
PR: "5844"

View File

@@ -0,0 +1,7 @@
kind: Under the Hood
body: remove source quoting setting in adapter tests
time: 2022-09-14T19:39:33.688385+02:00
custom:
Author: sdebruyn
Issue: "5836"
PR: "5839"

View File

@@ -0,0 +1,7 @@
kind: Under the Hood
body: Profiling and Adapter Management work with Click CLI
time: 2022-09-20T14:48:42.070256-05:00
custom:
Author: iknox-fa
Issue: "5531"
PR: "5892"

60
.github/CODEOWNERS vendored
View File

@@ -16,25 +16,57 @@
# Changes to GitHub configurations including Actions
/.github/ @leahwicz
### LANGUAGE
# Language core modules
/core/dbt/config/ @dbt-labs/core-language
/core/dbt/context/ @dbt-labs/core-language
/core/dbt/contracts/ @dbt-labs/core-language
/core/dbt/deps/ @dbt-labs/core-language
/core/dbt/parser/ @dbt-labs/core-language
/core/dbt/config/ @dbt-labs/core-language
/core/dbt/context/ @dbt-labs/core-language
/core/dbt/contracts/ @dbt-labs/core-language
/core/dbt/deps/ @dbt-labs/core-language
/core/dbt/events/ @dbt-labs/core-language # structured logging
/core/dbt/parser/ @dbt-labs/core-language
# Language misc files
/core/dbt/dataclass_schema.py @dbt-labs/core-language
/core/dbt/hooks.py @dbt-labs/core-language
/core/dbt/node_types.py @dbt-labs/core-language
/core/dbt/semver.py @dbt-labs/core-language
### EXECUTION
# Execution core modules
/core/dbt/events/ @dbt-labs/core-execution @dbt-labs/core-language # eventually remove language but they have knowledge here now
/core/dbt/graph/ @dbt-labs/core-execution
/core/dbt/task/ @dbt-labs/core-execution
/core/dbt/graph/ @dbt-labs/core-execution
/core/dbt/task/ @dbt-labs/core-execution
# Adapter interface, scaffold, Postgres plugin
/core/dbt/adapters @dbt-labs/core-adapters
/core/scripts/create_adapter_plugin.py @dbt-labs/core-adapters
/plugins/ @dbt-labs/core-adapters
# Execution misc files
/core/dbt/compilation.py @dbt-labs/core-execution
/core/dbt/flags.py @dbt-labs/core-execution
/core/dbt/lib.py @dbt-labs/core-execution
/core/dbt/main.py @dbt-labs/core-execution
/core/dbt/profiler.py @dbt-labs/core-execution
/core/dbt/selected_resources.py @dbt-labs/core-execution
/core/dbt/tracking.py @dbt-labs/core-execution
/core/dbt/version.py @dbt-labs/core-execution
# Global project: default macros, including generic tests + materializations
/core/dbt/include/global_project @dbt-labs/core-execution @dbt-labs/core-adapters
### ADAPTERS
# Adapter interface ("base" + "sql" adapter defaults, cache)
/core/dbt/adapters @dbt-labs/core-adapters
# Global project (default macros + materializations), starter project
/core/dbt/include @dbt-labs/core-adapters
# Postgres plugin
/plugins/ @dbt-labs/core-adapters
# Functional tests for adapter plugins
/tests/adapter @dbt-labs/core-adapters
### TESTS
# Overlapping ownership for vast majority of unit + functional tests
# Perf regression testing framework
# This excludes the test project files itself since those aren't specific

View File

@@ -20,4 +20,4 @@ resolves #
- [ ] I have run this code in development and it appears to resolve the stated issue
- [ ] This PR includes tests, or tests are not required/relevant for this PR
- [ ] I have [opened an issue to add/update docs](https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose), or docs changes are not required/relevant for this PR
- [ ] I have run `changie new` to [create a changelog entry](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#Adding-CHANGELOG-Entry)
- [ ] I have run `changie new` to [create a changelog entry](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-a-changelog-entry)

View File

@@ -28,7 +28,7 @@ name: Bot Changelog
on:
pull_request:
# catch when the PR is opened with the label or when the label is added
types: [opened, labeled]
types: [labeled]
permissions:
contents: write
@@ -48,9 +48,9 @@ jobs:
steps:
- name: Create and commit changelog on bot PR
if: "contains(github.event.pull_request.labels.*.name, ${{ matrix.label }})"
if: ${{ contains(github.event.pull_request.labels.*.name, matrix.label) }}
id: bot_changelog
uses: emmyoop/changie_bot@v1.0
uses: emmyoop/changie_bot@v1.0.1
with:
GITHUB_TOKEN: ${{ secrets.FISHTOWN_BOT_PAT }}
commit_author_name: "Github Build Bot"
@@ -58,4 +58,4 @@ jobs:
commit_message: "Add automated changelog yaml from template for bot PR"
changie_kind: ${{ matrix.changie_kind }}
label: ${{ matrix.label }}
custom_changelog_string: "custom:\n Author: ${{ github.event.pull_request.user.login }}\n Issue: 4904\n PR: ${{ github.event.pull_request.number }}\n"
custom_changelog_string: "custom:\n Author: ${{ github.event.pull_request.user.login }}\n Issue: 4904\n PR: ${{ github.event.pull_request.number }}"

View File

@@ -15,6 +15,9 @@ on:
issues:
types: [closed, deleted, reopened]
# no special access is needed
permissions: read-all
jobs:
call-label-action:
uses: dbt-labs/jira-actions/.github/workflows/jira-transition.yml@main

View File

@@ -39,7 +39,7 @@ jobs:
max-parallel: 1
fail-fast: false
matrix:
branch: [1.0.latest, 1.1.latest, main]
branch: [1.0.latest, 1.1.latest, 1.2.latest, main]
steps:
- name: Call CI workflow for ${{ matrix.branch }} branch

View File

@@ -20,6 +20,9 @@ on:
description: 'The release version number (i.e. 1.0.0b1)'
required: true
permissions:
contents: write # this is the permission that allows creating a new release
defaults:
run:
shell: bash

View File

@@ -21,6 +21,9 @@ on:
- "*.latest"
- "releases/*"
# no special access is needed
permissions: read-all
env:
LATEST_SCHEMA_PATH: ${{ github.workspace }}/new_schemas
SCHEMA_DIFF_ARTIFACT: ${{ github.workspace }}//schema_schanges.txt

View File

@@ -3,6 +3,10 @@ on:
schedule:
- cron: "30 1 * * *"
permissions:
issues: write
pull-requests: write
jobs:
stale:
runs-on: ubuntu-latest

View File

@@ -20,6 +20,10 @@ on:
description: 'The version number to bump to (ex. 1.2.0, 1.3.0b1)'
required: true
permissions:
contents: write
pull-requests: write
jobs:
bump:
runs-on: ubuntu-latest

1
.gitignore vendored
View File

@@ -95,6 +95,7 @@ venv/
# vscode
.vscode/
*.code-workspace
# poetry
pyproject.toml

View File

@@ -6,7 +6,7 @@ exclude: ^test/
# Force all unspecified python hooks to run python 3.8
default_language_version:
python: python3.8
python: python3
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks

View File

@@ -7,7 +7,9 @@
3. [Setting up an environment](#setting-up-an-environment)
4. [Running `dbt` in development](#running-dbt-core-in-development)
5. [Testing dbt-core](#testing)
6. [Submitting a Pull Request](#submitting-a-pull-request)
6. [Debugging](#debugging)
7. [Adding a changelog entry](#adding-a-changelog-entry)
8. [Submitting a Pull Request](#submitting-a-pull-request)
## About this document
@@ -21,7 +23,8 @@ If you get stuck, we're happy to help! Drop us a line in the `#dbt-core-developm
- **Adapters:** Is your issue or proposed code change related to a specific [database adapter](https://docs.getdbt.com/docs/available-adapters)? If so, please open issues, PRs, and discussions in that adapter's repository instead. The sole exception is Postgres; the `dbt-postgres` plugin lives in this repository (`dbt-core`).
- **CLA:** Please note that anyone contributing code to `dbt-core` must sign the [Contributor License Agreement](https://docs.getdbt.com/docs/contributor-license-agreements). If you are unable to sign the CLA, the `dbt-core` maintainers will unfortunately be unable to merge any of your Pull Requests. We welcome you to participate in discussions, open issues, and comment on existing ones.
- **Branches:** All pull requests from community contributors should target the `main` branch (default). If the change is needed as a patch for a minor version of dbt that has already been released (or is already a release candidate), a maintainer will backport the changes in your PR to the relevant "latest" release branch (`1.0.latest`, `1.1.latest`, ...)
- **Branches:** All pull requests from community contributors should target the `main` branch (default). If the change is needed as a patch for a minor version of dbt that has already been released (or is already a release candidate), a maintainer will backport the changes in your PR to the relevant "latest" release branch (`1.0.latest`, `1.1.latest`, ...). If an issue fix applies to a release branch, that fix should be first committed to the development branch and then to the release branch (rarely release-branch fixes may not apply to `main`).
- **Releases**: Before releasing a new minor version of Core, we prepare a series of alphas and release candidates to allow users (especially employees of dbt Labs!) to test the new version in live environments. This is an important quality assurance step, as it exposes the new code to a wide variety of complicated deployments and can surface bugs before official release. Releases are accessible via pip, homebrew, and dbt Cloud.
## Getting the code
@@ -41,7 +44,9 @@ If you are not a member of the `dbt-labs` GitHub organization, you can contribut
### dbt Labs contributors
If you are a member of the `dbt-labs` GitHub organization, you will have push access to the `dbt-core` repo. Rather than forking `dbt-core` to make your changes, just clone the repository, check out a new branch, and push directly to that branch.
If you are a member of the `dbt-labs` GitHub organization, you will have push access to the `dbt-core` repo. Rather than forking `dbt-core` to make your changes, just clone the repository, check out a new branch, and push directly to that branch. Branch names should be fixed by `CT-XXX/` where:
* CT stands for 'core team'
* XXX stands for a JIRA ticket number
## Setting up an environment
@@ -151,7 +156,7 @@ Check out the other targets in the Makefile to see other commonly used test
suites.
#### `pre-commit`
[`pre-commit`](https://pre-commit.com) takes care of running all code-checks for formatting and linting. Run `make dev` to install `pre-commit` in your local environment. Once this is done you can use any of the linter-based make targets as well as a git pre-commit hook that will ensure proper formatting and linting.
[`pre-commit`](https://pre-commit.com) takes care of running all code-checks for formatting and linting. Run `make dev` to install `pre-commit` in your local environment (we recommend running this command with a python virtual environment active). This command installs several pip executables including black, mypy, and flake8. Once this is done you can use any of the linter-based make targets as well as a git pre-commit hook that will ensure proper formatting and linting.
#### `tox`
@@ -174,7 +179,29 @@ python3 -m pytest tests/functional/sources
> See [pytest usage docs](https://docs.pytest.org/en/6.2.x/usage.html) for an overview of useful command-line options.
## Adding CHANGELOG Entry
### Unit, Integration, Functional?
Here are some general rules for adding tests:
* unit tests (`test/unit` & `tests/unit`) dont need to access a database; "pure Python" tests should be written as unit tests
* functional tests (`test/integration` & `tests/functional`) cover anything that interacts with a database, namely adapter
* *everything in* `test/*` *is being steadily migrated to* `tests/*`
## Debugging
1. The logs for a `dbt run` have stack traces and other information for debugging errors (in `logs/dbt.log` in your project directory).
2. Try using a debugger, like `ipdb`. For pytest: `--pdb --pdbcls=IPython.terminal.debugger:pdb`
3. Sometimes, its easier to debug on a single thread: `dbt --single-threaded run`
4. To make print statements from Jinja macros: `{{ log(msg, info=true) }}`
5. You can also add `{{ debug() }}` statements, which will drop you into some auto-generated code that the macro wrote.
6. The dbt “artifacts” are written out to the target directory of your dbt project. They are in unformatted json, which can be hard to read. Format them with:
> python -m json.tool target/run_results.json > run_results.json
### Assorted development tips
* Append `# type: ignore` to the end of a line if you need to disable `mypy` on that line.
* Sometimes flake8 complains about lines that are actually fine, in which case you can put a comment on the line such as: # noqa or # noqa: ANNN, where ANNN is the error code that flake8 issues.
* To collect output for `CProfile`, run dbt with the `-r` option and the name of an output file, i.e. `dbt -r dbt.cprof run`. If you just want to profile parsing, you can do: `dbt -r dbt.cprof parse`. `pip` install `snakeviz` to view the output. Run `snakeviz dbt.cprof` and output will be rendered in a browser window.
## Adding a CHANGELOG Entry
We use [changie](https://changie.dev) to generate `CHANGELOG` entries. **Note:** Do not edit the `CHANGELOG.md` directly. Your modifications will be lost.
@@ -186,8 +213,10 @@ You don't need to worry about which `dbt-core` version your change will go into.
## Submitting a Pull Request
A `dbt-core` maintainer will review your PR. They may suggest code revision for style or clarity, or request that you add unit or integration test(s). These are good things! We believe that, with a little bit of help, anyone can contribute high-quality code.
Code can be merged into the current development branch `main` by opening a pull request. A `dbt-core` maintainer will review your PR. They may suggest code revision for style or clarity, or request that you add unit or integration test(s). These are good things! We believe that, with a little bit of help, anyone can contribute high-quality code.
Automated tests run via GitHub Actions. If you're a first-time contributor, all tests (including code checks and unit tests) will require a maintainer to approve. Changes in the `dbt-core` repository trigger integration tests against Postgres. dbt Labs also provides CI environments in which to test changes to other adapters, triggered by PRs in those adapters' repositories, as well as periodic maintenance checks of each adapter in concert with the latest `dbt-core` code changes.
Once all tests are passing and your PR has been approved, a `dbt-core` maintainer will merge your changes into the active development branch. And that's it! Happy developing :tada:
Sometimes, the content license agreement auto-check bot doesn't find a user's entry in its roster. If you need to force a rerun, add `@cla-bot check` in a comment on the pull request.

View File

@@ -6,6 +6,19 @@ ifeq ($(USE_DOCKER),true)
DOCKER_CMD := docker-compose run --rm test
endif
LOGS_DIR := ./logs
# Optional flag to invoke tests using our CI env.
# But we always want these active for structured
# log testing.
CI_FLAGS =\
DBT_TEST_USER_1=dbt_test_user_1\
DBT_TEST_USER_2=dbt_test_user_2\
DBT_TEST_USER_3=dbt_test_user_3\
RUSTFLAGS="-D warnings"\
LOG_DIR=./logs\
DBT_LOG_FORMAT=json
.PHONY: dev
dev: ## Installs dbt-* packages in develop mode along with development dependencies.
@\
@@ -48,13 +61,20 @@ test: .env ## Runs unit tests with py and code checks against staged changes.
.PHONY: integration
integration: .env ## Runs postgres integration tests with py-integration
@\
$(DOCKER_CMD) tox -e py-integration -- -nauto
$(if $(USE_CI_FLAGS), $(CI_FLAGS)) $(DOCKER_CMD) tox -e py-integration -- -nauto
.PHONY: integration-fail-fast
integration-fail-fast: .env ## Runs postgres integration tests with py-integration in "fail fast" mode.
@\
$(DOCKER_CMD) tox -e py-integration -- -x -nauto
.PHONY: interop
interop: clean
@\
mkdir $(LOGS_DIR) && \
$(CI_FLAGS) $(DOCKER_CMD) tox -e py-integration -- -nauto && \
LOG_DIR=$(LOGS_DIR) cargo run --manifest-path test/interop/log_parsing/Cargo.toml
.PHONY: setup-db
setup-db: ## Setup Postgres database with docker-compose for system testing.
@\
@@ -76,6 +96,7 @@ endif
clean: ## Resets development environment.
@echo 'cleaning repo...'
@rm -f .coverage
@rm -f .coverage.*
@rm -rf .eggs/
@rm -f .env
@rm -rf .tox/

View File

@@ -1 +1,2 @@
recursive-include dbt/include *.py *.sql *.yml *.html *.md .gitkeep .gitignore
include dbt/py.typed

View File

@@ -0,0 +1,10 @@
## Base adapters
### impl.py
The class `SQLAdapter` in [base/imply.py](https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/adapters/base/impl.py) is a (mostly) abstract object that adapter objects inherit from. The base class scaffolds out methods that every adapter project usually should implement for smooth communication between dbt and database.
Some target databases require more or fewer methods--it all depends on what the warehouse's featureset is.
Look into the class for function-level comments.

View File

@@ -10,5 +10,5 @@ from dbt.adapters.base.relation import ( # noqa
SchemaSearchMap,
)
from dbt.adapters.base.column import Column # noqa
from dbt.adapters.base.impl import AdapterConfig, BaseAdapter # noqa
from dbt.adapters.base.impl import AdapterConfig, BaseAdapter, PythonJobHelper # noqa
from dbt.adapters.base.plugin import AdapterPlugin # noqa

View File

@@ -12,6 +12,7 @@ class Column:
"TIMESTAMP": "TIMESTAMP",
"FLOAT": "FLOAT",
"INTEGER": "INT",
"BOOLEAN": "BOOLEAN",
}
column: str
dtype: str

View File

@@ -60,6 +60,7 @@ from dbt.adapters.base.relation import (
SchemaSearchMap,
)
from dbt.adapters.base import Column as BaseColumn
from dbt.adapters.base import Credentials
from dbt.adapters.cache import RelationsCache, _make_key
@@ -127,6 +128,35 @@ def _relation_name(rel: Optional[BaseRelation]) -> str:
return str(rel)
def log_code_execution(code_execution_function):
# decorator to log code and execution time
if code_execution_function.__name__ != "submit_python_job":
raise ValueError("this should be only used to log submit_python_job now")
def execution_with_log(*args):
self = args[0]
connection_name = self.connections.get_thread_connection().name
fire_event(CodeExecution(conn_name=connection_name, code_content=args[2]))
start_time = time.time()
response = code_execution_function(*args)
fire_event(
CodeExecutionStatus(
status=response._message, elapsed=round((time.time() - start_time), 2)
)
)
return response
return execution_with_log
class PythonJobHelper:
def __init__(self, parsed_model: Dict, credential: Credentials) -> None:
raise NotImplementedError("PythonJobHelper is not implemented yet")
def submit(self, compiled_code: str) -> Any:
raise NotImplementedError("PythonJobHelper submit function is not implemented yet")
class BaseAdapter(metaclass=AdapterMeta):
"""The BaseAdapter provides an abstract base class for adapters.
@@ -1182,9 +1212,36 @@ class BaseAdapter(metaclass=AdapterMeta):
return sql
@available.parse_none
def submit_python_job(self, parsed_model: dict, compiled_code: str):
raise NotImplementedException("`submit_python_job` is not implemented for this adapter!")
@property
def python_submission_helpers(self) -> Dict[str, Type[PythonJobHelper]]:
raise NotImplementedError("python_submission_helpers is not specified")
@property
def default_python_submission_method(self) -> str:
raise NotImplementedError("default_python_submission_method is not specified")
@log_code_execution
def submit_python_job(self, parsed_model: dict, compiled_code: str) -> AdapterResponse:
submission_method = parsed_model["config"].get(
"submission_method", self.default_python_submission_method
)
if submission_method not in self.python_submission_helpers:
raise NotImplementedError(
"Submission method {} is not supported for current adapter".format(
submission_method
)
)
job_helper = self.python_submission_helpers[submission_method](
parsed_model, self.connections.profile.credentials
)
submission_result = job_helper.submit(compiled_code)
# process submission result to generate adapter response
return self.generate_python_submission_response(submission_result)
def generate_python_submission_response(self, submission_result: Any) -> AdapterResponse:
raise NotImplementedException(
"Your adapter need to implement generate_python_submission_response"
)
def valid_incremental_strategies(self):
"""The set of standard builtin strategies which this adapter supports out-of-the-box.
@@ -1274,24 +1331,3 @@ def catch_as_completed(
# exc is not None, derives from Exception, and isn't ctrl+c
exceptions.append(exc)
return merge_tables(tables), exceptions
def log_code_execution(code_execution_function):
# decorator to log code and execution time
if code_execution_function.__name__ != "submit_python_job":
raise ValueError("this should be only used to log submit_python_job now")
def execution_with_log(*args):
self = args[0]
connection_name = self.connections.get_thread_connection().name
fire_event(CodeExecution(conn_name=connection_name, code_content=args[2]))
start_time = time.time()
response = code_execution_function(*args)
fire_event(
CodeExecutionStatus(
status=response._message, elapsed=round((time.time() - start_time), 2)
)
)
return response
return execution_with_log

View File

@@ -1,23 +1,17 @@
import threading
from pathlib import Path
from contextlib import contextmanager
from importlib import import_module
from typing import Type, Dict, Any, List, Optional, Set
from pathlib import Path
from typing import Any, Dict, List, Optional, Set, Type
from dbt.exceptions import RuntimeException, InternalException
from dbt.include.global_project import (
PACKAGE_PATH as GLOBAL_PROJECT_PATH,
PROJECT_NAME as GLOBAL_PROJECT_NAME,
)
from dbt.adapters.base.plugin import AdapterPlugin
from dbt.adapters.protocol import AdapterConfig, AdapterProtocol, RelationProtocol
from dbt.contracts.connection import AdapterRequiredConfig, Credentials
from dbt.events.functions import fire_event
from dbt.events.types import AdapterImportError, PluginLoadError
from dbt.contracts.connection import Credentials, AdapterRequiredConfig
from dbt.adapters.protocol import (
AdapterProtocol,
AdapterConfig,
RelationProtocol,
)
from dbt.adapters.base.plugin import AdapterPlugin
from dbt.exceptions import InternalException, RuntimeException
from dbt.include.global_project import PACKAGE_PATH as GLOBAL_PROJECT_PATH
from dbt.include.global_project import PROJECT_NAME as GLOBAL_PROJECT_NAME
Adapter = AdapterProtocol
@@ -64,7 +58,7 @@ class AdapterContainer:
# if we failed to import the target module in particular, inform
# the user about it via a runtime error
if exc.name == "dbt.adapters." + name:
fire_event(AdapterImportError(exc=exc))
fire_event(AdapterImportError(exc=str(exc)))
raise RuntimeException(f"Could not find adapter type {name}!")
# otherwise, the error had to have come from some underlying
# library. Log the stack trace.
@@ -217,3 +211,12 @@ def get_adapter_package_names(name: Optional[str]) -> List[str]:
def get_adapter_type_names(name: Optional[str]) -> List[str]:
return FACTORY.get_adapter_type_names(name)
@contextmanager
def adapter_management():
reset_adapters()
try:
yield
finally:
cleanup_connections()

1
core/dbt/cli/README.md Normal file
View File

@@ -0,0 +1 @@
TODO

0
core/dbt/cli/__init__.py Normal file
View File

44
core/dbt/cli/flags.py Normal file
View File

@@ -0,0 +1,44 @@
# TODO Move this to /core/dbt/flags.py when we're ready to break things
import os
from dataclasses import dataclass
from multiprocessing import get_context
from pprint import pformat as pf
from click import get_current_context
if os.name != "nt":
# https://bugs.python.org/issue41567
import multiprocessing.popen_spawn_posix # type: ignore # noqa: F401
@dataclass(frozen=True)
class Flags:
def __init__(self, ctx=None) -> None:
if ctx is None:
ctx = get_current_context()
def assign_params(ctx):
"""Recursively adds all click params to flag object"""
for param_name, param_value in ctx.params.items():
# N.B. You have to use the base MRO method (object.__setattr__) to set attributes
# when using frozen dataclasses.
# https://docs.python.org/3/library/dataclasses.html#frozen-instances
if hasattr(self, param_name):
raise Exception(f"Duplicate flag names found in click command: {param_name}")
object.__setattr__(self, param_name.upper(), param_value)
if ctx.parent:
assign_params(ctx.parent)
assign_params(ctx)
# Hard coded flags
object.__setattr__(self, "WHICH", ctx.info_name)
object.__setattr__(self, "MP_CONTEXT", get_context("spawn"))
# Support console DO NOT TRACK initiave
if os.getenv("DO_NOT_TRACK", "").lower() in (1, "t", "true", "y", "yes"):
object.__setattr__(self, "ANONYMOUS_USAGE_STATS", False)
def __str__(self) -> str:
return str(pf(self.__dict__))

412
core/dbt/cli/main.py Normal file
View File

@@ -0,0 +1,412 @@
import inspect # This is temporary for RAT-ing
from copy import copy
from pprint import pformat as pf # This is temporary for RAT-ing
import click
from dbt.adapters.factory import adapter_management
from dbt.cli import params as p
from dbt.cli.flags import Flags
from dbt.profiler import profiler
def cli_runner():
# Alias "list" to "ls"
ls = copy(cli.commands["list"])
ls.hidden = True
cli.add_command(ls, "ls")
# Run the cli
cli()
# dbt
@click.group(
context_settings={"help_option_names": ["-h", "--help"]},
invoke_without_command=True,
no_args_is_help=True,
epilog="Specify one of these sub-commands and you can find more help from there.",
)
@click.pass_context
@p.anonymous_usage_stats
@p.cache_selected_only
@p.debug
@p.enable_legacy_logger
@p.event_buffer_size
@p.fail_fast
@p.log_cache_events
@p.log_format
@p.macro_debugging
@p.partial_parse
@p.print
@p.printer_width
@p.quiet
@p.record_timing_info
@p.static_parser
@p.use_colors
@p.use_experimental_parser
@p.version
@p.version_check
@p.warn_error
@p.write_json
def cli(ctx, **kwargs):
"""An ELT tool for managing your SQL transformations and data models.
For more documentation on these commands, visit: docs.getdbt.com
"""
incomplete_flags = Flags()
# Profiling
if incomplete_flags.RECORD_TIMING_INFO:
ctx.with_resource(profiler(enable=True, outfile=incomplete_flags.RECORD_TIMING_INFO))
# Adapter management
ctx.with_resource(adapter_management())
# Version info
if incomplete_flags.VERSION:
click.echo(f"`version` called\n ctx.params: {pf(ctx.params)}")
return
else:
del ctx.params["version"]
# dbt build
@cli.command("build")
@click.pass_context
@p.defer
@p.exclude
@p.fail_fast
@p.full_refresh
@p.indirect_selection
@p.log_path
@p.models
@p.profile
@p.profiles_dir
@p.project_dir
@p.selector
@p.show
@p.state
@p.store_failures
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
def build(ctx, **kwargs):
"""Run all Seeds, Models, Snapshots, and tests in DAG order"""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt clean
@cli.command("clean")
@click.pass_context
@p.profile
@p.profiles_dir
@p.project_dir
@p.target
@p.vars
def clean(ctx, **kwargs):
"""Delete all folders in the clean-targets list (usually the dbt_packages and target directories.)"""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt docs
@cli.group()
@click.pass_context
def docs(ctx, **kwargs):
"""Generate or serve the documentation website for your project"""
# dbt docs generate
@docs.command("generate")
@click.pass_context
@p.compile_docs
@p.defer
@p.exclude
@p.log_path
@p.models
@p.profile
@p.profiles_dir
@p.project_dir
@p.selector
@p.state
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
def docs_generate(ctx, **kwargs):
"""Generate the documentation website for your project"""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt docs serve
@docs.command("serve")
@click.pass_context
@p.browser
@p.port
@p.profile
@p.profiles_dir
@p.project_dir
@p.target
@p.vars
def docs_serve(ctx, **kwargs):
"""Serve the documentation website for your project"""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt compile
@cli.command("compile")
@click.pass_context
@p.defer
@p.exclude
@p.full_refresh
@p.log_path
@p.models
@p.parse_only
@p.profile
@p.profiles_dir
@p.project_dir
@p.selector
@p.state
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
def compile(ctx, **kwargs):
"""Generates executable SQL from source, model, test, and analysis files. Compiled SQL files are written to the target/ directory."""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt debug
@cli.command("debug")
@click.pass_context
@p.config_dir
@p.profile
@p.profiles_dir
@p.project_dir
@p.target
@p.vars
@p.version_check
def debug(ctx, **kwargs):
"""Show some helpful information about dbt for debugging. Not to be confused with the --debug option which increases verbosity."""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt deps
@cli.command("deps")
@click.pass_context
@p.profile
@p.profiles_dir
@p.project_dir
@p.target
@p.vars
def deps(ctx, **kwargs):
"""Pull the most recent version of the dependencies listed in packages.yml"""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt init
@cli.command("init")
@click.pass_context
@p.profile
@p.profiles_dir
@p.project_dir
@p.skip_profile_setup
@p.target
@p.vars
def init(ctx, **kwargs):
"""Initialize a new DBT project."""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt list
@cli.command("list")
@click.pass_context
@p.exclude
@p.indirect_selection
@p.models
@p.output
@p.output_keys
@p.profile
@p.profiles_dir
@p.project_dir
@p.resource_type
@p.selector
@p.state
@p.target
@p.vars
def list(ctx, **kwargs):
"""List the resources in your project"""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt parse
@cli.command("parse")
@click.pass_context
@p.compile_parse
@p.log_path
@p.profile
@p.profiles_dir
@p.project_dir
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
@p.write_manifest
def parse(ctx, **kwargs):
"""Parses the project and provides information on performance"""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt run
@cli.command("run")
@click.pass_context
@p.defer
@p.exclude
@p.fail_fast
@p.full_refresh
@p.log_path
@p.models
@p.profile
@p.profiles_dir
@p.project_dir
@p.selector
@p.state
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
def run(ctx, **kwargs):
"""Compile SQL and execute against the current target database."""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt run operation
@cli.command("run-operation")
@click.pass_context
@p.args
@p.profile
@p.profiles_dir
@p.project_dir
@p.target
@p.vars
def run_operation(ctx, **kwargs):
"""Run the named macro with any supplied arguments."""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt seed
@cli.command("seed")
@click.pass_context
@p.exclude
@p.full_refresh
@p.log_path
@p.models
@p.profile
@p.profiles_dir
@p.project_dir
@p.selector
@p.show
@p.state
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
def seed(ctx, **kwargs):
"""Load data from csv files into your data warehouse."""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt snapshot
@cli.command("snapshot")
@click.pass_context
@p.defer
@p.exclude
@p.models
@p.profile
@p.profiles_dir
@p.project_dir
@p.selector
@p.state
@p.target
@p.threads
@p.vars
def snapshot(ctx, **kwargs):
"""Execute snapshots defined in your project"""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt source
@cli.group()
@click.pass_context
def source(ctx, **kwargs):
"""Manage your project's sources"""
# dbt source freshness
@source.command("freshness")
@click.pass_context
@p.exclude
@p.models
@p.output_path # TODO: Is this ok to re-use? We have three different output params, how much can we consolidate?
@p.profile
@p.profiles_dir
@p.project_dir
@p.selector
@p.state
@p.target
@p.threads
@p.vars
def freshness(ctx, **kwargs):
"""Snapshots the current freshness of the project's sources"""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# dbt test
@cli.command("test")
@click.pass_context
@p.defer
@p.exclude
@p.fail_fast
@p.indirect_selection
@p.log_path
@p.models
@p.profile
@p.profiles_dir
@p.project_dir
@p.selector
@p.state
@p.store_failures
@p.target
@p.target_path
@p.threads
@p.vars
@p.version_check
def test(ctx, **kwargs):
"""Runs tests on data in deployed models. Run this after `dbt run`"""
flags = Flags()
click.echo(f"`{inspect.stack()[0][3]}` called\n flags: {flags}")
# Support running as a module
if __name__ == "__main__":
cli_runner()

View File

@@ -0,0 +1,33 @@
from click import ParamType
import yaml
class YAML(ParamType):
"""The Click YAML type. Converts YAML strings into objects."""
name = "YAML"
def convert(self, value, param, ctx):
# assume non-string values are a problem
if not isinstance(value, str):
self.fail(f"Cannot load YAML from type {type(value)}", param, ctx)
try:
return yaml.load(value, Loader=yaml.Loader)
except yaml.parser.ParserError:
self.fail(f"String '{value}' is not valid YAML", param, ctx)
class Truthy(ParamType):
"""The Click Truthy type. Converts strings into a "truthy" type"""
name = "TRUTHY"
def convert(self, value, param, ctx):
# assume non-string / non-None values are a problem
if not isinstance(value, (str, None)):
self.fail(f"Cannot load TRUTHY from type {type(value)}", param, ctx)
if value is None or value.lower() in ("0", "false", "f"):
return None
else:
return value

386
core/dbt/cli/params.py Normal file
View File

@@ -0,0 +1,386 @@
from pathlib import Path, PurePath
import click
from dbt.cli.option_types import YAML
# TODO: The name (reflected in flags) is a correction!
# The original name was `SEND_ANONYMOUS_USAGE_STATS` and used an env var called "DBT_SEND_ANONYMOUS_USAGE_STATS"
# Both of which break existing naming conventions (doesn't match param flag).
# This will need to be fixed before use in the main codebase and communicated as a change to the community!
anonymous_usage_stats = click.option(
"--anonymous-usage-stats/--no-anonymous-usage-stats",
envvar="DBT_ANONYMOUS_USAGE_STATS",
help="Send anonymous usage stats to dbt Labs.",
default=True,
)
args = click.option(
"--args",
envvar=None,
help="Supply arguments to the macro. This dictionary will be mapped to the keyword arguments defined in the selected macro. This argument should be a YAML string, eg. '{my_variable: my_value}'",
type=YAML(),
)
browser = click.option(
"--browser/--no-browser",
envvar=None,
help="Wether or not to open a local web browser after starting the server",
default=True,
)
cache_selected_only = click.option(
"--cache-selected-only/--no-cache-selected-only",
envvar="DBT_CACHE_SELECTED_ONLY",
help="Pre cache database objects relevant to selected resource only.",
)
compile_docs = click.option(
"--compile/--no-compile",
envvar=None,
help="Wether or not to run 'dbt compile' as part of docs generation",
default=True,
)
compile_parse = click.option(
"--compile/--no-compile",
envvar=None,
help="TODO: No help text currently available",
default=True,
)
config_dir = click.option(
"--config-dir",
envvar=None,
help="If specified, DBT will show path information for this project",
type=click.STRING,
)
debug = click.option(
"--debug/--no-debug",
"-d/ ",
envvar="DBT_DEBUG",
help="Display debug logging during dbt execution. Useful for debugging and making bug reports.",
)
# TODO: The env var and name (reflected in flags) are corrections!
# The original name was `DEFER_MODE` and used an env var called "DBT_DEFER_TO_STATE"
# Both of which break existing naming conventions.
# This will need to be fixed before use in the main codebase and communicated as a change to the community!
defer = click.option(
"--defer/--no-defer",
envvar="DBT_DEFER",
help="If set, defer to the state variable for resolving unselected nodes.",
)
enable_legacy_logger = click.option(
"--enable-legacy-logger/--no-enable-legacy-logger",
envvar="DBT_ENABLE_LEGACY_LOGGER",
hidden=True,
)
event_buffer_size = click.option(
"--event-buffer-size",
envvar="DBT_EVENT_BUFFER_SIZE",
help="Sets the max number of events to buffer in EVENT_HISTORY.",
default=100000,
type=click.INT,
)
exclude = click.option("--exclude", envvar=None, help="Specify the nodes to exclude.")
fail_fast = click.option(
"--fail-fast/--no-fail-fast",
"-x/ ",
envvar="DBT_FAIL_FAST",
help="Stop execution on first failure.",
)
full_refresh = click.option(
"--full-refresh",
"-f",
envvar="DBT_FULL_REFRESH",
help="If specified, dbt will drop incremental models and fully-recalculate the incremental table from the model definition.",
is_flag=True,
)
indirect_selection = click.option(
"--indirect-selection",
envvar="DBT_INDIRECT_SELECTION",
help="Select all tests that are adjacent to selected resources, even if they those resources have been explicitly selected.",
type=click.Choice(["eager", "cautious"], case_sensitive=False),
default="eager",
)
log_cache_events = click.option(
"--log-cache-events/--no-log-cache-events",
help="Enable verbose adapter cache logging.",
envvar="DBT_LOG_CACHE_EVENTS",
)
log_format = click.option(
"--log-format",
envvar="DBT_LOG_FORMAT",
help="Specify the log format, overriding the command's default.",
type=click.Choice(["text", "json", "default"], case_sensitive=False),
default="default",
)
log_path = click.option(
"--log-path",
envvar="DBT_LOG_PATH",
help="Configure the 'log-path'. Only applies this setting for the current run. Overrides the 'DBT_LOG_PATH' if it is set.",
type=click.Path(),
)
macro_debugging = click.option(
"--macro-debugging/--no-macro-debugging",
envvar="DBT_MACRO_DEBUGGING",
hidden=True,
)
models = click.option(
"-m",
"-s",
"models",
envvar=None,
help="Specify the nodes to include.",
multiple=True,
)
output = click.option(
"--output",
envvar=None,
help="TODO: No current help text",
type=click.Choice(["json", "name", "path", "selector"], case_sensitive=False),
default="name",
)
output_keys = click.option(
"--output-keys", envvar=None, help="TODO: No current help text", type=click.STRING
)
output_path = click.option(
"--output",
"-o",
envvar=None,
help="Specify the output path for the json report. By default, outputs to 'target/sources.json'",
type=click.Path(file_okay=True, dir_okay=False, writable=True),
default=PurePath.joinpath(Path.cwd(), "target/sources.json"),
)
parse_only = click.option(
"--parse-only",
envvar=None,
help="TODO: No help text currently available",
is_flag=True,
)
partial_parse = click.option(
"--partial-parse/--no-partial-parse",
envvar="DBT_PARTIAL_PARSE",
help="Allow for partial parsing by looking for and writing to a pickle file in the target directory. This overrides the user configuration file.",
default=True,
)
port = click.option(
"--port",
envvar=None,
help="Specify the port number for the docs server",
default=8080,
type=click.INT,
)
# TODO: The env var and name (reflected in flags) are corrections!
# The original name was `NO_PRINT` and used the env var `DBT_NO_PRINT`.
# Both of which break existing naming conventions.
# This will need to be fixed before use in the main codebase and communicated as a change to the community!
print = click.option(
"--print/--no-print",
envvar="DBT_PRINT",
help="Output all {{ print() }} macro calls.",
default=True,
)
printer_width = click.option(
"--printer-width",
envvar="DBT_PRINTER_WIDTH",
help="Sets the width of terminal output",
type=click.INT,
default=80,
)
profile = click.option(
"--profile",
envvar=None,
help="Which profile to load. Overrides setting in dbt_project.yml.",
)
profiles_dir = click.option(
"--profiles-dir",
envvar="DBT_PROFILES_DIR",
help="Which directory to look in for the profiles.yml file. If not set, dbt will look in the current working directory first, then HOME/.dbt/",
default=PurePath.joinpath(Path.home(), ".dbt"),
type=click.Path(
exists=True,
),
)
project_dir = click.option(
"--project-dir",
envvar=None,
help="Which directory to look in for the dbt_project.yml file. Default is the current working directory and its parents.",
default=Path.cwd(),
type=click.Path(exists=True),
)
quiet = click.option(
"--quiet/--no-quiet",
envvar="DBT_QUIET",
help="Suppress all non-error logging to stdout. Does not affect {{ print() }} macro calls.",
)
record_timing_info = click.option(
"--record-timing-info",
"-r",
envvar=None,
help="When this option is passed, dbt will output low-level timing stats to the specified file. Example: `--record-timing-info output.profile`",
type=click.Path(exists=False),
)
resource_type = click.option(
"--resource-type",
envvar=None,
help="TODO: No current help text",
type=click.Choice(
[
"metric",
"source",
"analysis",
"model",
"test",
"exposure",
"snapshot",
"seed",
"default",
"all",
],
case_sensitive=False,
),
default="default",
)
selector = click.option(
"--selector", envvar=None, help="The selector name to use, as defined in selectors.yml"
)
show = click.option(
"--show", envvar=None, help="Show a sample of the loaded data in the terminal", is_flag=True
)
skip_profile_setup = click.option(
"--skip-profile-setup", "-s", envvar=None, help="Skip interative profile setup.", is_flag=True
)
# TODO: The env var and name (reflected in flags) are corrections!
# The original name was `ARTIFACT_STATE_PATH` and used the env var `DBT_ARTIFACT_STATE_PATH`.
# Both of which break existing naming conventions.
# This will need to be fixed before use in the main codebase and communicated as a change to the community!
state = click.option(
"--state",
envvar="DBT_STATE",
help="If set, use the given directory as the source for json files to compare with this project.",
type=click.Path(
dir_okay=True,
exists=True,
file_okay=False,
readable=True,
resolve_path=True,
),
)
static_parser = click.option(
"--static-parser/--no-static-parser",
envvar="DBT_STATIC_PARSER",
help="Use the static parser.",
default=True,
)
store_failures = click.option(
"--store-failures",
envvar="DBT_STORE_FAILURES",
help="Store test results (failing rows) in the database",
is_flag=True,
)
target = click.option(
"--target", "-t", envvar=None, help="Which target to load for the given profile"
)
target_path = click.option(
"--target-path",
envvar="DBT_TARGET_PATH",
help="Configure the 'target-path'. Only applies this setting for the current run. Overrides the 'DBT_TARGET_PATH' if it is set.",
type=click.Path(),
)
threads = click.option(
"--threads",
envvar=None,
help="Specify number of threads to use while executing models. Overrides settings in profiles.yml.",
default=1,
type=click.INT,
)
use_colors = click.option(
"--use-colors/--no-use-colors",
envvar="DBT_USE_COLORS",
help="Output is colorized by default and may also be set in a profile or at the command line.",
default=True,
)
use_experimental_parser = click.option(
"--use-experimental-parser/--no-use-experimental-parser",
envvar="DBT_USE_EXPERIMENTAL_PARSER",
help="Enable experimental parsing features.",
)
vars = click.option(
"--vars",
envvar=None,
help="Supply variables to the project. This argument overrides variables defined in your dbt_project.yml file. This argument should be a YAML string, eg. '{my_variable: my_value}'",
type=YAML(),
)
version = click.option(
"--version",
envvar=None,
help="Show version information",
is_flag=True,
)
version_check = click.option(
"--version-check/--no-version-check",
envvar="DBT_VERSION_CHECK",
help="Ensure dbt's version matches the one specified in the dbt_project.yml file ('require-dbt-version')",
default=True,
)
warn_error = click.option(
"--warn-error/--no-warn-error",
envvar="DBT_WARN_ERROR",
help="If dbt would normally warn, instead raise an exception. Examples include --models that selects nothing, deprecations, configurations with no associated models, invalid test configurations, and missing sources/refs in tests.",
)
write_json = click.option(
"--write-json/--no-write-json",
envvar="DBT_WRITE_JSON",
help="Writing the manifest and run_results.json files to disk",
default=True,
)
write_manifest = click.option(
"--write-manifest/--no-write-manifest",
envvar=None,
help="TODO: No help text currently available",
default=True,
)

View File

@@ -1 +1,19 @@
# Clients README
### Jinja
#### How are materializations defined
Model materializations are kept in `core/dbt/include/global_project/macros/materializations/models/`. Materializations are defined using syntax that isn't part of the Jinja standard library. These tags are referenced internally, and materializations can be overridden in user projects when users have specific needs.
```
-- Pseudocode for arguments
{% materialization <name>, <target name := one_of{default, adapter}> %}'
{% endmaterialization %}
```
These blocks are referred to Jinja extensions. Extensions are defined as part of the accepted Jinja code encapsulated within a dbt project. This includes system code used internally by dbt and user space (i.e. user-defined) macros. Extensions exist to help Jinja users create reusable code blocks or abstract objects--for us, materializations are a great use-case since we pass these around as arguments within dbt system code.
The code that defines this extension is a class `MaterializationExtension` and a `parse` routine. That code lives in [clients/jinja.py](https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/clients/jinja.py). The routine
enables Jinja to parse (i.e. recognize) the unique comma separated arg structure our `materialization` tags exhibit (the `table, default` as seen above).

View File

@@ -27,6 +27,7 @@ from dbt.utils import (
from dbt.clients._jinja_blocks import BlockIterator, BlockData, BlockTag
from dbt.contracts.graph.compiled import CompiledGenericTestNode
from dbt.contracts.graph.parsed import ParsedGenericTestNode
from dbt.exceptions import (
InternalException,
raise_compiler_error,
@@ -305,13 +306,13 @@ class MacroGenerator(BaseMacroGenerator):
@contextmanager
def track_call(self):
# This is only called from __call__
if self.stack is None or self.node is None:
if self.stack is None:
yield
else:
unique_id = self.macro.unique_id
depth = self.stack.depth
# only mark depth=0 as a dependency
if depth == 0:
# only mark depth=0 as a dependency, when creating this dependency we don't pass in stack
if depth == 0 and self.node:
self.node.depends_on.add_macro(unique_id)
self.stack.push(unique_id)
try:

View File

@@ -12,6 +12,7 @@ import tarfile
import requests
import stat
from typing import Type, NoReturn, List, Optional, Dict, Any, Tuple, Callable, Union
from pathspec import PathSpec # type: ignore
from dbt.events.functions import fire_event
from dbt.events.types import (
@@ -36,6 +37,7 @@ def find_matching(
root_path: str,
relative_paths_to_search: List[str],
file_pattern: str,
ignore_spec: Optional[PathSpec] = None,
) -> List[Dict[str, Any]]:
"""
Given an absolute `root_path`, a list of relative paths to that
@@ -57,19 +59,30 @@ def find_matching(
reobj = re.compile(regex, re.IGNORECASE)
for relative_path_to_search in relative_paths_to_search:
# potential speedup for ignore_spec
# if ignore_spec.matches(relative_path_to_search):
# continue
absolute_path_to_search = os.path.join(root_path, relative_path_to_search)
walk_results = os.walk(absolute_path_to_search)
for current_path, subdirectories, local_files in walk_results:
# potential speedup for ignore_spec
# relative_dir = os.path.relpath(current_path, root_path) + os.sep
# if ignore_spec.match(relative_dir):
# continue
for local_file in local_files:
absolute_path = os.path.join(current_path, local_file)
relative_path = os.path.relpath(absolute_path, absolute_path_to_search)
relative_path_to_root = os.path.join(relative_path_to_search, relative_path)
modification_time = 0.0
try:
modification_time = os.path.getmtime(absolute_path)
except OSError:
fire_event(SystemErrorRetrievingModTime(path=absolute_path))
if reobj.match(local_file):
if reobj.match(local_file) and (
not ignore_spec or not ignore_spec.match_file(relative_path_to_root)
):
matching.append(
{
"searched_path": relative_path_to_search,
@@ -164,7 +177,7 @@ def write_file(path: str, contents: str = "") -> bool:
reason = "Path was possibly too long"
# all our hard work and the path was still too long. Log and
# continue.
fire_event(SystemCouldNotWrite(path=path, reason=reason, exc=exc))
fire_event(SystemCouldNotWrite(path=path, reason=reason, exc=str(exc)))
else:
raise
return True

View File

@@ -183,6 +183,7 @@ class Compiler:
context = generate_runtime_model_context(node, self.config, manifest)
context.update(extra_context)
if isinstance(node, CompiledGenericTestNode):
# for test nodes, add a special keyword args value to the context
jinja.add_rendered_test_kwargs(context, node)

View File

@@ -23,8 +23,6 @@ from .renderer import ProfileRenderer
DEFAULT_THREADS = 1
DEFAULT_PROFILES_DIR = os.path.join(os.path.expanduser("~"), ".dbt")
INVALID_PROFILE_MESSAGE = """
dbt encountered an error while trying to read your profiles.yml file.
@@ -44,7 +42,7 @@ defined in your profiles.yml file. You can find profiles.yml here:
{profiles_file}/profiles.yml
""".format(
profiles_file=DEFAULT_PROFILES_DIR
profiles_file=flags.DEFAULT_PROFILES_DIR
)

View File

@@ -380,6 +380,8 @@ class PartialProject(RenderComponents):
snapshots: Dict[str, Any]
sources: Dict[str, Any]
tests: Dict[str, Any]
metrics: Dict[str, Any]
exposures: Dict[str, Any]
vars_value: VarProvider
dispatch = cfg.dispatch
@@ -388,6 +390,8 @@ class PartialProject(RenderComponents):
snapshots = cfg.snapshots
sources = cfg.sources
tests = cfg.tests
metrics = cfg.metrics
exposures = cfg.exposures
if cfg.vars is None:
vars_dict: Dict[str, Any] = {}
else:
@@ -441,6 +445,8 @@ class PartialProject(RenderComponents):
query_comment=query_comment,
sources=sources,
tests=tests,
metrics=metrics,
exposures=exposures,
vars=vars_value,
config_version=cfg.config_version,
unrendered=unrendered,
@@ -543,6 +549,8 @@ class Project:
snapshots: Dict[str, Any]
sources: Dict[str, Any]
tests: Dict[str, Any]
metrics: Dict[str, Any]
exposures: Dict[str, Any]
vars: VarProvider
dbt_version: List[VersionSpecifier]
packages: Dict[str, Any]
@@ -615,6 +623,8 @@ class Project:
"snapshots": self.snapshots,
"sources": self.sources,
"tests": self.tests,
"metrics": self.metrics,
"exposures": self.exposures,
"vars": self.vars.to_dict(),
"require-dbt-version": [v.to_version_string() for v in self.dbt_version],
"config-version": self.config_version,

View File

@@ -105,6 +105,8 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
query_comment=project.query_comment,
sources=project.sources,
tests=project.tests,
metrics=project.metrics,
exposures=project.exposures,
vars=project.vars,
config_version=project.config_version,
unrendered=project.unrendered,
@@ -274,6 +276,8 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
"snapshots": self._get_config_paths(self.snapshots),
"sources": self._get_config_paths(self.sources),
"tests": self._get_config_paths(self.tests),
"metrics": self._get_config_paths(self.metrics),
"exposures": self._get_config_paths(self.exposures),
}
def get_unused_resource_config_paths(
@@ -477,6 +481,8 @@ class UnsetProfileConfig(RuntimeConfig):
"snapshots": self.snapshots,
"sources": self.sources,
"tests": self.tests,
"metrics": self.metrics,
"exposures": self.exposures,
"vars": self.vars.to_dict(),
"require-dbt-version": [v.to_version_string() for v in self.dbt_version],
"config-version": self.config_version,
@@ -537,6 +543,8 @@ class UnsetProfileConfig(RuntimeConfig):
query_comment=project.query_comment,
sources=project.sources,
tests=project.tests,
metrics=project.metrics,
exposures=project.exposures,
vars=project.vars,
config_version=project.config_version,
unrendered=project.unrendered,

View File

@@ -2,7 +2,7 @@
Contexts are used for Jinja rendering. They include context methods, executable macros, and various settings that are available in Jinja.
The most common entrypoint to Jinja rendering in dbt is a method named `get_rendered`, which takes two arguments: templated code (string), and a context used to render it (dictionary).
The most common entrypoint to Jinja rendering in dbt is a method named `get_rendered`, which takes two arguments: templated code (string), and a context used to render it (dictionary).
The context is the bundle of information that is in "scope" when rendering Jinja-templated code. For instance, imagine a simple Jinja template:
```

View File

@@ -43,9 +43,12 @@ class UnrenderedConfig(ConfigSource):
model_configs = unrendered.get("sources")
elif resource_type == NodeType.Test:
model_configs = unrendered.get("tests")
elif resource_type == NodeType.Metric:
model_configs = unrendered.get("metrics")
elif resource_type == NodeType.Exposure:
model_configs = unrendered.get("exposures")
else:
model_configs = unrendered.get("models")
if model_configs is None:
return {}
else:
@@ -65,6 +68,10 @@ class RenderedConfig(ConfigSource):
model_configs = self.project.sources
elif resource_type == NodeType.Test:
model_configs = self.project.tests
elif resource_type == NodeType.Metric:
model_configs = self.project.metrics
elif resource_type == NodeType.Exposure:
model_configs = self.project.exposures
else:
model_configs = self.project.models
return model_configs

View File

@@ -4,6 +4,7 @@ from dbt.clients.jinja import MacroStack
from dbt.contracts.connection import AdapterRequiredConfig
from dbt.contracts.graph.manifest import Manifest
from dbt.context.macro_resolver import TestMacroNamespace
from .base import contextproperty
from .configured import ConfiguredContext
@@ -66,6 +67,10 @@ class ManifestContext(ConfiguredContext):
dct.update(self.namespace)
return dct
@contextproperty
def context_macro_stack(self):
return self.macro_stack
class QueryHeaderContext(ManifestContext):
def __init__(self, config: AdapterRequiredConfig, manifest: Manifest) -> None:

View File

@@ -41,6 +41,7 @@ from dbt.contracts.graph.parsed import (
ParsedSourceDefinition,
)
from dbt.contracts.graph.metrics import MetricReference, ResolvedMetricReference
from dbt.contracts.util import get_metadata_env
from dbt.exceptions import (
CompilationException,
ParsingException,
@@ -63,7 +64,7 @@ from dbt.exceptions import (
from dbt.config import IsFQNResource
from dbt.node_types import NodeType, ModelLanguage
from dbt.utils import merge, AttrDict, MultiDict
from dbt.utils import merge, AttrDict, MultiDict, args_to_dict
from dbt import selected_resources
@@ -710,6 +711,14 @@ class ProviderContext(ManifestContext):
self.model,
)
@contextproperty
def dbt_metadata_envs(self) -> Dict[str, str]:
return get_metadata_env()
@contextproperty
def invocation_args_dict(self):
return args_to_dict(self.config.args)
@contextproperty
def _sql_results(self) -> Dict[str, AttrDict]:
return self.sql_results
@@ -1240,6 +1249,19 @@ class ProviderContext(ManifestContext):
"""
return selected_resources.SELECTED_RESOURCES
@contextmember
def submit_python_job(self, parsed_model: Dict, compiled_code: str) -> AdapterResponse:
# Check macro_stack and that the unique id is for a materialization macro
if not (
self.context_macro_stack.depth == 2
and self.context_macro_stack.call_stack[1] == "macro.dbt.statement"
and "materialization" in self.context_macro_stack.call_stack[0]
):
raise RuntimeException(
f"submit_python_job is not intended to be called here, at model {parsed_model['alias']}, with macro call_stack {self.context_macro_stack.call_stack}."
)
return self.adapter.submit_python_job(parsed_model, compiled_code)
class MacroContext(ProviderContext):
"""Internally, macros can be executed like nodes, with some restrictions:

View File

@@ -114,25 +114,34 @@ class FileHash(dbtClassMixin):
@dataclass
class RemoteFile(dbtClassMixin):
def __init__(self, language) -> None:
if language == "sql":
self.path_end = ".sql"
elif language == "python":
self.path_end = ".py"
else:
raise RuntimeError(f"Invalid language for remote File {language}")
self.path = f"from remote system{self.path_end}"
@property
def searched_path(self) -> str:
return "from remote system"
return self.path
@property
def relative_path(self) -> str:
return "from remote system"
return self.path
@property
def absolute_path(self) -> str:
return "from remote system"
return self.path
@property
def original_file_path(self):
return "from remote system"
return self.path
@property
def modification_time(self):
return "from remote system"
return self.path
@dataclass
@@ -202,9 +211,9 @@ class SourceFile(BaseSourceFile):
# TODO: do this a different way. This remote file kludge isn't going
# to work long term
@classmethod
def remote(cls, contents: str, project_name: str) -> "SourceFile":
def remote(cls, contents: str, project_name: str, language: str) -> "SourceFile":
self = cls(
path=RemoteFile(),
path=RemoteFile(language),
checksum=FileHash.from_contents(contents),
project_name=project_name,
contents=contents,
@@ -268,11 +277,13 @@ class SchemaSourceFile(BaseSourceFile):
self.tests[key][name] = []
self.tests[key][name].append(node_unique_id)
# this is only used in unit tests
def remove_tests(self, yaml_key, name):
if yaml_key in self.tests:
if name in self.tests[yaml_key]:
del self.tests[yaml_key][name]
# this is only used in tests (unit + functional)
def get_tests(self, yaml_key, name):
if yaml_key in self.tests:
if name in self.tests[yaml_key]:

View File

@@ -33,6 +33,7 @@ from dbt.contracts.graph.parsed import (
ParsedMacro,
ParsedDocumentation,
ParsedSourceDefinition,
ParsedGenericTestNode,
ParsedExposure,
ParsedMetric,
HasUniqueID,
@@ -216,7 +217,7 @@ class MetricLookup(dbtClassMixin):
return manifest.metrics[unique_id]
# This handles both models/seeds/snapshots and sources
# This handles both models/seeds/snapshots and sources/metrics/exposures
class DisabledLookup(dbtClassMixin):
def __init__(self, manifest: "Manifest"):
self.storage: Dict[str, Dict[PackageName, List[Any]]] = {}
@@ -464,7 +465,7 @@ class Disabled(Generic[D]):
target: D
MaybeMetricNode = Optional[ParsedMetric]
MaybeMetricNode = Optional[Union[ParsedMetric, Disabled[ParsedMetric]]]
MaybeDocumentation = Optional[ParsedDocumentation]
@@ -616,7 +617,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
flat_graph: Dict[str, Any] = field(default_factory=dict)
state_check: ManifestStateCheck = field(default_factory=ManifestStateCheck)
source_patches: MutableMapping[SourceKey, SourcePatch] = field(default_factory=dict)
disabled: MutableMapping[str, List[CompileResultNode]] = field(default_factory=dict)
disabled: MutableMapping[str, List[GraphMemberNode]] = field(default_factory=dict)
env_vars: MutableMapping[str, str] = field(default_factory=dict)
_doc_lookup: Optional[DocLookup] = field(
@@ -964,13 +965,22 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
current_project: str,
node_package: str,
) -> MaybeMetricNode:
metric: Optional[ParsedMetric] = None
disabled: Optional[List[ParsedMetric]] = None
candidates = _search_packages(current_project, node_package, target_metric_package)
for pkg in candidates:
metric = self.metric_lookup.find(target_metric_name, pkg, self)
if metric is not None:
if metric is not None and metric.config.enabled:
return metric
# it's possible that the node is disabled
if disabled is None:
disabled = self.disabled_lookup.find(f"{target_metric_name}", pkg)
if disabled:
return Disabled(disabled[0])
return None
# Called by DocsRuntimeContext.doc
@@ -1018,6 +1028,10 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
merged.add(unique_id)
self.nodes[unique_id] = node.replace(deferred=True)
# Rebuild the flat_graph, which powers the 'graph' context variable,
# now that we've deferred some nodes
self.build_flat_graph()
# log up to 5 items
sample = list(islice(merged, 5))
fire_event(MergedFromState(nbr_merged=len(merged), sample=sample))
@@ -1089,7 +1103,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
self.metrics[metric.unique_id] = metric
source_file.metrics.append(metric.unique_id)
def add_disabled_nofile(self, node: CompileResultNode):
def add_disabled_nofile(self, node: GraphMemberNode):
# There can be multiple disabled nodes for the same unique_id
if node.unique_id in self.disabled:
self.disabled[node.unique_id].append(node)
@@ -1099,8 +1113,13 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
def add_disabled(self, source_file: AnySourceFile, node: CompileResultNode, test_from=None):
self.add_disabled_nofile(node)
if isinstance(source_file, SchemaSourceFile):
assert test_from
source_file.add_test(node.unique_id, test_from)
if isinstance(node, ParsedGenericTestNode):
assert test_from
source_file.add_test(node.unique_id, test_from)
if isinstance(node, ParsedMetric):
source_file.metrics.append(node.unique_id)
if isinstance(node, ParsedExposure):
source_file.exposures.append(node.unique_id)
else:
source_file.nodes.append(node.unique_id)

View File

@@ -49,13 +49,17 @@ class ResolvedMetricReference(MetricReference):
@classmethod
def reverse_dag_parsing(cls, metric_node, manifest, metric_depth_count):
if metric_node.type == "expression":
if metric_node.calculation_method == "derived":
yield {metric_node.name: metric_depth_count}
metric_depth_count = metric_depth_count + 1
for parent_unique_id in metric_node.depends_on.nodes:
node = manifest.metrics.get(parent_unique_id)
if node and node.resource_type == NodeType.Metric and node.type == "expression":
if (
node
and node.resource_type == NodeType.Metric
and node.calculation_method == "derived"
):
yield from cls.reverse_dag_parsing(node, manifest, metric_depth_count)
def full_metric_dependency(self):
@@ -67,7 +71,7 @@ class ResolvedMetricReference(MetricReference):
to_return = []
for metric in in_scope_metrics:
if metric.type != "expression" and metric.name not in to_return:
if metric.calculation_method != "derived" and metric.name not in to_return:
to_return.append(metric.name)
return to_return
@@ -77,7 +81,7 @@ class ResolvedMetricReference(MetricReference):
to_return = []
for metric in in_scope_metrics:
if metric.type == "expression" and metric.name not in to_return:
if metric.calculation_method == "derived" and metric.name not in to_return:
to_return.append(metric.name)
return to_return

View File

@@ -363,6 +363,16 @@ class BaseConfig(AdditionalPropertiesAllowed, Replaceable):
return self.from_dict(dct)
@dataclass
class MetricConfig(BaseConfig):
enabled: bool = True
@dataclass
class ExposureConfig(BaseConfig):
enabled: bool = True
@dataclass
class SourceConfig(BaseConfig):
enabled: bool = True
@@ -613,6 +623,8 @@ class SnapshotConfig(EmptySnapshotConfig):
RESOURCE_TYPES: Dict[NodeType, Type[BaseConfig]] = {
NodeType.Metric: MetricConfig,
NodeType.Exposure: ExposureConfig,
NodeType.Source: SourceConfig,
NodeType.Seed: SeedConfig,
NodeType.Test: TestConfig,

View File

@@ -37,6 +37,7 @@ from dbt.contracts.graph.unparsed import (
ExposureType,
MaturityType,
MetricFilter,
MetricTime,
)
from dbt.contracts.util import Replaceable, AdditionalPropertiesMixin
from dbt.exceptions import warn_or_error
@@ -49,6 +50,8 @@ from .model_config import (
SeedConfig,
TestConfig,
SourceConfig,
MetricConfig,
ExposureConfig,
EmptySnapshotConfig,
SnapshotConfig,
)
@@ -742,9 +745,12 @@ class ParsedExposure(UnparsedBaseNode, HasUniqueID, HasFqn):
owner: ExposureOwner
resource_type: NodeType = NodeType.Exposure
description: str = ""
label: Optional[str] = None
maturity: Optional[MaturityType] = None
meta: Dict[str, Any] = field(default_factory=dict)
tags: List[str] = field(default_factory=list)
config: ExposureConfig = field(default_factory=ExposureConfig)
unrendered_config: Dict[str, Any] = field(default_factory=dict)
url: Optional[str] = None
depends_on: DependsOn = field(default_factory=DependsOn)
refs: List[List[str]] = field(default_factory=list)
@@ -765,6 +771,9 @@ class ParsedExposure(UnparsedBaseNode, HasUniqueID, HasFqn):
def same_description(self, old: "ParsedExposure") -> bool:
return self.description == old.description
def same_label(self, old: "ParsedExposure") -> bool:
return self.label == old.label
def same_maturity(self, old: "ParsedExposure") -> bool:
return self.maturity == old.maturity
@@ -777,6 +786,12 @@ class ParsedExposure(UnparsedBaseNode, HasUniqueID, HasFqn):
def same_url(self, old: "ParsedExposure") -> bool:
return self.url == old.url
def same_config(self, old: "ParsedExposure") -> bool:
return self.config.same_contents(
self.unrendered_config,
old.unrendered_config,
)
def same_contents(self, old: Optional["ParsedExposure"]) -> bool:
# existing when it didn't before is a change!
# metadata/tags changes are not "changes"
@@ -790,7 +805,9 @@ class ParsedExposure(UnparsedBaseNode, HasUniqueID, HasFqn):
and self.same_maturity(old)
and self.same_url(old)
and self.same_description(old)
and self.same_label(old)
and self.same_depends_on(old)
and self.same_config(old)
and True
)
@@ -806,17 +823,20 @@ class ParsedMetric(UnparsedBaseNode, HasUniqueID, HasFqn):
name: str
description: str
label: str
type: str
sql: str
timestamp: Optional[str]
calculation_method: str
expression: str
timestamp: str
filters: List[MetricFilter]
time_grains: List[str]
dimensions: List[str]
window: Optional[MetricTime] = None
model: Optional[str] = None
model_unique_id: Optional[str] = None
resource_type: NodeType = NodeType.Metric
meta: Dict[str, Any] = field(default_factory=dict)
tags: List[str] = field(default_factory=list)
config: MetricConfig = field(default_factory=MetricConfig)
unrendered_config: Dict[str, Any] = field(default_factory=dict)
sources: List[List[str]] = field(default_factory=list)
depends_on: DependsOn = field(default_factory=DependsOn)
refs: List[List[str]] = field(default_factory=list)
@@ -834,6 +854,9 @@ class ParsedMetric(UnparsedBaseNode, HasUniqueID, HasFqn):
def same_model(self, old: "ParsedMetric") -> bool:
return self.model == old.model
def same_window(self, old: "ParsedMetric") -> bool:
return self.window == old.window
def same_dimensions(self, old: "ParsedMetric") -> bool:
return self.dimensions == old.dimensions
@@ -846,11 +869,11 @@ class ParsedMetric(UnparsedBaseNode, HasUniqueID, HasFqn):
def same_label(self, old: "ParsedMetric") -> bool:
return self.label == old.label
def same_type(self, old: "ParsedMetric") -> bool:
return self.type == old.type
def same_calculation_method(self, old: "ParsedMetric") -> bool:
return self.calculation_method == old.calculation_method
def same_sql(self, old: "ParsedMetric") -> bool:
return self.sql == old.sql
def same_expression(self, old: "ParsedMetric") -> bool:
return self.expression == old.expression
def same_timestamp(self, old: "ParsedMetric") -> bool:
return self.timestamp == old.timestamp
@@ -858,6 +881,12 @@ class ParsedMetric(UnparsedBaseNode, HasUniqueID, HasFqn):
def same_time_grains(self, old: "ParsedMetric") -> bool:
return self.time_grains == old.time_grains
def same_config(self, old: "ParsedMetric") -> bool:
return self.config.same_contents(
self.unrendered_config,
old.unrendered_config,
)
def same_contents(self, old: Optional["ParsedMetric"]) -> bool:
# existing when it didn't before is a change!
# metadata/tags changes are not "changes"
@@ -866,14 +895,16 @@ class ParsedMetric(UnparsedBaseNode, HasUniqueID, HasFqn):
return (
self.same_model(old)
and self.same_window(old)
and self.same_dimensions(old)
and self.same_filters(old)
and self.same_description(old)
and self.same_label(old)
and self.same_type(old)
and self.same_sql(old)
and self.same_calculation_method(old)
and self.same_expression(old)
and self.same_timestamp(old)
and self.same_time_grains(old)
and self.same_config(old)
and True
)

View File

@@ -1,5 +1,13 @@
import re
from dbt import deprecations
from dbt.node_types import NodeType
from dbt.contracts.util import AdditionalPropertiesMixin, Mergeable, Replaceable
from dbt.contracts.util import (
AdditionalPropertiesMixin,
Mergeable,
Replaceable,
rename_metric_attr,
)
# trigger the PathEncoder
import dbt.helper_types # noqa:F401
@@ -429,11 +437,21 @@ class UnparsedExposure(dbtClassMixin, Replaceable):
type: ExposureType
owner: ExposureOwner
description: str = ""
label: Optional[str] = None
maturity: Optional[MaturityType] = None
meta: Dict[str, Any] = field(default_factory=dict)
tags: List[str] = field(default_factory=list)
url: Optional[str] = None
depends_on: List[str] = field(default_factory=list)
config: Dict[str, Any] = field(default_factory=dict)
@classmethod
def validate(cls, data):
super(UnparsedExposure, cls).validate(data)
if "name" in data:
# name can only contain alphanumeric chars and underscores
if not (re.match(r"[\w-]+$", data["name"])):
deprecations.warn("exposure-name", exposure=data["name"])
@dataclass
@@ -444,35 +462,66 @@ class MetricFilter(dbtClassMixin, Replaceable):
value: str
class MetricTimePeriod(StrEnum):
day = "day"
week = "week"
month = "month"
year = "year"
def plural(self) -> str:
return str(self) + "s"
@dataclass
class MetricTime(dbtClassMixin, Mergeable):
count: Optional[int] = None
period: Optional[MetricTimePeriod] = None
def __bool__(self):
return self.count is not None and self.period is not None
@dataclass
class UnparsedMetric(dbtClassMixin, Replaceable):
# TODO : verify that this disallows metric names with spaces
# TODO: fix validation that you broke :p
# name: Identifier
name: str
label: str
type: str
model: Optional[str] = None
calculation_method: str
timestamp: str
description: str = ""
sql: Union[str, int] = ""
timestamp: Optional[str] = None
expression: Union[str, int] = ""
time_grains: List[str] = field(default_factory=list)
dimensions: List[str] = field(default_factory=list)
window: Optional[MetricTime] = None
model: Optional[str] = None
filters: List[MetricFilter] = field(default_factory=list)
meta: Dict[str, Any] = field(default_factory=dict)
tags: List[str] = field(default_factory=list)
config: Dict[str, Any] = field(default_factory=dict)
@classmethod
def validate(cls, data):
# super().validate(data)
# TODO: putting this back for now to get tests passing. Do we want to implement name: Identifier?
data = rename_metric_attr(data, raise_deprecation_warning=True)
super(UnparsedMetric, cls).validate(data)
if "name" in data and " " in data["name"]:
raise ParsingException(f"Metrics name '{data['name']}' cannot contain spaces")
if "name" in data:
errors = []
if " " in data["name"]:
errors.append("cannot contain spaces")
# This handles failing queries due to too long metric names.
# It only occurs in BigQuery and Snowflake (Postgres/Redshift truncate)
if len(data["name"]) > 250:
errors.append("cannot contain more than 250 characters")
if not (re.match(r"^[A-Za-z]", data["name"])):
errors.append("must begin with a letter")
if not (re.match(r"[\w-]+$", data["name"])):
errors.append("must contain only letters, numbers and underscores")
# TODO: Expressions _cannot_ have `model` properties
if data.get("model") is None and data.get("type") != "expression":
raise ValidationError("Non-expression metrics require a 'model' property")
if errors:
raise ParsingException(
f"The metric name '{data['name']}' is invalid. It {', '.join(e for e in errors)}"
)
if data.get("model") is not None and data.get("type") == "expression":
raise ValidationError("Expression metrics cannot have a 'model' property")
if data.get("model") is None and data.get("calculation_method") != "derived":
raise ValidationError("Non-derived metrics require a 'model' property")
if data.get("model") is not None and data.get("calculation_method") == "derived":
raise ValidationError("Derived metrics cannot have a 'model' property")

View File

@@ -192,6 +192,8 @@ class Project(HyphenatedDbtClassMixin, Replaceable):
analyses: Dict[str, Any] = field(default_factory=dict)
sources: Dict[str, Any] = field(default_factory=dict)
tests: Dict[str, Any] = field(default_factory=dict)
metrics: Dict[str, Any] = field(default_factory=dict)
exposures: Dict[str, Any] = field(default_factory=dict)
vars: Optional[Dict[str, Any]] = field(
default=None,
metadata=dict(
@@ -234,6 +236,7 @@ class UserConfig(ExtensibleDbtClassMixin, Replaceable, UserConfigContract):
static_parser: Optional[bool] = None
indirect_selection: Optional[str] = None
cache_selected_only: Optional[bool] = None
event_buffer_size: Optional[int] = None
@dataclass

View File

@@ -4,6 +4,7 @@ from datetime import datetime
from typing import List, Tuple, ClassVar, Type, TypeVar, Dict, Any, Optional
from dbt.clients.system import write_json, read_json
from dbt import deprecations
from dbt.exceptions import (
InternalException,
RuntimeException,
@@ -207,13 +208,60 @@ def get_manifest_schema_version(dct: dict) -> int:
return int(schema_version.split(".")[-2][-1])
# we renamed these properties in v1.3
# this method allows us to be nice to the early adopters
def rename_metric_attr(data: dict, raise_deprecation_warning: bool = False) -> dict:
metric_name = data["name"]
if raise_deprecation_warning and (
"sql" in data.keys()
or "type" in data.keys()
or data.get("calculation_method") == "expression"
):
deprecations.warn("metric-attr-renamed", metric_name=metric_name)
duplicated_attribute_msg = """\n
The metric '{}' contains both the deprecated metric property '{}'
and the up-to-date metric property '{}'. Please remove the deprecated property.
"""
if "sql" in data.keys():
if "expression" in data.keys():
raise ValidationError(
duplicated_attribute_msg.format(metric_name, "sql", "expression")
)
else:
data["expression"] = data.pop("sql")
if "type" in data.keys():
if "calculation_method" in data.keys():
raise ValidationError(
duplicated_attribute_msg.format(metric_name, "type", "calculation_method")
)
else:
calculation_method = data.pop("type")
data["calculation_method"] = calculation_method
# we also changed "type: expression" -> "calculation_method: derived"
if data.get("calculation_method") == "expression":
data["calculation_method"] = "derived"
return data
def rename_sql_attr(node_content: dict) -> dict:
if "raw_sql" in node_content:
node_content["raw_code"] = node_content.pop("raw_sql")
if "compiled_sql" in node_content:
node_content["compiled_code"] = node_content.pop("compiled_sql")
node_content["language"] = "sql"
return node_content
def upgrade_manifest_json(manifest: dict) -> dict:
for node_content in manifest.get("nodes", {}).values():
if "raw_sql" in node_content:
node_content["raw_code"] = node_content.pop("raw_sql")
if "compiled_sql" in node_content:
node_content["compiled_code"] = node_content.pop("compiled_sql")
node_content["language"] = "sql"
node_content = rename_sql_attr(node_content)
for disabled in manifest.get("disabled", {}).values():
# There can be multiple disabled nodes for the same unique_id
# so make sure all the nodes get the attr renamed
disabled = [rename_sql_attr(n) for n in disabled]
for metric_content in manifest.get("metrics", {}).values():
# handle attr renames + value translation ("expression" -> "derived")
metric_content = rename_metric_attr(metric_content)
return manifest

View File

@@ -87,6 +87,30 @@ def renamed_method(old_name: str, new_name: str):
deprecations[dep.name] = dep
class MetricAttributesRenamed(DBTDeprecation):
_name = "metric-attr-renamed"
_description = """\
dbt-core v1.3 renamed attributes for metrics:
\n 'sql' -> 'expression'
\n 'type' -> 'calculation_method'
\n 'type: expression' -> 'calculation_method: derived'
\nThe old metric parameter names will be fully deprecated in v1.4.
\nPlease remove them from the metric definition of metric '{metric_name}'
\nRelevant issue here: https://github.com/dbt-labs/dbt-core/issues/5849
"""
class ExposureNameDeprecation(DBTDeprecation):
_name = "exposure-name"
_description = """\
Starting in v1.3, the 'name' of an exposure should contain only letters, numbers, and underscores.
Exposures support a new property, 'label', which may contain spaces, capital letters, and special characters.
{exposure} does not follow this pattern.
Please update the 'name', and use the 'label' property for a human-friendly title.
This will raise an error in a future version of dbt-core.
"""
def warn(name, *args, **kwargs):
if name not in deprecations:
# this should (hopefully) never happen
@@ -101,10 +125,12 @@ def warn(name, *args, **kwargs):
active_deprecations: Set[str] = set()
deprecations_list: List[DBTDeprecation] = [
ExposureNameDeprecation(),
ConfigSourcePathDeprecation(),
ConfigDataPathDeprecation(),
PackageInstallPathDeprecation(),
PackageRedirectDeprecation(),
MetricAttributesRenamed(),
]
deprecations: Dict[str, DBTDeprecation] = {d.name: d for d in deprecations_list}

View File

@@ -1,7 +1,7 @@
from colorama import Style
import dbt.events.functions as this # don't worry I hate it too.
from dbt.events.base_types import NoStdOut, Event, NoFile, ShowException, Cache
from dbt.events.types import EventBufferFull, T_Event, MainReportVersion, EmptyLine
from dbt.events.types import T_Event, MainReportVersion, EmptyLine, EventBufferFull
import dbt.flags as flags
from dbt.constants import SECRET_ENV_PREFIX
@@ -22,24 +22,16 @@ import threading
from typing import Any, Dict, List, Optional, Union
from collections import deque
global LOG_VERSION
LOG_VERSION = 2
# create the global event history buffer with the default max size (10k)
# python 3.7 doesn't support type hints on globals, but mypy requires them. hence the ignore.
# TODO the flags module has not yet been resolved when this is created
global EVENT_HISTORY
EVENT_HISTORY = deque(maxlen=flags.EVENT_BUFFER_SIZE) # type: ignore
EVENT_HISTORY = None
# create the global file logger with no configuration
global FILE_LOG
FILE_LOG = logging.getLogger("default_file")
null_handler = logging.NullHandler()
FILE_LOG.addHandler(null_handler)
# set up logger to go to stdout with defaults
# setup_event_logger will be called once args have been parsed
global STDOUT_LOG
STDOUT_LOG = logging.getLogger("default_stdout")
STDOUT_LOG.setLevel(logging.INFO)
stdout_handler = logging.StreamHandler(sys.stdout)
@@ -52,11 +44,8 @@ invocation_id: Optional[str] = None
def setup_event_logger(log_path, level_override=None):
# flags have been resolved, and log_path is known
global EVENT_HISTORY
EVENT_HISTORY = deque(maxlen=flags.EVENT_BUFFER_SIZE) # type: ignore
make_log_dir_if_missing(log_path)
this.format_json = flags.LOG_FORMAT == "json"
# USE_COLORS can be None if the app just started and the cli flags
# havent been applied yet
@@ -270,14 +259,7 @@ def fire_event(e: Event) -> None:
if isinstance(e, Cache) and not flags.LOG_CACHE_EVENTS:
return
# if and only if the event history deque will be completely filled by this event
# fire warning that old events are now being dropped
global EVENT_HISTORY
if len(EVENT_HISTORY) == (flags.EVENT_BUFFER_SIZE - 1):
EVENT_HISTORY.append(e)
fire_event(EventBufferFull())
else:
EVENT_HISTORY.append(e)
add_to_event_history(e)
# backwards compatibility for plugins that require old logger (dbt-rpc)
if flags.ENABLE_LEGACY_LOGGER:
@@ -343,3 +325,20 @@ def get_ts_rfc3339() -> str:
ts = get_ts()
ts_rfc3339 = ts.strftime("%Y-%m-%dT%H:%M:%S.%fZ")
return ts_rfc3339
def add_to_event_history(event):
if flags.EVENT_BUFFER_SIZE == 0:
return
global EVENT_HISTORY
if EVENT_HISTORY is None:
reset_event_history()
EVENT_HISTORY.append(event)
# We only set the EventBufferFull message for event buffers >= 10,000
if flags.EVENT_BUFFER_SIZE >= 10000 and len(EVENT_HISTORY) == (flags.EVENT_BUFFER_SIZE - 1):
fire_event(EventBufferFull())
def reset_event_history():
global EVENT_HISTORY
EVENT_HISTORY = deque(maxlen=flags.EVENT_BUFFER_SIZE)

View File

@@ -103,11 +103,11 @@ class MainKeyboardInterrupt(InfoLevel):
@dataclass
class MainEncounteredError(ErrorLevel):
e: BaseException
exc: str
code: str = "Z002"
def message(self) -> str:
return f"Encountered an error:\n{self.e}"
return f"Encountered an error:\n{self.exc}"
@dataclass
@@ -382,7 +382,7 @@ class SystemErrorRetrievingModTime(ErrorLevel):
class SystemCouldNotWrite(DebugLevel):
path: str
reason: str
exc: Exception
exc: str
code: str = "Z005"
def message(self) -> str:
@@ -762,7 +762,7 @@ class DumpAfterRenameSchema(DebugLevel, Cache):
@dataclass
class AdapterImportError(InfoLevel):
exc: Exception
exc: str
code: str = "E035"
def message(self) -> str:
@@ -1008,7 +1008,7 @@ class PartialParsingNotEnabled(DebugLevel):
@dataclass
class ParsedFileLoadFailed(ShowException, DebugLevel):
path: str
exc: Exception
exc: str
code: str = "I029"
def message(self) -> str:
@@ -1223,7 +1223,7 @@ class InvalidRefInTestNode(DebugLevel):
@dataclass
class RunningOperationCaughtError(ErrorLevel):
exc: Exception
exc: str
code: str = "Q001"
def message(self) -> str:
@@ -1232,7 +1232,7 @@ class RunningOperationCaughtError(ErrorLevel):
@dataclass
class RunningOperationUncaughtError(ErrorLevel):
exc: Exception
exc: str
code: str = "FF01"
def message(self) -> str:
@@ -1249,7 +1249,7 @@ class DbtProjectError(ErrorLevel):
@dataclass
class DbtProjectErrorException(ErrorLevel):
exc: Exception
exc: str
code: str = "A010"
def message(self) -> str:
@@ -1266,7 +1266,7 @@ class DbtProfileError(ErrorLevel):
@dataclass
class DbtProfileErrorException(ErrorLevel):
exc: Exception
exc: str
code: str = "A012"
def message(self) -> str:
@@ -1313,7 +1313,7 @@ https://docs.getdbt.com/docs/configure-your-profile
@dataclass
class CatchableExceptionOnRun(ShowException, DebugLevel):
exc: Exception
exc: str
code: str = "W002"
def message(self) -> str:
@@ -1323,7 +1323,7 @@ class CatchableExceptionOnRun(ShowException, DebugLevel):
@dataclass
class InternalExceptionOnRun(DebugLevel):
build_path: str
exc: Exception
exc: str
code: str = "W003"
def message(self) -> str:
@@ -1352,7 +1352,7 @@ class PrintDebugStackTrace(ShowException, DebugLevel):
class GenericExceptionOnRun(ErrorLevel):
build_path: Optional[str]
unique_id: str
exc: Exception
exc: str
code: str = "W004"
def message(self) -> str:
@@ -1366,7 +1366,7 @@ class GenericExceptionOnRun(ErrorLevel):
@dataclass
class NodeConnectionReleaseError(ShowException, DebugLevel):
node_name: str
exc: Exception
exc: str
code: str = "W005"
def message(self) -> str:
@@ -1640,6 +1640,15 @@ class RunResultWarning(WarnLevel):
return ui.yellow(f"{info} in {self.resource_type} {self.node_name} ({self.path})")
@dataclass
class RunResultWarningMessage(WarnLevel):
msg: str
code: str = "Z049"
def message(self) -> str:
return f" {self.msg}"
@dataclass
class RunResultFailure(ErrorLevel):
resource_type: str
@@ -1691,7 +1700,7 @@ class SQLCompiledPath(InfoLevel):
@dataclass
class SQlRunnerException(ShowException, DebugLevel):
exc: Exception
exc: str
code: str = "Q006"
def message(self) -> str:
@@ -2449,7 +2458,7 @@ class GeneralWarningMsg(WarnLevel):
@dataclass
class GeneralWarningException(WarnLevel):
exc: Exception
exc: str
log_fmt: str
code: str = "Z047"
@@ -2470,7 +2479,7 @@ class EventBufferFull(WarnLevel):
@dataclass
class RecordRetryException(DebugLevel):
exc: Exception
exc: str
code: str = "M021"
def message(self) -> str:
@@ -2486,7 +2495,7 @@ class RecordRetryException(DebugLevel):
if 1 == 0:
MainReportVersion(v="")
MainKeyboardInterrupt()
MainEncounteredError(e=BaseException(""))
MainEncounteredError(exc="")
MainStackTrace(stack_trace="")
MainTrackingUserState(user_state="")
ParsingStart()
@@ -2515,7 +2524,7 @@ if 1 == 0:
RegistryResponseMissingNestedKeys(response=""),
RegistryResponseExtraNestedKeys(response=""),
SystemErrorRetrievingModTime(path="")
SystemCouldNotWrite(path="", reason="", exc=Exception(""))
SystemCouldNotWrite(path="", reason="", exc="")
SystemExecutingCmd(cmd=[""])
SystemStdOutMsg(bmsg=b"")
SystemStdErrMsg(bmsg=b"")
@@ -2571,7 +2580,7 @@ if 1 == 0:
DumpAfterAddGraph(Lazy.defer(lambda: dict()))
DumpBeforeRenameSchema(Lazy.defer(lambda: dict()))
DumpAfterRenameSchema(Lazy.defer(lambda: dict()))
AdapterImportError(exc=Exception())
AdapterImportError(exc="")
PluginLoadError()
SystemReportReturnCode(returncode=0)
NewConnectionOpening(connection_state="")
@@ -2594,7 +2603,7 @@ if 1 == 0:
PartialParsingFailedBecauseNewProjectDependency()
PartialParsingFailedBecauseHashChanged()
PartialParsingDeletedMetric(id="")
ParsedFileLoadFailed(path="", exc=Exception(""))
ParsedFileLoadFailed(path="", exc="")
PartialParseSaveFileNotFound()
StaticParserCausedJinjaRendering(path="")
UsingExperimentalParser(path="")
@@ -2617,20 +2626,20 @@ if 1 == 0:
PartialParsingDeletedExposure(unique_id="")
InvalidDisabledSourceInTestNode(msg="")
InvalidRefInTestNode(msg="")
RunningOperationCaughtError(exc=Exception(""))
RunningOperationUncaughtError(exc=Exception(""))
RunningOperationCaughtError(exc="")
RunningOperationUncaughtError(exc="")
DbtProjectError()
DbtProjectErrorException(exc=Exception(""))
DbtProjectErrorException(exc="")
DbtProfileError()
DbtProfileErrorException(exc=Exception(""))
DbtProfileErrorException(exc="")
ProfileListTitle()
ListSingleProfile(profile="")
NoDefinedProfiles()
ProfileHelpMessage()
CatchableExceptionOnRun(exc=Exception(""))
InternalExceptionOnRun(build_path="", exc=Exception(""))
GenericExceptionOnRun(build_path="", unique_id="", exc=Exception(""))
NodeConnectionReleaseError(node_name="", exc=Exception(""))
CatchableExceptionOnRun(exc="")
InternalExceptionOnRun(build_path="", exc="")
GenericExceptionOnRun(build_path="", unique_id="", exc="")
NodeConnectionReleaseError(node_name="", exc="")
CheckCleanPath(path="")
ConfirmCleanPath(path="")
ProtectedCleanPath(path="")
@@ -2838,6 +2847,6 @@ if 1 == 0:
TrackingInitializeFailure()
RetryExternalCall(attempt=0, max=0)
GeneralWarningMsg(msg="", log_fmt="")
GeneralWarningException(exc=Exception(""), log_fmt="")
GeneralWarningException(exc="", log_fmt="")
EventBufferFull()
RecordRetryException(exc=Exception(""))
RecordRetryException(exc="")

View File

@@ -21,6 +21,28 @@ def validator_error_message(exc):
return "at path {}: {}".format(path, exc.message)
def generic_format_node(node, include_path: bool, node_attr: str):
source_path = ''
if include_path:
source_path = " ({})".format(node.original_file_path)
node_identifier = getattr(node, node_attr)
return f"{node.resource_type.title()} '{node_identifier}'{source_path}"
def format_node(node, include_path: bool, node_attr: str = "name"):
# Sql Operations are one-off queries. If we used generic formatting,
# the node string would include a nonsense unique id & path like:
#
# sql operation.demo_data.name' (from remote system.sql)
#
# which is terribly hard to understand!
if node.resource_type == NodeType.SqlOperation:
return 'query'
else:
return generic_format_node(node, include_path, node_attr)
class Exception(builtins.Exception):
CODE = -32000
MESSAGE = "Server Error"
@@ -77,7 +99,8 @@ class RuntimeException(RuntimeError, Exception):
# out the path we know at least. This indicates an error during
# block parsing.
return "{}".format(node.path.original_file_path)
return "{} {} ({})".format(node.resource_type, node.name, node.original_file_path)
return format_node(node, include_path=True)
def process_stack(self):
lines = []
@@ -582,14 +605,9 @@ def _get_target_failure_msg(
if target_model_package is not None:
target_package_string = "in package '{}' ".format(target_model_package)
source_path_string = ""
if include_path:
source_path_string = " ({})".format(model.original_file_path)
return "{} '{}'{} depends on a {} named '{}' {}which {}".format(
model.resource_type.title(),
model.unique_id,
source_path_string,
model_string = format_node(model, include_path, node_attr='unique_id')
return "{} depends on a {} named '{}' {}which {}".format(
model_string,
target_kind,
target_name,
target_package_string,
@@ -631,13 +649,13 @@ def ref_target_not_found(
raise_compiler_error(msg, model)
def get_source_not_found_or_disabled_msg(
model,
def get_not_found_or_disabled_msg(
node,
target_name: str,
target_table_name: str,
target_kind: str,
target_package: Optional[str] = None,
disabled: Optional[bool] = None,
) -> str:
full_name = f"{target_name}.{target_table_name}"
if disabled is None:
reason = "was not found or is disabled"
elif disabled is True:
@@ -645,34 +663,57 @@ def get_source_not_found_or_disabled_msg(
else:
reason = "was not found"
return _get_target_failure_msg(
model, full_name, None, include_path=True, reason=reason, target_kind="source"
node,
target_name,
target_package,
include_path=True,
reason=reason,
target_kind=target_kind,
)
def source_target_not_found(
model, target_name: str, target_table_name: str, disabled: Optional[bool] = None
) -> NoReturn:
msg = get_source_not_found_or_disabled_msg(model, target_name, target_table_name, disabled)
msg = get_not_found_or_disabled_msg(
node=model,
target_name=f"{target_name}.{target_table_name}",
target_kind="source",
disabled=disabled,
)
raise_compiler_error(msg, model)
def get_metric_not_found_msg(
model,
target_name: str,
target_package: Optional[str],
) -> str:
reason = "was not found"
return _get_target_failure_msg(
model, target_name, target_package, include_path=True, reason=reason, target_kind="metric"
def metric_target_not_found(
metric, target_name: str, target_package: Optional[str], disabled: Optional[bool] = None
) -> NoReturn:
msg = get_not_found_or_disabled_msg(
node=metric,
target_name=target_name,
target_kind="metric",
target_package=target_package,
disabled=disabled,
)
def metric_target_not_found(metric, target_name: str, target_package: Optional[str]) -> NoReturn:
msg = get_metric_not_found_msg(metric, target_name, target_package)
raise_compiler_error(msg, metric)
def exposure_target_not_found(
exposure, target_name: str, target_package: Optional[str], disabled: Optional[bool] = None
) -> NoReturn:
msg = get_not_found_or_disabled_msg(
node=exposure,
target_name=target_name,
target_kind="exposure",
target_package=target_package,
disabled=disabled,
)
raise_compiler_error(msg, exposure)
def dependency_not_found(model, target_model_name):
raise_compiler_error(
"'{}' depends on '{}' which is not in the graph!".format(
@@ -1075,7 +1116,7 @@ def warn_or_raise(exc, log_fmt=None):
if flags.WARN_ERROR:
raise exc
else:
fire_event(GeneralWarningException(exc=exc, log_fmt=log_fmt))
fire_event(GeneralWarningException(exc=str(exc), log_fmt=log_fmt))
def warn(msg, node=None):

View File

@@ -10,7 +10,13 @@ from typing import Optional
# PROFILES_DIR must be set before the other flags
# It also gets set in main.py and in set_from_args because the rpc server
# doesn't go through exactly the same main arg processing.
DEFAULT_PROFILES_DIR = os.path.join(os.path.expanduser("~"), ".dbt")
GLOBAL_PROFILES_DIR = os.path.join(os.path.expanduser("~"), ".dbt")
LOCAL_PROFILES_DIR = os.getcwd()
# Use the current working directory if there is a profiles.yml file present there
if os.path.exists(Path(LOCAL_PROFILES_DIR) / Path("profiles.yml")):
DEFAULT_PROFILES_DIR = LOCAL_PROFILES_DIR
else:
DEFAULT_PROFILES_DIR = GLOBAL_PROFILES_DIR
PROFILES_DIR = os.path.expanduser(os.getenv("DBT_PROFILES_DIR", DEFAULT_PROFILES_DIR))
STRICT_MODE = False # Only here for backwards compatibility
@@ -52,6 +58,7 @@ _NON_BOOLEAN_FLAGS = [
_NON_DBT_ENV_FLAGS = ["DO_NOT_TRACK"]
# Global CLI defaults. These flags are set from three places:
# CLI args, environment variables, and user_config (profiles.yml).
# Environment variables use the pattern 'DBT_{flag name}', like DBT_PROFILES_DIR

View File

@@ -165,7 +165,8 @@ class NodeSelector(MethodManager):
elif unique_id in self.manifest.exposures:
return True
elif unique_id in self.manifest.metrics:
return True
metric = self.manifest.metrics[unique_id]
return metric.config.enabled
node = self.manifest.nodes[unique_id]
return not node.empty and node.config.enabled

View File

@@ -356,8 +356,21 @@ class ConfigSelectorMethod(SelectorMethod):
except AttributeError:
continue
else:
if selector == value:
yield node
if isinstance(value, list):
if (
(selector in value)
or (CaseInsensitive(selector) == "true" and True in value)
or (CaseInsensitive(selector) == "false" and False in value)
):
yield node
else:
if (
(selector == value)
or (CaseInsensitive(selector) == "true" and value is True)
or (CaseInsensitive(selector) == "false")
and value is False
):
yield node
class ResourceTypeSelectorMethod(SelectorMethod):

View File

@@ -1,15 +1,63 @@
# Include Module
The Include module is reponsible for housing default macro definitions, starter project scaffold, and the html file used to generate the docs page.
The Include module is responsible for housing default macro definitions, starter project scaffold, and the html file used to generate the docs page.
# Directories
## `global_project`
Defines the default implementations of jinja2 macros for `dbt-core` which can be overwritten in each adapter repo to work more in line with those adapter plugins. To view adapter specific jinja2 changes please check the relevant adapter repo [`adapter.sql` ](https://github.com/dbt-labs/dbt-bigquery/blob/main/dbt/include/bigquery/macros/adapters.sql) file in the `include` directory or in the [`impl.py`](https://github.com/dbt-labs/dbt-bigquery/blob/main/dbt/adapters/bigquery/impl.py) file for some ex. BigQuery (truncate_relation).
These directories contain the macros which wrap model code in DDL/DML. That code, in turn, “materializes” as a fully fledged database object in a warehouse pointed to by the current `target`. Herere the major steps of this process:
1. the SQL `select` query is compiled. In its current state, it could be run directly through a query editor to retrieve a dataset that will match 1:1 the database object materialized by `dbt`.
2. `dbt` embeds this SQL query into a materialization macro.
3. `dbt` executes this materialization macro which unravels a table into a DML/DDL statement that creates/truncate-and-replenishes/replaces a relation (e.g. view, table) in the `target`.
Note: `dbt compile` will not progress past step 1. It places the code produced in `target/compiled`.
### adapter.dispatch
Packages (e.g. `include` directories of adapters, any [hub](https://hub.getdbt.com/)-hosted package) can be interpreted as namespaces of functions a.k.a macros. In `dbt`'s macrospace, we take advantage of the multiple dispatch programming language concept. In short, multiple dispatch supports dynamic searching for a function across several namespaces—usually in a manually specified manner/order.
Adapters can have their own implementation of the same macro X. For example, a macro executed by `dbt-redshift` may need a specific implementation different from `dbt-snowflake`'s macro. We use multiple dispatch via `adapter.dispatch`, a Jinja function, which enables polymorphic macro invocations. The chosen implementation is selected according to what the `adapter` object is set to at runtime (it could be for redshift, postgres, and so on).
For more on this object, check out the dbt docs [here](https://docs.getdbt.com/reference/dbt-jinja-functions/adapter).
## `starter_project`
Produces the default project after running the `dbt init` command for the CLI. `dbt-cloud` initializes the project by using [dbt-starter-project](https://github.com/dbt-labs/dbt-starter-project).
# Files
- `index.html` a file generated from [dbt-docs](https://github.com/dbt-labs/dbt-docs) prior to new releases and replaced in the `dbt-core` directory. It is used to generate the docs page after using the `generate docs` command in dbt.
# dbt and database adapter python package interop
Lets say we have a fictional python app named `dbt-core` with this structure
```
dbt
├── adapters
│   └── base.py
├── cli.py
└── main.py
```
`pip install dbt-core` will install this application in my python environment, maintaining the same structure. Note that `dbt.adapters` only contains a `base.py`. In this example, we can assume that base.py includes an abstract class for creating connections. Lets say we wanted to create an postgres adapter that this app could use, and can be installed independently. We can create a python package with the following structure called `dbt-postgres`
```
dbt
└── adapters
└── postgres
└── impl.py
```
`pip install dbt-postgres` will install this package in the python environment, maintaining the same structure again. Lets say `impl.py` imports `dbt.adapters.base` and implements a concrete class inheriting from the abstract class in `base.py` from the `dbt-core` package. Since our top level package is named the same in both packages, `pip` will put this in the same place. We end up with this installed in our python environment.
```
dbt
├── adapters
│   ├── base.py
│   └── postgres
│   └── impl.py
├── cli.py
└── main.py
```
`dbt.adapters` now has a postgres module that dbt can easily find and call directly. dbt and its adapters follows the same type of file structure convention. This is the magic that allows you to import `dbt.*` in database adapters, and using a factory pattern in dbt-core, we can create instances of concrete classes defined in the database adapter packages (for creating connections, defining database configuration, defining credentials, etc.)

View File

@@ -12,7 +12,7 @@ The macro override naming method (spark__statement) only works for macros which
{%- if language == 'sql'-%}
{%- set res, table = adapter.execute(compiled_code, auto_begin=auto_begin, fetch=fetch_result) -%}
{%- elif language == 'python' -%}
{%- set res = adapter.submit_python_job(model, compiled_code) -%}
{%- set res = submit_python_job(model, compiled_code) -%}
{#-- TODO: What should table be for python models? --#}
{%- set table = None -%}
{%- else -%}

View File

@@ -3,7 +3,7 @@
{%- set ref_dict = {} -%}
{%- for _ref in model.refs -%}
{%- set resolved = ref(*_ref) -%}
{%- do ref_dict.update({_ref | join("."): resolved | string}) -%}
{%- do ref_dict.update({_ref | join("."): resolved.quote(database=False, schema=False, identifier=False) | string}) -%}
{%- endfor -%}
def ref(*args,dbt_load_df_function):
@@ -18,7 +18,7 @@ def ref(*args,dbt_load_df_function):
{%- set source_dict = {} -%}
{%- for _source in model.sources -%}
{%- set resolved = source(*_source) -%}
{%- do source_dict.update({_source | join("."): resolved | string}) -%}
{%- do source_dict.update({_source | join("."): resolved.quote(database=False, schema=False, identifier=False) | string}) -%}
{%- endfor -%}
def source(*args, dbt_load_df_function):
@@ -30,8 +30,8 @@ def source(*args, dbt_load_df_function):
{% macro build_config_dict(model) %}
{%- set config_dict = {} -%}
{%- for key in model.config.utilized -%}
{# TODO: weird type testing with enum, would be much easier to write this logic in Python! #}
{%- for key in model.config.config_keys_used -%}
{# weird type testing with enum, would be much easier to write this logic in Python! #}
{%- if key == 'language' -%}
{%- set value = 'python' -%}
{%- endif -%}
@@ -56,8 +56,8 @@ class config:
pass
@staticmethod
def get(key):
return config_dict.get(key)
def get(key, default=None):
return config_dict.get(key, default)
class this:
"""dbt.this() or dbt.this.identifier"""

View File

@@ -59,7 +59,7 @@ The TIMESTAMP_* variation associated with TIMESTAMP is specified by the TIMESTAM
{{ return(api.Column.translate_type("float")) }}
{% endmacro %}
{# numeric ------------------------------------------------ #}
{# numeric ------------------------------------------------- #}
{%- macro type_numeric() -%}
{{ return(adapter.dispatch('type_numeric', 'dbt')()) }}
@@ -115,3 +115,15 @@ the precision and scale explicitly.)
-- returns 'int' everywhere, except BigQuery, where it returns 'int64'
-- (but BigQuery also now accepts 'int' as a valid alias for 'int64')
{# bool ------------------------------------------------- #}
{%- macro type_boolean() -%}
{{ return(adapter.dispatch('type_boolean', 'dbt')()) }}
{%- endmacro -%}
{%- macro default__type_boolean() -%}
{{ return(api.Column.translate_type("boolean")) }}
{%- endmacro -%}
-- returns 'boolean' everywhere. BigQuery accepts 'boolean' as a valid alias for 'bool'

View File

@@ -15,7 +15,7 @@ def get_dbt_config(project_dir, args=None, single_threaded=False):
if os.getenv("DBT_PROFILES_DIR"):
profiles_dir = os.getenv("DBT_PROFILES_DIR")
else:
profiles_dir = os.path.expanduser("~/.dbt")
profiles_dir = flags.DEFAULT_PROFILES_DIR
profile = args.profile if hasattr(args, "profile") else None
target = args.target if hasattr(args, "target") else None

View File

@@ -1,4 +1,5 @@
from typing import List
from dbt.logger import log_cache_events, log_manager
import argparse
@@ -42,8 +43,13 @@ from dbt.adapters.factory import reset_adapters, cleanup_connections
import dbt.tracking
from dbt.utils import ExitCodes, args_to_dict
from dbt.config.profile import DEFAULT_PROFILES_DIR, read_user_config
from dbt.exceptions import InternalException, NotImplementedException, FailedToConnectException
from dbt.config.profile import read_user_config
from dbt.exceptions import (
Exception as dbtException,
InternalException,
NotImplementedException,
FailedToConnectException,
)
class DBTVersion(argparse.Action):
@@ -142,8 +148,9 @@ def main(args=None):
exit_code = e.code
except BaseException as e:
fire_event(MainEncounteredError(e=str(e)))
fire_event(MainStackTrace(stack_trace=traceback.format_exc()))
fire_event(MainEncounteredError(exc=str(e)))
if not isinstance(e, dbtException):
fire_event(MainStackTrace(stack_trace=traceback.format_exc()))
exit_code = ExitCodes.UnhandledError.value
sys.exit(exit_code)
@@ -201,7 +208,7 @@ def track_run(task):
yield
dbt.tracking.track_invocation_end(config=task.config, args=task.args, result_type="ok")
except (NotImplementedException, FailedToConnectException) as e:
fire_event(MainEncounteredError(e=str(e)))
fire_event(MainEncounteredError(exc=str(e)))
dbt.tracking.track_invocation_end(config=task.config, args=task.args, result_type="error")
except Exception:
dbt.tracking.track_invocation_end(config=task.config, args=task.args, result_type="error")
@@ -258,10 +265,8 @@ def _build_base_subparser():
dest="sub_profiles_dir", # Main cli arg precedes subcommand
type=str,
help="""
Which directory to look in for the profiles.yml file. Default = {}
""".format(
DEFAULT_PROFILES_DIR
),
Which directory to look in for the profiles.yml file. If not set, dbt will look in the current working directory first, then HOME/.dbt/
""",
)
base_subparser.add_argument(
@@ -620,6 +625,7 @@ def _add_table_mutability_arguments(*subparsers):
for sub in subparsers:
sub.add_argument(
"--full-refresh",
"-f",
action="store_true",
help="""
If specified, dbt will drop incremental models and
@@ -1059,10 +1065,8 @@ def parse_args(args, cls=DBTArgumentParser):
dest="profiles_dir",
type=str,
help="""
Which directory to look in for the profiles.yml file. Default = {}
""".format(
DEFAULT_PROFILES_DIR
),
Which directory to look in for the profiles.yml file. If not set, dbt will look in the current working directory first, then HOME/.dbt/
""",
)
p.add_argument(

View File

@@ -1 +1,219 @@
# Parser README
The parsing stage of dbt produces:
* the 'manifest', a Python structure in the code
* the project `target/manifest.json` file
## Top Level Parsing steps
1. Load the macro manifest (MacroManifest):
* Inputs: SQL files in the 'macros' directory
* Outputs: the MacroManifest encapsulated in a BaseAdapter; a pared down Manifest-type file which contains only macros and files (to be copied into the full Manifest)
The macro manifest takes the place of the full-fledged Manifest and contains only macros, which are used for resolving tests during parsing. The 'adapter' code retains a MacroManifest, while the mainline parsing code will have a full manifest. The class MacroParser assists in this process.
2. Loads all internal 'projects' (i.e. has a `dbt_project.yml`) and all projects in the project path.
3. Create a ManifestLoader object from the project dependencies and the current config.
4. Read and parse the project files. Results are loaded into the ManifestLoader. ManifestLoader loads the MacroManifest object into a BaseAdapter object.
5. Write the partial parse results (the pickle file). This writes out the 'results' from the ManifestLoader, so the "create the manifest" step has not occured yet. Things yet to happen include patching the nodes, patching the macros, and processing refs, sources, and docs.
6. Sources are patched. First, source tests are parsed. Nodes, sources, macros, docs, exposures, metadata, files, and selectors are copied into the Manifest by the ManifestLoader. And finally, nodes (from `results.patches`) are "patched" and macros too (from `results.macro_patches`).
7. Process the manifest (refs, sources, docs).
* Loops through all nodes, for each node find the node that matches
* the sources refs, and adds the unique id of the source to each
* node's 'depends_on'
8. Check the Manifest and check for resource uniqueness.
9. Build the flat graph
10. Write out the target/manifest.json file.
## Parse the project files
There are several parser-type objects. Each "parser" gets a list of of matching files specified by directory and ('dbt_project.yml', '*.sql', '*.yml', *.csv, or *.md)
### ModelParser
code: core/dbt/parser/models.py. Most of the code is in SimpleSQLParser.
paths: source\_paths + `*.sql`
Manifest: nodes
### SnapshotParser
code: core/dbt/parser/snapshots.py
paths: snapshot\_paths + `*.sql`
### AnalysisParser
code: core/dbt/parser/analysis.py
paths: analysis\_paths + `*.sql`
Manifest: nodes
### DataTestParser
code: core/dbt/parser/data\_test.py
paths: test\_paths + `*.sql`
Manifest: nodes
### HookParser
code: core/dbt/parser/hooks.py
paths: 'dbt\_project.yml'
### SeedParser
code: core/dbt/parser/seeds.py
paths: data\_paths + `*.csv`
### DocumentationParser
code: core/dbt/parser/docs.py
paths: docs\_paths, `*.md`
Manifest: docs
### SchemaParser
* Input: yaml files
* Output: various nodes in the Manifest and 'patch' files in the ParseResult/Manifest
code: core/dbt/parser/schemas.py
paths: all\_source\_paths = source\_paths, data\_paths, snapshot\_paths, analysis\_paths, macro\_paths, `*.yml`
This "parses" the `.yml` (property) files in the dbt project. It loops through each yaml file and pulls out the tests in the various config sections and jinja renders them.
A different sub-parser is called for each main dictionary key in the yaml.
* 'models' - TestablePatchParser
* plus 'patches' at create manifest time
* Manifest: nodes
* 'seeds' - TestablePatchParser
* 'snapshots' - TestablePatchParser
* 'sources' - SourceParser
* plus 'patch\_sources' at create manifest times
* Manifest: sources
* 'macros' - MacroPatchParser
* plus 'macro\_patches' at create manifest time
* Manifest: macros
* 'analyses' - AnalysisPatchParser
* 'exposures' - SchemaParser.parse\_exposures
* no 'patches'
* Manifest: exposures
# dbt Manifest
### nodes
These have executable SQL attached.
Models
- Are generated from SQL files in the 'models' directory
- have a unique_id starting with 'model.'
- Final object is a ParsedModelNode
Data Tests
- Are generated from SQL files in 'tests' directory
- have a unique_id starting with 'test.'
- Final object is a ParsedDataTestNode
Schema Tests
- Are generated from 'tests' in schema yaml files, which ultimately derive from tests in the 'macros' directory
- Have a unique_id starting with 'test.'
- Final object is a ParsedSchemaTestNode
- fqn is <project>.schema_test.<generated name>
Hooks
- come from 'on-run-start' and 'on-run-end' config attributes.
- have a unique_id starting with 'operation.'
- FQN is of the form: ["dbt_labs_internal_analytics","hooks","dbt_labs_internal_analytics-on-run-end-0"]
Analysis
- comes from SQL files in 'analysis' directory
- Final object is a ParsedAnalysisNode
RPC Node
- This is a "node" representing the bit of Jinja-SQL that gets passed into the run_sql or compile_sql methods. When you're using the Cloud IDE, and you're working in a scratch tab, and you just want to compile/run what you have there: it needs to be parsed and executed, but it's not actually a model/node in the project, so it's this special thing. This is a temporary addition to the running manifest.
- Object is a ParsedRPCNode
### sources
- comes from 'sources' sections in yaml files
- Final object is a ParsedSourceDefinition node
- have a unique_id starting with 'source.'
### macros
- comes from SQL files in 'macros' directory
- Final object is a ParsedMacro node
- have a unique_id starting with 'macro.'
- Test macros are used in schema tests
### docs
- comes from .md files in 'docs' directory
- Final object is a ParsedDocumentation
### exposures
- comes from 'exposures' sections in yaml files
- Final object is a ParsedExposure node
## Temporary patch files
The information in these structures is stored here by the schema parser, but should be resolved before the final manifest is written and do not show up in the written manifest. Ideally we'd like to skip this step and apply the changes directly to the nodes, macros, and sources instead. With the current staged parser we have to save this with the manifest information.
### patches
### macro_patches
### source_patches
## Other
### selectors
Selectors are set in config yaml files and can be used to determine which nodes should be 'compiled' and run. Selectors can also be done on the command line and will be in cli args.
### disabled
Models, sources, or other nodes that the user has disabled by setting enabled: false. They should be completely ignored by dbt, as if they don't exist. In its simplest/silliest form, it's a way to keep code around without deleting it. Some folks do cleverer things, like dynamically enabling/disabling certain models based on the database adapter type or the value of --vars.
### files
This contains a list of all of the files that were processed with links to the nodes, docs, macros, sources, exposures, patches, macro_patches, source_patches, i.e. all of the other places that data is stored in the manifest. It also has a checksum of the contents. The 'files' structure is in the saved manifest, but not in the manifest.json file that is written out. It is used in partial parsing to determine whether to use previously generated nodes.
### metadata
From the ManifestMetadata class. Contains dbt_schema_version, project_id, user_id, send_anonymous_usage_stats, adapter_type
### flat_graph
Used during execution in context.common (?). Builds dictionaries of nodes and sources. Not sure why this is used instead of the original nodes and sources. Not in the written manifest.
### state_check
This used to be in ParseResults (not committed yet). The saved version of this is compared against the current version to see if we can use the saved Manifest. Contains var_hash, profile_hash, and project_hashes, to compare to the saved Manifest to see if things have changed that would invalidate it.
## Written Manifest only
### child_map
### parent_map

View File

@@ -233,7 +233,7 @@ class ConfiguredParser(
def render_with_context(self, parsed_node: IntermediateNode, config: ContextConfig):
# Given the parsed node and a ContextConfig to use during parsing,
# render the node's sql wtih macro capture enabled.
# render the node's sql with macro capture enabled.
# Note: this mutates the config object when config calls are rendered.
context = self._context_for(parsed_node, config)

View File

@@ -20,9 +20,7 @@ class MacroParser(BaseParser[ParsedMacro]):
# from the normal parsing flow.
def get_paths(self) -> List[FilePath]:
return filesystem_search(
project=self.project,
relative_dirs=self.project.macro_paths,
extension=".sql",
project=self.project, relative_dirs=self.project.macro_paths, extension=".sql"
)
@property

View File

@@ -74,7 +74,8 @@ from dbt.exceptions import (
get_target_not_found_or_disabled_msg,
source_target_not_found,
metric_target_not_found,
get_source_not_found_or_disabled_msg,
exposure_target_not_found,
get_not_found_or_disabled_msg,
warn_or_error,
)
from dbt.parser.base import Parser
@@ -350,7 +351,7 @@ class ManifestLoader:
project, project_parser_files[project.project_name], parser_types
)
# Now that we've loaded most of the nodes (except for schema tests and sources)
# Now that we've loaded most of the nodes (except for schema tests, sources, metrics)
# load up the Lookup objects to resolve them by name, so the SourceFiles store
# the unique_id instead of the name. Sources are loaded from yaml files, so
# aren't in place yet
@@ -377,13 +378,14 @@ class ManifestLoader:
patcher.construct_sources()
self.manifest.sources = patcher.sources
self._perf_info.patch_sources_elapsed = time.perf_counter() - start_patch
# We need to rebuild disabled in order to include disabled sources
self.manifest.rebuild_disabled_lookup()
# copy the selectors from the root_project to the manifest
self.manifest.selectors = self.root_project.manifest_selectors
# update the refs, sources, and docs
# update the refs, sources, docs and metrics depends_on.nodes
# These check the created_at time on the nodes to
# determine whether they need processing.
start_process = time.perf_counter()
@@ -658,7 +660,7 @@ class ManifestLoader:
manifest.metadata.invocation_id = get_invocation_id()
return manifest
except Exception as exc:
fire_event(ParsedFileLoadFailed(path=path, exc=exc))
fire_event(ParsedFileLoadFailed(path=path, exc=str(exc)))
reparse_reason = ReparseReason.load_file_failure
else:
fire_event(PartialParseSaveFileNotFound())
@@ -946,7 +948,12 @@ def invalid_ref_fail_unless_test(node, target_model_name, target_model_package,
def invalid_source_fail_unless_test(node, target_name, target_table_name, disabled):
if node.resource_type == NodeType.Test:
msg = get_source_not_found_or_disabled_msg(node, target_name, target_table_name, disabled)
msg = get_not_found_or_disabled_msg(
node=node,
target_name=f"{target_name}.{target_table_name}",
target_kind="source",
disabled=disabled,
)
if disabled:
fire_event(InvalidDisabledSourceInTestNode(msg=msg))
else:
@@ -968,6 +975,21 @@ def invalid_metric_fail_unless_test(node, target_metric_name, target_metric_pack
)
def invalid_exposure_fail_unless_test(node, target_exposure_name, target_exposure_package):
if node.resource_type == NodeType.Test:
msg = get_target_not_found_or_disabled_msg(
node, target_exposure_name, target_exposure_package
)
warn_or_error(msg, log_fmt=warning_tag("{}"))
else:
exposure_target_not_found(
node,
target_exposure_name,
target_exposure_package,
)
def _check_resource_uniqueness(
manifest: Manifest,
config: RuntimeConfig,
@@ -1102,6 +1124,7 @@ def _process_refs_for_exposure(manifest: Manifest, current_project: str, exposur
if target_model is None or isinstance(target_model, Disabled):
# This may raise. Even if it doesn't, we don't want to add
# this exposure to the graph b/c there is no destination exposure
exposure.config.enabled = False
invalid_ref_fail_unless_test(
exposure,
target_model_name,
@@ -1142,7 +1165,8 @@ def _process_refs_for_metric(manifest: Manifest, current_project: str, metric: P
if target_model is None or isinstance(target_model, Disabled):
# This may raise. Even if it doesn't, we don't want to add
# this exposure to the graph b/c there is no destination exposure
# this metric to the graph b/c there is no destination metric
metric.config.enabled = False
invalid_ref_fail_unless_test(
metric,
target_model_name,
@@ -1163,7 +1187,7 @@ def _process_metrics_for_node(
):
"""Given a manifest and a node in that manifest, process its metrics"""
for metric in node.metrics:
target_metric: Optional[ParsedMetric] = None
target_metric: Optional[Union[Disabled, ParsedMetric]] = None
target_metric_name: str
target_metric_package: Optional[str] = None
@@ -1176,7 +1200,6 @@ def _process_metrics_for_node(
f"Metric references should always be 1 or 2 arguments - got {len(metric)}"
)
# Resolve_ref
target_metric = manifest.resolve_metric(
target_metric_name,
target_metric_package,
@@ -1184,9 +1207,10 @@ def _process_metrics_for_node(
node.package_name,
)
if target_metric is None:
if target_metric is None or isinstance(target_metric, Disabled):
# This may raise. Even if it doesn't, we don't want to add
# this node to the graph b/c there is no destination node
node.config.enabled = False
invalid_metric_fail_unless_test(
node,
target_metric_name,
@@ -1258,6 +1282,7 @@ def _process_sources_for_exposure(
exposure.package_name,
)
if target_source is None or isinstance(target_source, Disabled):
exposure.config.enabled = False
invalid_source_fail_unless_test(
exposure, source_name, table_name, disabled=(isinstance(target_source, Disabled))
)
@@ -1277,6 +1302,7 @@ def _process_sources_for_metric(manifest: Manifest, current_project: str, metric
metric.package_name,
)
if target_source is None or isinstance(target_source, Disabled):
metric.config.enabled = False
invalid_source_fail_unless_test(
metric, source_name, table_name, disabled=(isinstance(target_source, Disabled))
)

View File

@@ -86,11 +86,12 @@ class PythonParseVisitor(ast.NodeVisitor):
def _safe_eval(self, node):
try:
return ast.literal_eval(node)
except (SyntaxError, ValueError, TypeError) as exc:
msg = validator_error_message(exc)
raise ParsingException(msg, node=self.dbt_node) from exc
except (MemoryError, RecursionError) as exc:
msg = validator_error_message(exc)
except (SyntaxError, ValueError, TypeError, MemoryError, RecursionError) as exc:
msg = validator_error_message(
f"Error when trying to literal_eval an arg to dbt.ref(), dbt.source(), dbt.config() or dbt.config.get() \n{exc}\n"
"https://docs.python.org/3/library/ast.html#ast.literal_eval\n"
"In dbt python model, `dbt.ref`, `dbt.source`, `dbt.config`, `dbt.config.get` function args only support Python literal structures"
)
raise ParsingException(msg, node=self.dbt_node) from exc
def _get_call_literals(self, node):
@@ -205,16 +206,16 @@ class ModelParser(SimpleSQLParser[ParsedModelNode]):
dbtParser = PythonParseVisitor(node)
dbtParser.visit(tree)
config_keys_used = []
for (func, args, kwargs) in dbtParser.dbt_function_calls:
# TODO decide what we want to do with detected packages
# if func == "config":
# kwargs["detected_packages"] = dbtParser.packages
if func == "get":
context["config"](utilized=args)
config_keys_used.append(args[0])
continue
context[func](*args, **kwargs)
if config_keys_used:
# this is being used in macro build_config_dict
context["config"](config_keys_used=config_keys_used)
def render_update(self, node: ParsedModelNode, config: ContextConfig) -> None:
self.manifest._parsing_info.static_analysis_path_count += 1

View File

@@ -245,6 +245,22 @@ class PartialParsing:
if "overrides" in source:
self.remove_source_override_target(source)
def delete_disabled(self, unique_id, file_id):
# This node/metric/exposure is disabled. Find it and remove it from disabled dictionary.
for dis_index, dis_node in enumerate(self.saved_manifest.disabled[unique_id]):
if dis_node.file_id == file_id:
node = dis_node
index = dis_index
break
# Remove node from disabled
del self.saved_manifest.disabled[unique_id][index]
# if all nodes were removed for the unique id, delete the unique_id
# from the disabled dict
if not self.saved_manifest.disabled[unique_id]:
self.saved_manifest.disabled.pop(unique_id)
return node
# Deletes for all non-schema files
def delete_from_saved(self, file_id):
# Look at all things touched by file, remove those
@@ -319,15 +335,7 @@ class PartialParsing:
and unique_id in self.saved_manifest.disabled
):
# This node is disabled. Find the node and remove it from disabled dictionary.
for dis_index, dis_node in enumerate(self.saved_manifest.disabled[unique_id]):
if dis_node.file_id == source_file.file_id:
node = dis_node
break
if dis_node:
# Remove node from disabled and unique_id from disabled dict if necessary
del self.saved_manifest.disabled[unique_id][dis_index]
if not self.saved_manifest.disabled[unique_id]:
self.saved_manifest.disabled.pop(unique_id)
node = self.delete_disabled(unique_id, source_file.file_id)
else:
# Has already been deleted by another action
return
@@ -885,34 +893,38 @@ class PartialParsing:
self.add_to_pp_files(self.saved_files[macro_file_id])
# exposures are created only from schema files, so just delete
# the exposure.
# the exposure or the disabled exposure.
def delete_schema_exposure(self, schema_file, exposure_dict):
exposure_name = exposure_dict["name"]
exposures = schema_file.exposures.copy()
for unique_id in exposures:
exposure = self.saved_manifest.exposures[unique_id]
if unique_id in self.saved_manifest.exposures:
exposure = self.saved_manifest.exposures[unique_id]
if exposure.name == exposure_name:
self.deleted_manifest.exposures[unique_id] = self.saved_manifest.exposures.pop(
unique_id
)
schema_file.exposures.remove(unique_id)
fire_event(PartialParsingDeletedExposure(unique_id=unique_id))
elif unique_id in self.saved_manifest.disabled:
self.delete_disabled(unique_id, schema_file.file_id)
# metric are created only from schema files, so just delete
# the metric.
# the metric or the disabled metric.
def delete_schema_metric(self, schema_file, metric_dict):
metric_name = metric_dict["name"]
metrics = schema_file.metrics.copy()
for unique_id in metrics:
metric = self.saved_manifest.metrics[unique_id]
if unique_id in self.saved_manifest.metrics:
metric = self.saved_manifest.metrics[unique_id]
if metric.name == metric_name:
self.deleted_manifest.metrics[unique_id] = self.saved_manifest.metrics.pop(
unique_id
)
schema_file.metrics.remove(unique_id)
fire_event(PartialParsingDeletedMetric(id=unique_id))
elif unique_id in self.saved_manifest.disabled:
self.delete_disabled(unique_id, schema_file.file_id)
def get_schema_element(self, elem_list, elem_name):
for element in elem_list:

View File

@@ -1,3 +1,5 @@
import os
import pathspec # type: ignore
import pathlib
from dbt.clients.system import load_file_contents
from dbt.contracts.files import (
@@ -107,9 +109,9 @@ def load_seed_source_file(match: FilePath, project_name) -> SourceFile:
# Use the FilesystemSearcher to get a bunch of FilePaths, then turn
# them into a bunch of FileSource objects
def get_source_files(project, paths, extension, parse_file_type, saved_files):
def get_source_files(project, paths, extension, parse_file_type, saved_files, ignore_spec):
# file path list
fp_list = filesystem_search(project, paths, extension)
fp_list = filesystem_search(project, paths, extension, ignore_spec)
# file block list
fb_list = []
for fp in fp_list:
@@ -129,42 +131,84 @@ def get_source_files(project, paths, extension, parse_file_type, saved_files):
return fb_list
def read_files_for_parser(project, files, dirs, extensions, parse_ft, saved_files):
def read_files_for_parser(project, files, dirs, extensions, parse_ft, saved_files, ignore_spec):
parser_files = []
for extension in extensions:
source_files = get_source_files(project, dirs, extension, parse_ft, saved_files)
source_files = get_source_files(
project, dirs, extension, parse_ft, saved_files, ignore_spec
)
for sf in source_files:
files[sf.file_id] = sf
parser_files.append(sf.file_id)
return parser_files
def generate_dbt_ignore_spec(project_root):
ignore_file_path = os.path.join(project_root, ".dbtignore")
ignore_spec = None
if os.path.exists(ignore_file_path):
with open(ignore_file_path) as f:
ignore_spec = pathspec.PathSpec.from_lines(pathspec.patterns.GitWildMatchPattern, f)
return ignore_spec
# This needs to read files for multiple projects, so the 'files'
# dictionary needs to be passed in. What determines the order of
# the various projects? Is the root project always last? Do the
# non-root projects need to be done separately in order?
def read_files(project, files, parser_files, saved_files):
dbt_ignore_spec = generate_dbt_ignore_spec(project.project_root)
project_files = {}
project_files["MacroParser"] = read_files_for_parser(
project, files, project.macro_paths, [".sql"], ParseFileType.Macro, saved_files
project,
files,
project.macro_paths,
[".sql"],
ParseFileType.Macro,
saved_files,
dbt_ignore_spec,
)
project_files["ModelParser"] = read_files_for_parser(
project, files, project.model_paths, [".sql", ".py"], ParseFileType.Model, saved_files
project,
files,
project.model_paths,
[".sql", ".py"],
ParseFileType.Model,
saved_files,
dbt_ignore_spec,
)
project_files["SnapshotParser"] = read_files_for_parser(
project, files, project.snapshot_paths, [".sql"], ParseFileType.Snapshot, saved_files
project,
files,
project.snapshot_paths,
[".sql"],
ParseFileType.Snapshot,
saved_files,
dbt_ignore_spec,
)
project_files["AnalysisParser"] = read_files_for_parser(
project, files, project.analysis_paths, [".sql"], ParseFileType.Analysis, saved_files
project,
files,
project.analysis_paths,
[".sql"],
ParseFileType.Analysis,
saved_files,
dbt_ignore_spec,
)
project_files["SingularTestParser"] = read_files_for_parser(
project, files, project.test_paths, [".sql"], ParseFileType.SingularTest, saved_files
project,
files,
project.test_paths,
[".sql"],
ParseFileType.SingularTest,
saved_files,
dbt_ignore_spec,
)
# all generic tests within /tests must be nested under a /generic subfolder
@@ -175,14 +219,27 @@ def read_files(project, files, parser_files, saved_files):
[".sql"],
ParseFileType.GenericTest,
saved_files,
dbt_ignore_spec,
)
project_files["SeedParser"] = read_files_for_parser(
project, files, project.seed_paths, [".csv"], ParseFileType.Seed, saved_files
project,
files,
project.seed_paths,
[".csv"],
ParseFileType.Seed,
saved_files,
dbt_ignore_spec,
)
project_files["DocumentationParser"] = read_files_for_parser(
project, files, project.docs_paths, [".md"], ParseFileType.Documentation, saved_files
project,
files,
project.docs_paths,
[".md"],
ParseFileType.Documentation,
saved_files,
dbt_ignore_spec,
)
project_files["SchemaParser"] = read_files_for_parser(
@@ -192,6 +249,7 @@ def read_files(project, files, parser_files, saved_files):
[".yml", ".yaml"],
ParseFileType.Schema,
saved_files,
dbt_ignore_spec,
)
# Store the parser files for this particular project

View File

@@ -67,7 +67,8 @@ class SchemaYamlRenderer(BaseRenderer):
elif self._is_norender_key(keypath[0:]):
return False
elif self.key == "metrics":
if keypath[0] == "sql":
# back compat: "expression" is new name, "sql" is old name
if keypath[0] in ("expression", "sql"):
return False
elif self._is_norender_key(keypath[0:]):
return False

View File

@@ -14,6 +14,9 @@ from dbt.clients.yaml_helper import load_yaml_text
from dbt.parser.schema_renderer import SchemaYamlRenderer
from dbt.context.context_config import (
ContextConfig,
BaseContextConfigGenerator,
ContextConfigGenerator,
UnrenderedConfigGenerator,
)
from dbt.context.configured import generate_schema_yml_context, SchemaYamlVars
from dbt.context.providers import (
@@ -23,6 +26,7 @@ from dbt.context.providers import (
)
from dbt.context.macro_resolver import MacroResolver
from dbt.contracts.files import FileHash, SchemaSourceFile
from dbt.contracts.graph.model_config import MetricConfig, ExposureConfig
from dbt.contracts.graph.parsed import (
ParsedNodePatch,
ColumnInfo,
@@ -74,13 +78,6 @@ from dbt.ui import warning_tag
from dbt.utils import get_pseudo_test_path, coerce_dict_str
UnparsedSchemaYaml = Union[
UnparsedSourceDefinition,
UnparsedNodeUpdate,
UnparsedAnalysisUpdate,
UnparsedMacroUpdate,
]
TestDef = Union[str, Dict[str, Any]]
schema_file_keys = (
@@ -91,6 +88,7 @@ schema_file_keys = (
"macros",
"analyses",
"exposures",
"metrics",
)
@@ -546,14 +544,12 @@ class SchemaParser(SimpleParser[GenericTestBlock, ParsedGenericTestNode]):
# parse exposures
if "exposures" in dct:
exp_parser = ExposureParser(self, yaml_block)
for exposure_node in exp_parser.parse():
self.manifest.add_exposure(yaml_block.file, exposure_node)
exp_parser.parse()
# parse metrics
if "metrics" in dct:
metric_parser = MetricParser(self, yaml_block)
for metric_node in metric_parser.parse():
self.manifest.add_metric(yaml_block.file, metric_node)
metric_parser.parse()
def check_format_version(file_path, yaml_dct) -> None:
@@ -974,7 +970,7 @@ class ExposureParser(YamlReader):
self.schema_parser = schema_parser
self.yaml = yaml
def parse_exposure(self, unparsed: UnparsedExposure) -> ParsedExposure:
def parse_exposure(self, unparsed: UnparsedExposure):
package_name = self.project.project_name
unique_id = f"{NodeType.Exposure}.{package_name}.{unparsed.name}"
path = self.yaml.path.relative_path
@@ -982,6 +978,25 @@ class ExposureParser(YamlReader):
fqn = self.schema_parser.get_fqn_prefix(path)
fqn.append(unparsed.name)
config = self._generate_exposure_config(
target=unparsed,
fqn=fqn,
package_name=package_name,
rendered=True,
)
unrendered_config = self._generate_exposure_config(
target=unparsed,
fqn=fqn,
package_name=package_name,
rendered=False,
)
if not isinstance(config, ExposureConfig):
raise InternalException(
f"Calculated a {type(config)} for an exposure, but expected an ExposureConfig"
)
parsed = ParsedExposure(
package_name=package_name,
root_path=self.project.project_root,
@@ -995,8 +1010,11 @@ class ExposureParser(YamlReader):
meta=unparsed.meta,
tags=unparsed.tags,
description=unparsed.description,
label=unparsed.label,
owner=unparsed.owner,
maturity=unparsed.maturity,
config=config,
unrendered_config=unrendered_config,
)
ctx = generate_parse_exposure(
parsed,
@@ -1007,9 +1025,36 @@ class ExposureParser(YamlReader):
depends_on_jinja = "\n".join("{{ " + line + "}}" for line in unparsed.depends_on)
get_rendered(depends_on_jinja, ctx, parsed, capture_macros=True)
# parsed now has a populated refs/sources
return parsed
def parse(self) -> Iterable[ParsedExposure]:
if parsed.config.enabled:
self.manifest.add_exposure(self.yaml.file, parsed)
else:
self.manifest.add_disabled(self.yaml.file, parsed)
def _generate_exposure_config(
self, target: UnparsedExposure, fqn: List[str], package_name: str, rendered: bool
):
generator: BaseContextConfigGenerator
if rendered:
generator = ContextConfigGenerator(self.root_project)
else:
generator = UnrenderedConfigGenerator(self.root_project)
# configs with precendence set
precedence_configs = dict()
# apply exposure configs
precedence_configs.update(target.config)
return generator.calculate_node_config(
config_call_dict={},
fqn=fqn,
resource_type=NodeType.Exposure,
project_name=package_name,
base=False,
patch_config_dict=precedence_configs,
)
def parse(self):
for data in self.get_key_dicts():
try:
UnparsedExposure.validate(data)
@@ -1017,8 +1062,8 @@ class ExposureParser(YamlReader):
except (ValidationError, JSONValidationException) as exc:
msg = error_context(self.yaml.path, self.key, data, exc)
raise ParsingException(msg) from exc
parsed = self.parse_exposure(unparsed)
yield parsed
self.parse_exposure(unparsed)
class MetricParser(YamlReader):
@@ -1027,7 +1072,7 @@ class MetricParser(YamlReader):
self.schema_parser = schema_parser
self.yaml = yaml
def parse_metric(self, unparsed: UnparsedMetric) -> ParsedMetric:
def parse_metric(self, unparsed: UnparsedMetric):
package_name = self.project.project_name
unique_id = f"{NodeType.Metric}.{package_name}.{unparsed.name}"
path = self.yaml.path.relative_path
@@ -1035,6 +1080,25 @@ class MetricParser(YamlReader):
fqn = self.schema_parser.get_fqn_prefix(path)
fqn.append(unparsed.name)
config = self._generate_metric_config(
target=unparsed,
fqn=fqn,
package_name=package_name,
rendered=True,
)
unrendered_config = self._generate_metric_config(
target=unparsed,
fqn=fqn,
package_name=package_name,
rendered=False,
)
if not isinstance(config, MetricConfig):
raise InternalException(
f"Calculated a {type(config)} for a metric, but expected a MetricConfig"
)
parsed = ParsedMetric(
package_name=package_name,
root_path=self.project.project_root,
@@ -1046,14 +1110,17 @@ class MetricParser(YamlReader):
name=unparsed.name,
description=unparsed.description,
label=unparsed.label,
type=unparsed.type,
sql=str(unparsed.sql),
calculation_method=unparsed.calculation_method,
expression=str(unparsed.expression),
timestamp=unparsed.timestamp,
dimensions=unparsed.dimensions,
window=unparsed.window,
time_grains=unparsed.time_grains,
filters=unparsed.filters,
meta=unparsed.meta,
tags=unparsed.tags,
config=config,
unrendered_config=unrendered_config,
)
ctx = generate_parse_metrics(
@@ -1067,20 +1134,48 @@ class MetricParser(YamlReader):
model_ref = "{{ " + parsed.model + " }}"
get_rendered(model_ref, ctx, parsed)
parsed.sql = get_rendered(
parsed.sql,
parsed.expression = get_rendered(
parsed.expression,
ctx,
node=parsed,
)
return parsed
# if the metric is disabled we do not want it included in the manifest, only in the disabled dict
if parsed.config.enabled:
self.manifest.add_metric(self.yaml.file, parsed)
else:
self.manifest.add_disabled(self.yaml.file, parsed)
def parse(self) -> Iterable[ParsedMetric]:
def _generate_metric_config(
self, target: UnparsedMetric, fqn: List[str], package_name: str, rendered: bool
):
generator: BaseContextConfigGenerator
if rendered:
generator = ContextConfigGenerator(self.root_project)
else:
generator = UnrenderedConfigGenerator(self.root_project)
# configs with precendence set
precedence_configs = dict()
# first apply metric configs
precedence_configs.update(target.config)
return generator.calculate_node_config(
config_call_dict={},
fqn=fqn,
resource_type=NodeType.Metric,
project_name=package_name,
base=False,
patch_config_dict=precedence_configs,
)
def parse(self):
for data in self.get_key_dicts():
try:
UnparsedMetric.validate(data)
unparsed = UnparsedMetric.from_dict(data)
except (ValidationError, JSONValidationException) as exc:
msg = error_context(self.yaml.path, self.key, data, exc)
raise ParsingException(msg) from exc
yield self.parse_metric(unparsed)
self.parse_metric(unparsed)

View File

@@ -1,6 +1,7 @@
import os
from dataclasses import dataclass
from typing import List, Callable, Iterable, Set, Union, Iterator, TypeVar, Generic
from typing import List, Callable, Iterable, Set, Union, Iterator, TypeVar, Generic, Optional
from pathspec import PathSpec # type: ignore
from dbt.clients.jinja import extract_toplevel_blocks, BlockTag
from dbt.clients.system import find_matching
@@ -61,11 +62,16 @@ class FullBlock(FileBlock):
return self.block.full_block
def filesystem_search(project: Project, relative_dirs: List[str], extension: str):
def filesystem_search(
project: Project,
relative_dirs: List[str],
extension: str,
ignore_spec: Optional[PathSpec] = None,
):
ext = "[!.#~]*" + extension
root = project.project_root
file_path_list = []
for result in find_matching(root, relative_dirs, ext):
for result in find_matching(root, relative_dirs, ext, ignore_spec):
if "searched_path" not in result or "relative_path" not in result:
raise InternalException("Invalid result from find_matching: {}".format(result))
file_match = FilePath(

View File

@@ -43,7 +43,7 @@ class SqlBlockParser(SimpleSQLParser[ParsedSqlNode]):
return os.path.join("sql", block.name)
def parse_remote(self, sql: str, name: str) -> ParsedSqlNode:
source_file = SourceFile.remote(sql, self.project.project_name)
source_file = SourceFile.remote(sql, self.project.project_name, "sql")
contents = SqlBlock(block_name=name, file=source_file)
return self.parse_node(contents)

Some files were not shown because too many files have changed in this diff Show More