Compare commits

...

569 Commits

Author SHA1 Message Date
Nathaniel May
3144df1fa6 point to rust module 2021-02-09 15:05:25 -05:00
Nathaniel May
992dc5ce5c use relative import in tracking init 2021-02-09 14:00:09 -05:00
Nathaniel May
243c2cb0ed builds with pip. library functions are not actually included though. 2021-02-09 10:49:25 -05:00
Nathaniel May
c888fe52d6 expose functions in a python module 2021-02-08 12:15:08 -05:00
Nathaniel May
32ff2fbfd4 name project 2021-02-08 12:14:52 -05:00
Nathaniel May
7599b9bca1 add special linker rules for mac 2021-02-08 12:14:27 -05:00
Nathaniel May
0b1d93a18b expose tracking string literals in pyo3 library 2021-02-05 11:39:21 -05:00
Kyle Wigley
2b48152da6 Merge branch 'dev/0.19.1' into dev/margaret-mead 2021-01-27 17:16:13 -05:00
Christophe Blefari
e743e23d6b Update CHANGELOG to release fix in dbt 0.19.1 version 2021-01-27 16:57:29 -05:00
Christophe Blefari
f846f921f2 Bump werkzeug upper bound dependency constraint to include version 1.0 2021-01-27 16:55:56 -05:00
Github Build Bot
1060035838 Merge remote-tracking branch 'origin/releases/0.19.0' into dev/kiyoshi-kuromiya 2021-01-27 18:02:37 +00:00
Github Build Bot
69cc20013e Release dbt v0.19.0 2021-01-27 17:39:48 +00:00
Github Build Bot
3572bfd37d Merge remote-tracking branch 'origin/releases/0.19.0rc3' into dev/kiyoshi-kuromiya 2021-01-27 16:42:46 +00:00
Github Build Bot
a6b82990f5 Release dbt v0.19.0rc3 2021-01-27 16:07:41 +00:00
Kyle Wigley
540c1fd9c6 Merge pull request #3019 from fishtown-analytics/fix/cleanup-dockerfile
Clean up docker resources
2021-01-25 10:19:45 -05:00
Jeremy Cohen
46d36cd412 Merge pull request #3028 from NiallRees/lowercase_cte_names
Make generated CTE test names lowercase to match style guide
2021-01-25 14:39:26 +01:00
NiallRees
a170764fc5 Add to contributors 2021-01-25 11:16:00 +00:00
NiallRees
f72873a1ce Update CHANGELOG.md
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2021-01-25 11:13:32 +00:00
NiallRees
82496c30b1 Changelog 2021-01-24 16:35:40 +00:00
NiallRees
cb3c007acd Make generated CTE test names lowercase to match style guide 2021-01-24 16:19:20 +00:00
Jeremy Cohen
cb460a797c Merge pull request #3018 from lynxcare/fix-issue-debug-exit-code
dbt debug should return 1 when one of the tests fail
2021-01-21 16:36:03 +01:00
Sam Debruyn
df24c7d2f8 Merge branch 'dev/margaret-mead' into fix-issue-debug-exit-code 2021-01-21 15:39:18 +01:00
Sam Debruyn
133c15c0e2 move in changelog to v0.20 2021-01-21 15:38:31 +01:00
Kyle Wigley
116e18a19e rename testing dockerfile 2021-01-21 09:28:17 -05:00
Sam Debruyn
ec0af7c97b remove exitcodes and sys.exit 2021-01-21 10:36:05 +01:00
Jeremy Cohen
a34a877737 Merge pull request #2974 from rvacaru/fix-bug-2731
Fix bug #2731 on stripping query comments for snowflake
2021-01-21 09:54:22 +01:00
Sam Debruyn
f018794465 fix flake test - formatting 2021-01-20 21:09:58 +01:00
Sam Debruyn
d45f5e9791 add missing conditions 2021-01-20 18:15:32 +01:00
Razvan Vacaru
04bd0d834c added extra unit test 2021-01-20 18:06:17 +01:00
Sam Debruyn
ed4f0c4713 formatting 2021-01-20 18:04:21 +01:00
Sam Debruyn
c747068d4a use sys.exit 2021-01-20 16:51:06 +01:00
Kyle Wigley
aa0fbdc993 update changelog 2021-01-20 10:33:18 -05:00
Kyle Wigley
b50bfa7277 - rm older dockerfiles
- add dockerfile from dbt-releases
- rename the development dockerfile to Dockerfile.dev to avoid confusion
2021-01-20 10:23:03 -05:00
Sam Debruyn
e91988f679 use ExitCodes enum for exit code 2021-01-20 16:09:41 +01:00
Sam Debruyn
3ed1fce3fb update changelog 2021-01-20 16:06:24 +01:00
Sam Debruyn
e3ea0b511a dbt debug should return 1 when one of the tests fail 2021-01-20 16:00:58 +01:00
Razvan Vacaru
c411c663de moved unit tests and updated changelog.md 2021-01-19 19:04:58 +01:00
Razvan Vacaru
1c6f66fc14 Merge branch 'dev/margaret-mead' of https://github.com/fishtown-analytics/dbt into fix-bug-2731 2021-01-19 19:01:01 +01:00
Jeremy Cohen
1f927a374c Merge pull request #2928 from yu-iskw/issue-1843
Support require_partition_filter and partition_expiration_days in BQ
2021-01-19 12:11:39 +01:00
Jeremy Cohen
07c4225aa8 Merge branch 'dev/margaret-mead' into issue-1843 2021-01-19 11:24:59 +01:00
Github Build Bot
42a85ac39f Merge remote-tracking branch 'origin/releases/0.19.0rc2' into dev/kiyoshi-kuromiya 2021-01-14 17:41:49 +00:00
Github Build Bot
16e6d31ee3 Release dbt v0.19.0rc2 2021-01-14 17:21:25 +00:00
Kyle Wigley
a6db5b436d Merge pull request #2996 from fishtown-analytics/fix/rm-ellipses
Remove ellipses printed while parsing
2021-01-14 10:39:16 -05:00
Kyle Wigley
47675f2e28 update changelog 2021-01-14 09:28:28 -05:00
Kyle Wigley
0642bbefa7 remove ellipses printed while parsing 2021-01-14 09:28:05 -05:00
Kyle Wigley
43da603d52 Merge pull request #3009 from fishtown-analytics/fix/exposure-parsing
Fix exposure parsing to allow other resources with the same name
2021-01-14 09:26:02 -05:00
Kyle Wigley
f9e1f4d111 update changelog 2021-01-13 11:54:20 -05:00
Jeremy Cohen
1508564e10 Merge pull request #3008 from fishtown-analytics/feat/print-exposure-stats-too
Add exposures to print_compile_stats
2021-01-13 15:58:13 +01:00
Kyle Wigley
c14e6f4dcc add test for dupe exposures and dupe model/exposure name 2021-01-13 08:55:22 -05:00
Jeremy Cohen
75b6a20134 Add exposures to Found list 2021-01-12 19:07:52 +01:00
Kyle Wigley
d82a07c221 tweak exposure parsing logic 2021-01-12 12:41:51 -05:00
Jeremy Cohen
c6f7dbcaa5 Merge pull request #3006 from stpierre/postgres-unpin-botocore
postgres: Don't pin botocore version
2021-01-12 13:59:55 +01:00
Chris St. Pierre
82cd099e48 Update CHANGELOG 2021-01-12 06:20:09 -06:00
Chris St. Pierre
546c011dd8 postgres: Don't pin botocore version
`snowflake-connector-python` doesn't pin it, and it restricts us to a
much older version of boto3 than the boto3 pin would otherwise allow
(specifically, botocore<1.15 requires boto3<1.12).
2021-01-11 17:25:03 -06:00
Jeremy Cohen
10b33ccaf6 Merge pull request #3004 from mikaelene/Snapshot_merge_WHEN_MATCHED
This changes makes the macro easier to read and workable on SQL Server
2021-01-11 16:42:09 +01:00
mikaelene
bc01572176 Sane as #3003. But for postgres 2021-01-11 16:04:38 +01:00
mikaelene
ccd2064722 This changes makes the macro easier to read and makes the code work for SQL Server without a custom adapter macro. Solved #3003 2021-01-11 15:04:23 +01:00
mikaelene
0fb42901dd This changes makes the macro easier to read and makes the code work for SQL Server without a custom adapter macro. Solved #3003 2021-01-11 14:58:07 +01:00
Jeremy Cohen
a4280d7457 Merge pull request #3000 from swanderz/tsql_not_equal_workaround
Tsql not equal workaround
2021-01-11 09:40:33 +01:00
Anders
6966ede68b Update CHANGELOG.md
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2021-01-10 20:54:37 -08:00
Anders
27dd14a5a2 Update core/dbt/include/global_project/macros/materializations/snapshot/strategies.sql
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2021-01-10 20:54:10 -08:00
Anders
2494301f1e Update CHANGELOG.md
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2021-01-10 20:53:52 -08:00
Anders Swanson
f13143accb for posterity 2021-01-08 13:23:13 -08:00
Anders Swanson
26d340a917 temp hack 2021-01-08 12:14:08 -08:00
Anders Swanson
cc75cd4102 no tsql support for condA != condB 2021-01-08 12:10:15 -08:00
Anders Swanson
cf8615b231 Merge branch 'dev/kiyoshi-kuromiya' of https://github.com/fishtown-analytics/dbt into dev/kiyoshi-kuromiya 2021-01-08 12:03:15 -08:00
Jeremy Cohen
30f473a2b1 Merge pull request #2994 from fishtown-analytics/copyedit-changelog
Light cleanup of v0.19.0 changelogs
2021-01-07 16:00:02 +01:00
Jeremy Cohen
4618709baa Lightly edit v0.19 changelogs 2021-01-07 10:13:43 +01:00
Razvan Vacaru
16b098ea42 updated CHANGELOG.md 2021-01-04 17:43:03 +01:00
Razvan Vacaru
b31c4d407a Fix #2731 stripping snowflake comments in multiline queries 2021-01-04 17:41:00 +01:00
Kyle Wigley
28c36cc5e2 Merge pull request #2988 from fishtown-analytics/fix/dockerfile
Manually fix requirements for dockerfile using new pip version
2021-01-04 09:10:05 -05:00
Kyle Wigley
6bfbcb842e manually fix dockerfile using new pip version 2020-12-31 13:53:50 -05:00
Github Build Bot
a0eade4fdd Merge remote-tracking branch 'origin/releases/0.19.0rc1' into dev/kiyoshi-kuromiya 2020-12-29 23:07:35 +00:00
Github Build Bot
ee24b7e88a Release dbt v0.19.0rc1 2020-12-29 22:52:34 +00:00
Anders Swanson
c9baddf9a4 Merge branch 'master' of https://github.com/fishtown-analytics/dbt into dev/kiyoshi-kuromiya 2020-12-22 23:11:09 -08:00
Kyle Wigley
c5c780a685 Merge pull request #2972 from fishtown-analytics/feature/update-dbt-docs
dbt-docs changes for v0.19.0-rc1
2020-12-22 14:07:20 -05:00
Kyle Wigley
421aaabf62 Merge pull request #2961 from fishtown-analytics/feature/add-adapter-query-stats
Include adapter response info in execution results
2020-12-22 13:57:07 -05:00
Kyle Wigley
86788f034f update changelog 2020-12-22 13:30:50 -05:00
Kyle Wigley
232d3758cf update dbt docs 2020-12-22 13:17:51 -05:00
Kyle Wigley
71bcf9b31d update changelog 2020-12-22 13:08:12 -05:00
Kyle Wigley
bf4ee4f064 update api, fix tests, add placeholder for test/source results 2020-12-22 12:13:37 -05:00
Kyle Wigley
aa3bdfeb17 update naming 2020-12-21 13:35:15 -05:00
Jeremy Cohen
ce6967d396 Merge pull request #2966 from fishtown-analytics/fix/add-ctes-comment
Update comments for _add_ctes()
2020-12-18 10:54:37 -05:00
Yu ISHIKAWA
330065f5e0 Add a condition for require_partition_filter 2020-12-18 11:14:03 +09:00
Yu ISHIKAWA
944db82553 Remove unnecessary code for print debug 2020-12-18 11:14:03 +09:00
Yu ISHIKAWA
c257361f05 Fix syntax 2020-12-18 11:14:03 +09:00
Yu ISHIKAWA
ffdbfb018a Implement tests in test_bigquery_changing_partitions.py 2020-12-18 11:14:01 +09:00
Yu ISHIKAWA
cfa2bd6b08 Remove tests fromm test_bigquery_adapter_specific.py 2020-12-18 11:13:16 +09:00
Yu ISHIKAWA
51e90c3ce0 Format 2020-12-18 11:13:16 +09:00
Yu ISHIKAWA
d69149f43e Update 2020-12-18 11:13:15 +09:00
Yu ISHIKAWA
f261663f3d Add debug code 2020-12-18 11:13:15 +09:00
Yu ISHIKAWA
e5948dd1d3 Update 2020-12-18 11:13:15 +09:00
Yu ISHIKAWA
5f13aab7d8 Print debug 2020-12-18 11:13:15 +09:00
Yu ISHIKAWA
292d489592 Format code 2020-12-18 11:13:15 +09:00
Yu ISHIKAWA
0a01f20e35 Update CHANGELOG.md 2020-12-18 11:13:11 +09:00
Yu ISHIKAWA
2bd08d5c4c Support require_partition_filter and partition_expiration_days in BQ 2020-12-18 11:12:47 +09:00
Jeremy Cohen
adae5126db Merge pull request #2954 from fishtown-analytics/feature/defer-tests
Feature: defer tests
2020-12-17 18:01:14 -05:00
Kyle Wigley
dddf1bcb76 first pass at adding query stats, naming tbd 2020-12-17 16:39:02 -05:00
Jeremy Cohen
d23d4b0fd4 Merge pull request #2963 from tyang209/issue-2931
Bumped boto3 version uppper range for dbt-redshift
2020-12-17 14:30:47 -05:00
Tao Yang
658f7550b3 Merge branch 'dev/kiyoshi-kuromiya' into issue-2931 2020-12-17 08:58:49 -08:00
Kyle Wigley
cfb50ae21e Merge pull request #2960 from fishtown-analytics/feature/python-39
Test python3.9
2020-12-17 11:11:56 -05:00
Jeremy Cohen
9b0a365822 Update comments for _add_ctes() 2020-12-17 10:35:04 -05:00
Jeremy Cohen
97ab130619 Merge pull request #2958 from fishtown-analytics/fix/keyerror-defer-missing-parent
Fix KeyError from defer + deletion
2020-12-17 10:29:51 -05:00
Tao Yang
3578fde290 Bumped boto3 version uppper range for dbt-redshift 2020-12-16 20:03:53 -08:00
Jeremy Cohen
f382da69b8 Changelog 2020-12-16 17:46:00 -05:00
Jeremy Cohen
2da3d215c6 Add test case to repro bug 2020-12-16 17:38:27 -05:00
Kyle Wigley
43ed29c14c update changelog 2020-12-16 16:29:48 -05:00
Jeremy Cohen
9df0283689 Truthier? 2020-12-16 14:55:27 -05:00
Jeremy Cohen
04b82cf4a5 What is backward may not be forward 2020-12-16 14:55:27 -05:00
Jeremy Cohen
274c3012b0 Add defer to rpc test method 2020-12-16 14:53:25 -05:00
Jeremy Cohen
2b24a4934f defer tests, too 2020-12-16 14:42:00 -05:00
Kyle Wigley
692a423072 comment out snowflake py39 tests 2020-12-16 11:27:00 -05:00
Kyle Wigley
148f55335f address issue with py39 2020-12-16 11:25:31 -05:00
Kyle Wigley
2f752842a1 update hologram and add new envs to tox 2020-12-16 11:25:31 -05:00
Jeremy Cohen
aff72996a1 Merge pull request #2946 from fishtown-analytics/fix/defer-if-not-exist
Defer iff unselected reference does not exist in current env
2020-12-16 11:22:31 -05:00
Jeremy Cohen
08e425bcf6 Handle keyerror if old node missing 2020-12-16 00:24:00 -05:00
Kyle Wigley
454ddc601a Merge pull request #2943 from fishtown-analytics/feature/refactor-run-results
Clean up run results
2020-12-15 12:42:22 -05:00
Jeremy Cohen
b025f208a8 Check if relation exists before deferring 2020-12-14 22:21:43 -05:00
Kyle Wigley
b60e533b9d fix printer output 2020-12-14 19:50:17 -05:00
Kyle Wigley
37af0e0d59 update changelog 2020-12-14 16:28:23 -05:00
Kyle Wigley
ac1de5bce9 more updates 2020-12-14 16:28:23 -05:00
Kyle Wigley
ef7ff55e07 flake8 2020-12-14 16:28:23 -05:00
Kyle Wigley
608db5b982 code cleanup + swap node with unique_id 2020-12-14 16:28:23 -05:00
Kyle Wigley
8dd69efd48 address test failures 2020-12-14 16:28:23 -05:00
Kyle Wigley
73f7fba793 fix printing test status 2020-12-14 16:28:23 -05:00
Kyle Wigley
867e2402d2 chugging along 2020-12-14 16:28:23 -05:00
Kyle Wigley
a3b9e61967 first pass, lots of TODO's [skip ci] 2020-12-14 16:28:22 -05:00
Jeremy Cohen
cd149b68e8 Merge pull request #2920 from joellabes/2913-docs-block-exposures
Render docs blocks in exposures
2020-12-13 18:38:23 -05:00
Joel Labes
cd3583c736 Merge branch 'dev/kiyoshi-kuromiya' into 2913-docs-block-exposures 2020-12-13 14:27:37 +13:00
Joel Labes
441f86f3ed Add test.notebook_info to expected manifest 2020-12-13 14:25:37 +13:00
Joel Labes
f62bea65a1 Move model.test.view_summary to parent map instead of child map 2020-12-13 14:11:04 +13:00
Jeremy Cohen
886b574987 Merge pull request #2939 from fishtown-analytics/fix/big-seed-smaller-path
Use diff file path for big seed checksum
2020-12-07 11:18:15 -05:00
Joel Labes
2888bac275 Merge branch 'dev/kiyoshi-kuromiya' into 2913-docs-block-exposures 2020-12-07 21:17:21 +13:00
Joel Labes
35c9206916 Fix test failure (?) 2020-12-07 21:15:44 +13:00
Joel Labes
c4c5b59312 Stab at updating parent and child maps 2020-12-07 17:45:12 +13:00
Jeremy Cohen
f25fb4e5ac Use diff file path for big seed checksum 2020-12-04 17:04:27 -05:00
Jeremy Cohen
868bfec5e6 Merge pull request #2907 from max-sixty/raise
Remove duplicate raise
2020-12-03 14:17:58 -05:00
Jeremy Cohen
e7c242213a Merge pull request #2908 from max-sixty/bq-default-project
Allow BigQuery to default on project name
2020-12-03 14:17:02 -05:00
Jeremy Cohen
862552ead4 Merge pull request #2930 from fishtown-analytics/revert-2858-dependabot/pip/docker/requirements/cryptography-3.2
Revert dependabot cryptography upgrade for old versions
2020-12-03 13:58:26 -05:00
Jeremy Cohen
9d90e0c167 tiny changelog fixup 2020-12-03 13:27:46 -05:00
Jeremy Cohen
a281f227cd Revert "Bump cryptography from 2.9.2 to 3.2 in /docker/requirements" 2020-12-03 12:12:15 -05:00
Maximilian Roos
5b981278db changelog 2020-12-02 14:59:35 -08:00
Maximilian Roos
c1091ed3d1 Merge branch 'dev/kiyoshi-kuromiya' into bq-default-project 2020-12-02 14:55:27 -08:00
Maximilian Roos
08aed63455 Formatting 2020-12-02 11:19:02 -08:00
Maximilian Roos
90a550ee4f Update plugins/bigquery/dbt/adapters/bigquery/connections.py
Co-authored-by: Kyle Wigley <kwigley44@gmail.com>
2020-12-02 10:41:20 -08:00
Jeremy Cohen
34869fc2a2 Merge pull request #2922 from plotneishestvo/snowflake_connector_upgrade
update cryptography package and snowflake connector
2020-12-02 12:34:34 -05:00
Pavel Plotnikov
3deb10156d Merge branch 'dev/kiyoshi-kuromiya' into snowflake_connector_upgrade 2020-12-02 12:46:02 +02:00
Maximilian Roos
8c0e84de05 Move method to module func 2020-12-01 16:19:20 -08:00
Joel Labes
23be083c39 Change models folder to ref_models folder 2020-12-02 11:59:21 +13:00
Joel Labes
217aafce39 Add line break to description, fix refs and maybe fix original_file_path 2020-12-02 11:47:29 +13:00
Joel Labes
03210c63f4 Blank instead of none description 2020-12-02 10:57:47 +13:00
Joel Labes
a90510f6f2 Ref a model that actually exists 2020-12-02 10:40:34 +13:00
Joel Labes
36d91aded6 Empty description for minimal/basic exposure object tests 2020-12-01 17:56:55 +13:00
Joel Labes
9afe8a1297 Default to empty string for ParsedExposure description 2020-12-01 17:35:42 +13:00
Maximilian Roos
1e6f272034 Add test config 2020-11-30 20:06:06 -08:00
Maximilian Roos
a1aa2f81ef _ 2020-11-30 19:30:07 -08:00
Maximilian Roos
62899ef308 _ 2020-11-30 16:54:21 -08:00
Joel Labes
7f3396c002 Forgot another comma 🤦 2020-12-01 12:46:26 +13:00
Joel Labes
453bc18196 Merge branch '2913-docs-block-exposures' of https://github.com/joellabes/dbt into 2913-docs-block-exposures 2020-12-01 12:42:11 +13:00
Joel Labes
dbb6b57b76 Forgot a comma 2020-12-01 12:40:51 +13:00
Joel Labes
d7137db78c Merge branch 'dev/kiyoshi-kuromiya' into 2913-docs-block-exposures 2020-12-01 12:34:29 +13:00
Joel Labes
5ac4f2d80b Move description arg to be below default-free args 2020-12-01 12:33:08 +13:00
Jeremy Cohen
5ba5271da9 Merge pull request #2903 from db-magnus/bq-hourly-part
Hourly, monthly and yearly partitions in BigQuery
2020-11-30 09:46:36 -05:00
Pavel Plotnikov
b834e3015a update changelog md 2020-11-30 14:46:51 +02:00
Joel Labes
c8721ded62 Code review: non-optional description, docs block tests, yaml exposure attributes 2020-11-30 20:29:47 +13:00
Magnus Fagertun
1e97372d24 Update test/unit/test_bigquery_adapter.py
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-11-30 07:26:36 +01:00
Magnus Fagertun
fd4e111784 Update test/unit/test_bigquery_adapter.py
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-11-30 00:44:25 +01:00
Magnus Fagertun
75094e7e21 Update test/unit/test_bigquery_adapter.py
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-11-30 00:44:15 +01:00
Joel Labes
8db2d674ed Update CHANGELOG.md 2020-11-28 15:08:13 +13:00
Pavel Plotnikov
ffb140fab3 update cryptography package and snowflake connector 2020-11-27 16:52:13 +02:00
Joel Labes
e93543983c Follow Jeremy's wild speculation 2020-11-27 22:45:31 +13:00
Magnus Fagertun
0d066f80ff added test and enhancements from jtcohen6 2020-11-25 21:41:51 +01:00
Magnus Fagertun
ccca1b2016 Update plugins/bigquery/dbt/adapters/bigquery/impl.py
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-11-25 21:17:07 +01:00
Kyle Wigley
fec0e31a25 Merge pull request #2902 from fishtown-analytics/fix/test-selection
set default `materialized` for test node configs
2020-11-24 12:19:40 -05:00
Kyle Wigley
d246aa8f6d update readme 2020-11-24 10:40:01 -05:00
Maximilian Roos
66bfba2258 flake8 seems to sometimes be applied 2020-11-23 17:39:57 -08:00
Maximilian Roos
b53b4373cb Definet database exclusively in contracts/connection.py 2020-11-23 17:32:41 -08:00
Maximilian Roos
0810f93883 Allow BigQuery to default on project name 2020-11-23 16:58:54 -08:00
Maximilian Roos
a4e696a252 Remove duplicate raise 2020-11-23 15:34:43 -08:00
Jeremy Cohen
0951d08f52 Merge pull request #2877 from max-sixty/unlock-google-api
Wider google-cloud dependencies
2020-11-23 14:16:12 -05:00
Jeremy Cohen
dbf367e070 Merge branch 'dev/kiyoshi-kuromiya' into unlock-google-api 2020-11-23 11:46:07 -05:00
Magnus Fagertun
6447ba8ec8 whitespace cleanup 2020-11-22 10:00:10 +01:00
Magnus Fagertun
43e260966f uppercase and lowercase for date partitions supported 2020-11-21 01:21:07 +01:00
Magnus Fagertun
b0e301b046 typo in _partitions_match 2020-11-21 00:40:27 +01:00
Magnus Fagertun
c8a9ea4979 added month,year to date partitioning, granularity comparison to _partitions_match 2020-11-21 00:24:20 +01:00
Maximilian Roos
afb7fc05da Changelog 2020-11-20 14:58:46 -08:00
Magnus Fagertun
14124ccca8 added tests for datetime and timestamp 2020-11-20 00:10:15 +01:00
Magnus Fagertun
df5022dbc3 moving granularity to render, not to break tests 2020-11-19 18:51:05 +01:00
Magnus Fagertun
015e798a31 more BQ partitioning 2020-11-19 17:42:27 +01:00
Kyle Wigley
c19125bb02 Merge pull request #2893 from fishtown-analytics/feature/track-parse-time
Add event tracking for project parse/load time
2020-11-19 10:30:46 -05:00
Kyle Wigley
0e6ac5baf1 can we just default materialization to test? 2020-11-19 09:27:31 -05:00
Magnus Fagertun
2c8d1b5b8c Added hour, year, month partitioning BQ 2020-11-19 13:47:42 +01:00
Kyle Wigley
f7c0c1c21a fix tests 2020-11-18 17:21:41 -05:00
Kyle Wigley
4edd98f7ce update changelog 2020-11-18 16:19:58 -05:00
Kyle Wigley
df0abb7000 flake8 fixes 2020-11-18 16:19:58 -05:00
Kyle Wigley
4f93da307f add event to track loading time 2020-11-18 16:19:58 -05:00
Gerda Shank
a8765d54aa Merge pull request #2895 from fishtown-analytics/string_selectors
convert cli-style strings in selectors to normalized dictionaries
2020-11-18 15:53:23 -05:00
Gerda Shank
bb834358d4 convert cli-style strings in selectors to normalized dictionaries
[#2879]
2020-11-18 14:43:44 -05:00
Jeremy Cohen
ec0f3d22e7 Merge pull request #2892 from rsella/dev/kiyoshi-kuromiya
Change dbt list command to always return 0 as exit code
2020-11-17 11:12:55 -05:00
Riccardo Sella
009b75cab6 Fix changelog and edit additional failing tests 2020-11-17 16:38:28 +01:00
Riccardo Sella
d64668df1e Change dbt list command to always return 0 as exit code 2020-11-17 14:49:24 +01:00
Gerda Shank
72e808c9a7 Merge pull request #2889 from fishtown-analytics/dbt-test-runner
Add scripts/dtr.py, dbt test runner. Bump hologram version.
2020-11-15 20:06:28 -05:00
Gerda Shank
96cc9223be Add scripts/dtr.py, dbt test runner. Bump hologram version. 2020-11-13 14:34:10 -05:00
Gerda Shank
13b099fbd0 Merge pull request #2883 from fishtown-analytics/feature/2824-parse-only-command
Add parse command and collect parse timing info [#2824]
2020-11-13 10:19:19 -05:00
Gerda Shank
1a8416c297 Add parse command and collect parse timing info [#2824] 2020-11-12 13:56:41 -05:00
Maximilian Roos
8538bec99e _ 2020-11-11 13:48:41 -08:00
Maximilian Roos
f983900597 google-cloud-bigquery goes to 3 2020-11-10 23:51:15 -08:00
Gerda Shank
3af02020ff Merge pull request #2866 from fishtown-analytics/feature/2693-selectors-to-manifest
Save selector dictionary and write out in manifest [#2693][#2800]
2020-11-10 11:48:19 -05:00
Maximilian Roos
8c71488757 _ 2020-11-10 08:38:43 -08:00
Gerda Shank
74316bf702 Save selector dictionary and write out in manifest [#2693][#2800] 2020-11-10 11:17:14 -05:00
Maximilian Roos
7aa8c435c9 Bump protobuf too 2020-11-09 17:36:41 -08:00
Maximilian Roos
daeb51253d Unpin google-cloud dependencies 2020-11-09 17:18:42 -08:00
Jeremy Cohen
0ce2f41db4 Reorg #2837 in changelog 2020-11-09 09:46:08 -05:00
Jeremy Cohen
02e5a962d7 Merge pull request #2837 from franloza/feature/2647-relation-name-in-metadata
Store relation name in manifest's node and source objects
2020-11-09 09:44:40 -05:00
Jeremy Cohen
dcc32dc69f Merge pull request #2850 from elexisvenator/patch-1
Postgres: Prevent temp relation identifiers from being too long
2020-11-09 09:32:35 -05:00
Gerda Shank
af3d6681dd extend timeout for test/rpc 2020-11-06 17:45:35 -05:00
Gerda Shank
106968a3be Merge pull request #2858 from fishtown-analytics/dependabot/pip/docker/requirements/cryptography-3.2
Bump cryptography from 2.9.2 to 3.2 in /docker/requirements
2020-11-06 15:54:49 -05:00
Ben Edwards
2cd56ca044 Update changelog 2020-11-03 20:58:01 +11:00
Ben Edwards
eff198d079 Add integration tests 2020-11-03 20:56:02 +11:00
Ben Edwards
c3b5b88cd2 Postgres: Prevent temp relation identifiers from being too long
Related: #2197 

The currently postgres `make_temp_relation` adds a 29 character suffix to the end of the temp relation identifier (9 from default suffix and 20 from timestamp).  This is a problem now that relations with more than 63 characters raise exceptions. 
The fix is to shorten the suffix and also trim the base_relation identifier so that the total length is always less than 63 characters.

An exception can also be raised if the default suffix is overridden with a value that is too long.
2020-11-03 20:56:02 +11:00
Kyle Wigley
4e19e87bbc Merge pull request #2859 from fishtown-analytics/fix/update-test-container
add unixodbc-dev to testing docker image
2020-10-30 09:56:39 -04:00
Kyle Wigley
6be6f6585d update changelog 2020-10-29 16:52:09 -04:00
Kyle Wigley
d7579f0c99 add g++ and unixodbc-dev 2020-10-29 16:22:46 -04:00
Fran Lozano
b741679c9c Add missing key to child map in expected_bigquery_complex_manifest 2020-10-29 17:25:18 +01:00
Fran Lozano
852990e967 Fix child_map in tests 2020-10-28 22:18:32 +01:00
Fran Lozano
21fd75b500 Fix parent_map object in tests 2020-10-28 19:59:36 +01:00
Fran Lozano
3e5d9010a3 Add snapshot to additional Redshift and Bigquery manifest tests 2020-10-28 19:39:04 +01:00
Fran Lozano
784616ec29 Add relation name to source object in manifest 2020-10-28 18:58:25 +01:00
Fran Lozano
6251d19946 Use is_ephemeral_model property instead of config.materialized
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-10-28 09:49:44 +01:00
dependabot[bot]
17b1332a2a Bump cryptography from 2.9.2 to 3.2 in /docker/requirements
Bumps [cryptography](https://github.com/pyca/cryptography) from 2.9.2 to 3.2.
- [Release notes](https://github.com/pyca/cryptography/releases)
- [Changelog](https://github.com/pyca/cryptography/blob/master/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/2.9.2...3.2)

Signed-off-by: dependabot[bot] <support@github.com>
2020-10-27 22:22:14 +00:00
Jeremy Cohen
74eec3bdbe Merge pull request #2855 from brangisom/brangisom/spectrum-filter-fix
Fix the filtering for external tables in the Redshift get_columns_in_relation macro
2020-10-27 15:03:59 -04:00
Fran Lozano
a9901c4ea7 Disable snapshot documentation testing for Redshift and Bigquery 2020-10-27 19:27:54 +01:00
Brandon Isom
348a2f91ee Move a CHANGELOG entry 2020-10-27 11:17:13 -07:00
Fran Lozano
7115d862ea Modify snapshot path for docs generation tests 2020-10-27 18:59:32 +01:00
Fran Lozano
52ed4aa631 Fix tests which are missing snapshot nodes 2020-10-27 18:45:00 +01:00
Fran Lozano
92cedf8931 Fix Flake8 style issue 2020-10-27 17:39:44 +01:00
Fran Lozano
e1097f11b5 Define relation_name only for non-ephemeral models, seeds and snapshots 2020-10-27 17:23:23 +01:00
Brandon Isom
eb34c0e46b Add stuff to changelog per checklist 2020-10-26 20:17:03 -07:00
Brandon Isom
ee2181b371 Merge branch 'brangisom/spectrum-filter-fix' of github.com:brangisom/dbt into brangisom/spectrum-filter-fix 2020-10-26 19:44:45 -07:00
Brandon Isom
2a5d090e91 Pushes the table_schema = '{{ relation.schema }}' filter into the svv_external_columns CTE 2020-10-26 19:38:33 -07:00
Brandon Isom
857bebe819 Pushes the table_schema = '{{ relation.schema }}' clause down into the svv_external_columns CTE. 2020-10-26 18:29:47 -07:00
Jeremy Cohen
9728152768 Merge pull request #2851 from hochoy/add-python-regex
Folllow up: Support for python "re" module for doing regex in jinja templates
2020-10-26 16:27:17 -04:00
Wai Ho Choy
2566a85429 edit CHANGELOG.md 2020-10-26 12:21:30 -07:00
Wai Ho Choy
46b3130198 lint 2020-10-25 21:18:23 -07:00
Wai Ho Choy
8664516c8d fix blank line linting 2020-10-25 12:03:10 -07:00
Wai Ho Choy
0733c246ea add all exports from python re module 2020-10-25 11:31:33 -07:00
Fran Lozano
4203985e3e Adapt expected_seeded_manifest method to Snowflake identifier quoting 2020-10-25 18:50:52 +01:00
Fran Lozano
900298bce7 Fix database name in relation_name in expected_run_results 2020-10-25 18:06:18 +01:00
Fran Lozano
09c37f508e Adapt relation_name to expected_run_results parameters 2020-10-25 17:27:46 +01:00
Fran Lozano
c9e01bcc81 Fix quotes in relation name for Bigquery docs generate tests 2020-10-25 16:31:00 +01:00
Fran Lozano
b079545e0f Adapt relation_name for Bigquery and Snowflake in docs generation tests 2020-10-25 15:58:04 +01:00
Fran Lozano
c3bf0f8cbf Add relation_name to missing tests in test_docs_generate 2020-10-25 14:11:33 +01:00
Jeremy Cohen
e945bca1d9 Merge pull request #2596 from ran-eh/re-partition-metadata
Make partition metadata available to BigQuery users
2020-10-22 20:58:17 -04:00
Jeremy Cohen
bf5835de5e Merge branch 'dev/kiyoshi-kuromiya' into re-partition-metadata 2020-10-22 20:18:31 -04:00
Ran Ever-Hadani
7503f0cb10 merge from dev/kiyoshi-kuromiya 2020-10-22 16:23:02 -07:00
Ran Ever-Hadani
3a751bcf9b Update CHANGELOG.md 2020-10-22 15:53:25 -07:00
Jeremy Cohen
c31ba101d6 Add tests for get_partitions_metadata (#1)
* Add tests using get_partitions_metadata

* Readd asterisk to raw_execute
2020-10-21 16:00:10 -07:00
Jeremy Cohen
ecadc74d44 Merge pull request #2841 from feluelle/dev/kiyoshi-kuromiya
Respect --project-dir in dbt clean command
2020-10-21 16:01:11 -04:00
Jeremy Cohen
63d25aaf19 Update changelog to account for v0.19.0-b1 release 2020-10-21 09:47:36 -04:00
feluelle
5af82c3c05 Add test that checks if targets were successfully deleted 2020-10-21 10:27:39 +02:00
feluelle
8b4d74ed17 Add changelog entry for resolving issue #2840 2020-10-21 10:27:39 +02:00
feluelle
6a6a9064d5 Respect --project-dir in dbt clean command 2020-10-21 10:25:17 +02:00
Github Build Bot
b188a9488a Merge remote-tracking branch 'origin/releases/0.19.0b1' into dev/kiyoshi-kuromiya 2020-10-21 00:46:30 +00:00
Github Build Bot
7c2635f65d Release dbt v0.19.0b1 2020-10-21 00:35:44 +00:00
Jeremy Cohen
c67d0a0e1a Readd bumpversion config header 2020-10-20 18:01:04 -04:00
Fran Lozano
7ee78e89c9 Add missing relation_name fields in doc generation test manifests 2020-10-20 19:18:38 +02:00
Fran Lozano
40370e104f Fix wrong schema name in test and add missing relation_name in node 2020-10-20 18:48:59 +02:00
Fran Lozano
a8809baa6c Merge branch 'dev/kiyoshi-kuromiya' into feature/2647-relation-name-in-metadata 2020-10-20 18:32:53 +02:00
Fran Lozano
244d5d2c3b Merge remote-tracking branch 'upstream/dev/kiyoshi-kuromiya' into dev/kiyoshi-kuromiya 2020-10-20 18:26:28 +02:00
Fran Lozano
a0370a6617 Add relation_name to node object in docs generation tests 2020-10-20 18:22:32 +02:00
Jeremy Cohen
eb077fcc75 Merge pull request #2845 from fishtown-analytics/docs/0.19.0-b1
dbt-docs changes for dbt v0.19.0-b1
2020-10-20 09:57:11 -04:00
Jeremy Cohen
c5adc50eed Make flake8 happy 2020-10-19 18:42:23 -04:00
Jeremy Cohen
6e71b6fd31 Include dbt-docs changes for v0.19.0-b1 2020-10-19 18:41:31 -04:00
Gerda Shank
278382589d Merge pull request #2834 from fishtown-analytics/feature/remove_injected_sql
Remove injected_sql. Store non-ephemeral injected_sql in compiled_sql
2020-10-19 18:08:41 -04:00
Gerda Shank
6f0f6cf21a Merge branch 'dev/0.18.1' into dev/kiyoshi-kuromiya 2020-10-19 11:30:52 -04:00
Fran Lozano
01331ed311 Update CHANGELOG.md 2020-10-16 19:08:30 +02:00
Fran Lozano
f638a3d50c Store relation name in manifest's node object 2020-10-16 18:38:22 +02:00
Gerda Shank
512c41dbaf Remove injected_sql. Store non-ephemeral injected_sql in compiled_sql 2020-10-15 11:52:03 -04:00
Github Build Bot
f6bab4adcf Release dbt v0.18.1 2020-10-13 21:31:54 +00:00
Jeremy Cohen
526ecee3da Merge pull request #2832 from fishtown-analytics/fix/colorama-upper-044
Set colorama upper bound to <0.4.4
2020-10-13 17:20:05 -04:00
Jeremy Cohen
1bc9815d53 Set colorama upper bound to <0.4.4 2020-10-13 16:26:10 -04:00
Ran Ever-Hadani
78bd7c9465 Eliminate asterisk from raw_execute to try an fix integration error 2020-10-11 12:06:56 -07:00
Ran Ever-Hadani
d74df8692b Eliminate pep8 errors 2020-10-11 11:37:51 -07:00
Ran Ever-Hadani
eda86412cc Accommodate first round of comments 2020-10-11 11:03:53 -07:00
Ran Ever-Hadani
cce5945fd2 Make partition metadata available to BigQuery users (rebased to dev/kiyoshi-kuromiya) 2020-10-10 17:44:07 -07:00
Drew Banin
72038258ed Merge pull request #2805 from fishtown-analytics/feature/bigquery-oauth-token
Support BigQuery OAuth using a refresh token and client secrets
2020-10-09 14:55:00 -04:00
Drew Banin
056d8fa9ad Merge branch 'dev/kiyoshi-kuromiya' into feature/bigquery-oauth-token 2020-10-09 14:07:27 -04:00
Gerda Shank
3888e0066f Merge pull request #2813 from fishtown-analytics/feature/2510-save-args-run_results
Save args in run_results.json
2020-10-09 13:51:14 -04:00
Drew Banin
ee6571d050 Merge branch 'dev/kiyoshi-kuromiya' into feature/bigquery-oauth-token 2020-10-09 10:15:23 -04:00
Gerda Shank
9472288304 Save args in run_results.json 2020-10-08 17:39:03 -04:00
Jeremy Cohen
fd5e10cfdf Merge pull request #2817 from zmac12/feature/addDebugQueryMethod
Feature/add debug query method
2020-10-08 10:30:31 -04:00
Jeremy Cohen
aeae18ec37 Merge pull request #2821 from joelluijmes/feature/hard-delete-revival
Re-instate hard-deleted records during snapshot
2020-10-08 10:27:42 -04:00
Zach McQuiston
03d3943e99 fixing linter problem in connections.py 2020-10-08 07:47:16 -06:00
Zach McQuiston
214d137672 adding entry to changelog 2020-10-08 07:45:35 -06:00
Joël Luijmes
83db275ddf Added changelog entry 2020-10-08 15:26:51 +02:00
Joël Luijmes
b8f16d081a Remove redundant 'dbt_valid_to is null' checks 2020-10-08 15:24:04 +02:00
Joël Luijmes
675b01ed48 Re-snapshot records that were invalidated through hard-delete 2020-10-08 12:45:35 +02:00
Joël Luijmes
b20224a096 Refactor test for hard-delete snapshots 2020-10-08 12:45:35 +02:00
Zach McQuiston
fd6edfccc4 Removing accidental whitespace addition 2020-10-07 19:30:48 -06:00
Zach McQuiston
4c58438e8a removing errant reference to debug_query method 2020-10-07 19:29:49 -06:00
Zach McQuiston
5ff383a025 Adding type hint for debug_query 2020-10-07 19:25:00 -06:00
Zach McQuiston
dcb6854683 adding debug_query to base/impl.py enabling plugin authors to write their own debug_query 2020-10-07 19:20:16 -06:00
Drew Banin
e4644bfe5a support providing a token directly; update method name 2020-10-07 15:34:57 -04:00
Jeremy Cohen
93168fef87 Merge pull request #2809 from mescanne/bigquery_invocation_id
Add invocation_id to BigQuery jobs
2020-10-07 10:41:39 -04:00
Jeremy Cohen
9832822bdf Merge branch 'dev/kiyoshi-kuromiya' into bigquery_invocation_id 2020-10-07 09:45:21 -04:00
Mark Scannell
5d91aa3bcd Updated feature request resolution. 2020-10-07 10:55:35 +01:00
Zach McQuiston
354ab5229b adding new debug_query function to base debug task 2020-10-03 16:20:52 -06:00
Zach McQuiston
00de0cd4b5 adding new debug_query function to base adapter 2020-10-03 16:19:48 -06:00
Mark Scannell
26210216da Only set invocation_id if tracking is enabled 2020-10-03 15:32:53 +01:00
Mark Scannell
e29c14a22b updated changelog 2020-10-03 13:09:04 +01:00
Mark Scannell
a6990c8fb8 fixe up 2020-10-03 13:04:56 +01:00
Mark Scannell
3e40e71b96 Added dbt_invocation_id to BigQuery jobs 2020-10-03 13:01:25 +01:00
Gerda Shank
3f45abe331 Merge pull request #2799 from fishtown-analytics/feature/2765-save-manifest
write manifest when writing run_results
2020-10-01 16:12:15 -04:00
Gerda Shank
6777c62789 Merge pull request #2804 from fishtown-analytics/rpc-test-timeouts
Increase rpc test timouts to avoid local test failures
2020-10-01 16:10:59 -04:00
Github Build Bot
1aac869738 Merge remote-tracking branch 'origin/releases/0.18.1rc1' into dev/0.18.1 2020-10-01 16:52:51 +00:00
Github Build Bot
493554ea30 Release dbt v0.18.1rc1 2020-10-01 16:39:50 +00:00
Drew Banin
1cf87c639b (#2344) Support BigQuery OAuth using a refesh token and client secrets 2020-09-30 23:04:40 -04:00
Gerda Shank
2cb3d92163 Increase rpc test timouts to avoid local test failures 2020-09-30 17:34:23 -04:00
Jeremy Cohen
89b6e52a73 Merge pull request #2791 from kingfink/tf/fix-snapshot-compilation-error
Fix snapshot compilation error
2020-09-30 16:54:45 -04:00
Tim Finkel
97407c10ff revert dockerfile 2020-09-30 15:54:50 -04:00
Tim Finkel
81222dadbc Merge branch 'tf/fix-snapshot-compilation-error' of https://github.com/kingfink/dbt into tf/fix-snapshot-compilation-error 2020-09-30 15:53:40 -04:00
Tim Finkel
400555c391 update tests 2020-09-30 15:52:12 -04:00
Tim Finkel
9125b05809 Update CHANGELOG.md 2020-09-30 14:46:45 -04:00
Jeremy Cohen
139b353a28 Merge pull request #2796 from Foxtel-DnA/feature/6434-bq-retry-rate-limit
UPDATE _is_retryable() to handle BQ rateLimitExceeded
2020-09-30 13:09:11 -04:00
Jared Champion (SYD)
fc474a07d0 REBASED on dev/0.18.1; moved CHANGELOG entries 2020-09-30 10:57:03 +10:00
championj-foxtel
8fd8fa09a5 Merge pull request #1 from fishtown-analytics/dev/0.18.1
Dev/0.18.1
2020-09-30 09:56:27 +10:00
Gerda Shank
41ae831d0e write manifest when writing run_results 2020-09-29 16:35:44 -04:00
Gerda Shank
dbca540d70 Merge pull request #2781 from fishtown-analytics/feature/2700-improve-yaml-selector-errors
Add better error messages for yaml selectors
2020-09-29 16:16:44 -04:00
Tim Finkel
dc7eca4bf9 update changelog 2020-09-25 17:07:05 -04:00
Tim Finkel
fb07149cb7 fix snapshot compliation error 2020-09-25 16:52:14 -04:00
Github Build Bot
b2bd5a5548 Merge remote-tracking branch 'origin/releases/0.18.1b3' into dev/0.18.1 2020-09-25 20:20:21 +00:00
Github Build Bot
aa6b333e79 Release dbt v0.18.1b3 2020-09-25 20:05:31 +00:00
Jeremy Cohen
0cb9740535 Merge pull request #2789 from fishtown-analytics/fix/require-keyring
Fix: require keyring on snowflake
2020-09-25 15:00:12 -04:00
Gerda Shank
46eadd54e5 Add better error messages for yaml selectors 2020-09-25 14:53:30 -04:00
Jeremy Cohen
6b032b49fe Merge branch 'dev/0.18.1' into fix/require-keyring 2020-09-25 14:13:53 -04:00
Jeremy Cohen
35f78ee0f9 Merge pull request #2754 from aiguofer/include_external_tables_in_get_columns_in_relation
Include external tables in get_columns_in_relation redshift adapter
2020-09-25 13:25:06 -04:00
Jeremy Cohen
5ec36df7f0 Merge branch 'dev/0.18.1' into include_external_tables_in_get_columns_in_relation 2020-09-25 12:52:39 -04:00
Jeremy Cohen
f918fd65b6 Merge pull request #2766 from jweibel22/fix/redshift-iam-concurrency-issue
Give each redshift client their own boto session
2020-09-25 12:50:20 -04:00
Jeremy Cohen
d08a39483d PR feedback 2020-09-25 12:11:49 -04:00
Jeremy Cohen
9191f4ff2d Merge branch 'dev/0.18.1' into fix/redshift-iam-concurrency-issue 2020-09-25 12:10:33 -04:00
Jeremy Cohen
19232f554f Merge pull request #2785 from fishtown-analytics/feature/metadata-env-vars
add env vars with a magic prefix to the metadata
2020-09-24 17:03:30 -04:00
Jeremy Cohen
b4a83414ac Require optional dep (keyring) on snowflake 2020-09-24 15:49:35 -04:00
Jeremy Cohen
cb0e62576d Merge pull request #2732 from Mr-Nobody99/feature/add-snowflake-last-modified
Added last_altered query to Snowflake catalog macro
2020-09-24 15:29:04 -04:00
Alexander Kutz
e3f557406f Updated test/integration/029_docs_generate_test.py to reflect new stat 2020-09-23 11:08:08 -05:00
Jacob Beck
676af831c0 add env vars with a magic prefix to the metadata 2020-09-23 10:04:06 -06:00
Jacob Beck
873d76d72c Merge pull request #2786 from fishtown-analytics/feature/invocation-id
add invocation_id to artifact metadata
2020-09-23 10:03:42 -06:00
Jacob Beck
8ee490b881 Merge pull request #2749 from joelluijmes/snapshot-hard-deletes-joell
Include hard-deletes when making snapshot
2020-09-23 08:31:48 -06:00
Jacob Beck
ff31b277f6 add invocation_id to artifact metadata 2020-09-23 08:07:20 -06:00
Jacob Beck
120eb5b502 Merge pull request #2778 from fishtown-analytics/feature/common-artifact-metadata
Feature: common artifact metadata
2020-09-23 08:06:45 -06:00
Alexander Kutz
a93e288d6a Added missing commaabove addition. 2020-09-22 17:48:48 -05:00
Alexander Kutz
8cf9311ced Changed 'BASE_TABLE' to 'BASE TABLE' 2020-09-22 16:04:30 -05:00
Alexander Kutz
713e781473 Reset branch against dev/0.8.1 and re-added changes.
udpated changelog.md
2020-09-22 14:54:56 -05:00
Jacob Beck
a32295e74a fix schema collection script 2020-09-22 13:48:31 -06:00
Jacob Beck
204b02de3e fix freshness RPC response behavior 2020-09-22 12:49:17 -06:00
Jacob Beck
8379edce99 Add a common metadata field to JSON artifacts
Adjusted how schema versions are set
RPC calls no longer have the schema version in their replies
2020-09-22 10:35:02 -06:00
Github Build Bot
e265ab67c7 Merge remote-tracking branch 'origin/releases/0.18.1b2' into dev/0.18.1 2020-09-22 14:23:45 +00:00
Github Build Bot
fde1f13b4e Release dbt v0.18.1b2 2020-09-22 14:09:51 +00:00
Jeremy Cohen
9c3839c7e2 Merge pull request #2782 from fishtown-analytics/docs/0.18.1-exposures
[revised] dbt-docs changes for v0.18.1
2020-09-22 09:51:04 -04:00
Jeremy Cohen
c0fd702cc7 Rename reports --> exposures 2020-09-22 08:44:58 -04:00
Jacob Beck
429419c4af Merge pull request #2780 from fishtown-analytics/feature/rename-results-to-exposures
reports -> exposures
2020-09-21 15:15:11 -06:00
Jacob Beck
56ae20602d reports -> exposures 2020-09-21 14:46:48 -06:00
Jacob Beck
a4b80cc2e4 Merge branch 'dev/kiyoshi-kuromiya' into snapshot-hard-deletes-joell 2020-09-21 14:11:01 -06:00
Jacob Beck
4994cc07a0 Merge pull request #2767 from fishtown-analytics/feature/schema-versions
Feature/schema versions
2020-09-21 14:10:28 -06:00
Joël Luijmes
e96cf02561 Merge remote-tracking branch 'upstream/dev/kiyoshi-kuromiya' into snapshot-hard-deletes-joell 2020-09-21 21:43:49 +02:00
Jacob Beck
764c9b2986 PR feedback 2020-09-21 11:56:07 -06:00
jweibel22
40c6499d3a Update CHANGELOG.md
Co-authored-by: Jacob Beck <beckjake@users.noreply.github.com>
2020-09-20 13:31:43 +02:00
Jimmy Rasmussen
3a78efd83c Add test cases to ensure default boto session is not used 2020-09-20 13:31:15 +02:00
Jimmy Rasmussen
eb33cf75e3 Add entry to CHANGELOG 2020-09-18 10:44:00 +02:00
Jimmy Rasmussen
863d8e6405 Give each redshift client their own boto session
since the boto session is not thread-safe, using the default session in a multi-threaded scenario will result in concurrency errors
2020-09-18 10:29:26 +02:00
Jacob Beck
1fc5a45b9e Merge pull request #2768 from fishtown-analytics/fix/makefile-on-macos
fix the new makefile on macos
2020-09-17 13:42:32 -06:00
Github Build Bot
7751fece35 Merge remote-tracking branch 'origin/releases/0.18.1b1' into dev/0.18.1 2020-09-17 19:09:23 +00:00
Github Build Bot
7670c42462 Release dbt v0.18.1b1 2020-09-17 18:54:44 +00:00
Jacob Beck
b72fc3cd25 fix the new makefile on macos 2020-09-17 11:47:06 -06:00
Jacob Beck
4cc1a4f74c changelog 2020-09-17 10:04:28 -06:00
Jacob Beck
540607086c fix docker message 2020-09-17 10:03:32 -06:00
Jacob Beck
7d929e98af Embed schema/dbt versions into the json schema for artifacts 2020-09-17 10:03:32 -06:00
Joël Luijmes
0086097639 Fix non-deterministic behavior by sorting on id (redshift test failed) 2020-09-17 11:55:23 +02:00
Jacob Beck
daff0badc8 Merge branch 'dev/0.18.1' into dev/kiyoshi-kuromiya 2020-09-16 14:16:33 -06:00
Jacob Beck
22c4d8fabe Merge pull request #2741 from heisencoder/fix/docker-testing-on-linux
Fix docker-based testing for Linux users
2020-09-16 13:39:02 -06:00
Jeremy Cohen
3485482460 Merge pull request #2760 from fishtown-analytics/docs/0.18.1-reports
dbt-docs changes for v0.18.1
2020-09-16 15:21:54 -04:00
Jacob Beck
c43873379c Merge pull request #2758 from fishtown-analytics/fix/version-bump
Bump version: 0.18.0 → 0.19.0a1
2020-09-16 13:17:58 -06:00
Jeremy Cohen
ea5e5df5a3 Support reports in dbt-docs 2020-09-16 13:44:17 -04:00
Jacob Beck
f2caf2f1ff Merge pull request #2752 from fishtown-analytics/feature/reports
Feature: reports
2020-09-16 10:41:43 -06:00
Jacob Beck
07d4020fca Bump version: 0.18.0 → 0.19.0a1 2020-09-16 10:34:53 -06:00
Jacob Beck
2142e529ff PR feedback: make report selector more like source selector, remove reports from fqn slector
Make some corresponding fqn adjustments
Add dbt ls report output
Fix dbt ls source output
The default selector now also returns reports
Update tests
2020-09-16 07:23:36 -06:00
Joël Luijmes
b9d502e2e6 Ensure dbt_valid_to is latest column 2020-09-16 09:11:42 +02:00
Joël Luijmes
8c80862c10 Snapshot hard-delete tests also for bigquery, snowflake and redshift 2020-09-16 08:41:45 +02:00
Joël Luijmes
2356c7b63d Use dict.get for optional paramater invalidate_hard_deletes 2020-09-16 07:51:01 +02:00
Diego Fernandez
9c24fc25f5 Add entry to CHANGELOG 2020-09-15 15:11:05 -06:00
Diego Fernandez
4f1a6d56c1 Include external tables in get_columns_in_relation redshift adapter 2020-09-15 15:09:55 -06:00
Joël Luijmes
b71b7e209e Update changelog 2020-09-15 22:01:02 +02:00
Joël Luijmes
2581e98aff Snapshot hard-delete opt-in during config 2020-09-15 16:47:16 +02:00
Joël Luijmes
afc7136bae Fix rpc integration snapshot tests 2020-09-15 16:47:11 +02:00
Joël Luijmes
e489170558 Add test for snapshotting hard deleted records 2020-09-15 16:45:36 +02:00
Joël Luijmes
50106f2bd3 Include hard-deletes when making snapshot
It sets dbt_valid_to to the current snapshot time.
2020-09-15 16:45:36 +02:00
Jacob Beck
e96f4a5be6 got redshifted 2020-09-14 14:35:00 -06:00
Jacob Beck
4768ac5fda Fix and add new tests, update changelog 2020-09-14 14:35:00 -06:00
Jacob Beck
c91fcc527a add comparison/selector logic for reports 2020-09-14 12:24:00 -06:00
Jacob Beck
8520ff35b3 Add reports feature
Add ParsedReport/UnparsedReport
add report parser and report node logic to manifest/results/dbt ls/selectors
NonSourceNode -> ManifestNode
add GraphMemberNode type that includes reports
2020-09-14 11:36:31 -06:00
Jacob Beck
9b8a98f4ec Test quality of life/cleanup
remove unused test folders
move rpc tests from 048 to 100 for convenience
 - Migrating these to the test/rpc model is going to take work. In the
   interim, developers can now use `tests/integeration/0*` to run all non-rpc
   tests.
2020-09-14 10:26:02 -06:00
Jacob Beck
4bd4afaec7 bumpversion for 0.18.1 2020-09-11 14:21:03 -06:00
Matt Ball
69352d8414 Fix docker-based testing for Linux users
See https://github.com/fishtown-analytics/dbt/issues/2739

This change enables Linux users to run the dbt tests via the docker
image. It follows the recommendations from this article:
https://jtreminio.com/blog/running-docker-containers-as-current-host-user/

Notable changes:
*  Added new Makefile rule to generate a .env file that contains USER_ID
and GROUP_ID environment variables to the ID of the current user. This
is in turn used by docker-compose and the Dockerfile to make the Docker
image run as the current user. (Note that on Windows or Mac, this
behavior happens by default).
*  Reordered Dockerfile to allow for better caching of intermediate
images (i.e., put things that don't depend on ARGS earlier).
*  Bumped CircleCI's Dockerfile from version 7 to 9.  Jake rebuilt
9 off of this PR.
2020-09-11 11:15:18 -06:00
Jeremy Cohen
4a21ea6575 Merge pull request #2723 from tpilewicz/fix/freshness-logs
Feat(result logs): Use three logging levels
2020-09-09 15:03:06 -04:00
Thomas Pilewicz
86bbb9fe38 Add contributors section to 0.18.1 2020-09-09 18:50:39 +02:00
Thomas Pilewicz
4030d4fc20 Move logging levels changelog entry to 0.18.1 2020-09-09 18:50:27 +02:00
tpilewicz
182f69a9ec Merge pull request #2 from fishtown-analytics/dev/0.18.1 2020-09-09 18:43:45 +02:00
Gerda Shank
fb40efe4b7 Merge pull request #2733 from fishtown-analytics/fix/2539-comment-quoting
When column config says quote, use quotes in SQL to add comments
2020-09-08 14:40:27 -04:00
Jacob Beck
9d00c00072 Merge pull request #2735 from fishtown-analytics/feature/include-unrendered-configs-2
Feature: include unrendered configs
2020-09-08 10:57:29 -06:00
Jacob Beck
10c3118f9c Merge branch 'dev/kiyoshi-kuromiya' into feature/include-unrendered-configs-2 2020-09-08 09:29:09 -06:00
Jacob Beck
1fa149dca2 Merge pull request #2736 from fishtown-analytics/feature/rpc-state-defer
Feature: state and defer in RPC calls
2020-09-08 09:28:02 -06:00
Jacob Beck
60f4c963b5 Merge branch 'dev/kiyoshi-kuromiya' into feature/rpc-state-defer 2020-09-08 08:41:21 -06:00
Gerda Shank
51b8e64972 When column config says quote, use quotes in SQL to add comments
Add separate test for column comments. Fix Snowflake catalog comments.
2020-09-04 16:44:19 -04:00
Jacob Beck
ae542dce74 Merge dev/marian-anderson 2020-09-04 07:12:04 -06:00
Github Build Bot
fa8a4f2020 Merge remote-tracking branch 'origin/releases/0.18.0' into dev/marian-anderson 2020-09-03 16:45:49 +00:00
Github Build Bot
481bdd56d3 Release dbt v0.18.0 2020-09-03 16:02:36 +00:00
Github Build Bot
1a9083ddb7 Merge remote-tracking branch 'origin/releases/0.18.0rc2' into dev/marian-anderson 2020-09-03 15:52:16 +00:00
Github Build Bot
9779f43620 Release dbt v0.18.0rc2 2020-09-03 15:49:09 +00:00
Jacob Beck
d31e82edfc this does not belong here 2020-09-02 10:13:41 -06:00
Jeremy Cohen
981535a1c3 Merge pull request #2734 from fishtown-analytics/docs/0.18.0-followup
dbt-docs changes for v0.18.0 (final)
2020-09-01 17:14:16 -04:00
Jacob Beck
5354e39e5f add defer/state args to RPC, add tests 2020-09-01 14:25:28 -06:00
Jacob Beck
ca9293cbfb changelog 2020-09-01 14:22:58 -06:00
Jeremy Cohen
e2fe6a8249 Add project-level overviews 2020-09-01 14:58:15 -04:00
Jeremy Cohen
a8347b7ada Add missing changelog entry 2020-09-01 14:57:47 -04:00
Jacob Beck
bcbf7c3b7b remove default values from unrendered configs 2020-09-01 10:30:03 -06:00
Jacob Beck
6a26cb280f fix the tests, add unrendered configs for sources 2020-09-01 10:30:03 -06:00
Jacob Beck
fd658ace9d Attach unrendered configs to parsed nodes 2020-09-01 10:30:03 -06:00
Jacob Beck
5e71a2aa3f Add unrendered configs to project 2020-09-01 10:30:03 -06:00
Jacob Beck
e3fb923b34 removed v1 config 2020-09-01 10:30:03 -06:00
Jeremy Cohen
f55b257609 Merge pull request #2722 from genos/genos/fix-for-2347
fix for 2347
2020-09-01 08:50:04 -04:00
genos
81bf3dae5c add contributors to changelog 2020-08-31 23:48:49 -04:00
Graham
d0074f3411 Merge branch 'dev/marian-anderson' into genos/fix-for-2347 2020-08-31 09:30:52 -04:00
Gerda Shank
2cc2d971c6 Merge pull request #2727 from fishtown-analytics/fix/2197-long-table-names
Check Postgres relation name lengths and throw error when over 63
2020-08-28 09:31:23 -04:00
Gerda Shank
5830f5590e Tweak error message, reformat for flake8 2020-08-27 16:35:35 -04:00
Jacob Beck
75facebe80 Merge pull request #2726 from fishtown-analytics/fix/require-version-validation
Validate require-dbt-version before validating dbt_project.ymls chema
2020-08-27 10:01:13 -06:00
Jacob Beck
0130398e9f Update core/dbt/config/project.py
Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>
2020-08-27 08:16:33 -06:00
Gerda Shank
22d9b86e9f update changelog for #2197 2020-08-26 14:15:01 -04:00
Gerda Shank
c87b671275 Use csv for data in test 063; tweak several lines 2020-08-26 13:57:03 -04:00
Jacob Beck
1eb5857811 add missing unit tests 2020-08-26 08:40:53 -06:00
Gerda Shank
5fc1cb39a6 Check Postgres relation name lengths and throw error when over 63 2020-08-26 10:28:37 -04:00
Jacob Beck
1f8e29276e move the require-dbt-version check to before parsing
update changelog
2020-08-25 13:17:00 -06:00
Thomas Pilewicz
cf02c7fd02 Fix(get_printable_result): return type hints 2020-08-25 16:52:21 +02:00
Thomas Pilewicz
5d93c64c0e Add logging levels to features of 0.18.0 2020-08-24 15:59:31 +02:00
Thomas Pilewicz
c738928ea3 Feat(result logs): Use three logging levels 2020-08-24 14:04:05 +02:00
genos
707310db64 update changelog 2020-08-23 17:40:36 -04:00
genos
59bf43dc1f Fix for #2347
**Introduction**

This PR attempts to fix #2347, wherein we wish `dbt` to complain about trying to install with a Python version < 3.6.

**Changes**

Per [the issue's suggestion](https://github.com/fishtown-analytics/dbt/issues/2347), I found every `setup.py` file I could:

```
-# If you have the fantastic `fd` utility installed:
fd setup.py
-# This also works
find . -name setup.py -print
```

Then to each of these, I added the following after the `import sys`:

```
if sys.version_info < (3, 6):
    print('Error: dbt does not support this version of Python.')
    print('Please upgrade to Python 3.6 or higher.')
    sys.exit(1)
```

**Testing**

I used the [`nix` package manager](https://nixos.org) to attempt installing this branch with both Python 2.7 and Python 3.8.

_Python 2.7_ fails as expected:

```
~/github/test2 ∃ cat default.nix
let
  pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/20.03.tar.gz") { };
  py = pkgs.python27Full.withPackages (p: [ p.setuptools ]);
in pkgs.mkShell {
  name = "python-2-env";
  buildInputs = [ py ];
}
~/github/test2 ∃ nix-shell --pure

[nix-shell:~/github/test2]$ python ../dbt/setup.py build
Error: dbt does not support this version of Python.
Please upgrade to Python 3.6 or higher.

[nix-shell:~/github/test2]$ echo $?
1
```

_Python 3.8_ still works:

```
~/github/test3 ∃ cat default.nix
let
  pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/20.03.tar.gz") { };
  py = pkgs.python38Full.withPackages (p: [ p.setuptools ]);
in pkgs.mkShell {
  name = "python-3-env";
  buildInputs = [ py ];
}
~/github/test3 ∃ nix-shell --pure

[nix-shell:~/github/test3]$ python ../dbt/setup.py build
running build

[nix-shell:~/github/test3]$ echo $?
0
```
2020-08-23 17:13:36 -04:00
Jacob Beck
fe461381a2 Merge pull request #2721 from fishtown-analytics/feature/more-test-only-adapter-methods
add more test helper methods
2020-08-21 13:13:28 -06:00
Jacob Beck
685873ab42 its ok to hit a deprecation with tracking disabled 2020-08-21 12:31:35 -06:00
Jacob Beck
b6a951903e add more test helper methods 2020-08-21 08:02:45 -06:00
Github Build Bot
acfa84918e Release dbt v0.18.0rc1 2020-08-19 20:10:33 +00:00
Jacob Beck
75304eb3be Merge pull request #2718 from fishtown-analytics/fix/only-jinja-is-comments
When deciding if we should bypass rendering, also check for comments
2020-08-19 13:39:03 -06:00
Jacob Beck
1d7eb59ff2 Merge pull request #2712 from fishtown-analytics/feature/adapter-cte-generation
Feature: adapter cte generation
2020-08-19 12:40:04 -06:00
Jacob Beck
4273cc9e29 Merge pull request #2709 from kconvey/kconvey-copy-job
Add a BigQuery adapter macro to enable usage of CopyJobs
2020-08-19 12:39:49 -06:00
Jacob Beck
29be2de5cb When deciding if we should bypass rendering, also check for comments 2020-08-19 12:27:40 -06:00
Jacob Beck
91b0496c89 Merge branch 'dev/marian-anderson' into kconvey-copy-job 2020-08-19 12:12:14 -06:00
Jacob Beck
f043f948de Merge pull request #2711 from kconvey/kconvey-model-ttl
Support TTL for BigQuery tables
2020-08-19 12:10:14 -06:00
Jeremy Cohen
7ef7a8f306 Merge pull request #2708 from rsenseman/enhancement/color_output_command_line_flag_second_attempt
Add --use-colors cli option (second attempt)
2020-08-19 13:42:24 -04:00
Robert
db8eea2468 Merge branch 'dev/marian-anderson' into enhancement/color_output_command_line_flag_second_attempt 2020-08-19 10:01:12 -07:00
Kurt Convey
55813d9209 hours_to_expiration 2020-08-19 10:08:52 -06:00
Kurt Convey
4c05daae1b Merge with origin 2020-08-19 09:41:37 -06:00
Kurt Convey
16eb7232c3 Check the copy model for failure 2020-08-19 09:18:33 -06:00
Kurt Convey
c28ffcdd9f merge with origin 2020-08-19 09:08:07 -06:00
Kurt Convey
4b8652f1c4 Should be two results for original table and (failing) copy 2020-08-19 08:58:39 -06:00
Jeremy Cohen
674bd8f264 Merge pull request #2715 from fishtown-analytics/docs/0.18.0-search-selectors
Docs site updates for 0.18.0
2020-08-19 10:41:48 -04:00
Jeremy Cohen
1ba832dbfe Docs site updates for 0.18.0 2020-08-18 18:42:32 -04:00
Jacob Beck
d3e4d3fbcb pr feedback: remove commented out code 2020-08-18 14:34:12 -06:00
Jacob Beck
58a3cb4fbd changelog update 2020-08-18 14:34:11 -06:00
Jacob Beck
8ad1551b15 when you think about it, data tests are really just ctes 2020-08-18 14:33:56 -06:00
Jacob Beck
123771163a hide more things from the context 2020-08-18 14:33:56 -06:00
Jacob Beck
f80a759488 Have the adapter be responsible for producing the compiler
The adapter's Relation is consulted for adding the ephemeral model prefix

Also hide some things from Jinja

Have the adapter be responsible for producing the compiler, move CTE generation into the Relation object
2020-08-18 14:33:56 -06:00
Kurt Convey
42f8a4715e Assert that single result has error 2020-08-18 14:11:28 -06:00
Jeremy Cohen
c29892e340 Merge pull request #2710 from fishtown-analytics/feature/add-deprecation-tracking
Track deprecation warnings
2020-08-18 16:11:02 -04:00
Kurt Convey
b4a2ed6bb5 Look for proper string 2020-08-18 14:06:53 -06:00
Kurt Convey
67e8caf045 No need to assert model success 2020-08-18 14:05:44 -06:00
Kurt Convey
2562debe31 Put --debug before run 2020-08-18 13:42:58 -06:00
Kurt Convey
87e2fd610c Raise error for bad materializations and set a default 2020-08-18 13:41:00 -06:00
Kurt Convey
af118bcc53 Check stdout with --debug for actual ddl 2020-08-18 13:23:28 -06:00
Kurt Convey
7e01172b4c Set status 2020-08-18 13:08:58 -06:00
Kurt Convey
f56ae93772 Return from copy_bq_table and fix test 2020-08-18 10:46:04 -06:00
Kurt Convey
3834805929 Use injected sql from results 2020-08-18 10:25:11 -06:00
Kurt Convey
50b6057bbf Make copy a proper materialization 2020-08-18 10:17:15 -06:00
Kurt Convey
c0199abacf Split failing and succeeding models 2020-08-18 08:34:41 -06:00
Kurt Convey
77688c74f3 Remove test from wrong PR 2020-08-17 17:20:24 -06:00
Kurt Convey
47ab7419ac Embed profile name 2020-08-17 17:08:38 -06:00
Kurt Convey
dc209f77ec Fix class name 2020-08-17 16:35:29 -06:00
Kurt Convey
76aa8c7df5 Fix class name 2020-08-17 16:32:22 -06:00
Kurt Convey
671a29ff34 plugins 2020-08-17 16:16:46 -06:00
Kurt Convey
eb35794aca Fix table options string 2020-08-17 16:05:44 -06:00
Kurt Convey
a8d6691dee Mock config better 2020-08-17 15:59:04 -06:00
Kurt Convey
955f4ae977 Add entry to CHANGELOG 2020-08-17 15:26:33 -06:00
Kurt Convey
25a869a686 Fix unit test 2020-08-17 15:23:25 -06:00
Kurt Convey
7aa8030b76 Fix test name 2020-08-17 15:12:02 -06:00
Kurt Convey
108d843bba Add newlines 2020-08-17 15:10:51 -06:00
Kurt Convey
099fea8565 Add integration test 2020-08-17 15:04:59 -06:00
Jeremy Cohen
a573a2ada1 Explicitly import dbt.tracking module 2020-08-17 17:00:07 -04:00
Kurt Convey
1468ca8ebc Expect to fail 2020-08-17 14:54:11 -06:00
Jeremy Cohen
274aea9f8f Track deprecation warnings 2020-08-17 16:36:04 -04:00
Kurt Convey
38a99a75ed Use correct models 2020-08-17 14:29:49 -06:00
Kurt Convey
6e06bd0cb4 Add unit test 2020-08-17 14:29:02 -06:00
Kurt Convey
02a793998a Fix macro compilation 2020-08-17 14:10:04 -06:00
Kurt Convey
81ab9469b7 Add time_to_expiration 2020-08-17 14:00:10 -06:00
Kurt Convey
bcc928495d Fix if statement 2020-08-17 13:13:48 -06:00
Kurt Convey
621ae7dbc9 make flake8 happy 2020-08-17 13:03:33 -06:00
Kurt Convey
85f2c03903 Tweak integration test 2020-08-17 12:26:25 -06:00
Kurt Convey
f7fd741d43 Attempt at integration tests 2020-08-17 12:16:20 -06:00
Kurt Convey
fad0d81837 Reference consts through the right module 2020-08-17 11:56:14 -06:00
Kurt Convey
09687409dc move bq context to impl.py 2020-08-17 11:44:51 -06:00
Kurt Convey
091bcd107c Fix test write dispositions 2020-08-17 09:20:22 -06:00
Kurt Convey
f9b300d63a Add changelog 2020-08-17 09:12:52 -06:00
Jacob Beck
4f2acc2c96 Merge pull request #2703 from vogt4nick/2702-patch-redshift-table-size-estimation
patch redshift table size estimation
2020-08-17 09:04:03 -06:00
Kurt Convey
6aa4a60d5c Add copy_job macro 2020-08-17 09:01:17 -06:00
rsenseman
b075bf51b0 one more changelog update 2020-08-16 14:48:08 -07:00
rsenseman
de2341ece0 update changelog 2020-08-16 14:44:55 -07:00
rsenseman
ff67e7d47c add missing code; finalize pr 2020-08-16 14:34:46 -07:00
rsenseman
31644ed39d initial commit 2020-08-16 13:48:55 -07:00
Jacob Beck
acb235ef4f Merge pull request #2698 from fishtown-analytics/fix/snowflake-connector-python-upgrade
bump requirements, enable the token cache
2020-08-14 13:47:32 -06:00
Jacob Beck
d554835b50 Merge pull request #2695 from fishtown-analytics/feature/state-modified-selector
Add state:modified and state:new selectors
2020-08-14 13:44:48 -06:00
Jacob Beck
c8453d80fc fix ls tests 2020-08-14 11:36:19 -06:00
Jacob Beck
f3f713ae65 fix macro change check to account for new list return value when macros are added/removed 2020-08-14 09:05:55 -06:00
Nick Vogt
cfa741e597 calculate Redshift table size in bytes, not megabytes 2020-08-14 11:02:10 -04:00
Jacob Beck
13fb2351ed PR feedback 2020-08-14 08:48:59 -06:00
Jacob Beck
3555ba518d fix unit tests 2020-08-12 14:07:20 -06:00
Jacob Beck
c5a19ca42e bump requirements, enable the token cache 2020-08-12 14:00:37 -06:00
Jacob Beck
153eb7e9d3 flake8 update found more things to complain about 2020-08-12 07:44:52 -06:00
Jacob Beck
7ba52d4931 mypy is rightfully mad about our funky inheritance, just copy+paste things
We can fix this when we drop python 3.6 and unions stop collapsing types
2020-08-12 07:44:52 -06:00
Jacob Beck
ebe5b46653 Add state:modified and state:new selectors 2020-08-12 07:44:51 -06:00
Jacob Beck
1bd82d4914 Merge pull request #2694 from kconvey/kconvey-retry-upstream
Add retry of additional errors
2020-08-12 07:43:50 -06:00
Jeremy Cohen
89775fa94f Merge pull request #2594 from brunomurino/issue-2533
added option --adapter to dbt init, to create sample profiles.yml bas…
2020-08-11 18:27:02 -04:00
Kurt Convey
4456872635 Didn't forget to add myself to the contributors 2020-08-11 15:23:02 -06:00
Kurt Convey
ee9ae22651 Fix variable name 2020-08-11 15:16:15 -06:00
Kurt Convey
afe0f46768 use isinstance with tuple 2020-08-11 15:15:04 -06:00
Kurt Convey
203d8c3481 Update CHANGELOG 2020-08-11 12:42:31 -06:00
Kurt Convey
c9ae49255d Add test from dev 2020-08-11 12:29:52 -06:00
Bruno Murino
51f17d3358 Update plugins/redshift/dbt/include/redshift/sample_profiles.yml
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-08-10 13:38:08 +01:00
Bruno Murino
cacdd58b41 Update plugins/redshift/dbt/include/redshift/sample_profiles.yml
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-08-10 13:38:01 +01:00
Bruno Murino
8b7bcbbc47 Update plugins/postgres/dbt/include/postgres/sample_profiles.yml
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-08-10 13:37:49 +01:00
Bruno Murino
bdf9482e75 Update plugins/postgres/dbt/include/postgres/sample_profiles.yml
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-08-10 13:37:42 +01:00
Bruno Murino
b94d0b66e6 Apply suggestions from code review
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-08-06 19:11:12 +01:00
Bruno Murino
1dff122a94 Update plugins/postgres/dbt/include/postgres/sample_profiles.yml
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-08-06 19:10:34 +01:00
Jacob Beck
fb8065df27 Merge pull request #2686 from fishtown-analytics/feature/override-core-macros
include project macros in the manifest the adapter stores locally
2020-08-06 11:35:25 -06:00
Jacob Beck
4274139210 Fix the test
Make sure "dbt deps" reloads the full manifest
Make sure commands that reload the dbt_project.yml properly reset the config (including adapters)
2020-08-05 14:50:59 -06:00
brunomurino
b422a44c03 update redshift and snowflake sample profiles 2020-08-05 19:56:29 +01:00
Jacob Beck
285479c0bc include project macros in the manifest the adapter stores locally 2020-08-05 09:35:00 -06:00
Jacob Beck
e641ec12fa Merge pull request #2684 from fishtown-analytics/feature/dispatch-schema-tests
convert tests
2020-08-05 09:27:55 -06:00
Jeremy Cohen
36fda28a92 Merge pull request #2677 from bbhoss/bq_impersonate
Add support for impersonating a service account with BigQuery
2020-08-04 14:37:53 -04:00
Jacob Beck
1ece515074 Merge pull request #2679 from fishtown-analytics/feature/adapter-dispatch
Create adapter.dispatch
2020-08-04 12:01:00 -06:00
Jacob Beck
04f840d907 convert tests 2020-08-04 09:47:48 -06:00
Preston Marshall
df8ccc04eb add contributors section 2020-08-04 10:49:35 -04:00
Jacob Beck
dd764b93e0 check if packages is a string and error out 2020-08-04 07:46:50 -06:00
Jacob Beck
335497f688 fixed error message, fixed the changelog 2020-08-03 15:48:04 -06:00
Jacob Beck
41a9251982 pr feedback: better errors for dotted macro names
add more tests
2020-08-03 12:31:29 -06:00
Preston Marshall
84bf03580d Merge remote-tracking branch 'upstream/dev/marian-anderson' into bq_impersonate 2020-08-03 13:00:13 -04:00
Preston Marshall
43d5dfcb71 move changelog additions to next 2020-08-03 12:57:30 -04:00
Preston Marshall
7a9fc7ef12 update changelog 2020-08-03 12:55:42 -04:00
Jacob Beck
ecf24cd4d9 adapter_macro -> adapter.dispatch
Added tests
Added deprecation warning + tests
2020-08-03 10:49:28 -06:00
Preston Marshall
c95a6792e5 add a test 2020-08-01 15:40:48 -04:00
Preston Marshall
88529d5c25 first try 2020-08-01 15:12:00 -04:00
brunomurino
3dabe62254 updated redshift and snowflake sample_profiles.yml 2020-07-04 14:13:51 +01:00
brunomurino
0d246ac95b updated postgres sample_profiles.yml 2020-07-02 21:16:45 +01:00
brunomurino
2d0612c972 updated bigquery sample_profiles.yml 2020-07-02 21:14:52 +01:00
brunomurino
13da3390e5 updated changelog 2020-06-26 18:58:45 +01:00
brunomurino
4164e6ee8e updated setup.py of adapters to include sample_profiles.yml 2020-06-26 18:12:18 +01:00
brunomurino
2b454f99dd updated 040_init_test for profile postgres to work with new --adapter option of dbt init 2020-06-25 23:54:33 +01:00
brunomurino
2025634417 added option --adapter to dbt init, to create sample profiles.yml based on chosen adapter 2020-06-25 23:27:08 +01:00
Drew Banin
1dd4187cd0 Merge branch '0.14.latest' 2019-09-05 14:32:23 -04:00
Connor McArthur
9e36ebdaab Merge branch '0.13.latest' of github.com:fishtown-analytics/dbt 2019-03-21 13:27:24 -04:00
Drew Banin
aaa0127354 Merge pull request #1241 from fishtown-analytics/0.12.latest
Merge 0.12.latest into master
2019-01-15 17:01:16 -05:00
Drew Banin
e60280c4d6 Merge branch '0.12.latest' 2018-11-15 12:24:05 -05:00
Drew Banin
aef7866e29 Update CHANGELOG.md 2018-11-13 10:36:35 -05:00
Drew Banin
70694e3bb9 Merge pull request #1118 from fishtown-analytics/0.12.latest
merge 0.12.latest to master
2018-11-13 10:19:56 -05:00
331 changed files with 14676 additions and 6330 deletions

View File

@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.18.0b2
current_version = 0.19.0
parse = (?P<major>\d+)
\.(?P<minor>\d+)
\.(?P<patch>\d+)

View File

@@ -2,7 +2,7 @@ version: 2.1
jobs:
unit:
docker: &test_only
- image: fishtownanalytics/test-container:7
- image: fishtownanalytics/test-container:9
environment:
DBT_INVOCATION_ENV: circle
steps:
@@ -30,7 +30,7 @@ jobs:
destination: dist
integration-postgres-py36:
docker: &test_and_postgres
- image: fishtownanalytics/test-container:7
- image: fishtownanalytics/test-container:9
environment:
DBT_INVOCATION_ENV: circle
- image: postgres
@@ -121,6 +121,45 @@ jobs:
- store_artifacts:
path: ./logs
integration-postgres-py39:
docker: *test_and_postgres
steps:
- checkout
- run: *setupdb
- run:
name: Run tests
command: tox -e integration-postgres-py39
- store_artifacts:
path: ./logs
integration-snowflake-py39:
docker: *test_only
steps:
- checkout
- run:
name: Run tests
command: tox -e integration-snowflake-py39
no_output_timeout: 1h
- store_artifacts:
path: ./logs
integration-redshift-py39:
docker: *test_only
steps:
- checkout
- run:
name: Run tests
command: tox -e integration-redshift-py39
- store_artifacts:
path: ./logs
integration-bigquery-py39:
docker: *test_only
steps:
- checkout
- run:
name: Run tests
command: tox -e integration-bigquery-py39
- store_artifacts:
path: ./logs
workflows:
version: 2
test-everything:
@@ -150,6 +189,18 @@ workflows:
- integration-snowflake-py38:
requires:
- integration-postgres-py38
- integration-postgres-py39:
requires:
- unit
- integration-redshift-py39:
requires:
- integration-postgres-py39
- integration-bigquery-py39:
requires:
- integration-postgres-py39
# - integration-snowflake-py39:
# requires:
# - integration-postgres-py39
- build-wheels:
requires:
- unit
@@ -161,3 +212,7 @@ workflows:
- integration-redshift-py38
- integration-bigquery-py38
- integration-snowflake-py38
- integration-postgres-py39
- integration-redshift-py39
- integration-bigquery-py39
# - integration-snowflake-py39

12
.gitignore vendored
View File

@@ -8,7 +8,8 @@ __pycache__/
# Distribution / packaging
.Python
env/
env*/
dbt_env/
build/
develop-eggs/
dist/
@@ -42,6 +43,7 @@ htmlcov/
.coverage
.coverage.*
.cache
.env
nosetests.xml
coverage.xml
*,cover
@@ -83,3 +85,11 @@ target/
# pycharm
.idea/
# AWS credentials
.aws/
.DS_Store
# vscode
.vscode/

View File

@@ -1,7 +1,250 @@
## dbt 0.18.0 (Release TBD)
## dbt 0.20.0 (Release TBD)
### Fixes
- Fix exit code from dbt debug not returning a failure when one of the tests fail ([#3017](https://github.com/fishtown-analytics/dbt/issues/3017))
- Auto-generated CTEs in tests and ephemeral models have lowercase names to comply with dbt coding conventions ([#3027](https://github.com/fishtown-analytics/dbt/issues/3027), [#3028](https://github.com/fishtown-analytics/dbt/issues/3028))
### Features
- Add optional configs for `require_partition_filter` and `partition_expiration_days` in BigQuery ([#1843](https://github.com/fishtown-analytics/dbt/issues/1843), [#2928](https://github.com/fishtown-analytics/dbt/pull/2928))
- Fix for EOL SQL comments prevent entire line execution ([#2731](https://github.com/fishtown-analytics/dbt/issues/2731), [#2974](https://github.com/fishtown-analytics/dbt/pull/2974))
Contributors:
- [@yu-iskw](https://github.com/yu-iskw) ([#2928](https://github.com/fishtown-analytics/dbt/pull/2928))
- [@sdebruyn](https://github.com/sdebruyn) / [@lynxcare](https://github.com/lynxcare) ([#3018](https://github.com/fishtown-analytics/dbt/pull/3018))
- [@rvacaru](https://github.com/rvacaru) ([#2974](https://github.com/fishtown-analytics/dbt/pull/2974))
- [@NiallRees](https://github.com/NiallRees) ([#3028](https://github.com/fishtown-analytics/dbt/pull/3028))
## dbt 0.19.1 (Release TBD)
### Under the hood
- Bump werkzeug upper bound dependency to `<v2.0` ([#3011](https://github.com/fishtown-analytics/dbt/pull/3011))
Contributors:
- [@Bl3f](https://github.com/Bl3f) ([#3011](https://github.com/fishtown-analytics/dbt/pull/3011))
## dbt 0.19.0 (January 27, 2021)
## dbt 0.19.0rc3 (January 27, 2021)
### Under the hood
- Cleanup docker resources, use single `docker/Dockerfile` for publishing dbt as a docker image ([dbt-release#3](https://github.com/fishtown-analytics/dbt-release/issues/3), [#3019](https://github.com/fishtown-analytics/dbt/pull/3019))
## dbt 0.19.0rc2 (January 14, 2021)
### Fixes
- Fix regression with defining exposures and other resources with the same name ([#2969](https://github.com/fishtown-analytics/dbt/issues/2969), [#3009](https://github.com/fishtown-analytics/dbt/pull/3009))
- Remove ellipses printed while parsing ([#2971](https://github.com/fishtown-analytics/dbt/issues/2971), [#2996](https://github.com/fishtown-analytics/dbt/pull/2996))
### Under the hood
- Rewrite macro for snapshot_merge_sql to make compatible with other SQL dialects ([#3003](https://github.com/fishtown-analytics/dbt/pull/3003)
- Rewrite logic in `snapshot_check_strategy()` to make compatible with other SQL dialects ([#3000](https://github.com/fishtown-analytics/dbt/pull/3000), [#3001](https://github.com/fishtown-analytics/dbt/pull/3001))
- Remove version restrictions on `botocore` ([#3006](https://github.com/fishtown-analytics/dbt/pull/3006))
- Include `exposures` in start-of-invocation stdout summary: `Found ...` ([#3007](https://github.com/fishtown-analytics/dbt/pull/3007), [#3008](https://github.com/fishtown-analytics/dbt/pull/3008))
Contributors:
- [@mikaelene](https://github.com/mikaelene) ([#3003](https://github.com/fishtown-analytics/dbt/pull/3003))
- [@dbeatty10](https://github.com/dbeatty10) ([dbt-adapter-tests#10](https://github.com/fishtown-analytics/dbt-adapter-tests/pull/10))
- [@swanderz](https://github.com/swanderz) ([#3000](https://github.com/fishtown-analytics/dbt/pull/3000))
- [@stpierre](https://github.com/stpierre) ([#3006](https://github.com/fishtown-analytics/dbt/pull/3006))
## dbt 0.19.0rc1 (December 29, 2020)
### Breaking changes
- Defer if and only if upstream reference does not exist in current environment namespace ([#2909](https://github.com/fishtown-analytics/dbt/issues/2909), [#2946](https://github.com/fishtown-analytics/dbt/pull/2946))
- Rationalize run result status reporting and clean up artifact schema ([#2493](https://github.com/fishtown-analytics/dbt/issues/2493), [#2943](https://github.com/fishtown-analytics/dbt/pull/2943))
- Add adapter specific query execution info to run results and source freshness results artifacts. Statement call blocks return `response` instead of `status`, and the adapter method `get_status` is now `get_response` ([#2747](https://github.com/fishtown-analytics/dbt/issues/2747), [#2961](https://github.com/fishtown-analytics/dbt/pull/2961))
### Features
- Added macro `get_partitions_metadata(table)` to return partition metadata for BigQuery partitioned tables ([#2552](https://github.com/fishtown-analytics/dbt/pull/2552), [#2596](https://github.com/fishtown-analytics/dbt/pull/2596))
- Added `--defer` flag for `dbt test` as well ([#2701](https://github.com/fishtown-analytics/dbt/issues/2701), [#2954](https://github.com/fishtown-analytics/dbt/pull/2954))
- Added native python `re` module for regex in jinja templates ([#1755](https://github.com/fishtown-analytics/dbt/pull/2851), [#1755](https://github.com/fishtown-analytics/dbt/pull/2851))
- Store resolved node names in manifest ([#2647](https://github.com/fishtown-analytics/dbt/issues/2647), [#2837](https://github.com/fishtown-analytics/dbt/pull/2837))
- Save selectors dictionary to manifest, allow descriptions ([#2693](https://github.com/fishtown-analytics/dbt/issues/2693), [#2866](https://github.com/fishtown-analytics/dbt/pull/2866))
- Normalize cli-style-strings in manifest selectors dictionary ([#2879](https://github.com/fishtown-anaytics/dbt/issues/2879), [#2895](https://github.com/fishtown-analytics/dbt/pull/2895))
- Hourly, monthly and yearly partitions available in BigQuery ([#2476](https://github.com/fishtown-analytics/dbt/issues/2476), [#2903](https://github.com/fishtown-analytics/dbt/pull/2903))
- Allow BigQuery to default to the environment's default project ([#2828](https://github.com/fishtown-analytics/dbt/pull/2828), [#2908](https://github.com/fishtown-analytics/dbt/pull/2908))
- Rationalize run result status reporting and clean up artifact schema ([#2493](https://github.com/fishtown-analytics/dbt/issues/2493), [#2943](https://github.com/fishtown-analytics/dbt/pull/2943))
### Fixes
- Respect `--project-dir` in `dbt clean` command ([#2840](https://github.com/fishtown-analytics/dbt/issues/2840), [#2841](https://github.com/fishtown-analytics/dbt/pull/2841))
- Fix Redshift adapter `get_columns_in_relation` macro to push schema filter down to the `svv_external_columns` view ([#2854](https://github.com/fishtown-analytics/dbt/issues/2854), [#2854](https://github.com/fishtown-analytics/dbt/issues/2854))
- Increased the supported relation name length in postgres from 29 to 51 ([#2850](https://github.com/fishtown-analytics/dbt/pull/2850))
- `dbt list` command always return `0` as exit code ([#2886](https://github.com/fishtown-analytics/dbt/issues/2886), [#2892](https://github.com/fishtown-analytics/dbt/issues/2892))
- Set default `materialized` for test node configs to `test` ([#2806](https://github.com/fishtown-analytics/dbt/issues/2806), [#2902](https://github.com/fishtown-analytics/dbt/pull/2902))
- Allow `docs` blocks in `exposure` descriptions ([#2913](https://github.com/fishtown-analytics/dbt/issues/2913), [#2920](https://github.com/fishtown-analytics/dbt/pull/2920))
- Use original file path instead of absolute path as checksum for big seeds ([#2927](https://github.com/fishtown-analytics/dbt/issues/2927), [#2939](https://github.com/fishtown-analytics/dbt/pull/2939))
- Fix KeyError if deferring to a manifest with a since-deleted source, ephemeral model, or test ([#2875](https://github.com/fishtown-analytics/dbt/issues/2875), [#2958](https://github.com/fishtown-analytics/dbt/pull/2958))
### Under the hood
- Add `unixodbc-dev` package to testing docker image ([#2859](https://github.com/fishtown-analytics/dbt/pull/2859))
- Add event tracking for project parser/load times ([#2823](https://github.com/fishtown-analytics/dbt/issues/2823),[#2893](https://github.com/fishtown-analytics/dbt/pull/2893))
- Bump `cryptography` version to `>= 3.2` and bump snowflake connector to `2.3.6` ([#2896](https://github.com/fishtown-analytics/dbt/issues/2896), [#2922](https://github.com/fishtown-analytics/dbt/issues/2922))
- Widen supported Google Cloud libraries dependencies ([#2794](https://github.com/fishtown-analytics/dbt/pull/2794), [#2877](https://github.com/fishtown-analytics/dbt/pull/2877)).
- Bump `hologram` version to `0.0.11`. Add `scripts/dtr.py` ([#2888](https://github.com/fishtown-analytics/dbt/issues/2840),[#2889](https://github.com/fishtown-analytics/dbt/pull/2889))
- Bump `hologram` version to `0.0.12`. Add testing support for python3.9 ([#2822](https://github.com/fishtown-analytics/dbt/issues/2822),[#2960](https://github.com/fishtown-analytics/dbt/pull/2960))
- Bump the version requirements for `boto3` in dbt-redshift to the upper limit `1.16` to match dbt-redshift and the `snowflake-python-connector` as of version `2.3.6`. ([#2931](https://github.com/fishtown-analytics/dbt/issues/2931), ([#2963](https://github.com/fishtown-analytics/dbt/issues/2963))
### Docs
- Fixed issue where data tests with tags were not showing up in graph viz ([docs#147](https://github.com/fishtown-analytics/dbt-docs/issues/147), [docs#157](https://github.com/fishtown-analytics/dbt-docs/pull/157))
Contributors:
- [@feluelle](https://github.com/feluelle) ([#2841](https://github.com/fishtown-analytics/dbt/pull/2841))
- [ran-eh](https://github.com/ran-eh) ([#2596](https://github.com/fishtown-analytics/dbt/pull/2596))
- [@hochoy](https://github.com/hochoy) ([#2851](https://github.com/fishtown-analytics/dbt/pull/2851))
- [@brangisom](https://github.com/brangisom) ([#2855](https://github.com/fishtown-analytics/dbt/pull/2855))
- [@elexisvenator](https://github.com/elexisvenator) ([#2850](https://github.com/fishtown-analytics/dbt/pull/2850))
- [@franloza](https://github.com/franloza) ([#2837](https://github.com/fishtown-analytics/dbt/pull/2837))
- [@max-sixty](https://github.com/max-sixty) ([#2877](https://github.com/fishtown-analytics/dbt/pull/2877), [#2908](https://github.com/fishtown-analytics/dbt/pull/2908))
- [@rsella](https://github.com/rsella) ([#2892](https://github.com/fishtown-analytics/dbt/issues/2892))
- [@joellabes](https://github.com/joellabes) ([#2913](https://github.com/fishtown-analytics/dbt/issues/2913))
- [@plotneishestvo](https://github.com/plotneishestvo) ([#2896](https://github.com/fishtown-analytics/dbt/issues/2896))
- [@db-magnus](https://github.com/db-magnus) ([#2892](https://github.com/fishtown-analytics/dbt/issues/2892))
- [@tyang209](https:/github.com/tyang209) ([#2931](https://github.com/fishtown-analytics/dbt/issues/2931))
## dbt 0.19.0b1 (October 21, 2020)
### Breaking changes
- The format for `sources.json`, `run-results.json`, `manifest.json`, and `catalog.json` has changed:
- Each now has a common metadata dictionary ([#2761](https://github.com/fishtown-analytics/dbt/issues/2761), [#2778](https://github.com/fishtown-analytics/dbt/pull/2778)). The contents include: schema and dbt versions ([#2670](https://github.com/fishtown-analytics/dbt/issues/2670), [#2767](https://github.com/fishtown-analytics/dbt/pull/2767)); `invocation_id` ([#2763](https://github.com/fishtown-analytics/dbt/issues/2763), [#2784](https://github.com/fishtown-analytics/dbt/pull/2784)); custom environment variables prefixed with `DBT_ENV_CUSTOM_ENV_` ([#2764](https://github.com/fishtown-analytics/dbt/issues/2764), [#2785](https://github.com/fishtown-analytics/dbt/pull/2785)); cli and rpc arguments in the `run_results.json` ([#2510](https://github.com/fishtown-analytics/dbt/issues/2510), [#2813](https://github.com/fishtown-analytics/dbt/pull/2813)).
- Remove `injected_sql` from manifest nodes, use `compiled_sql` instead ([#2762](https://github.com/fishtown-analytics/dbt/issues/2762), [#2834](https://github.com/fishtown-analytics/dbt/pull/2834))
### Features
- dbt will compare configurations using the un-rendered form of the config block in `dbt_project.yml` ([#2713](https://github.com/fishtown-analytics/dbt/issues/2713), [#2735](https://github.com/fishtown-analytics/dbt/pull/2735))
- Added state and defer arguments to the RPC client, matching the CLI ([#2678](https://github.com/fishtown-analytics/dbt/issues/2678), [#2736](https://github.com/fishtown-analytics/dbt/pull/2736))
- Added ability to snapshot hard-deleted records (opt-in with `invalidate_hard_deletes` config option). ([#249](https://github.com/fishtown-analytics/dbt/issues/249), [#2749](https://github.com/fishtown-analytics/dbt/pull/2749))
- Added revival for snapshotting hard-deleted records. ([#2819](https://github.com/fishtown-analytics/dbt/issues/2819), [#2821](https://github.com/fishtown-analytics/dbt/pull/2821))
- Improved error messages for YAML selectors ([#2700](https://github.com/fishtown-analytics/dbt/issues/2700), [#2781](https://github.com/fishtown-analytics/dbt/pull/2781))
- Added `dbt_invocation_id` for each BigQuery job to enable performance analysis ([#2808](https://github.com/fishtown-analytics/dbt/issues/2808), [#2809](https://github.com/fishtown-analytics/dbt/pull/2809))
- Added support for BigQuery connections using refresh tokens ([#2344](https://github.com/fishtown-analytics/dbt/issues/2344), [#2805](https://github.com/fishtown-analytics/dbt/pull/2805))
### Under the hood
- Save `manifest.json` at the same time we save the `run_results.json` at the end of a run ([#2765](https://github.com/fishtown-analytics/dbt/issues/2765), [#2799](https://github.com/fishtown-analytics/dbt/pull/2799))
- Added strategy-specific validation to improve the relevancy of compilation errors for the `timestamp` and `check` snapshot strategies. (([#2787](https://github.com/fishtown-analytics/dbt/issues/2787), [#2791](https://github.com/fishtown-analytics/dbt/pull/2791))
- Changed rpc test timeouts to avoid locally run test failures ([#2803](https://github.com/fishtown-analytics/dbt/issues/2803),[#2804](https://github.com/fishtown-analytics/dbt/pull/2804))
- Added a `debug_query` on the base adapter that will allow plugin authors to create custom debug queries ([#2751](https://github.com/fishtown-analytics/dbt/issues/2751),[#2871](https://github.com/fishtown-analytics/dbt/pull/2817))
### Docs
- Add select/deselect option in DAG view dropups. ([docs#98](https://github.com/fishtown-analytics/dbt-docs/issues/98), [docs#138](https://github.com/fishtown-analytics/dbt-docs/pull/138))
- Fixed issue where sources with tags were not showing up in graph viz ([docs#93](https://github.com/fishtown-analytics/dbt-docs/issues/93), [docs#139](https://github.com/fishtown-analytics/dbt-docs/pull/139))
- Use `compiled_sql` instead of `injected_sql` for "Compiled" ([docs#146](https://github.com/fishtown-analytics/dbt-docs/issues/146), [docs#148](https://github.com/fishtown-analytics/dbt-docs/issues/148))
Contributors:
- [@joelluijmes](https://github.com/joelluijmes) ([#2749](https://github.com/fishtown-analytics/dbt/pull/2749), [#2821](https://github.com/fishtown-analytics/dbt/pull/2821))
- [@kingfink](https://github.com/kingfink) ([#2791](https://github.com/fishtown-analytics/dbt/pull/2791))
- [@zmac12](https://github.com/zmac12) ([#2817](https://github.com/fishtown-analytics/dbt/pull/2817))
- [@Mr-Nobody99](https://github.com/Mr-Nobody99) ([docs#138](https://github.com/fishtown-analytics/dbt-docs/pull/138))
- [@jplynch77](https://github.com/jplynch77) ([docs#139](https://github.com/fishtown-analytics/dbt-docs/pull/139))
## dbt 0.18.1 (October 13, 2020)
## dbt 0.18.1rc1 (October 01, 2020)
### Features
- Added retry support for rateLimitExceeded error from BigQuery, ([#2795](https://github.com/fishtown-analytics/dbt/issues/2795), [#2796](https://github.com/fishtown-analytics/dbt/issues/2796))
Contributors:
- [@championj-foxtel](https://github.com/championj-foxtel) ([#2796](https://github.com/fishtown-analytics/dbt/issues/2796))
## dbt 0.18.1b3 (September 25, 2020)
### Feature
- Added 'Last Modified' stat in snowflake catalog macro. Now should be available in docs. ([#2728](https://github.com/fishtown-analytics/dbt/issues/2728))
### Fixes
- `dbt compile` and `dbt run` failed with `KeyError: 'endpoint_resolver'` when threads > 1 and `method: iam` had been specified in the profiles.yaml ([#2756](https://github.com/fishtown-analytics/dbt/issues/2756), [#2766](https://github.com/fishtown-analytics/dbt/pull/2766))
- Fix Redshift adapter to include columns from external tables when using the get_columns_in_relation macro ([#2753](https://github.com/fishtown-analytics/dbt/issues/2753), [#2754](https://github.com/fishtown-analytics/dbt/pull/2754))
### Under the hood
- Require extra `snowflake-connector-python[secure-local-storage]` on all dbt-snowflake installations ([#2779](https://github.com/fishtown-analytics/dbt/issues/2779), [#2789](https://github.com/fishtown-analytics/dbt/pull/2789))
Contributors:
- [@Mr-Nobody99](https://github.com/Mr-Nobody99) ([#2732](https://github.com/fishtown-analytics/dbt/pull/2732))
- [@jweibel22](https://github.com/jweibel22) ([#2766](https://github.com/fishtown-analytics/dbt/pull/2766))
- [@aiguofer](https://github.com/aiguofer) ([#2754](https://github.com/fishtown-analytics/dbt/pull/2754))
## dbt 0.18.1b1 (September 17, 2020)
### Under the hood
- If column config says quote, use quoting in SQL for adding a comment. ([#2539](https://github.com/fishtown-analytics/dbt/issues/2539), [#2733](https://github.com/fishtown-analytics/dbt/pull/2733))
- Added support for running docker-based tests under Linux. ([#2739](https://github.com/fishtown-analytics/dbt/issues/2739))
### Features
- Specify all three logging levels (`INFO`, `WARNING`, `ERROR`) in result logs for commands `test`, `seed`, `run`, `snapshot` and `source snapshot-freshness` ([#2680](https://github.com/fishtown-analytics/dbt/pull/2680), [#2723](https://github.com/fishtown-analytics/dbt/pull/2723))
- Added "exposures" ([#2730](https://github.com/fishtown-analytics/dbt/issues/2730), [#2752](https://github.com/fishtown-analytics/dbt/pull/2752), [#2777](https://github.com/fishtown-analytics/dbt/issues/2777))
### Docs
- Add Exposure nodes ([docs#135](https://github.com/fishtown-analytics/dbt-docs/issues/135), [docs#136](https://github.com/fishtown-analytics/dbt-docs/pull/136), [docs#137](https://github.com/fishtown-analytics/dbt-docs/pull/137))
Contributors:
- [@tpilewicz](https://github.com/tpilewicz) ([#2723](https://github.com/fishtown-analytics/dbt/pull/2723))
- [@heisencoder](https://github.com/heisencoder) ([#2739](https://github.com/fishtown-analytics/dbt/issues/2739))
## dbt 0.18.0 (September 03, 2020)
### Under the hood
- Added 3 more adapter methods that the new dbt-adapter-test suite can use for testing. ([#2492](https://github.com/fishtown-analytics/dbt/issues/2492), [#2721](https://github.com/fishtown-analytics/dbt/pull/2721))
- It is now an error to attempt installing `dbt` with a Python version less than 3.6. (resolves [#2347](https://github.com/fishtown-analytics/dbt/issues/2347))
- Check for Postgres relation names longer than 63 and throw exception. ([#2197](https://github.com/fishtown-analytics/dbt/issues/2197), [#2727](https://github.com/fishtown-analytics/dbt/pull/2727))
### Fixes
- dbt now validates the require-dbt-version field before it validates the dbt_project.yml schema ([#2638](https://github.com/fishtown-analytics/dbt/issues/2638), [#2726](https://github.com/fishtown-analytics/dbt/pull/2726))
### Docs
- Add project level overviews ([docs#127](https://github.com/fishtown-analytics/dbt-docs/issues/127))
Contributors:
- [@genos](https://github.com/genos) ([#2722](https://github.com/fishtown-analytics/dbt/pull/2722))
- [@Mr-Nobody99](https://github.com/Mr-Nobody99) ([docs#129](https://github.com/fishtown-analytics/dbt-docs/pull/129))
## dbt 0.18.0rc1 (August 19, 2020)
### Breaking changes
- `adapter_macro` is no longer a macro, instead it is a builtin context method. Any custom macros that intercepted it by going through `context['dbt']` will need to instead access it via `context['builtins']` ([#2302](https://github.com/fishtown-analytics/dbt/issues/2302), [#2673](https://github.com/fishtown-analytics/dbt/pull/2673))
- `adapter_macro` is now deprecated. Use `adapter.dispatch` instead.
- Data tests are now written as CTEs instead of subqueries. Adapter plugins for adapters that don't support CTEs may require modification. ([#2712](https://github.com/fishtown-analytics/dbt/pull/2712))
### Under the hood
- Upgraded snowflake-connector-python dependency to 2.2.10 and enabled the SSO token cache ([#2613](https://github.com/fishtown-analytics/dbt/issues/2613), [#2689](https://github.com/fishtown-analytics/dbt/issues/2689), [#2698](https://github.com/fishtown-analytics/dbt/pull/2698))
- Add deprecation warnings to anonymous usage tracking ([#2688](https://github.com/fishtown-analytics/dbt/issues/2688), [#2710](https://github.com/fishtown-analytics/dbt/issues/2710))
- Data tests now behave like dbt CTEs ([#2609](https://github.com/fishtown-analytics/dbt/issues/2609), [#2712](https://github.com/fishtown-analytics/dbt/pull/2712))
- Adapter plugins can now override the CTE prefix by overriding their `Relation` attribute with a class that has a custom `add_ephemeral_prefix` implementation. ([#2660](https://github.com/fishtown-analytics/dbt/issues/2660), [#2712](https://github.com/fishtown-analytics/dbt/pull/2712))
### Features
- Add a BigQuery adapter macro to enable usage of CopyJobs ([#2709](https://github.com/fishtown-analytics/dbt/pull/2709))
- Support TTL for BigQuery tables([#2711](https://github.com/fishtown-analytics/dbt/pull/2711))
- Add better retry support when using the BigQuery adapter ([#2694](https://github.com/fishtown-analytics/dbt/pull/2694), follow-up to [#1963](https://github.com/fishtown-analytics/dbt/pull/1963))
- Added a `dispatch` method to the context adapter and deprecated `adapter_macro`. ([#2302](https://github.com/fishtown-analytics/dbt/issues/2302), [#2679](https://github.com/fishtown-analytics/dbt/pull/2679))
- The built-in schema tests now use `adapter.dispatch`, so they can be overridden for adapter plugins ([#2415](https://github.com/fishtown-analytics/dbt/issues/2415), [#2684](https://github.com/fishtown-analytics/dbt/pull/2684))
- Add support for impersonating a service account using `impersonate_service_account` in the BigQuery profile configuration ([#2677](https://github.com/fishtown-analytics/dbt/issues/2677)) ([docs](https://docs.getdbt.com/reference/warehouse-profiles/bigquery-profile#service-account-impersonation))
- Macros in the current project can override internal dbt macros that are called through `execute_macros`. ([#2301](https://github.com/fishtown-analytics/dbt/issues/2301), [#2686](https://github.com/fishtown-analytics/dbt/pull/2686))
- Add state:modified and state:new selectors ([#2641](https://github.com/fishtown-analytics/dbt/issues/2641), [#2695](https://github.com/fishtown-analytics/dbt/pull/2695))
- Add two new flags `--use-colors` and `--no-use-colors` to `dbt run` command to enable or disable log colorization from the command line ([#2708](https://github.com/fishtown-analytics/dbt/pull/2708))
### Fixes
- Fix Redshift table size estimation; e.g. 44 GB tables are no longer reported as 44 KB. [#2702](https://github.com/fishtown-analytics/dbt/issues/2702)
- Fix issue where jinja that only contained jinja comments wasn't rendered. ([#2707](https://github.com/fishtown-analytics/dbt/issues/2707), [#2178](https://github.com/fishtown-analytics/dbt/pull/2178))
### Docs
- Add "Referenced By" and "Depends On" sections for each node ([docs#106](https://github.com/fishtown-analytics/dbt-docs/pull/106))
- Add Name, Description, Column, SQL, Tags filters to site search ([docs#108](https://github.com/fishtown-analytics/dbt-docs/pull/108))
- Add relevance criteria to site search ([docs#113](https://github.com/fishtown-analytics/dbt-docs/pull/113))
- Support new selector methods, intersection, and arbitrary parent/child depth in DAG selection syntax ([docs#118](https://github.com/fishtown-analytics/dbt-docs/pull/118))
- Revise anonymous event tracking: simpler URL fuzzing; differentiate between Cloud-hosted and non-Cloud docs ([docs#121](https://github.com/fishtown-analytics/dbt-docs/pull/121))
Contributors:
- [@bbhoss](https://github.com/bbhoss) ([#2677](https://github.com/fishtown-analytics/dbt/pull/2677))
- [@kconvey](https://github.com/kconvey) ([#2694](https://github.com/fishtown-analytics/dbt/pull/2694), [#2709](https://github.com/fishtown-analytics/dbt/pull/2709)), [#2711](https://github.com/fishtown-analytics/dbt/pull/2711))
- [@vogt4nick](https://github.com/vogt4nick) ([#2702](https://github.com/fishtown-analytics/dbt/issues/2702))
- [@stephen8chang](https://github.com/stephen8chang) ([docs#106](https://github.com/fishtown-analytics/dbt-docs/pull/106), [docs#108](https://github.com/fishtown-analytics/dbt-docs/pull/108), [docs#113](https://github.com/fishtown-analytics/dbt-docs/pull/113))
- [@rsenseman](https://github.com/rsenseman) ([#2708](https://github.com/fishtown-analytics/dbt/pull/2708))
## dbt 0.18.0b2 (July 30, 2020)
@@ -13,6 +256,7 @@
- Previously, dbt put macros from all installed plugins into the namespace. This version of dbt will not include adapter plugin macros unless they are from the currently-in-use adapter or one of its dependencies [#2590](https://github.com/fishtown-analytics/dbt/pull/2590)
### Features
- Added option "--adapter" to `dbt init` to create a sample `profiles.yml` based on the chosen adapter ([#2533](https://github.com/fishtown-analytics/dbt/issues/2533), [#2594](https://github.com/fishtown-analytics/dbt/pull/2594))
- Added support for Snowflake query tags at the connection and model level ([#1030](https://github.com/fishtown-analytics/dbt/issues/1030), [#2555](https://github.com/fishtown-analytics/dbt/pull/2555/))
- Added new node selector methods (`config`, `test_type`, `test_name`, `package`) ([#2425](https://github.com/fishtown-analytics/dbt/issues/2425), [#2629](https://github.com/fishtown-analytics/dbt/pull/2629))
- Added option to specify profile when connecting to Redshift via IAM ([#2437](https://github.com/fishtown-analytics/dbt/issues/2437), [#2581](https://github.com/fishtown-analytics/dbt/pull/2581))
@@ -24,7 +268,7 @@
- Compile assets as part of docs generate ([#2072](https://github.com/fishtown-analytics/dbt/issues/2072), [#2623](https://github.com/fishtown-analytics/dbt/pull/2623))
Contributors:
- [@brunomurino](https://github.com/brunomurino) ([#2437](https://github.com/fishtown-analytics/dbt/pull/2581))
- [@brunomurino](https://github.com/brunomurino) ([#2581](https://github.com/fishtown-analytics/dbt/pull/2581), [#2594](https://github.com/fishtown-analytics/dbt/pull/2594))
- [@DrMcTaco](https://github.com/DrMcTaco) ([#1030](https://github.com/fishtown-analytics/dbt/issues/1030)),[#2555](https://github.com/fishtown-analytics/dbt/pull/2555/))
- [@kning](https://github.com/kning) ([#2627](https://github.com/fishtown-analytics/dbt/pull/2627))
- [@azhard](https://github.com/azhard) ([#2588](https://github.com/fishtown-analytics/dbt/pull/2588))
@@ -123,11 +367,9 @@ Contributors:
- dbt compile and ls no longer create schemas if they don't already exist ([#2525](https://github.com/fishtown-analytics/dbt/issues/2525), [#2528](https://github.com/fishtown-analytics/dbt/pull/2528))
- `dbt deps` now respects the `--project-dir` flag, so using `dbt deps --project-dir=/some/path` and then `dbt run --project-dir=/some/path` will properly find dependencies ([#2519](https://github.com/fishtown-analytics/dbt/issues/2519), [#2534](https://github.com/fishtown-analytics/dbt/pull/2534))
- `packages.yml` revision/version fields can be float-like again (`revision: '1.0'` is valid). ([#2518](https://github.com/fishtown-analytics/dbt/issues/2518), [#2535](https://github.com/fishtown-analytics/dbt/pull/2535))
<<<<<<< HEAD
- dbt again respects config aliases in config() calls ([#2557](https://github.com/fishtown-analytics/dbt/issues/2557), [#2559](https://github.com/fishtown-analytics/dbt/pull/2559))
=======
- Parallel RPC requests no longer step on each others' arguments ([[#2484](https://github.com/fishtown-analytics/dbt/issues/2484), [#2554](https://github.com/fishtown-analytics/dbt/pull/2554)])
- `persist_docs` now takes into account descriptions for nested columns in bigquery ([#2549](https://github.com/fishtown-analytics/dbt/issues/2549), [#2550](https://github.com/fishtown-analytics/dbt/pull/2550))
- On windows (depending upon OS support), dbt no longer fails with errors when writing artifacts ([#2558](https://github.com/fishtown-analytics/dbt/issues/2558), [#2566](https://github.com/fishtown-analytics/dbt/pull/2566))
@@ -137,7 +379,6 @@ Contributors:
Contributors:
- [@bodschut](https://github.com/bodschut) ([#2550](https://github.com/fishtown-analytics/dbt/pull/2550))
>>>>>>> dev/0.17.1
## dbt 0.17.0 (June 08, 2020)
@@ -716,7 +957,6 @@ Thanks for your contributions to dbt!
- [@bastienboutonnet](https://github.com/bastienboutonnet) ([#1591](https://github.com/fishtown-analytics/dbt/pull/1591), [#1689](https://github.com/fishtown-analytics/dbt/pull/1689))
## dbt 0.14.0 - Wilt Chamberlain (July 10, 2019)
### Overview

View File

@@ -1,39 +0,0 @@
FROM ubuntu:18.04
ENV DEBIAN_FRONTEND noninteractive
ARG DOCKERIZE_VERSION=v0.6.1
RUN apt-get update && \
apt-get dist-upgrade -y && \
apt-get install -y --no-install-recommends \
netcat postgresql curl git ssh software-properties-common \
make build-essential ca-certificates libpq-dev \
libsasl2-dev libsasl2-2 libsasl2-modules-gssapi-mit libyaml-dev \
&& \
add-apt-repository ppa:deadsnakes/ppa && \
apt-get install -y \
python python-dev python-pip \
python3.6 python3.6-dev python3-pip python3.6-venv \
python3.7 python3.7-dev python3.7-venv \
python3.8 python3.8-dev python3.8-venv \
python3.9 python3.9-dev python3.9-venv && \
apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
RUN useradd -mU dbt_test_user
RUN mkdir /usr/app && chown dbt_test_user /usr/app
RUN mkdir /home/tox && chown dbt_test_user /home/tox
RUN curl -LO https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz && \
tar -C /usr/local/bin -xzvf dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz && \
rm dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz
WORKDIR /usr/app
VOLUME /usr/app
RUN pip3 install -U "tox==3.14.4" wheel "six>=1.14.0,<1.15.0" "virtualenv==20.0.3" setuptools
# tox fails if the 'python' interpreter (python2) doesn't have `tox` installed
RUN pip install -U "tox==3.14.4" "six>=1.14.0,<1.15.0" "virtualenv==20.0.3" setuptools
USER dbt_test_user
ENV PYTHONIOENCODING=utf-8
ENV LANG C.UTF-8

74
Dockerfile.test Normal file
View File

@@ -0,0 +1,74 @@
FROM ubuntu:18.04
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update \
&& apt-get dist-upgrade -y \
&& apt-get install -y --no-install-recommends \
netcat \
postgresql \
curl \
git \
ssh \
software-properties-common \
make \
build-essential \
ca-certificates \
libpq-dev \
libsasl2-dev \
libsasl2-2 \
libsasl2-modules-gssapi-mit \
libyaml-dev \
unixodbc-dev \
&& add-apt-repository ppa:deadsnakes/ppa \
&& apt-get install -y \
python \
python-dev \
python-pip \
python3.6 \
python3.6-dev \
python3-pip \
python3.6-venv \
python3.7 \
python3.7-dev \
python3.7-venv \
python3.8 \
python3.8-dev \
python3.8-venv \
python3.9 \
python3.9-dev \
python3.9-venv \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
ARG DOCKERIZE_VERSION=v0.6.1
RUN curl -LO https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
&& tar -C /usr/local/bin -xzvf dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
&& rm dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz
RUN pip3 install -U "tox==3.14.4" wheel "six>=1.14.0,<1.15.0" "virtualenv==20.0.3" setuptools
# tox fails if the 'python' interpreter (python2) doesn't have `tox` installed
RUN pip install -U "tox==3.14.4" "six>=1.14.0,<1.15.0" "virtualenv==20.0.3" setuptools
# These args are passed in via docker-compose, which reads then from the .env file.
# On Linux, run `make .env` to create the .env file for the current user.
# On MacOS and Windows, these can stay unset.
ARG USER_ID
ARG GROUP_ID
RUN if [ ${USER_ID:-0} -ne 0 ] && [ ${GROUP_ID:-0} -ne 0 ]; then \
groupadd -g ${GROUP_ID} dbt_test_user && \
useradd -m -l -u ${USER_ID} -g ${GROUP_ID} dbt_test_user; \
else \
useradd -mU -l dbt_test_user; \
fi
RUN mkdir /usr/app && chown dbt_test_user /usr/app
RUN mkdir /home/tox && chown dbt_test_user /home/tox
WORKDIR /usr/app
VOLUME /usr/app
USER dbt_test_user
ENV PYTHONIOENCODING=utf-8
ENV LANG C.UTF-8

View File

@@ -5,25 +5,38 @@ changed_tests := `git status --porcelain | grep '^\(M\| M\|A\| A\)' | awk '{ pri
install:
pip install -e .
test:
test: .env
@echo "Full test run starting..."
@time docker-compose run test tox
@time docker-compose run --rm test tox
test-unit:
test-unit: .env
@echo "Unit test run starting..."
@time docker-compose run test tox -e unit-py36,flake8
@time docker-compose run --rm test tox -e unit-py36,flake8
test-integration:
test-integration: .env
@echo "Integration test run starting..."
@time docker-compose run test tox -e integration-postgres-py36,integration-redshift-py36,integration-snowflake-py36,integration-bigquery-py36
@time docker-compose run --rm test tox -e integration-postgres-py36,integration-redshift-py36,integration-snowflake-py36,integration-bigquery-py36
test-quick:
test-quick: .env
@echo "Integration test run starting..."
@time docker-compose run test tox -e integration-postgres-py36 -- -x
@time docker-compose run --rm test tox -e integration-postgres-py36 -- -x
# This rule creates a file named .env that is used by docker-compose for passing
# the USER_ID and GROUP_ID arguments to the Docker image.
.env:
@touch .env
ifneq ($(OS),Windows_NT)
ifneq ($(shell uname -s), Darwin)
@echo USER_ID=$(shell id -u) > .env
@echo GROUP_ID=$(shell id -g) >> .env
endif
endif
@time docker-compose build
clean:
rm -f .coverage
rm -rf .eggs/
rm -f .env
rm -rf .tox/
rm -rf build/
rm -rf dbt.egg-info/

View File

@@ -4,14 +4,15 @@ import os
from multiprocessing.synchronize import RLock
from threading import get_ident
from typing import (
Dict, Tuple, Hashable, Optional, ContextManager, List
Dict, Tuple, Hashable, Optional, ContextManager, List, Union
)
import agate
import dbt.exceptions
from dbt.contracts.connection import (
Connection, Identifier, ConnectionState, AdapterRequiredConfig, LazyHandle
Connection, Identifier, ConnectionState,
AdapterRequiredConfig, LazyHandle, AdapterResponse
)
from dbt.contracts.graph.manifest import Manifest
from dbt.adapters.base.query_headers import (
@@ -290,7 +291,7 @@ class BaseConnectionManager(metaclass=abc.ABCMeta):
@abc.abstractmethod
def execute(
self, sql: str, auto_begin: bool = False, fetch: bool = False
) -> Tuple[str, agate.Table]:
) -> Tuple[Union[str, AdapterResponse], agate.Table]:
"""Execute the given SQL.
:param str sql: The sql to execute.
@@ -298,7 +299,7 @@ class BaseConnectionManager(metaclass=abc.ABCMeta):
transaction, automatically begin one.
:param bool fetch: If set, fetch results.
:return: A tuple of the status and the results (empty if fetch=False).
:rtype: Tuple[str, agate.Table]
:rtype: Tuple[Union[str, AdapterResponse], agate.Table]
"""
raise dbt.exceptions.NotImplementedException(
'`execute` is not implemented for this adapter!'

View File

@@ -25,7 +25,9 @@ from dbt.adapters.protocol import (
)
from dbt.clients.agate_helper import empty_table, merge_tables, table_from_rows
from dbt.clients.jinja import MacroGenerator
from dbt.contracts.graph.compiled import CompileResultNode, CompiledSeedNode
from dbt.contracts.graph.compiled import (
CompileResultNode, CompiledSeedNode
)
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.parsed import ParsedSeedNode
from dbt.exceptions import warn_or_error
@@ -33,7 +35,7 @@ from dbt.node_types import NodeType
from dbt.logger import GLOBAL_LOGGER as logger
from dbt.utils import filter_null_values, executor
from dbt.adapters.base.connections import Connection
from dbt.adapters.base.connections import Connection, AdapterResponse
from dbt.adapters.base.meta import AdapterMeta, available
from dbt.adapters.base.relation import (
ComponentName, BaseRelation, InformationSchema, SchemaSearchMap
@@ -158,7 +160,7 @@ class BaseAdapter(metaclass=AdapterMeta):
self.config = config
self.cache = RelationsCache()
self.connections = self.ConnectionManager(config)
self._internal_manifest_lazy: Optional[Manifest] = None
self._macro_manifest_lazy: Optional[Manifest] = None
###
# Methods that pass through to the connection manager
@@ -178,6 +180,9 @@ class BaseAdapter(metaclass=AdapterMeta):
def commit_if_has_connection(self) -> None:
self.connections.commit_if_has_connection()
def debug_query(self) -> None:
self.execute('select 1 as id')
def nice_connection_name(self) -> str:
conn = self.connections.get_if_exists()
if conn is None or conn.name is None:
@@ -208,7 +213,7 @@ class BaseAdapter(metaclass=AdapterMeta):
@available.parse(lambda *a, **k: ('', empty_table()))
def execute(
self, sql: str, auto_begin: bool = False, fetch: bool = False
) -> Tuple[str, agate.Table]:
) -> Tuple[Union[str, AdapterResponse], agate.Table]:
"""Execute the given SQL. This is a thin wrapper around
ConnectionManager.execute.
@@ -217,7 +222,7 @@ class BaseAdapter(metaclass=AdapterMeta):
transaction, automatically begin one.
:param bool fetch: If set, fetch results.
:return: A tuple of the status and the results (empty if fetch=False).
:rtype: Tuple[str, agate.Table]
:rtype: Tuple[Union[str, AdapterResponse], agate.Table]
"""
return self.connections.execute(
sql=sql,
@@ -225,6 +230,21 @@ class BaseAdapter(metaclass=AdapterMeta):
fetch=fetch
)
@available.parse(lambda *a, **k: ('', empty_table()))
def get_partitions_metadata(
self, table: str
) -> Tuple[agate.Table]:
"""Obtain partitions metadata for a BigQuery partitioned table.
:param str table_id: a partitioned table id, in standard SQL format.
:return: a partition metadata tuple, as described in
https://cloud.google.com/bigquery/docs/creating-partitioned-tables#getting_partition_metadata_using_meta_tables.
:rtype: agate.Table
"""
return self.connections.get_partitions_metadata(
table=table
)
###
# Methods that should never be overridden
###
@@ -239,24 +259,30 @@ class BaseAdapter(metaclass=AdapterMeta):
return cls.ConnectionManager.TYPE
@property
def _internal_manifest(self) -> Manifest:
if self._internal_manifest_lazy is None:
return self.load_internal_manifest()
return self._internal_manifest_lazy
def _macro_manifest(self) -> Manifest:
if self._macro_manifest_lazy is None:
return self.load_macro_manifest()
return self._macro_manifest_lazy
def check_internal_manifest(self) -> Optional[Manifest]:
def check_macro_manifest(self) -> Optional[Manifest]:
"""Return the internal manifest (used for executing macros) if it's
been initialized, otherwise return None.
"""
return self._internal_manifest_lazy
return self._macro_manifest_lazy
def load_internal_manifest(self) -> Manifest:
if self._internal_manifest_lazy is None:
def load_macro_manifest(self) -> Manifest:
if self._macro_manifest_lazy is None:
# avoid a circular import
from dbt.parser.manifest import load_internal_manifest
manifest = load_internal_manifest(self.config)
self._internal_manifest_lazy = manifest
return self._internal_manifest_lazy
from dbt.parser.manifest import load_macro_manifest
manifest = load_macro_manifest(
self.config, self.connections.set_query_header
)
self._macro_manifest_lazy = manifest
return self._macro_manifest_lazy
def clear_macro_manifest(self):
if self._macro_manifest_lazy is not None:
self._macro_manifest_lazy = None
###
# Caching methods
@@ -283,7 +309,10 @@ class BaseAdapter(metaclass=AdapterMeta):
return {
self.Relation.create_from(self.config, node).without_identifier()
for node in manifest.nodes.values()
if node.resource_type in NodeType.executable()
if (
node.resource_type in NodeType.executable() and
not node.is_ephemeral_model
)
}
def _get_catalog_schemas(self, manifest: Manifest) -> SchemaSearchMap:
@@ -941,7 +970,7 @@ class BaseAdapter(metaclass=AdapterMeta):
context_override = {}
if manifest is None:
manifest = self._internal_manifest
manifest = self._macro_manifest
macro = manifest.find_macro_by_name(
macro_name, self.config.project_name, project
@@ -1107,6 +1136,44 @@ class BaseAdapter(metaclass=AdapterMeta):
"""
pass
def get_compiler(self):
from dbt.compilation import Compiler
return Compiler(self.config)
# Methods used in adapter tests
def update_column_sql(
self,
dst_name: str,
dst_column: str,
clause: str,
where_clause: Optional[str] = None,
) -> str:
clause = f'update {dst_name} set {dst_column} = {clause}'
if where_clause is not None:
clause += f' where {where_clause}'
return clause
def timestamp_add_sql(
self, add_to: str, number: int = 1, interval: str = 'hour'
) -> str:
# for backwards compatibility, we're compelled to set some sort of
# default. A lot of searching has lead me to believe that the
# '+ interval' syntax used in postgres/redshift is relatively common
# and might even be the SQL standard's intention.
return f"{add_to} + interval '{number} {interval}'"
def string_add_sql(
self, add_to: str, value: str, location='append',
) -> str:
if location == 'append':
return f"{add_to} || '{value}'"
elif location == 'prepend':
return f"'{value}' || {add_to}"
else:
raise RuntimeException(
f'Got an unexpected location value of "{location}"'
)
def get_rows_different_sql(
self,
relation_a: BaseRelation,

View File

@@ -201,6 +201,23 @@ class BaseRelation(FakeAPIObject, Hashable):
**kwargs
)
@staticmethod
def add_ephemeral_prefix(name: str):
return f'__dbt__cte__{name}'
@classmethod
def create_ephemeral_from_node(
cls: Type[Self],
config: HasQuoting,
node: Union[ParsedNode, CompiledNode],
) -> Self:
# Note that ephemeral models are based on the name.
identifier = cls.add_ephemeral_prefix(node.name)
return cls.create(
type=cls.CTE,
identifier=identifier,
).quote(identifier=False)
@classmethod
def create_from_node(
cls: Type[Self],

View File

@@ -1,19 +1,25 @@
from dataclasses import dataclass
from typing import (
Type, Hashable, Optional, ContextManager, List, Generic, TypeVar, ClassVar,
Tuple, Union
Tuple, Union, Dict, Any
)
from typing_extensions import Protocol
import agate
from dbt.contracts.connection import Connection, AdapterRequiredConfig
from dbt.contracts.graph.compiled import CompiledNode
from dbt.contracts.connection import (
Connection, AdapterRequiredConfig, AdapterResponse
)
from dbt.contracts.graph.compiled import (
CompiledNode, ManifestNode, NonSourceCompiledNode
)
from dbt.contracts.graph.parsed import ParsedNode, ParsedSourceDefinition
from dbt.contracts.graph.model_config import BaseConfig
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.relation import Policy, HasQuoting
from dbt.graph import Graph
@dataclass
class AdapterConfig(BaseConfig):
@@ -45,6 +51,19 @@ class RelationProtocol(Protocol):
...
class CompilerProtocol(Protocol):
def compile(self, manifest: Manifest, write=True) -> Graph:
...
def compile_node(
self,
node: ManifestNode,
manifest: Manifest,
extra_context: Optional[Dict[str, Any]] = None,
) -> NonSourceCompiledNode:
...
AdapterConfig_T = TypeVar(
'AdapterConfig_T', bound=AdapterConfig
)
@@ -57,11 +76,18 @@ Relation_T = TypeVar(
Column_T = TypeVar(
'Column_T', bound=ColumnProtocol
)
Compiler_T = TypeVar('Compiler_T', bound=CompilerProtocol)
class AdapterProtocol(
Protocol,
Generic[AdapterConfig_T, ConnectionManager_T, Relation_T, Column_T]
Generic[
AdapterConfig_T,
ConnectionManager_T,
Relation_T,
Column_T,
Compiler_T,
]
):
AdapterSpecificConfigs: ClassVar[Type[AdapterConfig_T]]
Column: ClassVar[Type[Column_T]]
@@ -130,5 +156,8 @@ class AdapterProtocol(
def execute(
self, sql: str, auto_begin: bool = False, fetch: bool = False
) -> Tuple[str, agate.Table]:
) -> Tuple[Union[str, AdapterResponse], agate.Table]:
...
def get_compiler(self) -> Compiler_T:
...

View File

@@ -1,13 +1,15 @@
import abc
import time
from typing import List, Optional, Tuple, Any, Iterable, Dict
from typing import List, Optional, Tuple, Any, Iterable, Dict, Union
import agate
import dbt.clients.agate_helper
import dbt.exceptions
from dbt.adapters.base import BaseConnectionManager
from dbt.contracts.connection import Connection, ConnectionState
from dbt.contracts.connection import (
Connection, ConnectionState, AdapterResponse
)
from dbt.logger import GLOBAL_LOGGER as logger
from dbt import flags
@@ -18,7 +20,7 @@ class SQLConnectionManager(BaseConnectionManager):
Methods to implement:
- exception_handler
- cancel
- get_status
- get_response
- open
"""
@abc.abstractmethod
@@ -76,20 +78,19 @@ class SQLConnectionManager(BaseConnectionManager):
cursor = connection.handle.cursor()
cursor.execute(sql, bindings)
logger.debug(
"SQL status: {status} in {elapsed:0.2f} seconds",
status=self.get_status(cursor),
status=self.get_response(cursor),
elapsed=(time.time() - pre)
)
return connection, cursor
@abc.abstractclassmethod
def get_status(cls, cursor: Any) -> str:
def get_response(cls, cursor: Any) -> Union[AdapterResponse, str]:
"""Get the status of the cursor."""
raise dbt.exceptions.NotImplementedException(
'`get_status` is not implemented for this adapter!'
'`get_response` is not implemented for this adapter!'
)
@classmethod
@@ -118,15 +119,15 @@ class SQLConnectionManager(BaseConnectionManager):
def execute(
self, sql: str, auto_begin: bool = False, fetch: bool = False
) -> Tuple[str, agate.Table]:
) -> Tuple[Union[AdapterResponse, str], agate.Table]:
sql = self._add_query_comment(sql)
_, cursor = self.add_query(sql, auto_begin)
status = self.get_status(cursor)
response = self.get_response(cursor)
if fetch:
table = self.get_result_from_cursor(cursor)
else:
table = dbt.clients.agate_helper.empty_table()
return status, table
return response, table
def add_begin_query(self):
return self.add_query('BEGIN', auto_begin=False)

View File

@@ -545,7 +545,7 @@ def _requote_result(raw_value: str, rendered: str) -> str:
# checking two separate patterns, but the standard deviation is smaller with
# one pattern. The time difference between the two was ~2 std deviations, which
# is small enough that I've just chosen the more readable option.
_HAS_RENDER_CHARS_PAT = re.compile(r'({[{%]|[}%]})')
_HAS_RENDER_CHARS_PAT = re.compile(r'({[{%#]|[#}%]})')
def get_rendered(

View File

@@ -1,27 +1,35 @@
import os
from collections import defaultdict
from typing import List, Dict, Any, Tuple, cast
from typing import List, Dict, Any, Tuple, cast, Optional
import networkx as nx # type: ignore
import sqlparse
from dbt import flags
from dbt.adapters.factory import get_adapter
from dbt.clients import jinja
from dbt.clients.system import make_directory
from dbt.context.providers import generate_runtime_model
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.compiled import (
InjectedCTE,
COMPILED_TYPES,
NonSourceNode,
NonSourceCompiledNode,
CompiledDataTestNode,
CompiledSchemaTestNode,
COMPILED_TYPES,
GraphMemberNode,
InjectedCTE,
ManifestNode,
NonSourceCompiledNode,
)
from dbt.contracts.graph.parsed import ParsedNode
from dbt.exceptions import dependency_not_found, InternalException
from dbt.exceptions import (
dependency_not_found,
InternalException,
RuntimeException,
)
from dbt.graph import Graph
from dbt.logger import GLOBAL_LOGGER as logger
from dbt.node_types import NodeType
from dbt.utils import add_ephemeral_model_prefix, pluralize
from dbt.utils import pluralize
graph_file_name = 'graph.gpickle'
@@ -44,6 +52,7 @@ def print_compile_stats(stats):
NodeType.Operation: 'operation',
NodeType.Seed: 'seed file',
NodeType.Source: 'source',
NodeType.Exposure: 'exposure',
}
results = {k: 0 for k in names.keys()}
@@ -57,7 +66,7 @@ def print_compile_stats(stats):
logger.info("Found {}".format(stat_line))
def _node_enabled(node: NonSourceNode):
def _node_enabled(node: ManifestNode):
# Disabled models are already excluded from the manifest
if node.resource_type == NodeType.Test and not node.config.enabled:
return False
@@ -73,6 +82,8 @@ def _generate_stats(manifest: Manifest):
for source in manifest.sources.values():
stats[source.resource_type] += 1
for exposure in manifest.exposures.values():
stats[exposure.resource_type] += 1
for macro in manifest.macros.values():
stats[macro.resource_type] += 1
return stats
@@ -140,12 +151,15 @@ class Compiler:
make_directory(self.config.target_path)
make_directory(self.config.modules_path)
# creates a ModelContext which is converted to
# a dict for jinja rendering of SQL
def _create_node_context(
self,
node: NonSourceCompiledNode,
manifest: Manifest,
extra_context: Dict[str, Any],
) -> Dict[str, Any]:
context = generate_runtime_model(
node, self.config, manifest
)
@@ -156,75 +170,218 @@ class Compiler:
return context
def _get_compiled_model(
self,
manifest: Manifest,
cte_id: str,
extra_context: Dict[str, Any],
) -> NonSourceCompiledNode:
def add_ephemeral_prefix(self, name: str):
adapter = get_adapter(self.config)
relation_cls = adapter.Relation
return relation_cls.add_ephemeral_prefix(name)
if cte_id not in manifest.nodes:
raise InternalException(
f'During compilation, found a cte reference that could not be '
f'resolved: {cte_id}'
)
cte_model = manifest.nodes[cte_id]
if getattr(cte_model, 'compiled', False):
assert isinstance(cte_model, tuple(COMPILED_TYPES.values()))
return cast(NonSourceCompiledNode, cte_model)
elif cte_model.is_ephemeral_model:
# this must be some kind of parsed node that we can compile.
# we know it's not a parsed source definition
assert isinstance(cte_model, tuple(COMPILED_TYPES))
# update the node so
node = self.compile_node(cte_model, manifest, extra_context)
manifest.sync_update_node(node)
return node
def _get_relation_name(self, node: ParsedNode):
relation_name = None
if (node.resource_type in NodeType.refable() and
not node.is_ephemeral_model):
adapter = get_adapter(self.config)
relation_cls = adapter.Relation
relation_name = str(relation_cls.create_from(self.config, node))
return relation_name
def _inject_ctes_into_sql(self, sql: str, ctes: List[InjectedCTE]) -> str:
"""
`ctes` is a list of InjectedCTEs like:
[
InjectedCTE(
id="cte_id_1",
sql="__dbt__cte__ephemeral as (select * from table)",
),
InjectedCTE(
id="cte_id_2",
sql="__dbt__cte__events as (select id, type from events)",
),
]
Given `sql` like:
"with internal_cte as (select * from sessions)
select * from internal_cte"
This will spit out:
"with __dbt__cte__ephemeral as (select * from table),
__dbt__cte__events as (select id, type from events),
with internal_cte as (select * from sessions)
select * from internal_cte"
(Whitespace enhanced for readability.)
"""
if len(ctes) == 0:
return sql
parsed_stmts = sqlparse.parse(sql)
parsed = parsed_stmts[0]
with_stmt = None
for token in parsed.tokens:
if token.is_keyword and token.normalized == 'WITH':
with_stmt = token
break
if with_stmt is None:
# no with stmt, add one, and inject CTEs right at the beginning
first_token = parsed.token_first()
with_stmt = sqlparse.sql.Token(sqlparse.tokens.Keyword, 'with')
parsed.insert_before(first_token, with_stmt)
else:
raise InternalException(
f'During compilation, found an uncompiled cte that '
f'was not an ephemeral model: {cte_id}'
# stmt exists, add a comma (which will come after injected CTEs)
trailing_comma = sqlparse.sql.Token(
sqlparse.tokens.Punctuation, ','
)
parsed.insert_after(with_stmt, trailing_comma)
token = sqlparse.sql.Token(
sqlparse.tokens.Keyword,
", ".join(c.sql for c in ctes)
)
parsed.insert_after(with_stmt, token)
return str(parsed)
def _get_dbt_test_name(self) -> str:
return 'dbt__cte__internal_test'
# This method is called by the 'compile_node' method. Starting
# from the node that it is passed in, it will recursively call
# itself using the 'extra_ctes'. The 'ephemeral' models do
# not produce SQL that is executed directly, instead they
# are rolled up into the models that refer to them by
# inserting CTEs into the SQL.
def _recursively_prepend_ctes(
self,
model: NonSourceCompiledNode,
manifest: Manifest,
extra_context: Dict[str, Any],
extra_context: Optional[Dict[str, Any]],
) -> Tuple[NonSourceCompiledNode, List[InjectedCTE]]:
if model.compiled_sql is None:
raise RuntimeException(
'Cannot inject ctes into an unparsed node', model
)
if model.extra_ctes_injected:
return (model, model.extra_ctes)
if flags.STRICT_MODE:
if not isinstance(model, tuple(COMPILED_TYPES.values())):
raise InternalException(
f'Bad model type: {type(model)}'
)
# Just to make it plain that nothing is actually injected for this case
if not model.extra_ctes:
model.extra_ctes_injected = True
manifest.update_node(model)
return (model, model.extra_ctes)
# This stores the ctes which will all be recursively
# gathered and then "injected" into the model.
prepended_ctes: List[InjectedCTE] = []
dbt_test_name = self._get_dbt_test_name()
# extra_ctes are added to the model by
# RuntimeRefResolver.create_relation, which adds an
# extra_cte for every model relation which is an
# ephemeral model.
for cte in model.extra_ctes:
cte_model = self._get_compiled_model(
manifest,
cte.id,
extra_context,
)
cte_model, new_prepended_ctes = self._recursively_prepend_ctes(
cte_model, manifest, extra_context
)
_extend_prepended_ctes(prepended_ctes, new_prepended_ctes)
new_cte_name = add_ephemeral_model_prefix(cte_model.name)
sql = f' {new_cte_name} as (\n{cte_model.compiled_sql}\n)'
if cte.id == dbt_test_name:
sql = cte.sql
else:
if cte.id not in manifest.nodes:
raise InternalException(
f'During compilation, found a cte reference that '
f'could not be resolved: {cte.id}'
)
cte_model = manifest.nodes[cte.id]
if not cte_model.is_ephemeral_model:
raise InternalException(f'{cte.id} is not ephemeral')
# This model has already been compiled, so it's been
# through here before
if getattr(cte_model, 'compiled', False):
assert isinstance(cte_model,
tuple(COMPILED_TYPES.values()))
cte_model = cast(NonSourceCompiledNode, cte_model)
new_prepended_ctes = cte_model.extra_ctes
# if the cte_model isn't compiled, i.e. first time here
else:
# This is an ephemeral parsed model that we can compile.
# Compile and update the node
cte_model = self._compile_node(
cte_model, manifest, extra_context)
# recursively call this method
cte_model, new_prepended_ctes = \
self._recursively_prepend_ctes(
cte_model, manifest, extra_context
)
# Save compiled SQL file and sync manifest
self._write_node(cte_model)
manifest.sync_update_node(cte_model)
_extend_prepended_ctes(prepended_ctes, new_prepended_ctes)
new_cte_name = self.add_ephemeral_prefix(cte_model.name)
sql = f' {new_cte_name} as (\n{cte_model.compiled_sql}\n)'
_add_prepended_cte(prepended_ctes, InjectedCTE(id=cte.id, sql=sql))
model.prepend_ctes(prepended_ctes)
# We don't save injected_sql into compiled sql for ephemeral models
# because it will cause problems with processing of subsequent models.
# Ephemeral models do not produce executable SQL of their own.
if not model.is_ephemeral_model:
injected_sql = self._inject_ctes_into_sql(
model.compiled_sql,
prepended_ctes,
)
model.compiled_sql = injected_sql
model.extra_ctes_injected = True
model.extra_ctes = prepended_ctes
model.validate(model.to_dict())
manifest.update_node(model)
return model, prepended_ctes
def compile_node(
self, node: NonSourceNode, manifest, extra_context=None
def _add_ctes(
self,
compiled_node: NonSourceCompiledNode,
manifest: Manifest,
extra_context: Dict[str, Any],
) -> NonSourceCompiledNode:
"""Wrap the data test SQL in a CTE."""
# for data tests, we need to insert a special CTE at the end of the
# list containing the test query, and then have the "real" query be a
# select count(*) from that model.
# the benefit of doing it this way is that _add_ctes() can be
# rewritten for different adapters to handle databases that don't
# support CTEs, or at least don't have full support.
if isinstance(compiled_node, CompiledDataTestNode):
# the last prepend (so last in order) should be the data test body.
# then we can add our select count(*) from _that_ cte as the "real"
# compiled_sql, and do the regular prepend logic from CTEs.
name = self._get_dbt_test_name()
cte = InjectedCTE(
id=name,
sql=f' {name} as (\n{compiled_node.compiled_sql}\n)'
)
compiled_node.extra_ctes.append(cte)
compiled_node.compiled_sql = f'\nselect count(*) from {name}'
return compiled_node
# creates a compiled_node from the ManifestNode passed in,
# creates a "context" dictionary for jinja rendering,
# and then renders the "compiled_sql" using the node, the
# raw_sql and the context.
def _compile_node(
self,
node: ManifestNode,
manifest: Manifest,
extra_context: Optional[Dict[str, Any]] = None,
) -> NonSourceCompiledNode:
if extra_context is None:
extra_context = {}
@@ -237,7 +394,6 @@ class Compiler:
'compiled_sql': None,
'extra_ctes_injected': False,
'extra_ctes': [],
'injected_sql': None,
})
compiled_node = _compiled_type_for(node).from_dict(data)
@@ -248,15 +404,20 @@ class Compiler:
compiled_node.compiled_sql = jinja.get_rendered(
node.raw_sql,
context,
node)
node,
)
compiled_node.relation_name = self._get_relation_name(node)
compiled_node.compiled = True
injected_node, _ = self._recursively_prepend_ctes(
# add ctes for specific test nodes, and also for
# possible future use in adapters
compiled_node = self._add_ctes(
compiled_node, manifest, extra_context
)
return injected_node
return compiled_node
def write_graph_file(self, linker: Linker, manifest: Manifest):
filename = graph_file_name
@@ -265,7 +426,7 @@ class Compiler:
linker.write_graph(graph_path, manifest)
def link_node(
self, linker: Linker, node: NonSourceNode, manifest: Manifest
self, linker: Linker, node: GraphMemberNode, manifest: Manifest
):
linker.add_node(node.unique_id)
@@ -288,6 +449,9 @@ class Compiler:
linker.add_node(source.unique_id)
for node in manifest.nodes.values():
self.link_node(linker, node, manifest)
for exposure in manifest.exposures.values():
self.link_node(linker, exposure, manifest)
# linker.add_node(exposure.unique_id)
cycle = linker.find_cycles()
@@ -295,6 +459,7 @@ class Compiler:
raise RuntimeError("Found a cycle: {}".format(cycle))
def compile(self, manifest: Manifest, write=True) -> Graph:
self.initialize()
linker = Linker()
self.link_graph(linker, manifest)
@@ -307,35 +472,38 @@ class Compiler:
return Graph(linker.graph)
# writes the "compiled_sql" into the target/compiled directory
def _write_node(self, node: NonSourceCompiledNode) -> ManifestNode:
if (not node.extra_ctes_injected or
node.resource_type == NodeType.Snapshot):
return node
logger.debug(f'Writing injected SQL for node "{node.unique_id}"')
def compile_manifest(config, manifest, write=True) -> Graph:
compiler = Compiler(config)
compiler.initialize()
return compiler.compile(manifest, write=write)
if node.compiled_sql:
node.build_path = node.write_node(
self.config.target_path,
'compiled',
node.compiled_sql
)
return node
# This is the main entry point into this code. It's called by
# CompileRunner.compile, GenericRPCRunner.compile, and
# RunTask.get_hook_sql. It calls '_compile_node' to convert
# the node into a compiled node, and then calls the
# recursive method to "prepend" the ctes.
def compile_node(
self,
node: ManifestNode,
manifest: Manifest,
extra_context: Optional[Dict[str, Any]] = None,
write: bool = True,
) -> NonSourceCompiledNode:
node = self._compile_node(node, manifest, extra_context)
def _is_writable(node):
if not node.injected_sql:
return False
if node.resource_type == NodeType.Snapshot:
return False
return True
def compile_node(adapter, config, node, manifest, extra_context, write=True):
compiler = Compiler(config)
node = compiler.compile_node(node, manifest, extra_context)
if write and _is_writable(node):
logger.debug('Writing injected SQL for node "{}"'.format(
node.unique_id))
node.build_path = node.write_node(
config.target_path,
'compiled',
node.injected_sql
node, _ = self._recursively_prepend_ctes(
node, manifest, extra_context
)
return node
if write:
self._write_node(node)
return node

View File

@@ -1,4 +1,4 @@
# all these are just exports, they need "noqa" so flake8 will not complain.
from .profile import Profile, PROFILES_DIR, read_user_config # noqa
from .project import Project # noqa
from .project import Project, IsFQNResource # noqa
from .runtime import RuntimeConfig, UnsetProfileConfig # noqa

View File

@@ -81,6 +81,8 @@ def read_user_config(directory: str) -> UserConfig:
return UserConfig()
# The Profile class is included in RuntimeConfig, so any attribute
# additions must also be set where the RuntimeConfig class is created
@dataclass
class Profile(HasCredentials):
profile_name: str

View File

@@ -2,10 +2,9 @@ from copy import deepcopy
from dataclasses import dataclass, field
from itertools import chain
from typing import (
List, Dict, Any, Optional, TypeVar, Union, Tuple, Callable, Mapping,
Iterable, Set
List, Dict, Any, Optional, TypeVar, Union, Mapping,
)
from typing_extensions import Protocol
from typing_extensions import Protocol, runtime_checkable
import hashlib
import os
@@ -16,7 +15,6 @@ from dbt.clients.system import load_file_contents
from dbt.clients.yaml_helper import load_yaml_text
from dbt.contracts.connection import QueryComment
from dbt.exceptions import DbtProjectError
from dbt.exceptions import RecursionException
from dbt.exceptions import SemverException
from dbt.exceptions import validator_error_message
from dbt.exceptions import RuntimeException
@@ -25,13 +23,12 @@ from dbt.helper_types import NoValue
from dbt.semver import VersionSpecifier
from dbt.semver import versions_compatible
from dbt.version import get_installed_version
from dbt.utils import deep_map, MultiDict
from dbt.legacy_config_updater import ConfigUpdater, IsFQNResource
from dbt.utils import MultiDict
from dbt.node_types import NodeType
from dbt.config.selectors import SelectorDict
from dbt.contracts.project import (
ProjectV1 as ProjectV1Contract,
ProjectV2 as ProjectV2Contract,
parse_project_config,
Project as ProjectContract,
SemverString,
)
from dbt.contracts.project import PackageConfig
@@ -75,23 +72,11 @@ Validator Error:
"""
def _list_if_none(value):
if value is None:
value = []
return value
def _dict_if_none(value):
if value is None:
value = {}
return value
def _list_if_none_or_string(value):
value = _list_if_none(value)
if isinstance(value, str):
return [value]
return value
@runtime_checkable
class IsFQNResource(Protocol):
fqn: List[str]
resource_type: NodeType
package_name: str
def _load_yaml(path):
@@ -111,8 +96,8 @@ def package_data_from_root(project_root):
return packages_dict
def package_config_from_data(packages_data):
if packages_data is None:
def package_config_from_data(packages_data: Dict[str, Any]):
if not packages_data:
packages_data = {'packages': []}
try:
@@ -197,11 +182,69 @@ def _query_comment_from_cfg(
return cfg_query_comment
def validate_version(dbt_version: List[VersionSpecifier], project_name: str):
"""Ensure this package works with the installed version of dbt."""
installed = get_installed_version()
if not versions_compatible(*dbt_version):
msg = IMPOSSIBLE_VERSION_ERROR.format(
package=project_name,
version_spec=[
x.to_version_string() for x in dbt_version
]
)
raise DbtProjectError(msg)
if not versions_compatible(installed, *dbt_version):
msg = INVALID_VERSION_ERROR.format(
package=project_name,
installed=installed.to_version_string(),
version_spec=[
x.to_version_string() for x in dbt_version
]
)
raise DbtProjectError(msg)
def _get_required_version(
project_dict: Dict[str, Any],
verify_version: bool,
) -> List[VersionSpecifier]:
dbt_raw_version: Union[List[str], str] = '>=0.0.0'
required = project_dict.get('require-dbt-version')
if required is not None:
dbt_raw_version = required
try:
dbt_version = _parse_versions(dbt_raw_version)
except SemverException as e:
raise DbtProjectError(str(e)) from e
if verify_version:
# no name is also an error that we want to raise
if 'name' not in project_dict:
raise DbtProjectError(
'Required "name" field not present in project',
)
validate_version(dbt_version, project_dict['name'])
return dbt_version
@dataclass
class PartialProject:
config_version: int = field(metadata=dict(
description='The version of the configuration file format'
))
class RenderComponents:
project_dict: Dict[str, Any] = field(
metadata=dict(description='The project dictionary')
)
packages_dict: Dict[str, Any] = field(
metadata=dict(description='The packages dictionary')
)
selectors_dict: Dict[str, Any] = field(
metadata=dict(description='The selectors dictionary')
)
@dataclass
class PartialProject(RenderComponents):
profile_name: Optional[str] = field(metadata=dict(
description='The unrendered profile name in the project, if set'
))
@@ -214,178 +257,58 @@ class PartialProject:
project_root: str = field(
metadata=dict(description='The root directory of the project'),
)
project_dict: Dict[str, Any]
def render(self, renderer):
packages_dict = package_data_from_root(self.project_root)
selectors_dict = selector_data_from_root(self.project_root)
return Project.render_from_dict(
self.project_root,
self.project_dict,
packages_dict,
selectors_dict,
renderer,
)
verify_version: bool = field(
metadata=dict(description=(
'If True, verify the dbt version matches the required version'
))
)
def render_profile_name(self, renderer) -> Optional[str]:
if self.profile_name is None:
return None
return renderer.render_value(self.profile_name)
class VarProvider(Protocol):
"""Var providers are tied to a particular Project."""
def vars_for(
self, node: IsFQNResource, adapter_type: str
) -> Mapping[str, Any]:
raise NotImplementedError(
f'vars_for not implemented for {type(self)}!'
)
def to_dict(self):
raise NotImplementedError(
f'to_dict not implemented for {type(self)}!'
)
class V1VarProvider(VarProvider):
def __init__(
def get_rendered(
self,
models: Dict[str, Any],
seeds: Dict[str, Any],
snapshots: Dict[str, Any],
) -> None:
self.models = models
self.seeds = seeds
self.snapshots = snapshots
self.sources: Dict[str, Any] = {}
renderer: DbtProjectYamlRenderer,
) -> RenderComponents:
def vars_for(
self, node: IsFQNResource, adapter_type: str
) -> Mapping[str, Any]:
updater = ConfigUpdater(adapter_type)
return updater.get_project_config(node, self).get('vars', {})
rendered_project = renderer.render_project(
self.project_dict, self.project_root
)
rendered_packages = renderer.render_packages(self.packages_dict)
rendered_selectors = renderer.render_selectors(self.selectors_dict)
def to_dict(self):
raise ValidationError(
'to_dict was called on a v1 vars, but it should only be called '
'on v2 vars'
return RenderComponents(
project_dict=rendered_project,
packages_dict=rendered_packages,
selectors_dict=rendered_selectors,
)
def render(self, renderer: DbtProjectYamlRenderer) -> 'Project':
try:
rendered = self.get_rendered(renderer)
return self.create_project(rendered)
except DbtProjectError as exc:
if exc.path is None:
exc.path = os.path.join(self.project_root, 'dbt_project.yml')
raise
class V2VarProvider(VarProvider):
def __init__(
self,
vars: Dict[str, Dict[str, Any]]
) -> None:
self.vars = vars
def vars_for(
self, node: IsFQNResource, adapter_type: str
) -> Mapping[str, Any]:
# in v2, vars are only either project or globally scoped
merged = MultiDict([self.vars])
merged.add(self.vars.get(node.package_name, {}))
return merged
def to_dict(self):
return self.vars
@dataclass
class Project:
project_name: str
version: Union[SemverString, float]
project_root: str
profile_name: Optional[str]
source_paths: List[str]
macro_paths: List[str]
data_paths: List[str]
test_paths: List[str]
analysis_paths: List[str]
docs_paths: List[str]
asset_paths: List[str]
target_path: str
snapshot_paths: List[str]
clean_targets: List[str]
log_path: str
modules_path: str
quoting: Dict[str, Any]
models: Dict[str, Any]
on_run_start: List[str]
on_run_end: List[str]
seeds: Dict[str, Any]
snapshots: Dict[str, Any]
sources: Dict[str, Any]
vars: VarProvider
dbt_version: List[VersionSpecifier]
packages: Dict[str, Any]
selectors: SelectorConfig
query_comment: QueryComment
config_version: int
@property
def all_source_paths(self) -> List[str]:
return _all_source_paths(
self.source_paths, self.data_paths, self.snapshot_paths,
self.analysis_paths, self.macro_paths
def create_project(self, rendered: RenderComponents) -> 'Project':
unrendered = RenderComponents(
project_dict=self.project_dict,
packages_dict=self.packages_dict,
selectors_dict=self.selectors_dict,
)
dbt_version = _get_required_version(
rendered.project_dict,
verify_version=self.verify_version,
)
@staticmethod
def _preprocess(project_dict: Dict[str, Any]) -> Dict[str, Any]:
"""Pre-process certain special keys to convert them from None values
into empty containers, and to turn strings into arrays of strings.
"""
handlers: Dict[Tuple[Union[str, int], ...], Callable[[Any], Any]] = {
('on-run-start',): _list_if_none_or_string,
('on-run-end',): _list_if_none_or_string,
}
for k in ('models', 'seeds', 'snapshots'):
handlers[(k,)] = _dict_if_none
handlers[(k, 'vars')] = _dict_if_none
handlers[(k, 'pre-hook')] = _list_if_none_or_string
handlers[(k, 'post-hook')] = _list_if_none_or_string
handlers[('seeds', 'column_types')] = _dict_if_none
def converter(value: Any, keypath: Tuple[Union[str, int], ...]) -> Any:
if keypath in handlers:
handler = handlers[keypath]
return handler(value)
else:
return value
return deep_map(converter, project_dict)
@classmethod
def from_project_config(
cls,
project_dict: Dict[str, Any],
packages_dict: Optional[Dict[str, Any]] = None,
selectors_dict: Optional[Dict[str, Any]] = None,
) -> 'Project':
"""Create a project from its project and package configuration, as read
by yaml.safe_load().
:param project_dict: The dictionary as read from disk
:param packages_dict: If it exists, the packages file as
read from disk.
:raises DbtProjectError: If the project is missing or invalid, or if
the packages file exists and is invalid.
:returns: The project, with defaults populated.
"""
try:
project_dict = cls._preprocess(project_dict)
except RecursionException:
raise DbtProjectError(
'Cycle detected: Project input has a reference to itself',
project=project_dict
)
try:
cfg = parse_project_config(project_dict)
cfg = ProjectContract.from_dict(rendered.project_dict)
except ValidationError as e:
raise DbtProjectError(validator_error_message(e)) from e
# name/version are required in the Project definition, so we can assume
# they are present
name = cfg.name
@@ -431,58 +354,31 @@ class Project:
sources: Dict[str, Any]
vars_value: VarProvider
if cfg.config_version == 1:
assert isinstance(cfg, ProjectV1Contract)
# extract everything named 'vars'
models = cfg.models
seeds = cfg.seeds
snapshots = cfg.snapshots
sources = {}
vars_value = V1VarProvider(
models=models, seeds=seeds, snapshots=snapshots
)
elif cfg.config_version == 2:
assert isinstance(cfg, ProjectV2Contract)
models = cfg.models
seeds = cfg.seeds
snapshots = cfg.snapshots
sources = cfg.sources
if cfg.vars is None:
vars_dict: Dict[str, Any] = {}
else:
vars_dict = cfg.vars
vars_value = V2VarProvider(vars_dict)
models = cfg.models
seeds = cfg.seeds
snapshots = cfg.snapshots
sources = cfg.sources
if cfg.vars is None:
vars_dict: Dict[str, Any] = {}
else:
raise ValidationError(
f'Got unsupported config_version={cfg.config_version}'
)
vars_dict = cfg.vars
vars_value = VarProvider(vars_dict)
on_run_start: List[str] = value_or(cfg.on_run_start, [])
on_run_end: List[str] = value_or(cfg.on_run_end, [])
# weird type handling: no value_or use
dbt_raw_version: Union[List[str], str] = '>=0.0.0'
if cfg.require_dbt_version is not None:
dbt_raw_version = cfg.require_dbt_version
query_comment = _query_comment_from_cfg(cfg.query_comment)
try:
dbt_version = _parse_versions(dbt_raw_version)
except SemverException as e:
raise DbtProjectError(str(e)) from e
packages = package_config_from_data(rendered.packages_dict)
selectors = selector_config_from_data(rendered.selectors_dict)
manifest_selectors: Dict[str, Any] = {}
if rendered.selectors_dict and rendered.selectors_dict['selectors']:
# this is a dict with a single key 'selectors' pointing to a list
# of dicts.
manifest_selectors = SelectorDict.parse_from_selectors_list(
rendered.selectors_dict['selectors'])
try:
packages = package_config_from_data(packages_dict)
except ValidationError as e:
raise DbtProjectError(validator_error_message(e)) from e
try:
selectors = selector_config_from_data(selectors_dict)
except ValidationError as e:
raise DbtProjectError(validator_error_message(e)) from e
project = cls(
project = Project(
project_name=name,
version=version,
project_root=project_root,
@@ -507,16 +403,131 @@ class Project:
snapshots=snapshots,
dbt_version=dbt_version,
packages=packages,
manifest_selectors=manifest_selectors,
selectors=selectors,
query_comment=query_comment,
sources=sources,
vars=vars_value,
config_version=cfg.config_version,
unrendered=unrendered,
)
# sanity check - this means an internal issue
project.validate()
return project
@classmethod
def from_dicts(
cls,
project_root: str,
project_dict: Dict[str, Any],
packages_dict: Dict[str, Any],
selectors_dict: Dict[str, Any],
*,
verify_version: bool = False,
):
"""Construct a partial project from its constituent dicts.
"""
project_name = project_dict.get('name')
profile_name = project_dict.get('profile')
return cls(
profile_name=profile_name,
project_name=project_name,
project_root=project_root,
project_dict=project_dict,
packages_dict=packages_dict,
selectors_dict=selectors_dict,
verify_version=verify_version,
)
@classmethod
def from_project_root(
cls, project_root: str, *, verify_version: bool = False
) -> 'PartialProject':
project_root = os.path.normpath(project_root)
project_dict = _raw_project_from(project_root)
config_version = project_dict.get('config-version', 1)
if config_version != 2:
raise DbtProjectError(
f'Invalid config version: {config_version}, expected 2',
path=os.path.join(project_root, 'dbt_project.yml')
)
packages_dict = package_data_from_root(project_root)
selectors_dict = selector_data_from_root(project_root)
return cls.from_dicts(
project_root=project_root,
project_dict=project_dict,
selectors_dict=selectors_dict,
packages_dict=packages_dict,
verify_version=verify_version,
)
class VarProvider:
"""Var providers are tied to a particular Project."""
def __init__(
self,
vars: Dict[str, Dict[str, Any]]
) -> None:
self.vars = vars
def vars_for(
self, node: IsFQNResource, adapter_type: str
) -> Mapping[str, Any]:
# in v2, vars are only either project or globally scoped
merged = MultiDict([self.vars])
merged.add(self.vars.get(node.package_name, {}))
return merged
def to_dict(self):
return self.vars
# The Project class is included in RuntimeConfig, so any attribute
# additions must also be set where the RuntimeConfig class is created
@dataclass
class Project:
project_name: str
version: Union[SemverString, float]
project_root: str
profile_name: Optional[str]
source_paths: List[str]
macro_paths: List[str]
data_paths: List[str]
test_paths: List[str]
analysis_paths: List[str]
docs_paths: List[str]
asset_paths: List[str]
target_path: str
snapshot_paths: List[str]
clean_targets: List[str]
log_path: str
modules_path: str
quoting: Dict[str, Any]
models: Dict[str, Any]
on_run_start: List[str]
on_run_end: List[str]
seeds: Dict[str, Any]
snapshots: Dict[str, Any]
sources: Dict[str, Any]
vars: VarProvider
dbt_version: List[VersionSpecifier]
packages: Dict[str, Any]
manifest_selectors: Dict[str, Any]
selectors: SelectorConfig
query_comment: QueryComment
config_version: int
unrendered: RenderComponents
@property
def all_source_paths(self) -> List[str]:
return _all_source_paths(
self.source_paths, self.data_paths, self.snapshot_paths,
self.analysis_paths, self.macro_paths
)
def __str__(self):
cfg = self.to_project_config(with_packages=True)
return str(cfg)
@@ -558,6 +569,8 @@ class Project:
'on-run-end': self.on_run_end,
'seeds': self.seeds,
'snapshots': self.snapshots,
'sources': self.sources,
'vars': self.vars.to_dict(),
'require-dbt-version': [
v.to_version_string() for v in self.dbt_version
],
@@ -569,20 +582,23 @@ class Project:
if with_packages:
result.update(self.packages.to_dict())
if self.config_version == 2:
result.update({
'sources': self.sources,
'vars': self.vars.to_dict()
})
return result
def validate(self):
try:
ProjectV2Contract.from_dict(self.to_project_config())
ProjectContract.from_dict(self.to_project_config())
except ValidationError as e:
raise DbtProjectError(validator_error_message(e)) from e
@classmethod
def partial_load(
cls, project_root: str, *, verify_version: bool = False
) -> PartialProject:
return PartialProject.from_project_root(
project_root,
verify_version=verify_version,
)
@classmethod
def render_from_dict(
cls,
@@ -591,102 +607,32 @@ class Project:
packages_dict: Dict[str, Any],
selectors_dict: Dict[str, Any],
renderer: DbtProjectYamlRenderer,
*,
verify_version: bool = False
) -> 'Project':
rendered_project = renderer.render_data(project_dict)
rendered_project['project-root'] = project_root
package_renderer = renderer.get_package_renderer()
rendered_packages = package_renderer.render_data(packages_dict)
selectors_renderer = renderer.get_selector_renderer()
rendered_selectors = selectors_renderer.render_data(selectors_dict)
try:
return cls.from_project_config(
rendered_project,
rendered_packages,
rendered_selectors,
)
except DbtProjectError as exc:
if exc.path is None:
exc.path = os.path.join(project_root, 'dbt_project.yml')
raise
@classmethod
def partial_load(
cls, project_root: str
) -> PartialProject:
project_root = os.path.normpath(project_root)
project_dict = _raw_project_from(project_root)
project_name = project_dict.get('name')
profile_name = project_dict.get('profile')
config_version = project_dict.get('config-version', 1)
return PartialProject(
config_version=config_version,
profile_name=profile_name,
project_name=project_name,
partial = PartialProject.from_dicts(
project_root=project_root,
project_dict=project_dict,
packages_dict=packages_dict,
selectors_dict=selectors_dict,
verify_version=verify_version,
)
return partial.render(renderer)
@classmethod
def from_project_root(
cls, project_root: str, renderer: DbtProjectYamlRenderer
cls,
project_root: str,
renderer: DbtProjectYamlRenderer,
*,
verify_version: bool = False,
) -> 'Project':
partial = cls.partial_load(project_root)
renderer.version = partial.config_version
partial = cls.partial_load(project_root, verify_version=verify_version)
return partial.render(renderer)
def hashed_name(self):
return hashlib.md5(self.project_name.encode('utf-8')).hexdigest()
def validate_version(self):
"""Ensure this package works with the installed version of dbt."""
installed = get_installed_version()
if not versions_compatible(*self.dbt_version):
msg = IMPOSSIBLE_VERSION_ERROR.format(
package=self.project_name,
version_spec=[
x.to_version_string() for x in self.dbt_version
]
)
raise DbtProjectError(msg)
if not versions_compatible(installed, *self.dbt_version):
msg = INVALID_VERSION_ERROR.format(
package=self.project_name,
installed=installed.to_version_string(),
version_spec=[
x.to_version_string() for x in self.dbt_version
]
)
raise DbtProjectError(msg)
def as_v1(self, all_projects: Iterable[str]):
if self.config_version == 1:
return self
dct = self.to_project_config()
mutated = deepcopy(dct)
# remove sources, it doesn't exist
mutated.pop('sources', None)
common_config_keys = ['models', 'seeds', 'snapshots']
if 'vars' in dct and isinstance(dct['vars'], dict):
v2_vars_to_v1(mutated, dct['vars'], set(all_projects))
# ok, now we want to look through all the existing cfgkeys and mirror
# it, except expand the '+' prefix.
for cfgkey in common_config_keys:
if cfgkey not in dct:
continue
mutated[cfgkey] = _flatten_config(dct[cfgkey])
mutated['config-version'] = 1
project = Project.from_project_config(mutated)
project.packages = self.packages
return project
def get_selector(self, name: str) -> SelectionSpec:
if name not in self.selectors:
raise RuntimeException(
@@ -694,45 +640,3 @@ class Project:
f'{list(self.selectors)}'
)
return self.selectors[name]
def v2_vars_to_v1(
dst: Dict[str, Any], src_vars: Dict[str, Any], project_names: Set[str]
) -> None:
# stuff any 'vars' entries into the old-style
# models/seeds/snapshots dicts
common_config_keys = ['models', 'seeds', 'snapshots']
for project_name in project_names:
for cfgkey in common_config_keys:
if cfgkey not in dst:
dst[cfgkey] = {}
if project_name not in dst[cfgkey]:
dst[cfgkey][project_name] = {}
project_type_cfg = dst[cfgkey][project_name]
if 'vars' not in project_type_cfg:
project_type_cfg['vars'] = {}
project_type_vars = project_type_cfg['vars']
project_type_vars.update({
k: v for k, v in src_vars.items()
if not isinstance(v, dict)
})
items = src_vars.get(project_name, None)
if isinstance(items, dict):
project_type_vars.update(items)
# remove this from the v1 form
dst.pop('vars')
def _flatten_config(dct: Dict[str, Any]):
result = {}
for key, value in dct.items():
if isinstance(value, dict) and not key.startswith('+'):
result[key] = _flatten_config(value)
else:
if key.startswith('+'):
key = key[1:]
result[key] = value
return result

View File

@@ -1,4 +1,4 @@
from typing import Dict, Any, Tuple, Optional, Union
from typing import Dict, Any, Tuple, Optional, Union, Callable
from dbt.clients.jinja import get_rendered, catch_jinja
@@ -55,12 +55,49 @@ class BaseRenderer:
)
def _list_if_none(value):
if value is None:
value = []
return value
def _dict_if_none(value):
if value is None:
value = {}
return value
def _list_if_none_or_string(value):
value = _list_if_none(value)
if isinstance(value, str):
return [value]
return value
class ProjectPostprocessor(Dict[Keypath, Callable[[Any], Any]]):
def __init__(self):
super().__init__()
self[('on-run-start',)] = _list_if_none_or_string
self[('on-run-end',)] = _list_if_none_or_string
for k in ('models', 'seeds', 'snapshots'):
self[(k,)] = _dict_if_none
self[(k, 'vars')] = _dict_if_none
self[(k, 'pre-hook')] = _list_if_none_or_string
self[(k, 'post-hook')] = _list_if_none_or_string
self[('seeds', 'column_types')] = _dict_if_none
def postprocess(self, value: Any, key: Keypath) -> Any:
if key in self:
handler = self[key]
return handler(value)
return value
class DbtProjectYamlRenderer(BaseRenderer):
def __init__(
self, context: Dict[str, Any], version: Optional[int] = None
) -> None:
super().__init__(context)
self.version: Optional[int] = version
_KEYPATH_HANDLERS = ProjectPostprocessor()
@property
def name(self):
@@ -72,26 +109,30 @@ class DbtProjectYamlRenderer(BaseRenderer):
def get_selector_renderer(self) -> BaseRenderer:
return SelectorRenderer(self.context)
def should_render_keypath_v1(self, keypath: Keypath) -> bool:
if not keypath:
return True
def render_project(
self,
project: Dict[str, Any],
project_root: str,
) -> Dict[str, Any]:
"""Render the project and insert the project root after rendering."""
rendered_project = self.render_data(project)
rendered_project['project-root'] = project_root
return rendered_project
first = keypath[0]
# run hooks
if first in {'on-run-start', 'on-run-end', 'query-comment'}:
return False
# models have two things to avoid
if first in {'seeds', 'models', 'snapshots'}:
# model-level hooks
if 'pre-hook' in keypath or 'post-hook' in keypath:
return False
# model-level 'vars' declarations
if 'vars' in keypath:
return False
def render_packages(self, packages: Dict[str, Any]):
"""Render the given packages dict"""
package_renderer = self.get_package_renderer()
return package_renderer.render_data(packages)
return True
def render_selectors(self, selectors: Dict[str, Any]):
selector_renderer = self.get_selector_renderer()
return selector_renderer.render_data(selectors)
def should_render_keypath_v2(self, keypath: Keypath) -> bool:
def render_entry(self, value: Any, keypath: Keypath) -> Any:
result = super().render_entry(value, keypath)
return self._KEYPATH_HANDLERS.postprocess(result, keypath)
def should_render_keypath(self, keypath: Keypath) -> bool:
if not keypath:
return True
@@ -115,26 +156,6 @@ class DbtProjectYamlRenderer(BaseRenderer):
return True
def should_render_keypath(self, keypath: Keypath) -> bool:
if self.version == 2:
return self.should_render_keypath_v2(keypath)
else: # could be None
return self.should_render_keypath_v1(keypath)
def render_data(
self, data: Dict[str, Any]
) -> Dict[str, Any]:
if self.version is None:
self.version = data.get('current-version')
try:
return deep_map(self.render_entry, data)
except RecursionException:
raise DbtProjectError(
f'Cycle detected: {self.name} input has a reference to itself',
project=data
)
class ProfileRenderer(BaseRenderer):
@property

View File

@@ -32,7 +32,6 @@ from dbt.exceptions import (
warn_or_error,
raise_compiler_error
)
from dbt.legacy_config_updater import ConfigUpdater
from hologram import ValidationError
@@ -107,11 +106,13 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
snapshots=project.snapshots,
dbt_version=project.dbt_version,
packages=project.packages,
manifest_selectors=project.manifest_selectors,
selectors=project.selectors,
query_comment=project.query_comment,
sources=project.sources,
vars=project.vars,
config_version=project.config_version,
unrendered=project.unrendered,
profile_name=profile.profile_name,
target_name=profile.target_name,
config=profile.config,
@@ -138,7 +139,11 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
# load the new project and its packages. Don't pass cli variables.
renderer = DbtProjectYamlRenderer(generate_target_context(profile, {}))
project = Project.from_project_root(project_root, renderer)
project = Project.from_project_root(
project_root,
renderer,
verify_version=getattr(self.args, 'version_check', False),
)
cfg = self.from_parts(
project=project,
@@ -173,9 +178,6 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
except ValidationError as e:
raise DbtProjectError(validator_error_message(e)) from e
if getattr(self.args, 'version_check', False):
self.validate_version()
@classmethod
def _get_rendered_profile(
cls,
@@ -193,7 +195,11 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
) -> Tuple[Project, Profile]:
# profile_name from the project
project_root = args.project_dir if args.project_dir else os.getcwd()
partial = Project.partial_load(project_root)
version_check = getattr(args, 'version_check', False)
partial = Project.partial_load(
project_root,
verify_version=version_check
)
# build the profile using the base renderer and the one fact we know
cli_vars: Dict[str, Any] = parse_cli_vars(getattr(args, 'vars', '{}'))
@@ -207,7 +213,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
# get a new renderer using our target information and render the
# project
ctx = generate_target_context(profile, cli_vars)
project_renderer = DbtProjectYamlRenderer(ctx, partial.config_version)
project_renderer = DbtProjectYamlRenderer(ctx)
project = partial.render(project_renderer)
return (project, profile)
@@ -249,27 +255,6 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
paths.add(path)
return frozenset(paths)
def _get_v1_config_paths(
self,
config: Dict[str, Any],
path: FQNPath,
paths: MutableSet[FQNPath],
) -> PathSet:
keys = ConfigUpdater(self.credentials.type).ConfigKeys
for key, value in config.items():
if isinstance(value, dict):
if key in keys:
if path not in paths:
paths.add(path)
else:
self._get_v1_config_paths(value, path + (key,), paths)
else:
if path not in paths:
paths.add(path)
return frozenset(paths)
def _get_config_paths(
self,
config: Dict[str, Any],
@@ -279,10 +264,12 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
if paths is None:
paths = set()
if self.config_version == 2:
return self._get_v2_config_paths(config, path, paths)
else:
return self._get_v1_config_paths(config, path, paths)
for key, value in config.items():
if isinstance(value, dict) and not key.startswith('+'):
self._get_v2_config_paths(value, path + (key,), paths)
else:
paths.add(path)
return frozenset(paths)
def get_resource_config_paths(self) -> Dict[str, PathSet]:
"""Return a dictionary with 'seeds' and 'models' keys whose values are
@@ -355,6 +342,9 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
self.dependencies = all_projects
return self.dependencies
def clear_dependencies(self):
self.dependencies = None
def load_projects(
self, paths: Iterable[Path]
) -> Iterator[Tuple[str, 'RuntimeConfig']]:
@@ -378,17 +368,6 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
if path.is_dir() and not path.name.startswith('__'):
yield path
def as_v1(self, all_projects: Iterable[str]):
if self.config_version == 1:
return self
return self.from_parts(
project=Project.as_v1(self, all_projects),
profile=self,
args=self.args,
dependencies=self.dependencies,
)
class UnsetCredentials(Credentials):
def __init__(self):
@@ -505,11 +484,13 @@ class UnsetProfileConfig(RuntimeConfig):
snapshots=project.snapshots,
dbt_version=project.dbt_version,
packages=project.packages,
manifest_selectors=project.manifest_selectors,
selectors=project.selectors,
query_comment=project.query_comment,
sources=project.sources,
vars=project.vars,
config_version=project.config_version,
unrendered=project.unrendered,
profile_name='',
target_name='',
config=UnsetConfig(),

View File

@@ -1,5 +1,6 @@
from pathlib import Path
from typing import Dict, Any, Optional
from typing import Dict, Any
import yaml
from hologram import ValidationError
@@ -14,6 +15,7 @@ from dbt.clients.yaml_helper import load_yaml_text
from dbt.contracts.selection import SelectorFile
from dbt.exceptions import DbtSelectorsError, RuntimeException
from dbt.graph import parse_from_selectors_definition, SelectionSpec
from dbt.graph.selector_spec import SelectionCriteria
MALFORMED_SELECTOR_ERROR = """\
The selectors.yml file in this project is malformed. Please double check
@@ -33,7 +35,17 @@ class SelectorConfig(Dict[str, SelectionSpec]):
try:
selector_file = SelectorFile.from_dict(data)
selectors = parse_from_selectors_definition(selector_file)
except (ValidationError, RuntimeException) as exc:
except ValidationError as exc:
yaml_sel_cfg = yaml.dump(exc.instance)
raise DbtSelectorsError(
f"Could not parse selector file data: \n{yaml_sel_cfg}\n"
f"Valid root-level selector definitions: "
f"union, intersection, string, dictionary. No lists. "
f"\nhttps://docs.getdbt.com/reference/node-selection/"
f"yaml-selectors",
result_type='invalid_selector'
) from exc
except RuntimeException as exc:
raise DbtSelectorsError(
f'Could not read selector file data: {exc}',
result_type='invalid_selector',
@@ -89,9 +101,9 @@ def selector_data_from_root(project_root: str) -> Dict[str, Any]:
def selector_config_from_data(
selectors_data: Optional[Dict[str, Any]]
selectors_data: Dict[str, Any]
) -> SelectorConfig:
if selectors_data is None:
if not selectors_data:
selectors_data = {'selectors': []}
try:
@@ -102,3 +114,67 @@ def selector_config_from_data(
result_type='invalid_selector',
) from e
return selectors
# These are utilities to clean up the dictionary created from
# selectors.yml by turning the cli-string format entries into
# normalized dictionary entries. It parallels the flow in
# dbt/graph/cli.py. If changes are made there, it might
# be necessary to make changes here. Ideally it would be
# good to combine the two flows into one at some point.
class SelectorDict:
@classmethod
def parse_dict_definition(cls, definition):
key = list(definition)[0]
value = definition[key]
if isinstance(value, list):
new_values = []
for sel_def in value:
new_value = cls.parse_from_definition(sel_def)
new_values.append(new_value)
value = new_values
if key == 'exclude':
definition = {key: value}
elif len(definition) == 1:
definition = {'method': key, 'value': value}
return definition
@classmethod
def parse_a_definition(cls, def_type, definition):
# this definition must be a list
new_dict = {def_type: []}
for sel_def in definition[def_type]:
if isinstance(sel_def, dict):
sel_def = cls.parse_from_definition(sel_def)
new_dict[def_type].append(sel_def)
elif isinstance(sel_def, str):
sel_def = SelectionCriteria.dict_from_single_spec(sel_def)
new_dict[def_type].append(sel_def)
else:
new_dict[def_type].append(sel_def)
return new_dict
@classmethod
def parse_from_definition(cls, definition):
if isinstance(definition, str):
definition = SelectionCriteria.dict_from_single_spec(definition)
elif 'union' in definition:
definition = cls.parse_a_definition('union', definition)
elif 'intersection' in definition:
definition = cls.parse_a_definition('intersection', definition)
elif isinstance(definition, dict):
definition = cls.parse_dict_definition(definition)
return definition
# This is the normal entrypoint of this code. Give it the
# list of selectors generated from the selectors.yml file.
@classmethod
def parse_from_selectors_list(cls, selectors):
selector_dict = {}
for selector in selectors:
sel_name = selector['name']
selector_dict[sel_name] = selector
definition = cls.parse_from_definition(selector['definition'])
selector_dict[sel_name]['definition'] = definition
return selector_dict

View File

@@ -18,6 +18,7 @@ import yaml
# approaches which will extend well to potentially many modules
import pytz
import datetime
import re
def get_pytz_module_context() -> Dict[str, Any]:
@@ -42,10 +43,19 @@ def get_datetime_module_context() -> Dict[str, Any]:
}
def get_re_module_context() -> Dict[str, Any]:
context_exports = re.__all__
return {
name: getattr(re, name) for name in context_exports
}
def get_context_modules() -> Dict[str, Dict[str, Any]]:
return {
'pytz': get_pytz_module_context(),
'datetime': get_datetime_module_context(),
're': get_re_module_context(),
}
@@ -105,39 +115,39 @@ class Var:
cli_vars: Mapping[str, Any],
node: Optional[CompiledResource] = None
) -> None:
self.context: Mapping[str, Any] = context
self.cli_vars: Mapping[str, Any] = cli_vars
self.node: Optional[CompiledResource] = node
self.merged: Mapping[str, Any] = self._generate_merged()
self._context: Mapping[str, Any] = context
self._cli_vars: Mapping[str, Any] = cli_vars
self._node: Optional[CompiledResource] = node
self._merged: Mapping[str, Any] = self._generate_merged()
def _generate_merged(self) -> Mapping[str, Any]:
return self.cli_vars
return self._cli_vars
@property
def node_name(self):
if self.node is not None:
return self.node.name
if self._node is not None:
return self._node.name
else:
return '<Configuration>'
def get_missing_var(self, var_name):
dct = {k: self.merged[k] for k in self.merged}
dct = {k: self._merged[k] for k in self._merged}
pretty_vars = json.dumps(dct, sort_keys=True, indent=4)
msg = self.UndefinedVarError.format(
var_name, self.node_name, pretty_vars
)
raise_compiler_error(msg, self.node)
raise_compiler_error(msg, self._node)
def has_var(self, var_name: str):
return var_name in self.merged
return var_name in self._merged
def get_rendered_var(self, var_name):
raw = self.merged[var_name]
raw = self._merged[var_name]
# if bool/int/float/etc are passed in, don't compile anything
if not isinstance(raw, str):
return raw
return get_rendered(raw, self.context)
return get_rendered(raw, self._context)
def __call__(self, var_name, default=_VAR_NOTSET):
if self.has_var(var_name):

View File

@@ -36,27 +36,26 @@ class ConfiguredVar(Var):
project_name: str,
):
super().__init__(context, config.cli_vars)
self.config = config
self.project_name = project_name
self._config = config
self._project_name = project_name
def __call__(self, var_name, default=Var._VAR_NOTSET):
my_config = self.config.load_dependencies()[self.project_name]
my_config = self._config.load_dependencies()[self._project_name]
# cli vars > active project > local project
if var_name in self.config.cli_vars:
return self.config.cli_vars[var_name]
if var_name in self._config.cli_vars:
return self._config.cli_vars[var_name]
if self.config.config_version == 2 and my_config.config_version == 2:
adapter_type = self.config.credentials.type
lookup = FQNLookup(self.project_name)
active_vars = self.config.vars.vars_for(lookup, adapter_type)
all_vars = MultiDict([active_vars])
adapter_type = self._config.credentials.type
lookup = FQNLookup(self._project_name)
active_vars = self._config.vars.vars_for(lookup, adapter_type)
all_vars = MultiDict([active_vars])
if self.config.project_name != my_config.project_name:
all_vars.add(my_config.vars.vars_for(lookup, adapter_type))
if self._config.project_name != my_config.project_name:
all_vars.add(my_config.vars.vars_for(lookup, adapter_type))
if var_name in all_vars:
return all_vars[var_name]
if var_name in all_vars:
return all_vars[var_name]
if default is not Var._VAR_NOTSET:
return default

View File

@@ -1,11 +1,11 @@
from abc import abstractmethod
from copy import deepcopy
from dataclasses import dataclass
from typing import List, Iterator, Dict, Any, TypeVar, Union
from typing import List, Iterator, Dict, Any, TypeVar, Generic
from dbt.config import RuntimeConfig, Project
from dbt.config import RuntimeConfig, Project, IsFQNResource
from dbt.contracts.graph.model_config import BaseConfig, get_config_for
from dbt.exceptions import InternalException
from dbt.legacy_config_updater import ConfigUpdater, IsFQNResource
from dbt.node_types import NodeType
from dbt.utils import fqn_search
@@ -17,84 +17,66 @@ class ModelParts(IsFQNResource):
package_name: str
class LegacyContextConfig:
def __init__(
self,
active_project: RuntimeConfig,
own_project: Project,
fqn: List[str],
node_type: NodeType,
):
self._config = None
self.active_project: RuntimeConfig = active_project
self.own_project: Project = own_project
T = TypeVar('T') # any old type
C = TypeVar('C', bound=BaseConfig)
self.model = ModelParts(
fqn=fqn,
resource_type=node_type,
package_name=self.own_project.project_name,
)
self.updater = ConfigUpdater(active_project.credentials.type)
class ConfigSource:
def __init__(self, project):
self.project = project
# the config options defined within the model
self.in_model_config: Dict[str, Any] = {}
def get_config_dict(self, resource_type: NodeType):
...
def get_default(self) -> Dict[str, Any]:
defaults = {"enabled": True, "materialized": "view"}
if self.model.resource_type == NodeType.Seed:
defaults['materialized'] = 'seed'
elif self.model.resource_type == NodeType.Snapshot:
defaults['materialized'] = 'snapshot'
class UnrenderedConfig(ConfigSource):
def __init__(self, project: Project):
self.project = project
if self.model.resource_type == NodeType.Test:
defaults['severity'] = 'ERROR'
return defaults
def build_config_dict(self, base: bool = False) -> Dict[str, Any]:
defaults = self.get_default()
active_config = self.load_config_from_active_project()
if self.active_project.project_name == self.own_project.project_name:
cfg = self.updater.merge(
defaults, active_config, self.in_model_config
)
def get_config_dict(self, resource_type: NodeType) -> Dict[str, Any]:
unrendered = self.project.unrendered.project_dict
if resource_type == NodeType.Seed:
model_configs = unrendered.get('seeds')
elif resource_type == NodeType.Snapshot:
model_configs = unrendered.get('snapshots')
elif resource_type == NodeType.Source:
model_configs = unrendered.get('sources')
else:
own_config = self.load_config_from_own_project()
model_configs = unrendered.get('models')
cfg = self.updater.merge(
defaults, own_config, self.in_model_config, active_config
)
return cfg
def _translate_adapter_aliases(self, config: Dict[str, Any]):
return self.active_project.credentials.translate_aliases(config)
def update_in_model_config(self, config: Dict[str, Any]) -> None:
config = self._translate_adapter_aliases(config)
self.updater.update_into(self.in_model_config, config)
def load_config_from_own_project(self) -> Dict[str, Any]:
return self.updater.get_project_config(self.model, self.own_project)
def load_config_from_active_project(self) -> Dict[str, Any]:
return self.updater.get_project_config(self.model, self.active_project)
if model_configs is None:
return {}
else:
return model_configs
T = TypeVar('T', bound=BaseConfig)
class RenderedConfig(ConfigSource):
def __init__(self, project: Project):
self.project = project
def get_config_dict(self, resource_type: NodeType) -> Dict[str, Any]:
if resource_type == NodeType.Seed:
model_configs = self.project.seeds
elif resource_type == NodeType.Snapshot:
model_configs = self.project.snapshots
elif resource_type == NodeType.Source:
model_configs = self.project.sources
else:
model_configs = self.project.models
return model_configs
class ContextConfigGenerator:
class BaseContextConfigGenerator(Generic[T]):
def __init__(self, active_project: RuntimeConfig):
self.active_project = active_project
self._active_project = active_project
def get_config_source(self, project: Project) -> ConfigSource:
return RenderedConfig(project)
def get_node_project(self, project_name: str):
if project_name == self.active_project.project_name:
return self.active_project
dependencies = self.active_project.load_dependencies()
if project_name == self._active_project.project_name:
return self._active_project
dependencies = self._active_project.load_dependencies()
if project_name not in dependencies:
raise InternalException(
f'Project name {project_name} not found in dependencies '
@@ -102,17 +84,11 @@ class ContextConfigGenerator:
)
return dependencies[project_name]
def project_configs(
def _project_configs(
self, project: Project, fqn: List[str], resource_type: NodeType
) -> Iterator[Dict[str, Any]]:
if resource_type == NodeType.Seed:
model_configs = project.seeds
elif resource_type == NodeType.Snapshot:
model_configs = project.snapshots
elif resource_type == NodeType.Source:
model_configs = project.sources
else:
model_configs = project.models
src = self.get_config_source(project)
model_configs = src.get_config_dict(resource_type)
for level_config in fqn_search(model_configs, fqn):
result = {}
for key, value in level_config.items():
@@ -123,20 +99,20 @@ class ContextConfigGenerator:
yield result
def active_project_configs(
def _active_project_configs(
self, fqn: List[str], resource_type: NodeType
) -> Iterator[Dict[str, Any]]:
return self.project_configs(self.active_project, fqn, resource_type)
return self._project_configs(self._active_project, fqn, resource_type)
@abstractmethod
def _update_from_config(
self, result: T, partial: Dict[str, Any], validate: bool = False
) -> T:
translated = self.active_project.credentials.translate_aliases(partial)
return result.update_from(
translated,
self.active_project.credentials.type,
validate=validate
)
...
@abstractmethod
def initial_result(self, resource_type: NodeType, base: bool) -> T:
...
def calculate_node_config(
self,
@@ -147,23 +123,120 @@ class ContextConfigGenerator:
base: bool,
) -> BaseConfig:
own_config = self.get_node_project(project_name)
result = self.initial_result(resource_type=resource_type, base=base)
project_configs = self._project_configs(own_config, fqn, resource_type)
for fqn_config in project_configs:
result = self._update_from_config(result, fqn_config)
for config_call in config_calls:
result = self._update_from_config(result, config_call)
if own_config.project_name != self._active_project.project_name:
for fqn_config in self._active_project_configs(fqn, resource_type):
result = self._update_from_config(result, fqn_config)
# this is mostly impactful in the snapshot config case
return result
@abstractmethod
def calculate_node_config_dict(
self,
config_calls: List[Dict[str, Any]],
fqn: List[str],
resource_type: NodeType,
project_name: str,
base: bool,
) -> Dict[str, Any]:
...
class ContextConfigGenerator(BaseContextConfigGenerator[C]):
def __init__(self, active_project: RuntimeConfig):
self._active_project = active_project
def get_config_source(self, project: Project) -> ConfigSource:
return RenderedConfig(project)
def initial_result(self, resource_type: NodeType, base: bool) -> C:
# defaults, own_config, config calls, active_config (if != own_config)
config_cls = get_config_for(resource_type, base=base)
# Calculate the defaults. We don't want to validate the defaults,
# because it might be invalid in the case of required config members
# (such as on snapshots!)
result = config_cls.from_dict({}, validate=False)
for fqn_config in self.project_configs(own_config, fqn, resource_type):
result = self._update_from_config(result, fqn_config)
for config_call in config_calls:
result = self._update_from_config(result, config_call)
return result
if own_config.project_name != self.active_project.project_name:
for fqn_config in self.active_project_configs(fqn, resource_type):
result = self._update_from_config(result, fqn_config)
def _update_from_config(
self, result: C, partial: Dict[str, Any], validate: bool = False
) -> C:
translated = self._active_project.credentials.translate_aliases(
partial
)
return result.update_from(
translated,
self._active_project.credentials.type,
validate=validate
)
# this is mostly impactful in the snapshot config case
return result.finalize_and_validate()
def calculate_node_config_dict(
self,
config_calls: List[Dict[str, Any]],
fqn: List[str],
resource_type: NodeType,
project_name: str,
base: bool,
) -> Dict[str, Any]:
config = self.calculate_node_config(
config_calls=config_calls,
fqn=fqn,
resource_type=resource_type,
project_name=project_name,
base=base,
)
finalized = config.finalize_and_validate()
return finalized.to_dict()
class UnrenderedConfigGenerator(BaseContextConfigGenerator[Dict[str, Any]]):
def get_config_source(self, project: Project) -> ConfigSource:
return UnrenderedConfig(project)
def calculate_node_config_dict(
self,
config_calls: List[Dict[str, Any]],
fqn: List[str],
resource_type: NodeType,
project_name: str,
base: bool,
) -> Dict[str, Any]:
return self.calculate_node_config(
config_calls=config_calls,
fqn=fqn,
resource_type=resource_type,
project_name=project_name,
base=base,
)
def initial_result(
self,
resource_type: NodeType,
base: bool
) -> Dict[str, Any]:
return {}
def _update_from_config(
self,
result: Dict[str, Any],
partial: Dict[str, Any],
validate: bool = False,
) -> Dict[str, Any]:
translated = self._active_project.credentials.translate_aliases(
partial
)
result.update(translated)
return result
class ContextConfig:
@@ -174,23 +247,30 @@ class ContextConfig:
resource_type: NodeType,
project_name: str,
) -> None:
self.config_calls: List[Dict[str, Any]] = []
self.cfg_source = ContextConfigGenerator(active_project)
self.fqn = fqn
self.resource_type = resource_type
self.project_name = project_name
self._config_calls: List[Dict[str, Any]] = []
self._active_project = active_project
self._fqn = fqn
self._resource_type = resource_type
self._project_name = project_name
def update_in_model_config(self, opts: Dict[str, Any]) -> None:
self.config_calls.append(opts)
self._config_calls.append(opts)
def build_config_dict(self, base: bool = False) -> Dict[str, Any]:
return self.cfg_source.calculate_node_config(
config_calls=self.config_calls,
fqn=self.fqn,
resource_type=self.resource_type,
project_name=self.project_name,
def build_config_dict(
self,
base: bool = False,
*,
rendered: bool = True,
) -> Dict[str, Any]:
if rendered:
src = ContextConfigGenerator(self._active_project)
else:
src = UnrenderedConfigGenerator(self._active_project)
return src.calculate_node_config_dict(
config_calls=self._config_calls,
fqn=self._fqn,
resource_type=self._resource_type,
project_name=self._project_name,
base=base,
).to_dict()
ContextConfigType = Union[LegacyContextConfig, ContextConfig]
)

View File

@@ -6,25 +6,27 @@ from typing import (
)
from typing_extensions import Protocol
from dbt import deprecations
from dbt.adapters.base.column import Column
from dbt.adapters.factory import get_adapter, get_adapter_package_names
from dbt.clients import agate_helper
from dbt.clients.jinja import get_rendered
from dbt.clients.jinja import get_rendered, MacroGenerator
from dbt.config import RuntimeConfig, Project
from .base import contextmember, contextproperty, Var
from .configured import FQNLookup
from .context_config import ContextConfigType
from .macros import MacroNamespaceBuilder
from .context_config import ContextConfig
from .macros import MacroNamespaceBuilder, MacroNamespace
from .manifest import ManifestContext
from dbt.contracts.graph.manifest import Manifest, Disabled
from dbt.contracts.connection import AdapterResponse
from dbt.contracts.graph.compiled import (
CompiledResource,
CompiledSeedNode,
NonSourceNode,
ManifestNode,
)
from dbt.contracts.graph.parsed import (
ParsedMacro,
ParsedExposure,
ParsedSeedNode,
ParsedSourceDefinition,
)
@@ -41,12 +43,12 @@ from dbt.exceptions import (
source_target_not_found,
wrapped_exports,
)
from dbt.legacy_config_updater import IsFQNResource
from dbt.config import IsFQNResource
from dbt.logger import GLOBAL_LOGGER as logger # noqa
from dbt.node_types import NodeType
from dbt.utils import (
add_ephemeral_model_prefix, merge, AttrDict, MultiDict
merge, AttrDict, MultiDict
)
import agate
@@ -58,23 +60,23 @@ _MISSING = object()
# base classes
class RelationProxy:
def __init__(self, adapter):
self.quoting_config = adapter.config.quoting
self.relation_type = adapter.Relation
self._quoting_config = adapter.config.quoting
self._relation_type = adapter.Relation
def __getattr__(self, key):
return getattr(self.relation_type, key)
return getattr(self._relation_type, key)
def create_from_source(self, *args, **kwargs):
# bypass our create when creating from source so as not to mess up
# the source quoting
return self.relation_type.create_from_source(*args, **kwargs)
return self._relation_type.create_from_source(*args, **kwargs)
def create(self, *args, **kwargs):
kwargs['quote_policy'] = merge(
self.quoting_config,
self._quoting_config,
kwargs.pop('quote_policy', {})
)
return self.relation_type.create(*args, **kwargs)
return self._relation_type.create(*args, **kwargs)
class BaseDatabaseWrapper:
@@ -82,22 +84,85 @@ class BaseDatabaseWrapper:
Wrapper for runtime database interaction. Applies the runtime quote policy
via a relation proxy.
"""
def __init__(self, adapter):
self.adapter = adapter
def __init__(self, adapter, namespace: MacroNamespace):
self._adapter = adapter
self.Relation = RelationProxy(adapter)
self._namespace = namespace
def __getattr__(self, name):
raise NotImplementedError('subclasses need to implement this')
@property
def config(self):
return self.adapter.config
return self._adapter.config
def type(self):
return self.adapter.type()
return self._adapter.type()
def commit(self):
return self.adapter.commit_if_has_connection()
return self._adapter.commit_if_has_connection()
def _get_adapter_macro_prefixes(self) -> List[str]:
# a future version of this could have plugins automatically call fall
# back to their dependencies' dependencies by using
# `get_adapter_type_names` instead of `[self.config.credentials.type]`
search_prefixes = [self._adapter.type(), 'default']
return search_prefixes
def dispatch(
self, macro_name: str, packages: Optional[List[str]] = None
) -> MacroGenerator:
search_packages: List[Optional[str]]
if '.' in macro_name:
suggest_package, suggest_macro_name = macro_name.split('.', 1)
msg = (
f'In adapter.dispatch, got a macro name of "{macro_name}", '
f'but "." is not a valid macro name component. Did you mean '
f'`adapter.dispatch("{suggest_macro_name}", '
f'packages=["{suggest_package}"])`?'
)
raise CompilationException(msg)
if packages is None:
search_packages = [None]
elif isinstance(packages, str):
raise CompilationException(
f'In adapter.dispatch, got a string packages argument '
f'("{packages}"), but packages should be None or a list.'
)
else:
search_packages = packages
attempts = []
for package_name in search_packages:
for prefix in self._get_adapter_macro_prefixes():
search_name = f'{prefix}__{macro_name}'
try:
macro = self._namespace.get_from_package(
package_name, search_name
)
except CompilationException as exc:
raise CompilationException(
f'In dispatch: {exc.msg}',
) from exc
if package_name is None:
attempts.append(search_name)
else:
attempts.append(f'{package_name}.{search_name}')
if macro is not None:
return macro
searched = ', '.join(repr(a) for a in attempts)
msg = (
f"In dispatch: No macro named '{macro_name}' found\n"
f" Searched for: {searched}"
)
raise CompilationException(msg)
class BaseResolver(metaclass=abc.ABCMeta):
@@ -190,13 +255,13 @@ class BaseSourceResolver(BaseResolver):
class Config(Protocol):
def __init__(self, model, context_config: Optional[ContextConfigType]):
def __init__(self, model, context_config: Optional[ContextConfig]):
...
# `config` implementations
class ParseConfigObject(Config):
def __init__(self, model, context_config: Optional[ContextConfigType]):
def __init__(self, model, context_config: Optional[ContextConfig]):
self.model = model
self.context_config = context_config
@@ -252,7 +317,7 @@ class ParseConfigObject(Config):
class RuntimeConfigObject(Config):
def __init__(
self, model, context_config: Optional[ContextConfigType] = None
self, model, context_config: Optional[ContextConfig] = None
):
self.model = model
# we never use or get a config, only the parser cares
@@ -316,14 +381,15 @@ class ParseDatabaseWrapper(BaseDatabaseWrapper):
"""The parser subclass of the database wrapper applies any explicit
parse-time overrides.
"""
def __getattr__(self, name):
override = (name in self.adapter._available_ and
name in self.adapter._parse_replacements_)
override = (name in self._adapter._available_ and
name in self._adapter._parse_replacements_)
if override:
return self.adapter._parse_replacements_[name]
elif name in self.adapter._available_:
return getattr(self.adapter, name)
return self._adapter._parse_replacements_[name]
elif name in self._adapter._available_:
return getattr(self._adapter, name)
else:
raise AttributeError(
"'{}' object has no attribute '{}'".format(
@@ -336,9 +402,10 @@ class RuntimeDatabaseWrapper(BaseDatabaseWrapper):
"""The runtime database wrapper exposes everything the adapter marks
available.
"""
def __getattr__(self, name):
if name in self.adapter._available_:
return getattr(self.adapter, name)
if name in self._adapter._available_:
return getattr(self._adapter, name)
else:
raise AttributeError(
"'{}' object has no attribute '{}'".format(
@@ -357,7 +424,7 @@ class ParseRefResolver(BaseRefResolver):
return self.Relation.create_from(self.config, self.model)
ResolveRef = Union[Disabled, NonSourceNode]
ResolveRef = Union[Disabled, ManifestNode]
class RuntimeRefResolver(BaseRefResolver):
@@ -381,26 +448,20 @@ class RuntimeRefResolver(BaseRefResolver):
self.validate(target_model, target_name, target_package)
return self.create_relation(target_model, target_name)
def create_ephemeral_relation(
self, target_model: NonSourceNode, name: str
) -> RelationProxy:
self.model.set_cte(target_model.unique_id, None)
return self.Relation.create(
type=self.Relation.CTE,
identifier=add_ephemeral_model_prefix(name)
).quote(identifier=False)
def create_relation(
self, target_model: NonSourceNode, name: str
self, target_model: ManifestNode, name: str
) -> RelationProxy:
if target_model.get_materialization() == 'ephemeral':
return self.create_ephemeral_relation(target_model, name)
if target_model.is_ephemeral_model:
self.model.set_cte(target_model.unique_id, None)
return self.Relation.create_ephemeral_from_node(
self.config, target_model
)
else:
return self.Relation.create_from(self.config, target_model)
def validate(
self,
resolved: NonSourceNode,
resolved: ManifestNode,
target_name: str,
target_package: Optional[str]
) -> None:
@@ -412,22 +473,25 @@ class RuntimeRefResolver(BaseRefResolver):
class OperationRefResolver(RuntimeRefResolver):
def validate(
self,
resolved: NonSourceNode,
resolved: ManifestNode,
target_name: str,
target_package: Optional[str],
) -> None:
pass
def create_ephemeral_relation(
self, target_model: NonSourceNode, name: str
def create_relation(
self, target_model: ManifestNode, name: str
) -> RelationProxy:
# In operations, we can't ref() ephemeral nodes, because ParsedMacros
# do not support set_cte
raise_compiler_error(
'Operations can not ref() ephemeral nodes, but {} is ephemeral'
.format(target_model.name),
self.model
)
if target_model.is_ephemeral_model:
# In operations, we can't ref() ephemeral nodes, because
# ParsedMacros do not support set_cte
raise_compiler_error(
'Operations can not ref() ephemeral nodes, but {} is ephemeral'
.format(target_model.name),
self.model
)
else:
return super().create_relation(target_model, name)
# `source` implementations
@@ -464,37 +528,37 @@ class ModelConfiguredVar(Var):
config: RuntimeConfig,
node: CompiledResource,
) -> None:
self.node: CompiledResource
self.config: RuntimeConfig = config
self._node: CompiledResource
self._config: RuntimeConfig = config
super().__init__(context, config.cli_vars, node=node)
def packages_for_node(self) -> Iterable[Project]:
dependencies = self.config.load_dependencies()
package_name = self.node.package_name
dependencies = self._config.load_dependencies()
package_name = self._node.package_name
if package_name != self.config.project_name:
if package_name != self._config.project_name:
if package_name not in dependencies:
# I don't think this is actually reachable
raise_compiler_error(
f'Node package named {package_name} not found!',
self.node
self._node
)
yield dependencies[package_name]
yield self.config
yield self._config
def _generate_merged(self) -> Mapping[str, Any]:
search_node: IsFQNResource
if isinstance(self.node, IsFQNResource):
search_node = self.node
if isinstance(self._node, IsFQNResource):
search_node = self._node
else:
search_node = FQNLookup(self.node.package_name)
search_node = FQNLookup(self._node.package_name)
adapter_type = self.config.credentials.type
adapter_type = self._config.credentials.type
merged = MultiDict()
for project in self.packages_for_node():
merged.add(project.vars.vars_for(search_node, adapter_type))
merged.add(self.cli_vars)
merged.add(self._cli_vars)
return merged
@@ -560,7 +624,7 @@ class ProviderContext(ManifestContext):
config: RuntimeConfig,
manifest: Manifest,
provider: Provider,
context_config: Optional[ContextConfigType],
context_config: Optional[ContextConfig],
) -> None:
if provider is None:
raise InternalException(
@@ -568,13 +632,15 @@ class ProviderContext(ManifestContext):
)
# mypy appeasement - we know it'll be a RuntimeConfig
self.config: RuntimeConfig
self.model: Union[ParsedMacro, NonSourceNode] = model
self.model: Union[ParsedMacro, ManifestNode] = model
super().__init__(config, manifest, model.package_name)
self.sql_results: Dict[str, AttrDict] = {}
self.context_config: Optional[ContextConfigType] = context_config
self.context_config: Optional[ContextConfig] = context_config
self.provider: Provider = provider
self.adapter = get_adapter(self.config)
self.db_wrapper = self.provider.DatabaseWrapper(self.adapter)
self.db_wrapper = self.provider.DatabaseWrapper(
self.adapter, self.namespace
)
def _get_namespace_builder(self):
internal_packages = get_adapter_package_names(
@@ -598,18 +664,33 @@ class ProviderContext(ManifestContext):
@contextmember
def store_result(
self, name: str, status: Any, agate_table: Optional[agate.Table] = None
self, name: str,
response: Any,
agate_table: Optional[agate.Table] = None
) -> str:
if agate_table is None:
agate_table = agate_helper.empty_table()
self.sql_results[name] = AttrDict({
'status': status,
'response': response,
'data': agate_helper.as_matrix(agate_table),
'table': agate_table
})
return ''
@contextmember
def store_raw_result(
self,
name: str,
message=Optional[str],
code=Optional[str],
rows_affected=Optional[str],
agate_table: Optional[agate.Table] = None
) -> str:
response = AdapterResponse(
_message=message, code=code, rows_affected=rows_affected)
return self.store_result(name, response, agate_table)
@contextproperty
def validation(self):
def validate_any(*args) -> Callable[[T], None]:
@@ -1045,13 +1126,6 @@ class ProviderContext(ManifestContext):
def sql_now(self) -> str:
return self.adapter.date_function()
def _get_adapter_macro_prefixes(self) -> List[str]:
# a future version of this could have plugins automatically call fall
# back to their dependencies' dependencies by using
# `get_adapter_type_names` instead of `[self.config.credentials.type]`
search_prefixes = [self.config.credentials.type, 'default']
return search_prefixes
@contextmember
def adapter_macro(self, name: str, *args, **kwargs):
"""Find the most appropriate macro for the name, considering the
@@ -1096,38 +1170,24 @@ class ProviderContext(ManifestContext):
...
{%- endmacro %}
"""
original_name: str = name
package_name: Optional[str] = None
deprecations.warn('adapter-macro', macro_name=name)
original_name = name
package_names: Optional[List[str]] = None
if '.' in name:
package_name, name = name.split('.', 1)
package_names = [package_name]
attempts = []
for prefix in self._get_adapter_macro_prefixes():
search_name = f'{prefix}__{name}'
try:
macro = self.namespace.get_from_package(
package_name, search_name
)
except CompilationException as exc:
raise CompilationException(
f'In adapter_macro: {exc.msg}, original name '
f"'{original_name}'",
node=self.model,
) from exc
if package_name is None:
attempts.append(search_name)
else:
attempts.append(f'{package_name}.{search_name}')
if macro is not None:
return macro(*args, **kwargs)
searched = ', '.join(repr(a) for a in attempts)
raise_compiler_error(
f"In adapter_macro: No macro named '{name}' found\n"
f" Original name: '{original_name}'\n"
f" Searched for: {searched}"
)
try:
macro = self.db_wrapper.dispatch(
macro_name=name, packages=package_names
)
except CompilationException as exc:
raise CompilationException(
f'In adapter_macro: {exc.msg}\n'
f" Original name: '{original_name}'",
node=self.model
) from exc
return macro(*args, **kwargs)
class MacroContext(ProviderContext):
@@ -1138,6 +1198,7 @@ class MacroContext(ProviderContext):
- 'schema' does not use any 'model' information
- they can't be configured with config() directives
"""
def __init__(
self,
model: ParsedMacro,
@@ -1156,7 +1217,7 @@ class MacroContext(ProviderContext):
class ModelContext(ProviderContext):
model: NonSourceNode
model: ManifestNode
@contextproperty
def pre_hooks(self) -> List[Dict[str, Any]]:
@@ -1176,7 +1237,9 @@ class ModelContext(ProviderContext):
@contextproperty
def sql(self) -> Optional[str]:
return getattr(self.model, 'injected_sql', None)
if getattr(self.model, 'extra_ctes_injected', None):
return self.model.compiled_sql
return None
@contextproperty
def database(self) -> str:
@@ -1227,10 +1290,10 @@ class ModelContext(ProviderContext):
def generate_parser_model(
model: NonSourceNode,
model: ManifestNode,
config: RuntimeConfig,
manifest: Manifest,
context_config: ContextConfigType,
context_config: ContextConfig,
) -> Dict[str, Any]:
ctx = ModelContext(
model, config, manifest, ParseProvider(), context_config
@@ -1262,7 +1325,7 @@ def generate_generate_component_name_macro(
def generate_runtime_model(
model: NonSourceNode,
model: ManifestNode,
config: RuntimeConfig,
manifest: Manifest,
) -> Dict[str, Any]:
@@ -1282,3 +1345,45 @@ def generate_runtime_macro(
macro, config, manifest, OperationProvider(), package_name
)
return ctx.to_dict()
class ExposureRefResolver(BaseResolver):
def __call__(self, *args) -> str:
if len(args) not in (1, 2):
ref_invalid_args(self.model, args)
self.model.refs.append(list(args))
return ''
class ExposureSourceResolver(BaseResolver):
def __call__(self, *args) -> str:
if len(args) != 2:
raise_compiler_error(
f"source() takes exactly two arguments ({len(args)} given)",
self.model
)
self.model.sources.append(list(args))
return ''
def generate_parse_exposure(
exposure: ParsedExposure,
config: RuntimeConfig,
manifest: Manifest,
package_name: str,
) -> Dict[str, Any]:
project = config.load_dependencies()[package_name]
return {
'ref': ExposureRefResolver(
None,
exposure,
project,
manifest,
),
'source': ExposureSourceResolver(
None,
exposure,
project,
manifest,
)
}

View File

@@ -22,6 +22,16 @@ Identifier = NewType('Identifier', str)
register_pattern(Identifier, r'^[A-Za-z_][A-Za-z0-9_]+$')
@dataclass
class AdapterResponse(JsonSchemaMixin):
_message: str
code: Optional[str] = None
rows_affected: Optional[int] = None
def __str__(self):
return self._message
class ConnectionState(StrEnum):
INIT = 'init'
OPEN = 'open'
@@ -85,6 +95,7 @@ class LazyHandle:
"""Opener must be a callable that takes a Connection object and opens the
connection, updating the handle on the Connection.
"""
def __init__(self, opener: Callable[[Connection], Connection]):
self.opener = opener
@@ -160,7 +171,7 @@ class Credentials(
class UserConfigContract(Protocol):
send_anonymous_usage_stats: bool
use_colors: bool
use_colors: Optional[bool]
partial_parse: Optional[bool]
printer_width: Optional[int]

167
core/dbt/contracts/files.py Normal file
View File

@@ -0,0 +1,167 @@
import hashlib
import os
from dataclasses import dataclass, field
from typing import List, Optional, Union
from hologram import JsonSchemaMixin
from dbt.exceptions import InternalException
from .util import MacroKey, SourceKey
MAXIMUM_SEED_SIZE = 1 * 1024 * 1024
MAXIMUM_SEED_SIZE_NAME = '1MB'
@dataclass
class FilePath(JsonSchemaMixin):
searched_path: str
relative_path: str
project_root: str
@property
def search_key(self) -> str:
# TODO: should this be project name + path relative to project root?
return self.absolute_path
@property
def full_path(self) -> str:
# useful for symlink preservation
return os.path.join(
self.project_root, self.searched_path, self.relative_path
)
@property
def absolute_path(self) -> str:
return os.path.abspath(self.full_path)
@property
def original_file_path(self) -> str:
# this is mostly used for reporting errors. It doesn't show the project
# name, should it?
return os.path.join(
self.searched_path, self.relative_path
)
def seed_too_large(self) -> bool:
"""Return whether the file this represents is over the seed size limit
"""
return os.stat(self.full_path).st_size > MAXIMUM_SEED_SIZE
@dataclass
class FileHash(JsonSchemaMixin):
name: str # the hash type name
checksum: str # the hashlib.hash_type().hexdigest() of the file contents
@classmethod
def empty(cls):
return FileHash(name='none', checksum='')
@classmethod
def path(cls, path: str):
return FileHash(name='path', checksum=path)
def __eq__(self, other):
if not isinstance(other, FileHash):
return NotImplemented
if self.name == 'none' or self.name != other.name:
return False
return self.checksum == other.checksum
def compare(self, contents: str) -> bool:
"""Compare the file contents with the given hash"""
if self.name == 'none':
return False
return self.from_contents(contents, name=self.name) == self.checksum
@classmethod
def from_contents(cls, contents: str, name='sha256') -> 'FileHash':
"""Create a file hash from the given file contents. The hash is always
the utf-8 encoding of the contents given, because dbt only reads files
as utf-8.
"""
data = contents.encode('utf-8')
checksum = hashlib.new(name, data).hexdigest()
return cls(name=name, checksum=checksum)
@dataclass
class RemoteFile(JsonSchemaMixin):
@property
def searched_path(self) -> str:
return 'from remote system'
@property
def relative_path(self) -> str:
return 'from remote system'
@property
def absolute_path(self) -> str:
return 'from remote system'
@property
def original_file_path(self):
return 'from remote system'
@dataclass
class SourceFile(JsonSchemaMixin):
"""Define a source file in dbt"""
path: Union[FilePath, RemoteFile] # the path information
checksum: FileHash
# we don't want to serialize this
_contents: Optional[str] = None
# the unique IDs contained in this file
nodes: List[str] = field(default_factory=list)
docs: List[str] = field(default_factory=list)
macros: List[str] = field(default_factory=list)
sources: List[str] = field(default_factory=list)
exposures: List[str] = field(default_factory=list)
# any node patches in this file. The entries are names, not unique ids!
patches: List[str] = field(default_factory=list)
# any macro patches in this file. The entries are package, name pairs.
macro_patches: List[MacroKey] = field(default_factory=list)
# any source patches in this file. The entries are package, name pairs
source_patches: List[SourceKey] = field(default_factory=list)
@property
def search_key(self) -> Optional[str]:
if isinstance(self.path, RemoteFile):
return None
if self.checksum.name == 'none':
return None
return self.path.search_key
@property
def contents(self) -> str:
if self._contents is None:
raise InternalException('SourceFile has no contents!')
return self._contents
@contents.setter
def contents(self, value):
self._contents = value
@classmethod
def empty(cls, path: FilePath) -> 'SourceFile':
self = cls(path=path, checksum=FileHash.empty())
self.contents = ''
return self
@classmethod
def big_seed(cls, path: FilePath) -> 'SourceFile':
"""Parse seeds over the size limit with just the path"""
self = cls(path=path, checksum=FileHash.path(path.original_file_path))
self.contents = ''
return self
@classmethod
def remote(cls, contents: str) -> 'SourceFile':
self = cls(path=RemoteFile(), checksum=FileHash.empty())
self.contents = contents
return self

View File

@@ -5,6 +5,7 @@ from dbt.contracts.graph.parsed import (
ParsedDataTestNode,
ParsedHookNode,
ParsedModelNode,
ParsedExposure,
ParsedResource,
ParsedRPCNode,
ParsedSchemaTestNode,
@@ -13,14 +14,13 @@ from dbt.contracts.graph.parsed import (
ParsedSourceDefinition,
SeedConfig,
TestConfig,
same_seeds,
)
from dbt.node_types import NodeType
from dbt.contracts.util import Replaceable
from dbt.exceptions import RuntimeException
from hologram import JsonSchemaMixin
from dataclasses import dataclass, field
import sqlparse # type: ignore
from typing import Optional, List, Union, Dict, Type
@@ -42,20 +42,7 @@ class CompiledNode(ParsedNode, CompiledNodeMixin):
compiled_sql: Optional[str] = None
extra_ctes_injected: bool = False
extra_ctes: List[InjectedCTE] = field(default_factory=list)
injected_sql: Optional[str] = None
def prepend_ctes(self, prepended_ctes: List[InjectedCTE]):
self.extra_ctes_injected = True
self.extra_ctes = prepended_ctes
if self.compiled_sql is None:
raise RuntimeException(
'Cannot prepend ctes to an unparsed node', self
)
self.injected_sql = _inject_ctes_into_sql(
self.compiled_sql,
prepended_ctes,
)
self.validate(self.to_dict())
relation_name: Optional[str] = None
def set_cte(self, cte_id: str, sql: str):
"""This is the equivalent of what self.extra_ctes[cte_id] = sql would
@@ -94,6 +81,7 @@ class CompiledRPCNode(CompiledNode):
@dataclass
class CompiledSeedNode(CompiledNode):
# keep this in sync with ParsedSeedNode!
resource_type: NodeType = field(metadata={'restrict': [NodeType.Seed]})
config: SeedConfig = field(default_factory=SeedConfig)
@@ -102,6 +90,9 @@ class CompiledSeedNode(CompiledNode):
""" Seeds are never empty"""
return False
def same_body(self, other) -> bool:
return same_seeds(self, other)
@dataclass
class CompiledSnapshotNode(CompiledNode):
@@ -116,74 +107,34 @@ class CompiledDataTestNode(CompiledNode):
@dataclass
class CompiledSchemaTestNode(CompiledNode, HasTestMetadata):
# keep this in sync with ParsedSchemaTestNode!
resource_type: NodeType = field(metadata={'restrict': [NodeType.Test]})
column_name: Optional[str] = None
config: TestConfig = field(default_factory=TestConfig)
def same_config(self, other) -> bool:
return (
self.unrendered_config.get('severity') ==
other.unrendered_config.get('severity')
)
def same_column_name(self, other) -> bool:
return self.column_name == other.column_name
def same_contents(self, other) -> bool:
if other is None:
return False
return (
self.same_config(other) and
self.same_fqn(other) and
True
)
CompiledTestNode = Union[CompiledDataTestNode, CompiledSchemaTestNode]
def _inject_ctes_into_sql(sql: str, ctes: List[InjectedCTE]) -> str:
"""
`ctes` is a list of InjectedCTEs like:
[
InjectedCTE(
id="cte_id_1",
sql="__dbt__CTE__ephemeral as (select * from table)",
),
InjectedCTE(
id="cte_id_2",
sql="__dbt__CTE__events as (select id, type from events)",
),
]
Given `sql` like:
"with internal_cte as (select * from sessions)
select * from internal_cte"
This will spit out:
"with __dbt__CTE__ephemeral as (select * from table),
__dbt__CTE__events as (select id, type from events),
with internal_cte as (select * from sessions)
select * from internal_cte"
(Whitespace enhanced for readability.)
"""
if len(ctes) == 0:
return sql
parsed_stmts = sqlparse.parse(sql)
parsed = parsed_stmts[0]
with_stmt = None
for token in parsed.tokens:
if token.is_keyword and token.normalized == 'WITH':
with_stmt = token
break
if with_stmt is None:
# no with stmt, add one, and inject CTEs right at the beginning
first_token = parsed.token_first()
with_stmt = sqlparse.sql.Token(sqlparse.tokens.Keyword, 'with')
parsed.insert_before(first_token, with_stmt)
else:
# stmt exists, add a comma (which will come after injected CTEs)
trailing_comma = sqlparse.sql.Token(sqlparse.tokens.Punctuation, ',')
parsed.insert_after(with_stmt, trailing_comma)
token = sqlparse.sql.Token(
sqlparse.tokens.Keyword,
", ".join(c.sql for c in ctes)
)
parsed.insert_after(with_stmt, token)
return str(parsed)
PARSED_TYPES: Dict[Type[CompiledNode], Type[ParsedResource]] = {
CompiledAnalysisNode: ParsedAnalysisNode,
CompiledModelNode: ParsedModelNode,
@@ -255,7 +206,7 @@ NonSourceParsedNode = Union[
# This is anything that can be in manifest.nodes.
NonSourceNode = Union[
ManifestNode = Union[
NonSourceCompiledNode,
NonSourceParsedNode,
]
@@ -264,6 +215,12 @@ NonSourceNode = Union[
# 'compile()' calls in the runner actually just return the original parsed
# node they were given.
CompileResultNode = Union[
NonSourceNode,
ManifestNode,
ParsedSourceDefinition,
]
# anything that participates in the graph: sources, exposures, manifest nodes
GraphMemberNode = Union[
CompileResultNode,
ParsedExposure,
]

View File

@@ -1,31 +1,29 @@
import abc
import enum
import hashlib
import os
from dataclasses import dataclass, field
from datetime import datetime
from itertools import chain, islice
from multiprocessing.synchronize import Lock
from typing import (
Dict, List, Optional, Union, Mapping, MutableMapping, Any, Set, Tuple,
TypeVar, Callable, Iterable, Generic, cast
TypeVar, Callable, Iterable, Generic, cast, AbstractSet
)
from typing_extensions import Protocol
from uuid import UUID
from hologram import JsonSchemaMixin
from dbt.contracts.graph.compiled import (
CompileResultNode, NonSourceNode, NonSourceCompiledNode
CompileResultNode, ManifestNode, NonSourceCompiledNode, GraphMemberNode
)
from dbt.contracts.graph.parsed import (
ParsedMacro, ParsedDocumentation, ParsedNodePatch, ParsedMacroPatch,
ParsedSourceDefinition
ParsedSourceDefinition, ParsedExposure
)
from dbt.contracts.files import SourceFile
from dbt.contracts.util import (
BaseArtifactMetadata, MacroKey, SourceKey, ArtifactMixin, schema_version
)
from dbt.contracts.util import Readable, Writable, Replaceable
from dbt.exceptions import (
raise_duplicate_resource_name, InternalException, raise_compiler_error,
warn_or_error, raise_invalid_patch
raise_duplicate_resource_name, raise_compiler_error, warn_or_error,
raise_invalid_patch,
)
from dbt.helper_types import PathSet
from dbt.logger import GLOBAL_LOGGER as logger
@@ -36,8 +34,6 @@ from dbt import tracking
import dbt.utils
NodeEdgeMap = Dict[str, List[str]]
MacroKey = Tuple[str, str]
SourceKey = Tuple[str, str]
PackageName = str
DocName = str
RefName = str
@@ -131,7 +127,7 @@ class SourceCache(PackageAwareCache[SourceKey, ParsedSourceDefinition]):
return self._manifest.sources[unique_id]
class RefableCache(PackageAwareCache[RefName, NonSourceNode]):
class RefableCache(PackageAwareCache[RefName, ManifestNode]):
# refables are actually unique, so the Dict[PackageName, UniqueID] will
# only ever have exactly one value, but doing 3 dict lookups instead of 1
# is not a big deal at all and retains consistency
@@ -139,7 +135,7 @@ class RefableCache(PackageAwareCache[RefName, NonSourceNode]):
self._cached_types = set(NodeType.refable())
super().__init__(manifest)
def add_node(self, node: NonSourceNode):
def add_node(self, node: ManifestNode):
if node.resource_type in self._cached_types:
if node.name not in self.storage:
self.storage[node.name] = {}
@@ -151,7 +147,7 @@ class RefableCache(PackageAwareCache[RefName, NonSourceNode]):
def perform_lookup(
self, unique_id: UniqueID
) -> NonSourceNode:
) -> ManifestNode:
if unique_id not in self._manifest.nodes:
raise dbt.exceptions.InternalException(
f'Node {unique_id} found in cache but not found in manifest'
@@ -173,155 +169,11 @@ def _search_packages(
@dataclass
class FilePath(JsonSchemaMixin):
searched_path: str
relative_path: str
project_root: str
@property
def search_key(self) -> str:
# TODO: should this be project name + path relative to project root?
return self.absolute_path
@property
def full_path(self) -> str:
# useful for symlink preservation
return os.path.join(
self.project_root, self.searched_path, self.relative_path
)
@property
def absolute_path(self) -> str:
return os.path.abspath(self.full_path)
@property
def original_file_path(self) -> str:
# this is mostly used for reporting errors. It doesn't show the project
# name, should it?
return os.path.join(
self.searched_path, self.relative_path
)
@dataclass
class FileHash(JsonSchemaMixin):
name: str # the hash type name
checksum: str # the hashlib.hash_type().hexdigest() of the file contents
@classmethod
def empty(cls):
return FileHash(name='none', checksum='')
@classmethod
def path(cls, path: str):
return FileHash(name='path', checksum=path)
def __eq__(self, other):
if not isinstance(other, FileHash):
return NotImplemented
if self.name == 'none' or self.name != other.name:
return False
return self.checksum == other.checksum
def compare(self, contents: str) -> bool:
"""Compare the file contents with the given hash"""
if self.name == 'none':
return False
return self.from_contents(contents, name=self.name) == self.checksum
@classmethod
def from_contents(cls, contents: str, name='sha256'):
"""Create a file hash from the given file contents. The hash is always
the utf-8 encoding of the contents given, because dbt only reads files
as utf-8.
"""
data = contents.encode('utf-8')
checksum = hashlib.new(name, data).hexdigest()
return cls(name=name, checksum=checksum)
@dataclass
class RemoteFile(JsonSchemaMixin):
@property
def searched_path(self) -> str:
return 'from remote system'
@property
def relative_path(self) -> str:
return 'from remote system'
@property
def absolute_path(self) -> str:
return 'from remote system'
@property
def original_file_path(self):
return 'from remote system'
@dataclass
class SourceFile(JsonSchemaMixin):
"""Define a source file in dbt"""
path: Union[FilePath, RemoteFile] # the path information
checksum: FileHash
# we don't want to serialize this
_contents: Optional[str] = None
# the unique IDs contained in this file
nodes: List[str] = field(default_factory=list)
docs: List[str] = field(default_factory=list)
macros: List[str] = field(default_factory=list)
sources: List[str] = field(default_factory=list)
# any node patches in this file. The entries are names, not unique ids!
patches: List[str] = field(default_factory=list)
# any macro patches in this file. The entries are package, name pairs.
macro_patches: List[MacroKey] = field(default_factory=list)
# any source patches in this file. The entries are package, name pairs
source_patches: List[SourceKey] = field(default_factory=list)
@property
def search_key(self) -> Optional[str]:
if isinstance(self.path, RemoteFile):
return None
if self.checksum.name == 'none':
return None
return self.path.search_key
@property
def contents(self) -> str:
if self._contents is None:
raise InternalException('SourceFile has no contents!')
return self._contents
@contents.setter
def contents(self, value):
self._contents = value
@classmethod
def empty(cls, path: FilePath) -> 'SourceFile':
self = cls(path=path, checksum=FileHash.empty())
self.contents = ''
return self
@classmethod
def seed(cls, path: FilePath) -> 'SourceFile':
"""Seeds always parse the same regardless of their content."""
self = cls(path=path, checksum=FileHash.path(path.absolute_path))
self.contents = ''
return self
@classmethod
def remote(cls, contents: str) -> 'SourceFile':
self = cls(path=RemoteFile(), checksum=FileHash.empty())
self.contents = contents
return self
@dataclass
class ManifestMetadata(JsonSchemaMixin, Replaceable):
class ManifestMetadata(BaseArtifactMetadata):
"""Metadata for the manifest."""
dbt_schema_version: str = field(
default_factory=lambda: str(WritableManifest.dbt_schema_version)
)
project_id: Optional[str] = field(
default=None,
metadata={
@@ -357,6 +209,12 @@ class ManifestMetadata(JsonSchemaMixin, Replaceable):
not tracking.active_user.do_not_track
)
@classmethod
def default(cls):
return cls(
dbt_schema_version=str(WritableManifest.dbt_schema_version),
)
def _sort_values(dct):
"""Given a dictionary, sort each value. This makes output deterministic,
@@ -365,7 +223,7 @@ def _sort_values(dct):
return {k: sorted(v) for k, v in dct.items()}
def build_edges(nodes: List[NonSourceNode]):
def build_edges(nodes: List[ManifestNode]):
"""Build the forward and backward edges on the given list of ParsedNodes
and return them as two separate dictionaries, each mapping unique IDs to
lists of edges.
@@ -376,7 +234,8 @@ def build_edges(nodes: List[NonSourceNode]):
for node in nodes:
backward_edges[node.unique_id] = node.depends_on_nodes[:]
for unique_id in node.depends_on_nodes:
forward_edges[unique_id].append(node.unique_id)
if unique_id in forward_edges.keys():
forward_edges[unique_id].append(node.unique_id)
return _sort_values(forward_edges), _sort_values(backward_edges)
@@ -541,12 +400,12 @@ MaybeParsedSource = Optional[Union[
MaybeNonSource = Optional[Union[
NonSourceNode,
Disabled[NonSourceNode]
ManifestNode,
Disabled[ManifestNode]
]]
T = TypeVar('T', bound=CompileResultNode)
T = TypeVar('T', bound=GraphMemberNode)
def _update_into(dest: MutableMapping[str, T], new_item: T):
@@ -573,11 +432,15 @@ def _update_into(dest: MutableMapping[str, T], new_item: T):
class Manifest:
"""The manifest for the full graph, after parsing and during compilation.
"""
nodes: MutableMapping[str, NonSourceNode]
# These attributes are both positional and by keyword. If an attribute
# is added it must all be added in the __reduce_ex__ method in the
# args tuple in the right position.
nodes: MutableMapping[str, ManifestNode]
sources: MutableMapping[str, ParsedSourceDefinition]
macros: MutableMapping[str, ParsedMacro]
docs: MutableMapping[str, ParsedDocumentation]
generated_at: datetime
exposures: MutableMapping[str, ParsedExposure]
selectors: MutableMapping[str, Any]
disabled: List[CompileResultNode]
files: MutableMapping[str, SourceFile]
metadata: ManifestMetadata = field(default_factory=ManifestMetadata)
@@ -602,7 +465,8 @@ class Manifest:
sources={},
macros=macros,
docs={},
generated_at=datetime.utcnow(),
exposures={},
selectors={},
disabled=[],
files=files,
)
@@ -627,7 +491,10 @@ class Manifest:
_update_into(self.nodes, new_node)
return new_node
def update_node(self, new_node: NonSourceNode):
def update_exposure(self, new_exposure: ParsedExposure):
_update_into(self.exposures, new_exposure)
def update_node(self, new_node: ManifestNode):
_update_into(self.nodes, new_node)
def update_source(self, new_source: ParsedSourceDefinition):
@@ -650,7 +517,7 @@ class Manifest:
def find_disabled_by_name(
self, name: str, package: Optional[str] = None
) -> Optional[NonSourceNode]:
) -> Optional[ManifestNode]:
searcher: NameSearcher = NameSearcher(
name, package, NodeType.refable()
)
@@ -780,7 +647,7 @@ class Manifest:
resource_fqns[resource_type_plural].add(tuple(resource.fqn))
return resource_fqns
def add_nodes(self, new_nodes: Mapping[str, NonSourceNode]):
def add_nodes(self, new_nodes: Mapping[str, ManifestNode]):
"""Add the given dict of new nodes to the manifest."""
for unique_id, node in new_nodes.items():
if unique_id in self.nodes:
@@ -868,14 +735,19 @@ class Manifest:
sources={k: _deepcopy(v) for k, v in self.sources.items()},
macros={k: _deepcopy(v) for k, v in self.macros.items()},
docs={k: _deepcopy(v) for k, v in self.docs.items()},
generated_at=self.generated_at,
disabled=[_deepcopy(n) for n in self.disabled],
exposures={k: _deepcopy(v) for k, v in self.exposures.items()},
selectors=self.root_project.manifest_selectors,
metadata=self.metadata,
disabled=[_deepcopy(n) for n in self.disabled],
files={k: _deepcopy(v) for k, v in self.files.items()},
)
def writable_manifest(self):
edge_members = list(chain(self.nodes.values(), self.sources.values()))
edge_members = list(chain(
self.nodes.values(),
self.sources.values(),
self.exposures.values(),
))
forward_edges, backward_edges = build_edges(edge_members)
return WritableManifest(
@@ -883,7 +755,8 @@ class Manifest:
sources=self.sources,
macros=self.macros,
docs=self.docs,
generated_at=self.generated_at,
exposures=self.exposures,
selectors=self.selectors,
metadata=self.metadata,
disabled=self.disabled,
child_map=forward_edges,
@@ -898,11 +771,13 @@ class Manifest:
def write(self, path):
self.writable_manifest().write(path)
def expect(self, unique_id: str) -> CompileResultNode:
def expect(self, unique_id: str) -> GraphMemberNode:
if unique_id in self.nodes:
return self.nodes[unique_id]
elif unique_id in self.sources:
return self.sources[unique_id]
elif unique_id in self.exposures:
return self.exposures[unique_id]
else:
# something terrible has happened
raise dbt.exceptions.InternalException(
@@ -941,8 +816,8 @@ class Manifest:
node_package: str,
) -> MaybeNonSource:
node: Optional[NonSourceNode] = None
disabled: Optional[NonSourceNode] = None
node: Optional[ManifestNode] = None
disabled: Optional[ManifestNode] = None
candidates = _search_packages(
current_project, node_package, target_model_package
@@ -1013,8 +888,9 @@ class Manifest:
def merge_from_artifact(
self,
adapter,
other: 'WritableManifest',
selected: Set[UniqueID],
selected: AbstractSet[UniqueID],
) -> None:
"""Given the selected unique IDs and a writable manifest, update this
manifest by replacing any unselected nodes with their counterpart.
@@ -1024,10 +900,14 @@ class Manifest:
refables = set(NodeType.refable())
merged = set()
for unique_id, node in other.nodes.items():
if (
current = self.nodes.get(unique_id)
if current and (
node.resource_type in refables and
not node.is_ephemeral and
unique_id not in selected
unique_id not in selected and
not adapter.get_relation(
current.database, current.schema, current.identifier
)
):
merged.add(unique_id)
self.nodes[unique_id] = node.replace(deferred=True)
@@ -1038,10 +918,36 @@ class Manifest:
f'Merged {len(merged)} items from state (sample: {sample})'
)
# Provide support for copy.deepcopy() - we just need to avoid the lock!
# pickle and deepcopy use this. It returns a callable object used to
# create the initial version of the object and a tuple of arguments
# for the object, i.e. the Manifest.
# The order of the arguments must match the order of the attributes
# in the Manifest class declaration, because they are used as
# positional arguments to construct a Manifest.
def __reduce_ex__(self, protocol):
args = (
self.nodes,
self.sources,
self.macros,
self.docs,
self.exposures,
self.selectors,
self.disabled,
self.files,
self.metadata,
self.flat_graph,
self._docs_cache,
self._sources_cache,
self._refs_cache,
)
return self.__class__, args
@dataclass
class WritableManifest(JsonSchemaMixin, Writable, Readable):
nodes: Mapping[UniqueID, NonSourceNode] = field(
@schema_version('manifest', 1)
class WritableManifest(ArtifactMixin):
nodes: Mapping[UniqueID, ManifestNode] = field(
metadata=dict(description=(
'The nodes defined in the dbt project and its dependencies'
))
@@ -1061,12 +967,19 @@ class WritableManifest(JsonSchemaMixin, Writable, Readable):
'The docs defined in the dbt project and its dependencies'
))
)
exposures: Mapping[UniqueID, ParsedExposure] = field(
metadata=dict(description=(
'The exposures defined in the dbt project and its dependencies'
))
)
selectors: Mapping[UniqueID, Any] = field(
metadata=dict(description=(
'The selectors defined in selectors.yml'
))
)
disabled: Optional[List[CompileResultNode]] = field(metadata=dict(
description='A list of the disabled nodes in the target'
))
generated_at: datetime = field(metadata=dict(
description='The time at which the manifest was generated',
))
parent_map: Optional[NodeEdgeMap] = field(metadata=dict(
description='A mapping from child nodes to their dependencies',
))

View File

@@ -1,8 +1,9 @@
from dataclasses import field, Field, dataclass
from enum import Enum
from itertools import chain
from typing import (
Any, List, Optional, Dict, MutableMapping, Union, Type, NewType, Tuple,
TypeVar
TypeVar, Callable, cast, Hashable
)
# TODO: patch+upgrade hologram to avoid this jsonschema import
@@ -21,7 +22,10 @@ from dbt import hooks
from dbt.node_types import NodeType
def _get_meta_value(cls: Type[Enum], fld: Field, key: str, default: Any):
M = TypeVar('M', bound='Metadata')
def _get_meta_value(cls: Type[M], fld: Field, key: str, default: Any) -> M:
# a metadata field might exist. If it does, it might have a matching key.
# If it has both, make sure the value is valid and return it. If it
# doesn't, return the default.
@@ -39,7 +43,7 @@ def _get_meta_value(cls: Type[Enum], fld: Field, key: str, default: Any):
def _set_meta_value(
obj: Enum, key: str, existing: Optional[Dict[str, Any]] = None
obj: M, key: str, existing: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
if existing is None:
result = {}
@@ -49,35 +53,82 @@ def _set_meta_value(
return result
MERGE_KEY = 'merge'
class Metadata(Enum):
@classmethod
def from_field(cls: Type[M], fld: Field) -> M:
default = cls.default_field()
key = cls.metadata_key()
return _get_meta_value(cls, fld, key, default)
def meta(
self, existing: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
key = self.metadata_key()
return _set_meta_value(self, key, existing)
@classmethod
def default_field(cls) -> 'Metadata':
raise NotImplementedError('Not implemented')
@classmethod
def metadata_key(cls) -> str:
raise NotImplementedError('Not implemented')
class MergeBehavior(Enum):
class MergeBehavior(Metadata):
Append = 1
Update = 2
Clobber = 3
@classmethod
def from_field(cls, fld: Field) -> 'MergeBehavior':
return _get_meta_value(cls, fld, MERGE_KEY, cls.Clobber)
def default_field(cls) -> 'MergeBehavior':
return cls.Clobber
def meta(self, existing: Optional[Dict[str, Any]] = None):
return _set_meta_value(self, MERGE_KEY, existing)
@classmethod
def metadata_key(cls) -> str:
return 'merge'
SHOW_HIDE_KEY = 'show_hide'
class ShowBehavior(Enum):
class ShowBehavior(Metadata):
Show = 1
Hide = 2
@classmethod
def from_field(cls, fld: Field) -> 'ShowBehavior':
return _get_meta_value(cls, fld, SHOW_HIDE_KEY, cls.Show)
def default_field(cls) -> 'ShowBehavior':
return cls.Show
def meta(self, existing: Optional[Dict[str, Any]] = None):
return _set_meta_value(self, SHOW_HIDE_KEY, existing)
@classmethod
def metadata_key(cls) -> str:
return 'show_hide'
@classmethod
def should_show(cls, fld: Field) -> bool:
return cls.from_field(fld) == cls.Show
class CompareBehavior(Metadata):
Include = 1
Exclude = 2
@classmethod
def default_field(cls) -> 'CompareBehavior':
return cls.Include
@classmethod
def metadata_key(cls) -> str:
return 'compare'
@classmethod
def should_include(cls, fld: Field) -> bool:
return cls.from_field(fld) == cls.Include
def metas(*metas: Metadata) -> Dict[str, Any]:
existing: Dict[str, Any] = {}
for m in metas:
existing = m.meta(existing)
return existing
def _listify(value: Any) -> List:
@@ -174,16 +225,59 @@ class BaseConfig(
else:
del self._extra[key]
def __iter__(self):
def _content_iterator(self, include_condition: Callable[[Field], bool]):
seen = set()
for fld, _ in self._get_fields():
yield fld.name
seen.add(fld.name)
if include_condition(fld):
yield fld.name
for key in self._extra:
yield key
if key not in seen:
seen.add(key)
yield key
def __iter__(self):
yield from self._content_iterator(include_condition=lambda f: True)
def __len__(self):
return len(self._get_fields()) + len(self._extra)
@staticmethod
def compare_key(
unrendered: Dict[str, Any],
other: Dict[str, Any],
key: str,
) -> bool:
if key not in unrendered and key not in other:
return True
elif key not in unrendered and key in other:
return False
elif key in unrendered and key not in other:
return False
else:
return unrendered[key] == other[key]
@classmethod
def same_contents(
cls, unrendered: Dict[str, Any], other: Dict[str, Any]
) -> bool:
"""This is like __eq__, except it ignores some fields."""
seen = set()
for fld, target_name in cls._get_fields():
key = target_name
seen.add(key)
if CompareBehavior.should_include(fld):
if not cls.compare_key(unrendered, other, key):
return False
for key in chain(unrendered, other):
if key not in seen:
seen.add(key)
if not cls.compare_key(unrendered, other, key):
return False
return True
@classmethod
def _extract_dict(
cls, src: Dict[str, Any], data: Dict[str, Any]
@@ -238,8 +332,7 @@ class BaseConfig(
if result[target_field] is not None:
continue
show_behavior = ShowBehavior.from_field(fld)
if show_behavior == ShowBehavior.Hide:
if not ShowBehavior.should_show(fld):
del result[target_field]
return result
@@ -272,6 +365,15 @@ class BaseConfig(
dct = self.to_dict(omit_none=False, validate=False)
return self.from_dict(dct)
def replace(self, **kwargs):
dct = self.to_dict(validate=False)
mapping = self.field_mapping()
for key, value in kwargs.items():
new_key = mapping.get(key, key)
dct[new_key] = value
return self.from_dict(dct, validate=False)
@dataclass
class SourceConfig(BaseConfig):
@@ -291,9 +393,10 @@ class NodeConfig(BaseConfig):
default_factory=list,
metadata=MergeBehavior.Append.meta(),
)
# this only applies for config v1, so it doesn't participate in comparison
vars: Dict[str, Any] = field(
default_factory=dict,
metadata=MergeBehavior.Update.meta(),
metadata=metas(CompareBehavior.Exclude, MergeBehavior.Update),
)
quoting: Dict[str, Any] = field(
default_factory=dict,
@@ -305,23 +408,25 @@ class NodeConfig(BaseConfig):
default_factory=dict,
metadata=MergeBehavior.Update.meta(),
)
# these fields are all config-only (they're ultimately applied to the node)
# these fields are included in serialized output, but are not part of
# config comparison (they are part of database_representation)
alias: Optional[str] = field(
default=None,
metadata=ShowBehavior.Hide.meta(),
metadata=CompareBehavior.Exclude.meta(),
)
schema: Optional[str] = field(
default=None,
metadata=ShowBehavior.Hide.meta(),
metadata=CompareBehavior.Exclude.meta(),
)
database: Optional[str] = field(
default=None,
metadata=ShowBehavior.Hide.meta(),
metadata=CompareBehavior.Exclude.meta(),
)
tags: Union[List[str], str] = field(
default_factory=list_str,
# TODO: hide this one?
metadata=MergeBehavior.Append.meta(),
metadata=metas(ShowBehavior.Hide,
MergeBehavior.Append,
CompareBehavior.Exclude),
)
full_refresh: Optional[bool] = None
@@ -345,6 +450,7 @@ class SeedConfig(NodeConfig):
@dataclass
class TestConfig(NodeConfig):
materialized: str = 'test'
severity: Severity = Severity('ERROR')
@@ -376,12 +482,28 @@ class SnapshotWrapper(JsonSchemaMixin):
@classmethod
def validate(cls, data: Any):
schema = _validate_schema(cls)
config = data.get('config', {})
if config.get('strategy') == 'check':
schema = _validate_schema(CheckSnapshotConfig)
to_validate = config
elif config.get('strategy') == 'timestamp':
schema = _validate_schema(TimestampSnapshotConfig)
to_validate = config
else:
h_cls = cast(Hashable, cls)
schema = _validate_schema(h_cls)
to_validate = data
validator = jsonschema.Draft7Validator(schema)
error = jsonschema.exceptions.best_match(
validator.iter_errors(data),
validator.iter_errors(to_validate),
key=_relevance_without_strategy,
)
if error is not None:
raise ValidationError.create_from(error) from error

View File

@@ -10,19 +10,23 @@ from typing import (
Sequence,
Tuple,
Iterator,
TypeVar,
)
from hologram import JsonSchemaMixin
from hologram.helpers import ExtensibleJsonSchemaMixin
from dbt.clients.system import write_file
from dbt.contracts.files import FileHash, MAXIMUM_SEED_SIZE_NAME
from dbt.contracts.graph.unparsed import (
UnparsedNode, UnparsedDocumentation, Quoting, Docs,
UnparsedBaseNode, FreshnessThreshold, ExternalTable,
HasYamlMetadata, MacroArgument, UnparsedSourceDefinition,
UnparsedSourceTableDefinition, UnparsedColumn, TestDef
UnparsedSourceTableDefinition, UnparsedColumn, TestDef,
ExposureOwner, ExposureType, MaturityType
)
from dbt.contracts.util import Replaceable, AdditionalPropertiesMixin
from dbt.exceptions import warn_or_error
from dbt.logger import GLOBAL_LOGGER as logger # noqa
from dbt import flags
from dbt.node_types import NodeType
@@ -45,12 +49,16 @@ from .model_config import ( # noqa
@dataclass
class ColumnInfo(AdditionalPropertiesMixin, ExtensibleJsonSchemaMixin,
Replaceable):
class ColumnInfo(
AdditionalPropertiesMixin,
ExtensibleJsonSchemaMixin,
Replaceable
):
name: str
description: str = ''
meta: Dict[str, Any] = field(default_factory=dict)
data_type: Optional[str] = None
quote: Optional[bool] = None
tags: List[str] = field(default_factory=list)
_extra: Dict[str, Any] = field(default_factory=dict)
@@ -59,6 +67,9 @@ class ColumnInfo(AdditionalPropertiesMixin, ExtensibleJsonSchemaMixin,
class HasFqn(JsonSchemaMixin, Replaceable):
fqn: List[str]
def same_fqn(self, other: 'HasFqn') -> bool:
return self.fqn == other.fqn
@dataclass
class HasUniqueID(JsonSchemaMixin, Replaceable):
@@ -122,7 +133,7 @@ class ParsedNodeMixins(JsonSchemaMixin):
self.docs = patch.docs
if flags.STRICT_MODE:
assert isinstance(self, JsonSchemaMixin)
self.to_dict(validate=True)
self.to_dict(validate=True, omit_none=False)
def get_materialization(self):
return self.config.materialized
@@ -140,6 +151,7 @@ class ParsedNodeMandatory(
Replaceable
):
alias: str
checksum: FileHash
config: NodeConfig = field(default_factory=NodeConfig)
@property
@@ -160,6 +172,7 @@ class ParsedNodeDefaults(ParsedNodeMandatory):
patch_path: Optional[str] = None
build_path: Optional[str] = None
deferred: bool = False
unrendered_config: Dict[str, Any] = field(default_factory=dict)
def write_node(self, target_path: str, subdirectory: str, payload: str):
if (os.path.basename(self.path) ==
@@ -177,9 +190,72 @@ class ParsedNodeDefaults(ParsedNodeMandatory):
return full_path
T = TypeVar('T', bound='ParsedNode')
@dataclass
class ParsedNode(ParsedNodeDefaults, ParsedNodeMixins):
pass
def _persist_column_docs(self) -> bool:
return bool(self.config.persist_docs.get('columns'))
def _persist_relation_docs(self) -> bool:
return bool(self.config.persist_docs.get('relation'))
def same_body(self: T, other: T) -> bool:
return self.raw_sql == other.raw_sql
def same_persisted_description(self: T, other: T) -> bool:
# the check on configs will handle the case where we have different
# persist settings, so we only have to care about the cases where they
# are the same..
if self._persist_relation_docs():
if self.description != other.description:
return False
if self._persist_column_docs():
# assert other._persist_column_docs()
column_descriptions = {
k: v.description for k, v in self.columns.items()
}
other_column_descriptions = {
k: v.description for k, v in other.columns.items()
}
if column_descriptions != other_column_descriptions:
return False
return True
def same_database_representation(self, other: T) -> bool:
# compare the config representation, not the node's config value. This
# compares the configured value, rather than the ultimate value (so
# generate_*_name and unset values derived from the target are
# ignored)
keys = ('database', 'schema', 'alias')
for key in keys:
mine = self.unrendered_config.get(key)
others = other.unrendered_config.get(key)
if mine != others:
return False
return True
def same_config(self, old: T) -> bool:
return self.config.same_contents(
self.unrendered_config,
old.unrendered_config,
)
def same_contents(self: T, old: Optional[T]) -> bool:
if old is None:
return False
return (
self.same_body(old) and
self.same_config(old) and
self.same_persisted_description(old) and
self.same_fqn(old) and
self.same_database_representation(old) and
True
)
@dataclass
@@ -205,8 +281,47 @@ class ParsedRPCNode(ParsedNode):
resource_type: NodeType = field(metadata={'restrict': [NodeType.RPCCall]})
def same_seeds(first: ParsedNode, second: ParsedNode) -> bool:
# for seeds, we check the hashes. If the hashes are different types,
# no match. If the hashes are both the same 'path', log a warning and
# assume they are the same
# if the current checksum is a path, we want to log a warning.
result = first.checksum == second.checksum
if first.checksum.name == 'path':
msg: str
if second.checksum.name != 'path':
msg = (
f'Found a seed ({first.package_name}.{first.name}) '
f'>{MAXIMUM_SEED_SIZE_NAME} in size. The previous file was '
f'<={MAXIMUM_SEED_SIZE_NAME}, so it has changed'
)
elif result:
msg = (
f'Found a seed ({first.package_name}.{first.name}) '
f'>{MAXIMUM_SEED_SIZE_NAME} in size at the same path, dbt '
f'cannot tell if it has changed: assuming they are the same'
)
elif not result:
msg = (
f'Found a seed ({first.package_name}.{first.name}) '
f'>{MAXIMUM_SEED_SIZE_NAME} in size. The previous file was in '
f'a different location, assuming it has changed'
)
else:
msg = (
f'Found a seed ({first.package_name}.{first.name}) '
f'>{MAXIMUM_SEED_SIZE_NAME} in size. The previous file had a '
f'checksum type of {second.checksum.name}, so it has changed'
)
warn_or_error(msg, node=first)
return result
@dataclass
class ParsedSeedNode(ParsedNode):
# keep this in sync with CompiledSeedNode!
resource_type: NodeType = field(metadata={'restrict': [NodeType.Seed]})
config: SeedConfig = field(default_factory=SeedConfig)
@@ -215,9 +330,12 @@ class ParsedSeedNode(ParsedNode):
""" Seeds are never empty"""
return False
def same_body(self: T, other: T) -> bool:
return same_seeds(self, other)
@dataclass
class TestMetadata(JsonSchemaMixin):
class TestMetadata(JsonSchemaMixin, Replaceable):
namespace: Optional[str]
name: str
kwargs: Dict[str, Any]
@@ -236,10 +354,30 @@ class ParsedDataTestNode(ParsedNode):
@dataclass
class ParsedSchemaTestNode(ParsedNode, HasTestMetadata):
# keep this in sync with CompiledSchemaTestNode!
resource_type: NodeType = field(metadata={'restrict': [NodeType.Test]})
column_name: Optional[str] = None
config: TestConfig = field(default_factory=TestConfig)
def same_config(self, other) -> bool:
return (
self.unrendered_config.get('severity') ==
other.unrendered_config.get('severity')
)
def same_column_name(self, other) -> bool:
return self.column_name == other.column_name
def same_contents(self, other) -> bool:
if other is None:
return False
return (
self.same_config(other) and
self.same_fqn(other) and
True
)
@dataclass
class IntermediateSnapshotNode(ParsedNode):
@@ -306,7 +444,14 @@ class ParsedMacro(UnparsedBaseNode, HasUniqueID):
self.arguments = patch.arguments
if flags.STRICT_MODE:
assert isinstance(self, JsonSchemaMixin)
self.to_dict(validate=True)
self.to_dict(validate=True, omit_none=False)
def same_contents(self, other: Optional['ParsedMacro']) -> bool:
if other is None:
return False
# the only thing that makes one macro different from another with the
# same name/package is its content
return self.macro_sql == other.macro_sql
@dataclass
@@ -318,6 +463,13 @@ class ParsedDocumentation(UnparsedDocumentation, HasUniqueID):
def search_name(self):
return self.name
def same_contents(self, other: Optional['ParsedDocumentation']) -> bool:
if other is None:
return False
# the only thing that makes one doc different from another with the
# same name/package is its content
return self.block_contents == other.block_contents
def normalize_test(testdef: TestDef) -> Dict[str, Any]:
if isinstance(testdef, str):
@@ -402,6 +554,60 @@ class ParsedSourceDefinition(
tags: List[str] = field(default_factory=list)
config: SourceConfig = field(default_factory=SourceConfig)
patch_path: Optional[Path] = None
unrendered_config: Dict[str, Any] = field(default_factory=dict)
relation_name: Optional[str] = None
def same_database_representation(
self, other: 'ParsedSourceDefinition'
) -> bool:
return (
self.database == other.database and
self.schema == other.schema and
self.identifier == other.identifier and
True
)
def same_quoting(self, other: 'ParsedSourceDefinition') -> bool:
return self.quoting == other.quoting
def same_freshness(self, other: 'ParsedSourceDefinition') -> bool:
return (
self.freshness == other.freshness and
self.loaded_at_field == other.loaded_at_field and
True
)
def same_external(self, other: 'ParsedSourceDefinition') -> bool:
return self.external == other.external
def same_config(self, old: 'ParsedSourceDefinition') -> bool:
return self.config.same_contents(
self.unrendered_config,
old.unrendered_config,
)
def same_contents(self, old: Optional['ParsedSourceDefinition']) -> bool:
# existing when it didn't before is a change!
if old is None:
return True
# config changes are changes (because the only config is "enabled", and
# enabling a source is a change!)
# changing the database/schema/identifier is a change
# messing around with external stuff is a change (uh, right?)
# quoting changes are changes
# freshness changes are changes, I guess
# metadata/tags changes are not "changes"
# patching/description changes are not "changes"
return (
self.same_database_representation(old) and
self.same_fqn(old) and
self.same_config(old) and
self.same_quoting(old) and
self.same_freshness(old) and
self.same_external(old) and
True
)
def get_full_source_name(self):
return f'{self.source_name}_{self.name}'
@@ -442,6 +648,71 @@ class ParsedSourceDefinition(
return f'{self.source_name}.{self.name}'
@dataclass
class ParsedExposure(UnparsedBaseNode, HasUniqueID, HasFqn):
name: str
type: ExposureType
owner: ExposureOwner
resource_type: NodeType = NodeType.Exposure
description: str = ''
maturity: Optional[MaturityType] = None
url: Optional[str] = None
depends_on: DependsOn = field(default_factory=DependsOn)
refs: List[List[str]] = field(default_factory=list)
sources: List[List[str]] = field(default_factory=list)
@property
def depends_on_nodes(self):
return self.depends_on.nodes
@property
def search_name(self):
return self.name
# no tags for now, but we could definitely add them
@property
def tags(self):
return []
def same_depends_on(self, old: 'ParsedExposure') -> bool:
return set(self.depends_on.nodes) == set(old.depends_on.nodes)
def same_description(self, old: 'ParsedExposure') -> bool:
return self.description == old.description
def same_maturity(self, old: 'ParsedExposure') -> bool:
return self.maturity == old.maturity
def same_owner(self, old: 'ParsedExposure') -> bool:
return self.owner == old.owner
def same_exposure_type(self, old: 'ParsedExposure') -> bool:
return self.type == old.type
def same_url(self, old: 'ParsedExposure') -> bool:
return self.url == old.url
def same_contents(self, old: Optional['ParsedExposure']) -> bool:
# existing when it didn't before is a change!
if old is None:
return True
return (
self.same_fqn(old) and
self.same_exposure_type(old) and
self.same_owner(old) and
self.same_maturity(old) and
self.same_url(old) and
self.same_description(old) and
self.same_depends_on(old) and
True
)
ParsedResource = Union[
ParsedMacro, ParsedNode, ParsedDocumentation, ParsedSourceDefinition
ParsedDocumentation,
ParsedMacro,
ParsedNode,
ParsedExposure,
ParsedSourceDefinition,
]

View File

@@ -158,19 +158,14 @@ class Time(JsonSchemaMixin, Replaceable):
return actual_age > difference
class FreshnessStatus(StrEnum):
Pass = 'pass'
Warn = 'warn'
Error = 'error'
@dataclass
class FreshnessThreshold(JsonSchemaMixin, Mergeable):
warn_after: Optional[Time] = None
error_after: Optional[Time] = None
filter: Optional[str] = None
def status(self, age: float) -> FreshnessStatus:
def status(self, age: float) -> "dbt.contracts.results.FreshnessStatus":
from dbt.contracts.results import FreshnessStatus
if self.error_after and self.error_after.exceeded(age):
return FreshnessStatus.Error
elif self.warn_after and self.warn_after.exceeded(age):
@@ -359,3 +354,63 @@ class UnparsedDocumentation(JsonSchemaMixin, Replaceable):
@dataclass
class UnparsedDocumentationFile(UnparsedDocumentation):
file_contents: str
# can't use total_ordering decorator here, as str provides an ordering already
# and it's not the one we want.
class Maturity(StrEnum):
low = 'low'
medium = 'medium'
high = 'high'
def __lt__(self, other):
if not isinstance(other, Maturity):
return NotImplemented
order = (Maturity.low, Maturity.medium, Maturity.high)
return order.index(self) < order.index(other)
def __gt__(self, other):
if not isinstance(other, Maturity):
return NotImplemented
return self != other and not (self < other)
def __ge__(self, other):
if not isinstance(other, Maturity):
return NotImplemented
return self == other or not (self < other)
def __le__(self, other):
if not isinstance(other, Maturity):
return NotImplemented
return self == other or self < other
class ExposureType(StrEnum):
Dashboard = 'dashboard'
Notebook = 'notebook'
Analysis = 'analysis'
ML = 'ml'
Application = 'application'
class MaturityType(StrEnum):
Low = 'low'
Medium = 'medium'
High = 'high'
@dataclass
class ExposureOwner(JsonSchemaMixin, Replaceable):
email: str
name: Optional[str] = None
@dataclass
class UnparsedExposure(JsonSchemaMixin, Replaceable):
name: str
type: ExposureType
owner: ExposureOwner
description: str = ''
maturity: Optional[MaturityType] = None
url: Optional[str] = None
depends_on: List[str] = field(default_factory=list)

View File

@@ -12,9 +12,8 @@ from hologram.helpers import HyphenatedJsonSchemaMixin, register_pattern, \
from dataclasses import dataclass, field
from typing import Optional, List, Dict, Union, Any, NewType
PIN_PACKAGE_URL = 'https://docs.getdbt.com/docs/package-management#section-specifying-package-versions' # noqa
PIN_PACKAGE_URL = 'https://docs.getdbt.com/docs/package-management#section-specifying-package-versions' # noqa
DEFAULT_SEND_ANONYMOUS_USAGE_STATS = True
DEFAULT_USE_COLORS = True
Name = NewType('Name', str)
@@ -143,6 +142,7 @@ BANNED_PROJECT_NAMES = {
'sql',
'sql_now',
'store_result',
'store_raw_result',
'target',
'this',
'tojson',
@@ -154,47 +154,7 @@ BANNED_PROJECT_NAMES = {
@dataclass
class ProjectV1(HyphenatedJsonSchemaMixin, Replaceable):
name: Name
version: Union[SemverString, float]
project_root: Optional[str] = None
source_paths: Optional[List[str]] = None
macro_paths: Optional[List[str]] = None
data_paths: Optional[List[str]] = None
test_paths: Optional[List[str]] = None
analysis_paths: Optional[List[str]] = None
docs_paths: Optional[List[str]] = None
asset_paths: Optional[List[str]] = None
target_path: Optional[str] = None
snapshot_paths: Optional[List[str]] = None
clean_targets: Optional[List[str]] = None
profile: Optional[str] = None
log_path: Optional[str] = None
modules_path: Optional[str] = None
quoting: Optional[Quoting] = None
on_run_start: Optional[List[str]] = field(default_factory=list_str)
on_run_end: Optional[List[str]] = field(default_factory=list_str)
require_dbt_version: Optional[Union[List[str], str]] = None
models: Dict[str, Any] = field(default_factory=dict)
seeds: Dict[str, Any] = field(default_factory=dict)
snapshots: Dict[str, Any] = field(default_factory=dict)
packages: List[PackageSpec] = field(default_factory=list)
query_comment: Optional[Union[QueryComment, NoValue, str]] = NoValue()
config_version: int = 1
@classmethod
def from_dict(cls, data, validate=True) -> 'ProjectV1':
result = super().from_dict(data, validate=validate)
if result.name in BANNED_PROJECT_NAMES:
raise ValidationError(
'Invalid project name: {} is a reserved word'
.format(result.name)
)
return result
@dataclass
class ProjectV2(HyphenatedJsonSchemaMixin, Replaceable):
class Project(HyphenatedJsonSchemaMixin, Replaceable):
name: Name
version: Union[SemverString, float]
config_version: int
@@ -231,7 +191,7 @@ class ProjectV2(HyphenatedJsonSchemaMixin, Replaceable):
query_comment: Optional[Union[QueryComment, NoValue, str]] = NoValue()
@classmethod
def from_dict(cls, data, validate=True) -> 'ProjectV2':
def from_dict(cls, data, validate=True) -> 'Project':
result = super().from_dict(data, validate=validate)
if result.name in BANNED_PROJECT_NAMES:
raise ValidationError(
@@ -241,25 +201,10 @@ class ProjectV2(HyphenatedJsonSchemaMixin, Replaceable):
return result
def parse_project_config(
data: Dict[str, Any], validate=True
) -> Union[ProjectV1, ProjectV2]:
config_version = data.get('config-version', 1)
if config_version == 1:
return ProjectV1.from_dict(data, validate=validate)
elif config_version == 2:
return ProjectV2.from_dict(data, validate=validate)
else:
raise ValidationError(
f'Got an unexpected config-version={config_version}, expected '
f'1 or 2'
)
@dataclass
class UserConfig(ExtensibleJsonSchemaMixin, Replaceable, UserConfigContract):
send_anonymous_usage_stats: bool = DEFAULT_SEND_ANONYMOUS_USAGE_STATS
use_colors: bool = DEFAULT_USE_COLORS
use_colors: Optional[bool] = None
partial_parse: Optional[bool] = None
printer_width: Optional[int] = None
@@ -269,8 +214,8 @@ class UserConfig(ExtensibleJsonSchemaMixin, Replaceable, UserConfigContract):
else:
tracking.do_not_track()
if self.use_colors:
ui.use_colors()
if self.use_colors is not None:
ui.use_colors(self.use_colors)
if self.printer_width:
ui.printer_width(self.printer_width)
@@ -295,7 +240,7 @@ class ConfiguredQuoting(Quoting, Replaceable):
@dataclass
class Configuration(ProjectV2, ProfileConfig):
class Configuration(Project, ProfileConfig):
cli_vars: Dict[str, Any] = field(
default_factory=dict,
metadata={'preserve_underscore': True},
@@ -305,4 +250,4 @@ class Configuration(ProjectV2, ProfileConfig):
@dataclass
class ProjectList(JsonSchemaMixin):
projects: Dict[str, Union[ProjectV2, ProjectV1]]
projects: Dict[str, Project]

View File

@@ -1,9 +1,15 @@
from dbt.contracts.graph.manifest import CompileResultNode
from dbt.contracts.graph.unparsed import (
Time, FreshnessStatus, FreshnessThreshold
FreshnessThreshold
)
from dbt.contracts.graph.parsed import ParsedSourceDefinition
from dbt.contracts.util import Writable, Replaceable
from dbt.contracts.util import (
BaseArtifactMetadata,
ArtifactMixin,
VersionedSchema,
Replaceable,
schema_version,
)
from dbt.exceptions import InternalException
from dbt.logger import (
TimingProcessor,
@@ -18,7 +24,9 @@ import agate
from dataclasses import dataclass, field
from datetime import datetime
from typing import Union, Dict, List, Optional, Any, NamedTuple
from typing import Union, Dict, List, Optional, Any, NamedTuple, Sequence
from dbt.clients.system import write_json
@dataclass
@@ -48,47 +56,63 @@ class collect_timing_info:
logger.debug('finished collecting timing info')
class NodeStatus(StrEnum):
Success = "success"
Error = "error"
Fail = "fail"
Warn = "warn"
Skipped = "skipped"
Pass = "pass"
RuntimeErr = "runtime error"
class RunStatus(StrEnum):
Success = NodeStatus.Success
Error = NodeStatus.Error
Skipped = NodeStatus.Skipped
class TestStatus(StrEnum):
Pass = NodeStatus.Pass
Error = NodeStatus.Error
Fail = NodeStatus.Fail
Warn = NodeStatus.Warn
class FreshnessStatus(StrEnum):
Pass = NodeStatus.Pass
Warn = NodeStatus.Warn
Error = NodeStatus.Error
RuntimeErr = NodeStatus.RuntimeErr
@dataclass
class PartialResult(JsonSchemaMixin, Writable):
class BaseResult(JsonSchemaMixin):
status: Union[RunStatus, TestStatus, FreshnessStatus]
timing: List[TimingInfo]
thread_id: str
execution_time: float
message: Optional[Union[str, int]]
adapter_response: Dict[str, Any]
@dataclass
class NodeResult(BaseResult):
node: CompileResultNode
error: Optional[str] = None
status: Union[None, str, int, bool] = None
execution_time: Union[str, int] = 0
thread_id: Optional[str] = None
timing: List[TimingInfo] = field(default_factory=list)
fail: Optional[bool] = None
warn: Optional[bool] = None
# if the result got to the point where it could be skipped/failed, we would
# be returning a real result, not a partial.
@property
def skipped(self):
return False
@dataclass
class WritableRunModelResult(PartialResult):
skip: bool = False
@property
def skipped(self):
return self.skip
@dataclass
class RunModelResult(WritableRunModelResult):
class RunResult(NodeResult):
agate_table: Optional[agate.Table] = None
def to_dict(self, *args, **kwargs):
dct = super().to_dict(*args, **kwargs)
dct.pop('agate_table', None)
return dct
@property
def skipped(self):
return self.status == RunStatus.Skipped
@dataclass
class ExecutionResult(JsonSchemaMixin, Writable):
results: List[Union[WritableRunModelResult, PartialResult]]
generated_at: datetime
class ExecutionResult(JsonSchemaMixin):
results: Sequence[BaseResult]
elapsed_time: float
def __len__(self):
@@ -101,138 +125,244 @@ class ExecutionResult(JsonSchemaMixin, Writable):
return self.results[idx]
@dataclass
class RunResultsMetadata(BaseArtifactMetadata):
dbt_schema_version: str = field(
default_factory=lambda: str(RunResultsArtifact.dbt_schema_version)
)
@dataclass
class RunResultOutput(BaseResult):
unique_id: str
def process_run_result(result: RunResult) -> RunResultOutput:
return RunResultOutput(
unique_id=result.node.unique_id,
status=result.status,
timing=result.timing,
thread_id=result.thread_id,
execution_time=result.execution_time,
message=result.message,
adapter_response=result.adapter_response
)
@dataclass
class RunExecutionResult(
ExecutionResult,
):
results: Sequence[RunResult]
args: Dict[str, Any] = field(default_factory=dict)
generated_at: datetime = field(default_factory=datetime.utcnow)
def write(self, path: str):
writable = RunResultsArtifact.from_execution_results(
results=self.results,
elapsed_time=self.elapsed_time,
generated_at=self.generated_at,
args=self.args,
)
writable.write(path)
@dataclass
@schema_version('run-results', 1)
class RunResultsArtifact(ExecutionResult, ArtifactMixin):
results: Sequence[RunResultOutput]
args: Dict[str, Any] = field(default_factory=dict)
@classmethod
def from_execution_results(
cls,
results: Sequence[RunResult],
elapsed_time: float,
generated_at: datetime,
args: Dict,
):
processed_results = [process_run_result(result) for result in results]
meta = RunResultsMetadata(
dbt_schema_version=str(cls.dbt_schema_version),
generated_at=generated_at,
)
return cls(
metadata=meta,
results=processed_results,
elapsed_time=elapsed_time,
args=args
)
def write(self, path: str, omit_none=False):
write_json(path, self.to_dict(omit_none=omit_none))
@dataclass
class RunOperationResult(ExecutionResult):
success: bool
@dataclass
class RunOperationResultMetadata(BaseArtifactMetadata):
dbt_schema_version: str = field(default_factory=lambda: str(
RunOperationResultsArtifact.dbt_schema_version
))
@dataclass
@schema_version('run-operation-result', 1)
class RunOperationResultsArtifact(RunOperationResult, ArtifactMixin):
@classmethod
def from_success(
cls,
success: bool,
elapsed_time: float,
generated_at: datetime,
):
meta = RunOperationResultMetadata(
dbt_schema_version=str(cls.dbt_schema_version),
generated_at=generated_at,
)
return cls(
metadata=meta,
results=[],
elapsed_time=elapsed_time,
success=success,
)
# due to issues with typing.Union collapsing subclasses, this can't subclass
# PartialResult
@dataclass
class SourceFreshnessResult(JsonSchemaMixin, Writable):
class SourceFreshnessResult(NodeResult):
node: ParsedSourceDefinition
status: FreshnessStatus
max_loaded_at: datetime
snapshotted_at: datetime
age: float
status: FreshnessStatus
error: Optional[str] = None
execution_time: Union[str, int] = 0
thread_id: Optional[str] = None
timing: List[TimingInfo] = field(default_factory=list)
fail: Optional[bool] = None
def __post_init__(self):
self.fail = self.status == 'error'
@property
def warned(self):
return self.status == 'warn'
@property
def skipped(self):
return False
@dataclass
class FreshnessMetadata(JsonSchemaMixin):
generated_at: datetime
elapsed_time: float
@dataclass
class FreshnessExecutionResult(FreshnessMetadata):
results: List[Union[PartialResult, SourceFreshnessResult]]
def write(self, path, omit_none=True):
"""Create a new object with the desired output schema and write it."""
meta = FreshnessMetadata(
generated_at=self.generated_at,
elapsed_time=self.elapsed_time,
)
sources = {}
for result in self.results:
result_value: Union[
SourceFreshnessRuntimeError, SourceFreshnessOutput
]
unique_id = result.node.unique_id
if result.error is not None:
result_value = SourceFreshnessRuntimeError(
error=result.error,
state=FreshnessErrorEnum.runtime_error,
)
else:
# we know that this must be a SourceFreshnessResult
if not isinstance(result, SourceFreshnessResult):
raise InternalException(
'Got {} instead of a SourceFreshnessResult for a '
'non-error result in freshness execution!'
.format(type(result))
)
# if we're here, we must have a non-None freshness threshold
criteria = result.node.freshness
if criteria is None:
raise InternalException(
'Somehow evaluated a freshness result for a source '
'that has no freshness criteria!'
)
result_value = SourceFreshnessOutput(
max_loaded_at=result.max_loaded_at,
snapshotted_at=result.snapshotted_at,
max_loaded_at_time_ago_in_s=result.age,
state=result.status,
criteria=criteria,
)
sources[unique_id] = result_value
output = FreshnessRunOutput(meta=meta, sources=sources)
output.write(path, omit_none=omit_none)
def __len__(self):
return len(self.results)
def __iter__(self):
return iter(self.results)
def __getitem__(self, idx):
return self.results[idx]
def _copykeys(src, keys, **updates):
return {k: getattr(src, k) for k in keys}
@dataclass
class FreshnessCriteria(JsonSchemaMixin):
warn_after: Time
error_after: Time
class FreshnessErrorEnum(StrEnum):
runtime_error = 'runtime error'
@dataclass
class SourceFreshnessRuntimeError(JsonSchemaMixin):
error: str
state: FreshnessErrorEnum
unique_id: str
error: Optional[Union[str, int]]
status: FreshnessErrorEnum
@dataclass
class SourceFreshnessOutput(JsonSchemaMixin):
unique_id: str
max_loaded_at: datetime
snapshotted_at: datetime
max_loaded_at_time_ago_in_s: float
state: FreshnessStatus
status: FreshnessStatus
criteria: FreshnessThreshold
SourceFreshnessRunResult = Union[SourceFreshnessOutput,
SourceFreshnessRuntimeError]
adapter_response: Dict[str, Any]
@dataclass
class FreshnessRunOutput(JsonSchemaMixin, Writable):
meta: FreshnessMetadata
sources: Dict[str, SourceFreshnessRunResult]
class PartialSourceFreshnessResult(NodeResult):
status: FreshnessStatus
@property
def skipped(self):
return False
FreshnessNodeResult = Union[PartialSourceFreshnessResult,
SourceFreshnessResult]
FreshnessNodeOutput = Union[SourceFreshnessRuntimeError, SourceFreshnessOutput]
def process_freshness_result(
result: FreshnessNodeResult
) -> FreshnessNodeOutput:
unique_id = result.node.unique_id
if result.status == FreshnessStatus.RuntimeErr:
return SourceFreshnessRuntimeError(
unique_id=unique_id,
error=result.message,
status=FreshnessErrorEnum.runtime_error,
)
# we know that this must be a SourceFreshnessResult
if not isinstance(result, SourceFreshnessResult):
raise InternalException(
'Got {} instead of a SourceFreshnessResult for a '
'non-error result in freshness execution!'
.format(type(result))
)
# if we're here, we must have a non-None freshness threshold
criteria = result.node.freshness
if criteria is None:
raise InternalException(
'Somehow evaluated a freshness result for a source '
'that has no freshness criteria!'
)
return SourceFreshnessOutput(
unique_id=unique_id,
max_loaded_at=result.max_loaded_at,
snapshotted_at=result.snapshotted_at,
max_loaded_at_time_ago_in_s=result.age,
status=result.status,
criteria=criteria,
adapter_response=result.adapter_response
)
@dataclass
class FreshnessMetadata(BaseArtifactMetadata):
dbt_schema_version: str = field(
default_factory=lambda: str(
FreshnessExecutionResultArtifact.dbt_schema_version
)
)
@dataclass
class FreshnessResult(ExecutionResult):
metadata: FreshnessMetadata
results: Sequence[FreshnessNodeResult]
@classmethod
def from_node_results(
cls,
results: List[FreshnessNodeResult],
elapsed_time: float,
generated_at: datetime,
):
meta = FreshnessMetadata(generated_at=generated_at)
return cls(metadata=meta, results=results, elapsed_time=elapsed_time)
@dataclass
@schema_version('sources', 1)
class FreshnessExecutionResultArtifact(
ArtifactMixin,
VersionedSchema,
):
metadata: FreshnessMetadata
results: Sequence[FreshnessNodeOutput]
elapsed_time: float
@classmethod
def from_result(cls, base: FreshnessResult):
processed = [process_freshness_result(r) for r in base.results]
return cls(
metadata=base.metadata,
results=processed,
elapsed_time=base.elapsed_time,
)
Primitive = Union[bool, str, float, None]
@@ -293,9 +423,39 @@ class CatalogTable(JsonSchemaMixin, Replaceable):
@dataclass
class CatalogResults(JsonSchemaMixin, Writable):
class CatalogMetadata(BaseArtifactMetadata):
dbt_schema_version: str = field(
default_factory=lambda: str(CatalogArtifact.dbt_schema_version)
)
@dataclass
class CatalogResults(JsonSchemaMixin):
nodes: Dict[str, CatalogTable]
sources: Dict[str, CatalogTable]
generated_at: datetime
errors: Optional[List[str]]
_compile_results: Optional[Any] = None
@dataclass
@schema_version('catalog', 1)
class CatalogArtifact(CatalogResults, ArtifactMixin):
metadata: CatalogMetadata
@classmethod
def from_results(
cls,
generated_at: datetime,
nodes: Dict[str, CatalogTable],
sources: Dict[str, CatalogTable],
compile_results: Optional[Any],
errors: Optional[List[str]]
) -> 'CatalogArtifact':
meta = CatalogMetadata(generated_at=generated_at)
return cls(
metadata=meta,
nodes=nodes,
sources=sources,
errors=errors,
_compile_results=compile_results,
)

View File

@@ -3,7 +3,7 @@ import os
import uuid
from dataclasses import dataclass, field
from datetime import datetime, timedelta
from typing import Optional, Union, List, Any, Dict, Type
from typing import Optional, Union, List, Any, Dict, Type, Sequence
from hologram import JsonSchemaMixin
from hologram.helpers import StrEnum
@@ -11,10 +11,17 @@ from hologram.helpers import StrEnum
from dbt.contracts.graph.compiled import CompileResultNode
from dbt.contracts.graph.manifest import WritableManifest
from dbt.contracts.results import (
TimingInfo,
RunResult, RunResultsArtifact, TimingInfo,
CatalogArtifact,
CatalogResults,
ExecutionResult,
FreshnessExecutionResultArtifact,
FreshnessResult,
RunOperationResult,
RunOperationResultsArtifact,
RunExecutionResult,
)
from dbt.contracts.util import VersionedSchema, schema_version
from dbt.exceptions import InternalException
from dbt.logger import LogMessage
from dbt.utils import restrict_to
@@ -45,6 +52,17 @@ class RPCCompileParameters(RPCParameters):
models: Union[None, str, List[str]] = None
exclude: Union[None, str, List[str]] = None
selector: Optional[str] = None
state: Optional[str] = None
@dataclass
class RPCRunParameters(RPCParameters):
threads: Optional[int] = None
models: Union[None, str, List[str]] = None
exclude: Union[None, str, List[str]] = None
selector: Optional[str] = None
state: Optional[str] = None
defer: Optional[bool] = None
@dataclass
@@ -53,12 +71,15 @@ class RPCSnapshotParameters(RPCParameters):
select: Union[None, str, List[str]] = None
exclude: Union[None, str, List[str]] = None
selector: Optional[str] = None
state: Optional[str] = None
@dataclass
class RPCTestParameters(RPCCompileParameters):
data: bool = False
schema: bool = False
state: Optional[str] = None
defer: Optional[bool] = None
@dataclass
@@ -68,11 +89,13 @@ class RPCSeedParameters(RPCParameters):
exclude: Union[None, str, List[str]] = None
selector: Optional[str] = None
show: bool = False
state: Optional[str] = None
@dataclass
class RPCDocsGenerateParameters(RPCParameters):
compile: bool = True
state: Optional[str] = None
@dataclass
@@ -81,7 +104,7 @@ class RPCCliParameters(RPCParameters):
@dataclass
class RPCNoParameters(RPCParameters):
class RPCDepsParameters(RPCParameters):
pass
@@ -155,35 +178,79 @@ class GetManifestParameters(RPCParameters):
@dataclass
class RemoteResult(JsonSchemaMixin):
class RemoteResult(VersionedSchema):
logs: List[LogMessage]
@dataclass
class RemoteEmptyResult(RemoteResult):
pass
@schema_version('remote-deps-result', 1)
class RemoteDepsResult(RemoteResult):
generated_at: datetime = field(default_factory=datetime.utcnow)
@dataclass
@schema_version('remote-catalog-result', 1)
class RemoteCatalogResults(CatalogResults, RemoteResult):
pass
generated_at: datetime = field(default_factory=datetime.utcnow)
def write(self, path: str):
artifact = CatalogArtifact.from_results(
generated_at=self.generated_at,
nodes=self.nodes,
sources=self.sources,
compile_results=self._compile_results,
errors=self.errors,
)
artifact.write(path)
@dataclass
class RemoteCompileResult(RemoteResult):
class RemoteCompileResultMixin(RemoteResult):
raw_sql: str
compiled_sql: str
node: CompileResultNode
timing: List[TimingInfo]
@dataclass
@schema_version('remote-compile-result', 1)
class RemoteCompileResult(RemoteCompileResultMixin):
generated_at: datetime = field(default_factory=datetime.utcnow)
@property
def error(self):
return None
@dataclass
@schema_version('remote-execution-result', 1)
class RemoteExecutionResult(ExecutionResult, RemoteResult):
pass
results: Sequence[RunResult]
args: Dict[str, Any] = field(default_factory=dict)
generated_at: datetime = field(default_factory=datetime.utcnow)
def write(self, path: str):
writable = RunResultsArtifact.from_execution_results(
generated_at=self.generated_at,
results=self.results,
elapsed_time=self.elapsed_time,
args=self.args,
)
writable.write(path)
@classmethod
def from_local_result(
cls,
base: RunExecutionResult,
logs: List[LogMessage],
) -> 'RemoteExecutionResult':
return cls(
generated_at=base.generated_at,
results=base.results,
elapsed_time=base.elapsed_time,
args=base.args,
logs=logs,
)
@dataclass
@@ -193,27 +260,74 @@ class ResultTable(JsonSchemaMixin):
@dataclass
class RemoteRunOperationResult(ExecutionResult, RemoteResult):
success: bool
@schema_version('remote-run-operation-result', 1)
class RemoteRunOperationResult(RunOperationResult, RemoteResult):
generated_at: datetime = field(default_factory=datetime.utcnow)
@classmethod
def from_local_result(
cls,
base: RunOperationResultsArtifact,
logs: List[LogMessage],
) -> 'RemoteRunOperationResult':
return cls(
generated_at=base.metadata.generated_at,
results=base.results,
elapsed_time=base.elapsed_time,
success=base.success,
logs=logs,
)
def write(self, path: str):
writable = RunOperationResultsArtifact.from_success(
success=self.success,
generated_at=self.generated_at,
elapsed_time=self.elapsed_time,
)
writable.write(path)
@dataclass
class RemoteRunResult(RemoteCompileResult):
@schema_version('remote-freshness-result', 1)
class RemoteFreshnessResult(FreshnessResult, RemoteResult):
@classmethod
def from_local_result(
cls,
base: FreshnessResult,
logs: List[LogMessage],
) -> 'RemoteFreshnessResult':
return cls(
metadata=base.metadata,
results=base.results,
elapsed_time=base.elapsed_time,
logs=logs,
)
def write(self, path: str):
writable = FreshnessExecutionResultArtifact.from_result(base=self)
writable.write(path)
@dataclass
@schema_version('remote-run-result', 1)
class RemoteRunResult(RemoteCompileResultMixin):
table: ResultTable
generated_at: datetime = field(default_factory=datetime.utcnow)
RPCResult = Union[
RemoteCompileResult,
RemoteExecutionResult,
RemoteFreshnessResult,
RemoteCatalogResults,
RemoteEmptyResult,
RemoteDepsResult,
RemoteRunOperationResult,
]
# GC types
class GCResultState(StrEnum):
Deleted = 'deleted' # successful GC
Missing = 'missing' # nothing to GC
@@ -221,6 +335,7 @@ class GCResultState(StrEnum):
@dataclass
@schema_version('remote-gc-result', 1)
class GCResult(RemoteResult):
logs: List[LogMessage] = field(default_factory=list)
deleted: List[TaskID] = field(default_factory=list)
@@ -314,6 +429,7 @@ class TaskRow(TaskTiming):
@dataclass
@schema_version('remote-ps-result', 1)
class PSResult(RemoteResult):
rows: List[TaskRow]
@@ -326,12 +442,14 @@ class KillResultStatus(StrEnum):
@dataclass
@schema_version('remote-kill-result', 1)
class KillResult(RemoteResult):
state: KillResultStatus = KillResultStatus.Missing
logs: List[LogMessage] = field(default_factory=list)
@dataclass
@schema_version('remote-manifest-result', 1)
class GetManifestResult(RemoteResult):
manifest: Optional[WritableManifest]
@@ -359,16 +477,18 @@ class PollResult(RemoteResult, TaskTiming):
@dataclass
class PollRemoteEmptyCompleteResult(PollResult, RemoteEmptyResult):
@schema_version('poll-remote-deps-result', 1)
class PollRemoteEmptyCompleteResult(PollResult, RemoteResult):
state: TaskHandlerState = field(
metadata=restrict_to(TaskHandlerState.Success,
TaskHandlerState.Failed),
)
generated_at: datetime = field(default_factory=datetime.utcnow)
@classmethod
def from_result(
cls: Type['PollRemoteEmptyCompleteResult'],
base: RemoteEmptyResult,
base: RemoteDepsResult,
tags: TaskTags,
timing: TaskTiming,
logs: List[LogMessage],
@@ -380,10 +500,12 @@ class PollRemoteEmptyCompleteResult(PollResult, RemoteEmptyResult):
start=timing.start,
end=timing.end,
elapsed=timing.elapsed,
generated_at=base.generated_at
)
@dataclass
@schema_version('poll-remote-killed-result', 1)
class PollKilledResult(PollResult):
state: TaskHandlerState = field(
metadata=restrict_to(TaskHandlerState.Killed),
@@ -391,7 +513,11 @@ class PollKilledResult(PollResult):
@dataclass
class PollExecuteCompleteResult(RemoteExecutionResult, PollResult):
@schema_version('poll-remote-execution-result', 1)
class PollExecuteCompleteResult(
RemoteExecutionResult,
PollResult,
):
state: TaskHandlerState = field(
metadata=restrict_to(TaskHandlerState.Success,
TaskHandlerState.Failed),
@@ -407,7 +533,6 @@ class PollExecuteCompleteResult(RemoteExecutionResult, PollResult):
) -> 'PollExecuteCompleteResult':
return cls(
results=base.results,
generated_at=base.generated_at,
elapsed_time=base.elapsed_time,
logs=logs,
tags=tags,
@@ -415,11 +540,16 @@ class PollExecuteCompleteResult(RemoteExecutionResult, PollResult):
start=timing.start,
end=timing.end,
elapsed=timing.elapsed,
generated_at=base.generated_at,
)
@dataclass
class PollCompileCompleteResult(RemoteCompileResult, PollResult):
@schema_version('poll-remote-compile-result', 1)
class PollCompileCompleteResult(
RemoteCompileResult,
PollResult,
):
state: TaskHandlerState = field(
metadata=restrict_to(TaskHandlerState.Success,
TaskHandlerState.Failed),
@@ -444,11 +574,16 @@ class PollCompileCompleteResult(RemoteCompileResult, PollResult):
start=timing.start,
end=timing.end,
elapsed=timing.elapsed,
generated_at=base.generated_at
)
@dataclass
class PollRunCompleteResult(RemoteRunResult, PollResult):
@schema_version('poll-remote-run-result', 1)
class PollRunCompleteResult(
RemoteRunResult,
PollResult,
):
state: TaskHandlerState = field(
metadata=restrict_to(TaskHandlerState.Success,
TaskHandlerState.Failed),
@@ -474,11 +609,16 @@ class PollRunCompleteResult(RemoteRunResult, PollResult):
start=timing.start,
end=timing.end,
elapsed=timing.elapsed,
generated_at=base.generated_at
)
@dataclass
class PollRunOperationCompleteResult(RemoteRunOperationResult, PollResult):
@schema_version('poll-remote-run-operation-result', 1)
class PollRunOperationCompleteResult(
RemoteRunOperationResult,
PollResult,
):
state: TaskHandlerState = field(
metadata=restrict_to(TaskHandlerState.Success,
TaskHandlerState.Failed),
@@ -507,6 +647,7 @@ class PollRunOperationCompleteResult(RemoteRunOperationResult, PollResult):
@dataclass
@schema_version('poll-remote-catalog-result', 1)
class PollCatalogCompleteResult(RemoteCatalogResults, PollResult):
state: TaskHandlerState = field(
metadata=restrict_to(TaskHandlerState.Success,
@@ -537,11 +678,13 @@ class PollCatalogCompleteResult(RemoteCatalogResults, PollResult):
@dataclass
@schema_version('poll-remote-in-progress-result', 1)
class PollInProgressResult(PollResult):
pass
@dataclass
@schema_version('poll-remote-get-manifest-result', 1)
class PollGetManifestResult(GetManifestResult, PollResult):
state: TaskHandlerState = field(
metadata=restrict_to(TaskHandlerState.Success,
@@ -566,6 +709,35 @@ class PollGetManifestResult(GetManifestResult, PollResult):
elapsed=timing.elapsed,
)
@dataclass
@schema_version('poll-remote-freshness-result', 1)
class PollFreshnessResult(RemoteFreshnessResult, PollResult):
state: TaskHandlerState = field(
metadata=restrict_to(TaskHandlerState.Success,
TaskHandlerState.Failed),
)
@classmethod
def from_result(
cls: Type['PollFreshnessResult'],
base: RemoteFreshnessResult,
tags: TaskTags,
timing: TaskTiming,
logs: List[LogMessage],
) -> 'PollFreshnessResult':
return cls(
logs=logs,
tags=tags,
state=timing.state,
start=timing.start,
end=timing.end,
elapsed=timing.elapsed,
metadata=base.metadata,
results=base.results,
elapsed_time=base.elapsed_time,
)
# Manifest parsing types
@@ -577,6 +749,7 @@ class ManifestStatus(StrEnum):
@dataclass
@schema_version('remote-status-result', 1)
class LastParse(RemoteResult):
state: ManifestStatus = ManifestStatus.Init
logs: List[LogMessage] = field(default_factory=list)

View File

@@ -8,6 +8,7 @@ from typing import List, Dict, Any, Union
class SelectorDefinition(JsonSchemaMixin):
name: str
definition: Union[str, Dict[str, Any]]
description: str = ''
@dataclass

View File

@@ -0,0 +1,18 @@
from pathlib import Path
from .graph.manifest import WritableManifest
from typing import Optional
from dbt.exceptions import IncompatibleSchemaException
class PreviousState:
def __init__(self, path: Path):
self.path: Path = path
self.manifest: Optional[WritableManifest] = None
manifest_path = self.path / 'manifest.json'
if manifest_path.exists() and manifest_path.is_file():
try:
self.manifest = WritableManifest.read(str(manifest_path))
except IncompatibleSchemaException as exc:
exc.add_filename(str(manifest_path))
raise

View File

@@ -1,8 +1,22 @@
import dataclasses
from typing import List
import os
from datetime import datetime
from typing import (
List, Tuple, ClassVar, Type, TypeVar, Dict, Any, Optional
)
from dbt.clients.system import write_json, read_json
from dbt.exceptions import RuntimeException
from dbt.exceptions import (
IncompatibleSchemaException,
InternalException,
RuntimeException,
)
from dbt.version import __version__
from dbt.tracking import get_invocation_id
from hologram import JsonSchemaMixin
MacroKey = Tuple[str, str]
SourceKey = Tuple[str, str]
def list_str() -> List[str]:
@@ -90,3 +104,94 @@ class Readable:
) from exc
return cls.from_dict(data) # type: ignore
BASE_SCHEMAS_URL = 'https://schemas.getdbt.com/dbt/{name}/v{version}.json'
@dataclasses.dataclass
class SchemaVersion:
name: str
version: int
def __str__(self) -> str:
return BASE_SCHEMAS_URL.format(
name=self.name,
version=self.version,
)
SCHEMA_VERSION_KEY = 'dbt_schema_version'
METADATA_ENV_PREFIX = 'DBT_ENV_CUSTOM_ENV_'
def get_metadata_env() -> Dict[str, str]:
return {
k[len(METADATA_ENV_PREFIX):]: v for k, v in os.environ.items()
if k.startswith(METADATA_ENV_PREFIX)
}
@dataclasses.dataclass
class BaseArtifactMetadata(JsonSchemaMixin):
dbt_schema_version: str
dbt_version: str = __version__
generated_at: datetime = dataclasses.field(
default_factory=datetime.utcnow
)
invocation_id: Optional[str] = dataclasses.field(
default_factory=get_invocation_id
)
env: Dict[str, str] = dataclasses.field(default_factory=get_metadata_env)
def schema_version(name: str, version: int):
def inner(cls: Type[VersionedSchema]):
cls.dbt_schema_version = SchemaVersion(
name=name,
version=version,
)
return cls
return inner
@dataclasses.dataclass
class VersionedSchema(JsonSchemaMixin):
dbt_schema_version: ClassVar[SchemaVersion]
@classmethod
def json_schema(cls, embeddable: bool = False) -> Dict[str, Any]:
result = super().json_schema(embeddable=embeddable)
if not embeddable:
result['$id'] = str(cls.dbt_schema_version)
return result
T = TypeVar('T', bound='ArtifactMixin')
# metadata should really be a Generic[T_M] where T_M is a TypeVar bound to
# BaseArtifactMetadata. Unfortunately this isn't possible due to a mypy issue:
# https://github.com/python/mypy/issues/7520
@dataclasses.dataclass(init=False)
class ArtifactMixin(VersionedSchema, Writable, Readable):
metadata: BaseArtifactMetadata
@classmethod
def from_dict(
cls: Type[T], data: Dict[str, Any], validate: bool = True
) -> T:
if cls.dbt_schema_version is None:
raise InternalException(
'Cannot call from_dict with no schema version!'
)
if validate:
expected = str(cls.dbt_schema_version)
found = data.get('metadata', {}).get(SCHEMA_VERSION_KEY)
if found != expected:
raise IncompatibleSchemaException(expected, found)
return super().from_dict(data=data, validate=validate)

View File

@@ -3,6 +3,8 @@ from typing import Optional, Set, List, Dict, ClassVar
import dbt.exceptions
from dbt import ui
import dbt.tracking
class DBTDeprecation:
_name: ClassVar[Optional[str]] = None
@@ -16,6 +18,12 @@ class DBTDeprecation:
'name not implemented for {}'.format(self)
)
def track_deprecation_warn(self) -> None:
if dbt.tracking.active_user is not None:
dbt.tracking.track_deprecation_warn({
"deprecation_name": self.name
})
@property
def description(self) -> str:
if self._description is not None:
@@ -31,6 +39,7 @@ class DBTDeprecation:
desc, prefix='* Deprecation Warning: '
)
dbt.exceptions.warn_or_error(msg)
self.track_deprecation_warn()
active_deprecations.add(self.name)
@@ -90,23 +99,6 @@ class ModelsKeyNonModelDeprecation(DBTDeprecation):
'''
class DbtProjectYamlDeprecation(DBTDeprecation):
_name = 'dbt-project-yaml-v1'
_description = '''\
dbt v0.17.0 introduces a new config format for the dbt_project.yml file.
Support for the existing version 1 format will be removed in a future
release of dbt. The following packages are currently configured with
config version 1:{project_names}
For upgrading instructions, consult the documentation:
https://docs.getdbt.com/docs/guides/migration-guide/upgrading-to-0-17-0
'''
class ExecuteMacrosReleaseDeprecation(DBTDeprecation):
_name = 'execute-macro-release'
_description = '''\
@@ -116,6 +108,15 @@ class ExecuteMacrosReleaseDeprecation(DBTDeprecation):
'''
class AdapterMacroDeprecation(DBTDeprecation):
_name = 'adapter-macro'
_description = '''\
The "adapter_macro" macro has been deprecated. Instead, use the
`adapter.dispatch` method to find a macro and call the result.
adapter_macro was called for: {macro_name}
'''
_adapter_renamed_description = """\
The adapter function `adapter.{old_name}` is deprecated and will be removed in
a future release of dbt. Please use `adapter.{new_name}` instead.
@@ -158,8 +159,8 @@ deprecations_list: List[DBTDeprecation] = [
NotADictionaryDeprecation(),
ColumnQuotingDeprecation(),
ModelsKeyNonModelDeprecation(),
DbtProjectYamlDeprecation(),
ExecuteMacrosReleaseDeprecation(),
AdapterMacroDeprecation(),
]
deprecations: Dict[str, DBTDeprecation] = {

View File

@@ -127,7 +127,7 @@ def resolve_packages(
final = PackageListing()
ctx = generate_target_context(config, config.cli_vars)
renderer = DbtProjectYamlRenderer(ctx, config.config_version)
renderer = DbtProjectYamlRenderer(ctx)
while pending:
next_pending = PackageListing()

View File

@@ -132,7 +132,7 @@ class RuntimeException(RuntimeError, Exception):
result.update({
'raw_sql': self.node.raw_sql,
# the node isn't always compiled, but if it is, include that!
'compiled_sql': getattr(self.node, 'injected_sql', None),
'compiled_sql': getattr(self.node, 'compiled_sql', None),
})
return result
@@ -257,6 +257,34 @@ class JSONValidationException(ValidationException):
return (JSONValidationException, (self.typename, self.errors))
class IncompatibleSchemaException(RuntimeException):
def __init__(self, expected: str, found: Optional[str]):
self.expected = expected
self.found = found
self.filename = 'input file'
super().__init__(self.get_message())
def add_filename(self, filename: str):
self.filename = filename
self.msg = self.get_message()
def get_message(self) -> str:
found_str = 'nothing'
if self.found is not None:
found_str = f'"{self.found}"'
msg = (
f'Expected a schema version of "{self.expected}" in '
f'{self.filename}, but found {found_str}. Are you running with a '
f'different version of dbt?'
)
return msg
CODE = 10014
MESSAGE = "Incompatible Schema"
class JinjaRenderingException(CompilationException):
pass

View File

@@ -1,7 +1,11 @@
import os
import multiprocessing
if os.name != 'nt':
# https://bugs.python.org/issue41567
import multiprocessing.popen_spawn_posix # type: ignore
from pathlib import Path
from typing import Optional
# initially all flags are set to None, the on-load call of reset() will set
# them for their first time.
STRICT_MODE = None
@@ -11,6 +15,7 @@ WARN_ERROR = None
TEST_NEW_PARSER = None
WRITE_JSON = None
PARTIAL_PARSE = None
USE_COLORS = None
def env_set_truthy(key: str) -> Optional[str]:
@@ -48,7 +53,7 @@ MP_CONTEXT = _get_context()
def reset():
global STRICT_MODE, FULL_REFRESH, USE_CACHE, WARN_ERROR, TEST_NEW_PARSER, \
WRITE_JSON, PARTIAL_PARSE, MP_CONTEXT
WRITE_JSON, PARTIAL_PARSE, MP_CONTEXT, USE_COLORS
STRICT_MODE = False
FULL_REFRESH = False
@@ -58,11 +63,12 @@ def reset():
WRITE_JSON = True
PARTIAL_PARSE = False
MP_CONTEXT = _get_context()
USE_COLORS = True
def set_from_args(args):
global STRICT_MODE, FULL_REFRESH, USE_CACHE, WARN_ERROR, TEST_NEW_PARSER, \
WRITE_JSON, PARTIAL_PARSE, MP_CONTEXT
WRITE_JSON, PARTIAL_PARSE, MP_CONTEXT, USE_COLORS
USE_CACHE = getattr(args, 'use_cache', USE_CACHE)
@@ -78,6 +84,13 @@ def set_from_args(args):
PARTIAL_PARSE = getattr(args, 'partial_parse', None)
MP_CONTEXT = _get_context()
# The use_colors attribute will always have a value because it is assigned
# None by default from the add_mutually_exclusive_group function
use_colors_override = getattr(args, 'use_colors')
if use_colors_override is not None:
USE_COLORS = use_colors_override
# initialize everything to the defaults on module load
reset()

View File

@@ -1,5 +1,6 @@
# special support for CLI argument parsing.
import itertools
import yaml
from typing import (
Dict, List, Optional, Tuple, Any, Union
@@ -18,7 +19,7 @@ from .selector_spec import (
INTERSECTION_DELIMITER = ','
DEFAULT_INCLUDES: List[str] = ['fqn:*', 'source:*']
DEFAULT_INCLUDES: List[str] = ['fqn:*', 'source:*', 'exposure:*']
DEFAULT_EXCLUDES: List[str] = []
DATA_TEST_SELECTOR: str = 'test_type:data'
SCHEMA_TEST_SELECTOR: str = 'test_type:schema'
@@ -116,8 +117,7 @@ def _get_list_dicts(
values = dct[key]
if not isinstance(values, list):
raise ValidationException(
f'Invalid value type {type(values)} in key "{key}" '
f'(value "{values}")'
f'Invalid value for key "{key}". Expected a list.'
)
for value in values:
if isinstance(value, dict):
@@ -165,9 +165,10 @@ def _parse_include_exclude_subdefs(
if isinstance(definition, dict) and 'exclude' in definition:
# do not allow multiple exclude: defs at the same level
if diff_arg is not None:
yaml_sel_cfg = yaml.dump(definition)
raise ValidationException(
f'Got multiple exclusion definitions in definition list '
f'{definitions}'
f"You cannot provide multiple exclude arguments to the "
f"same selector set operator:\n{yaml_sel_cfg}"
)
diff_arg = _parse_exclusions(definition)
else:
@@ -198,6 +199,7 @@ def parse_intersection_definition(
intersection_def_parts = _get_list_dicts(definition, 'intersection')
include, exclude = _parse_include_exclude_subdefs(intersection_def_parts)
intersection = SelectionIntersection(components=include)
if exclude is None:
intersection.raw = definition
return intersection
@@ -210,7 +212,6 @@ def parse_intersection_definition(
def parse_dict_definition(definition: Dict[str, Any]) -> SelectionSpec:
diff_arg: Optional[SelectionSpec] = None
if len(definition) == 1:
key = list(definition)[0]
value = definition[key]
@@ -230,7 +231,7 @@ def parse_dict_definition(definition: Dict[str, Any]) -> SelectionSpec:
dct = {k: v for k, v in dct.items() if k != 'exclude'}
else:
raise ValidationException(
f'Expected exactly 1 key in the selection definition or "method" '
f'Expected either 1 key or else "method" '
f'and "value" keys, but got {list(definition)}'
)
@@ -242,7 +243,18 @@ def parse_dict_definition(definition: Dict[str, Any]) -> SelectionSpec:
return SelectionDifference(components=[base, diff_arg])
def parse_from_definition(definition: RawDefinition) -> SelectionSpec:
def parse_from_definition(
definition: RawDefinition, rootlevel=False
) -> SelectionSpec:
if (isinstance(definition, dict) and
('union' in definition or 'intersection' in definition) and
rootlevel and len(definition) > 1):
keys = ",".join(definition.keys())
raise ValidationException(
f"Only a single 'union' or 'intersection' key is allowed "
f"in a root level selector definition; found {keys}."
)
if isinstance(definition, str):
return SelectionCriteria.from_single_spec(definition)
elif 'union' in definition:
@@ -253,8 +265,8 @@ def parse_from_definition(definition: RawDefinition) -> SelectionSpec:
return parse_dict_definition(definition)
else:
raise ValidationException(
f'Expected to find str or dict, instead found '
f'{type(definition)}: {definition}'
f'Expected to find union, intersection, str or dict, instead '
f'found {type(definition)}: {definition}'
)
@@ -264,5 +276,6 @@ def parse_from_selectors_definition(
result: Dict[str, SelectionSpec] = {}
selector: SelectorDefinition
for selector in source.selectors:
result[selector.name] = parse_from_definition(selector.definition)
result[selector.name] = parse_from_definition(selector.definition,
rootlevel=True)
return result

View File

@@ -7,8 +7,8 @@ from typing import (
import networkx as nx # type: ignore
from .graph import UniqueId
from dbt.contracts.graph.parsed import ParsedSourceDefinition
from dbt.contracts.graph.compiled import CompileResultNode
from dbt.contracts.graph.parsed import ParsedSourceDefinition, ParsedExposure
from dbt.contracts.graph.compiled import GraphMemberNode
from dbt.contracts.graph.manifest import Manifest
from dbt.node_types import NodeType
@@ -50,8 +50,8 @@ class GraphQueue:
node = self.manifest.expect(node_id)
if node.resource_type != NodeType.Model:
return False
# must be a Model - tell mypy this won't be a Source
assert not isinstance(node, ParsedSourceDefinition)
# must be a Model - tell mypy this won't be a Source or Exposure
assert not isinstance(node, (ParsedSourceDefinition, ParsedExposure))
if node.is_ephemeral:
return False
return True
@@ -84,7 +84,7 @@ class GraphQueue:
def get(
self, block: bool = True, timeout: Optional[float] = None
) -> CompileResultNode:
) -> GraphMemberNode:
"""Get a node off the inner priority queue. By default, this blocks.
This takes the lock, but only for part of it.

View File

@@ -1,5 +1,5 @@
from typing import Set, List, Union
from typing import Set, List, Optional
from .graph import Graph, UniqueId
from .queue import GraphQueue
@@ -13,9 +13,9 @@ from dbt.exceptions import (
InvalidSelectorException,
warn_or_error,
)
from dbt.contracts.graph.compiled import NonSourceNode, CompileResultNode
from dbt.contracts.graph.compiled import GraphMemberNode
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.parsed import ParsedSourceDefinition
from dbt.contracts.state import PreviousState
def get_package_names(nodes):
@@ -37,9 +37,10 @@ class NodeSelector(MethodManager):
self,
graph: Graph,
manifest: Manifest,
previous_state: Optional[PreviousState] = None,
):
super().__init__(manifest, previous_state)
self.full_graph = graph
self.manifest = manifest
# build a subgraph containing only non-empty, enabled nodes and enabled
# sources.
@@ -128,24 +129,25 @@ class NodeSelector(MethodManager):
if unique_id in self.manifest.sources:
source = self.manifest.sources[unique_id]
return source.config.enabled
elif unique_id in self.manifest.exposures:
return True
node = self.manifest.nodes[unique_id]
return not node.empty and node.config.enabled
def node_is_match(
self,
node: Union[ParsedSourceDefinition, NonSourceNode],
) -> bool:
def node_is_match(self, node: GraphMemberNode) -> bool:
"""Determine if a node is a match for the selector. Non-match nodes
will be excluded from results during filtering.
"""
return True
def _is_match(self, unique_id: UniqueId) -> bool:
node: CompileResultNode
node: GraphMemberNode
if unique_id in self.manifest.nodes:
node = self.manifest.nodes[unique_id]
elif unique_id in self.manifest.sources:
node = self.manifest.sources[unique_id]
elif unique_id in self.manifest.exposures:
node = self.manifest.exposures[unique_id]
else:
raise InternalException(
f'Node {unique_id} not found in the manifest!'
@@ -195,11 +197,13 @@ class ResourceTypeSelector(NodeSelector):
self,
graph: Graph,
manifest: Manifest,
previous_state: Optional[PreviousState],
resource_types: List[NodeType],
):
super().__init__(
graph=graph,
manifest=manifest,
previous_state=previous_state,
)
self.resource_types: Set[NodeType] = set(resource_types)

View File

@@ -1,7 +1,7 @@
import abc
from itertools import chain
from pathlib import Path
from typing import Set, List, Dict, Iterator, Tuple, Any, Union, Type
from typing import Set, List, Dict, Iterator, Tuple, Any, Union, Type, Optional
from hologram.helpers import StrEnum
@@ -10,20 +10,25 @@ from .graph import UniqueId
from dbt.contracts.graph.compiled import (
CompiledDataTestNode,
CompiledSchemaTestNode,
NonSourceNode,
CompileResultNode,
ManifestNode,
)
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.manifest import Manifest, WritableManifest
from dbt.contracts.graph.parsed import (
HasTestMetadata,
ParsedDataTestNode,
ParsedExposure,
ParsedSchemaTestNode,
ParsedSourceDefinition,
)
from dbt.contracts.state import PreviousState
from dbt.logger import GLOBAL_LOGGER as logger
from dbt.exceptions import (
InternalException,
RuntimeException,
)
from dbt.node_types import NodeType
from dbt.ui import warning_tag
SELECTOR_GLOB = '*'
@@ -40,6 +45,8 @@ class MethodName(StrEnum):
TestName = 'test_name'
TestType = 'test_type'
ResourceType = 'resource_type'
State = 'state'
Exposure = 'exposure'
def is_selected_node(real_node, node_selector):
@@ -68,18 +75,24 @@ def is_selected_node(real_node, node_selector):
return True
SelectorTarget = Union[ParsedSourceDefinition, NonSourceNode]
SelectorTarget = Union[ParsedSourceDefinition, ManifestNode, ParsedExposure]
class SelectorMethod(metaclass=abc.ABCMeta):
def __init__(self, manifest: Manifest, arguments: List[str]):
def __init__(
self,
manifest: Manifest,
previous_state: Optional[PreviousState],
arguments: List[str]
):
self.manifest: Manifest = manifest
self.previous_state = previous_state
self.arguments: List[str] = arguments
def parsed_nodes(
self,
included_nodes: Set[UniqueId]
) -> Iterator[Tuple[UniqueId, NonSourceNode]]:
) -> Iterator[Tuple[UniqueId, ManifestNode]]:
for key, node in self.manifest.nodes.items():
unique_id = UniqueId(key)
@@ -98,13 +111,39 @@ class SelectorMethod(metaclass=abc.ABCMeta):
continue
yield unique_id, source
def exposure_nodes(
self,
included_nodes: Set[UniqueId]
) -> Iterator[Tuple[UniqueId, ParsedExposure]]:
for key, exposure in self.manifest.exposures.items():
unique_id = UniqueId(key)
if unique_id not in included_nodes:
continue
yield unique_id, exposure
def all_nodes(
self,
included_nodes: Set[UniqueId]
) -> Iterator[Tuple[UniqueId, SelectorTarget]]:
yield from chain(self.parsed_nodes(included_nodes),
self.source_nodes(included_nodes),
self.exposure_nodes(included_nodes))
def configurable_nodes(
self,
included_nodes: Set[UniqueId]
) -> Iterator[Tuple[UniqueId, CompileResultNode]]:
yield from chain(self.parsed_nodes(included_nodes),
self.source_nodes(included_nodes))
def non_source_nodes(
self,
included_nodes: Set[UniqueId],
) -> Iterator[Tuple[UniqueId, Union[ParsedExposure, ManifestNode]]]:
yield from chain(self.parsed_nodes(included_nodes),
self.exposure_nodes(included_nodes))
@abc.abstractmethod
def search(
self,
@@ -199,8 +238,37 @@ class SourceSelectorMethod(SelectorMethod):
continue
if target_source not in (real_node.source_name, SELECTOR_GLOB):
continue
if target_table in (None, real_node.name, SELECTOR_GLOB):
yield node
if target_table not in (None, real_node.name, SELECTOR_GLOB):
continue
yield node
class ExposureSelectorMethod(SelectorMethod):
def search(
self, included_nodes: Set[UniqueId], selector: str
) -> Iterator[UniqueId]:
parts = selector.split('.')
target_package = SELECTOR_GLOB
if len(parts) == 1:
target_name = parts[0]
elif len(parts) == 2:
target_package, target_name = parts
else:
msg = (
'Invalid exposure selector value "{}". Exposures must be of '
'the form ${{exposure_name}} or '
'${{exposure_package.exposure_name}}'
).format(selector)
raise RuntimeException(msg)
for node, real_node in self.exposure_nodes(included_nodes):
if target_package not in (real_node.package_name, SELECTOR_GLOB):
continue
if target_name not in (real_node.name, SELECTOR_GLOB):
continue
yield node
class PathSelectorMethod(SelectorMethod):
@@ -274,7 +342,7 @@ class ConfigSelectorMethod(SelectorMethod):
# search sources is kind of useless now source configs only have
# 'enabled', which you can't really filter on anyway, but maybe we'll
# add more someday, so search them anyway.
for node, real_node in self.all_nodes(included_nodes):
for node, real_node in self.configurable_nodes(included_nodes):
try:
value = _getattr_descend(real_node.config, parts)
except AttributeError:
@@ -329,6 +397,97 @@ class TestTypeSelectorMethod(SelectorMethod):
yield node
class StateSelectorMethod(SelectorMethod):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.macros_were_modified: Optional[List[str]] = None
def _macros_modified(self) -> List[str]:
# we checked in the caller!
if self.previous_state is None or self.previous_state.manifest is None:
raise InternalException(
'No comparison manifest in _macros_modified'
)
old_macros = self.previous_state.manifest.macros
new_macros = self.manifest.macros
modified = []
for uid, macro in new_macros.items():
name = f'{macro.package_name}.{macro.name}'
if uid in old_macros:
old_macro = old_macros[uid]
if macro.macro_sql != old_macro.macro_sql:
modified.append(f'{name} changed')
else:
modified.append(f'{name} added')
for uid, macro in old_macros.items():
if uid not in new_macros:
modified.append(f'{macro.package_name}.{macro.name} removed')
return modified[:3]
def check_modified(
self,
old: Optional[SelectorTarget],
new: SelectorTarget,
) -> bool:
# check if there are any changes in macros, if so, log a warning the
# first time
if self.macros_were_modified is None:
self.macros_were_modified = self._macros_modified()
if self.macros_were_modified:
log_str = ', '.join(self.macros_were_modified)
logger.warning(warning_tag(
f'During a state comparison, dbt detected a change in '
f'macros. This will not be marked as a modification. Some '
f'macros: {log_str}'
))
return not new.same_contents(old) # type: ignore
def check_new(
self,
old: Optional[SelectorTarget],
new: SelectorTarget,
) -> bool:
return old is None
def search(
self, included_nodes: Set[UniqueId], selector: str
) -> Iterator[UniqueId]:
if self.previous_state is None or self.previous_state.manifest is None:
raise RuntimeException(
'Got a state selector method, but no comparison manifest'
)
state_checks = {
'modified': self.check_modified,
'new': self.check_new,
}
if selector in state_checks:
checker = state_checks[selector]
else:
raise RuntimeException(
f'Got an invalid selector "{selector}", expected one of '
f'"{list(state_checks)}"'
)
manifest: WritableManifest = self.previous_state.manifest
for node, real_node in self.all_nodes(included_nodes):
previous_node: Optional[SelectorTarget] = None
if node in manifest.nodes:
previous_node = manifest.nodes[node]
elif node in manifest.sources:
previous_node = manifest.sources[node]
elif node in manifest.exposures:
previous_node = manifest.exposures[node]
if checker(previous_node, real_node):
yield node
class MethodManager:
SELECTOR_METHODS: Dict[MethodName, Type[SelectorMethod]] = {
MethodName.FQN: QualifiedNameSelectorMethod,
@@ -339,10 +498,17 @@ class MethodManager:
MethodName.Config: ConfigSelectorMethod,
MethodName.TestName: TestNameSelectorMethod,
MethodName.TestType: TestTypeSelectorMethod,
MethodName.State: StateSelectorMethod,
MethodName.Exposure: ExposureSelectorMethod,
}
def __init__(self, manifest: Manifest):
def __init__(
self,
manifest: Manifest,
previous_state: Optional[PreviousState],
):
self.manifest = manifest
self.previous_state = previous_state
def get_method(
self, method: MethodName, method_arguments: List[str]
@@ -354,4 +520,4 @@ class MethodManager:
f'method name, but it is not handled'
)
cls: Type[SelectorMethod] = self.SELECTOR_METHODS[method]
return cls(self.manifest, method_arguments)
return cls(self.manifest, self.previous_state, method_arguments)

View File

@@ -93,7 +93,9 @@ class SelectionCriteria:
try:
method_name = MethodName(method_parts[0])
except ValueError as exc:
raise InvalidSelectorException(method_parts[0]) from exc
raise InvalidSelectorException(
f"'{method_parts[0]}' is not a valid method name"
) from exc
method_arguments: List[str] = method_parts[1:]
@@ -121,6 +123,26 @@ class SelectionCriteria:
children_depth=children_depth,
)
@classmethod
def dict_from_single_spec(cls, raw: str):
result = RAW_SELECTOR_PATTERN.match(raw)
if result is None:
return {'error': 'Invalid selector spec'}
dct: Dict[str, Any] = result.groupdict()
method_name, method_arguments = cls.parse_method(dct)
meth_name = str(method_name)
if method_arguments:
meth_name = meth_name + '.' + '.'.join(method_arguments)
dct['method'] = meth_name
dct = {k: v for k, v in dct.items() if (v is not None and v != '')}
if 'childrens_parents' in dct:
dct['childrens_parents'] = bool(dct.get('childrens_parents'))
if 'parents' in dct:
dct['parents'] = bool(dct.get('parents'))
if 'children' in dct:
dct['children'] = bool(dct.get('children'))
return dct
@classmethod
def from_single_spec(cls, raw: str) -> 'SelectionCriteria':
result = RAW_SELECTOR_PATTERN.match(raw)

View File

@@ -1,5 +1,5 @@
{% macro get_columns_in_query(select_sql) -%}
{{ return(adapter_macro('get_columns_in_query', select_sql)) }}
{{ return(adapter.dispatch('get_columns_in_query')(select_sql)) }}
{% endmacro %}
{% macro default__get_columns_in_query(select_sql) %}
@@ -15,7 +15,7 @@
{% endmacro %}
{% macro create_schema(relation) -%}
{{ adapter_macro('create_schema', relation) }}
{{ adapter.dispatch('create_schema')(relation) }}
{% endmacro %}
{% macro default__create_schema(relation) -%}
@@ -25,7 +25,7 @@
{% endmacro %}
{% macro drop_schema(relation) -%}
{{ adapter_macro('drop_schema', relation) }}
{{ adapter.dispatch('drop_schema')(relation) }}
{% endmacro %}
{% macro default__drop_schema(relation) -%}
@@ -35,7 +35,7 @@
{% endmacro %}
{% macro create_table_as(temporary, relation, sql) -%}
{{ adapter_macro('create_table_as', temporary, relation, sql) }}
{{ adapter.dispatch('create_table_as')(temporary, relation, sql) }}
{%- endmacro %}
{% macro default__create_table_as(temporary, relation, sql) -%}
@@ -52,7 +52,7 @@
{% endmacro %}
{% macro create_view_as(relation, sql) -%}
{{ adapter_macro('create_view_as', relation, sql) }}
{{ adapter.dispatch('create_view_as')(relation, sql) }}
{%- endmacro %}
{% macro default__create_view_as(relation, sql) -%}
@@ -66,7 +66,7 @@
{% macro get_catalog(information_schema, schemas) -%}
{{ return(adapter_macro('get_catalog', information_schema, schemas)) }}
{{ return(adapter.dispatch('get_catalog')(information_schema, schemas)) }}
{%- endmacro %}
{% macro default__get_catalog(information_schema, schemas) -%}
@@ -81,7 +81,7 @@
{% macro get_columns_in_relation(relation) -%}
{{ return(adapter_macro('get_columns_in_relation', relation)) }}
{{ return(adapter.dispatch('get_columns_in_relation')(relation)) }}
{% endmacro %}
{% macro sql_convert_columns_in_relation(table) -%}
@@ -98,13 +98,13 @@
{% endmacro %}
{% macro alter_column_type(relation, column_name, new_column_type) -%}
{{ return(adapter_macro('alter_column_type', relation, column_name, new_column_type)) }}
{{ return(adapter.dispatch('alter_column_type')(relation, column_name, new_column_type)) }}
{% endmacro %}
{% macro alter_column_comment(relation, column_dict) -%}
{{ return(adapter_macro('alter_column_comment', relation, column_dict)) }}
{{ return(adapter.dispatch('alter_column_comment')(relation, column_dict)) }}
{% endmacro %}
{% macro default__alter_column_comment(relation, column_dict) -%}
@@ -113,7 +113,7 @@
{% endmacro %}
{% macro alter_relation_comment(relation, relation_comment) -%}
{{ return(adapter_macro('alter_relation_comment', relation, relation_comment)) }}
{{ return(adapter.dispatch('alter_relation_comment')(relation, relation_comment)) }}
{% endmacro %}
{% macro default__alter_relation_comment(relation, relation_comment) -%}
@@ -122,7 +122,7 @@
{% endmacro %}
{% macro persist_docs(relation, model, for_relation=true, for_columns=true) -%}
{{ return(adapter_macro('persist_docs', relation, model, for_relation, for_columns)) }}
{{ return(adapter.dispatch('persist_docs')(relation, model, for_relation, for_columns)) }}
{% endmacro %}
{% macro default__persist_docs(relation, model, for_relation, for_columns) -%}
@@ -157,7 +157,7 @@
{% macro drop_relation(relation) -%}
{{ return(adapter_macro('drop_relation', relation)) }}
{{ return(adapter.dispatch('drop_relation')(relation)) }}
{% endmacro %}
@@ -168,7 +168,7 @@
{% endmacro %}
{% macro truncate_relation(relation) -%}
{{ return(adapter_macro('truncate_relation', relation)) }}
{{ return(adapter.dispatch('truncate_relation')(relation)) }}
{% endmacro %}
@@ -179,7 +179,7 @@
{% endmacro %}
{% macro rename_relation(from_relation, to_relation) -%}
{{ return(adapter_macro('rename_relation', from_relation, to_relation)) }}
{{ return(adapter.dispatch('rename_relation')(from_relation, to_relation)) }}
{% endmacro %}
{% macro default__rename_relation(from_relation, to_relation) -%}
@@ -191,7 +191,7 @@
{% macro information_schema_name(database) %}
{{ return(adapter_macro('information_schema_name', database)) }}
{{ return(adapter.dispatch('information_schema_name')(database)) }}
{% endmacro %}
{% macro default__information_schema_name(database) -%}
@@ -204,7 +204,7 @@
{% macro list_schemas(database) -%}
{{ return(adapter_macro('list_schemas', database)) }}
{{ return(adapter.dispatch('list_schemas')(database)) }}
{% endmacro %}
{% macro default__list_schemas(database) -%}
@@ -218,7 +218,7 @@
{% macro check_schema_exists(information_schema, schema) -%}
{{ return(adapter_macro('check_schema_exists', information_schema, schema)) }}
{{ return(adapter.dispatch('check_schema_exists')(information_schema, schema)) }}
{% endmacro %}
{% macro default__check_schema_exists(information_schema, schema) -%}
@@ -233,7 +233,7 @@
{% macro list_relations_without_caching(schema_relation) %}
{{ return(adapter_macro('list_relations_without_caching', schema_relation)) }}
{{ return(adapter.dispatch('list_relations_without_caching')(schema_relation)) }}
{% endmacro %}
@@ -244,7 +244,7 @@
{% macro current_timestamp() -%}
{{ adapter_macro('current_timestamp') }}
{{ adapter.dispatch('current_timestamp')() }}
{%- endmacro %}
@@ -255,7 +255,7 @@
{% macro collect_freshness(source, loaded_at_field, filter) %}
{{ return(adapter_macro('collect_freshness', source, loaded_at_field, filter))}}
{{ return(adapter.dispatch('collect_freshness')(source, loaded_at_field, filter))}}
{% endmacro %}
@@ -273,7 +273,7 @@
{% endmacro %}
{% macro make_temp_relation(base_relation, suffix='__dbt_tmp') %}
{{ return(adapter_macro('make_temp_relation', base_relation, suffix))}}
{{ return(adapter.dispatch('make_temp_relation')(base_relation, suffix))}}
{% endmacro %}
{% macro default__make_temp_relation(base_relation, suffix) %}

View File

@@ -7,15 +7,15 @@
{{ write(sql) }}
{%- endif -%}
{%- set status, res = adapter.execute(sql, auto_begin=auto_begin, fetch=fetch_result) -%}
{%- set res, table = adapter.execute(sql, auto_begin=auto_begin, fetch=fetch_result) -%}
{%- if name is not none -%}
{{ store_result(name, status=status, agate_table=res) }}
{{ store_result(name, response=res, agate_table=table) }}
{%- endif -%}
{%- endif -%}
{%- endmacro %}
{% macro noop_statement(name=None, status=None, res=None) -%}
{% macro noop_statement(name=None, message=None, code=None, rows_affected=None, res=None) -%}
{%- set sql = caller() -%}
{%- if name == 'main' -%}
@@ -24,7 +24,7 @@
{%- endif -%}
{%- if name is not none -%}
{{ store_result(name, status=status, agate_table=res) }}
{{ store_raw_result(name, message=message, code=code, rows_affected=rows_affected, agate_table=res) }}
{%- endif -%}
{%- endmacro %}

View File

@@ -14,7 +14,7 @@
#}
{% macro generate_database_name(custom_database_name=none, node=none) -%}
{% do return(adapter_macro('generate_database_name', custom_database_name, node)) %}
{% do return(adapter.dispatch('generate_database_name')(custom_database_name, node)) %}
{%- endmacro %}
{% macro default__generate_database_name(custom_database_name=none, node=none) -%}

View File

@@ -1,17 +1,17 @@
{% macro get_merge_sql(target, source, unique_key, dest_columns, predicates=none) -%}
{{ adapter_macro('get_merge_sql', target, source, unique_key, dest_columns, predicates) }}
{{ adapter.dispatch('get_merge_sql')(target, source, unique_key, dest_columns, predicates) }}
{%- endmacro %}
{% macro get_delete_insert_merge_sql(target, source, unique_key, dest_columns) -%}
{{ adapter_macro('get_delete_insert_merge_sql', target, source, unique_key, dest_columns) }}
{{ adapter.dispatch('get_delete_insert_merge_sql')(target, source, unique_key, dest_columns) }}
{%- endmacro %}
{% macro get_insert_overwrite_merge_sql(target, source, dest_columns, predicates, include_sql_header=false) -%}
{{ adapter_macro('get_insert_overwrite_merge_sql', target, source, dest_columns, predicates, include_sql_header) }}
{{ adapter.dispatch('get_insert_overwrite_merge_sql')(target, source, dest_columns, predicates, include_sql_header) }}
{%- endmacro %}
@@ -97,7 +97,7 @@
merge into {{ target }} as DBT_INTERNAL_DEST
using {{ source }} as DBT_INTERNAL_SOURCE
on FALSE
when not matched by source
{% if predicates %} and {{ predicates | join(' and ') }} {% endif %}
then delete

View File

@@ -1,14 +1,14 @@
{% macro create_csv_table(model, agate_table) -%}
{{ adapter_macro('create_csv_table', model, agate_table) }}
{{ adapter.dispatch('create_csv_table')(model, agate_table) }}
{%- endmacro %}
{% macro reset_csv_table(model, full_refresh, old_relation, agate_table) -%}
{{ adapter_macro('reset_csv_table', model, full_refresh, old_relation, agate_table) }}
{{ adapter.dispatch('reset_csv_table')(model, full_refresh, old_relation, agate_table) }}
{%- endmacro %}
{% macro load_csv_rows(model, agate_table) -%}
{{ adapter_macro('load_csv_rows', model, agate_table) }}
{{ adapter.dispatch('load_csv_rows')(model, agate_table) }}
{%- endmacro %}
{% macro default__create_csv_table(model, agate_table) %}
@@ -112,7 +112,7 @@
{%- set exists_as_view = (old_relation is not none and old_relation.is_view) -%}
{%- set agate_table = load_agate_table() -%}
{%- do store_result('agate_table', status='OK', agate_table=agate_table) -%}
{%- do store_result('agate_table', response='OK', agate_table=agate_table) -%}
{{ run_hooks(pre_hooks, inside_transaction=False) }}
@@ -129,11 +129,11 @@
{% set create_table_sql = create_csv_table(model, agate_table) %}
{% endif %}
{% set status = 'CREATE' if full_refresh_mode else 'INSERT' %}
{% set num_rows = (agate_table.rows | length) %}
{% set code = 'CREATE' if full_refresh_mode else 'INSERT' %}
{% set rows_affected = (agate_table.rows | length) %}
{% set sql = load_csv_rows(model, agate_table) %}
{% call noop_statement('main', status ~ ' ' ~ num_rows) %}
{% call noop_statement('main', code ~ ' ' ~ rows_affected, code, rows_affected) %}
{{ create_table_sql }};
-- dbt seed --
{{ sql }}

View File

@@ -2,7 +2,7 @@
Add new columns to the table if applicable
#}
{% macro create_columns(relation, columns) %}
{{ adapter_macro('create_columns', relation, columns) }}
{{ adapter.dispatch('create_columns')(relation, columns) }}
{% endmacro %}
{% macro default__create_columns(relation, columns) %}
@@ -15,7 +15,7 @@
{% macro post_snapshot(staging_relation) %}
{{ adapter_macro('post_snapshot', staging_relation) }}
{{ adapter.dispatch('post_snapshot')(staging_relation) }}
{% endmacro %}
{% macro default__post_snapshot(staging_relation) %}
@@ -37,6 +37,7 @@
{{ strategy.unique_key }} as dbt_unique_key
from {{ target_relation }}
where dbt_valid_to is null
),
@@ -65,6 +66,17 @@
from snapshot_query
),
{%- if strategy.invalidate_hard_deletes %}
deletes_source_data as (
select
*,
{{ strategy.unique_key }} as dbt_unique_key
from snapshot_query
),
{% endif %}
insertions as (
select
@@ -76,7 +88,6 @@
where snapshotted_data.dbt_unique_key is null
or (
snapshotted_data.dbt_unique_key is not null
and snapshotted_data.dbt_valid_to is null
and (
{{ strategy.row_changed }}
)
@@ -93,15 +104,37 @@
from updates_source_data as source_data
join snapshotted_data on snapshotted_data.dbt_unique_key = source_data.dbt_unique_key
where snapshotted_data.dbt_valid_to is null
and (
where (
{{ strategy.row_changed }}
)
)
{%- if strategy.invalidate_hard_deletes -%}
,
deletes as (
select
'delete' as dbt_change_type,
source_data.*,
{{ snapshot_get_time() }} as dbt_valid_from,
{{ snapshot_get_time() }} as dbt_updated_at,
{{ snapshot_get_time() }} as dbt_valid_to,
snapshotted_data.dbt_scd_id
from snapshotted_data
left join deletes_source_data as source_data on snapshotted_data.dbt_unique_key = source_data.dbt_unique_key
where source_data.dbt_unique_key is null
)
{%- endif %}
select * from insertions
union all
select * from updates
{%- if strategy.invalidate_hard_deletes %}
union all
select * from deletes
{%- endif %}
{%- endmacro %}
@@ -181,7 +214,7 @@
{% if not target_relation_exists %}
{% set build_sql = build_snapshot_table(strategy, model['injected_sql']) %}
{% set build_sql = build_snapshot_table(strategy, model['compiled_sql']) %}
{% set final_sql = create_table_as(False, target_relation, build_sql) %}
{% else %}

View File

@@ -1,6 +1,6 @@
{% macro snapshot_merge_sql(target, source, insert_cols) -%}
{{ adapter_macro('snapshot_merge_sql', target, source, insert_cols) }}
{{ adapter.dispatch('snapshot_merge_sql')(target, source, insert_cols) }}
{%- endmacro %}
@@ -13,7 +13,7 @@
when matched
and DBT_INTERNAL_DEST.dbt_valid_to is null
and DBT_INTERNAL_SOURCE.dbt_change_type = 'update'
and DBT_INTERNAL_SOURCE.dbt_change_type in ('update', 'delete')
then update
set dbt_valid_to = DBT_INTERNAL_SOURCE.dbt_valid_to

View File

@@ -36,7 +36,7 @@
Create SCD Hash SQL fields cross-db
#}
{% macro snapshot_hash_arguments(args) -%}
{{ adapter_macro('snapshot_hash_arguments', args) }}
{{ adapter.dispatch('snapshot_hash_arguments')(args) }}
{%- endmacro %}
@@ -52,7 +52,7 @@
Get the current time cross-db
#}
{% macro snapshot_get_time() -%}
{{ adapter_macro('snapshot_get_time') }}
{{ adapter.dispatch('snapshot_get_time')() }}
{%- endmacro %}
{% macro default__snapshot_get_time() -%}
@@ -66,6 +66,7 @@
{% macro snapshot_timestamp_strategy(node, snapshotted_rel, current_rel, config, target_exists) %}
{% set primary_key = config['unique_key'] %}
{% set updated_at = config['updated_at'] %}
{% set invalidate_hard_deletes = config.get('invalidate_hard_deletes', false) %}
{#/*
The snapshot relation might not have an {{ updated_at }} value if the
@@ -86,13 +87,14 @@
"unique_key": primary_key,
"updated_at": updated_at,
"row_changed": row_changed_expr,
"scd_id": scd_id_expr
"scd_id": scd_id_expr,
"invalidate_hard_deletes": invalidate_hard_deletes
}) %}
{% endmacro %}
{% macro snapshot_string_as_time(timestamp) -%}
{{ adapter_macro('snapshot_string_as_time', timestamp) }}
{{ adapter.dispatch('snapshot_string_as_time')(timestamp) }}
{%- endmacro %}
@@ -104,7 +106,7 @@
{% macro snapshot_check_all_get_existing_columns(node, target_exists) -%}
{%- set query_columns = get_columns_in_query(node['injected_sql']) -%}
{%- set query_columns = get_columns_in_query(node['compiled_sql']) -%}
{%- if not target_exists -%}
{# no table yet -> return whatever the query does #}
{{ return([false, query_columns]) }}
@@ -131,6 +133,8 @@
{% macro snapshot_check_strategy(node, snapshotted_rel, current_rel, config, target_exists) %}
{% set check_cols_config = config['check_cols'] %}
{% set primary_key = config['unique_key'] %}
{% set invalidate_hard_deletes = config.get('invalidate_hard_deletes', false) %}
{% set select_current_time -%}
select {{ snapshot_get_time() }} as snapshot_start
{%- endset %}
@@ -160,7 +164,11 @@
{%- for col in check_cols -%}
{{ snapshotted_rel }}.{{ col }} != {{ current_rel }}.{{ col }}
or
({{ snapshotted_rel }}.{{ col }} is null) != ({{ current_rel }}.{{ col }} is null)
(
(({{ snapshotted_rel }}.{{ col }} is null) and not ({{ current_rel }}.{{ col }} is null))
or
((not {{ snapshotted_rel }}.{{ col }} is null) and ({{ current_rel }}.{{ col }} is null))
)
{%- if not loop.last %} or {% endif -%}
{%- endfor -%}
{%- endif -%}
@@ -173,6 +181,7 @@
"unique_key": primary_key,
"updated_at": updated_at,
"row_changed": row_changed_expr,
"scd_id": scd_id_expr
"scd_id": scd_id_expr,
"invalidate_hard_deletes": invalidate_hard_deletes
}) %}
{% endmacro %}

View File

@@ -1,6 +1,6 @@
{% macro handle_existing_table(full_refresh, old_relation) %}
{{ adapter_macro("dbt.handle_existing_table", full_refresh, old_relation) }}
{{ adapter.dispatch("handle_existing_table", packages=['dbt'])(full_refresh, old_relation) }}
{% endmacro %}
{% macro default__handle_existing_table(full_refresh, old_relation) %}

View File

@@ -1,5 +1,5 @@
{% macro test_accepted_values(model, values) %}
{% macro default__test_accepted_values(model, values) %}
{% set column_name = kwargs.get('column_name', kwargs.get('field')) %}
{% set quote_values = kwargs.get('quote', True) %}
@@ -35,3 +35,9 @@ select count(*) as validation_errors
from validation_errors
{% endmacro %}
{% macro test_accepted_values(model, values) %}
{% set macro = adapter.dispatch('test_accepted_values') %}
{{ macro(model, values, **kwargs) }}
{% endmacro %}

View File

@@ -1,5 +1,5 @@
{% macro test_not_null(model) %}
{% macro default__test_not_null(model) %}
{% set column_name = kwargs.get('column_name', kwargs.get('arg')) %}
@@ -9,3 +9,9 @@ where {{ column_name }} is null
{% endmacro %}
{% macro test_not_null(model) %}
{% set macro = adapter.dispatch('test_not_null') %}
{{ macro(model, **kwargs) }}
{% endmacro %}

View File

@@ -1,5 +1,5 @@
{% macro test_relationships(model, to, field) %}
{% macro default__test_relationships(model, to, field) %}
{% set column_name = kwargs.get('column_name', kwargs.get('from')) %}
@@ -16,3 +16,9 @@ where child.id is not null
{% endmacro %}
{% macro test_relationships(model, to, field) %}
{% set macro = adapter.dispatch('test_relationships') %}
{{ macro(model, to, field, **kwargs) }}
{% endmacro %}

View File

@@ -1,5 +1,5 @@
{% macro test_unique(model) %}
{% macro default__test_unique(model) %}
{% set column_name = kwargs.get('column_name', kwargs.get('arg')) %}
@@ -17,3 +17,9 @@ from (
) validation_errors
{% endmacro %}
{% macro test_unique(model) %}
{% set macro = adapter.dispatch('test_unique') %}
{{ macro(model, **kwargs) }}
{% endmacro %}

File diff suppressed because one or more lines are too long

View File

@@ -1,214 +0,0 @@
# TODO: rename this module.
from typing import Dict, Any, Mapping, List
from typing_extensions import Protocol, runtime_checkable
import dbt.exceptions
from dbt.utils import deep_merge, fqn_search
from dbt.node_types import NodeType
from dbt.adapters.factory import get_config_class_by_name
class HasConfigFields(Protocol):
seeds: Dict[str, Any]
snapshots: Dict[str, Any]
models: Dict[str, Any]
sources: Dict[str, Any]
@runtime_checkable
class IsFQNResource(Protocol):
fqn: List[str]
resource_type: NodeType
package_name: str
def _listify(value) -> List:
if isinstance(value, tuple):
value = list(value)
elif not isinstance(value, list):
value = [value]
return value
class ConfigUpdater:
AppendListFields = {'pre-hook', 'post-hook', 'tags'}
ExtendDictFields = {'vars', 'column_types', 'quoting', 'persist_docs'}
DefaultClobberFields = {
'enabled',
'materialized',
# these 2 are additional - not defined in the NodeConfig object
'sql_header',
'incremental_strategy',
# these 3 are "special" - not defined in NodeConfig, instead set by
# update_parsed_node_name in parsing
'alias',
'schema',
'database',
# tests
'severity',
# snapshots
'unique_key',
'target_database',
'target_schema',
'strategy',
'updated_at',
# this is often a list, but it should replace and not append (sometimes
# it's 'all')
'check_cols',
# seeds
'quote_columns',
}
@property
def ClobberFields(self):
return self.DefaultClobberFields | self.AdapterSpecificConfigs
@property
def ConfigKeys(self):
return (
self.AppendListFields | self.ExtendDictFields | self.ClobberFields
)
def __init__(self, adapter_type: str):
config_class = get_config_class_by_name(adapter_type)
self.AdapterSpecificConfigs = {
target_name for _, target_name in
config_class._get_fields()
}
def update_config_keys_into(
self, mutable_config: Dict[str, Any], new_configs: Mapping[str, Any]
) -> Dict[str, Any]:
"""Update mutable_config with the contents of new_configs, but only
include "expected" config values.
Returns dict where the keys are what was updated and the update values
are what the updates were.
"""
relevant_configs: Dict[str, Any] = {
key: new_configs[key] for key
in new_configs if key in self.ConfigKeys
}
for key in self.AppendListFields:
append_fields = _listify(relevant_configs.get(key, []))
mutable_config[key].extend([
f for f in append_fields if f not in mutable_config[key]
])
for key in self.ExtendDictFields:
dict_val = relevant_configs.get(key, {})
try:
mutable_config[key].update(dict_val)
except (ValueError, TypeError, AttributeError):
dbt.exceptions.raise_compiler_error(
'Invalid config field: "{}" must be a dict'.format(key)
)
for key in self.ClobberFields:
if key in relevant_configs:
mutable_config[key] = relevant_configs[key]
return relevant_configs
def update_into(
self, mutable_config: Dict[str, Any], new_config: Mapping[str, Any]
) -> None:
"""Update mutable_config with the contents of new_config."""
for key, value in new_config.items():
if key in self.AppendListFields:
current_list: List = _listify(mutable_config.get(key, []))
current_list.extend(_listify(value))
mutable_config[key] = current_list
elif key in self.ExtendDictFields:
current_dict: Dict = mutable_config.get(key, {})
try:
current_dict.update(value)
except (ValueError, TypeError, AttributeError):
dbt.exceptions.raise_compiler_error(
'Invalid config field: "{}" must be a dict'.format(key)
)
mutable_config[key] = current_dict
else: # key in self.ClobberFields
mutable_config[key] = value
def get_project_config(
self, model: IsFQNResource, project: HasConfigFields
) -> Dict[str, Any]:
# most configs are overwritten by a more specific config, but pre/post
# hooks are appended!
config: Dict[str, Any] = {}
for k in self.AppendListFields:
config[k] = []
for k in self.ExtendDictFields:
config[k] = {}
if model.resource_type == NodeType.Seed:
model_configs = project.seeds
elif model.resource_type == NodeType.Snapshot:
model_configs = project.snapshots
elif model.resource_type == NodeType.Source:
model_configs = project.sources
else:
model_configs = project.models
if model_configs is None:
return config
# mutates config
self.update_config_keys_into(config, model_configs)
for level_config in fqn_search(model_configs, model.fqn):
relevant_configs = self.update_config_keys_into(
config, level_config
)
# mutates config
relevant_configs = self.update_config_keys_into(
config, level_config
)
# TODO: does this do anything? Doesn't update_config_keys_into
# handle the clobber case?
clobber_configs = {
k: v for (k, v) in relevant_configs.items()
if k not in self.AppendListFields and
k not in self.ExtendDictFields
}
config.update(clobber_configs)
return config
def get_project_vars(
self, project_vars: Dict[str, Any],
):
config: Dict[str, Any] = {}
# this is pretty trivial, since the new project vars don't care about
# FQNs or resource types
self.update_config_keys_into(config, project_vars)
return config
def merge(self, *configs: Dict[str, Any]) -> Dict[str, Any]:
merged_config: Dict[str, Any] = {}
for config in configs:
# Do not attempt to deep merge clobber fields
config = config.copy()
clobber = {
key: config.pop(key) for key in list(config.keys())
if key in self.ClobberFields
}
intermediary_merged = deep_merge(
merged_config, config
)
intermediary_merged.update(clobber)
merged_config.update(intermediary_merged)
return merged_config

View File

@@ -23,6 +23,7 @@ import dbt.task.generate as generate_task
import dbt.task.serve as serve_task
import dbt.task.freshness as freshness_task
import dbt.task.run_operation as run_operation_task
import dbt.task.parse as parse_task
from dbt.profiler import profiler
from dbt.task.list import ListTask
from dbt.task.rpc.server import RPCServerTask
@@ -364,6 +365,14 @@ def _build_init_subparser(subparsers, base_subparser):
Name of the new project
''',
)
sub.add_argument(
'--adapter',
default='redshift',
type=str,
help='''
Write sample profiles.yml for which adapter
''',
)
sub.set_defaults(cls=init_task.InitTask, which='init', rpc_method=None)
return sub
@@ -398,6 +407,7 @@ def _build_debug_subparser(subparsers, base_subparser):
If specified, DBT will show path information for this project
'''
)
_add_version_check(sub)
sub.set_defaults(cls=debug_task.DebugTask, which='debug', rpc_method=None)
return sub
@@ -436,6 +446,21 @@ def _build_snapshot_subparser(subparsers, base_subparser):
return sub
def _add_defer_argument(*subparsers):
for sub in subparsers:
sub.add_optional_argument_inverse(
'--defer',
enable_help='''
If set, defer to the state variable for resolving unselected nodes.
''',
disable_help='''
If set, do not defer to the state variable for resolving unselected
nodes.
''',
default=flags.DEFER_MODE,
)
def _build_run_subparser(subparsers, base_subparser):
run_sub = subparsers.add_parser(
'run',
@@ -453,28 +478,6 @@ def _build_run_subparser(subparsers, base_subparser):
'''
)
# for now, this is a "dbt run"-only thing
run_sub.add_argument(
'--state',
help='''
If set, use the given directory as the source for json files to compare
with this project.
''',
type=Path,
default=flags.ARTIFACT_STATE_PATH,
)
run_sub.add_optional_argument_inverse(
'--defer',
enable_help='''
If set, defer to the state variable for resolving unselected nodes.
''',
disable_help='''
If set, do not defer to the state variable for resolving unselected
nodes.
''',
default=flags.DEFER_MODE,
)
run_sub.set_defaults(cls=run_task.RunTask, which='run', rpc_method='run')
return run_sub
@@ -494,6 +497,21 @@ def _build_compile_subparser(subparsers, base_subparser):
return sub
def _build_parse_subparser(subparsers, base_subparser):
sub = subparsers.add_parser(
'parse',
parents=[base_subparser],
help='''
Parsed the project and provides information on performance
'''
)
sub.set_defaults(cls=parse_task.ParseTask, which='parse',
rpc_method='parse')
sub.add_argument('--write-manifest', action='store_true')
sub.add_argument('--compile', action='store_true')
return sub
def _build_docs_generate_subparser(subparsers, base_subparser):
# it might look like docs_sub is the correct parents entry, but that
# will cause weird errors about 'conflicting option strings'.
@@ -511,35 +529,79 @@ def _build_docs_generate_subparser(subparsers, base_subparser):
return generate_sub
def _add_models_argument(sub, help_override=None, **kwargs):
help_str = '''
Specify the models to include.
'''
if help_override is not None:
help_str = help_override
sub.add_argument(
'-m',
'--models',
dest='models',
nargs='+',
help=help_str,
**kwargs
)
def _add_select_argument(sub, dest='models', help_override=None, **kwargs):
help_str = '''
Specify the nodes to include.
'''
if help_override is not None:
help_str = help_override
sub.add_argument(
'-s',
'--select',
dest=dest,
nargs='+',
help=help_str,
**kwargs
)
def _add_common_selector_arguments(sub):
sub.add_argument(
'--exclude',
required=False,
nargs='+',
help='''
Specify the models to exclude.
''',
)
sub.add_argument(
'--selector',
dest='selector_name',
metavar='SELECTOR_NAME',
help='''
The selector name to use, as defined in selectors.yml
'''
)
sub.add_argument(
'--state',
help='''
If set, use the given directory as the source for json files to
compare with this project.
''',
type=Path,
default=flags.ARTIFACT_STATE_PATH,
)
def _add_selection_arguments(*subparsers, **kwargs):
models_name = kwargs.get('models_name', 'models')
for sub in subparsers:
sub.add_argument(
'-{}'.format(models_name[0]),
'--{}'.format(models_name),
dest='models',
required=False,
nargs='+',
help='''
Specify the models to include.
''',
)
sub.add_argument(
'--exclude',
required=False,
nargs='+',
help='''
Specify the models to exclude.
''',
)
sub.add_argument(
'--selector',
dest='selector_name',
metavar='SELECTOR_NAME',
help='''
The selector name to use, as defined in selectors.yml
'''
)
if models_name == 'models':
_add_models_argument(sub)
elif models_name == 'select':
# these still get stored in 'models', so they present the same
# interface to the task
_add_select_argument(sub)
else:
raise InternalException(f'Unknown models style {models_name}')
_add_common_selector_arguments(sub)
def _add_table_mutability_arguments(*subparsers):
@@ -554,6 +616,18 @@ def _add_table_mutability_arguments(*subparsers):
)
def _add_version_check(sub):
sub.add_argument(
'--no-version-check',
dest='version_check',
action='store_false',
help='''
If set, skip ensuring dbt's version matches the one specified in
the dbt_project.yml file ('require-dbt-version')
'''
)
def _add_common_arguments(*subparsers):
for sub in subparsers:
sub.add_argument(
@@ -565,15 +639,7 @@ def _add_common_arguments(*subparsers):
settings in profiles.yml.
'''
)
sub.add_argument(
'--no-version-check',
dest='version_check',
action='store_false',
help='''
If set, skip ensuring dbt's version matches the one specified in
the dbt_project.yml file ('require-dbt-version')
'''
)
_add_version_check(sub)
def _build_seed_subparser(subparsers, base_subparser):
@@ -752,44 +818,24 @@ def _build_list_subparser(subparsers, base_subparser):
sub.add_argument('--output',
choices=['json', 'name', 'path', 'selector'],
default='selector')
sub.add_argument(
'-s',
'--select',
required=False,
nargs='+',
metavar='SELECTOR',
help='''
Specify the nodes to select.
''',
)
sub.add_argument(
'-m',
'--models',
required=False,
nargs='+',
metavar='SELECTOR',
help='''
_add_models_argument(
sub,
help_override='''
Specify the models to select and set the resource-type to 'model'.
Mutually exclusive with '--select' (or '-s') and '--resource-type'
''',
)
sub.add_argument(
'--exclude',
required=False,
nargs='+',
metavar='SELECTOR',
help='''
Specify the models to exclude.
'''
required=False
)
sub.add_argument(
'--selector',
metavar='SELECTOR_NAME',
dest='selector_name',
help='''
The selector name to use, as defined in selectors.yml
'''
_add_select_argument(
sub,
dest='select',
metavar='SELECTOR',
required=False,
)
_add_common_selector_arguments(sub)
return sub
@@ -879,6 +925,30 @@ def parse_args(args, cls=DBTArgumentParser):
If set, skip writing the manifest and run_results.json files to disk
'''
)
colors_flag = p.add_mutually_exclusive_group()
colors_flag.add_argument(
'--use-colors',
action='store_const',
const=True,
dest='use_colors',
help='''
Colorize the output DBT prints to the terminal. Output is colorized by
default and may also be set in a profile or at the command line.
Mutually exclusive with --no-use-colors
'''
)
colors_flag.add_argument(
'--no-use-colors',
action='store_const',
const=False,
dest='use_colors',
help='''
Do not colorize the output DBT prints to the terminal. Output is
colorized by default and may also be set in a profile or at the
command line.
Mutually exclusive with --use-colors
'''
)
p.add_argument(
'-S',
@@ -954,15 +1024,19 @@ def parse_args(args, cls=DBTArgumentParser):
rpc_sub = _build_rpc_subparser(subs, base_subparser)
run_sub = _build_run_subparser(subs, base_subparser)
compile_sub = _build_compile_subparser(subs, base_subparser)
parse_sub = _build_parse_subparser(subs, base_subparser)
generate_sub = _build_docs_generate_subparser(docs_subs, base_subparser)
test_sub = _build_test_subparser(subs, base_subparser)
seed_sub = _build_seed_subparser(subs, base_subparser)
# --threads, --no-version-check
_add_common_arguments(run_sub, compile_sub, generate_sub, test_sub,
rpc_sub, seed_sub)
rpc_sub, seed_sub, parse_sub)
# --models, --exclude
# list_sub sets up its own arguments.
_add_selection_arguments(run_sub, compile_sub, generate_sub, test_sub)
_add_selection_arguments(snapshot_sub, seed_sub, models_name='select')
# --defer
_add_defer_argument(run_sub, test_sub)
# --full-refresh
_add_table_mutability_arguments(run_sub, compile_sub)

View File

@@ -14,6 +14,7 @@ class NodeType(StrEnum):
Documentation = 'docs'
Source = 'source'
Macro = 'macro'
Exposure = 'exposure'
@classmethod
def executable(cls) -> List['NodeType']:
@@ -45,6 +46,7 @@ class NodeType(StrEnum):
cls.Source,
cls.Macro,
cls.Analysis,
cls.Exposure
]
def pluralize(self) -> str:

View File

@@ -18,11 +18,12 @@ from dbt.adapters.factory import get_adapter
from dbt.clients.jinja import get_rendered
from dbt.config import Project, RuntimeConfig
from dbt.context.context_config import (
LegacyContextConfig, ContextConfig, ContextConfigType
ContextConfig
)
from dbt.contracts.graph.manifest import (
Manifest, SourceFile, FilePath, FileHash
from dbt.contracts.files import (
SourceFile, FilePath, FileHash
)
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.parsed import HasUniqueID
from dbt.contracts.graph.unparsed import UnparsedNode
from dbt.exceptions import (
@@ -76,11 +77,19 @@ class BaseParser(Generic[FinalValue]):
self.project.project_name,
resource_name)
def load_file(self, path: FilePath) -> SourceFile:
def load_file(
self,
path: FilePath,
*,
set_contents: bool = True,
) -> SourceFile:
file_contents = load_file_contents(path.absolute_path, strip=False)
checksum = FileHash.from_contents(file_contents)
source_file = SourceFile(path=path, checksum=checksum)
source_file.contents = file_contents.strip()
if set_contents:
source_file.contents = file_contents.strip()
else:
source_file.contents = ''
return source_file
@@ -214,7 +223,7 @@ class ConfiguredParser(
self,
block: ConfiguredBlockType,
path: str,
config: ContextConfigType,
config: ContextConfig,
fqn: List[str],
name=None,
**kwargs,
@@ -239,6 +248,7 @@ class ConfiguredParser(
'raw_sql': block.contents,
'unique_id': self.generate_unique_id(name),
'config': self.config_dict(config),
'checksum': block.file.checksum.to_dict(),
}
dct.update(kwargs)
try:
@@ -256,16 +266,16 @@ class ConfiguredParser(
raise CompilationException(msg, node=node)
def _context_for(
self, parsed_node: IntermediateNode, config: ContextConfigType
self, parsed_node: IntermediateNode, config: ContextConfig
) -> Dict[str, Any]:
return generate_parser_model(
parsed_node, self.root_project, self.macro_manifest, config
)
def render_with_context(
self, parsed_node: IntermediateNode, config: ContextConfigType
self, parsed_node: IntermediateNode, config: ContextConfig
) -> None:
"""Given the parsed node and a ContextConfigType to use during parsing,
"""Given the parsed node and a ContextConfig to use during parsing,
render the node's sql wtih macro capture enabled.
Note: this mutates the config object when config() calls are rendered.
@@ -297,9 +307,9 @@ class ConfiguredParser(
self._update_node_alias(parsed_node, config_dict)
def update_parsed_node(
self, parsed_node: IntermediateNode, config: ContextConfigType
self, parsed_node: IntermediateNode, config: ContextConfig
) -> None:
"""Given the ContextConfigType used for parsing and the parsed node,
"""Given the ContextConfig used for parsing and the parsed node,
generate and set the true values to use, overriding the temporary parse
values set in _build_intermediate_parsed_node.
"""
@@ -309,6 +319,10 @@ class ConfiguredParser(
model_tags = config_dict.get('tags', [])
parsed_node.tags.extend(model_tags)
parsed_node.unrendered_config = config.build_config_dict(
rendered=False
)
# do this once before we parse the node database/schema/alias, so
# parsed_node.config is what it would be if they did nothing
self.update_parsed_node_config(parsed_node, config_dict)
@@ -327,20 +341,11 @@ class ConfiguredParser(
for hook in hooks:
get_rendered(hook.sql, context, parsed_node, capture_macros=True)
def initial_config(self, fqn: List[str]) -> ContextConfigType:
def initial_config(self, fqn: List[str]) -> ContextConfig:
config_version = min(
[self.project.config_version, self.root_project.config_version]
)
# grab a list of the existing project names. This is for var conversion
all_projects = self.root_project.load_dependencies()
if config_version == 1:
return LegacyContextConfig(
self.root_project.as_v1(all_projects),
self.project.as_v1(all_projects),
fqn,
self.resource_type,
)
elif config_version == 2:
if config_version == 2:
return ContextConfig(
self.root_project,
fqn,
@@ -350,18 +355,18 @@ class ConfiguredParser(
else:
raise InternalException(
f'Got an unexpected project version={config_version}, '
f'expected 1 or 2'
f'expected 2'
)
def config_dict(
self, config: ContextConfigType,
self, config: ContextConfig,
) -> Dict[str, Any]:
config_dict = config.build_config_dict(base=True)
self._mangle_hooks(config_dict)
return config_dict
def render_update(
self, node: IntermediateNode, config: ContextConfigType
self, node: IntermediateNode, config: ContextConfig
) -> None:
try:
self.render_with_context(node, config)
@@ -381,7 +386,7 @@ class ConfiguredParser(
compiled_path: str = self.get_compiled_path(block)
fqn = self.get_fqn(compiled_path, block.name)
config: ContextConfigType = self.initial_config(fqn)
config: ContextConfig = self.initial_config(fqn)
node = self._create_parsetime_node(
block=block,

View File

@@ -1,8 +1,8 @@
from dataclasses import dataclass
from typing import Iterable, Iterator, Union, List, Tuple
from dbt.context.context_config import ContextConfigType
from dbt.contracts.graph.manifest import FilePath
from dbt.context.context_config import ContextConfig
from dbt.contracts.files import FilePath
from dbt.contracts.graph.parsed import ParsedHookNode
from dbt.exceptions import InternalException
from dbt.node_types import NodeType, RunHookType
@@ -89,7 +89,7 @@ class HookParser(SimpleParser[HookBlock, ParsedHookNode]):
self,
block: HookBlock,
path: str,
config: ContextConfigType,
config: ContextConfig,
fqn: List[str],
name=None,
**kwargs,

View File

@@ -1,18 +1,18 @@
from dataclasses import dataclass
from dataclasses import field
import os
import pickle
from datetime import datetime
from typing import (
Dict, Optional, Mapping, Callable, Any, List, Type, Union, MutableMapping
)
import time
import dbt.exceptions
import dbt.tracking
import dbt.flags as flags
from dbt import deprecations
from dbt.adapters.factory import (
get_relation_class_by_name,
get_adapter_package_names,
get_include_paths,
)
from dbt.helper_types import PathSet
from dbt.logger import GLOBAL_LOGGER as logger, DbtProcessState
@@ -21,11 +21,13 @@ from dbt.clients.jinja import get_rendered
from dbt.clients.system import make_directory
from dbt.config import Project, RuntimeConfig
from dbt.context.docs import generate_runtime_docs
from dbt.contracts.graph.compiled import NonSourceNode
from dbt.contracts.graph.manifest import Manifest, FilePath, FileHash, Disabled
from dbt.contracts.files import FilePath, FileHash
from dbt.contracts.graph.compiled import ManifestNode
from dbt.contracts.graph.manifest import Manifest, Disabled
from dbt.contracts.graph.parsed import (
ParsedSourceDefinition, ParsedNode, ParsedMacro, ColumnInfo,
ParsedSourceDefinition, ParsedNode, ParsedMacro, ColumnInfo, ParsedExposure
)
from dbt.contracts.util import Writable
from dbt.exceptions import (
ref_target_not_found,
get_target_not_found_or_disabled_msg,
@@ -49,12 +51,39 @@ from dbt.parser.sources import patch_sources
from dbt.ui import warning_tag
from dbt.version import __version__
from hologram import JsonSchemaMixin
PARTIAL_PARSE_FILE_NAME = 'partial_parse.pickle'
PARSING_STATE = DbtProcessState('parsing')
DEFAULT_PARTIAL_PARSE = False
@dataclass
class ParserInfo(JsonSchemaMixin):
parser: str
elapsed: float
path_count: int = 0
@dataclass
class ProjectLoaderInfo(JsonSchemaMixin):
project_name: str
elapsed: float
parsers: List[ParserInfo]
path_count: int = 0
@dataclass
class ManifestLoaderInfo(JsonSchemaMixin, Writable):
path_count: int = 0
is_partial_parse_enabled: Optional[bool] = None
parse_project_elapsed: Optional[float] = None
patch_sources_elapsed: Optional[float] = None
process_manifest_elapsed: Optional[float] = None
load_all_elapsed: Optional[float] = None
projects: List[ProjectLoaderInfo] = field(default_factory=list)
_parser_types: List[Type[Parser]] = [
ModelParser,
SnapshotParser,
@@ -122,28 +151,26 @@ class ManifestLoader:
root_project, all_projects,
)
self._loaded_file_cache: Dict[str, FileBlock] = {}
self._perf_info = ManifestLoaderInfo(
is_partial_parse_enabled=self._partial_parse_enabled()
)
def _load_macros(
self,
old_results: Optional[ParseResult],
internal_manifest: Optional[Manifest] = None,
) -> None:
projects = self.all_projects
if internal_manifest is not None:
# skip internal packages
packages = get_adapter_package_names(
self.root_project.credentials.type
)
projects = {
k: v for k, v in self.all_projects.items() if k not in packages
}
self.results.macros.update(internal_manifest.macros)
self.results.files.update(internal_manifest.files)
for project in projects.values():
parser = MacroParser(self.results, project)
for path in parser.search():
self.parse_with_cache(path, parser, old_results)
def track_project_load(self):
invocation_id = dbt.tracking.active_user.invocation_id
dbt.tracking.track_project_load({
"invocation_id": invocation_id,
"project_id": self.root_project.hashed_name(),
"path_count": self._perf_info.path_count,
"parse_project_elapsed": self._perf_info.parse_project_elapsed,
"patch_sources_elapsed": self._perf_info.patch_sources_elapsed,
"process_manifest_elapsed": (
self._perf_info.process_manifest_elapsed
),
"load_all_elapsed": self._perf_info.load_all_elapsed,
"is_partial_parse_enabled": (
self._perf_info.is_partial_parse_enabled
),
})
def parse_with_cache(
self,
@@ -195,36 +222,69 @@ class ManifestLoader:
# per-project cache.
self._loaded_file_cache.clear()
project_parser_info: List[ParserInfo] = []
start_timer = time.perf_counter()
total_path_count = 0
for parser in parsers:
parser_path_count = 0
parser_start_timer = time.perf_counter()
for path in parser.search():
self.parse_with_cache(path, parser, old_results)
parser_path_count = parser_path_count + 1
if parser_path_count > 0:
project_parser_info.append(ParserInfo(
parser=parser.resource_type,
path_count=parser_path_count,
elapsed=time.perf_counter() - parser_start_timer
))
total_path_count = total_path_count + parser_path_count
elapsed = time.perf_counter() - start_timer
project_info = ProjectLoaderInfo(
project_name=project.project_name,
path_count=total_path_count,
elapsed=elapsed,
parsers=project_parser_info
)
self._perf_info.projects.append(project_info)
self._perf_info.path_count = (
self._perf_info.path_count + total_path_count
)
def load_only_macros(self) -> Manifest:
old_results = self.read_parse_results()
self._load_macros(old_results, internal_manifest=None)
# make a manifest with just the macros to get the context
macro_manifest = Manifest.from_macros(
macros=self.results.macros,
files=self.results.files
)
return macro_manifest
def load(self, internal_manifest: Optional[Manifest] = None):
old_results = self.read_parse_results()
if old_results is not None:
logger.debug('Got an acceptable cached parse result')
self._load_macros(old_results, internal_manifest=internal_manifest)
for project in self.all_projects.values():
parser = MacroParser(self.results, project)
for path in parser.search():
self.parse_with_cache(path, parser, old_results)
# make a manifest with just the macros to get the context
macro_manifest = Manifest.from_macros(
macros=self.results.macros,
files=self.results.files
)
self.macro_hook(macro_manifest)
return macro_manifest
def load(self, macro_manifest: Manifest):
old_results = self.read_parse_results()
if old_results is not None:
logger.debug('Got an acceptable cached parse result')
self.results.macros.update(macro_manifest.macros)
self.results.files.update(macro_manifest.files)
start_timer = time.perf_counter()
for project in self.all_projects.values():
# parse a single project
self.parse_project(project, macro_manifest, old_results)
self._perf_info.parse_project_elapsed = (
time.perf_counter() - start_timer
)
def write_parse_results(self):
path = os.path.join(self.root_project.target_path,
PARTIAL_PARSE_FILE_NAME)
@@ -324,12 +384,16 @@ class ManifestLoader:
# before we do anything else, patch the sources. This mutates
# results.disabled, so it needs to come before the final 'disabled'
# list is created
start_patch = time.perf_counter()
sources = patch_sources(self.results, self.root_project)
self._perf_info.patch_sources_elapsed = (
time.perf_counter() - start_patch
)
disabled = []
for value in self.results.disabled.values():
disabled.extend(value)
nodes: MutableMapping[str, NonSourceNode] = {
nodes: MutableMapping[str, ManifestNode] = {
k: v for k, v in self.results.nodes.items()
}
@@ -338,47 +402,58 @@ class ManifestLoader:
sources=sources,
macros=self.results.macros,
docs=self.results.docs,
generated_at=datetime.utcnow(),
exposures=self.results.exposures,
metadata=self.root_project.get_metadata(),
disabled=disabled,
files=self.results.files,
selectors=self.root_project.manifest_selectors,
)
manifest.patch_nodes(self.results.patches)
manifest.patch_macros(self.results.macro_patches)
start_process = time.perf_counter()
self.process_manifest(manifest)
self._perf_info.process_manifest_elapsed = (
time.perf_counter() - start_process
)
return manifest
@classmethod
def load_all(
cls,
root_config: RuntimeConfig,
internal_manifest: Optional[Manifest],
macro_manifest: Manifest,
macro_hook: Callable[[Manifest], Any],
) -> Manifest:
with PARSING_STATE:
start_load_all = time.perf_counter()
projects = root_config.load_dependencies()
v1_configs = []
for project in projects.values():
if project.config_version == 1:
v1_configs.append(f'\n\n - {project.project_name}')
if v1_configs:
deprecations.warn(
'dbt-project-yaml-v1',
project_names=''.join(v1_configs)
)
loader = cls(root_config, projects, macro_hook)
loader.load(internal_manifest=internal_manifest)
loader.load(macro_manifest=macro_manifest)
loader.write_parse_results()
manifest = loader.create_manifest()
_check_manifest(manifest, root_config)
manifest.build_flat_graph()
loader._perf_info.load_all_elapsed = (
time.perf_counter() - start_load_all
)
loader.track_project_load()
return manifest
@classmethod
def load_internal(cls, root_config: RuntimeConfig) -> Manifest:
def load_macros(
cls,
root_config: RuntimeConfig,
macro_hook: Callable[[Manifest], Any],
) -> Manifest:
with PARSING_STATE:
projects = load_internal_projects(root_config)
loader = cls(root_config, projects)
projects = root_config.load_dependencies()
loader = cls(root_config, projects, macro_hook)
return loader.load_only_macros()
@@ -432,8 +507,8 @@ def _check_resource_uniqueness(
manifest: Manifest,
config: RuntimeConfig,
) -> None:
names_resources: Dict[str, NonSourceNode] = {}
alias_resources: Dict[str, NonSourceNode] = {}
names_resources: Dict[str, ManifestNode] = {}
alias_resources: Dict[str, ManifestNode] = {}
for resource, node in manifest.nodes.items():
if node.resource_type not in NodeType.refable():
@@ -511,7 +586,7 @@ DocsContextCallback = Callable[
def _process_docs_for_node(
context: Dict[str, Any],
node: NonSourceNode,
node: ManifestNode,
):
node.description = get_rendered(node.description, context)
for column_name, column in node.columns.items():
@@ -543,6 +618,12 @@ def _process_docs_for_macro(
arg.description = get_rendered(arg.description, context)
def _process_docs_for_exposure(
context: Dict[str, Any], exposure: ParsedExposure
) -> None:
exposure.description = get_rendered(exposure.description, context)
def process_docs(manifest: Manifest, config: RuntimeConfig):
for node in manifest.nodes.values():
ctx = generate_runtime_docs(
@@ -568,14 +649,63 @@ def process_docs(manifest: Manifest, config: RuntimeConfig):
config.project_name,
)
_process_docs_for_macro(ctx, macro)
for exposure in manifest.exposures.values():
ctx = generate_runtime_docs(
config,
exposure,
manifest,
config.project_name,
)
_process_docs_for_exposure(ctx, exposure)
def _process_refs_for_exposure(
manifest: Manifest, current_project: str, exposure: ParsedExposure
):
"""Given a manifest and a exposure in that manifest, process its refs"""
for ref in exposure.refs:
target_model: Optional[Union[Disabled, ManifestNode]] = None
target_model_name: str
target_model_package: Optional[str] = None
if len(ref) == 1:
target_model_name = ref[0]
elif len(ref) == 2:
target_model_package, target_model_name = ref
else:
raise dbt.exceptions.InternalException(
f'Refs should always be 1 or 2 arguments - got {len(ref)}'
)
target_model = manifest.resolve_ref(
target_model_name,
target_model_package,
current_project,
exposure.package_name,
)
if target_model is None or isinstance(target_model, Disabled):
# This may raise. Even if it doesn't, we don't want to add
# this exposure to the graph b/c there is no destination exposure
invalid_ref_fail_unless_test(
exposure, target_model_name, target_model_package,
disabled=(isinstance(target_model, Disabled))
)
continue
target_model_id = target_model.unique_id
exposure.depends_on.nodes.append(target_model_id)
manifest.update_exposure(exposure)
def _process_refs_for_node(
manifest: Manifest, current_project: str, node: NonSourceNode
manifest: Manifest, current_project: str, node: ManifestNode
):
"""Given a manifest and a node in that manifest, process its refs"""
for ref in node.refs:
target_model: Optional[Union[Disabled, NonSourceNode]] = None
target_model: Optional[Union[Disabled, ManifestNode]] = None
target_model_name: str
target_model_package: Optional[str] = None
@@ -618,11 +748,37 @@ def _process_refs_for_node(
def process_refs(manifest: Manifest, current_project: str):
for node in manifest.nodes.values():
_process_refs_for_node(manifest, current_project, node)
for exposure in manifest.exposures.values():
_process_refs_for_exposure(manifest, current_project, exposure)
return manifest
def _process_sources_for_exposure(
manifest: Manifest, current_project: str, exposure: ParsedExposure
):
target_source: Optional[Union[Disabled, ParsedSourceDefinition]] = None
for source_name, table_name in exposure.sources:
target_source = manifest.resolve_source(
source_name,
table_name,
current_project,
exposure.package_name,
)
if target_source is None or isinstance(target_source, Disabled):
invalid_source_fail_unless_test(
exposure,
source_name,
table_name,
disabled=(isinstance(target_source, Disabled))
)
continue
target_source_id = target_source.unique_id
exposure.depends_on.nodes.append(target_source_id)
manifest.update_exposure(exposure)
def _process_sources_for_node(
manifest: Manifest, current_project: str, node: NonSourceNode
manifest: Manifest, current_project: str, node: ManifestNode
):
target_source: Optional[Union[Disabled, ParsedSourceDefinition]] = None
for source_name, table_name in node.sources:
@@ -654,6 +810,8 @@ def process_sources(manifest: Manifest, current_project: str):
continue
assert not isinstance(node, ParsedSourceDefinition)
_process_sources_for_node(manifest, current_project, node)
for exposure in manifest.exposures.values():
_process_sources_for_exposure(manifest, current_project, exposure)
return manifest
@@ -670,7 +828,7 @@ def process_macro(
def process_node(
config: RuntimeConfig, manifest: Manifest, node: NonSourceNode
config: RuntimeConfig, manifest: Manifest, node: ManifestNode
):
_process_sources_for_node(
@@ -681,18 +839,16 @@ def process_node(
_process_docs_for_node(ctx, node)
def load_internal_projects(config):
project_paths = get_include_paths(config.credentials.type)
return dict(_load_projects(config, project_paths))
def load_internal_manifest(config: RuntimeConfig) -> Manifest:
return ManifestLoader.load_internal(config)
def load_macro_manifest(
config: RuntimeConfig,
macro_hook: Callable[[Manifest], Any],
) -> Manifest:
return ManifestLoader.load_macros(config, macro_hook)
def load_manifest(
config: RuntimeConfig,
internal_manifest: Optional[Manifest],
macro_manifest: Manifest,
macro_hook: Callable[[Manifest], Any],
) -> Manifest:
return ManifestLoader.load_all(config, internal_manifest, macro_hook)
return ManifestLoader.load_all(config, macro_manifest, macro_hook)

View File

@@ -3,9 +3,7 @@ from typing import TypeVar, MutableMapping, Mapping, Union, List
from hologram import JsonSchemaMixin
from dbt.contracts.graph.manifest import (
SourceFile, RemoteFile, FileHash, MacroKey, SourceKey
)
from dbt.contracts.files import RemoteFile, FileHash, SourceFile
from dbt.contracts.graph.compiled import CompileResultNode
from dbt.contracts.graph.parsed import (
HasUniqueID,
@@ -17,6 +15,7 @@ from dbt.contracts.graph.parsed import (
ParsedMacroPatch,
ParsedModelNode,
ParsedNodePatch,
ParsedExposure,
ParsedRPCNode,
ParsedSeedNode,
ParsedSchemaTestNode,
@@ -24,7 +23,7 @@ from dbt.contracts.graph.parsed import (
UnpatchedSourceDefinition,
)
from dbt.contracts.graph.unparsed import SourcePatch
from dbt.contracts.util import Writable, Replaceable
from dbt.contracts.util import Writable, Replaceable, MacroKey, SourceKey
from dbt.exceptions import (
raise_duplicate_resource_name, raise_duplicate_patch_name,
raise_duplicate_macro_patch_name, CompilationException, InternalException,
@@ -71,6 +70,7 @@ class ParseResult(JsonSchemaMixin, Writable, Replaceable):
sources: MutableMapping[str, UnpatchedSourceDefinition] = dict_field()
docs: MutableMapping[str, ParsedDocumentation] = dict_field()
macros: MutableMapping[str, ParsedMacro] = dict_field()
exposures: MutableMapping[str, ParsedExposure] = dict_field()
macro_patches: MutableMapping[MacroKey, ParsedMacroPatch] = dict_field()
patches: MutableMapping[str, ParsedNodePatch] = dict_field()
source_patches: MutableMapping[SourceKey, SourcePatch] = dict_field()
@@ -103,6 +103,11 @@ class ParseResult(JsonSchemaMixin, Writable, Replaceable):
self.add_node_nofile(node)
self.get_file(source_file).nodes.append(node.unique_id)
def add_exposure(self, source_file: SourceFile, exposure: ParsedExposure):
_check_duplicates(exposure, self.exposures)
self.exposures[exposure.unique_id] = exposure
self.get_file(source_file).exposures.append(exposure.unique_id)
def add_disabled_nofile(self, node: CompileResultNode):
if node.unique_id in self.disabled:
self.disabled[node.unique_id].append(node)
@@ -264,6 +269,12 @@ class ParseResult(JsonSchemaMixin, Writable, Replaceable):
continue
self._process_node(node_id, source_file, old_file, old_result)
for exposure_id in old_file.exposures:
exposure = _expect_value(
exposure_id, old_result.exposures, old_file, "exposures"
)
self.add_exposure(source_file, exposure)
patched = False
for name in old_file.patches:
patch = _expect_value(

View File

@@ -9,7 +9,11 @@ from typing import (
from dbt.clients.jinja import get_rendered, SCHEMA_TEST_KWARGS_NAME
from dbt.contracts.graph.parsed import UnpatchedSourceDefinition
from dbt.contracts.graph.unparsed import (
UnparsedNodeUpdate, UnparsedMacroUpdate, UnparsedAnalysisUpdate, TestDef,
TestDef,
UnparsedAnalysisUpdate,
UnparsedMacroUpdate,
UnparsedNodeUpdate,
UnparsedExposure,
)
from dbt.exceptions import raise_compiler_error
from dbt.parser.search import FileBlock
@@ -78,6 +82,7 @@ Target = TypeVar(
UnparsedMacroUpdate,
UnparsedAnalysisUpdate,
UnpatchedSourceDefinition,
UnparsedExposure,
)

View File

@@ -13,11 +13,15 @@ from dbt.clients.jinja import get_rendered, add_rendered_test_kwargs
from dbt.clients.yaml_helper import load_yaml_text
from dbt.config.renderer import SchemaYamlRenderer
from dbt.context.context_config import (
ContextConfigType,
BaseContextConfigGenerator,
ContextConfig,
ContextConfigGenerator,
UnrenderedConfigGenerator,
)
from dbt.context.configured import generate_schema_yml
from dbt.context.target import generate_target_context
from dbt.context.providers import generate_parse_exposure
from dbt.contracts.files import FileHash
from dbt.contracts.graph.manifest import SourceFile
from dbt.contracts.graph.model_config import SourceConfig
from dbt.contracts.graph.parsed import (
@@ -27,11 +31,20 @@ from dbt.contracts.graph.parsed import (
ParsedSchemaTestNode,
ParsedMacroPatch,
UnpatchedSourceDefinition,
ParsedExposure,
)
from dbt.contracts.graph.unparsed import (
UnparsedSourceDefinition, UnparsedNodeUpdate, UnparsedColumn,
UnparsedMacroUpdate, UnparsedAnalysisUpdate, SourcePatch,
HasDocs, HasColumnDocs, HasColumnTests, FreshnessThreshold,
FreshnessThreshold,
HasColumnDocs,
HasColumnTests,
HasDocs,
SourcePatch,
UnparsedAnalysisUpdate,
UnparsedColumn,
UnparsedMacroUpdate,
UnparsedNodeUpdate,
UnparsedExposure,
UnparsedSourceDefinition,
)
from dbt.exceptions import (
validator_error_message, JSONValidationException,
@@ -82,6 +95,7 @@ def error_context(
class ParserRef:
"""A helper object to hold parse-time references."""
def __init__(self):
self.column_info: Dict[str, ColumnInfo] = {}
@@ -94,12 +108,18 @@ class ParserRef:
):
tags: List[str] = []
tags.extend(getattr(column, 'tags', ()))
quote: Optional[bool]
if isinstance(column, UnparsedColumn):
quote = column.quote
else:
quote = None
self.column_info[column.name] = ColumnInfo(
name=column.name,
description=description,
data_type=data_type,
meta=meta,
tags=tags,
quote=quote,
_extra=column.extra
)
@@ -152,7 +172,6 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
)
self.raw_renderer = SchemaYamlRenderer(ctx)
self.config_generator = ContextConfigGenerator(self.root_project)
@classmethod
def get_compiled_path(cls, block: FileBlock) -> str:
@@ -229,6 +248,28 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
for test in column.tests:
self.parse_test(block, test, column)
def _generate_source_config(self, fqn: List[str], rendered: bool):
generator: BaseContextConfigGenerator
if rendered:
generator = ContextConfigGenerator(self.root_project)
else:
generator = UnrenderedConfigGenerator(
self.root_project
)
return generator.calculate_node_config(
config_calls=[],
fqn=fqn,
resource_type=NodeType.Source,
project_name=self.project.project_name,
base=False,
)
def _get_relation_name(self, node: ParsedSourceDefinition):
adapter = get_adapter(self.root_project)
relation_cls = adapter.Relation
return str(relation_cls.create_from(self.root_project, node))
def parse_source(
self, target: UnpatchedSourceDefinition
) -> ParsedSourceDefinition:
@@ -249,13 +290,16 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
# make sure we don't do duplicate tags from source + table
tags = sorted(set(itertools.chain(source.tags, table.tags)))
config = self.config_generator.calculate_node_config(
config_calls=[],
config = self._generate_source_config(
fqn=target.fqn,
resource_type=NodeType.Source,
project_name=self.project.project_name,
base=False,
rendered=True,
)
unrendered_config = self._generate_source_config(
fqn=target.fqn,
rendered=False,
)
if not isinstance(config, SourceConfig):
raise InternalException(
f'Calculated a {type(config)} for a source, but expected '
@@ -264,7 +308,7 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
default_database = self.root_project.credentials.database
return ParsedSourceDefinition(
parsed_source = ParsedSourceDefinition(
package_name=target.package_name,
database=(source.database or default_database),
schema=(source.schema or source.name),
@@ -289,13 +333,19 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
fqn=target.fqn,
tags=tags,
config=config,
unrendered_config=unrendered_config,
)
# relation name is added after instantiation because the adapter does
# not provide the relation name for a UnpatchedSourceDefinition object
parsed_source.relation_name = self._get_relation_name(parsed_source)
return parsed_source
def create_test_node(
self,
target: Union[UnpatchedSourceDefinition, UnparsedNodeUpdate],
path: str,
config: ContextConfigType,
config: ContextConfig,
tags: List[str],
fqn: List[str],
name: str,
@@ -321,6 +371,7 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
'config': self.config_dict(config),
'test_metadata': test_metadata,
'column_name': column_name,
'checksum': FileHash.empty().to_dict(),
}
try:
return self.parse_from_dict(dct)
@@ -450,9 +501,9 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
return node
def render_with_context(
self, node: ParsedSchemaTestNode, config: ContextConfigType,
self, node: ParsedSchemaTestNode, config: ContextConfig,
) -> None:
"""Given the parsed node and a ContextConfigType to use during
"""Given the parsed node and a ContextConfig to use during
parsing, collect all the refs that might be squirreled away in the test
arguments. This includes the implicit "model" argument.
"""
@@ -503,6 +554,11 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
for test in block.tests:
self.parse_test(block, test, None)
def parse_exposures(self, block: YamlBlock) -> None:
parser = ExposureParser(self, block)
for node in parser.parse():
self.results.add_exposure(block.file, node)
def parse_file(self, block: FileBlock) -> None:
dct = self._yaml_from_file(block.file)
# mark the file as seen, even if there are no macros in it
@@ -529,10 +585,15 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
parser = MacroPatchParser(self, yaml_block, plural)
elif key == NodeType.Analysis:
parser = AnalysisPatchParser(self, yaml_block, plural)
elif key == NodeType.Exposure:
# handle exposures separately, but they are
# technically still "documentable"
continue
else:
parser = TestablePatchParser(self, yaml_block, plural)
for test_block in parser.parse():
self.parse_tests(test_block)
self.parse_exposures(yaml_block)
Parsed = TypeVar(
@@ -549,7 +610,7 @@ NonSourceTarget = TypeVar(
)
class YamlDocsReader(metaclass=ABCMeta):
class YamlReader(metaclass=ABCMeta):
def __init__(
self, schema_parser: SchemaParser, yaml: YamlBlock, key: str
) -> None:
@@ -591,6 +652,8 @@ class YamlDocsReader(metaclass=ABCMeta):
)
raise CompilationException(msg)
class YamlDocsReader(YamlReader):
@abstractmethod
def parse(self) -> List[TestBlock]:
raise NotImplementedError('parse is abstract')
@@ -755,3 +818,57 @@ class MacroPatchParser(NonSourceParser[UnparsedMacroUpdate, ParsedMacroPatch]):
docs=block.target.docs,
)
self.results.add_macro_patch(self.yaml.file, result)
class ExposureParser(YamlReader):
def __init__(self, schema_parser: SchemaParser, yaml: YamlBlock):
super().__init__(schema_parser, yaml, NodeType.Exposure.pluralize())
self.schema_parser = schema_parser
self.yaml = yaml
def parse_exposure(self, unparsed: UnparsedExposure) -> ParsedExposure:
package_name = self.project.project_name
unique_id = f'{NodeType.Exposure}.{package_name}.{unparsed.name}'
path = self.yaml.path.relative_path
fqn = self.schema_parser.get_fqn_prefix(path)
fqn.append(unparsed.name)
parsed = ParsedExposure(
package_name=package_name,
root_path=self.project.project_root,
path=path,
original_file_path=self.yaml.path.original_file_path,
unique_id=unique_id,
fqn=fqn,
name=unparsed.name,
type=unparsed.type,
url=unparsed.url,
description=unparsed.description,
owner=unparsed.owner,
maturity=unparsed.maturity,
)
ctx = generate_parse_exposure(
parsed,
self.root_project,
self.schema_parser.macro_manifest,
package_name,
)
depends_on_jinja = '\n'.join(
'{{ ' + line + '}}' for line in unparsed.depends_on
)
get_rendered(
depends_on_jinja, ctx, parsed, capture_macros=True
)
# parsed now has a populated refs/sources
return parsed
def parse(self) -> Iterable[ParsedExposure]:
for data in self.get_key_dicts():
try:
unparsed = UnparsedExposure.from_dict(data)
except (ValidationError, JSONValidationException) as exc:
msg = error_context(self.yaml.path, self.key, data, exc)
raise CompilationException(msg) from exc
parsed = self.parse_exposure(unparsed)
yield parsed

View File

@@ -7,7 +7,7 @@ from typing import (
from dbt.clients.jinja import extract_toplevel_blocks, BlockTag
from dbt.clients.system import find_matching
from dbt.config import Project
from dbt.contracts.graph.manifest import SourceFile, FilePath
from dbt.contracts.files import SourceFile, FilePath
from dbt.exceptions import CompilationException, InternalException

View File

@@ -1,5 +1,5 @@
from dbt.context.context_config import ContextConfigType
from dbt.contracts.graph.manifest import SourceFile, FilePath
from dbt.context.context_config import ContextConfig
from dbt.contracts.files import SourceFile, FilePath
from dbt.contracts.graph.parsed import ParsedSeedNode
from dbt.node_types import NodeType
from dbt.parser.base import SimpleSQLParser
@@ -24,9 +24,16 @@ class SeedParser(SimpleSQLParser[ParsedSeedNode]):
return block.path.relative_path
def render_with_context(
self, parsed_node: ParsedSeedNode, config: ContextConfigType
self, parsed_node: ParsedSeedNode, config: ContextConfig
) -> None:
"""Seeds don't need to do any rendering."""
def load_file(self, match: FilePath) -> SourceFile:
return SourceFile.seed(match)
def load_file(
self, match: FilePath, *, set_contents: bool = False
) -> SourceFile:
if match.seed_too_large():
# We don't want to calculate a hash of this file. Use the path.
return SourceFile.big_seed(match)
else:
# We want to calculate a hash, but we don't need the contents
return super().load_file(match, set_contents=set_contents)

View File

@@ -7,7 +7,11 @@ from dbt.contracts.graph.manifest import Manifest
from dbt.config import RuntimeConfig
def get_full_manifest(config: RuntimeConfig) -> Manifest:
def get_full_manifest(
config: RuntimeConfig,
*,
reset: bool = False,
) -> Manifest:
"""Load the full manifest, using the adapter's internal manifest if it
exists to skip parsing internal (dbt + plugins) macros a second time.
@@ -15,9 +19,14 @@ def get_full_manifest(config: RuntimeConfig) -> Manifest:
attached to the adapter for any methods that need it.
"""
adapter = get_adapter(config) # type: ignore
internal: Manifest = adapter.load_internal_manifest()
if reset:
config.clear_dependencies()
adapter.clear_macro_manifest()
def set_header(manifest: Manifest) -> None:
adapter.connections.set_query_header(manifest)
internal: Manifest = adapter.load_macro_manifest()
return load_manifest(config, internal, set_header)
return load_manifest(
config,
internal,
adapter.connections.set_query_header,
)

View File

@@ -18,10 +18,11 @@ from dbt.contracts.rpc import (
TaskRow,
PSResult,
RemoteExecutionResult,
RemoteFreshnessResult,
RemoteRunResult,
RemoteCompileResult,
RemoteCatalogResults,
RemoteEmptyResult,
RemoteDepsResult,
RemoteRunOperationResult,
PollParameters,
PollResult,
@@ -32,6 +33,7 @@ from dbt.contracts.rpc import (
PollRunCompleteResult,
PollCompileCompleteResult,
PollCatalogCompleteResult,
PollFreshnessResult,
PollRemoteEmptyCompleteResult,
PollRunOperationCompleteResult,
TaskHandlerState,
@@ -146,7 +148,8 @@ def poll_complete(
PollCatalogCompleteResult,
PollRemoteEmptyCompleteResult,
PollRunOperationCompleteResult,
PollGetManifestResult
PollGetManifestResult,
PollFreshnessResult,
]]
if isinstance(result, RemoteExecutionResult):
@@ -158,12 +161,14 @@ def poll_complete(
cls = PollCompileCompleteResult
elif isinstance(result, RemoteCatalogResults):
cls = PollCatalogCompleteResult
elif isinstance(result, RemoteEmptyResult):
elif isinstance(result, RemoteDepsResult):
cls = PollRemoteEmptyCompleteResult
elif isinstance(result, RemoteRunOperationResult):
cls = PollRunOperationCompleteResult
elif isinstance(result, GetManifestResult):
cls = PollGetManifestResult
elif isinstance(result, RemoteFreshnessResult):
cls = PollFreshnessResult
else:
raise dbt.exceptions.InternalException(
'got invalid result in poll_complete: {}'.format(result)

View File

@@ -1,17 +1,20 @@
from abc import abstractmethod
from datetime import datetime
from typing import Generic, TypeVar
import dbt.exceptions
from dbt.compilation import compile_node
from dbt.contracts.rpc import (
RemoteCompileResult, RemoteRunResult, ResultTable,
RemoteCompileResult,
RemoteCompileResultMixin,
RemoteRunResult,
ResultTable,
)
from dbt.logger import GLOBAL_LOGGER as logger
from dbt.task.compile import CompileRunner
from dbt.rpc.error import dbt_error, RPCException, server_error
RPCSQLResult = TypeVar('RPCSQLResult', bound=RemoteCompileResult)
RPCSQLResult = TypeVar('RPCSQLResult', bound=RemoteCompileResultMixin)
class GenericRPCRunner(CompileRunner, Generic[RPCSQLResult]):
@@ -38,8 +41,8 @@ class GenericRPCRunner(CompileRunner, Generic[RPCSQLResult]):
pass
def compile(self, manifest):
return compile_node(self.adapter, self.config, self.node, manifest, {},
write=False)
compiler = self.adapter.get_compiler()
return compiler.compile_node(self.node, manifest, {}, write=False)
@abstractmethod
def execute(self, compiled_node, manifest) -> RPCSQLResult:
@@ -62,10 +65,11 @@ class RPCCompileRunner(GenericRPCRunner[RemoteCompileResult]):
def execute(self, compiled_node, manifest) -> RemoteCompileResult:
return RemoteCompileResult(
raw_sql=compiled_node.raw_sql,
compiled_sql=compiled_node.injected_sql,
compiled_sql=compiled_node.compiled_sql,
node=compiled_node,
timing=[], # this will get added later
logs=[],
generated_at=datetime.utcnow(),
)
def from_run_result(
@@ -77,13 +81,14 @@ class RPCCompileRunner(GenericRPCRunner[RemoteCompileResult]):
node=result.node,
timing=timing_info,
logs=[],
generated_at=datetime.utcnow(),
)
class RPCExecuteRunner(GenericRPCRunner[RemoteRunResult]):
def execute(self, compiled_node, manifest) -> RemoteRunResult:
_, execute_result = self.adapter.execute(
compiled_node.injected_sql, fetch=True
compiled_node.compiled_sql, fetch=True
)
table = ResultTable(
@@ -93,11 +98,12 @@ class RPCExecuteRunner(GenericRPCRunner[RemoteRunResult]):
return RemoteRunResult(
raw_sql=compiled_node.raw_sql,
compiled_sql=compiled_node.injected_sql,
compiled_sql=compiled_node.compiled_sql,
node=compiled_node,
table=table,
timing=[],
logs=[],
generated_at=datetime.utcnow(),
)
def from_run_result(
@@ -110,4 +116,5 @@ class RPCExecuteRunner(GenericRPCRunner[RemoteRunResult]):
table=result.table,
timing=timing_info,
logs=[],
generated_at=datetime.utcnow(),
)

View File

@@ -187,6 +187,7 @@ def get_results_context(
class StateHandler:
"""A helper context manager to manage task handler state."""
def __init__(self, task_handler: 'RequestTaskHandler') -> None:
self.handler = task_handler
@@ -248,6 +249,7 @@ class SetArgsStateHandler(StateHandler):
"""A state handler that does not touch state on success and does not
execute the teardown
"""
def handle_completed(self):
pass
@@ -257,6 +259,7 @@ class SetArgsStateHandler(StateHandler):
class RequestTaskHandler(threading.Thread, TaskHandlerProtocol):
"""Handler for the single task triggered by a given jsonrpc request."""
def __init__(
self,
manager: TaskManagerProtocol,
@@ -400,6 +403,7 @@ class RequestTaskHandler(threading.Thread, TaskHandlerProtocol):
try:
with StateHandler(self):
self.result = self.get_result()
except (dbt.exceptions.Exception, RPCException):
# we probably got an error after the RPC call ran (and it was
# probably deps...). By now anyone who wanted to see it has seen it

View File

@@ -8,6 +8,7 @@ from typing import (
import dbt.exceptions
import dbt.flags as flags
from dbt.adapters.factory import reset_adapters, register_adapter
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.rpc import (
LastParse,
@@ -126,6 +127,8 @@ class TaskManager:
def reload_config(self):
config = self.config.from_args(self.args)
self.config = config
reset_adapters()
register_adapter(config)
return config
def add_request(self, request_handler: TaskHandlerProtocol):
@@ -184,7 +187,7 @@ class TaskManager:
return True
def parse_manifest(self) -> None:
self.manifest = get_full_manifest(self.config)
self.manifest = get_full_manifest(self.config, reset=True)
def set_compile_exception(self, exc, logs=List[LogMessage]) -> None:
assert self.last_parse.state == ManifestStatus.Compiling, \
@@ -227,6 +230,7 @@ class TaskManager:
return None
task = self.rpc_task(method)
return task
def task_table(self) -> List[TaskRow]:

View File

@@ -9,7 +9,7 @@ from dbt import tracking
from dbt import ui
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.results import (
RunModelResult, collect_timing_info
NodeStatus, RunResult, collect_timing_info, RunStatus
)
from dbt.exceptions import (
NotImplementedException, CompilationException, RuntimeException,
@@ -165,6 +165,7 @@ class ExecutionContext:
"""During execution and error handling, dbt makes use of mutable state:
timing information and the newest (compiled vs executed) form of the node.
"""
def __init__(self, node):
self.timing = []
self.node = node
@@ -179,20 +180,20 @@ class BaseRunner(metaclass=ABCMeta):
self.num_nodes = num_nodes
self.skip = False
self.skip_cause: Optional[RunModelResult] = None
self.skip_cause: Optional[RunResult] = None
@abstractmethod
def compile(self, manifest: Manifest) -> Any:
pass
def get_result_status(self, result) -> Dict[str, str]:
if result.error:
return {'node_status': 'error', 'node_error': str(result.error)}
elif result.skip:
if result.status == NodeStatus.Error:
return {'node_status': 'error', 'node_error': str(result.message)}
elif result.status == NodeStatus.Skipped:
return {'node_status': 'skipped'}
elif result.fail:
elif result.status == NodeStatus.Fail:
return {'node_status': 'failed'}
elif result.warn:
elif result.status == NodeStatus.Warn:
return {'node_status': 'warn'}
else:
return {'node_status': 'passed'}
@@ -212,52 +213,62 @@ class BaseRunner(metaclass=ABCMeta):
return result
def _build_run_result(self, node, start_time, error, status, timing_info,
skip=False, fail=None, warn=None, agate_table=None):
def _build_run_result(self, node, start_time, status, timing_info, message,
agate_table=None, adapter_response=None):
execution_time = time.time() - start_time
thread_id = threading.current_thread().name
return RunModelResult(
node=node,
error=error,
skip=skip,
if adapter_response is None:
adapter_response = {}
return RunResult(
status=status,
fail=fail,
warn=warn,
execution_time=execution_time,
thread_id=thread_id,
execution_time=execution_time,
timing=timing_info,
message=message,
node=node,
agate_table=agate_table,
adapter_response=adapter_response
)
def error_result(self, node, error, start_time, timing_info):
def error_result(self, node, message, start_time, timing_info):
return self._build_run_result(
node=node,
start_time=start_time,
error=error,
status='ERROR',
timing_info=timing_info
status=RunStatus.Error,
timing_info=timing_info,
message=message,
)
def ephemeral_result(self, node, start_time, timing_info):
return self._build_run_result(
node=node,
start_time=start_time,
error=None,
status=None,
timing_info=timing_info
status=RunStatus.Success,
timing_info=timing_info,
message=None
)
def from_run_result(self, result, start_time, timing_info):
return self._build_run_result(
node=result.node,
start_time=start_time,
error=result.error,
skip=result.skip,
status=result.status,
fail=result.fail,
warn=result.warn,
timing_info=timing_info,
message=result.message,
agate_table=result.agate_table,
adapter_response=result.adapter_response
)
def skip_result(self, node, message):
thread_id = threading.current_thread().name
return RunResult(
status=RunStatus.Skipped,
thread_id=thread_id,
execution_time=0,
timing=[],
message=message,
node=node,
adapter_response={}
)
def compile_and_execute(self, manifest, ctx):
@@ -340,7 +351,7 @@ class BaseRunner(metaclass=ABCMeta):
# an error
if (
exc_str is not None and result is not None and
result.error is None and error is None
result.status != NodeStatus.Error and error is None
):
error = exc_str
@@ -389,7 +400,7 @@ class BaseRunner(metaclass=ABCMeta):
schema_name = self.node.schema
node_name = self.node.name
error = None
error_message = None
if not self.node.is_ephemeral_model:
# if this model was skipped due to an upstream ephemeral model
# failure, print a special 'error skip' message.
@@ -408,7 +419,7 @@ class BaseRunner(metaclass=ABCMeta):
'an ephemeral failure'
)
# set an error so dbt will exit with an error code
error = (
error_message = (
'Compilation Error in {}, caused by compilation error '
'in referenced ephemeral model {}'
.format(self.node.unique_id,
@@ -423,7 +434,7 @@ class BaseRunner(metaclass=ABCMeta):
self.num_nodes
)
node_result = RunModelResult(self.node, skip=True, error=error)
node_result = self.skip_result(self.node, error_message)
return node_result
def do_skip(self, cause=None):

View File

@@ -2,7 +2,7 @@ import os.path
import os
import shutil
from dbt.task.base import BaseTask
from dbt.task.base import BaseTask, move_to_nearest_project_dir
from dbt.logger import GLOBAL_LOGGER as logger
from dbt.config import UnsetProfileConfig
@@ -32,6 +32,7 @@ class CleanTask(BaseTask):
This function takes all the paths in the target file
and cleans the project paths that are not protected.
"""
move_to_nearest_project_dir(self.args)
for path in self.config.clean_targets:
logger.info("Checking {}/*".format(path))
if not self.__is_protected_path(path):

View File

@@ -1,8 +1,8 @@
import threading
from .runnable import GraphRunnableTask
from .base import BaseRunner
from dbt.compilation import compile_node
from dbt.contracts.results import RunModelResult
from dbt.contracts.results import RunStatus, RunResult
from dbt.exceptions import InternalException
from dbt.graph import ResourceTypeSelector, SelectionSpec, parse_difference
from dbt.logger import print_timestamped_line
@@ -17,10 +17,19 @@ class CompileRunner(BaseRunner):
pass
def execute(self, compiled_node, manifest):
return RunModelResult(compiled_node)
return RunResult(
node=compiled_node,
status=RunStatus.Success,
timing=[],
thread_id=threading.current_thread().name,
execution_time=0,
message=None,
adapter_response={}
)
def compile(self, manifest):
return compile_node(self.adapter, self.config, self.node, manifest, {})
compiler = self.adapter.get_compiler()
return compiler.compile_node(self.node, manifest, {})
class CompileTask(GraphRunnableTask):
@@ -42,6 +51,7 @@ class CompileTask(GraphRunnableTask):
return ResourceTypeSelector(
graph=self.graph,
manifest=self.manifest,
previous_state=self.previous_state,
resource_types=NodeType.executable(),
)

View File

@@ -48,7 +48,6 @@ Check your database credentials and try again. For more information, visit:
{url}
'''.lstrip()
MISSING_PROFILE_MESSAGE = '''
dbt looked for a profiles.yml file in {path}, but did
not find one. For more information on configuring your profile, consult the
@@ -90,6 +89,7 @@ class DebugTask(BaseTask):
self.profile_name: Optional[str] = None
self.project: Optional[Project] = None
self.project_fail_details = ''
self.any_failure = False
self.messages: List[str] = []
@property
@@ -111,7 +111,7 @@ class DebugTask(BaseTask):
def run(self):
if self.args.config_dir:
self.path_info()
return
return not self.any_failure
version = get_installed_version().to_version_string(skip_matcher=True)
print('dbt version: {}'.format(version))
@@ -129,6 +129,11 @@ class DebugTask(BaseTask):
print(message)
print('')
return not self.any_failure
def interpret_results(self, results):
return results
def _load_project(self):
if not os.path.exists(self.project_path):
self.project_fail_details = FILE_NOT_FOUND
@@ -143,7 +148,9 @@ class DebugTask(BaseTask):
try:
self.project = Project.from_project_root(
self.project_dir, renderer
self.project_dir,
renderer,
verify_version=getattr(self.args, 'version_check', False),
)
except dbt.exceptions.DbtConfigError as exc:
self.project_fail_details = str(exc)
@@ -181,7 +188,8 @@ class DebugTask(BaseTask):
if os.path.exists(self.project_path):
try:
partial = Project.partial_load(
os.path.dirname(self.project_path)
os.path.dirname(self.project_path),
verify_version=getattr(self.args, 'version_check', False),
)
renderer = DbtProjectYamlRenderer(
generate_base_context(self.cli_vars)
@@ -242,6 +250,7 @@ class DebugTask(BaseTask):
self.messages.append(MISSING_PROFILE_MESSAGE.format(
path=self.profile_path, url=ProfileConfigDocs
))
self.any_failure = True
return red('ERROR not found')
try:
@@ -280,6 +289,7 @@ class DebugTask(BaseTask):
dbt.clients.system.run_cmd(os.getcwd(), ['git', '--help'])
except dbt.exceptions.ExecutableError as exc:
self.messages.append('Error from git --help: {!s}'.format(exc))
self.any_failure = True
return red('ERROR')
return green('OK found')
@@ -307,6 +317,8 @@ class DebugTask(BaseTask):
def _log_project_fail(self):
if not self.project_fail_details:
return
self.any_failure = True
if self.project_fail_details == FILE_NOT_FOUND:
return
print('Project loading failed for the following reason:')
@@ -316,6 +328,8 @@ class DebugTask(BaseTask):
def _log_profile_fail(self):
if not self.profile_fail_details:
return
self.any_failure = True
if self.profile_fail_details == FILE_NOT_FOUND:
return
print('Profile loading failed for the following reason:')
@@ -331,7 +345,7 @@ class DebugTask(BaseTask):
adapter = get_adapter(profile)
try:
with adapter.connection_named('debug'):
adapter.execute('select 1 as id')
adapter.debug_query()
except Exception as exc:
return COULD_NOT_CONNECT_MESSAGE.format(
err=str(exc),
@@ -344,6 +358,7 @@ class DebugTask(BaseTask):
result = self.attempt_connection(self.profile)
if result is not None:
self.messages.append(result)
self.any_failure = True
return red('ERROR')
return green('OK connection ok')

View File

@@ -1,7 +1,6 @@
import os
import threading
import time
from typing import Dict
from .base import BaseRunner
from .printer import (
@@ -12,16 +11,14 @@ from .printer import (
from .runnable import GraphRunnableTask
from dbt.contracts.results import (
FreshnessExecutionResult,
SourceFreshnessResult,
PartialResult,
FreshnessExecutionResultArtifact,
FreshnessResult, PartialSourceFreshnessResult,
SourceFreshnessResult, FreshnessStatus
)
from dbt.exceptions import RuntimeException, InternalException
from dbt.logger import print_timestamped_line
from dbt.node_types import NodeType
from dbt import utils
from dbt.graph import NodeSelector, SelectionSpec, parse_difference
from dbt.contracts.graph.parsed import ParsedSourceDefinition
@@ -35,12 +32,6 @@ class FreshnessRunner(BaseRunner):
'Freshness: nodes cannot be skipped!'
)
def get_result_status(self, result) -> Dict[str, str]:
if result.error:
return {'node_status': 'error', 'node_error': str(result.error)}
else:
return {'node_status': str(result.status)}
def before_execute(self):
description = 'freshness of {0.source_name}.{0.name}'.format(self.node)
print_start_line(description, self.node_index, self.num_nodes)
@@ -48,18 +39,33 @@ class FreshnessRunner(BaseRunner):
def after_execute(self, result):
print_freshness_result_line(result, self.node_index, self.num_nodes)
def _build_run_result(self, node, start_time, error, status, timing_info,
skip=False, failed=None):
def error_result(self, node, message, start_time, timing_info):
return self._build_run_result(
node=node,
start_time=start_time,
status=FreshnessStatus.RuntimeErr,
timing_info=timing_info,
message=message,
)
def _build_run_result(
self,
node,
start_time,
status,
timing_info,
message
):
execution_time = time.time() - start_time
thread_id = threading.current_thread().name
status = utils.lowercase(status)
return PartialResult(
node=node,
return PartialSourceFreshnessResult(
status=status,
error=error,
execution_time=execution_time,
thread_id=thread_id,
execution_time=execution_time,
timing=timing_info,
message=message,
node=node,
adapter_response={}
)
def from_run_result(self, result, start_time, timing_info):
@@ -94,6 +100,10 @@ class FreshnessRunner(BaseRunner):
node=compiled_node,
status=status,
thread_id=threading.current_thread().name,
timing=[],
execution_time=0,
message=None,
adapter_response={},
**freshness
)
@@ -140,13 +150,18 @@ class FreshnessTask(GraphRunnableTask):
return FreshnessSelector(
graph=self.graph,
manifest=self.manifest,
previous_state=self.previous_state,
)
def get_runner_type(self):
return FreshnessRunner
def write_result(self, result):
artifact = FreshnessExecutionResultArtifact.from_result(result)
artifact.write(self.result_path())
def get_result(self, results, elapsed_time, generated_at):
return FreshnessExecutionResult(
return FreshnessResult.from_node_results(
elapsed_time=elapsed_time,
generated_at=generated_at,
results=results
@@ -154,7 +169,10 @@ class FreshnessTask(GraphRunnableTask):
def task_end_messages(self, results):
for result in results:
if result.error is not None:
if result.status in (
FreshnessStatus.Error,
FreshnessStatus.RuntimeErr
):
print_run_result_error(result)
print_timestamped_line('Done.')

View File

@@ -11,8 +11,8 @@ from dbt.adapters.factory import get_adapter
from dbt.contracts.graph.compiled import CompileResultNode
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.results import (
TableMetadata, CatalogTable, CatalogResults, Primitive, CatalogKey,
StatsItem, StatsDict, ColumnMetadata
NodeStatus, TableMetadata, CatalogTable, CatalogResults, Primitive,
CatalogKey, StatsItem, StatsDict, ColumnMetadata, CatalogArtifact
)
from dbt.exceptions import InternalException
from dbt.include.global_project import DOCS_INDEX_FILE_PATH
@@ -207,20 +207,20 @@ class GenerateTask(CompileTask):
)
return self.manifest
def run(self) -> CatalogResults:
def run(self) -> CatalogArtifact:
compile_results = None
if self.args.compile:
compile_results = CompileTask.run(self)
if any(r.error is not None for r in compile_results):
if any(r.status == NodeStatus.Error for r in compile_results):
print_timestamped_line(
'compile failed, cannot generate docs'
)
return CatalogResults(
return CatalogArtifact.from_results(
nodes={},
sources={},
generated_at=datetime.utcnow(),
errors=None,
_compile_results=compile_results
compile_results=compile_results
)
else:
self.manifest = get_full_manifest(self.config)
@@ -294,12 +294,12 @@ class GenerateTask(CompileTask):
generated_at: datetime,
compile_results: Optional[Any],
errors: Optional[List[str]]
) -> CatalogResults:
return CatalogResults(
) -> CatalogArtifact:
return CatalogArtifact.from_results(
generated_at=generated_at,
nodes=nodes,
sources=sources,
generated_at=generated_at,
_compile_results=compile_results,
compile_results=compile_results,
errors=errors,
)

View File

@@ -1,8 +1,11 @@
import os
import shutil
import dbt.config
import dbt.clients.git
import dbt.clients.system
from dbt.adapters.factory import load_plugin, get_include_paths
from dbt.exceptions import RuntimeException
from dbt.logger import GLOBAL_LOGGER as logger
@@ -11,12 +14,12 @@ from dbt.task.base import BaseTask
STARTER_REPO = 'https://github.com/fishtown-analytics/dbt-starter-project.git'
STARTER_BRANCH = 'dbt-yml-config-version-2'
DOCS_URL = 'https://docs.getdbt.com/docs/configure-your-profile'
SAMPLE_PROFILES_YML_FILE = 'https://docs.getdbt.com/docs/profile' # noqa
ON_COMPLETE_MESSAGE = """
Your new dbt project "{project_name}" was created! If this is your first time
using dbt, you'll need to set up your profiles.yml file -- this file will
tell dbt how to connect to your database. You can find this file by running:
using dbt, you'll need to set up your profiles.yml file (we've created a sample
file for you to connect to {sample_adapter}) -- this file will tell dbt how
to connect to your database. You can find this file by running:
{open_cmd} {profiles_path}
@@ -32,34 +35,6 @@ There's a link to our Slack group in the GitHub Readme. Happy modeling!
"""
STARTER_PROFILE = """
# For more information on how to configure this file, please see:
# {profiles_sample}
default:
outputs:
dev:
type: redshift
threads: 1
host: 127.0.0.1
port: 5439
user: alice
pass: pa55word
dbname: warehouse
schema: dbt_alice
prod:
type: redshift
threads: 1
host: 127.0.0.1
port: 5439
user: alice
pass: pa55word
dbname: warehouse
schema: analytics
target: dev
""".format(profiles_sample=SAMPLE_PROFILES_YML_FILE)
class InitTask(BaseTask):
def clone_starter_repo(self, project_name):
dbt.clients.git.clone(
@@ -76,34 +51,48 @@ class InitTask(BaseTask):
return True
return False
def create_profiles_file(self, profiles_file):
def create_profiles_file(self, profiles_file, sample_adapter):
# Line below raises an exception if the specified adapter is not found
load_plugin(sample_adapter)
adapter_path = get_include_paths(sample_adapter)[0]
sample_profiles_path = adapter_path / 'sample_profiles.yml'
if not sample_profiles_path.exists():
raise RuntimeException(f'No sample profile for {sample_adapter}')
if not os.path.exists(profiles_file):
dbt.clients.system.make_file(profiles_file, STARTER_PROFILE)
shutil.copyfile(sample_profiles_path, profiles_file)
return True
return False
def get_addendum(self, project_name, profiles_path):
def get_addendum(self, project_name, profiles_path, sample_adapter):
open_cmd = dbt.clients.system.open_dir_cmd()
return ON_COMPLETE_MESSAGE.format(
open_cmd=open_cmd,
project_name=project_name,
sample_adapter=sample_adapter,
profiles_path=profiles_path,
docs_url=DOCS_URL
)
def run(self):
project_dir = self.args.project_name
sample_adapter = self.args.adapter
profiles_dir = dbt.config.PROFILES_DIR
profiles_file = os.path.join(profiles_dir, 'profiles.yml')
self.create_profiles_dir(profiles_dir)
self.create_profiles_file(profiles_file)
msg = "Creating dbt configuration folder at {}"
logger.info(msg.format(profiles_dir))
msg = "With sample profiles.yml for {}"
logger.info(msg.format(sample_adapter))
self.create_profiles_dir(profiles_dir)
self.create_profiles_file(profiles_file, sample_adapter)
if os.path.exists(project_dir):
raise RuntimeError("directory {} already exists!".format(
project_dir
@@ -111,5 +100,5 @@ class InitTask(BaseTask):
self.clone_starter_repo(project_dir)
addendum = self.get_addendum(project_dir, profiles_dir)
addendum = self.get_addendum(project_dir, profiles_dir, sample_adapter)
logger.info(addendum)

View File

@@ -1,6 +1,10 @@
import json
from typing import Type
from dbt.contracts.graph.parsed import (
ParsedExposure,
ParsedSourceDefinition,
)
from dbt.graph import (
parse_difference,
ResourceTypeSelector,
@@ -20,6 +24,7 @@ class ListTask(GraphRunnableTask):
NodeType.Seed,
NodeType.Test,
NodeType.Source,
NodeType.Exposure,
))
ALL_RESOURCE_VALUES = DEFAULT_RESOURCE_VALUES | frozenset((
NodeType.Analysis,
@@ -71,6 +76,8 @@ class ListTask(GraphRunnableTask):
yield self.manifest.nodes[node]
elif node in self.manifest.sources:
yield self.manifest.sources[node]
elif node in self.manifest.exposures:
yield self.manifest.exposures[node]
else:
raise RuntimeException(
f'Got an unexpected result from node selection: "{node}"'
@@ -79,18 +86,25 @@ class ListTask(GraphRunnableTask):
def generate_selectors(self):
for node in self._iterate_selected_nodes():
selector = '.'.join(node.fqn)
if node.resource_type == NodeType.Source:
yield 'source:{}'.format(selector)
assert isinstance(node, ParsedSourceDefinition)
# sources are searched for by pkg.source_name.table_name
source_selector = '.'.join([
node.package_name, node.source_name, node.name
])
yield f'source:{source_selector}'
elif node.resource_type == NodeType.Exposure:
assert isinstance(node, ParsedExposure)
# exposures are searched for by pkg.exposure_name
exposure_selector = '.'.join([node.package_name, node.name])
yield f'exposure:{exposure_selector}'
else:
yield selector
# everything else is from `fqn`
yield '.'.join(node.fqn)
def generate_names(self):
for node in self._iterate_selected_nodes():
if node.resource_type == NodeType.Source:
yield '{0.source_name}.{0.name}'.format(node)
else:
yield node.name
yield node.search_name
def generate_json(self):
for node in self._iterate_selected_nodes():
@@ -165,13 +179,16 @@ class ListTask(GraphRunnableTask):
return TestSelector(
graph=self.graph,
manifest=self.manifest,
previous_state=self.previous_state,
)
else:
return ResourceTypeSelector(
graph=self.graph,
manifest=self.manifest,
previous_state=self.previous_state,
resource_types=self.resource_types,
)
def interpret_results(self, results):
return bool(results)
# list command should always return 0 as exit code
return True

93
core/dbt/task/parse.py Normal file
View File

@@ -0,0 +1,93 @@
# This task is intended to be used for diagnosis, development and
# performance analysis.
# It separates out the parsing flows for easier logging and
# debugging.
# To store cProfile performance data, execute with the '-r'
# flag and an output file: dbt -r dbt.cprof parse.
# Use a visualizer such as snakeviz to look at the output:
# snakeviz dbt.cprof
from dbt.task.base import ConfiguredTask
from dbt.adapters.factory import get_adapter
from dbt.parser.manifest import Manifest, ManifestLoader, _check_manifest
from dbt.logger import DbtProcessState, print_timestamped_line
from dbt.graph import Graph
import time
from typing import Optional
import os
MANIFEST_FILE_NAME = 'manifest.json'
PERF_INFO_FILE_NAME = 'perf_info.json'
PARSING_STATE = DbtProcessState('parsing')
class ParseTask(ConfiguredTask):
def __init__(self, args, config):
super().__init__(args, config)
self.manifest: Optional[Manifest] = None
self.graph: Optional[Graph] = None
self.loader: Optional[ManifestLoader] = None
def write_manifest(self):
path = os.path.join(self.config.target_path, MANIFEST_FILE_NAME)
self.manifest.write(path)
def write_perf_info(self):
path = os.path.join(self.config.target_path, PERF_INFO_FILE_NAME)
self.loader._perf_info.write(path)
print_timestamped_line(f"Performance info: {path}")
# This method takes code that normally exists in other files
# and pulls it in here, to simplify logging and make the
# parsing flow-of-control easier to understand and manage,
# with the downside that if changes happen in those other methods,
# similar changes might need to be made here.
# ManifestLoader.get_full_manifest
# ManifestLoader.load
# ManifestLoader.load_all
def get_full_manifest(self):
adapter = get_adapter(self.config) # type: ignore
macro_manifest: Manifest = adapter.load_macro_manifest()
print_timestamped_line("Macro manifest loaded")
root_config = self.config
macro_hook = adapter.connections.set_query_header
with PARSING_STATE:
start_load_all = time.perf_counter()
projects = root_config.load_dependencies()
print_timestamped_line("Dependencies loaded")
loader = ManifestLoader(root_config, projects, macro_hook)
print_timestamped_line("ManifestLoader created")
loader.load(macro_manifest=macro_manifest)
print_timestamped_line("Manifest loaded")
loader.write_parse_results()
print_timestamped_line("Parse results written")
manifest = loader.create_manifest()
print_timestamped_line("Manifest created")
_check_manifest(manifest, root_config)
print_timestamped_line("Manifest checked")
manifest.build_flat_graph()
print_timestamped_line("Flat graph built")
loader._perf_info.load_all_elapsed = (
time.perf_counter() - start_load_all
)
self.loader = loader
self.manifest = manifest
print_timestamped_line("Manifest loaded")
def compile_manifest(self):
adapter = get_adapter(self.config)
compiler = adapter.get_compiler()
self.graph = compiler.compile(self.manifest)
def run(self):
print_timestamped_line('Start parsing.')
self.get_full_manifest()
if self.args.compile:
print_timestamped_line('Compiling.')
self.compile_manifest()
if self.args.write_manifest:
print_timestamped_line('Writing manifest.')
self.write_manifest()
self.write_perf_info()
print_timestamped_line('Done.')

View File

@@ -1,4 +1,4 @@
from typing import Dict, Optional, Tuple
from typing import Dict, Optional, Tuple, Callable
from dbt.logger import (
GLOBAL_LOGGER as logger,
DbtStatusMessage,
@@ -11,10 +11,15 @@ from dbt.tracking import InvocationProcessor
from dbt import ui
from dbt import utils
from dbt.contracts.results import (
FreshnessStatus, NodeResult, NodeStatus, TestStatus
)
def print_fancy_output_line(
msg: str, status: str, index: Optional[int], total: Optional[int],
execution_time: Optional[float] = None, truncate: bool = False
msg: str, status: str, logger_fn: Callable, index: Optional[int],
total: Optional[int], execution_time: Optional[float] = None,
truncate: bool = False
) -> None:
if index is None or total is None:
progress = ''
@@ -39,7 +44,7 @@ def print_fancy_output_line(
output = "{justified} [{status}{status_time}]".format(
justified=justified, status=status, status_time=status_time)
logger.info(output)
logger_fn(output)
def get_counts(flat_nodes) -> str:
@@ -63,12 +68,13 @@ def get_counts(flat_nodes) -> str:
def print_start_line(description: str, index: int, total: int) -> None:
msg = "START {}".format(description)
print_fancy_output_line(msg, 'RUN', index, total)
print_fancy_output_line(msg, 'RUN', logger.info, index, total)
def print_hook_start_line(statement: str, index: int, total: int) -> None:
msg = 'START hook: {}'.format(statement)
print_fancy_output_line(msg, 'RUN', index, total, truncate=True)
print_fancy_output_line(
msg, 'RUN', logger.info, index, total, truncate=True)
def print_hook_end_line(
@@ -76,7 +82,7 @@ def print_hook_end_line(
) -> None:
msg = 'OK hook: {}'.format(statement)
# hooks don't fail into this path, so always green
print_fancy_output_line(msg, ui.green(status), index, total,
print_fancy_output_line(msg, ui.green(status), logger.info, index, total,
execution_time=execution_time, truncate=True)
@@ -84,51 +90,58 @@ def print_skip_line(
model, schema: str, relation: str, index: int, num_models: int
) -> None:
msg = 'SKIP relation {}.{}'.format(schema, relation)
print_fancy_output_line(msg, ui.yellow('SKIP'), index, num_models)
print_fancy_output_line(
msg, ui.yellow('SKIP'), logger.info, index, num_models)
def print_cancel_line(model) -> None:
msg = 'CANCEL query {}'.format(model)
print_fancy_output_line(msg, ui.red('CANCEL'), index=None, total=None)
print_fancy_output_line(
msg, ui.red('CANCEL'), logger.error, index=None, total=None)
def get_printable_result(result, success: str, error: str) -> Tuple[str, str]:
if result.error is not None:
def get_printable_result(
result, success: str, error: str) -> Tuple[str, str, Callable]:
if result.status == NodeStatus.Error:
info = 'ERROR {}'.format(error)
status = ui.red(result.status)
status = ui.red(result.status.upper())
logger_fn = logger.error
else:
info = 'OK {}'.format(success)
status = ui.green(result.status)
status = ui.green(result.message)
logger_fn = logger.info
return info, status
return info, status, logger_fn
def print_test_result_line(
result, schema_name, index: int, total: int
result: NodeResult, schema_name, index: int, total: int
) -> None:
model = result.node
if result.error is not None:
if result.status == TestStatus.Error:
info = "ERROR"
color = ui.red
elif result.status == 0:
logger_fn = logger.error
elif result.status == TestStatus.Pass:
info = 'PASS'
color = ui.green
elif result.warn:
info = 'WARN {}'.format(result.status)
logger_fn = logger.info
elif result.status == TestStatus.Warn:
info = 'WARN {}'.format(result.message)
color = ui.yellow
elif result.fail:
info = 'FAIL {}'.format(result.status)
logger_fn = logger.warning
elif result.status == TestStatus.Fail:
info = 'FAIL {}'.format(result.message)
color = ui.red
logger_fn = logger.error
else:
raise RuntimeError("unexpected status: {}".format(result.status))
print_fancy_output_line(
"{info} {name}".format(info=info, name=model.name),
color(info),
logger_fn,
index,
total,
result.execution_time)
@@ -137,11 +150,13 @@ def print_test_result_line(
def print_model_result_line(
result, description: str, index: int, total: int
) -> None:
info, status = get_printable_result(result, 'created', 'creating')
info, status, logger_fn = get_printable_result(
result, 'created', 'creating')
print_fancy_output_line(
"{info} {description}".format(info=info, description=description),
status,
logger_fn,
index,
total,
result.execution_time)
@@ -152,7 +167,8 @@ def print_snapshot_result_line(
) -> None:
model = result.node
info, status = get_printable_result(result, 'snapshotted', 'snapshotting')
info, status, logger_fn = get_printable_result(
result, 'snapshotted', 'snapshotting')
cfg = model.config.to_dict()
msg = "{info} {description}".format(
@@ -160,6 +176,7 @@ def print_snapshot_result_line(
print_fancy_output_line(
msg,
status,
logger_fn,
index,
total,
result.execution_time)
@@ -168,7 +185,7 @@ def print_snapshot_result_line(
def print_seed_result_line(result, schema_name: str, index: int, total: int):
model = result.node
info, status = get_printable_result(result, 'loaded', 'loading')
info, status, logger_fn = get_printable_result(result, 'loaded', 'loading')
print_fancy_output_line(
"{info} seed file {schema}.{relation}".format(
@@ -176,24 +193,29 @@ def print_seed_result_line(result, schema_name: str, index: int, total: int):
schema=schema_name,
relation=model.alias),
status,
logger_fn,
index,
total,
result.execution_time)
def print_freshness_result_line(result, index: int, total: int) -> None:
if result.error:
if result.status == FreshnessStatus.RuntimeErr:
info = 'ERROR'
color = ui.red
elif result.status == 'error':
logger_fn = logger.error
elif result.status == FreshnessStatus.Error:
info = 'ERROR STALE'
color = ui.red
elif result.status == 'warn':
logger_fn = logger.error
elif result.status == FreshnessStatus.Warn:
info = 'WARN'
color = ui.yellow
logger_fn = logger.warning
else:
info = 'PASS'
color = ui.green
logger_fn = logger.info
if hasattr(result, 'node'):
source_name = result.node.source_name
@@ -202,15 +224,12 @@ def print_freshness_result_line(result, index: int, total: int) -> None:
source_name = result.source_name
table_name = result.table_name
msg = "{info} freshness of {source_name}.{table_name}".format(
info=info,
source_name=source_name,
table_name=table_name
)
msg = f"{info} freshness of {source_name}.{table_name}"
print_fancy_output_line(
msg,
color(info),
logger_fn,
index,
total,
execution_time=result.execution_time
@@ -218,14 +237,16 @@ def print_freshness_result_line(result, index: int, total: int) -> None:
def interpret_run_result(result) -> str:
if result.error is not None or result.fail:
if result.status in (NodeStatus.Error, NodeStatus.Fail):
return 'error'
elif result.skipped:
elif result.status == NodeStatus.Skipped:
return 'skip'
elif result.warn:
elif result.status == NodeStatus.Warn:
return 'warn'
else:
elif result.status in (NodeStatus.Pass, NodeStatus.Success):
return 'pass'
else:
raise RuntimeError(f"unhandled result {result}")
def print_run_status_line(results) -> None:
@@ -253,7 +274,9 @@ def print_run_result_error(
with TextOnly():
logger.info("")
if result.fail or (is_warning and result.warn):
if result.status == NodeStatus.Fail or (
is_warning and result.status == NodeStatus.Warn
):
if is_warning:
color = ui.yellow
info = 'Warning'
@@ -269,12 +292,13 @@ def print_run_result_error(
result.node.original_file_path))
try:
int(result.status)
# if message is int, must be rows returned for a test
int(result.message)
except ValueError:
logger.error(" Status: {}".format(result.status))
else:
status = utils.pluralize(result.status, 'result')
logger.error(" Got {}, expected 0.".format(status))
num_rows = utils.pluralize(result.message, 'result')
logger.error(" Got {}, expected 0.".format(num_rows))
if result.node.build_path is not None:
with TextOnly():
@@ -282,9 +306,9 @@ def print_run_result_error(
logger.info(" compiled SQL at {}".format(
result.node.build_path))
else:
elif result.message is not None:
first = True
for line in result.error.split("\n"):
for line in result.message.split("\n"):
if first:
logger.error(ui.yellow(line))
first = False
@@ -297,7 +321,8 @@ def print_skip_caused_by_error(
) -> None:
msg = ('SKIP relation {}.{} due to ephemeral model error'
.format(schema, relation))
print_fancy_output_line(msg, ui.red('ERROR SKIP'), index, num_models)
print_fancy_output_line(
msg, ui.red('ERROR SKIP'), logger.error, index, num_models)
print_run_result_error(result, newline=False)
@@ -322,8 +347,21 @@ def print_end_of_run_summary(
def print_run_end_messages(results, keyboard_interrupt: bool = False) -> None:
errors = [r for r in results if r.error is not None or r.fail]
warnings = [r for r in results if r.warn]
errors, warnings = [], []
for r in results:
if r.status in (
NodeStatus.RuntimeErr,
NodeStatus.Error,
NodeStatus.Fail
):
errors.append(r)
elif r.status == NodeStatus.Skipped and r.message is not None:
# this means we skipped a node because of an issue upstream,
# so include it as an error
errors.append(r)
elif r.status == NodeStatus.Warn:
warnings.append(r)
with DbtStatusMessage(), InvocationProcessor():
print_end_of_run_summary(len(errors),
len(warnings),

View File

@@ -1,8 +1,24 @@
from dbt.contracts.rpc import RemoteExecutionResult
from dbt.contracts.results import (
RunResult,
RunOperationResult,
FreshnessResult,
)
from dbt.contracts.rpc import (
RemoteExecutionResult,
RemoteFreshnessResult,
RemoteRunOperationResult,
)
from dbt.task.runnable import GraphRunnableTask
from dbt.rpc.method import RemoteManifestMethod, Parameters
RESULT_TYPE_MAP = {
RunResult: RemoteExecutionResult,
RunOperationResult: RemoteRunOperationResult,
FreshnessResult: RemoteFreshnessResult,
}
class RPCTask(
GraphRunnableTask,
RemoteManifestMethod[Parameters, RemoteExecutionResult]
@@ -20,9 +36,7 @@ class RPCTask(
def get_result(
self, results, elapsed_time, generated_at
) -> RemoteExecutionResult:
return RemoteExecutionResult(
results=results,
elapsed_time=elapsed_time,
generated_at=generated_at,
logs=[],
)
base = super().get_result(results, elapsed_time, generated_at)
cls = RESULT_TYPE_MAP.get(type(base), RemoteExecutionResult)
rpc_result = cls.from_local_result(base, logs=[])
return rpc_result

View File

@@ -104,7 +104,9 @@ class RemoteRPCCli(RPCTask[RPCCliParameters]):
if dumped != self.args.vars:
self.real_task.args.vars = dumped
if isinstance(self.real_task, RemoteManifestMethod):
self.real_task.manifest = get_full_manifest(self.config)
self.real_task.manifest = get_full_manifest(
self.config, reset=True
)
# we parsed args from the cli, so we're set on that front
return self.real_task.handle_request()

View File

@@ -2,7 +2,7 @@ import os
import shutil
from dbt.contracts.rpc import (
RPCNoParameters, RemoteEmptyResult, RemoteMethodFlags,
RPCDepsParameters, RemoteDepsResult, RemoteMethodFlags,
)
from dbt.rpc.method import RemoteMethod
from dbt.task.deps import DepsTask
@@ -15,7 +15,7 @@ def _clean_deps(config):
class RemoteDepsTask(
RemoteMethod[RPCNoParameters, RemoteEmptyResult],
RemoteMethod[RPCDepsParameters, RemoteDepsResult],
DepsTask,
):
METHOD_NAME = 'deps'
@@ -26,10 +26,10 @@ class RemoteDepsTask(
RemoteMethodFlags.RequiresManifestReloadAfter
)
def set_args(self, params: RPCNoParameters):
def set_args(self, params: RPCDepsParameters):
pass
def handle_request(self) -> RemoteEmptyResult:
def handle_request(self) -> RemoteDepsResult:
_clean_deps(self.config)
self.run()
return RemoteEmptyResult([])
return RemoteDepsResult([])

View File

@@ -1,12 +1,15 @@
from datetime import datetime
from pathlib import Path
from typing import List, Optional, Union
from dbt import flags
from dbt.contracts.graph.manifest import WritableManifest
from dbt.contracts.rpc import (
GetManifestParameters,
GetManifestResult,
RPCCompileParameters,
RPCDocsGenerateParameters,
RPCRunParameters,
RPCRunOperationParameters,
RPCSeedParameters,
RPCTestParameters,
@@ -54,6 +57,15 @@ class RPCCommandTask(
return self.run()
def state_path(state: Optional[str]) -> Optional[Path]:
if state is not None:
return Path(state)
elif flags.ARTIFACT_STATE_PATH is not None:
return Path(flags.ARTIFACT_STATE_PATH)
else:
return None
class RemoteCompileProjectTask(
RPCCommandTask[RPCCompileParameters], CompileTask
):
@@ -66,16 +78,28 @@ class RemoteCompileProjectTask(
if params.threads is not None:
self.args.threads = params.threads
self.args.state = state_path(params.state)
class RemoteRunProjectTask(RPCCommandTask[RPCCompileParameters], RunTask):
self.set_previous_state()
class RemoteRunProjectTask(RPCCommandTask[RPCRunParameters], RunTask):
METHOD_NAME = 'run'
def set_args(self, params: RPCCompileParameters) -> None:
def set_args(self, params: RPCRunParameters) -> None:
self.args.models = self._listify(params.models)
self.args.exclude = self._listify(params.exclude)
self.args.selector_name = params.selector
if params.threads is not None:
self.args.threads = params.threads
if params.defer is None:
self.args.defer = flags.DEFER_MODE
else:
self.args.defer = params.defer
self.args.state = state_path(params.state)
self.set_previous_state()
class RemoteSeedProjectTask(RPCCommandTask[RPCSeedParameters], SeedTask):
@@ -90,6 +114,9 @@ class RemoteSeedProjectTask(RPCCommandTask[RPCSeedParameters], SeedTask):
self.args.threads = params.threads
self.args.show = params.show
self.args.state = state_path(params.state)
self.set_previous_state()
class RemoteTestProjectTask(RPCCommandTask[RPCTestParameters], TestTask):
METHOD_NAME = 'test'
@@ -102,6 +129,13 @@ class RemoteTestProjectTask(RPCCommandTask[RPCTestParameters], TestTask):
self.args.schema = params.schema
if params.threads is not None:
self.args.threads = params.threads
if params.defer is None:
self.args.defer = flags.DEFER_MODE
else:
self.args.defer = params.defer
self.args.state = state_path(params.state)
self.set_previous_state()
class RemoteDocsGenerateProjectTask(
@@ -116,6 +150,8 @@ class RemoteDocsGenerateProjectTask(
self.args.selector_name = None
self.args.compile = params.compile
self.args.state = state_path(params.state)
def get_catalog_results(
self, nodes, sources, generated_at, compile_results, errors
) -> RemoteCatalogResults:
@@ -161,13 +197,7 @@ class RemoteRunOperationTask(
def handle_request(self) -> RemoteRunOperationResult:
base = RunOperationTask.run(self)
result = RemoteRunOperationResult(
results=base.results,
generated_at=base.generated_at,
logs=[],
success=base.success,
elapsed_time=base.elapsed_time
)
result = RemoteRunOperationResult.from_local_result(base=base, logs=[])
return result
def interpret_results(self, results):
@@ -185,6 +215,9 @@ class RemoteSnapshotTask(RPCCommandTask[RPCSnapshotParameters], SnapshotTask):
if params.threads is not None:
self.args.threads = params.threads
self.args.state = state_path(params.state)
self.set_previous_state()
class RemoteSourceFreshnessTask(
RPCCommandTask[RPCSourceFreshnessParameters],

View File

@@ -1,7 +1,8 @@
# import these so we can find them
from . import sql_commands # noqa
from . import project_commands # noqa
from . import deps # noqa
from . import deps # noqa
import multiprocessing.queues # noqa - https://bugs.python.org/issue41567
import json
import os
import signal

View File

@@ -7,7 +7,6 @@ from typing import Dict, Any
from dbt import flags
from dbt.adapters.factory import get_adapter
from dbt.clients.jinja import extract_toplevel_blocks
from dbt.compilation import compile_manifest
from dbt.config.runtime import RuntimeConfig
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.parsed import ParsedRPCNode
@@ -129,7 +128,9 @@ class RemoteRunSQLTask(RPCTask[RPCExecParameters]):
)
# don't write our new, weird manifest!
self.graph = compile_manifest(self.config, self.manifest, write=False)
adapter = get_adapter(self.config)
compiler = adapter.get_compiler()
self.graph = compiler.compile(self.manifest, write=False)
# previously, this compiled the ancestors, but they are compiled at
# runtime now.
return rpc_node

View File

@@ -1,7 +1,9 @@
import functools
import threading
import time
from pathlib import Path
from typing import List, Dict, Any, Iterable, Set, Tuple, Optional
from typing import List, Dict, Any, Iterable, Set, Tuple, Optional, AbstractSet
from hologram import JsonSchemaMixin
from .compile import CompileRunner, CompileTask
@@ -19,13 +21,12 @@ from dbt import tracking
from dbt import utils
from dbt.adapters.base import BaseRelation
from dbt.clients.jinja import MacroGenerator
from dbt.compilation import compile_node
from dbt.context.providers import generate_runtime_model
from dbt.contracts.graph.compiled import CompileResultNode
from dbt.contracts.graph.manifest import WritableManifest
from dbt.contracts.graph.model_config import Hook
from dbt.contracts.graph.parsed import ParsedHookNode
from dbt.contracts.results import RunModelResult
from dbt.contracts.results import NodeStatus, RunResult, RunStatus
from dbt.exceptions import (
CompilationException,
InternalException,
@@ -107,9 +108,9 @@ def track_model_run(index, num_nodes, run_model_result):
"index": index,
"total": num_nodes,
"execution_time": run_model_result.execution_time,
"run_status": run_model_result.status,
"run_skipped": run_model_result.skip,
"run_error": None,
"run_status": str(run_model_result.status).upper(),
"run_skipped": run_model_result.status == NodeStatus.Skipped,
"run_error": run_model_result.status == NodeStatus.Error,
"model_materialization": run_model_result.node.get_materialization(),
"model_id": utils.get_hash(run_model_result.node),
"hashed_contents": utils.get_hashed_contents(
@@ -189,7 +190,18 @@ class ModelRunner(CompileRunner):
def _build_run_model_result(self, model, context):
result = context['load_result']('main')
return RunModelResult(model, status=result.status)
adapter_response = {}
if isinstance(result.response, JsonSchemaMixin):
adapter_response = result.response.to_dict()
return RunResult(
node=model,
status=RunStatus.Success,
timing=[],
thread_id=threading.current_thread().name,
execution_time=0,
message=str(result.response),
adapter_response=adapter_response
)
def _materialization_relations(
self, result: Any, model
@@ -247,32 +259,6 @@ class RunTask(CompileTask):
super().__init__(args, config)
self.ran_hooks = []
self._total_executed = 0
self.deferred_manifest: Optional[WritableManifest] = None
def _get_state_path(self) -> Path:
if self.args.state is not None:
return self.args.state
else:
raise RuntimeException(
'Received a --defer argument, but no value was provided '
'to --state'
)
def _get_deferred_manifest(self) -> Optional[WritableManifest]:
if not self.args.defer:
return None
path = self._get_state_path()
if not path.is_absolute():
path = Path(self.config.project_root) / path
if path.exists() and not path.is_file():
path = path / 'manifest.json'
if not path.exists():
raise RuntimeException(
f'Could not find --state path: "{path}"'
)
return WritableManifest.read(str(path))
def index_offset(self, value: int) -> int:
return self._total_executed + value
@@ -281,9 +267,9 @@ class RunTask(CompileTask):
return False
def get_hook_sql(self, adapter, hook, idx, num_hooks, extra_context):
compiled = compile_node(adapter, self.config, hook, self.manifest,
extra_context)
statement = compiled.injected_sql
compiler = adapter.get_compiler()
compiled = compiler.compile_node(hook, self.manifest, extra_context)
statement = compiled.compiled_sql
hook_index = hook.index or num_hooks
hook_obj = get_hook(statement, index=hook_index)
return hook_obj.sql or ''
@@ -350,7 +336,7 @@ class RunTask(CompileTask):
with finishctx, DbtModelState({'node_status': 'passed'}):
print_hook_end_line(
hook_text, status, idx, num_hooks, timer.elapsed
hook_text, str(status), idx, num_hooks, timer.elapsed
)
self._total_executed += len(ordered_hooks)
@@ -383,9 +369,26 @@ class RunTask(CompileTask):
"Finished running {stat_line}{execution}."
.format(stat_line=stat_line, execution=execution))
def defer_to_manifest(self, selected_uids):
self.deferred_manifest = self._get_deferred_manifest()
if self.deferred_manifest is None:
def _get_deferred_manifest(self) -> Optional[WritableManifest]:
if not self.args.defer:
return None
state = self.previous_state
if state is None:
raise RuntimeException(
'Received a --defer argument, but no value was provided '
'to --state'
)
if state.manifest is None:
raise RuntimeException(
f'Could not find manifest in --state path: "{self.args.state}"'
)
return state.manifest
def defer_to_manifest(self, adapter, selected_uids: AbstractSet[str]):
deferred_manifest = self._get_deferred_manifest()
if deferred_manifest is None:
return
if self.manifest is None:
raise InternalException(
@@ -393,17 +396,18 @@ class RunTask(CompileTask):
'manifest to defer from!'
)
self.manifest.merge_from_artifact(
other=self.deferred_manifest,
adapter=adapter,
other=deferred_manifest,
selected=selected_uids,
)
# TODO: is it wrong to write the manifest here? I think it's right...
self.write_manifest()
def before_run(self, adapter, selected_uids):
self.defer_to_manifest(selected_uids)
def before_run(self, adapter, selected_uids: AbstractSet[str]):
with adapter.connection_named('master'):
self.create_schemas(adapter, selected_uids)
self.populate_adapter_cache(adapter)
self.defer_to_manifest(adapter, selected_uids)
self.safe_run_hooks(adapter, RunHookType.Start, {})
def after_run(self, adapter, results):
@@ -411,10 +415,16 @@ class RunTask(CompileTask):
# list of unique database, schema pairs that successfully executed
# models were in. for backwards compatibility, include the old
# 'schemas', which did not include database information.
database_schema_set: Set[Tuple[Optional[str], str]] = {
(r.node.database, r.node.schema) for r in results
if not any((r.error is not None, r.fail, r.skipped))
if r.status not in (
NodeStatus.Error,
NodeStatus.Fail,
NodeStatus.Skipped
)
}
self._total_executed += len(results)
extras = {
@@ -436,6 +446,7 @@ class RunTask(CompileTask):
return ResourceTypeSelector(
graph=self.graph,
manifest=self.manifest,
previous_state=self.previous_state,
resource_types=[NodeType.Model],
)

View File

@@ -8,7 +8,7 @@ from .runnable import ManifestTask
import dbt.exceptions
from dbt.adapters.factory import get_adapter
from dbt.config.utils import parse_cli_vars
from dbt.contracts.results import RunOperationResult
from dbt.contracts.results import RunOperationResultsArtifact
from dbt.exceptions import InternalException
from dbt.logger import GLOBAL_LOGGER as logger
@@ -47,7 +47,7 @@ class RunOperationTask(ManifestTask):
return res
def run(self) -> RunOperationResult:
def run(self) -> RunOperationResultsArtifact:
start = datetime.utcnow()
self._runtime_initialize()
try:
@@ -69,11 +69,10 @@ class RunOperationTask(ManifestTask):
else:
success = True
end = datetime.utcnow()
return RunOperationResult(
results=[],
return RunOperationResultsArtifact.from_success(
generated_at=end,
elapsed_time=(end - start).total_seconds(),
success=success
success=success,
)
def interpret_results(self, results):

View File

@@ -4,7 +4,8 @@ from abc import abstractmethod
from concurrent.futures import as_completed
from datetime import datetime
from multiprocessing.dummy import Pool as ThreadPool
from typing import Optional, Dict, List, Set, Tuple, Iterable
from typing import Optional, Dict, List, Set, Tuple, Iterable, AbstractSet
from pathlib import PosixPath, WindowsPath
from .printer import (
print_run_result_error,
@@ -26,12 +27,12 @@ from dbt.logger import (
NodeCount,
print_timestamped_line,
)
from dbt.compilation import compile_manifest
from dbt.contracts.graph.compiled import CompileResultNode
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.parsed import ParsedSourceDefinition
from dbt.contracts.results import ExecutionResult
from dbt.contracts.results import NodeStatus, RunExecutionResult
from dbt.contracts.state import PreviousState
from dbt.exceptions import (
InternalException,
NotImplementedException,
@@ -70,7 +71,9 @@ class ManifestTask(ConfiguredTask):
raise InternalException(
'compile_manifest called before manifest was loaded'
)
self.graph = compile_manifest(self.config, self.manifest)
adapter = get_adapter(self.config)
compiler = adapter.get_compiler()
self.graph = compiler.compile(self.manifest)
def _runtime_initialize(self):
self.load_manifest()
@@ -88,6 +91,12 @@ class GraphRunnableTask(ManifestTask):
self.node_results = []
self._skipped_children = {}
self._raise_next_tick = None
self.previous_state: Optional[PreviousState] = None
self.set_previous_state()
def set_previous_state(self):
if self.args.state is not None:
self.previous_state = PreviousState(self.args.state)
def index_offset(self, value: int) -> int:
return value
@@ -180,17 +189,17 @@ class GraphRunnableTask(ManifestTask):
fail_fast = getattr(self.config.args, 'fail_fast', False)
if (result.fail is not None or result.error is not None) and fail_fast:
if result.status in (NodeStatus.Error, NodeStatus.Fail) and fail_fast:
self._raise_next_tick = FailFastException(
message='Failing early due to test failure or runtime error',
result=result,
node=getattr(result, 'node', None)
)
elif result.error is not None and self.raise_on_first_error():
elif result.status == NodeStatus.Error and self.raise_on_first_error():
# if we raise inside a thread, it'll just get silently swallowed.
# stash the error message we want here, and it will check the
# next 'tick' - should be soon since our thread is about to finish!
self._raise_next_tick = RuntimeException(result.error)
self._raise_next_tick = RuntimeException(result.message)
return result
@@ -278,7 +287,7 @@ class GraphRunnableTask(ManifestTask):
else:
self.manifest.update_node(node)
if result.error is not None:
if result.status == NodeStatus.Error:
if is_ephemeral:
cause = result
else:
@@ -356,7 +365,7 @@ class GraphRunnableTask(ManifestTask):
def before_hooks(self, adapter):
pass
def before_run(self, adapter, selected_uids):
def before_run(self, adapter, selected_uids: AbstractSet[str]):
with adapter.connection_named('master'):
self.populate_adapter_cache(adapter)
@@ -366,7 +375,7 @@ class GraphRunnableTask(ManifestTask):
def after_hooks(self, adapter, results, elapsed):
pass
def execute_with_hooks(self, selected_uids):
def execute_with_hooks(self, selected_uids: AbstractSet[str]):
adapter = get_adapter(self.config)
try:
self.before_hooks(adapter)
@@ -387,6 +396,9 @@ class GraphRunnableTask(ManifestTask):
)
return result
def write_result(self, result):
result.write(self.result_path())
def run(self):
"""
Run dbt for the query, based on the graph.
@@ -414,7 +426,8 @@ class GraphRunnableTask(ManifestTask):
result = self.execute_with_hooks(selected_uids)
if flags.WRITE_JSON:
result.write(self.result_path())
self.write_manifest()
self.write_result(result)
self.task_end_messages(result.results)
return result
@@ -423,7 +436,14 @@ class GraphRunnableTask(ManifestTask):
if results is None:
return False
failures = [r for r in results if r.error or r.fail]
failures = [
r for r in results if r.status in (
NodeStatus.RuntimeErr,
NodeStatus.Error,
NodeStatus.Fail,
NodeStatus.Skipped # propogate error message causing skip
)
]
return len(failures) == 0
def get_model_schemas(
@@ -518,11 +538,37 @@ class GraphRunnableTask(ManifestTask):
create_future.result()
def get_result(self, results, elapsed_time, generated_at):
return ExecutionResult(
return RunExecutionResult(
results=results,
elapsed_time=elapsed_time,
generated_at=generated_at
generated_at=generated_at,
args=self.args_to_dict(),
)
def args_to_dict(self):
var_args = vars(self.args)
dict_args = {}
# remove args keys that clutter up the dictionary
for key in var_args:
if key == 'cls':
continue
if var_args[key] is None:
continue
default_false_keys = (
'debug', 'full_refresh', 'fail_fast', 'warn_error',
'single_threaded', 'test_new_parser', 'log_cache_events',
'strict'
)
if key in default_false_keys and var_args[key] is False:
continue
if key == 'vars' and var_args[key] == '{}':
continue
# this was required for a test case
if (isinstance(var_args[key], PosixPath) or
isinstance(var_args[key], WindowsPath)):
var_args[key] = str(var_args[key])
dict_args[key] = var_args[key]
return dict_args
def task_end_messages(self, results):
print_run_end_messages(results)

View File

@@ -7,6 +7,7 @@ from .printer import (
print_run_end_messages,
)
from dbt.contracts.results import RunStatus
from dbt.exceptions import InternalException
from dbt.graph import ResourceTypeSelector
from dbt.logger import GLOBAL_LOGGER as logger, TextOnly
@@ -37,6 +38,10 @@ class SeedRunner(ModelRunner):
class SeedTask(RunTask):
def defer_to_manifest(self, adapter, selected_uids):
# seeds don't defer
return
def raise_on_first_error(self):
return False
@@ -48,6 +53,7 @@ class SeedTask(RunTask):
return ResourceTypeSelector(
graph=self.graph,
manifest=self.manifest,
previous_state=self.previous_state,
resource_types=[NodeType.Seed],
)
@@ -78,5 +84,5 @@ class SeedTask(RunTask):
def show_tables(self, results):
for result in results:
if result.error is None:
if result.status != RunStatus.Error:
self.show_table(result)

View File

@@ -22,6 +22,10 @@ class SnapshotTask(RunTask):
def raise_on_first_error(self):
return False
def defer_to_manifest(self, adapter, selected_uids):
# snapshots don't defer
return
def get_node_selector(self):
if self.manifest is None or self.graph is None:
raise InternalException(
@@ -30,6 +34,7 @@ class SnapshotTask(RunTask):
return ResourceTypeSelector(
graph=self.graph,
manifest=self.manifest,
previous_state=self.previous_state,
resource_types=[NodeType.Snapshot],
)

View File

@@ -1,3 +1,4 @@
import threading
from typing import Dict, Any, Set
from .compile import CompileRunner
@@ -14,7 +15,7 @@ from dbt.contracts.graph.parsed import (
ParsedDataTestNode,
ParsedSchemaTestNode,
)
from dbt.contracts.results import RunModelResult
from dbt.contracts.results import RunResult, TestStatus
from dbt.exceptions import raise_compiler_error, InternalException
from dbt.graph import (
ResourceTypeSelector,
@@ -41,10 +42,9 @@ class TestRunner(CompileRunner):
print_start_line(description, self.node_index, self.num_nodes)
def execute_data_test(self, test: CompiledDataTestNode):
sql = (
f'select count(*) as errors from (\n{test.injected_sql}\n) sbq'
res, table = self.adapter.execute(
test.compiled_sql, auto_begin=True, fetch=True
)
res, table = self.adapter.execute(sql, auto_begin=True, fetch=True)
num_rows = len(table.rows)
if num_rows != 1:
@@ -60,7 +60,7 @@ class TestRunner(CompileRunner):
def execute_schema_test(self, test: CompiledSchemaTestNode):
res, table = self.adapter.execute(
test.injected_sql,
test.compiled_sql,
auto_begin=True,
fetch=True,
)
@@ -84,19 +84,30 @@ class TestRunner(CompileRunner):
elif isinstance(test, CompiledSchemaTestNode):
failed_rows = self.execute_schema_test(test)
else:
raise InternalException(
f'Expected compiled schema test or compiled data test, got '
f'{type(test)}'
)
severity = test.config.severity.upper()
severity = test.config.severity.upper()
thread_id = threading.current_thread().name
status = None
if failed_rows == 0:
return RunModelResult(test, status=failed_rows)
status = TestStatus.Pass
elif severity == 'ERROR' or flags.WARN_ERROR:
return RunModelResult(test, status=failed_rows, fail=True)
status = TestStatus.Fail
else:
return RunModelResult(test, status=failed_rows, warn=True)
status = TestStatus.Warn
return RunResult(
node=test,
status=status,
timing=[],
thread_id=thread_id,
execution_time=0,
message=int(failed_rows),
adapter_response={}
)
def after_execute(self, result):
self.print_result_line(result)
@@ -107,18 +118,23 @@ SCHEMA_TEST_TYPES = (CompiledSchemaTestNode, ParsedSchemaTestNode)
class TestSelector(ResourceTypeSelector):
def __init__(self, graph, manifest):
def __init__(self, graph, manifest, previous_state):
super().__init__(
graph=graph,
manifest=manifest,
previous_state=previous_state,
resource_types=[NodeType.Test],
)
def expand_selection(self, selected: Set[UniqueId]) -> Set[UniqueId]:
selected_tests = {
n for n in self.graph.select_successors(selected)
if self.manifest.nodes[n].resource_type == NodeType.Test
}
# exposures can't have tests, so this is relatively easy
selected_tests = set()
for unique_id in self.graph.select_successors(selected):
if unique_id in self.manifest.nodes:
node = self.manifest.nodes[unique_id]
if node.resource_type == NodeType.Test:
selected_tests.add(unique_id)
return selected | selected_tests
@@ -128,6 +144,7 @@ class TestTask(RunTask):
Read schema files + custom data tests and validate that
constraints are satisfied.
"""
def raise_on_first_error(self):
return False
@@ -153,6 +170,7 @@ class TestTask(RunTask):
return TestSelector(
graph=self.graph,
manifest=self.manifest,
previous_state=self.previous_state,
)
def get_runner_type(self):

View File

@@ -15,19 +15,38 @@ import requests
import yaml
import os
import tracking # written in Rust
sp_logger.setLevel(100)
COLLECTOR_URL = "fishtownanalytics.sinter-collect.com"
COLLECTOR_PROTOCOL = "https"
COLLECTOR_URL = tracking.connector_url()
COLLECTOR_PROTOCOL = tracking.collector_protocol()
INVOCATION_SPEC = 'iglu:com.dbt/invocation/jsonschema/1-0-1'
PLATFORM_SPEC = 'iglu:com.dbt/platform/jsonschema/1-0-0'
RUN_MODEL_SPEC = 'iglu:com.dbt/run_model/jsonschema/1-0-1'
INVOCATION_ENV_SPEC = 'iglu:com.dbt/invocation_env/jsonschema/1-0-0'
PACKAGE_INSTALL_SPEC = 'iglu:com.dbt/package_install/jsonschema/1-0-0'
RPC_REQUEST_SPEC = 'iglu:com.dbt/rpc_request/jsonschema/1-0-1'
INVOCATION_SPEC = tracking.invocation_spec()
PLATFORM_SPEC = tracking.platform_spec()
RUN_MODEL_SPEC = tracking.run_model_spec()
INVOCATION_ENV_SPEC = tracking.invocation_env_spec()
PACKAGE_INSTALL_SPEC = tracking.package_install_spec()
RPC_REQUEST_SPEC = tracking.rpc_request_spec()
DEPRECATION_WARN_SPEC = tracking.deprecation_warn_spec()
LOAD_ALL_TIMING_SPEC = tracking.load_all_timing_spec()
DBT_INVOCATION_ENV = 'DBT_INVOCATION_ENV'
DBT_INVOCATION_ENV = tracking.dbt_invocation_env()
# --- revert to these for testing purposes --- #
# COLLECTOR_URL = "fishtownanalytics.sinter-collect.com"
# COLLECTOR_PROTOCOL = "https"
# INVOCATION_SPEC = 'iglu:com.dbt/invocation/jsonschema/1-0-1'
# PLATFORM_SPEC = 'iglu:com.dbt/platform/jsonschema/1-0-0'
# RUN_MODEL_SPEC = 'iglu:com.dbt/run_model/jsonschema/1-0-1'
# INVOCATION_ENV_SPEC = 'iglu:com.dbt/invocation_env/jsonschema/1-0-0'
# PACKAGE_INSTALL_SPEC = 'iglu:com.dbt/package_install/jsonschema/1-0-0'
# RPC_REQUEST_SPEC = 'iglu:com.dbt/rpc_request/jsonschema/1-0-1'
# DEPRECATION_WARN_SPEC = 'iglu:com.dbt/deprecation_warn/jsonschema/1-0-0'
# LOAD_ALL_TIMING_SPEC = 'iglu:com.dbt/load_all_timing/jsonschema/1-0-0'
# DBT_INVOCATION_ENV = 'DBT_INVOCATION_ENV'
class TimeoutEmitter(Emitter):
@@ -272,6 +291,20 @@ def track_invocation_start(config=None, args=None):
)
def track_project_load(options):
context = [SelfDescribingJson(LOAD_ALL_TIMING_SPEC, options)]
assert active_user is not None, \
'Cannot track project loading time when active user is None'
track(
active_user,
category='dbt',
action='load_project',
label=active_user.invocation_id,
context=context
)
def track_model_run(options):
context = [SelfDescribingJson(RUN_MODEL_SPEC, options)]
assert active_user is not None, \
@@ -321,6 +354,25 @@ def track_package_install(config, args, options):
)
def track_deprecation_warn(options):
assert active_user is not None, \
'Cannot track deprecation warnings when active user is None'
context = [
SelfDescribingJson(DEPRECATION_WARN_SPEC, options)
]
track(
active_user,
category="dbt",
action='deprecation',
label=active_user.invocation_id,
property_='warn',
context=context
)
def track_invocation_end(
config=None, args=None, result_type=None
):
@@ -401,6 +453,13 @@ def initialize_tracking(cookie_dir):
active_user = User(None)
def get_invocation_id() -> Optional[str]:
if active_user is None:
return None
else:
return active_user.invocation_id
class InvocationProcessor(logbook.Processor):
def __init__(self):
super().__init__()

View File

@@ -1,3 +1,4 @@
import dbt.flags as flags
import textwrap
from typing import Dict
@@ -11,8 +12,6 @@ COLORS: Dict[str, str] = {
}
USE_COLORS = False
COLOR_FG_RED = COLORS['red']
COLOR_FG_GREEN = COLORS['green']
COLOR_FG_YELLOW = COLORS['yellow']
@@ -21,9 +20,8 @@ COLOR_RESET_ALL = COLORS['reset_all']
PRINTER_WIDTH = 80
def use_colors():
global USE_COLORS
USE_COLORS = True
def use_colors(use_colors_val=True):
flags.USE_COLORS = use_colors_val
def printer_width(printer_width):
@@ -32,7 +30,7 @@ def printer_width(printer_width):
def color(text: str, color_code: str):
if USE_COLORS:
if flags.USE_COLORS:
return "{}{}{}".format(color_code, text, COLOR_RESET_ALL)
else:
return text

Some files were not shown because too many files have changed in this diff Show More