Compare commits

...

363 Commits

Author SHA1 Message Date
Kyle Wigley
0259c50b49 update manifest 2021-02-09 17:43:13 -05:00
Kyle Wigley
7be2992b1b move to core 2021-02-09 17:09:02 -05:00
Kyle Wigley
e8f057d785 hacking extensions in monorepo 2021-02-09 16:45:14 -05:00
Gerda Shank
6c6649f912 Performance fixes, including supporting libyaml, caching
mapped_fields in the classes for 'from_dict', removing deepcopy
on fqn_search, separating validation from 'from_dict',
and special handling for dbt internal not_null and unique tests.
Use TestMacroNamespace instead of original in order to limit
the number of macros in the context.  Integrate mashumaro into
dbt to improve performance of 'from_dict' and 'to_dict'
2021-02-05 15:23:55 -05:00
Kyle Wigley
2b48152da6 Merge branch 'dev/0.19.1' into dev/margaret-mead 2021-01-27 17:16:13 -05:00
Christophe Blefari
e743e23d6b Update CHANGELOG to release fix in dbt 0.19.1 version 2021-01-27 16:57:29 -05:00
Christophe Blefari
f846f921f2 Bump werkzeug upper bound dependency constraint to include version 1.0 2021-01-27 16:55:56 -05:00
Github Build Bot
1060035838 Merge remote-tracking branch 'origin/releases/0.19.0' into dev/kiyoshi-kuromiya 2021-01-27 18:02:37 +00:00
Github Build Bot
69cc20013e Release dbt v0.19.0 2021-01-27 17:39:48 +00:00
Github Build Bot
3572bfd37d Merge remote-tracking branch 'origin/releases/0.19.0rc3' into dev/kiyoshi-kuromiya 2021-01-27 16:42:46 +00:00
Github Build Bot
a6b82990f5 Release dbt v0.19.0rc3 2021-01-27 16:07:41 +00:00
Kyle Wigley
540c1fd9c6 Merge pull request #3019 from fishtown-analytics/fix/cleanup-dockerfile
Clean up docker resources
2021-01-25 10:19:45 -05:00
Jeremy Cohen
46d36cd412 Merge pull request #3028 from NiallRees/lowercase_cte_names
Make generated CTE test names lowercase to match style guide
2021-01-25 14:39:26 +01:00
NiallRees
a170764fc5 Add to contributors 2021-01-25 11:16:00 +00:00
NiallRees
f72873a1ce Update CHANGELOG.md
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2021-01-25 11:13:32 +00:00
NiallRees
82496c30b1 Changelog 2021-01-24 16:35:40 +00:00
NiallRees
cb3c007acd Make generated CTE test names lowercase to match style guide 2021-01-24 16:19:20 +00:00
Jeremy Cohen
cb460a797c Merge pull request #3018 from lynxcare/fix-issue-debug-exit-code
dbt debug should return 1 when one of the tests fail
2021-01-21 16:36:03 +01:00
Sam Debruyn
df24c7d2f8 Merge branch 'dev/margaret-mead' into fix-issue-debug-exit-code 2021-01-21 15:39:18 +01:00
Sam Debruyn
133c15c0e2 move in changelog to v0.20 2021-01-21 15:38:31 +01:00
Kyle Wigley
116e18a19e rename testing dockerfile 2021-01-21 09:28:17 -05:00
Sam Debruyn
ec0af7c97b remove exitcodes and sys.exit 2021-01-21 10:36:05 +01:00
Jeremy Cohen
a34a877737 Merge pull request #2974 from rvacaru/fix-bug-2731
Fix bug #2731 on stripping query comments for snowflake
2021-01-21 09:54:22 +01:00
Sam Debruyn
f018794465 fix flake test - formatting 2021-01-20 21:09:58 +01:00
Sam Debruyn
d45f5e9791 add missing conditions 2021-01-20 18:15:32 +01:00
Razvan Vacaru
04bd0d834c added extra unit test 2021-01-20 18:06:17 +01:00
Sam Debruyn
ed4f0c4713 formatting 2021-01-20 18:04:21 +01:00
Sam Debruyn
c747068d4a use sys.exit 2021-01-20 16:51:06 +01:00
Kyle Wigley
aa0fbdc993 update changelog 2021-01-20 10:33:18 -05:00
Kyle Wigley
b50bfa7277 - rm older dockerfiles
- add dockerfile from dbt-releases
- rename the development dockerfile to Dockerfile.dev to avoid confusion
2021-01-20 10:23:03 -05:00
Sam Debruyn
e91988f679 use ExitCodes enum for exit code 2021-01-20 16:09:41 +01:00
Sam Debruyn
3ed1fce3fb update changelog 2021-01-20 16:06:24 +01:00
Sam Debruyn
e3ea0b511a dbt debug should return 1 when one of the tests fail 2021-01-20 16:00:58 +01:00
Razvan Vacaru
c411c663de moved unit tests and updated changelog.md 2021-01-19 19:04:58 +01:00
Razvan Vacaru
1c6f66fc14 Merge branch 'dev/margaret-mead' of https://github.com/fishtown-analytics/dbt into fix-bug-2731 2021-01-19 19:01:01 +01:00
Jeremy Cohen
1f927a374c Merge pull request #2928 from yu-iskw/issue-1843
Support require_partition_filter and partition_expiration_days in BQ
2021-01-19 12:11:39 +01:00
Jeremy Cohen
07c4225aa8 Merge branch 'dev/margaret-mead' into issue-1843 2021-01-19 11:24:59 +01:00
Github Build Bot
42a85ac39f Merge remote-tracking branch 'origin/releases/0.19.0rc2' into dev/kiyoshi-kuromiya 2021-01-14 17:41:49 +00:00
Github Build Bot
16e6d31ee3 Release dbt v0.19.0rc2 2021-01-14 17:21:25 +00:00
Kyle Wigley
a6db5b436d Merge pull request #2996 from fishtown-analytics/fix/rm-ellipses
Remove ellipses printed while parsing
2021-01-14 10:39:16 -05:00
Kyle Wigley
47675f2e28 update changelog 2021-01-14 09:28:28 -05:00
Kyle Wigley
0642bbefa7 remove ellipses printed while parsing 2021-01-14 09:28:05 -05:00
Kyle Wigley
43da603d52 Merge pull request #3009 from fishtown-analytics/fix/exposure-parsing
Fix exposure parsing to allow other resources with the same name
2021-01-14 09:26:02 -05:00
Kyle Wigley
f9e1f4d111 update changelog 2021-01-13 11:54:20 -05:00
Jeremy Cohen
1508564e10 Merge pull request #3008 from fishtown-analytics/feat/print-exposure-stats-too
Add exposures to print_compile_stats
2021-01-13 15:58:13 +01:00
Kyle Wigley
c14e6f4dcc add test for dupe exposures and dupe model/exposure name 2021-01-13 08:55:22 -05:00
Jeremy Cohen
75b6a20134 Add exposures to Found list 2021-01-12 19:07:52 +01:00
Kyle Wigley
d82a07c221 tweak exposure parsing logic 2021-01-12 12:41:51 -05:00
Jeremy Cohen
c6f7dbcaa5 Merge pull request #3006 from stpierre/postgres-unpin-botocore
postgres: Don't pin botocore version
2021-01-12 13:59:55 +01:00
Chris St. Pierre
82cd099e48 Update CHANGELOG 2021-01-12 06:20:09 -06:00
Chris St. Pierre
546c011dd8 postgres: Don't pin botocore version
`snowflake-connector-python` doesn't pin it, and it restricts us to a
much older version of boto3 than the boto3 pin would otherwise allow
(specifically, botocore<1.15 requires boto3<1.12).
2021-01-11 17:25:03 -06:00
Jeremy Cohen
10b33ccaf6 Merge pull request #3004 from mikaelene/Snapshot_merge_WHEN_MATCHED
This changes makes the macro easier to read and workable on SQL Server
2021-01-11 16:42:09 +01:00
mikaelene
bc01572176 Sane as #3003. But for postgres 2021-01-11 16:04:38 +01:00
mikaelene
ccd2064722 This changes makes the macro easier to read and makes the code work for SQL Server without a custom adapter macro. Solved #3003 2021-01-11 15:04:23 +01:00
mikaelene
0fb42901dd This changes makes the macro easier to read and makes the code work for SQL Server without a custom adapter macro. Solved #3003 2021-01-11 14:58:07 +01:00
Jeremy Cohen
a4280d7457 Merge pull request #3000 from swanderz/tsql_not_equal_workaround
Tsql not equal workaround
2021-01-11 09:40:33 +01:00
Anders
6966ede68b Update CHANGELOG.md
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2021-01-10 20:54:37 -08:00
Anders
27dd14a5a2 Update core/dbt/include/global_project/macros/materializations/snapshot/strategies.sql
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2021-01-10 20:54:10 -08:00
Anders
2494301f1e Update CHANGELOG.md
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2021-01-10 20:53:52 -08:00
Anders Swanson
f13143accb for posterity 2021-01-08 13:23:13 -08:00
Anders Swanson
26d340a917 temp hack 2021-01-08 12:14:08 -08:00
Anders Swanson
cc75cd4102 no tsql support for condA != condB 2021-01-08 12:10:15 -08:00
Anders Swanson
cf8615b231 Merge branch 'dev/kiyoshi-kuromiya' of https://github.com/fishtown-analytics/dbt into dev/kiyoshi-kuromiya 2021-01-08 12:03:15 -08:00
Jeremy Cohen
30f473a2b1 Merge pull request #2994 from fishtown-analytics/copyedit-changelog
Light cleanup of v0.19.0 changelogs
2021-01-07 16:00:02 +01:00
Jeremy Cohen
4618709baa Lightly edit v0.19 changelogs 2021-01-07 10:13:43 +01:00
Razvan Vacaru
16b098ea42 updated CHANGELOG.md 2021-01-04 17:43:03 +01:00
Razvan Vacaru
b31c4d407a Fix #2731 stripping snowflake comments in multiline queries 2021-01-04 17:41:00 +01:00
Kyle Wigley
28c36cc5e2 Merge pull request #2988 from fishtown-analytics/fix/dockerfile
Manually fix requirements for dockerfile using new pip version
2021-01-04 09:10:05 -05:00
Kyle Wigley
6bfbcb842e manually fix dockerfile using new pip version 2020-12-31 13:53:50 -05:00
Github Build Bot
a0eade4fdd Merge remote-tracking branch 'origin/releases/0.19.0rc1' into dev/kiyoshi-kuromiya 2020-12-29 23:07:35 +00:00
Github Build Bot
ee24b7e88a Release dbt v0.19.0rc1 2020-12-29 22:52:34 +00:00
Anders Swanson
c9baddf9a4 Merge branch 'master' of https://github.com/fishtown-analytics/dbt into dev/kiyoshi-kuromiya 2020-12-22 23:11:09 -08:00
Kyle Wigley
c5c780a685 Merge pull request #2972 from fishtown-analytics/feature/update-dbt-docs
dbt-docs changes for v0.19.0-rc1
2020-12-22 14:07:20 -05:00
Kyle Wigley
421aaabf62 Merge pull request #2961 from fishtown-analytics/feature/add-adapter-query-stats
Include adapter response info in execution results
2020-12-22 13:57:07 -05:00
Kyle Wigley
86788f034f update changelog 2020-12-22 13:30:50 -05:00
Kyle Wigley
232d3758cf update dbt docs 2020-12-22 13:17:51 -05:00
Kyle Wigley
71bcf9b31d update changelog 2020-12-22 13:08:12 -05:00
Kyle Wigley
bf4ee4f064 update api, fix tests, add placeholder for test/source results 2020-12-22 12:13:37 -05:00
Kyle Wigley
aa3bdfeb17 update naming 2020-12-21 13:35:15 -05:00
Jeremy Cohen
ce6967d396 Merge pull request #2966 from fishtown-analytics/fix/add-ctes-comment
Update comments for _add_ctes()
2020-12-18 10:54:37 -05:00
Yu ISHIKAWA
330065f5e0 Add a condition for require_partition_filter 2020-12-18 11:14:03 +09:00
Yu ISHIKAWA
944db82553 Remove unnecessary code for print debug 2020-12-18 11:14:03 +09:00
Yu ISHIKAWA
c257361f05 Fix syntax 2020-12-18 11:14:03 +09:00
Yu ISHIKAWA
ffdbfb018a Implement tests in test_bigquery_changing_partitions.py 2020-12-18 11:14:01 +09:00
Yu ISHIKAWA
cfa2bd6b08 Remove tests fromm test_bigquery_adapter_specific.py 2020-12-18 11:13:16 +09:00
Yu ISHIKAWA
51e90c3ce0 Format 2020-12-18 11:13:16 +09:00
Yu ISHIKAWA
d69149f43e Update 2020-12-18 11:13:15 +09:00
Yu ISHIKAWA
f261663f3d Add debug code 2020-12-18 11:13:15 +09:00
Yu ISHIKAWA
e5948dd1d3 Update 2020-12-18 11:13:15 +09:00
Yu ISHIKAWA
5f13aab7d8 Print debug 2020-12-18 11:13:15 +09:00
Yu ISHIKAWA
292d489592 Format code 2020-12-18 11:13:15 +09:00
Yu ISHIKAWA
0a01f20e35 Update CHANGELOG.md 2020-12-18 11:13:11 +09:00
Yu ISHIKAWA
2bd08d5c4c Support require_partition_filter and partition_expiration_days in BQ 2020-12-18 11:12:47 +09:00
Jeremy Cohen
adae5126db Merge pull request #2954 from fishtown-analytics/feature/defer-tests
Feature: defer tests
2020-12-17 18:01:14 -05:00
Kyle Wigley
dddf1bcb76 first pass at adding query stats, naming tbd 2020-12-17 16:39:02 -05:00
Jeremy Cohen
d23d4b0fd4 Merge pull request #2963 from tyang209/issue-2931
Bumped boto3 version uppper range for dbt-redshift
2020-12-17 14:30:47 -05:00
Tao Yang
658f7550b3 Merge branch 'dev/kiyoshi-kuromiya' into issue-2931 2020-12-17 08:58:49 -08:00
Kyle Wigley
cfb50ae21e Merge pull request #2960 from fishtown-analytics/feature/python-39
Test python3.9
2020-12-17 11:11:56 -05:00
Jeremy Cohen
9b0a365822 Update comments for _add_ctes() 2020-12-17 10:35:04 -05:00
Jeremy Cohen
97ab130619 Merge pull request #2958 from fishtown-analytics/fix/keyerror-defer-missing-parent
Fix KeyError from defer + deletion
2020-12-17 10:29:51 -05:00
Tao Yang
3578fde290 Bumped boto3 version uppper range for dbt-redshift 2020-12-16 20:03:53 -08:00
Jeremy Cohen
f382da69b8 Changelog 2020-12-16 17:46:00 -05:00
Jeremy Cohen
2da3d215c6 Add test case to repro bug 2020-12-16 17:38:27 -05:00
Kyle Wigley
43ed29c14c update changelog 2020-12-16 16:29:48 -05:00
Jeremy Cohen
9df0283689 Truthier? 2020-12-16 14:55:27 -05:00
Jeremy Cohen
04b82cf4a5 What is backward may not be forward 2020-12-16 14:55:27 -05:00
Jeremy Cohen
274c3012b0 Add defer to rpc test method 2020-12-16 14:53:25 -05:00
Jeremy Cohen
2b24a4934f defer tests, too 2020-12-16 14:42:00 -05:00
Kyle Wigley
692a423072 comment out snowflake py39 tests 2020-12-16 11:27:00 -05:00
Kyle Wigley
148f55335f address issue with py39 2020-12-16 11:25:31 -05:00
Kyle Wigley
2f752842a1 update hologram and add new envs to tox 2020-12-16 11:25:31 -05:00
Jeremy Cohen
aff72996a1 Merge pull request #2946 from fishtown-analytics/fix/defer-if-not-exist
Defer iff unselected reference does not exist in current env
2020-12-16 11:22:31 -05:00
Jeremy Cohen
08e425bcf6 Handle keyerror if old node missing 2020-12-16 00:24:00 -05:00
Kyle Wigley
454ddc601a Merge pull request #2943 from fishtown-analytics/feature/refactor-run-results
Clean up run results
2020-12-15 12:42:22 -05:00
Jeremy Cohen
b025f208a8 Check if relation exists before deferring 2020-12-14 22:21:43 -05:00
Kyle Wigley
b60e533b9d fix printer output 2020-12-14 19:50:17 -05:00
Kyle Wigley
37af0e0d59 update changelog 2020-12-14 16:28:23 -05:00
Kyle Wigley
ac1de5bce9 more updates 2020-12-14 16:28:23 -05:00
Kyle Wigley
ef7ff55e07 flake8 2020-12-14 16:28:23 -05:00
Kyle Wigley
608db5b982 code cleanup + swap node with unique_id 2020-12-14 16:28:23 -05:00
Kyle Wigley
8dd69efd48 address test failures 2020-12-14 16:28:23 -05:00
Kyle Wigley
73f7fba793 fix printing test status 2020-12-14 16:28:23 -05:00
Kyle Wigley
867e2402d2 chugging along 2020-12-14 16:28:23 -05:00
Kyle Wigley
a3b9e61967 first pass, lots of TODO's [skip ci] 2020-12-14 16:28:22 -05:00
Jeremy Cohen
cd149b68e8 Merge pull request #2920 from joellabes/2913-docs-block-exposures
Render docs blocks in exposures
2020-12-13 18:38:23 -05:00
Joel Labes
cd3583c736 Merge branch 'dev/kiyoshi-kuromiya' into 2913-docs-block-exposures 2020-12-13 14:27:37 +13:00
Joel Labes
441f86f3ed Add test.notebook_info to expected manifest 2020-12-13 14:25:37 +13:00
Joel Labes
f62bea65a1 Move model.test.view_summary to parent map instead of child map 2020-12-13 14:11:04 +13:00
Jeremy Cohen
886b574987 Merge pull request #2939 from fishtown-analytics/fix/big-seed-smaller-path
Use diff file path for big seed checksum
2020-12-07 11:18:15 -05:00
Joel Labes
2888bac275 Merge branch 'dev/kiyoshi-kuromiya' into 2913-docs-block-exposures 2020-12-07 21:17:21 +13:00
Joel Labes
35c9206916 Fix test failure (?) 2020-12-07 21:15:44 +13:00
Joel Labes
c4c5b59312 Stab at updating parent and child maps 2020-12-07 17:45:12 +13:00
Jeremy Cohen
f25fb4e5ac Use diff file path for big seed checksum 2020-12-04 17:04:27 -05:00
Jeremy Cohen
868bfec5e6 Merge pull request #2907 from max-sixty/raise
Remove duplicate raise
2020-12-03 14:17:58 -05:00
Jeremy Cohen
e7c242213a Merge pull request #2908 from max-sixty/bq-default-project
Allow BigQuery to default on project name
2020-12-03 14:17:02 -05:00
Jeremy Cohen
862552ead4 Merge pull request #2930 from fishtown-analytics/revert-2858-dependabot/pip/docker/requirements/cryptography-3.2
Revert dependabot cryptography upgrade for old versions
2020-12-03 13:58:26 -05:00
Jeremy Cohen
9d90e0c167 tiny changelog fixup 2020-12-03 13:27:46 -05:00
Jeremy Cohen
a281f227cd Revert "Bump cryptography from 2.9.2 to 3.2 in /docker/requirements" 2020-12-03 12:12:15 -05:00
Maximilian Roos
5b981278db changelog 2020-12-02 14:59:35 -08:00
Maximilian Roos
c1091ed3d1 Merge branch 'dev/kiyoshi-kuromiya' into bq-default-project 2020-12-02 14:55:27 -08:00
Maximilian Roos
08aed63455 Formatting 2020-12-02 11:19:02 -08:00
Maximilian Roos
90a550ee4f Update plugins/bigquery/dbt/adapters/bigquery/connections.py
Co-authored-by: Kyle Wigley <kwigley44@gmail.com>
2020-12-02 10:41:20 -08:00
Jeremy Cohen
34869fc2a2 Merge pull request #2922 from plotneishestvo/snowflake_connector_upgrade
update cryptography package and snowflake connector
2020-12-02 12:34:34 -05:00
Pavel Plotnikov
3deb10156d Merge branch 'dev/kiyoshi-kuromiya' into snowflake_connector_upgrade 2020-12-02 12:46:02 +02:00
Maximilian Roos
8c0e84de05 Move method to module func 2020-12-01 16:19:20 -08:00
Joel Labes
23be083c39 Change models folder to ref_models folder 2020-12-02 11:59:21 +13:00
Joel Labes
217aafce39 Add line break to description, fix refs and maybe fix original_file_path 2020-12-02 11:47:29 +13:00
Joel Labes
03210c63f4 Blank instead of none description 2020-12-02 10:57:47 +13:00
Joel Labes
a90510f6f2 Ref a model that actually exists 2020-12-02 10:40:34 +13:00
Joel Labes
36d91aded6 Empty description for minimal/basic exposure object tests 2020-12-01 17:56:55 +13:00
Joel Labes
9afe8a1297 Default to empty string for ParsedExposure description 2020-12-01 17:35:42 +13:00
Maximilian Roos
1e6f272034 Add test config 2020-11-30 20:06:06 -08:00
Maximilian Roos
a1aa2f81ef _ 2020-11-30 19:30:07 -08:00
Maximilian Roos
62899ef308 _ 2020-11-30 16:54:21 -08:00
Joel Labes
7f3396c002 Forgot another comma 🤦 2020-12-01 12:46:26 +13:00
Joel Labes
453bc18196 Merge branch '2913-docs-block-exposures' of https://github.com/joellabes/dbt into 2913-docs-block-exposures 2020-12-01 12:42:11 +13:00
Joel Labes
dbb6b57b76 Forgot a comma 2020-12-01 12:40:51 +13:00
Joel Labes
d7137db78c Merge branch 'dev/kiyoshi-kuromiya' into 2913-docs-block-exposures 2020-12-01 12:34:29 +13:00
Joel Labes
5ac4f2d80b Move description arg to be below default-free args 2020-12-01 12:33:08 +13:00
Jeremy Cohen
5ba5271da9 Merge pull request #2903 from db-magnus/bq-hourly-part
Hourly, monthly and yearly partitions in BigQuery
2020-11-30 09:46:36 -05:00
Pavel Plotnikov
b834e3015a update changelog md 2020-11-30 14:46:51 +02:00
Joel Labes
c8721ded62 Code review: non-optional description, docs block tests, yaml exposure attributes 2020-11-30 20:29:47 +13:00
Magnus Fagertun
1e97372d24 Update test/unit/test_bigquery_adapter.py
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-11-30 07:26:36 +01:00
Magnus Fagertun
fd4e111784 Update test/unit/test_bigquery_adapter.py
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-11-30 00:44:25 +01:00
Magnus Fagertun
75094e7e21 Update test/unit/test_bigquery_adapter.py
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-11-30 00:44:15 +01:00
Joel Labes
8db2d674ed Update CHANGELOG.md 2020-11-28 15:08:13 +13:00
Pavel Plotnikov
ffb140fab3 update cryptography package and snowflake connector 2020-11-27 16:52:13 +02:00
Joel Labes
e93543983c Follow Jeremy's wild speculation 2020-11-27 22:45:31 +13:00
Magnus Fagertun
0d066f80ff added test and enhancements from jtcohen6 2020-11-25 21:41:51 +01:00
Magnus Fagertun
ccca1b2016 Update plugins/bigquery/dbt/adapters/bigquery/impl.py
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-11-25 21:17:07 +01:00
Kyle Wigley
fec0e31a25 Merge pull request #2902 from fishtown-analytics/fix/test-selection
set default `materialized` for test node configs
2020-11-24 12:19:40 -05:00
Kyle Wigley
d246aa8f6d update readme 2020-11-24 10:40:01 -05:00
Maximilian Roos
66bfba2258 flake8 seems to sometimes be applied 2020-11-23 17:39:57 -08:00
Maximilian Roos
b53b4373cb Definet database exclusively in contracts/connection.py 2020-11-23 17:32:41 -08:00
Maximilian Roos
0810f93883 Allow BigQuery to default on project name 2020-11-23 16:58:54 -08:00
Maximilian Roos
a4e696a252 Remove duplicate raise 2020-11-23 15:34:43 -08:00
Jeremy Cohen
0951d08f52 Merge pull request #2877 from max-sixty/unlock-google-api
Wider google-cloud dependencies
2020-11-23 14:16:12 -05:00
Jeremy Cohen
dbf367e070 Merge branch 'dev/kiyoshi-kuromiya' into unlock-google-api 2020-11-23 11:46:07 -05:00
Magnus Fagertun
6447ba8ec8 whitespace cleanup 2020-11-22 10:00:10 +01:00
Magnus Fagertun
43e260966f uppercase and lowercase for date partitions supported 2020-11-21 01:21:07 +01:00
Magnus Fagertun
b0e301b046 typo in _partitions_match 2020-11-21 00:40:27 +01:00
Magnus Fagertun
c8a9ea4979 added month,year to date partitioning, granularity comparison to _partitions_match 2020-11-21 00:24:20 +01:00
Maximilian Roos
afb7fc05da Changelog 2020-11-20 14:58:46 -08:00
Magnus Fagertun
14124ccca8 added tests for datetime and timestamp 2020-11-20 00:10:15 +01:00
Magnus Fagertun
df5022dbc3 moving granularity to render, not to break tests 2020-11-19 18:51:05 +01:00
Magnus Fagertun
015e798a31 more BQ partitioning 2020-11-19 17:42:27 +01:00
Kyle Wigley
c19125bb02 Merge pull request #2893 from fishtown-analytics/feature/track-parse-time
Add event tracking for project parse/load time
2020-11-19 10:30:46 -05:00
Kyle Wigley
0e6ac5baf1 can we just default materialization to test? 2020-11-19 09:27:31 -05:00
Magnus Fagertun
2c8d1b5b8c Added hour, year, month partitioning BQ 2020-11-19 13:47:42 +01:00
Kyle Wigley
f7c0c1c21a fix tests 2020-11-18 17:21:41 -05:00
Kyle Wigley
4edd98f7ce update changelog 2020-11-18 16:19:58 -05:00
Kyle Wigley
df0abb7000 flake8 fixes 2020-11-18 16:19:58 -05:00
Kyle Wigley
4f93da307f add event to track loading time 2020-11-18 16:19:58 -05:00
Gerda Shank
a8765d54aa Merge pull request #2895 from fishtown-analytics/string_selectors
convert cli-style strings in selectors to normalized dictionaries
2020-11-18 15:53:23 -05:00
Gerda Shank
bb834358d4 convert cli-style strings in selectors to normalized dictionaries
[#2879]
2020-11-18 14:43:44 -05:00
Jeremy Cohen
ec0f3d22e7 Merge pull request #2892 from rsella/dev/kiyoshi-kuromiya
Change dbt list command to always return 0 as exit code
2020-11-17 11:12:55 -05:00
Riccardo Sella
009b75cab6 Fix changelog and edit additional failing tests 2020-11-17 16:38:28 +01:00
Riccardo Sella
d64668df1e Change dbt list command to always return 0 as exit code 2020-11-17 14:49:24 +01:00
Gerda Shank
72e808c9a7 Merge pull request #2889 from fishtown-analytics/dbt-test-runner
Add scripts/dtr.py, dbt test runner. Bump hologram version.
2020-11-15 20:06:28 -05:00
Gerda Shank
96cc9223be Add scripts/dtr.py, dbt test runner. Bump hologram version. 2020-11-13 14:34:10 -05:00
Gerda Shank
13b099fbd0 Merge pull request #2883 from fishtown-analytics/feature/2824-parse-only-command
Add parse command and collect parse timing info [#2824]
2020-11-13 10:19:19 -05:00
Gerda Shank
1a8416c297 Add parse command and collect parse timing info [#2824] 2020-11-12 13:56:41 -05:00
Maximilian Roos
8538bec99e _ 2020-11-11 13:48:41 -08:00
Maximilian Roos
f983900597 google-cloud-bigquery goes to 3 2020-11-10 23:51:15 -08:00
Gerda Shank
3af02020ff Merge pull request #2866 from fishtown-analytics/feature/2693-selectors-to-manifest
Save selector dictionary and write out in manifest [#2693][#2800]
2020-11-10 11:48:19 -05:00
Maximilian Roos
8c71488757 _ 2020-11-10 08:38:43 -08:00
Gerda Shank
74316bf702 Save selector dictionary and write out in manifest [#2693][#2800] 2020-11-10 11:17:14 -05:00
Maximilian Roos
7aa8c435c9 Bump protobuf too 2020-11-09 17:36:41 -08:00
Maximilian Roos
daeb51253d Unpin google-cloud dependencies 2020-11-09 17:18:42 -08:00
Jeremy Cohen
0ce2f41db4 Reorg #2837 in changelog 2020-11-09 09:46:08 -05:00
Jeremy Cohen
02e5a962d7 Merge pull request #2837 from franloza/feature/2647-relation-name-in-metadata
Store relation name in manifest's node and source objects
2020-11-09 09:44:40 -05:00
Jeremy Cohen
dcc32dc69f Merge pull request #2850 from elexisvenator/patch-1
Postgres: Prevent temp relation identifiers from being too long
2020-11-09 09:32:35 -05:00
Gerda Shank
af3d6681dd extend timeout for test/rpc 2020-11-06 17:45:35 -05:00
Gerda Shank
106968a3be Merge pull request #2858 from fishtown-analytics/dependabot/pip/docker/requirements/cryptography-3.2
Bump cryptography from 2.9.2 to 3.2 in /docker/requirements
2020-11-06 15:54:49 -05:00
Ben Edwards
2cd56ca044 Update changelog 2020-11-03 20:58:01 +11:00
Ben Edwards
eff198d079 Add integration tests 2020-11-03 20:56:02 +11:00
Ben Edwards
c3b5b88cd2 Postgres: Prevent temp relation identifiers from being too long
Related: #2197 

The currently postgres `make_temp_relation` adds a 29 character suffix to the end of the temp relation identifier (9 from default suffix and 20 from timestamp).  This is a problem now that relations with more than 63 characters raise exceptions. 
The fix is to shorten the suffix and also trim the base_relation identifier so that the total length is always less than 63 characters.

An exception can also be raised if the default suffix is overridden with a value that is too long.
2020-11-03 20:56:02 +11:00
Kyle Wigley
4e19e87bbc Merge pull request #2859 from fishtown-analytics/fix/update-test-container
add unixodbc-dev to testing docker image
2020-10-30 09:56:39 -04:00
Kyle Wigley
6be6f6585d update changelog 2020-10-29 16:52:09 -04:00
Kyle Wigley
d7579f0c99 add g++ and unixodbc-dev 2020-10-29 16:22:46 -04:00
Fran Lozano
b741679c9c Add missing key to child map in expected_bigquery_complex_manifest 2020-10-29 17:25:18 +01:00
Fran Lozano
852990e967 Fix child_map in tests 2020-10-28 22:18:32 +01:00
Fran Lozano
21fd75b500 Fix parent_map object in tests 2020-10-28 19:59:36 +01:00
Fran Lozano
3e5d9010a3 Add snapshot to additional Redshift and Bigquery manifest tests 2020-10-28 19:39:04 +01:00
Fran Lozano
784616ec29 Add relation name to source object in manifest 2020-10-28 18:58:25 +01:00
Fran Lozano
6251d19946 Use is_ephemeral_model property instead of config.materialized
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
2020-10-28 09:49:44 +01:00
dependabot[bot]
17b1332a2a Bump cryptography from 2.9.2 to 3.2 in /docker/requirements
Bumps [cryptography](https://github.com/pyca/cryptography) from 2.9.2 to 3.2.
- [Release notes](https://github.com/pyca/cryptography/releases)
- [Changelog](https://github.com/pyca/cryptography/blob/master/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/2.9.2...3.2)

Signed-off-by: dependabot[bot] <support@github.com>
2020-10-27 22:22:14 +00:00
Jeremy Cohen
74eec3bdbe Merge pull request #2855 from brangisom/brangisom/spectrum-filter-fix
Fix the filtering for external tables in the Redshift get_columns_in_relation macro
2020-10-27 15:03:59 -04:00
Fran Lozano
a9901c4ea7 Disable snapshot documentation testing for Redshift and Bigquery 2020-10-27 19:27:54 +01:00
Brandon Isom
348a2f91ee Move a CHANGELOG entry 2020-10-27 11:17:13 -07:00
Fran Lozano
7115d862ea Modify snapshot path for docs generation tests 2020-10-27 18:59:32 +01:00
Fran Lozano
52ed4aa631 Fix tests which are missing snapshot nodes 2020-10-27 18:45:00 +01:00
Fran Lozano
92cedf8931 Fix Flake8 style issue 2020-10-27 17:39:44 +01:00
Fran Lozano
e1097f11b5 Define relation_name only for non-ephemeral models, seeds and snapshots 2020-10-27 17:23:23 +01:00
Brandon Isom
eb34c0e46b Add stuff to changelog per checklist 2020-10-26 20:17:03 -07:00
Brandon Isom
ee2181b371 Merge branch 'brangisom/spectrum-filter-fix' of github.com:brangisom/dbt into brangisom/spectrum-filter-fix 2020-10-26 19:44:45 -07:00
Brandon Isom
2a5d090e91 Pushes the table_schema = '{{ relation.schema }}' filter into the svv_external_columns CTE 2020-10-26 19:38:33 -07:00
Brandon Isom
857bebe819 Pushes the table_schema = '{{ relation.schema }}' clause down into the svv_external_columns CTE. 2020-10-26 18:29:47 -07:00
Jeremy Cohen
9728152768 Merge pull request #2851 from hochoy/add-python-regex
Folllow up: Support for python "re" module for doing regex in jinja templates
2020-10-26 16:27:17 -04:00
Wai Ho Choy
2566a85429 edit CHANGELOG.md 2020-10-26 12:21:30 -07:00
Wai Ho Choy
46b3130198 lint 2020-10-25 21:18:23 -07:00
Wai Ho Choy
8664516c8d fix blank line linting 2020-10-25 12:03:10 -07:00
Wai Ho Choy
0733c246ea add all exports from python re module 2020-10-25 11:31:33 -07:00
Fran Lozano
4203985e3e Adapt expected_seeded_manifest method to Snowflake identifier quoting 2020-10-25 18:50:52 +01:00
Fran Lozano
900298bce7 Fix database name in relation_name in expected_run_results 2020-10-25 18:06:18 +01:00
Fran Lozano
09c37f508e Adapt relation_name to expected_run_results parameters 2020-10-25 17:27:46 +01:00
Fran Lozano
c9e01bcc81 Fix quotes in relation name for Bigquery docs generate tests 2020-10-25 16:31:00 +01:00
Fran Lozano
b079545e0f Adapt relation_name for Bigquery and Snowflake in docs generation tests 2020-10-25 15:58:04 +01:00
Fran Lozano
c3bf0f8cbf Add relation_name to missing tests in test_docs_generate 2020-10-25 14:11:33 +01:00
Jeremy Cohen
e945bca1d9 Merge pull request #2596 from ran-eh/re-partition-metadata
Make partition metadata available to BigQuery users
2020-10-22 20:58:17 -04:00
Jeremy Cohen
bf5835de5e Merge branch 'dev/kiyoshi-kuromiya' into re-partition-metadata 2020-10-22 20:18:31 -04:00
Ran Ever-Hadani
7503f0cb10 merge from dev/kiyoshi-kuromiya 2020-10-22 16:23:02 -07:00
Ran Ever-Hadani
3a751bcf9b Update CHANGELOG.md 2020-10-22 15:53:25 -07:00
Jeremy Cohen
c31ba101d6 Add tests for get_partitions_metadata (#1)
* Add tests using get_partitions_metadata

* Readd asterisk to raw_execute
2020-10-21 16:00:10 -07:00
Jeremy Cohen
ecadc74d44 Merge pull request #2841 from feluelle/dev/kiyoshi-kuromiya
Respect --project-dir in dbt clean command
2020-10-21 16:01:11 -04:00
Jeremy Cohen
63d25aaf19 Update changelog to account for v0.19.0-b1 release 2020-10-21 09:47:36 -04:00
feluelle
5af82c3c05 Add test that checks if targets were successfully deleted 2020-10-21 10:27:39 +02:00
feluelle
8b4d74ed17 Add changelog entry for resolving issue #2840 2020-10-21 10:27:39 +02:00
feluelle
6a6a9064d5 Respect --project-dir in dbt clean command 2020-10-21 10:25:17 +02:00
Github Build Bot
b188a9488a Merge remote-tracking branch 'origin/releases/0.19.0b1' into dev/kiyoshi-kuromiya 2020-10-21 00:46:30 +00:00
Github Build Bot
7c2635f65d Release dbt v0.19.0b1 2020-10-21 00:35:44 +00:00
Jeremy Cohen
c67d0a0e1a Readd bumpversion config header 2020-10-20 18:01:04 -04:00
Fran Lozano
7ee78e89c9 Add missing relation_name fields in doc generation test manifests 2020-10-20 19:18:38 +02:00
Fran Lozano
40370e104f Fix wrong schema name in test and add missing relation_name in node 2020-10-20 18:48:59 +02:00
Fran Lozano
a8809baa6c Merge branch 'dev/kiyoshi-kuromiya' into feature/2647-relation-name-in-metadata 2020-10-20 18:32:53 +02:00
Fran Lozano
244d5d2c3b Merge remote-tracking branch 'upstream/dev/kiyoshi-kuromiya' into dev/kiyoshi-kuromiya 2020-10-20 18:26:28 +02:00
Fran Lozano
a0370a6617 Add relation_name to node object in docs generation tests 2020-10-20 18:22:32 +02:00
Jeremy Cohen
eb077fcc75 Merge pull request #2845 from fishtown-analytics/docs/0.19.0-b1
dbt-docs changes for dbt v0.19.0-b1
2020-10-20 09:57:11 -04:00
Jeremy Cohen
c5adc50eed Make flake8 happy 2020-10-19 18:42:23 -04:00
Jeremy Cohen
6e71b6fd31 Include dbt-docs changes for v0.19.0-b1 2020-10-19 18:41:31 -04:00
Gerda Shank
278382589d Merge pull request #2834 from fishtown-analytics/feature/remove_injected_sql
Remove injected_sql. Store non-ephemeral injected_sql in compiled_sql
2020-10-19 18:08:41 -04:00
Gerda Shank
6f0f6cf21a Merge branch 'dev/0.18.1' into dev/kiyoshi-kuromiya 2020-10-19 11:30:52 -04:00
Fran Lozano
01331ed311 Update CHANGELOG.md 2020-10-16 19:08:30 +02:00
Fran Lozano
f638a3d50c Store relation name in manifest's node object 2020-10-16 18:38:22 +02:00
Gerda Shank
512c41dbaf Remove injected_sql. Store non-ephemeral injected_sql in compiled_sql 2020-10-15 11:52:03 -04:00
Github Build Bot
f6bab4adcf Release dbt v0.18.1 2020-10-13 21:31:54 +00:00
Jeremy Cohen
526ecee3da Merge pull request #2832 from fishtown-analytics/fix/colorama-upper-044
Set colorama upper bound to <0.4.4
2020-10-13 17:20:05 -04:00
Jeremy Cohen
1bc9815d53 Set colorama upper bound to <0.4.4 2020-10-13 16:26:10 -04:00
Ran Ever-Hadani
78bd7c9465 Eliminate asterisk from raw_execute to try an fix integration error 2020-10-11 12:06:56 -07:00
Ran Ever-Hadani
d74df8692b Eliminate pep8 errors 2020-10-11 11:37:51 -07:00
Ran Ever-Hadani
eda86412cc Accommodate first round of comments 2020-10-11 11:03:53 -07:00
Ran Ever-Hadani
cce5945fd2 Make partition metadata available to BigQuery users (rebased to dev/kiyoshi-kuromiya) 2020-10-10 17:44:07 -07:00
Drew Banin
72038258ed Merge pull request #2805 from fishtown-analytics/feature/bigquery-oauth-token
Support BigQuery OAuth using a refresh token and client secrets
2020-10-09 14:55:00 -04:00
Drew Banin
056d8fa9ad Merge branch 'dev/kiyoshi-kuromiya' into feature/bigquery-oauth-token 2020-10-09 14:07:27 -04:00
Gerda Shank
3888e0066f Merge pull request #2813 from fishtown-analytics/feature/2510-save-args-run_results
Save args in run_results.json
2020-10-09 13:51:14 -04:00
Drew Banin
ee6571d050 Merge branch 'dev/kiyoshi-kuromiya' into feature/bigquery-oauth-token 2020-10-09 10:15:23 -04:00
Gerda Shank
9472288304 Save args in run_results.json 2020-10-08 17:39:03 -04:00
Jeremy Cohen
fd5e10cfdf Merge pull request #2817 from zmac12/feature/addDebugQueryMethod
Feature/add debug query method
2020-10-08 10:30:31 -04:00
Jeremy Cohen
aeae18ec37 Merge pull request #2821 from joelluijmes/feature/hard-delete-revival
Re-instate hard-deleted records during snapshot
2020-10-08 10:27:42 -04:00
Zach McQuiston
03d3943e99 fixing linter problem in connections.py 2020-10-08 07:47:16 -06:00
Zach McQuiston
214d137672 adding entry to changelog 2020-10-08 07:45:35 -06:00
Joël Luijmes
83db275ddf Added changelog entry 2020-10-08 15:26:51 +02:00
Joël Luijmes
b8f16d081a Remove redundant 'dbt_valid_to is null' checks 2020-10-08 15:24:04 +02:00
Joël Luijmes
675b01ed48 Re-snapshot records that were invalidated through hard-delete 2020-10-08 12:45:35 +02:00
Joël Luijmes
b20224a096 Refactor test for hard-delete snapshots 2020-10-08 12:45:35 +02:00
Zach McQuiston
fd6edfccc4 Removing accidental whitespace addition 2020-10-07 19:30:48 -06:00
Zach McQuiston
4c58438e8a removing errant reference to debug_query method 2020-10-07 19:29:49 -06:00
Zach McQuiston
5ff383a025 Adding type hint for debug_query 2020-10-07 19:25:00 -06:00
Zach McQuiston
dcb6854683 adding debug_query to base/impl.py enabling plugin authors to write their own debug_query 2020-10-07 19:20:16 -06:00
Drew Banin
e4644bfe5a support providing a token directly; update method name 2020-10-07 15:34:57 -04:00
Jeremy Cohen
93168fef87 Merge pull request #2809 from mescanne/bigquery_invocation_id
Add invocation_id to BigQuery jobs
2020-10-07 10:41:39 -04:00
Jeremy Cohen
9832822bdf Merge branch 'dev/kiyoshi-kuromiya' into bigquery_invocation_id 2020-10-07 09:45:21 -04:00
Mark Scannell
5d91aa3bcd Updated feature request resolution. 2020-10-07 10:55:35 +01:00
Zach McQuiston
354ab5229b adding new debug_query function to base debug task 2020-10-03 16:20:52 -06:00
Zach McQuiston
00de0cd4b5 adding new debug_query function to base adapter 2020-10-03 16:19:48 -06:00
Mark Scannell
26210216da Only set invocation_id if tracking is enabled 2020-10-03 15:32:53 +01:00
Mark Scannell
e29c14a22b updated changelog 2020-10-03 13:09:04 +01:00
Mark Scannell
a6990c8fb8 fixe up 2020-10-03 13:04:56 +01:00
Mark Scannell
3e40e71b96 Added dbt_invocation_id to BigQuery jobs 2020-10-03 13:01:25 +01:00
Gerda Shank
3f45abe331 Merge pull request #2799 from fishtown-analytics/feature/2765-save-manifest
write manifest when writing run_results
2020-10-01 16:12:15 -04:00
Gerda Shank
6777c62789 Merge pull request #2804 from fishtown-analytics/rpc-test-timeouts
Increase rpc test timouts to avoid local test failures
2020-10-01 16:10:59 -04:00
Github Build Bot
1aac869738 Merge remote-tracking branch 'origin/releases/0.18.1rc1' into dev/0.18.1 2020-10-01 16:52:51 +00:00
Github Build Bot
493554ea30 Release dbt v0.18.1rc1 2020-10-01 16:39:50 +00:00
Drew Banin
1cf87c639b (#2344) Support BigQuery OAuth using a refesh token and client secrets 2020-09-30 23:04:40 -04:00
Gerda Shank
2cb3d92163 Increase rpc test timouts to avoid local test failures 2020-09-30 17:34:23 -04:00
Jeremy Cohen
89b6e52a73 Merge pull request #2791 from kingfink/tf/fix-snapshot-compilation-error
Fix snapshot compilation error
2020-09-30 16:54:45 -04:00
Tim Finkel
97407c10ff revert dockerfile 2020-09-30 15:54:50 -04:00
Tim Finkel
81222dadbc Merge branch 'tf/fix-snapshot-compilation-error' of https://github.com/kingfink/dbt into tf/fix-snapshot-compilation-error 2020-09-30 15:53:40 -04:00
Tim Finkel
400555c391 update tests 2020-09-30 15:52:12 -04:00
Tim Finkel
9125b05809 Update CHANGELOG.md 2020-09-30 14:46:45 -04:00
Jeremy Cohen
139b353a28 Merge pull request #2796 from Foxtel-DnA/feature/6434-bq-retry-rate-limit
UPDATE _is_retryable() to handle BQ rateLimitExceeded
2020-09-30 13:09:11 -04:00
Jared Champion (SYD)
fc474a07d0 REBASED on dev/0.18.1; moved CHANGELOG entries 2020-09-30 10:57:03 +10:00
championj-foxtel
8fd8fa09a5 Merge pull request #1 from fishtown-analytics/dev/0.18.1
Dev/0.18.1
2020-09-30 09:56:27 +10:00
Gerda Shank
41ae831d0e write manifest when writing run_results 2020-09-29 16:35:44 -04:00
Tim Finkel
dc7eca4bf9 update changelog 2020-09-25 17:07:05 -04:00
Tim Finkel
fb07149cb7 fix snapshot compliation error 2020-09-25 16:52:14 -04:00
Github Build Bot
b2bd5a5548 Merge remote-tracking branch 'origin/releases/0.18.1b3' into dev/0.18.1 2020-09-25 20:20:21 +00:00
Github Build Bot
aa6b333e79 Release dbt v0.18.1b3 2020-09-25 20:05:31 +00:00
Jeremy Cohen
0cb9740535 Merge pull request #2789 from fishtown-analytics/fix/require-keyring
Fix: require keyring on snowflake
2020-09-25 15:00:12 -04:00
Jeremy Cohen
6b032b49fe Merge branch 'dev/0.18.1' into fix/require-keyring 2020-09-25 14:13:53 -04:00
Jeremy Cohen
35f78ee0f9 Merge pull request #2754 from aiguofer/include_external_tables_in_get_columns_in_relation
Include external tables in get_columns_in_relation redshift adapter
2020-09-25 13:25:06 -04:00
Jeremy Cohen
5ec36df7f0 Merge branch 'dev/0.18.1' into include_external_tables_in_get_columns_in_relation 2020-09-25 12:52:39 -04:00
Jeremy Cohen
f918fd65b6 Merge pull request #2766 from jweibel22/fix/redshift-iam-concurrency-issue
Give each redshift client their own boto session
2020-09-25 12:50:20 -04:00
Jeremy Cohen
d08a39483d PR feedback 2020-09-25 12:11:49 -04:00
Jeremy Cohen
9191f4ff2d Merge branch 'dev/0.18.1' into fix/redshift-iam-concurrency-issue 2020-09-25 12:10:33 -04:00
Jeremy Cohen
b4a83414ac Require optional dep (keyring) on snowflake 2020-09-24 15:49:35 -04:00
Jeremy Cohen
cb0e62576d Merge pull request #2732 from Mr-Nobody99/feature/add-snowflake-last-modified
Added last_altered query to Snowflake catalog macro
2020-09-24 15:29:04 -04:00
Alexander Kutz
e3f557406f Updated test/integration/029_docs_generate_test.py to reflect new stat 2020-09-23 11:08:08 -05:00
Alexander Kutz
a93e288d6a Added missing commaabove addition. 2020-09-22 17:48:48 -05:00
Alexander Kutz
8cf9311ced Changed 'BASE_TABLE' to 'BASE TABLE' 2020-09-22 16:04:30 -05:00
Alexander Kutz
713e781473 Reset branch against dev/0.8.1 and re-added changes.
udpated changelog.md
2020-09-22 14:54:56 -05:00
Github Build Bot
e265ab67c7 Merge remote-tracking branch 'origin/releases/0.18.1b2' into dev/0.18.1 2020-09-22 14:23:45 +00:00
Github Build Bot
fde1f13b4e Release dbt v0.18.1b2 2020-09-22 14:09:51 +00:00
Jeremy Cohen
9c3839c7e2 Merge pull request #2782 from fishtown-analytics/docs/0.18.1-exposures
[revised] dbt-docs changes for v0.18.1
2020-09-22 09:51:04 -04:00
Jeremy Cohen
c0fd702cc7 Rename reports --> exposures 2020-09-22 08:44:58 -04:00
Jacob Beck
429419c4af Merge pull request #2780 from fishtown-analytics/feature/rename-results-to-exposures
reports -> exposures
2020-09-21 15:15:11 -06:00
Jacob Beck
56ae20602d reports -> exposures 2020-09-21 14:46:48 -06:00
jweibel22
40c6499d3a Update CHANGELOG.md
Co-authored-by: Jacob Beck <beckjake@users.noreply.github.com>
2020-09-20 13:31:43 +02:00
Jimmy Rasmussen
3a78efd83c Add test cases to ensure default boto session is not used 2020-09-20 13:31:15 +02:00
Jimmy Rasmussen
eb33cf75e3 Add entry to CHANGELOG 2020-09-18 10:44:00 +02:00
Jimmy Rasmussen
863d8e6405 Give each redshift client their own boto session
since the boto session is not thread-safe, using the default session in a multi-threaded scenario will result in concurrency errors
2020-09-18 10:29:26 +02:00
Jacob Beck
1fc5a45b9e Merge pull request #2768 from fishtown-analytics/fix/makefile-on-macos
fix the new makefile on macos
2020-09-17 13:42:32 -06:00
Github Build Bot
7751fece35 Merge remote-tracking branch 'origin/releases/0.18.1b1' into dev/0.18.1 2020-09-17 19:09:23 +00:00
Github Build Bot
7670c42462 Release dbt v0.18.1b1 2020-09-17 18:54:44 +00:00
Jacob Beck
b72fc3cd25 fix the new makefile on macos 2020-09-17 11:47:06 -06:00
Diego Fernandez
9c24fc25f5 Add entry to CHANGELOG 2020-09-15 15:11:05 -06:00
Diego Fernandez
4f1a6d56c1 Include external tables in get_columns_in_relation redshift adapter 2020-09-15 15:09:55 -06:00
Drew Banin
1dd4187cd0 Merge branch '0.14.latest' 2019-09-05 14:32:23 -04:00
Connor McArthur
9e36ebdaab Merge branch '0.13.latest' of github.com:fishtown-analytics/dbt 2019-03-21 13:27:24 -04:00
Drew Banin
aaa0127354 Merge pull request #1241 from fishtown-analytics/0.12.latest
Merge 0.12.latest into master
2019-01-15 17:01:16 -05:00
Drew Banin
e60280c4d6 Merge branch '0.12.latest' 2018-11-15 12:24:05 -05:00
Drew Banin
aef7866e29 Update CHANGELOG.md 2018-11-13 10:36:35 -05:00
Drew Banin
70694e3bb9 Merge pull request #1118 from fishtown-analytics/0.12.latest
merge 0.12.latest to master
2018-11-13 10:19:56 -05:00
251 changed files with 7135 additions and 3241 deletions

View File

@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.19.0a1
current_version = 0.19.0
parse = (?P<major>\d+)
\.(?P<minor>\d+)
\.(?P<patch>\d+)

View File

@@ -121,6 +121,45 @@ jobs:
- store_artifacts:
path: ./logs
integration-postgres-py39:
docker: *test_and_postgres
steps:
- checkout
- run: *setupdb
- run:
name: Run tests
command: tox -e integration-postgres-py39
- store_artifacts:
path: ./logs
integration-snowflake-py39:
docker: *test_only
steps:
- checkout
- run:
name: Run tests
command: tox -e integration-snowflake-py39
no_output_timeout: 1h
- store_artifacts:
path: ./logs
integration-redshift-py39:
docker: *test_only
steps:
- checkout
- run:
name: Run tests
command: tox -e integration-redshift-py39
- store_artifacts:
path: ./logs
integration-bigquery-py39:
docker: *test_only
steps:
- checkout
- run:
name: Run tests
command: tox -e integration-bigquery-py39
- store_artifacts:
path: ./logs
workflows:
version: 2
test-everything:
@@ -150,6 +189,18 @@ workflows:
- integration-snowflake-py38:
requires:
- integration-postgres-py38
- integration-postgres-py39:
requires:
- unit
- integration-redshift-py39:
requires:
- integration-postgres-py39
- integration-bigquery-py39:
requires:
- integration-postgres-py39
# - integration-snowflake-py39:
# requires:
# - integration-postgres-py39
- build-wheels:
requires:
- unit
@@ -161,3 +212,7 @@ workflows:
- integration-redshift-py38
- integration-bigquery-py38
- integration-snowflake-py38
- integration-postgres-py39
- integration-redshift-py39
- integration-bigquery-py39
# - integration-snowflake-py39

2
.gitignore vendored
View File

@@ -8,7 +8,7 @@ __pycache__/
# Distribution / packaging
.Python
env/
env*/
dbt_env/
build/
develop-eggs/

View File

@@ -1,19 +1,173 @@
## dbt 0.19.0 (Release TBD)
## dbt 0.20.0 (Release TBD)
### Breaking changes
- The format for sources.json, run-results.json, manifest.json, and catalog.json has changed to include a common metadata field ([#2761](https://github.com/fishtown-analytics/dbt/issues/2761), [#2778](https://github.com/fishtown-analytics/dbt/pull/2778), [#2763](https://github.com/fishtown-analytics/dbt/issues/2763), [#2784](https://github.com/fishtown-analytics/dbt/pull/2784), [#2764](https://github.com/fishtown-analytics/dbt/issues/2764), [#2785](https://github.com/fishtown-analytics/dbt/pull/2785))
### Fixes
- Fix exit code from dbt debug not returning a failure when one of the tests fail ([#3017](https://github.com/fishtown-analytics/dbt/issues/3017))
- Auto-generated CTEs in tests and ephemeral models have lowercase names to comply with dbt coding conventions ([#3027](https://github.com/fishtown-analytics/dbt/issues/3027), [#3028](https://github.com/fishtown-analytics/dbt/issues/3028))
### Features
- dbt will compare configurations using the un-rendered form of the config block in dbt_project.yml ([#2713](https://github.com/fishtown-analytics/dbt/issues/2713), [#2735](https://github.com/fishtown-analytics/dbt/pull/2735))
- Added state and defer arguments to the RPC client, matching the CLI ([#2678](https://github.com/fishtown-analytics/dbt/issues/2678), [#2736](https://github.com/fishtown-analytics/dbt/pull/2736))
- Added schema and dbt versions to JSON artifacts ([#2670](https://github.com/fishtown-analytics/dbt/issues/2670), [#2767](https://github.com/fishtown-analytics/dbt/pull/2767))
- Added ability to snapshot hard-deleted records (opt-in with `invalidate_hard_deletes` config option). ([#249](https://github.com/fishtown-analytics/dbt/issues/249), [#2749](https://github.com/fishtown-analytics/dbt/pull/2749))
- Improved error messages for YAML selectors ([#2700](https://github.com/fishtown-analytics/dbt/issues/2700), [#2781](https://github.com/fishtown-analytics/dbt/pull/2781))
- Add optional configs for `require_partition_filter` and `partition_expiration_days` in BigQuery ([#1843](https://github.com/fishtown-analytics/dbt/issues/1843), [#2928](https://github.com/fishtown-analytics/dbt/pull/2928))
- Fix for EOL SQL comments prevent entire line execution ([#2731](https://github.com/fishtown-analytics/dbt/issues/2731), [#2974](https://github.com/fishtown-analytics/dbt/pull/2974))
Contributors:
- [@joelluijmes](https://github.com/joelluijmes) ([#2749](https://github.com/fishtown-analytics/dbt/pull/2749))
- [@yu-iskw](https://github.com/yu-iskw) ([#2928](https://github.com/fishtown-analytics/dbt/pull/2928))
- [@sdebruyn](https://github.com/sdebruyn) / [@lynxcare](https://github.com/lynxcare) ([#3018](https://github.com/fishtown-analytics/dbt/pull/3018))
- [@rvacaru](https://github.com/rvacaru) ([#2974](https://github.com/fishtown-analytics/dbt/pull/2974))
- [@NiallRees](https://github.com/NiallRees) ([#3028](https://github.com/fishtown-analytics/dbt/pull/3028))
## dbt 0.18.1 (Release TBD)
## dbt 0.19.1 (Release TBD)
### Under the hood
- Bump werkzeug upper bound dependency to `<v2.0` ([#3011](https://github.com/fishtown-analytics/dbt/pull/3011))
- Performance fixes for many different things ([#2862](https://github.com/fishtown-analytics/dbt/issues/2862), [#3034](https://github.com/fishtown-analytics/dbt/pull/3034))
Contributors:
- [@Bl3f](https://github.com/Bl3f) ([#3011](https://github.com/fishtown-analytics/dbt/pull/3011))
## dbt 0.19.0 (January 27, 2021)
## dbt 0.19.0rc3 (January 27, 2021)
### Under the hood
- Cleanup docker resources, use single `docker/Dockerfile` for publishing dbt as a docker image ([dbt-release#3](https://github.com/fishtown-analytics/dbt-release/issues/3), [#3019](https://github.com/fishtown-analytics/dbt/pull/3019))
## dbt 0.19.0rc2 (January 14, 2021)
### Fixes
- Fix regression with defining exposures and other resources with the same name ([#2969](https://github.com/fishtown-analytics/dbt/issues/2969), [#3009](https://github.com/fishtown-analytics/dbt/pull/3009))
- Remove ellipses printed while parsing ([#2971](https://github.com/fishtown-analytics/dbt/issues/2971), [#2996](https://github.com/fishtown-analytics/dbt/pull/2996))
### Under the hood
- Rewrite macro for snapshot_merge_sql to make compatible with other SQL dialects ([#3003](https://github.com/fishtown-analytics/dbt/pull/3003)
- Rewrite logic in `snapshot_check_strategy()` to make compatible with other SQL dialects ([#3000](https://github.com/fishtown-analytics/dbt/pull/3000), [#3001](https://github.com/fishtown-analytics/dbt/pull/3001))
- Remove version restrictions on `botocore` ([#3006](https://github.com/fishtown-analytics/dbt/pull/3006))
- Include `exposures` in start-of-invocation stdout summary: `Found ...` ([#3007](https://github.com/fishtown-analytics/dbt/pull/3007), [#3008](https://github.com/fishtown-analytics/dbt/pull/3008))
Contributors:
- [@mikaelene](https://github.com/mikaelene) ([#3003](https://github.com/fishtown-analytics/dbt/pull/3003))
- [@dbeatty10](https://github.com/dbeatty10) ([dbt-adapter-tests#10](https://github.com/fishtown-analytics/dbt-adapter-tests/pull/10))
- [@swanderz](https://github.com/swanderz) ([#3000](https://github.com/fishtown-analytics/dbt/pull/3000))
- [@stpierre](https://github.com/stpierre) ([#3006](https://github.com/fishtown-analytics/dbt/pull/3006))
## dbt 0.19.0rc1 (December 29, 2020)
### Breaking changes
- Defer if and only if upstream reference does not exist in current environment namespace ([#2909](https://github.com/fishtown-analytics/dbt/issues/2909), [#2946](https://github.com/fishtown-analytics/dbt/pull/2946))
- Rationalize run result status reporting and clean up artifact schema ([#2493](https://github.com/fishtown-analytics/dbt/issues/2493), [#2943](https://github.com/fishtown-analytics/dbt/pull/2943))
- Add adapter specific query execution info to run results and source freshness results artifacts. Statement call blocks return `response` instead of `status`, and the adapter method `get_status` is now `get_response` ([#2747](https://github.com/fishtown-analytics/dbt/issues/2747), [#2961](https://github.com/fishtown-analytics/dbt/pull/2961))
### Features
- Added macro `get_partitions_metadata(table)` to return partition metadata for BigQuery partitioned tables ([#2552](https://github.com/fishtown-analytics/dbt/pull/2552), [#2596](https://github.com/fishtown-analytics/dbt/pull/2596))
- Added `--defer` flag for `dbt test` as well ([#2701](https://github.com/fishtown-analytics/dbt/issues/2701), [#2954](https://github.com/fishtown-analytics/dbt/pull/2954))
- Added native python `re` module for regex in jinja templates ([#1755](https://github.com/fishtown-analytics/dbt/pull/2851), [#1755](https://github.com/fishtown-analytics/dbt/pull/2851))
- Store resolved node names in manifest ([#2647](https://github.com/fishtown-analytics/dbt/issues/2647), [#2837](https://github.com/fishtown-analytics/dbt/pull/2837))
- Save selectors dictionary to manifest, allow descriptions ([#2693](https://github.com/fishtown-analytics/dbt/issues/2693), [#2866](https://github.com/fishtown-analytics/dbt/pull/2866))
- Normalize cli-style-strings in manifest selectors dictionary ([#2879](https://github.com/fishtown-anaytics/dbt/issues/2879), [#2895](https://github.com/fishtown-analytics/dbt/pull/2895))
- Hourly, monthly and yearly partitions available in BigQuery ([#2476](https://github.com/fishtown-analytics/dbt/issues/2476), [#2903](https://github.com/fishtown-analytics/dbt/pull/2903))
- Allow BigQuery to default to the environment's default project ([#2828](https://github.com/fishtown-analytics/dbt/pull/2828), [#2908](https://github.com/fishtown-analytics/dbt/pull/2908))
- Rationalize run result status reporting and clean up artifact schema ([#2493](https://github.com/fishtown-analytics/dbt/issues/2493), [#2943](https://github.com/fishtown-analytics/dbt/pull/2943))
### Fixes
- Respect `--project-dir` in `dbt clean` command ([#2840](https://github.com/fishtown-analytics/dbt/issues/2840), [#2841](https://github.com/fishtown-analytics/dbt/pull/2841))
- Fix Redshift adapter `get_columns_in_relation` macro to push schema filter down to the `svv_external_columns` view ([#2854](https://github.com/fishtown-analytics/dbt/issues/2854), [#2854](https://github.com/fishtown-analytics/dbt/issues/2854))
- Increased the supported relation name length in postgres from 29 to 51 ([#2850](https://github.com/fishtown-analytics/dbt/pull/2850))
- `dbt list` command always return `0` as exit code ([#2886](https://github.com/fishtown-analytics/dbt/issues/2886), [#2892](https://github.com/fishtown-analytics/dbt/issues/2892))
- Set default `materialized` for test node configs to `test` ([#2806](https://github.com/fishtown-analytics/dbt/issues/2806), [#2902](https://github.com/fishtown-analytics/dbt/pull/2902))
- Allow `docs` blocks in `exposure` descriptions ([#2913](https://github.com/fishtown-analytics/dbt/issues/2913), [#2920](https://github.com/fishtown-analytics/dbt/pull/2920))
- Use original file path instead of absolute path as checksum for big seeds ([#2927](https://github.com/fishtown-analytics/dbt/issues/2927), [#2939](https://github.com/fishtown-analytics/dbt/pull/2939))
- Fix KeyError if deferring to a manifest with a since-deleted source, ephemeral model, or test ([#2875](https://github.com/fishtown-analytics/dbt/issues/2875), [#2958](https://github.com/fishtown-analytics/dbt/pull/2958))
### Under the hood
- Add `unixodbc-dev` package to testing docker image ([#2859](https://github.com/fishtown-analytics/dbt/pull/2859))
- Add event tracking for project parser/load times ([#2823](https://github.com/fishtown-analytics/dbt/issues/2823),[#2893](https://github.com/fishtown-analytics/dbt/pull/2893))
- Bump `cryptography` version to `>= 3.2` and bump snowflake connector to `2.3.6` ([#2896](https://github.com/fishtown-analytics/dbt/issues/2896), [#2922](https://github.com/fishtown-analytics/dbt/issues/2922))
- Widen supported Google Cloud libraries dependencies ([#2794](https://github.com/fishtown-analytics/dbt/pull/2794), [#2877](https://github.com/fishtown-analytics/dbt/pull/2877)).
- Bump `hologram` version to `0.0.11`. Add `scripts/dtr.py` ([#2888](https://github.com/fishtown-analytics/dbt/issues/2840),[#2889](https://github.com/fishtown-analytics/dbt/pull/2889))
- Bump `hologram` version to `0.0.12`. Add testing support for python3.9 ([#2822](https://github.com/fishtown-analytics/dbt/issues/2822),[#2960](https://github.com/fishtown-analytics/dbt/pull/2960))
- Bump the version requirements for `boto3` in dbt-redshift to the upper limit `1.16` to match dbt-redshift and the `snowflake-python-connector` as of version `2.3.6`. ([#2931](https://github.com/fishtown-analytics/dbt/issues/2931), ([#2963](https://github.com/fishtown-analytics/dbt/issues/2963))
### Docs
- Fixed issue where data tests with tags were not showing up in graph viz ([docs#147](https://github.com/fishtown-analytics/dbt-docs/issues/147), [docs#157](https://github.com/fishtown-analytics/dbt-docs/pull/157))
Contributors:
- [@feluelle](https://github.com/feluelle) ([#2841](https://github.com/fishtown-analytics/dbt/pull/2841))
- [ran-eh](https://github.com/ran-eh) ([#2596](https://github.com/fishtown-analytics/dbt/pull/2596))
- [@hochoy](https://github.com/hochoy) ([#2851](https://github.com/fishtown-analytics/dbt/pull/2851))
- [@brangisom](https://github.com/brangisom) ([#2855](https://github.com/fishtown-analytics/dbt/pull/2855))
- [@elexisvenator](https://github.com/elexisvenator) ([#2850](https://github.com/fishtown-analytics/dbt/pull/2850))
- [@franloza](https://github.com/franloza) ([#2837](https://github.com/fishtown-analytics/dbt/pull/2837))
- [@max-sixty](https://github.com/max-sixty) ([#2877](https://github.com/fishtown-analytics/dbt/pull/2877), [#2908](https://github.com/fishtown-analytics/dbt/pull/2908))
- [@rsella](https://github.com/rsella) ([#2892](https://github.com/fishtown-analytics/dbt/issues/2892))
- [@joellabes](https://github.com/joellabes) ([#2913](https://github.com/fishtown-analytics/dbt/issues/2913))
- [@plotneishestvo](https://github.com/plotneishestvo) ([#2896](https://github.com/fishtown-analytics/dbt/issues/2896))
- [@db-magnus](https://github.com/db-magnus) ([#2892](https://github.com/fishtown-analytics/dbt/issues/2892))
- [@tyang209](https:/github.com/tyang209) ([#2931](https://github.com/fishtown-analytics/dbt/issues/2931))
## dbt 0.19.0b1 (October 21, 2020)
### Breaking changes
- The format for `sources.json`, `run-results.json`, `manifest.json`, and `catalog.json` has changed:
- Each now has a common metadata dictionary ([#2761](https://github.com/fishtown-analytics/dbt/issues/2761), [#2778](https://github.com/fishtown-analytics/dbt/pull/2778)). The contents include: schema and dbt versions ([#2670](https://github.com/fishtown-analytics/dbt/issues/2670), [#2767](https://github.com/fishtown-analytics/dbt/pull/2767)); `invocation_id` ([#2763](https://github.com/fishtown-analytics/dbt/issues/2763), [#2784](https://github.com/fishtown-analytics/dbt/pull/2784)); custom environment variables prefixed with `DBT_ENV_CUSTOM_ENV_` ([#2764](https://github.com/fishtown-analytics/dbt/issues/2764), [#2785](https://github.com/fishtown-analytics/dbt/pull/2785)); cli and rpc arguments in the `run_results.json` ([#2510](https://github.com/fishtown-analytics/dbt/issues/2510), [#2813](https://github.com/fishtown-analytics/dbt/pull/2813)).
- Remove `injected_sql` from manifest nodes, use `compiled_sql` instead ([#2762](https://github.com/fishtown-analytics/dbt/issues/2762), [#2834](https://github.com/fishtown-analytics/dbt/pull/2834))
### Features
- dbt will compare configurations using the un-rendered form of the config block in `dbt_project.yml` ([#2713](https://github.com/fishtown-analytics/dbt/issues/2713), [#2735](https://github.com/fishtown-analytics/dbt/pull/2735))
- Added state and defer arguments to the RPC client, matching the CLI ([#2678](https://github.com/fishtown-analytics/dbt/issues/2678), [#2736](https://github.com/fishtown-analytics/dbt/pull/2736))
- Added ability to snapshot hard-deleted records (opt-in with `invalidate_hard_deletes` config option). ([#249](https://github.com/fishtown-analytics/dbt/issues/249), [#2749](https://github.com/fishtown-analytics/dbt/pull/2749))
- Added revival for snapshotting hard-deleted records. ([#2819](https://github.com/fishtown-analytics/dbt/issues/2819), [#2821](https://github.com/fishtown-analytics/dbt/pull/2821))
- Improved error messages for YAML selectors ([#2700](https://github.com/fishtown-analytics/dbt/issues/2700), [#2781](https://github.com/fishtown-analytics/dbt/pull/2781))
- Added `dbt_invocation_id` for each BigQuery job to enable performance analysis ([#2808](https://github.com/fishtown-analytics/dbt/issues/2808), [#2809](https://github.com/fishtown-analytics/dbt/pull/2809))
- Added support for BigQuery connections using refresh tokens ([#2344](https://github.com/fishtown-analytics/dbt/issues/2344), [#2805](https://github.com/fishtown-analytics/dbt/pull/2805))
### Under the hood
- Save `manifest.json` at the same time we save the `run_results.json` at the end of a run ([#2765](https://github.com/fishtown-analytics/dbt/issues/2765), [#2799](https://github.com/fishtown-analytics/dbt/pull/2799))
- Added strategy-specific validation to improve the relevancy of compilation errors for the `timestamp` and `check` snapshot strategies. (([#2787](https://github.com/fishtown-analytics/dbt/issues/2787), [#2791](https://github.com/fishtown-analytics/dbt/pull/2791))
- Changed rpc test timeouts to avoid locally run test failures ([#2803](https://github.com/fishtown-analytics/dbt/issues/2803),[#2804](https://github.com/fishtown-analytics/dbt/pull/2804))
- Added a `debug_query` on the base adapter that will allow plugin authors to create custom debug queries ([#2751](https://github.com/fishtown-analytics/dbt/issues/2751),[#2871](https://github.com/fishtown-analytics/dbt/pull/2817))
### Docs
- Add select/deselect option in DAG view dropups. ([docs#98](https://github.com/fishtown-analytics/dbt-docs/issues/98), [docs#138](https://github.com/fishtown-analytics/dbt-docs/pull/138))
- Fixed issue where sources with tags were not showing up in graph viz ([docs#93](https://github.com/fishtown-analytics/dbt-docs/issues/93), [docs#139](https://github.com/fishtown-analytics/dbt-docs/pull/139))
- Use `compiled_sql` instead of `injected_sql` for "Compiled" ([docs#146](https://github.com/fishtown-analytics/dbt-docs/issues/146), [docs#148](https://github.com/fishtown-analytics/dbt-docs/issues/148))
Contributors:
- [@joelluijmes](https://github.com/joelluijmes) ([#2749](https://github.com/fishtown-analytics/dbt/pull/2749), [#2821](https://github.com/fishtown-analytics/dbt/pull/2821))
- [@kingfink](https://github.com/kingfink) ([#2791](https://github.com/fishtown-analytics/dbt/pull/2791))
- [@zmac12](https://github.com/zmac12) ([#2817](https://github.com/fishtown-analytics/dbt/pull/2817))
- [@Mr-Nobody99](https://github.com/Mr-Nobody99) ([docs#138](https://github.com/fishtown-analytics/dbt-docs/pull/138))
- [@jplynch77](https://github.com/jplynch77) ([docs#139](https://github.com/fishtown-analytics/dbt-docs/pull/139))
## dbt 0.18.1 (October 13, 2020)
## dbt 0.18.1rc1 (October 01, 2020)
### Features
- Added retry support for rateLimitExceeded error from BigQuery, ([#2795](https://github.com/fishtown-analytics/dbt/issues/2795), [#2796](https://github.com/fishtown-analytics/dbt/issues/2796))
Contributors:
- [@championj-foxtel](https://github.com/championj-foxtel) ([#2796](https://github.com/fishtown-analytics/dbt/issues/2796))
## dbt 0.18.1b3 (September 25, 2020)
### Feature
- Added 'Last Modified' stat in snowflake catalog macro. Now should be available in docs. ([#2728](https://github.com/fishtown-analytics/dbt/issues/2728))
### Fixes
- `dbt compile` and `dbt run` failed with `KeyError: 'endpoint_resolver'` when threads > 1 and `method: iam` had been specified in the profiles.yaml ([#2756](https://github.com/fishtown-analytics/dbt/issues/2756), [#2766](https://github.com/fishtown-analytics/dbt/pull/2766))
- Fix Redshift adapter to include columns from external tables when using the get_columns_in_relation macro ([#2753](https://github.com/fishtown-analytics/dbt/issues/2753), [#2754](https://github.com/fishtown-analytics/dbt/pull/2754))
### Under the hood
- Require extra `snowflake-connector-python[secure-local-storage]` on all dbt-snowflake installations ([#2779](https://github.com/fishtown-analytics/dbt/issues/2779), [#2789](https://github.com/fishtown-analytics/dbt/pull/2789))
Contributors:
- [@Mr-Nobody99](https://github.com/Mr-Nobody99) ([#2732](https://github.com/fishtown-analytics/dbt/pull/2732))
- [@jweibel22](https://github.com/jweibel22) ([#2766](https://github.com/fishtown-analytics/dbt/pull/2766))
- [@aiguofer](https://github.com/aiguofer) ([#2754](https://github.com/fishtown-analytics/dbt/pull/2754))
## dbt 0.18.1b1 (September 17, 2020)
### Under the hood
- If column config says quote, use quoting in SQL for adding a comment. ([#2539](https://github.com/fishtown-analytics/dbt/issues/2539), [#2733](https://github.com/fishtown-analytics/dbt/pull/2733))
@@ -21,16 +175,15 @@ Contributors:
### Features
- Specify all three logging levels (`INFO`, `WARNING`, `ERROR`) in result logs for commands `test`, `seed`, `run`, `snapshot` and `source snapshot-freshness` ([#2680](https://github.com/fishtown-analytics/dbt/pull/2680), [#2723](https://github.com/fishtown-analytics/dbt/pull/2723))
- Added "reports" ([#2730](https://github.com/fishtown-analytics/dbt/issues/2730), [#2752](https://github.com/fishtown-analytics/dbt/pull/2752))
- Added "exposures" ([#2730](https://github.com/fishtown-analytics/dbt/issues/2730), [#2752](https://github.com/fishtown-analytics/dbt/pull/2752), [#2777](https://github.com/fishtown-analytics/dbt/issues/2777))
### Docs
- Add Report nodes ([docs#135](https://github.com/fishtown-analytics/dbt-docs/issues/135), [docs#136](https://github.com/fishtown-analytics/dbt-docs/pull/136))
- Add Exposure nodes ([docs#135](https://github.com/fishtown-analytics/dbt-docs/issues/135), [docs#136](https://github.com/fishtown-analytics/dbt-docs/pull/136), [docs#137](https://github.com/fishtown-analytics/dbt-docs/pull/137))
Contributors:
- [@tpilewicz](https://github.com/tpilewicz) ([#2723](https://github.com/fishtown-analytics/dbt/pull/2723))
- [@heisencoder](https://github.com/heisencoder) ([#2739](https://github.com/fishtown-analytics/dbt/issues/2739))
## dbt 0.18.0 (September 03, 2020)
### Under the hood
@@ -86,7 +239,6 @@ Contributors:
- Add relevance criteria to site search ([docs#113](https://github.com/fishtown-analytics/dbt-docs/pull/113))
- Support new selector methods, intersection, and arbitrary parent/child depth in DAG selection syntax ([docs#118](https://github.com/fishtown-analytics/dbt-docs/pull/118))
- Revise anonymous event tracking: simpler URL fuzzing; differentiate between Cloud-hosted and non-Cloud docs ([docs#121](https://github.com/fishtown-analytics/dbt-docs/pull/121))
Contributors:
- [@bbhoss](https://github.com/bbhoss) ([#2677](https://github.com/fishtown-analytics/dbt/pull/2677))
- [@kconvey](https://github.com/kconvey) ([#2694](https://github.com/fishtown-analytics/dbt/pull/2694), [#2709](https://github.com/fishtown-analytics/dbt/pull/2709)), [#2711](https://github.com/fishtown-analytics/dbt/pull/2711))
@@ -806,7 +958,6 @@ Thanks for your contributions to dbt!
- [@bastienboutonnet](https://github.com/bastienboutonnet) ([#1591](https://github.com/fishtown-analytics/dbt/pull/1591), [#1689](https://github.com/fishtown-analytics/dbt/pull/1689))
## dbt 0.14.0 - Wilt Chamberlain (July 10, 2019)
### Overview

View File

@@ -1,51 +0,0 @@
FROM ubuntu:18.04
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && \
apt-get dist-upgrade -y && \
apt-get install -y --no-install-recommends \
netcat postgresql curl git ssh software-properties-common \
make build-essential ca-certificates libpq-dev \
libsasl2-dev libsasl2-2 libsasl2-modules-gssapi-mit libyaml-dev \
&& \
add-apt-repository ppa:deadsnakes/ppa && \
apt-get install -y \
python python-dev python-pip \
python3.6 python3.6-dev python3-pip python3.6-venv \
python3.7 python3.7-dev python3.7-venv \
python3.8 python3.8-dev python3.8-venv \
python3.9 python3.9-dev python3.9-venv && \
apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
ARG DOCKERIZE_VERSION=v0.6.1
RUN curl -LO https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz && \
tar -C /usr/local/bin -xzvf dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz && \
rm dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz
RUN pip3 install -U "tox==3.14.4" wheel "six>=1.14.0,<1.15.0" "virtualenv==20.0.3" setuptools
# tox fails if the 'python' interpreter (python2) doesn't have `tox` installed
RUN pip install -U "tox==3.14.4" "six>=1.14.0,<1.15.0" "virtualenv==20.0.3" setuptools
# These args are passed in via docker-compose, which reads then from the .env file.
# On Linux, run `make .env` to create the .env file for the current user.
# On MacOS and Windows, these can stay unset.
ARG USER_ID
ARG GROUP_ID
RUN if [ ${USER_ID:-0} -ne 0 ] && [ ${GROUP_ID:-0} -ne 0 ]; then \
groupadd -g ${GROUP_ID} dbt_test_user && \
useradd -m -l -u ${USER_ID} -g ${GROUP_ID} dbt_test_user; \
else \
useradd -mU -l dbt_test_user; \
fi
RUN mkdir /usr/app && chown dbt_test_user /usr/app
RUN mkdir /home/tox && chown dbt_test_user /home/tox
WORKDIR /usr/app
VOLUME /usr/app
USER dbt_test_user
ENV PYTHONIOENCODING=utf-8
ENV LANG C.UTF-8

74
Dockerfile.test Normal file
View File

@@ -0,0 +1,74 @@
FROM ubuntu:18.04
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update \
&& apt-get dist-upgrade -y \
&& apt-get install -y --no-install-recommends \
netcat \
postgresql \
curl \
git \
ssh \
software-properties-common \
make \
build-essential \
ca-certificates \
libpq-dev \
libsasl2-dev \
libsasl2-2 \
libsasl2-modules-gssapi-mit \
libyaml-dev \
unixodbc-dev \
&& add-apt-repository ppa:deadsnakes/ppa \
&& apt-get install -y \
python \
python-dev \
python-pip \
python3.6 \
python3.6-dev \
python3-pip \
python3.6-venv \
python3.7 \
python3.7-dev \
python3.7-venv \
python3.8 \
python3.8-dev \
python3.8-venv \
python3.9 \
python3.9-dev \
python3.9-venv \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
ARG DOCKERIZE_VERSION=v0.6.1
RUN curl -LO https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
&& tar -C /usr/local/bin -xzvf dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
&& rm dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz
RUN pip3 install -U "tox==3.14.4" wheel "six>=1.14.0,<1.15.0" "virtualenv==20.0.3" setuptools
# tox fails if the 'python' interpreter (python2) doesn't have `tox` installed
RUN pip install -U "tox==3.14.4" "six>=1.14.0,<1.15.0" "virtualenv==20.0.3" setuptools
# These args are passed in via docker-compose, which reads then from the .env file.
# On Linux, run `make .env` to create the .env file for the current user.
# On MacOS and Windows, these can stay unset.
ARG USER_ID
ARG GROUP_ID
RUN if [ ${USER_ID:-0} -ne 0 ] && [ ${GROUP_ID:-0} -ne 0 ]; then \
groupadd -g ${GROUP_ID} dbt_test_user && \
useradd -m -l -u ${USER_ID} -g ${GROUP_ID} dbt_test_user; \
else \
useradd -mU -l dbt_test_user; \
fi
RUN mkdir /usr/app && chown dbt_test_user /usr/app
RUN mkdir /home/tox && chown dbt_test_user /home/tox
WORKDIR /usr/app
VOLUME /usr/app
USER dbt_test_user
ENV PYTHONIOENCODING=utf-8
ENV LANG C.UTF-8

View File

@@ -7,25 +7,30 @@ install:
test: .env
@echo "Full test run starting..."
@time docker-compose run test tox
@time docker-compose run --rm test tox
test-unit: .env
@echo "Unit test run starting..."
@time docker-compose run test tox -e unit-py36,flake8
@time docker-compose run --rm test tox -e unit-py36,flake8
test-integration: .env
@echo "Integration test run starting..."
@time docker-compose run test tox -e integration-postgres-py36,integration-redshift-py36,integration-snowflake-py36,integration-bigquery-py36
@time docker-compose run --rm test tox -e integration-postgres-py36,integration-redshift-py36,integration-snowflake-py36,integration-bigquery-py36
test-quick: .env
@echo "Integration test run starting..."
@time docker-compose run test tox -e integration-postgres-py36 -- -x
@time docker-compose run --rm test tox -e integration-postgres-py36 -- -x
# This rule creates a file named .env that is used by docker-compose for passing
# the USER_ID and GROUP_ID arguments to the Docker image.
.env:
@touch .env
ifneq ($(OS),Windows_NT)
ifneq ($(shell uname -s), Darwin)
@echo USER_ID=$(shell id -u) > .env
@echo GROUP_ID=$(shell id -g) >> .env
endif
endif
@time docker-compose build
clean:

241
core/Cargo.lock generated Normal file
View File

@@ -0,0 +1,241 @@
# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
[[package]]
name = "cfg-if"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
[[package]]
name = "ctor"
version = "0.1.19"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e8f45d9ad417bcef4817d614a501ab55cdd96a6fdb24f49aab89a54acfd66b19"
dependencies = [
"quote",
"syn",
]
[[package]]
name = "extensions-tracking"
version = "0.1.0"
dependencies = [
"pyo3",
]
[[package]]
name = "ghost"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1a5bcf1bbeab73aa4cf2fde60a846858dc036163c7c33bec309f8d17de785479"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "indoc"
version = "1.0.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e5a75aeaaef0ce18b58056d306c27b07436fbb34b8816c53094b76dd81803136"
dependencies = [
"unindent",
]
[[package]]
name = "instant"
version = "0.1.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "61124eeebbd69b8190558df225adf7e4caafce0d743919e5d6b19652314ec5ec"
dependencies = [
"cfg-if",
]
[[package]]
name = "inventory"
version = "0.1.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0f0f7efb804ec95e33db9ad49e4252f049e37e8b0a4652e3cd61f7999f2eff7f"
dependencies = [
"ctor",
"ghost",
"inventory-impl",
]
[[package]]
name = "inventory-impl"
version = "0.1.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "75c094e94816723ab936484666968f5b58060492e880f3c8d00489a1e244fa51"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "libc"
version = "0.2.86"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b7282d924be3275cec7f6756ff4121987bc6481325397dde6ba3e7802b1a8b1c"
[[package]]
name = "lock_api"
version = "0.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dd96ffd135b2fd7b973ac026d28085defbe8983df057ced3eb4f2130b0831312"
dependencies = [
"scopeguard",
]
[[package]]
name = "parking_lot"
version = "0.11.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6d7744ac029df22dca6284efe4e898991d28e3085c706c972bcd7da4a27a15eb"
dependencies = [
"instant",
"lock_api",
"parking_lot_core",
]
[[package]]
name = "parking_lot_core"
version = "0.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9ccb628cad4f84851442432c60ad8e1f607e29752d0bf072cbd0baf28aa34272"
dependencies = [
"cfg-if",
"instant",
"libc",
"redox_syscall",
"smallvec",
"winapi",
]
[[package]]
name = "paste"
version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c5d65c4d95931acda4498f675e332fcbdc9a06705cd07086c510e9b6009cd1c1"
[[package]]
name = "proc-macro2"
version = "1.0.24"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1e0704ee1a7e00d7bb417d0770ea303c1bccbabf0ef1667dae92b5967f5f8a71"
dependencies = [
"unicode-xid",
]
[[package]]
name = "pyo3"
version = "0.13.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "00ca634cf3acd58a599b535ed6cb188223298977d471d146121792bfa23b754c"
dependencies = [
"cfg-if",
"ctor",
"indoc",
"inventory",
"libc",
"parking_lot",
"paste",
"pyo3-macros",
"unindent",
]
[[package]]
name = "pyo3-macros"
version = "0.13.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "483ac516dbda6789a5b4be0271e7a31b9ad4ec8c0a5955050e8076f72bdbef8f"
dependencies = [
"pyo3-macros-backend",
"quote",
"syn",
]
[[package]]
name = "pyo3-macros-backend"
version = "0.13.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "15230cabcda008f03565ed8bac40f094cbb5ee1b46e6551f1ec3a0e922cf7df9"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "quote"
version = "1.0.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "991431c3519a3f36861882da93630ce66b52918dcf1b8e2fd66b397fc96f28df"
dependencies = [
"proc-macro2",
]
[[package]]
name = "redox_syscall"
version = "0.1.57"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "41cc0f7e4d5d4544e8861606a285bb08d3e70712ccc7d2b84d7c0ccfaf4b05ce"
[[package]]
name = "scopeguard"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d29ab0c6d3fc0ee92fe66e2d99f700eab17a8d57d1c1d3b748380fb20baa78cd"
[[package]]
name = "smallvec"
version = "1.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fe0f37c9e8f3c5a4a66ad655a93c74daac4ad00c441533bf5c6e7990bb42604e"
[[package]]
name = "syn"
version = "1.0.60"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c700597eca8a5a762beb35753ef6b94df201c81cca676604f547495a0d7f0081"
dependencies = [
"proc-macro2",
"quote",
"unicode-xid",
]
[[package]]
name = "unicode-xid"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f7fe0bb3479651439c9112f72b6c505038574c9fbb575ed1bf3b797fa39dd564"
[[package]]
name = "unindent"
version = "0.1.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f14ee04d9415b52b3aeab06258a3f07093182b88ba0f9b8d203f211a7a7d41c7"
[[package]]
name = "winapi"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5c839a674fcd7a98952e593242ea400abe93992746761e38641405d28b00f419"
dependencies = [
"winapi-i686-pc-windows-gnu",
"winapi-x86_64-pc-windows-gnu",
]
[[package]]
name = "winapi-i686-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6"
[[package]]
name = "winapi-x86_64-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"

2
core/Cargo.toml Normal file
View File

@@ -0,0 +1,2 @@
[workspace]
members = [ "extensions/tracking",]

View File

@@ -1 +1,3 @@
recursive-include dbt/include *.py *.sql *.yml *.html *.md
recursive-include extensions *
include Cargo.toml

View File

@@ -1,14 +1,12 @@
from dataclasses import dataclass
import re
from hologram import JsonSchemaMixin
from dbt.exceptions import RuntimeException
from typing import Dict, ClassVar, Any, Optional
from dbt.exceptions import RuntimeException
@dataclass
class Column(JsonSchemaMixin):
class Column:
TYPE_LABELS: ClassVar[Dict[str, str]] = {
'STRING': 'TEXT',
'TIMESTAMP': 'TIMESTAMP',

View File

@@ -4,14 +4,15 @@ import os
from multiprocessing.synchronize import RLock
from threading import get_ident
from typing import (
Dict, Tuple, Hashable, Optional, ContextManager, List
Dict, Tuple, Hashable, Optional, ContextManager, List, Union
)
import agate
import dbt.exceptions
from dbt.contracts.connection import (
Connection, Identifier, ConnectionState, AdapterRequiredConfig, LazyHandle
Connection, Identifier, ConnectionState,
AdapterRequiredConfig, LazyHandle, AdapterResponse
)
from dbt.contracts.graph.manifest import Manifest
from dbt.adapters.base.query_headers import (
@@ -290,7 +291,7 @@ class BaseConnectionManager(metaclass=abc.ABCMeta):
@abc.abstractmethod
def execute(
self, sql: str, auto_begin: bool = False, fetch: bool = False
) -> Tuple[str, agate.Table]:
) -> Tuple[Union[str, AdapterResponse], agate.Table]:
"""Execute the given SQL.
:param str sql: The sql to execute.
@@ -298,7 +299,7 @@ class BaseConnectionManager(metaclass=abc.ABCMeta):
transaction, automatically begin one.
:param bool fetch: If set, fetch results.
:return: A tuple of the status and the results (empty if fetch=False).
:rtype: Tuple[str, agate.Table]
:rtype: Tuple[Union[str, AdapterResponse], agate.Table]
"""
raise dbt.exceptions.NotImplementedException(
'`execute` is not implemented for this adapter!'

View File

@@ -28,14 +28,14 @@ from dbt.clients.jinja import MacroGenerator
from dbt.contracts.graph.compiled import (
CompileResultNode, CompiledSeedNode
)
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.manifest import Manifest, MacroManifest
from dbt.contracts.graph.parsed import ParsedSeedNode
from dbt.exceptions import warn_or_error
from dbt.node_types import NodeType
from dbt.logger import GLOBAL_LOGGER as logger
from dbt.utils import filter_null_values, executor
from dbt.adapters.base.connections import Connection
from dbt.adapters.base.connections import Connection, AdapterResponse
from dbt.adapters.base.meta import AdapterMeta, available
from dbt.adapters.base.relation import (
ComponentName, BaseRelation, InformationSchema, SchemaSearchMap
@@ -160,7 +160,7 @@ class BaseAdapter(metaclass=AdapterMeta):
self.config = config
self.cache = RelationsCache()
self.connections = self.ConnectionManager(config)
self._macro_manifest_lazy: Optional[Manifest] = None
self._macro_manifest_lazy: Optional[MacroManifest] = None
###
# Methods that pass through to the connection manager
@@ -180,6 +180,9 @@ class BaseAdapter(metaclass=AdapterMeta):
def commit_if_has_connection(self) -> None:
self.connections.commit_if_has_connection()
def debug_query(self) -> None:
self.execute('select 1 as id')
def nice_connection_name(self) -> str:
conn = self.connections.get_if_exists()
if conn is None or conn.name is None:
@@ -210,7 +213,7 @@ class BaseAdapter(metaclass=AdapterMeta):
@available.parse(lambda *a, **k: ('', empty_table()))
def execute(
self, sql: str, auto_begin: bool = False, fetch: bool = False
) -> Tuple[str, agate.Table]:
) -> Tuple[Union[str, AdapterResponse], agate.Table]:
"""Execute the given SQL. This is a thin wrapper around
ConnectionManager.execute.
@@ -219,7 +222,7 @@ class BaseAdapter(metaclass=AdapterMeta):
transaction, automatically begin one.
:param bool fetch: If set, fetch results.
:return: A tuple of the status and the results (empty if fetch=False).
:rtype: Tuple[str, agate.Table]
:rtype: Tuple[Union[str, AdapterResponse], agate.Table]
"""
return self.connections.execute(
sql=sql,
@@ -227,6 +230,21 @@ class BaseAdapter(metaclass=AdapterMeta):
fetch=fetch
)
@available.parse(lambda *a, **k: ('', empty_table()))
def get_partitions_metadata(
self, table: str
) -> Tuple[agate.Table]:
"""Obtain partitions metadata for a BigQuery partitioned table.
:param str table_id: a partitioned table id, in standard SQL format.
:return: a partition metadata tuple, as described in
https://cloud.google.com/bigquery/docs/creating-partitioned-tables#getting_partition_metadata_using_meta_tables.
:rtype: agate.Table
"""
return self.connections.get_partitions_metadata(
table=table
)
###
# Methods that should never be overridden
###
@@ -241,18 +259,18 @@ class BaseAdapter(metaclass=AdapterMeta):
return cls.ConnectionManager.TYPE
@property
def _macro_manifest(self) -> Manifest:
def _macro_manifest(self) -> MacroManifest:
if self._macro_manifest_lazy is None:
return self.load_macro_manifest()
return self._macro_manifest_lazy
def check_macro_manifest(self) -> Optional[Manifest]:
def check_macro_manifest(self) -> Optional[MacroManifest]:
"""Return the internal manifest (used for executing macros) if it's
been initialized, otherwise return None.
"""
return self._macro_manifest_lazy
def load_macro_manifest(self) -> Manifest:
def load_macro_manifest(self) -> MacroManifest:
if self._macro_manifest_lazy is None:
# avoid a circular import
from dbt.parser.manifest import load_macro_manifest

View File

@@ -21,8 +21,8 @@ Self = TypeVar('Self', bound='BaseRelation')
@dataclass(frozen=True, eq=False, repr=False)
class BaseRelation(FakeAPIObject, Hashable):
type: Optional[RelationType]
path: Path
type: Optional[RelationType] = None
quote_character: str = '"'
include_policy: Policy = Policy()
quote_policy: Policy = Policy()
@@ -203,7 +203,7 @@ class BaseRelation(FakeAPIObject, Hashable):
@staticmethod
def add_ephemeral_prefix(name: str):
return f'__dbt__CTE__{name}'
return f'__dbt__cte__{name}'
@classmethod
def create_ephemeral_from_node(

View File

@@ -7,7 +7,9 @@ from typing_extensions import Protocol
import agate
from dbt.contracts.connection import Connection, AdapterRequiredConfig
from dbt.contracts.connection import (
Connection, AdapterRequiredConfig, AdapterResponse
)
from dbt.contracts.graph.compiled import (
CompiledNode, ManifestNode, NonSourceCompiledNode
)
@@ -154,7 +156,7 @@ class AdapterProtocol(
def execute(
self, sql: str, auto_begin: bool = False, fetch: bool = False
) -> Tuple[str, agate.Table]:
) -> Tuple[Union[str, AdapterResponse], agate.Table]:
...
def get_compiler(self) -> Compiler_T:

View File

@@ -1,13 +1,15 @@
import abc
import time
from typing import List, Optional, Tuple, Any, Iterable, Dict
from typing import List, Optional, Tuple, Any, Iterable, Dict, Union
import agate
import dbt.clients.agate_helper
import dbt.exceptions
from dbt.adapters.base import BaseConnectionManager
from dbt.contracts.connection import Connection, ConnectionState
from dbt.contracts.connection import (
Connection, ConnectionState, AdapterResponse
)
from dbt.logger import GLOBAL_LOGGER as logger
from dbt import flags
@@ -18,7 +20,7 @@ class SQLConnectionManager(BaseConnectionManager):
Methods to implement:
- exception_handler
- cancel
- get_status
- get_response
- open
"""
@abc.abstractmethod
@@ -76,20 +78,19 @@ class SQLConnectionManager(BaseConnectionManager):
cursor = connection.handle.cursor()
cursor.execute(sql, bindings)
logger.debug(
"SQL status: {status} in {elapsed:0.2f} seconds",
status=self.get_status(cursor),
status=self.get_response(cursor),
elapsed=(time.time() - pre)
)
return connection, cursor
@abc.abstractclassmethod
def get_status(cls, cursor: Any) -> str:
def get_response(cls, cursor: Any) -> Union[AdapterResponse, str]:
"""Get the status of the cursor."""
raise dbt.exceptions.NotImplementedException(
'`get_status` is not implemented for this adapter!'
'`get_response` is not implemented for this adapter!'
)
@classmethod
@@ -118,15 +119,15 @@ class SQLConnectionManager(BaseConnectionManager):
def execute(
self, sql: str, auto_begin: bool = False, fetch: bool = False
) -> Tuple[str, agate.Table]:
) -> Tuple[Union[AdapterResponse, str], agate.Table]:
sql = self._add_query_comment(sql)
_, cursor = self.add_query(sql, auto_begin)
status = self.get_status(cursor)
response = self.get_response(cursor)
if fetch:
table = self.get_result_from_cursor(cursor)
else:
table = dbt.clients.agate_helper.empty_table()
return status, table
return response, table
def add_begin_query(self):
return self.add_query('BEGIN', auto_begin=False)

View File

@@ -231,6 +231,7 @@ class BaseMacroGenerator:
template = self.get_template()
# make the module. previously we set both vars and local, but that's
# redundant: They both end up in the same place
# make_module is in jinja2.environment. It returns a TemplateModule
module = template.make_module(vars=self.context, shared=False)
macro = module.__dict__[get_dbt_macro_name(name)]
module.__dict__.update(self.context)
@@ -244,6 +245,7 @@ class BaseMacroGenerator:
raise_compiler_error(str(e))
def call_macro(self, *args, **kwargs):
# called from __call__ methods
if self.context is None:
raise InternalException(
'Context is still None in call_macro!'
@@ -306,8 +308,10 @@ class MacroGenerator(BaseMacroGenerator):
e.stack.append(self.macro)
raise e
# This adds the macro's unique id to the node's 'depends_on'
@contextmanager
def track_call(self):
# This is only called from __call__
if self.stack is None or self.node is None:
yield
else:
@@ -322,6 +326,7 @@ class MacroGenerator(BaseMacroGenerator):
finally:
self.stack.pop(unique_id)
# this makes MacroGenerator objects callable like functions
def __call__(self, *args, **kwargs):
with self.track_call():
return self.call_macro(*args, **kwargs)

View File

@@ -438,7 +438,9 @@ def run_cmd(
return out, err
def download(url: str, path: str, timeout: Union[float, tuple] = None) -> None:
def download(
url: str, path: str, timeout: Optional[Union[float, tuple]] = None
) -> None:
path = convert_path(path)
connection_timeout = timeout or float(os.getenv('DBT_HTTP_TIMEOUT', 10))
response = requests.get(url, timeout=connection_timeout)

View File

@@ -1,16 +1,19 @@
from typing import Any
import dbt.exceptions
import yaml
import yaml.scanner
# the C version is faster, but it doesn't always exist
YamlLoader: Any
try:
from yaml import CSafeLoader as YamlLoader
from yaml import (
CLoader as Loader,
CSafeLoader as SafeLoader,
CDumper as Dumper
)
except ImportError:
from yaml import SafeLoader as YamlLoader
from yaml import ( # type: ignore # noqa: F401
Loader, SafeLoader, Dumper
)
YAML_ERROR_MESSAGE = """
@@ -54,7 +57,7 @@ def contextualized_yaml_error(raw_contents, error):
def safe_load(contents):
return yaml.load(contents, Loader=YamlLoader)
return yaml.load(contents, Loader=SafeLoader)
def load_yaml_text(contents):

View File

@@ -52,6 +52,7 @@ def print_compile_stats(stats):
NodeType.Operation: 'operation',
NodeType.Seed: 'seed file',
NodeType.Source: 'source',
NodeType.Exposure: 'exposure',
}
results = {k: 0 for k in names.keys()}
@@ -81,6 +82,8 @@ def _generate_stats(manifest: Manifest):
for source in manifest.sources.values():
stats[source.resource_type] += 1
for exposure in manifest.exposures.values():
stats[exposure.resource_type] += 1
for macro in manifest.macros.values():
stats[macro.resource_type] += 1
return stats
@@ -148,12 +151,15 @@ class Compiler:
make_directory(self.config.target_path)
make_directory(self.config.modules_path)
# creates a ModelContext which is converted to
# a dict for jinja rendering of SQL
def _create_node_context(
self,
node: NonSourceCompiledNode,
manifest: Manifest,
extra_context: Dict[str, Any],
) -> Dict[str, Any]:
context = generate_runtime_model(
node, self.config, manifest
)
@@ -169,35 +175,14 @@ class Compiler:
relation_cls = adapter.Relation
return relation_cls.add_ephemeral_prefix(name)
def _get_compiled_model(
self,
manifest: Manifest,
cte_id: str,
extra_context: Dict[str, Any],
) -> NonSourceCompiledNode:
if cte_id not in manifest.nodes:
raise InternalException(
f'During compilation, found a cte reference that could not be '
f'resolved: {cte_id}'
)
cte_model = manifest.nodes[cte_id]
if getattr(cte_model, 'compiled', False):
assert isinstance(cte_model, tuple(COMPILED_TYPES.values()))
return cast(NonSourceCompiledNode, cte_model)
elif cte_model.is_ephemeral_model:
# this must be some kind of parsed node that we can compile.
# we know it's not a parsed source definition
assert isinstance(cte_model, tuple(COMPILED_TYPES))
# update the node so
node = self.compile_node(cte_model, manifest, extra_context)
manifest.sync_update_node(node)
return node
else:
raise InternalException(
f'During compilation, found an uncompiled cte that '
f'was not an ephemeral model: {cte_id}'
)
def _get_relation_name(self, node: ParsedNode):
relation_name = None
if (node.resource_type in NodeType.refable() and
not node.is_ephemeral_model):
adapter = get_adapter(self.config)
relation_cls = adapter.Relation
relation_name = str(relation_cls.create_from(self.config, node))
return relation_name
def _inject_ctes_into_sql(self, sql: str, ctes: List[InjectedCTE]) -> str:
"""
@@ -206,11 +191,11 @@ class Compiler:
[
InjectedCTE(
id="cte_id_1",
sql="__dbt__CTE__ephemeral as (select * from table)",
sql="__dbt__cte__ephemeral as (select * from table)",
),
InjectedCTE(
id="cte_id_2",
sql="__dbt__CTE__events as (select id, type from events)",
sql="__dbt__cte__events as (select id, type from events)",
),
]
@@ -221,8 +206,8 @@ class Compiler:
This will spit out:
"with __dbt__CTE__ephemeral as (select * from table),
__dbt__CTE__events as (select id, type from events),
"with __dbt__cte__ephemeral as (select * from table),
__dbt__cte__events as (select id, type from events),
with internal_cte as (select * from sessions)
select * from internal_cte"
@@ -260,85 +245,119 @@ class Compiler:
return str(parsed)
def _model_prepend_ctes(
self,
model: NonSourceCompiledNode,
prepended_ctes: List[InjectedCTE]
) -> NonSourceCompiledNode:
if model.compiled_sql is None:
raise RuntimeException(
'Cannot prepend ctes to an unparsed node', model
)
injected_sql = self._inject_ctes_into_sql(
model.compiled_sql,
prepended_ctes,
)
model.extra_ctes_injected = True
model.extra_ctes = prepended_ctes
model.injected_sql = injected_sql
model.validate(model.to_dict())
return model
def _get_dbt_test_name(self) -> str:
return 'dbt__CTE__INTERNAL_test'
return 'dbt__cte__internal_test'
# This method is called by the 'compile_node' method. Starting
# from the node that it is passed in, it will recursively call
# itself using the 'extra_ctes'. The 'ephemeral' models do
# not produce SQL that is executed directly, instead they
# are rolled up into the models that refer to them by
# inserting CTEs into the SQL.
def _recursively_prepend_ctes(
self,
model: NonSourceCompiledNode,
manifest: Manifest,
extra_context: Dict[str, Any],
extra_context: Optional[Dict[str, Any]],
) -> Tuple[NonSourceCompiledNode, List[InjectedCTE]]:
if model.compiled_sql is None:
raise RuntimeException(
'Cannot inject ctes into an unparsed node', model
)
if model.extra_ctes_injected:
return (model, model.extra_ctes)
if flags.STRICT_MODE:
if not isinstance(model, tuple(COMPILED_TYPES.values())):
raise InternalException(
f'Bad model type: {type(model)}'
)
# Just to make it plain that nothing is actually injected for this case
if not model.extra_ctes:
model.extra_ctes_injected = True
manifest.update_node(model)
return (model, model.extra_ctes)
# This stores the ctes which will all be recursively
# gathered and then "injected" into the model.
prepended_ctes: List[InjectedCTE] = []
dbt_test_name = self._get_dbt_test_name()
# extra_ctes are added to the model by
# RuntimeRefResolver.create_relation, which adds an
# extra_cte for every model relation which is an
# ephemeral model.
for cte in model.extra_ctes:
if cte.id == dbt_test_name:
sql = cte.sql
else:
cte_model = self._get_compiled_model(
manifest,
cte.id,
extra_context,
)
cte_model, new_prepended_ctes = self._recursively_prepend_ctes(
cte_model, manifest, extra_context
)
if cte.id not in manifest.nodes:
raise InternalException(
f'During compilation, found a cte reference that '
f'could not be resolved: {cte.id}'
)
cte_model = manifest.nodes[cte.id]
if not cte_model.is_ephemeral_model:
raise InternalException(f'{cte.id} is not ephemeral')
# This model has already been compiled, so it's been
# through here before
if getattr(cte_model, 'compiled', False):
assert isinstance(cte_model,
tuple(COMPILED_TYPES.values()))
cte_model = cast(NonSourceCompiledNode, cte_model)
new_prepended_ctes = cte_model.extra_ctes
# if the cte_model isn't compiled, i.e. first time here
else:
# This is an ephemeral parsed model that we can compile.
# Compile and update the node
cte_model = self._compile_node(
cte_model, manifest, extra_context)
# recursively call this method
cte_model, new_prepended_ctes = \
self._recursively_prepend_ctes(
cte_model, manifest, extra_context
)
# Save compiled SQL file and sync manifest
self._write_node(cte_model)
manifest.sync_update_node(cte_model)
_extend_prepended_ctes(prepended_ctes, new_prepended_ctes)
new_cte_name = self.add_ephemeral_prefix(cte_model.name)
sql = f' {new_cte_name} as (\n{cte_model.compiled_sql}\n)'
_add_prepended_cte(prepended_ctes, InjectedCTE(id=cte.id, sql=sql))
model = self._model_prepend_ctes(model, prepended_ctes)
# We don't save injected_sql into compiled sql for ephemeral models
# because it will cause problems with processing of subsequent models.
# Ephemeral models do not produce executable SQL of their own.
if not model.is_ephemeral_model:
injected_sql = self._inject_ctes_into_sql(
model.compiled_sql,
prepended_ctes,
)
model.compiled_sql = injected_sql
model.extra_ctes_injected = True
model.extra_ctes = prepended_ctes
model.validate(model.to_dict())
manifest.update_node(model)
return model, prepended_ctes
def _insert_ctes(
def _add_ctes(
self,
compiled_node: NonSourceCompiledNode,
manifest: Manifest,
extra_context: Dict[str, Any],
) -> NonSourceCompiledNode:
"""Insert the CTEs for the model."""
"""Wrap the data test SQL in a CTE."""
# for data tests, we need to insert a special CTE at the end of the
# list containing the test query, and then have the "real" query be a
# select count(*) from that model.
# the benefit of doing it this way is that _insert_ctes() can be
# rewritten for different adapters to handle databses that don't
# the benefit of doing it this way is that _add_ctes() can be
# rewritten for different adapters to handle databases that don't
# support CTEs, or at least don't have full support.
if isinstance(compiled_node, CompiledDataTestNode):
# the last prepend (so last in order) should be the data test body.
@@ -352,11 +371,12 @@ class Compiler:
compiled_node.extra_ctes.append(cte)
compiled_node.compiled_sql = f'\nselect count(*) from {name}'
injected_node, _ = self._recursively_prepend_ctes(
compiled_node, manifest, extra_context
)
return injected_node
return compiled_node
# creates a compiled_node from the ManifestNode passed in,
# creates a "context" dictionary for jinja rendering,
# and then renders the "compiled_sql" using the node, the
# raw_sql and the context.
def _compile_node(
self,
node: ManifestNode,
@@ -374,7 +394,6 @@ class Compiler:
'compiled_sql': None,
'extra_ctes_injected': False,
'extra_ctes': [],
'injected_sql': None,
})
compiled_node = _compiled_type_for(node).from_dict(data)
@@ -388,13 +407,17 @@ class Compiler:
node,
)
compiled_node.relation_name = self._get_relation_name(node)
compiled_node.compiled = True
injected_node = self._insert_ctes(
# add ctes for specific test nodes, and also for
# possible future use in adapters
compiled_node = self._add_ctes(
compiled_node, manifest, extra_context
)
return injected_node
return compiled_node
def write_graph_file(self, linker: Linker, manifest: Manifest):
filename = graph_file_name
@@ -426,9 +449,9 @@ class Compiler:
linker.add_node(source.unique_id)
for node in manifest.nodes.values():
self.link_node(linker, node, manifest)
for report in manifest.reports.values():
self.link_node(linker, report, manifest)
# linker.add_node(report.unique_id)
for exposure in manifest.exposures.values():
self.link_node(linker, exposure, manifest)
# linker.add_node(exposure.unique_id)
cycle = linker.find_cycles()
@@ -449,26 +472,26 @@ class Compiler:
return Graph(linker.graph)
# writes the "compiled_sql" into the target/compiled directory
def _write_node(self, node: NonSourceCompiledNode) -> ManifestNode:
if not _is_writable(node):
if (not node.extra_ctes_injected or
node.resource_type == NodeType.Snapshot):
return node
logger.debug(f'Writing injected SQL for node "{node.unique_id}"')
if node.injected_sql is None:
# this should not really happen, but it'd be a shame to crash
# over it
logger.error(
f'Compiled node "{node.unique_id}" had no injected_sql, '
'cannot write sql!'
)
else:
if node.compiled_sql:
node.build_path = node.write_node(
self.config.target_path,
'compiled',
node.injected_sql
node.compiled_sql
)
return node
# This is the main entry point into this code. It's called by
# CompileRunner.compile, GenericRPCRunner.compile, and
# RunTask.get_hook_sql. It calls '_compile_node' to convert
# the node into a compiled node, and then calls the
# recursive method to "prepend" the ctes.
def compile_node(
self,
node: ManifestNode,
@@ -478,16 +501,9 @@ class Compiler:
) -> NonSourceCompiledNode:
node = self._compile_node(node, manifest, extra_context)
if write and _is_writable(node):
node, _ = self._recursively_prepend_ctes(
node, manifest, extra_context
)
if write:
self._write_node(node)
return node
def _is_writable(node):
if not node.injected_sql:
return False
if node.resource_type == NodeType.Snapshot:
return False
return True

View File

@@ -2,7 +2,7 @@ from dataclasses import dataclass
from typing import Any, Dict, Optional, Tuple
import os
from hologram import ValidationError
from dbt.dataclass_schema import ValidationError
from dbt.clients.system import load_file_contents
from dbt.clients.yaml_helper import load_yaml_text
@@ -75,12 +75,15 @@ def read_user_config(directory: str) -> UserConfig:
if profile:
user_cfg = coerce_dict_str(profile.get('config', {}))
if user_cfg is not None:
UserConfig.validate(user_cfg)
return UserConfig.from_dict(user_cfg)
except (RuntimeException, ValidationError):
pass
return UserConfig()
# The Profile class is included in RuntimeConfig, so any attribute
# additions must also be set where the RuntimeConfig class is created
@dataclass
class Profile(HasCredentials):
profile_name: str
@@ -135,10 +138,10 @@ class Profile(HasCredentials):
def validate(self):
try:
if self.credentials:
self.credentials.to_dict(validate=True)
ProfileConfig.from_dict(
self.to_profile_info(serialize_credentials=True)
)
dct = self.credentials.to_dict()
self.credentials.validate(dct)
dct = self.to_profile_info(serialize_credentials=True)
ProfileConfig.validate(dct)
except ValidationError as exc:
raise DbtProfileError(validator_error_message(exc)) from exc
@@ -158,7 +161,9 @@ class Profile(HasCredentials):
typename = profile.pop('type')
try:
cls = load_plugin(typename)
credentials = cls.from_dict(profile)
data = cls.translate_aliases(profile)
cls.validate(data)
credentials = cls.from_dict(data)
except (RuntimeException, ValidationError) as e:
msg = str(e) if isinstance(e, RuntimeException) else e.message
raise DbtProfileError(
@@ -231,6 +236,7 @@ class Profile(HasCredentials):
"""
if user_cfg is None:
user_cfg = {}
UserConfig.validate(user_cfg)
config = UserConfig.from_dict(user_cfg)
profile = cls(

View File

@@ -25,15 +25,13 @@ from dbt.semver import versions_compatible
from dbt.version import get_installed_version
from dbt.utils import MultiDict
from dbt.node_types import NodeType
from dbt.config.selectors import SelectorDict
from dbt.contracts.project import (
Project as ProjectContract,
SemverString,
)
from dbt.contracts.project import PackageConfig
from hologram import ValidationError
from dbt.dataclass_schema import ValidationError
from .renderer import DbtProjectYamlRenderer
from .selectors import (
selector_config_from_data,
@@ -100,6 +98,7 @@ def package_config_from_data(packages_data: Dict[str, Any]):
packages_data = {'packages': []}
try:
PackageConfig.validate(packages_data)
packages = PackageConfig.from_dict(packages_data)
except ValidationError as e:
raise DbtProjectError(
@@ -305,7 +304,10 @@ class PartialProject(RenderComponents):
)
try:
cfg = ProjectContract.from_dict(rendered.project_dict)
ProjectContract.validate(rendered.project_dict)
cfg = ProjectContract.from_dict(
rendered.project_dict
)
except ValidationError as e:
raise DbtProjectError(validator_error_message(e)) from e
# name/version are required in the Project definition, so we can assume
@@ -370,6 +372,12 @@ class PartialProject(RenderComponents):
packages = package_config_from_data(rendered.packages_dict)
selectors = selector_config_from_data(rendered.selectors_dict)
manifest_selectors: Dict[str, Any] = {}
if rendered.selectors_dict and rendered.selectors_dict['selectors']:
# this is a dict with a single key 'selectors' pointing to a list
# of dicts.
manifest_selectors = SelectorDict.parse_from_selectors_list(
rendered.selectors_dict['selectors'])
project = Project(
project_name=name,
@@ -396,6 +404,7 @@ class PartialProject(RenderComponents):
snapshots=snapshots,
dbt_version=dbt_version,
packages=packages,
manifest_selectors=manifest_selectors,
selectors=selectors,
query_comment=query_comment,
sources=sources,
@@ -458,6 +467,7 @@ class PartialProject(RenderComponents):
class VarProvider:
"""Var providers are tied to a particular Project."""
def __init__(
self,
vars: Dict[str, Dict[str, Any]]
@@ -476,6 +486,8 @@ class VarProvider:
return self.vars
# The Project class is included in RuntimeConfig, so any attribute
# additions must also be set where the RuntimeConfig class is created
@dataclass
class Project:
project_name: str
@@ -504,6 +516,7 @@ class Project:
vars: VarProvider
dbt_version: List[VersionSpecifier]
packages: Dict[str, Any]
manifest_selectors: Dict[str, Any]
selectors: SelectorConfig
query_comment: QueryComment
config_version: int
@@ -574,7 +587,7 @@ class Project:
def validate(self):
try:
ProjectContract.from_dict(self.to_project_config())
ProjectContract.validate(self.to_project_config())
except ValidationError as e:
raise DbtProjectError(validator_error_message(e)) from e

View File

@@ -33,7 +33,7 @@ from dbt.exceptions import (
raise_compiler_error
)
from hologram import ValidationError
from dbt.dataclass_schema import ValidationError
def _project_quoting_dict(
@@ -106,6 +106,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
snapshots=project.snapshots,
dbt_version=project.dbt_version,
packages=project.packages,
manifest_selectors=project.manifest_selectors,
selectors=project.selectors,
query_comment=project.query_comment,
sources=project.sources,
@@ -173,7 +174,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
:raises DbtProjectError: If the configuration fails validation.
"""
try:
Configuration.from_dict(self.serialize())
Configuration.validate(self.serialize())
except ValidationError as e:
raise DbtProjectError(validator_error_message(e)) from e
@@ -390,7 +391,7 @@ class UnsetConfig(UserConfig):
f"'UnsetConfig' object has no attribute {name}"
)
def to_dict(self):
def __post_serialize__(self, dct, options=None):
return {}
@@ -483,6 +484,7 @@ class UnsetProfileConfig(RuntimeConfig):
snapshots=project.snapshots,
dbt_version=project.dbt_version,
packages=project.packages,
manifest_selectors=project.manifest_selectors,
selectors=project.selectors,
query_comment=project.query_comment,
sources=project.sources,

View File

@@ -1,8 +1,9 @@
from pathlib import Path
from typing import Dict, Any
import yaml
from hologram import ValidationError
from dbt.clients.yaml_helper import ( # noqa: F401
yaml, Loader, Dumper, load_yaml_text
)
from dbt.dataclass_schema import ValidationError
from .renderer import SelectorRenderer
@@ -11,10 +12,10 @@ from dbt.clients.system import (
path_exists,
resolve_path_from_base,
)
from dbt.clients.yaml_helper import load_yaml_text
from dbt.contracts.selection import SelectorFile
from dbt.exceptions import DbtSelectorsError, RuntimeException
from dbt.graph import parse_from_selectors_definition, SelectionSpec
from dbt.graph.selector_spec import SelectionCriteria
MALFORMED_SELECTOR_ERROR = """\
The selectors.yml file in this project is malformed. Please double check
@@ -29,9 +30,11 @@ Validator Error:
class SelectorConfig(Dict[str, SelectionSpec]):
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> 'SelectorConfig':
def selectors_from_dict(cls, data: Dict[str, Any]) -> 'SelectorConfig':
try:
SelectorFile.validate(data)
selector_file = SelectorFile.from_dict(data)
selectors = parse_from_selectors_definition(selector_file)
except ValidationError as exc:
@@ -65,7 +68,7 @@ class SelectorConfig(Dict[str, SelectionSpec]):
f'Could not render selector data: {exc}',
result_type='invalid_selector',
) from exc
return cls.from_dict(rendered)
return cls.selectors_from_dict(rendered)
@classmethod
def from_path(
@@ -106,10 +109,74 @@ def selector_config_from_data(
selectors_data = {'selectors': []}
try:
selectors = SelectorConfig.from_dict(selectors_data)
selectors = SelectorConfig.selectors_from_dict(selectors_data)
except ValidationError as e:
raise DbtSelectorsError(
MALFORMED_SELECTOR_ERROR.format(error=str(e.message)),
result_type='invalid_selector',
) from e
return selectors
# These are utilities to clean up the dictionary created from
# selectors.yml by turning the cli-string format entries into
# normalized dictionary entries. It parallels the flow in
# dbt/graph/cli.py. If changes are made there, it might
# be necessary to make changes here. Ideally it would be
# good to combine the two flows into one at some point.
class SelectorDict:
@classmethod
def parse_dict_definition(cls, definition):
key = list(definition)[0]
value = definition[key]
if isinstance(value, list):
new_values = []
for sel_def in value:
new_value = cls.parse_from_definition(sel_def)
new_values.append(new_value)
value = new_values
if key == 'exclude':
definition = {key: value}
elif len(definition) == 1:
definition = {'method': key, 'value': value}
return definition
@classmethod
def parse_a_definition(cls, def_type, definition):
# this definition must be a list
new_dict = {def_type: []}
for sel_def in definition[def_type]:
if isinstance(sel_def, dict):
sel_def = cls.parse_from_definition(sel_def)
new_dict[def_type].append(sel_def)
elif isinstance(sel_def, str):
sel_def = SelectionCriteria.dict_from_single_spec(sel_def)
new_dict[def_type].append(sel_def)
else:
new_dict[def_type].append(sel_def)
return new_dict
@classmethod
def parse_from_definition(cls, definition):
if isinstance(definition, str):
definition = SelectionCriteria.dict_from_single_spec(definition)
elif 'union' in definition:
definition = cls.parse_a_definition('union', definition)
elif 'intersection' in definition:
definition = cls.parse_a_definition('intersection', definition)
elif isinstance(definition, dict):
definition = cls.parse_dict_definition(definition)
return definition
# This is the normal entrypoint of this code. Give it the
# list of selectors generated from the selectors.yml file.
@classmethod
def parse_from_selectors_list(cls, selectors):
selector_dict = {}
for selector in selectors:
sel_name = selector['name']
selector_dict[sel_name] = selector
definition = cls.parse_from_definition(selector['definition'])
selector_dict[sel_name]['definition'] = definition
return selector_dict

View File

@@ -7,17 +7,19 @@ from typing import (
from dbt import flags
from dbt import tracking
from dbt.clients.jinja import undefined_error, get_rendered
from dbt.clients import yaml_helper
from dbt.clients.yaml_helper import ( # noqa: F401
yaml, safe_load, SafeLoader, Loader, Dumper
)
from dbt.contracts.graph.compiled import CompiledResource
from dbt.exceptions import raise_compiler_error, MacroReturn
from dbt.logger import GLOBAL_LOGGER as logger
from dbt.version import __version__ as dbt_version
import yaml
# These modules are added to the context. Consider alternative
# approaches which will extend well to potentially many modules
import pytz
import datetime
import re
def get_pytz_module_context() -> Dict[str, Any]:
@@ -42,10 +44,19 @@ def get_datetime_module_context() -> Dict[str, Any]:
}
def get_re_module_context() -> Dict[str, Any]:
context_exports = re.__all__
return {
name: getattr(re, name) for name in context_exports
}
def get_context_modules() -> Dict[str, Dict[str, Any]]:
return {
'pytz': get_pytz_module_context(),
'datetime': get_datetime_module_context(),
're': get_re_module_context(),
}
@@ -162,6 +173,7 @@ class BaseContext(metaclass=ContextMeta):
builtins[key] = value
return builtins
# no dbtClassMixin so this is not an actual override
def to_dict(self):
self._ctx['context'] = self._ctx
builtins = self.generate_builtins()
@@ -384,7 +396,7 @@ class BaseContext(metaclass=ContextMeta):
-- ["good"]
"""
try:
return yaml_helper.safe_load(value)
return safe_load(value)
except (AttributeError, ValueError, yaml.YAMLError):
return default

View File

@@ -165,7 +165,7 @@ class ContextConfigGenerator(BaseContextConfigGenerator[C]):
# Calculate the defaults. We don't want to validate the defaults,
# because it might be invalid in the case of required config members
# (such as on snapshots!)
result = config_cls.from_dict({}, validate=False)
result = config_cls.from_dict({})
return result
def _update_from_config(

View File

@@ -0,0 +1,153 @@
from typing import (
Dict, MutableMapping, Optional
)
from dbt.contracts.graph.parsed import ParsedMacro
from dbt.exceptions import raise_duplicate_macro_name, raise_compiler_error
from dbt.include.global_project import PROJECT_NAME as GLOBAL_PROJECT_NAME
from dbt.clients.jinja import MacroGenerator
MacroNamespace = Dict[str, ParsedMacro]
# This class builds the MacroResolver by adding macros
# to various categories for finding macros in the right order,
# so that higher precedence macros are found first.
# This functionality is also provided by the MacroNamespace,
# but the intention is to eventually replace that class.
# This enables us to get the macor unique_id without
# processing every macro in the project.
class MacroResolver:
def __init__(
self,
macros: MutableMapping[str, ParsedMacro],
root_project_name: str,
internal_package_names,
) -> None:
self.root_project_name = root_project_name
self.macros = macros
# internal packages comes from get_adapter_package_names
self.internal_package_names = internal_package_names
# To be filled in from macros.
self.internal_packages: Dict[str, MacroNamespace] = {}
self.packages: Dict[str, MacroNamespace] = {}
self.root_package_macros: MacroNamespace = {}
# add the macros to internal_packages, packages, and root packages
self.add_macros()
self._build_internal_packages_namespace()
self._build_macros_by_name()
def _build_internal_packages_namespace(self):
# Iterate in reverse-order and overwrite: the packages that are first
# in the list are the ones we want to "win".
self.internal_packages_namespace: MacroNamespace = {}
for pkg in reversed(self.internal_package_names):
if pkg in self.internal_packages:
# Turn the internal packages into a flat namespace
self.internal_packages_namespace.update(
self.internal_packages[pkg])
def _build_macros_by_name(self):
macros_by_name = {}
# search root package macros
for macro in self.root_package_macros.values():
macros_by_name[macro.name] = macro
# search miscellaneous non-internal packages
for fnamespace in self.packages.values():
for macro in fnamespace.values():
macros_by_name[macro.name] = macro
# search all internal packages
for macro in self.internal_packages_namespace.values():
macros_by_name[macro.name] = macro
self.macros_by_name = macros_by_name
def _add_macro_to(
self,
package_namespaces: Dict[str, MacroNamespace],
macro: ParsedMacro,
):
if macro.package_name in package_namespaces:
namespace = package_namespaces[macro.package_name]
else:
namespace = {}
package_namespaces[macro.package_name] = namespace
if macro.name in namespace:
raise_duplicate_macro_name(
macro, macro, macro.package_name
)
package_namespaces[macro.package_name][macro.name] = macro
def add_macro(self, macro: ParsedMacro):
macro_name: str = macro.name
# internal macros (from plugins) will be processed separately from
# project macros, so store them in a different place
if macro.package_name in self.internal_package_names:
self._add_macro_to(self.internal_packages, macro)
else:
# if it's not an internal package
self._add_macro_to(self.packages, macro)
# add to root_package_macros if it's in the root package
if macro.package_name == self.root_project_name:
self.root_package_macros[macro_name] = macro
def add_macros(self):
for macro in self.macros.values():
self.add_macro(macro)
def get_macro_id(self, local_package, macro_name):
local_package_macros = {}
if (local_package not in self.internal_package_names and
local_package in self.packages):
local_package_macros = self.packages[local_package]
# First: search the local packages for this macro
if macro_name in local_package_macros:
return local_package_macros[macro_name].unique_id
if macro_name in self.macros_by_name:
return self.macros_by_name[macro_name].unique_id
return None
# Currently this is just used by test processing in the schema
# parser (in connection with the MacroResolver). Future work
# will extend the use of these classes to other parsing areas.
# One of the features of this class compared to the MacroNamespace
# is that you can limit the number of macros provided to the
# context dictionary in the 'to_dict' manifest method.
class TestMacroNamespace:
def __init__(
self, macro_resolver, ctx, node, thread_ctx, depends_on_macros
):
self.macro_resolver = macro_resolver
self.ctx = ctx
self.node = node
self.thread_ctx = thread_ctx
local_namespace = {}
if depends_on_macros:
for macro_unique_id in depends_on_macros:
macro = self.manifest.macros[macro_unique_id]
local_namespace[macro.name] = MacroGenerator(
macro, self.ctx, self.node, self.thread_ctx,
)
self.local_namespace = local_namespace
def get_from_package(
self, package_name: Optional[str], name: str
) -> Optional[MacroGenerator]:
macro = None
if package_name is None:
macro = self.macro_resolver.macros_by_name.get(name)
elif package_name == GLOBAL_PROJECT_NAME:
macro = self.macro_resolver.internal_packages_namespace.get(name)
elif package_name in self.resolver.packages:
macro = self.macro_resolver.packages[package_name].get(name)
else:
raise_compiler_error(
f"Could not find package '{package_name}'"
)
macro_func = MacroGenerator(
macro, self.ctx, self.node, self.thread_ctx
)
return macro_func

View File

@@ -15,6 +15,10 @@ NamespaceMember = Union[FlatNamespace, MacroGenerator]
FullNamespace = Dict[str, NamespaceMember]
# The point of this class is to collect the various macros
# and provide the ability to flatten them into the ManifestContexts
# that are created for jinja, so that macro calls can be resolved.
# Creates special iterators and _keys methods to flatten the lists.
class MacroNamespace(Mapping):
def __init__(
self,
@@ -37,12 +41,16 @@ class MacroNamespace(Mapping):
}
yield self.global_project_namespace
# provides special keys method for MacroNamespace iterator
# returns keys from local_namespace, global_namespace, packages,
# global_project_namespace
def _keys(self) -> Set[str]:
keys: Set[str] = set()
for search in self._search_order():
keys.update(search)
return keys
# special iterator using special keys
def __iter__(self) -> Iterator[str]:
for key in self._keys():
yield key
@@ -72,6 +80,10 @@ class MacroNamespace(Mapping):
)
# This class builds the MacroNamespace by adding macros to
# internal_packages or packages, and locals/globals.
# Call 'build_namespace' to return a MacroNamespace.
# This is used by ManifestContext (and subclasses)
class MacroNamespaceBuilder:
def __init__(
self,
@@ -83,10 +95,15 @@ class MacroNamespaceBuilder:
) -> None:
self.root_package = root_package
self.search_package = search_package
# internal packages comes from get_adapter_package_names
self.internal_package_names = set(internal_packages)
self.internal_package_names_order = internal_packages
# macro_func is added here if in root package
self.globals: FlatNamespace = {}
# macro_func is added here if it's the package for this node
self.locals: FlatNamespace = {}
# Create a dictionary of [package name][macro name] =
# MacroGenerator object which acts like a function
self.internal_packages: Dict[str, FlatNamespace] = {}
self.packages: Dict[str, FlatNamespace] = {}
self.thread_ctx = thread_ctx
@@ -94,25 +111,28 @@ class MacroNamespaceBuilder:
def _add_macro_to(
self,
heirarchy: Dict[str, FlatNamespace],
hierarchy: Dict[str, FlatNamespace],
macro: ParsedMacro,
macro_func: MacroGenerator,
):
if macro.package_name in heirarchy:
namespace = heirarchy[macro.package_name]
if macro.package_name in hierarchy:
namespace = hierarchy[macro.package_name]
else:
namespace = {}
heirarchy[macro.package_name] = namespace
hierarchy[macro.package_name] = namespace
if macro.name in namespace:
raise_duplicate_macro_name(
macro_func.macro, macro, macro.package_name
)
heirarchy[macro.package_name][macro.name] = macro_func
hierarchy[macro.package_name][macro.name] = macro_func
def add_macro(self, macro: ParsedMacro, ctx: Dict[str, Any]):
macro_name: str = macro.name
# MacroGenerator is in clients/jinja.py
# a MacroGenerator object is a callable object that will
# execute the MacroGenerator.__call__ function
macro_func: MacroGenerator = MacroGenerator(
macro, ctx, self.node, self.thread_ctx
)
@@ -122,10 +142,12 @@ class MacroNamespaceBuilder:
if macro.package_name in self.internal_package_names:
self._add_macro_to(self.internal_packages, macro, macro_func)
else:
# if it's not an internal package
self._add_macro_to(self.packages, macro, macro_func)
# add to locals if it's the package this node is in
if macro.package_name == self.search_package:
self.locals[macro_name] = macro_func
# add to globals if it's in the root package
elif macro.package_name == self.root_package:
self.globals[macro_name] = macro_func
@@ -143,6 +165,7 @@ class MacroNamespaceBuilder:
global_project_namespace: FlatNamespace = {}
for pkg in reversed(self.internal_package_names_order):
if pkg in self.internal_packages:
# add the macros pointed to by this package name
global_project_namespace.update(self.internal_packages[pkg])
return MacroNamespace(

View File

@@ -2,7 +2,8 @@ from typing import List
from dbt.clients.jinja import MacroStack
from dbt.contracts.connection import AdapterRequiredConfig
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.manifest import Manifest, AnyManifest
from dbt.context.macro_resolver import TestMacroNamespace
from .configured import ConfiguredContext
@@ -19,17 +20,25 @@ class ManifestContext(ConfiguredContext):
def __init__(
self,
config: AdapterRequiredConfig,
manifest: Manifest,
manifest: AnyManifest,
search_package: str,
) -> None:
super().__init__(config)
self.manifest = manifest
# this is the package of the node for which this context was built
self.search_package = search_package
self.macro_stack = MacroStack()
# This namespace is used by the BaseDatabaseWrapper in jinja rendering.
# The namespace is passed to it when it's constructed. It expects
# to be able to do: namespace.get_from_package(..)
self.namespace = self._build_namespace()
def _build_namespace(self):
# this takes all the macros in the manifest and adds them
# to the MacroNamespaceBuilder stored in self.namespace
builder = self._get_namespace_builder()
self.namespace = builder.build_namespace(
self.manifest.macros.values(),
self._ctx,
return builder.build_namespace(
self.manifest.macros.values(), self._ctx
)
def _get_namespace_builder(self) -> MacroNamespaceBuilder:
@@ -46,9 +55,15 @@ class ManifestContext(ConfiguredContext):
None,
)
# This does not use the Mashumaro code
def to_dict(self):
dct = super().to_dict()
dct.update(self.namespace)
# This moves all of the macros in the 'namespace' into top level
# keys in the manifest dictionary
if isinstance(self.namespace, TestMacroNamespace):
dct.update(self.namespace.local_namespace)
else:
dct.update(self.namespace)
return dct

View File

@@ -10,14 +10,18 @@ from dbt import deprecations
from dbt.adapters.base.column import Column
from dbt.adapters.factory import get_adapter, get_adapter_package_names
from dbt.clients import agate_helper
from dbt.clients.jinja import get_rendered, MacroGenerator
from dbt.clients.jinja import get_rendered, MacroGenerator, MacroStack
from dbt.config import RuntimeConfig, Project
from .base import contextmember, contextproperty, Var
from .configured import FQNLookup
from .context_config import ContextConfig
from dbt.context.macro_resolver import MacroResolver, TestMacroNamespace
from .macros import MacroNamespaceBuilder, MacroNamespace
from .manifest import ManifestContext
from dbt.contracts.graph.manifest import Manifest, Disabled
from dbt.contracts.connection import AdapterResponse
from dbt.contracts.graph.manifest import (
Manifest, AnyManifest, Disabled, MacroManifest
)
from dbt.contracts.graph.compiled import (
CompiledResource,
CompiledSeedNode,
@@ -25,7 +29,7 @@ from dbt.contracts.graph.compiled import (
)
from dbt.contracts.graph.parsed import (
ParsedMacro,
ParsedReport,
ParsedExposure,
ParsedSeedNode,
ParsedSourceDefinition,
)
@@ -83,6 +87,7 @@ class BaseDatabaseWrapper:
Wrapper for runtime database interaction. Applies the runtime quote policy
via a relation proxy.
"""
def __init__(self, adapter, namespace: MacroNamespace):
self._adapter = adapter
self.Relation = RelationProxy(adapter)
@@ -139,6 +144,7 @@ class BaseDatabaseWrapper:
for prefix in self._get_adapter_macro_prefixes():
search_name = f'{prefix}__{macro_name}'
try:
# this uses the namespace from the context
macro = self._namespace.get_from_package(
package_name, search_name
)
@@ -379,6 +385,7 @@ class ParseDatabaseWrapper(BaseDatabaseWrapper):
"""The parser subclass of the database wrapper applies any explicit
parse-time overrides.
"""
def __getattr__(self, name):
override = (name in self._adapter._available_ and
name in self._adapter._parse_replacements_)
@@ -399,6 +406,7 @@ class RuntimeDatabaseWrapper(BaseDatabaseWrapper):
"""The runtime database wrapper exposes everything the adapter marks
available.
"""
def __getattr__(self, name):
if name in self._adapter._available_:
return getattr(self._adapter, name)
@@ -634,10 +642,13 @@ class ProviderContext(ManifestContext):
self.context_config: Optional[ContextConfig] = context_config
self.provider: Provider = provider
self.adapter = get_adapter(self.config)
# The macro namespace is used in creating the DatabaseWrapper
self.db_wrapper = self.provider.DatabaseWrapper(
self.adapter, self.namespace
)
# This overrides the method in ManifestContext, and provides
# a model, which the ManifestContext builder does not
def _get_namespace_builder(self):
internal_packages = get_adapter_package_names(
self.config.credentials.type
@@ -660,18 +671,33 @@ class ProviderContext(ManifestContext):
@contextmember
def store_result(
self, name: str, status: Any, agate_table: Optional[agate.Table] = None
self, name: str,
response: Any,
agate_table: Optional[agate.Table] = None
) -> str:
if agate_table is None:
agate_table = agate_helper.empty_table()
self.sql_results[name] = AttrDict({
'status': status,
'response': response,
'data': agate_helper.as_matrix(agate_table),
'table': agate_table
})
return ''
@contextmember
def store_raw_result(
self,
name: str,
message=Optional[str],
code=Optional[str],
rows_affected=Optional[str],
agate_table: Optional[agate.Table] = None
) -> str:
response = AdapterResponse(
_message=message, code=code, rows_affected=rows_affected)
return self.store_result(name, response, agate_table)
@contextproperty
def validation(self):
def validate_any(*args) -> Callable[[T], None]:
@@ -1179,11 +1205,12 @@ class MacroContext(ProviderContext):
- 'schema' does not use any 'model' information
- they can't be configured with config() directives
"""
def __init__(
self,
model: ParsedMacro,
config: RuntimeConfig,
manifest: Manifest,
manifest: AnyManifest,
provider: Provider,
search_package: Optional[str],
) -> None:
@@ -1217,7 +1244,9 @@ class ModelContext(ProviderContext):
@contextproperty
def sql(self) -> Optional[str]:
return getattr(self.model, 'injected_sql', None)
if getattr(self.model, 'extra_ctes_injected', None):
return self.model.compiled_sql
return None
@contextproperty
def database(self) -> str:
@@ -1267,34 +1296,28 @@ class ModelContext(ProviderContext):
return self.db_wrapper.Relation.create_from(self.config, self.model)
# This is called by '_context_for', used in 'render_with_context'
def generate_parser_model(
model: ManifestNode,
config: RuntimeConfig,
manifest: Manifest,
manifest: MacroManifest,
context_config: ContextConfig,
) -> Dict[str, Any]:
# The __init__ method of ModelContext also initializes
# a ManifestContext object which creates a MacroNamespaceBuilder
# which adds every macro in the Manifest.
ctx = ModelContext(
model, config, manifest, ParseProvider(), context_config
)
return ctx.to_dict()
def generate_parser_macro(
macro: ParsedMacro,
config: RuntimeConfig,
manifest: Manifest,
package_name: Optional[str],
) -> Dict[str, Any]:
ctx = MacroContext(
macro, config, manifest, ParseProvider(), package_name
)
# The 'to_dict' method in ManifestContext moves all of the macro names
# in the macro 'namespace' up to top level keys
return ctx.to_dict()
def generate_generate_component_name_macro(
macro: ParsedMacro,
config: RuntimeConfig,
manifest: Manifest,
manifest: MacroManifest,
) -> Dict[str, Any]:
ctx = MacroContext(
macro, config, manifest, GenerateNameProvider(), None
@@ -1325,7 +1348,7 @@ def generate_runtime_macro(
return ctx.to_dict()
class ReportRefResolver(BaseResolver):
class ExposureRefResolver(BaseResolver):
def __call__(self, *args) -> str:
if len(args) not in (1, 2):
ref_invalid_args(self.model, args)
@@ -1333,7 +1356,7 @@ class ReportRefResolver(BaseResolver):
return ''
class ReportSourceResolver(BaseResolver):
class ExposureSourceResolver(BaseResolver):
def __call__(self, *args) -> str:
if len(args) != 2:
raise_compiler_error(
@@ -1344,24 +1367,78 @@ class ReportSourceResolver(BaseResolver):
return ''
def generate_parse_report(
report: ParsedReport,
def generate_parse_exposure(
exposure: ParsedExposure,
config: RuntimeConfig,
manifest: Manifest,
manifest: MacroManifest,
package_name: str,
) -> Dict[str, Any]:
project = config.load_dependencies()[package_name]
return {
'ref': ReportRefResolver(
'ref': ExposureRefResolver(
None,
report,
exposure,
project,
manifest,
),
'source': ReportSourceResolver(
'source': ExposureSourceResolver(
None,
report,
exposure,
project,
manifest,
)
}
# This class is currently used by the schema parser in order
# to limit the number of macros in the context by using
# the TestMacroNamespace
class TestContext(ProviderContext):
def __init__(
self,
model,
config: RuntimeConfig,
manifest: Manifest,
provider: Provider,
context_config: Optional[ContextConfig],
macro_resolver: MacroResolver,
) -> None:
# this must be before super init so that macro_resolver exists for
# build_namespace
self.macro_resolver = macro_resolver
self.thread_ctx = MacroStack()
super().__init__(model, config, manifest, provider, context_config)
self._build_test_namespace
def _build_namespace(self):
return {}
# this overrides _build_namespace in ManifestContext which provides a
# complete namespace of all macros to only specify macros in the depends_on
# This only provides a namespace with macros in the test node
# 'depends_on.macros' by using the TestMacroNamespace
def _build_test_namespace(self):
depends_on_macros = []
if self.model.depends_on and self.model.depends_on.macros:
depends_on_macros = self.model.depends_on.macros
macro_namespace = TestMacroNamespace(
self.macro_resolver, self.ctx, self.node, self.thread_ctx,
depends_on_macros
)
self._namespace = macro_namespace
def generate_test_context(
model: ManifestNode,
config: RuntimeConfig,
manifest: Manifest,
context_config: ContextConfig,
macro_resolver: MacroResolver
) -> Dict[str, Any]:
ctx = TestContext(
model, config, manifest, ParseProvider(), context_config,
macro_resolver
)
# The 'to_dict' method in ManifestContext moves all of the macro names
# in the macro 'namespace' up to top level keys
return ctx.to_dict()

View File

@@ -2,26 +2,37 @@ import abc
import itertools
from dataclasses import dataclass, field
from typing import (
Any, ClassVar, Dict, Tuple, Iterable, Optional, NewType, List, Callable,
Any, ClassVar, Dict, Tuple, Iterable, Optional, List, Callable,
)
from typing_extensions import Protocol
from hologram import JsonSchemaMixin
from hologram.helpers import (
StrEnum, register_pattern, ExtensibleJsonSchemaMixin
)
from dbt.contracts.util import Replaceable
from dbt.exceptions import InternalException
from dbt.utils import translate_aliases
from dbt.logger import GLOBAL_LOGGER as logger
from typing_extensions import Protocol
from dbt.dataclass_schema import (
dbtClassMixin, StrEnum, ExtensibleDbtClassMixin,
ValidatedStringMixin, register_pattern
)
from dbt.contracts.util import Replaceable
Identifier = NewType('Identifier', str)
class Identifier(ValidatedStringMixin):
ValidationRegex = r'^[A-Za-z_][A-Za-z0-9_]+$'
# we need register_pattern for jsonschema validation
register_pattern(Identifier, r'^[A-Za-z_][A-Za-z0-9_]+$')
@dataclass
class AdapterResponse(dbtClassMixin):
_message: str
code: Optional[str] = None
rows_affected: Optional[int] = None
def __str__(self):
return self._message
class ConnectionState(StrEnum):
INIT = 'init'
OPEN = 'open'
@@ -30,20 +41,19 @@ class ConnectionState(StrEnum):
@dataclass(init=False)
class Connection(ExtensibleJsonSchemaMixin, Replaceable):
class Connection(ExtensibleDbtClassMixin, Replaceable):
type: Identifier
name: Optional[str]
name: Optional[str] = None
state: ConnectionState = ConnectionState.INIT
transaction_open: bool = False
# prevent serialization
_handle: Optional[Any] = None
_credentials: JsonSchemaMixin = field(init=False)
_credentials: Optional[Any] = None
def __init__(
self,
type: Identifier,
name: Optional[str],
credentials: JsonSchemaMixin,
credentials: dbtClassMixin,
state: ConnectionState = ConnectionState.INIT,
transaction_open: bool = False,
handle: Optional[Any] = None,
@@ -85,6 +95,7 @@ class LazyHandle:
"""Opener must be a callable that takes a Connection object and opens the
connection, updating the handle on the Connection.
"""
def __init__(self, opener: Callable[[Connection], Connection]):
self.opener = opener
@@ -102,7 +113,7 @@ class LazyHandle:
# will work.
@dataclass # type: ignore
class Credentials(
ExtensibleJsonSchemaMixin,
ExtensibleDbtClassMixin,
Replaceable,
metaclass=abc.ABCMeta
):
@@ -121,7 +132,7 @@ class Credentials(
) -> Iterable[Tuple[str, Any]]:
"""Return an ordered iterator of key/value pairs for pretty-printing.
"""
as_dict = self.to_dict(omit_none=False, with_aliases=with_aliases)
as_dict = self.to_dict(options={'keep_none': True})
connection_keys = set(self._connection_keys())
aliases: List[str] = []
if with_aliases:
@@ -137,9 +148,10 @@ class Credentials(
raise NotImplementedError
@classmethod
def from_dict(cls, data):
def __pre_deserialize__(cls, data, options=None):
data = super().__pre_deserialize__(data, options=options)
data = cls.translate_aliases(data)
return super().from_dict(data)
return data
@classmethod
def translate_aliases(
@@ -147,31 +159,26 @@ class Credentials(
) -> Dict[str, Any]:
return translate_aliases(kwargs, cls._ALIASES, recurse)
def to_dict(self, omit_none=True, validate=False, *, with_aliases=False):
serialized = super().to_dict(omit_none=omit_none, validate=validate)
if with_aliases:
serialized.update({
new_name: serialized[canonical_name]
def __post_serialize__(self, dct, options=None):
# no super() -- do we need it?
if self._ALIASES:
dct.update({
new_name: dct[canonical_name]
for new_name, canonical_name in self._ALIASES.items()
if canonical_name in serialized
if canonical_name in dct
})
return serialized
return dct
class UserConfigContract(Protocol):
send_anonymous_usage_stats: bool
use_colors: Optional[bool]
partial_parse: Optional[bool]
printer_width: Optional[int]
use_colors: Optional[bool] = None
partial_parse: Optional[bool] = None
printer_width: Optional[int] = None
def set_values(self, cookie_dir: str) -> None:
...
def to_dict(
self, omit_none: bool = True, validate: bool = False
) -> Dict[str, Any]:
...
class HasCredentials(Protocol):
credentials: Credentials
@@ -205,7 +212,7 @@ DEFAULT_QUERY_COMMENT = '''
@dataclass
class QueryComment(JsonSchemaMixin):
class QueryComment(dbtClassMixin):
comment: str = DEFAULT_QUERY_COMMENT
append: bool = False

View File

@@ -3,7 +3,7 @@ import os
from dataclasses import dataclass, field
from typing import List, Optional, Union
from hologram import JsonSchemaMixin
from dbt.dataclass_schema import dbtClassMixin
from dbt.exceptions import InternalException
@@ -15,7 +15,7 @@ MAXIMUM_SEED_SIZE_NAME = '1MB'
@dataclass
class FilePath(JsonSchemaMixin):
class FilePath(dbtClassMixin):
searched_path: str
relative_path: str
project_root: str
@@ -51,7 +51,7 @@ class FilePath(JsonSchemaMixin):
@dataclass
class FileHash(JsonSchemaMixin):
class FileHash(dbtClassMixin):
name: str # the hash type name
checksum: str # the hashlib.hash_type().hexdigest() of the file contents
@@ -91,7 +91,7 @@ class FileHash(JsonSchemaMixin):
@dataclass
class RemoteFile(JsonSchemaMixin):
class RemoteFile(dbtClassMixin):
@property
def searched_path(self) -> str:
return 'from remote system'
@@ -110,7 +110,7 @@ class RemoteFile(JsonSchemaMixin):
@dataclass
class SourceFile(JsonSchemaMixin):
class SourceFile(dbtClassMixin):
"""Define a source file in dbt"""
path: Union[FilePath, RemoteFile] # the path information
checksum: FileHash
@@ -121,7 +121,7 @@ class SourceFile(JsonSchemaMixin):
docs: List[str] = field(default_factory=list)
macros: List[str] = field(default_factory=list)
sources: List[str] = field(default_factory=list)
reports: List[str] = field(default_factory=list)
exposures: List[str] = field(default_factory=list)
# any node patches in this file. The entries are names, not unique ids!
patches: List[str] = field(default_factory=list)
# any macro patches in this file. The entries are package, name pairs.
@@ -156,7 +156,7 @@ class SourceFile(JsonSchemaMixin):
@classmethod
def big_seed(cls, path: FilePath) -> 'SourceFile':
"""Parse seeds over the size limit with just the path"""
self = cls(path=path, checksum=FileHash.path(path.absolute_path))
self = cls(path=path, checksum=FileHash.path(path.original_file_path))
self.contents = ''
return self

View File

@@ -5,7 +5,7 @@ from dbt.contracts.graph.parsed import (
ParsedDataTestNode,
ParsedHookNode,
ParsedModelNode,
ParsedReport,
ParsedExposure,
ParsedResource,
ParsedRPCNode,
ParsedSchemaTestNode,
@@ -19,19 +19,19 @@ from dbt.contracts.graph.parsed import (
from dbt.node_types import NodeType
from dbt.contracts.util import Replaceable
from hologram import JsonSchemaMixin
from dbt.dataclass_schema import dbtClassMixin
from dataclasses import dataclass, field
from typing import Optional, List, Union, Dict, Type
@dataclass
class InjectedCTE(JsonSchemaMixin, Replaceable):
class InjectedCTE(dbtClassMixin, Replaceable):
id: str
sql: str
@dataclass
class CompiledNodeMixin(JsonSchemaMixin):
class CompiledNodeMixin(dbtClassMixin):
# this is a special mixin class to provide a required argument. If a node
# is missing a `compiled` flag entirely, it must not be a CompiledNode.
compiled: bool
@@ -42,7 +42,7 @@ class CompiledNode(ParsedNode, CompiledNodeMixin):
compiled_sql: Optional[str] = None
extra_ctes_injected: bool = False
extra_ctes: List[InjectedCTE] = field(default_factory=list)
injected_sql: Optional[str] = None
relation_name: Optional[str] = None
def set_cte(self, cte_id: str, sql: str):
"""This is the equivalent of what self.extra_ctes[cte_id] = sql would
@@ -178,8 +178,7 @@ def parsed_instance_for(compiled: CompiledNode) -> ParsedResource:
raise ValueError('invalid resource_type: {}'
.format(compiled.resource_type))
# validate=False to allow extra keys from compiling
return cls.from_dict(compiled.to_dict(), validate=False)
return cls.from_dict(compiled.to_dict())
NonSourceCompiledNode = Union[
@@ -219,8 +218,8 @@ CompileResultNode = Union[
ParsedSourceDefinition,
]
# anything that participates in the graph: sources, reports, manifest nodes
# anything that participates in the graph: sources, exposures, manifest nodes
GraphMemberNode = Union[
CompileResultNode,
ParsedReport,
ParsedExposure,
]

View File

@@ -15,7 +15,7 @@ from dbt.contracts.graph.compiled import (
)
from dbt.contracts.graph.parsed import (
ParsedMacro, ParsedDocumentation, ParsedNodePatch, ParsedMacroPatch,
ParsedSourceDefinition, ParsedReport
ParsedSourceDefinition, ParsedExposure
)
from dbt.contracts.files import SourceFile
from dbt.contracts.util import (
@@ -234,7 +234,8 @@ def build_edges(nodes: List[ManifestNode]):
for node in nodes:
backward_edges[node.unique_id] = node.depends_on_nodes[:]
for unique_id in node.depends_on_nodes:
forward_edges[unique_id].append(node.unique_id)
if unique_id in forward_edges.keys():
forward_edges[unique_id].append(node.unique_id)
return _sort_values(forward_edges), _sort_values(backward_edges)
@@ -427,153 +428,13 @@ def _update_into(dest: MutableMapping[str, T], new_item: T):
dest[unique_id] = new_item
@dataclass
class Manifest:
"""The manifest for the full graph, after parsing and during compilation.
"""
nodes: MutableMapping[str, ManifestNode]
sources: MutableMapping[str, ParsedSourceDefinition]
macros: MutableMapping[str, ParsedMacro]
docs: MutableMapping[str, ParsedDocumentation]
reports: MutableMapping[str, ParsedReport]
disabled: List[CompileResultNode]
files: MutableMapping[str, SourceFile]
metadata: ManifestMetadata = field(default_factory=ManifestMetadata)
flat_graph: Dict[str, Any] = field(default_factory=dict)
_docs_cache: Optional[DocCache] = None
_sources_cache: Optional[SourceCache] = None
_refs_cache: Optional[RefableCache] = None
_lock: Lock = field(default_factory=flags.MP_CONTEXT.Lock)
@classmethod
def from_macros(
cls,
macros: Optional[MutableMapping[str, ParsedMacro]] = None,
files: Optional[MutableMapping[str, SourceFile]] = None,
) -> 'Manifest':
if macros is None:
macros = {}
if files is None:
files = {}
return cls(
nodes={},
sources={},
macros=macros,
docs={},
reports={},
disabled=[],
files=files,
)
def sync_update_node(
self, new_node: NonSourceCompiledNode
) -> NonSourceCompiledNode:
"""update the node with a lock. The only time we should want to lock is
when compiling an ephemeral ancestor of a node at runtime, because
multiple threads could be just-in-time compiling the same ephemeral
dependency, and we want them to have a consistent view of the manifest.
If the existing node is not compiled, update it with the new node and
return that. If the existing node is compiled, do not update the
manifest and return the existing node.
"""
with self._lock:
existing = self.nodes[new_node.unique_id]
if getattr(existing, 'compiled', False):
# already compiled -> must be a NonSourceCompiledNode
return cast(NonSourceCompiledNode, existing)
_update_into(self.nodes, new_node)
return new_node
def update_report(self, new_report: ParsedReport):
_update_into(self.reports, new_report)
def update_node(self, new_node: ManifestNode):
_update_into(self.nodes, new_node)
def update_source(self, new_source: ParsedSourceDefinition):
_update_into(self.sources, new_source)
def build_flat_graph(self):
"""This attribute is used in context.common by each node, so we want to
only build it once and avoid any concurrency issues around it.
Make sure you don't call this until you're done with building your
manifest!
"""
self.flat_graph = {
'nodes': {
k: v.to_dict(omit_none=False) for k, v in self.nodes.items()
},
'sources': {
k: v.to_dict(omit_none=False) for k, v in self.sources.items()
}
}
def find_disabled_by_name(
self, name: str, package: Optional[str] = None
) -> Optional[ManifestNode]:
searcher: NameSearcher = NameSearcher(
name, package, NodeType.refable()
)
result = searcher.search(self.disabled)
return result
def find_disabled_source_by_name(
self, source_name: str, table_name: str, package: Optional[str] = None
) -> Optional[ParsedSourceDefinition]:
search_name = f'{source_name}.{table_name}'
searcher: NameSearcher = NameSearcher(
search_name, package, [NodeType.Source]
)
result = searcher.search(self.disabled)
if result is not None:
assert isinstance(result, ParsedSourceDefinition)
return result
def _find_macros_by_name(
self,
name: str,
root_project_name: str,
filter: Optional[Callable[[MacroCandidate], bool]] = None
) -> CandidateList:
"""Find macros by their name.
"""
# avoid an import cycle
from dbt.adapters.factory import get_adapter_package_names
candidates: CandidateList = CandidateList()
packages = set(get_adapter_package_names(self.metadata.adapter_type))
for unique_id, macro in self.macros.items():
if macro.name != name:
continue
candidate = MacroCandidate(
locality=_get_locality(macro, root_project_name, packages),
macro=macro,
)
if filter is None or filter(candidate):
candidates.append(candidate)
return candidates
def _materialization_candidates_for(
self, project_name: str,
materialization_name: str,
adapter_type: Optional[str],
) -> CandidateList:
if adapter_type is None:
specificity = Specificity.Default
else:
specificity = Specificity.Adapter
full_name = dbt.utils.get_materialization_macro_name(
materialization_name=materialization_name,
adapter_type=adapter_type,
with_prefix=False,
)
return CandidateList(
MaterializationCandidate.from_macro(m, specificity)
for m in self._find_macros_by_name(full_name, project_name)
)
# This contains macro methods that are in both the Manifest
# and the MacroManifest
class MacroMethods:
# Just to make mypy happy. There must be a better way.
def __init__(self):
self.macros = []
self.metadata = {}
def find_macro_by_name(
self, name: str, root_project_name: str, package: Optional[str]
@@ -619,6 +480,141 @@ class Manifest:
)
return candidates.last()
def _find_macros_by_name(
self,
name: str,
root_project_name: str,
filter: Optional[Callable[[MacroCandidate], bool]] = None
) -> CandidateList:
"""Find macros by their name.
"""
# avoid an import cycle
from dbt.adapters.factory import get_adapter_package_names
candidates: CandidateList = CandidateList()
packages = set(get_adapter_package_names(self.metadata.adapter_type))
for unique_id, macro in self.macros.items():
if macro.name != name:
continue
candidate = MacroCandidate(
locality=_get_locality(macro, root_project_name, packages),
macro=macro,
)
if filter is None or filter(candidate):
candidates.append(candidate)
return candidates
@dataclass
class Manifest(MacroMethods):
"""The manifest for the full graph, after parsing and during compilation.
"""
# These attributes are both positional and by keyword. If an attribute
# is added it must all be added in the __reduce_ex__ method in the
# args tuple in the right position.
nodes: MutableMapping[str, ManifestNode]
sources: MutableMapping[str, ParsedSourceDefinition]
macros: MutableMapping[str, ParsedMacro]
docs: MutableMapping[str, ParsedDocumentation]
exposures: MutableMapping[str, ParsedExposure]
selectors: MutableMapping[str, Any]
disabled: List[CompileResultNode]
files: MutableMapping[str, SourceFile]
metadata: ManifestMetadata = field(default_factory=ManifestMetadata)
flat_graph: Dict[str, Any] = field(default_factory=dict)
_docs_cache: Optional[DocCache] = None
_sources_cache: Optional[SourceCache] = None
_refs_cache: Optional[RefableCache] = None
_lock: Lock = field(default_factory=flags.MP_CONTEXT.Lock)
def sync_update_node(
self, new_node: NonSourceCompiledNode
) -> NonSourceCompiledNode:
"""update the node with a lock. The only time we should want to lock is
when compiling an ephemeral ancestor of a node at runtime, because
multiple threads could be just-in-time compiling the same ephemeral
dependency, and we want them to have a consistent view of the manifest.
If the existing node is not compiled, update it with the new node and
return that. If the existing node is compiled, do not update the
manifest and return the existing node.
"""
with self._lock:
existing = self.nodes[new_node.unique_id]
if getattr(existing, 'compiled', False):
# already compiled -> must be a NonSourceCompiledNode
return cast(NonSourceCompiledNode, existing)
_update_into(self.nodes, new_node)
return new_node
def update_exposure(self, new_exposure: ParsedExposure):
_update_into(self.exposures, new_exposure)
def update_node(self, new_node: ManifestNode):
_update_into(self.nodes, new_node)
def update_source(self, new_source: ParsedSourceDefinition):
_update_into(self.sources, new_source)
def build_flat_graph(self):
"""This attribute is used in context.common by each node, so we want to
only build it once and avoid any concurrency issues around it.
Make sure you don't call this until you're done with building your
manifest!
"""
self.flat_graph = {
'nodes': {
k: v.to_dict(options={'keep_none': True})
for k, v in self.nodes.items()
},
'sources': {
k: v.to_dict(options={'keep_none': True})
for k, v in self.sources.items()
}
}
def find_disabled_by_name(
self, name: str, package: Optional[str] = None
) -> Optional[ManifestNode]:
searcher: NameSearcher = NameSearcher(
name, package, NodeType.refable()
)
result = searcher.search(self.disabled)
return result
def find_disabled_source_by_name(
self, source_name: str, table_name: str, package: Optional[str] = None
) -> Optional[ParsedSourceDefinition]:
search_name = f'{source_name}.{table_name}'
searcher: NameSearcher = NameSearcher(
search_name, package, [NodeType.Source]
)
result = searcher.search(self.disabled)
if result is not None:
assert isinstance(result, ParsedSourceDefinition)
return result
def _materialization_candidates_for(
self, project_name: str,
materialization_name: str,
adapter_type: Optional[str],
) -> CandidateList:
if adapter_type is None:
specificity = Specificity.Default
else:
specificity = Specificity.Adapter
full_name = dbt.utils.get_materialization_macro_name(
materialization_name=materialization_name,
adapter_type=adapter_type,
with_prefix=False,
)
return CandidateList(
MaterializationCandidate.from_macro(m, specificity)
for m in self._find_macros_by_name(full_name, project_name)
)
def find_materialization_macro_by_name(
self, project_name: str, materialization_name: str, adapter_type: str
) -> Optional[ParsedMacro]:
@@ -729,9 +725,10 @@ class Manifest:
sources={k: _deepcopy(v) for k, v in self.sources.items()},
macros={k: _deepcopy(v) for k, v in self.macros.items()},
docs={k: _deepcopy(v) for k, v in self.docs.items()},
reports={k: _deepcopy(v) for k, v in self.reports.items()},
disabled=[_deepcopy(n) for n in self.disabled],
exposures={k: _deepcopy(v) for k, v in self.exposures.items()},
selectors=self.root_project.manifest_selectors,
metadata=self.metadata,
disabled=[_deepcopy(n) for n in self.disabled],
files={k: _deepcopy(v) for k, v in self.files.items()},
)
@@ -739,7 +736,7 @@ class Manifest:
edge_members = list(chain(
self.nodes.values(),
self.sources.values(),
self.reports.values(),
self.exposures.values(),
))
forward_edges, backward_edges = build_edges(edge_members)
@@ -748,17 +745,18 @@ class Manifest:
sources=self.sources,
macros=self.macros,
docs=self.docs,
reports=self.reports,
exposures=self.exposures,
selectors=self.selectors,
metadata=self.metadata,
disabled=self.disabled,
child_map=forward_edges,
parent_map=backward_edges,
)
def to_dict(self, omit_none=True, validate=False):
return self.writable_manifest().to_dict(
omit_none=omit_none, validate=validate
)
# When 'to_dict' is called on the Manifest, it substitues a
# WritableManifest
def __pre_serialize__(self, options=None):
return self.writable_manifest()
def write(self, path):
self.writable_manifest().write(path)
@@ -768,8 +766,8 @@ class Manifest:
return self.nodes[unique_id]
elif unique_id in self.sources:
return self.sources[unique_id]
elif unique_id in self.reports:
return self.reports[unique_id]
elif unique_id in self.exposures:
return self.exposures[unique_id]
else:
# something terrible has happened
raise dbt.exceptions.InternalException(
@@ -880,6 +878,7 @@ class Manifest:
def merge_from_artifact(
self,
adapter,
other: 'WritableManifest',
selected: AbstractSet[UniqueID],
) -> None:
@@ -891,10 +890,14 @@ class Manifest:
refables = set(NodeType.refable())
merged = set()
for unique_id, node in other.nodes.items():
if (
current = self.nodes.get(unique_id)
if current and (
node.resource_type in refables and
not node.is_ephemeral and
unique_id not in selected
unique_id not in selected and
not adapter.get_relation(
current.database, current.schema, current.identifier
)
):
merged.add(unique_id)
self.nodes[unique_id] = node.replace(deferred=True)
@@ -905,14 +908,21 @@ class Manifest:
f'Merged {len(merged)} items from state (sample: {sample})'
)
# provide support for copy.deepcopy() - we jsut need to avoid the lock!
# Provide support for copy.deepcopy() - we just need to avoid the lock!
# pickle and deepcopy use this. It returns a callable object used to
# create the initial version of the object and a tuple of arguments
# for the object, i.e. the Manifest.
# The order of the arguments must match the order of the attributes
# in the Manifest class declaration, because they are used as
# positional arguments to construct a Manifest.
def __reduce_ex__(self, protocol):
args = (
self.nodes,
self.sources,
self.macros,
self.docs,
self.reports,
self.exposures,
self.selectors,
self.disabled,
self.files,
self.metadata,
@@ -924,6 +934,19 @@ class Manifest:
return self.__class__, args
class MacroManifest(MacroMethods):
def __init__(self, macros, files):
self.macros = macros
self.files = files
self.metadata = ManifestMetadata()
# This is returned by the 'graph' context property
# in the ProviderContext class.
self.flat_graph = {}
AnyManifest = Union[Manifest, MacroManifest]
@dataclass
@schema_version('manifest', 1)
class WritableManifest(ArtifactMixin):
@@ -947,9 +970,14 @@ class WritableManifest(ArtifactMixin):
'The docs defined in the dbt project and its dependencies'
))
)
reports: Mapping[UniqueID, ParsedReport] = field(
exposures: Mapping[UniqueID, ParsedExposure] = field(
metadata=dict(description=(
'The reports defined in the dbt project and its dependencies'
'The exposures defined in the dbt project and its dependencies'
))
)
selectors: Mapping[UniqueID, Any] = field(
metadata=dict(description=(
'The selectors defined in selectors.yml'
))
)
disabled: Optional[List[CompileResultNode]] = field(metadata=dict(

View File

@@ -2,19 +2,12 @@ from dataclasses import field, Field, dataclass
from enum import Enum
from itertools import chain
from typing import (
Any, List, Optional, Dict, MutableMapping, Union, Type, NewType, Tuple,
TypeVar, Callable
Any, List, Optional, Dict, MutableMapping, Union, Type,
TypeVar, Callable,
)
from dbt.dataclass_schema import (
dbtClassMixin, ValidationError, register_pattern,
)
# TODO: patch+upgrade hologram to avoid this jsonschema import
import jsonschema # type: ignore
# This is protected, but we really do want to reuse this logic, and the cache!
# It would be nice to move the custom error picking stuff into hologram!
from hologram import _validate_schema
from hologram import JsonSchemaMixin, ValidationError
from hologram.helpers import StrEnum, register_pattern
from dbt.contracts.graph.unparsed import AdditionalPropertiesAllowed
from dbt.exceptions import CompilationException, InternalException
from dbt.contracts.util import Replaceable, list_str
@@ -170,22 +163,15 @@ def insensitive_patterns(*patterns: str):
return '^({})$'.format('|'.join(lowercased))
Severity = NewType('Severity', str)
class Severity(str):
pass
register_pattern(Severity, insensitive_patterns('warn', 'error'))
class SnapshotStrategy(StrEnum):
Timestamp = 'timestamp'
Check = 'check'
class All(StrEnum):
All = 'all'
@dataclass
class Hook(JsonSchemaMixin, Replaceable):
class Hook(dbtClassMixin, Replaceable):
sql: str
transaction: bool = True
index: Optional[int] = None
@@ -313,29 +299,6 @@ class BaseConfig(
)
return result
def to_dict(
self,
omit_none: bool = True,
validate: bool = False,
*,
omit_hidden: bool = True,
) -> Dict[str, Any]:
result = super().to_dict(omit_none=omit_none, validate=validate)
if omit_hidden and not omit_none:
for fld, target_field in self._get_fields():
if target_field not in result:
continue
# if the field is not None, preserve it regardless of the
# setting. This is in line with existing behavior, but isn't
# an endorsement of it!
if result[target_field] is not None:
continue
if not ShowBehavior.should_show(fld):
del result[target_field]
return result
def update_from(
self: T, data: Dict[str, Any], adapter_type: str, validate: bool = True
) -> T:
@@ -344,7 +307,7 @@ class BaseConfig(
"""
# sadly, this is a circular import
from dbt.adapters.factory import get_config_class_by_name
dct = self.to_dict(omit_none=False, validate=False, omit_hidden=False)
dct = self.to_dict(options={'keep_none': True})
adapter_config_cls = get_config_class_by_name(adapter_type)
@@ -358,21 +321,23 @@ class BaseConfig(
dct.update(data)
# any validation failures must have come from the update
return self.from_dict(dct, validate=validate)
if validate:
self.validate(dct)
return self.from_dict(dct)
def finalize_and_validate(self: T) -> T:
# from_dict will validate for us
dct = self.to_dict(omit_none=False, validate=False)
dct = self.to_dict(options={'keep_none': True})
self.validate(dct)
return self.from_dict(dct)
def replace(self, **kwargs):
dct = self.to_dict(validate=False)
dct = self.to_dict()
mapping = self.field_mapping()
for key, value in kwargs.items():
new_key = mapping.get(key, key)
dct[new_key] = value
return self.from_dict(dct, validate=False)
return self.from_dict(dct)
@dataclass
@@ -431,12 +396,33 @@ class NodeConfig(BaseConfig):
full_refresh: Optional[bool] = None
@classmethod
def from_dict(cls, data, validate=True):
def __pre_deserialize__(cls, data, options=None):
data = super().__pre_deserialize__(data, options=options)
field_map = {'post-hook': 'post_hook', 'pre-hook': 'pre_hook'}
# create a new dict because otherwise it gets overwritten in
# tests
new_dict = {}
for key in data:
new_dict[key] = data[key]
data = new_dict
for key in hooks.ModelHookType:
if key in data:
data[key] = [hooks.get_hook_dict(h) for h in data[key]]
return super().from_dict(data, validate=validate)
for field_name in field_map:
if field_name in data:
new_name = field_map[field_name]
data[new_name] = data.pop(field_name)
return data
def __post_serialize__(self, dct, options=None):
dct = super().__post_serialize__(dct, options=options)
field_map = {'post_hook': 'post-hook', 'pre_hook': 'pre-hook'}
for field_name in field_map:
if field_name in dct:
dct[field_map[field_name]] = dct.pop(field_name)
return dct
# this is still used by jsonschema validation
@classmethod
def field_mapping(cls):
return {'post_hook': 'post-hook', 'pre_hook': 'pre-hook'}
@@ -450,169 +436,53 @@ class SeedConfig(NodeConfig):
@dataclass
class TestConfig(NodeConfig):
materialized: str = 'test'
severity: Severity = Severity('ERROR')
SnapshotVariants = Union[
'TimestampSnapshotConfig',
'CheckSnapshotConfig',
'GenericSnapshotConfig',
]
def _relevance_without_strategy(error: jsonschema.ValidationError):
# calculate the 'relevance' of an error the normal jsonschema way, except
# if the validator is in the 'strategy' field and its conflicting with the
# 'enum'. This suppresses `"'timestamp' is not one of ['check']` and such
if 'strategy' in error.path and error.validator in {'enum', 'not'}:
length = 1
else:
length = -len(error.path)
validator = error.validator
return length, validator not in {'anyOf', 'oneOf'}
@dataclass
class SnapshotWrapper(JsonSchemaMixin):
"""This is a little wrapper to let us serialize/deserialize the
SnapshotVariants union.
"""
config: SnapshotVariants # mypy: ignore
@classmethod
def validate(cls, data: Any):
schema = _validate_schema(cls)
validator = jsonschema.Draft7Validator(schema)
error = jsonschema.exceptions.best_match(
validator.iter_errors(data),
key=_relevance_without_strategy,
)
if error is not None:
raise ValidationError.create_from(error) from error
@dataclass
class EmptySnapshotConfig(NodeConfig):
materialized: str = 'snapshot'
@dataclass(init=False)
@dataclass
class SnapshotConfig(EmptySnapshotConfig):
unique_key: str = field(init=False, metadata=dict(init_required=True))
target_schema: str = field(init=False, metadata=dict(init_required=True))
strategy: Optional[str] = None
unique_key: Optional[str] = None
target_schema: Optional[str] = None
target_database: Optional[str] = None
updated_at: Optional[str] = None
check_cols: Optional[Union[str, List[str]]] = None
def __init__(
self,
unique_key: str,
target_schema: str,
target_database: Optional[str] = None,
**kwargs
) -> None:
self.unique_key = unique_key
self.target_schema = target_schema
self.target_database = target_database
# kwargs['materialized'] = materialized
super().__init__(**kwargs)
# type hacks...
@classmethod
def _get_fields(cls) -> List[Tuple[Field, str]]: # type: ignore
fields: List[Tuple[Field, str]] = []
for old_field, name in super()._get_fields():
new_field = old_field
# tell hologram we're really an initvar
if old_field.metadata and old_field.metadata.get('init_required'):
new_field = field(init=True, metadata=old_field.metadata)
new_field.name = old_field.name
new_field.type = old_field.type
new_field._field_type = old_field._field_type # type: ignore
fields.append((new_field, name))
return fields
def validate(cls, data):
super().validate(data)
if data.get('strategy') == 'check':
if not data.get('check_cols'):
raise ValidationError(
"A snapshot configured with the check strategy must "
"specify a check_cols configuration.")
if (isinstance(data['check_cols'], str) and
data['check_cols'] != 'all'):
raise ValidationError(
f"Invalid value for 'check_cols': {data['check_cols']}. "
"Expected 'all' or a list of strings.")
def finalize_and_validate(self: 'SnapshotConfig') -> SnapshotVariants:
elif data.get('strategy') == 'timestamp':
if not data.get('updated_at'):
raise ValidationError(
"A snapshot configured with the timestamp strategy "
"must specify an updated_at configuration.")
if data.get('check_cols'):
raise ValidationError(
"A 'timestamp' snapshot should not have 'check_cols'")
# If the strategy is not 'check' or 'timestamp' it's a custom strategy,
# formerly supported with GenericSnapshotConfig
def finalize_and_validate(self):
data = self.to_dict()
return SnapshotWrapper.from_dict({'config': data}).config
@dataclass(init=False)
class GenericSnapshotConfig(SnapshotConfig):
strategy: str = field(init=False, metadata=dict(init_required=True))
def __init__(self, strategy: str, **kwargs) -> None:
self.strategy = strategy
super().__init__(**kwargs)
@classmethod
def _collect_json_schema(
cls, definitions: Dict[str, Any]
) -> Dict[str, Any]:
# this is the method you want to override in hologram if you want
# to do clever things about the json schema and have classes that
# contain instances of your JsonSchemaMixin respect the change.
schema = super()._collect_json_schema(definitions)
# Instead of just the strategy we'd calculate normally, say
# "this strategy except none of our specialization strategies".
strategies = [schema['properties']['strategy']]
for specialization in (TimestampSnapshotConfig, CheckSnapshotConfig):
strategies.append(
{'not': specialization.json_schema()['properties']['strategy']}
)
schema['properties']['strategy'] = {
'allOf': strategies
}
return schema
@dataclass(init=False)
class TimestampSnapshotConfig(SnapshotConfig):
strategy: str = field(
init=False,
metadata=dict(
restrict=[str(SnapshotStrategy.Timestamp)],
init_required=True,
),
)
updated_at: str = field(init=False, metadata=dict(init_required=True))
def __init__(
self, strategy: str, updated_at: str, **kwargs
) -> None:
self.strategy = strategy
self.updated_at = updated_at
super().__init__(**kwargs)
@dataclass(init=False)
class CheckSnapshotConfig(SnapshotConfig):
strategy: str = field(
init=False,
metadata=dict(
restrict=[str(SnapshotStrategy.Check)],
init_required=True,
),
)
# TODO: is there a way to get this to accept tuples of strings? Adding
# `Tuple[str, ...]` to the list of types results in this:
# ['email'] is valid under each of {'type': 'array', 'items':
# {'type': 'string'}}, {'type': 'array', 'items': {'type': 'string'}}
# but without it, parsing gets upset about values like `('email',)`
# maybe hologram itself should support this behavior? It's not like tuples
# are meaningful in json
check_cols: Union[All, List[str]] = field(
init=False,
metadata=dict(init_required=True),
)
def __init__(
self, strategy: str, check_cols: Union[All, List[str]],
**kwargs
) -> None:
self.strategy = strategy
self.check_cols = check_cols
super().__init__(**kwargs)
self.validate(data)
return self.from_dict(data)
RESOURCE_TYPES: Dict[NodeType, Type[BaseConfig]] = {

View File

@@ -13,8 +13,9 @@ from typing import (
TypeVar,
)
from hologram import JsonSchemaMixin
from hologram.helpers import ExtensibleJsonSchemaMixin
from dbt.dataclass_schema import (
dbtClassMixin, ExtensibleDbtClassMixin
)
from dbt.clients.system import write_file
from dbt.contracts.files import FileHash, MAXIMUM_SEED_SIZE_NAME
@@ -23,7 +24,7 @@ from dbt.contracts.graph.unparsed import (
UnparsedBaseNode, FreshnessThreshold, ExternalTable,
HasYamlMetadata, MacroArgument, UnparsedSourceDefinition,
UnparsedSourceTableDefinition, UnparsedColumn, TestDef,
ReportOwner, ExposureType, MaturityType
ExposureOwner, ExposureType, MaturityType
)
from dbt.contracts.util import Replaceable, AdditionalPropertiesMixin
from dbt.exceptions import warn_or_error
@@ -38,20 +39,14 @@ from .model_config import (
TestConfig,
SourceConfig,
EmptySnapshotConfig,
SnapshotVariants,
)
# import these 3 so the SnapshotVariants forward ref works.
from .model_config import ( # noqa
TimestampSnapshotConfig,
CheckSnapshotConfig,
GenericSnapshotConfig,
SnapshotConfig,
)
@dataclass
class ColumnInfo(
AdditionalPropertiesMixin,
ExtensibleJsonSchemaMixin,
ExtensibleDbtClassMixin,
Replaceable
):
name: str
@@ -64,7 +59,7 @@ class ColumnInfo(
@dataclass
class HasFqn(JsonSchemaMixin, Replaceable):
class HasFqn(dbtClassMixin, Replaceable):
fqn: List[str]
def same_fqn(self, other: 'HasFqn') -> bool:
@@ -72,12 +67,12 @@ class HasFqn(JsonSchemaMixin, Replaceable):
@dataclass
class HasUniqueID(JsonSchemaMixin, Replaceable):
class HasUniqueID(dbtClassMixin, Replaceable):
unique_id: str
@dataclass
class MacroDependsOn(JsonSchemaMixin, Replaceable):
class MacroDependsOn(dbtClassMixin, Replaceable):
macros: List[str] = field(default_factory=list)
# 'in' on lists is O(n) so this is O(n^2) for # of macros
@@ -96,12 +91,22 @@ class DependsOn(MacroDependsOn):
@dataclass
class HasRelationMetadata(JsonSchemaMixin, Replaceable):
class HasRelationMetadata(dbtClassMixin, Replaceable):
database: Optional[str]
schema: str
# Can't set database to None like it ought to be
# because it messes up the subclasses and default parameters
# so hack it here
@classmethod
def __pre_deserialize__(cls, data, options=None):
data = super().__pre_deserialize__(data, options=options)
if 'database' not in data:
data['database'] = None
return data
class ParsedNodeMixins(JsonSchemaMixin):
class ParsedNodeMixins(dbtClassMixin):
resource_type: NodeType
depends_on: DependsOn
config: NodeConfig
@@ -132,8 +137,12 @@ class ParsedNodeMixins(JsonSchemaMixin):
self.meta = patch.meta
self.docs = patch.docs
if flags.STRICT_MODE:
assert isinstance(self, JsonSchemaMixin)
self.to_dict(validate=True, omit_none=False)
# It seems odd that an instance can be invalid
# Maybe there should be validation or restrictions
# elsewhere?
assert isinstance(self, dbtClassMixin)
dct = self.to_dict(options={'keep_none': True})
self.validate(dct)
def get_materialization(self):
return self.config.materialized
@@ -335,14 +344,14 @@ class ParsedSeedNode(ParsedNode):
@dataclass
class TestMetadata(JsonSchemaMixin, Replaceable):
namespace: Optional[str]
class TestMetadata(dbtClassMixin, Replaceable):
name: str
kwargs: Dict[str, Any]
kwargs: Dict[str, Any] = field(default_factory=dict)
namespace: Optional[str] = None
@dataclass
class HasTestMetadata(JsonSchemaMixin):
class HasTestMetadata(dbtClassMixin):
test_metadata: TestMetadata
@@ -394,7 +403,7 @@ class IntermediateSnapshotNode(ParsedNode):
@dataclass
class ParsedSnapshotNode(ParsedNode):
resource_type: NodeType = field(metadata={'restrict': [NodeType.Snapshot]})
config: SnapshotVariants
config: SnapshotConfig
@dataclass
@@ -443,8 +452,10 @@ class ParsedMacro(UnparsedBaseNode, HasUniqueID):
self.docs = patch.docs
self.arguments = patch.arguments
if flags.STRICT_MODE:
assert isinstance(self, JsonSchemaMixin)
self.to_dict(validate=True, omit_none=False)
# What does this actually validate?
assert isinstance(self, dbtClassMixin)
dct = self.to_dict(options={'keep_none': True})
self.validate(dct)
def same_contents(self, other: Optional['ParsedMacro']) -> bool:
if other is None:
@@ -555,6 +566,7 @@ class ParsedSourceDefinition(
config: SourceConfig = field(default_factory=SourceConfig)
patch_path: Optional[Path] = None
unrendered_config: Dict[str, Any] = field(default_factory=dict)
relation_name: Optional[str] = None
def same_database_representation(
self, other: 'ParsedSourceDefinition'
@@ -648,14 +660,14 @@ class ParsedSourceDefinition(
@dataclass
class ParsedReport(UnparsedBaseNode, HasUniqueID, HasFqn):
class ParsedExposure(UnparsedBaseNode, HasUniqueID, HasFqn):
name: str
type: ExposureType
owner: ReportOwner
resource_type: NodeType = NodeType.Report
owner: ExposureOwner
resource_type: NodeType = NodeType.Exposure
description: str = ''
maturity: Optional[MaturityType] = None
url: Optional[str] = None
description: Optional[str] = None
depends_on: DependsOn = field(default_factory=DependsOn)
refs: List[List[str]] = field(default_factory=list)
sources: List[List[str]] = field(default_factory=list)
@@ -673,25 +685,25 @@ class ParsedReport(UnparsedBaseNode, HasUniqueID, HasFqn):
def tags(self):
return []
def same_depends_on(self, old: 'ParsedReport') -> bool:
def same_depends_on(self, old: 'ParsedExposure') -> bool:
return set(self.depends_on.nodes) == set(old.depends_on.nodes)
def same_description(self, old: 'ParsedReport') -> bool:
def same_description(self, old: 'ParsedExposure') -> bool:
return self.description == old.description
def same_maturity(self, old: 'ParsedReport') -> bool:
def same_maturity(self, old: 'ParsedExposure') -> bool:
return self.maturity == old.maturity
def same_owner(self, old: 'ParsedReport') -> bool:
def same_owner(self, old: 'ParsedExposure') -> bool:
return self.owner == old.owner
def same_exposure_type(self, old: 'ParsedReport') -> bool:
def same_exposure_type(self, old: 'ParsedExposure') -> bool:
return self.type == old.type
def same_url(self, old: 'ParsedReport') -> bool:
def same_url(self, old: 'ParsedExposure') -> bool:
return self.url == old.url
def same_contents(self, old: Optional['ParsedReport']) -> bool:
def same_contents(self, old: Optional['ParsedExposure']) -> bool:
# existing when it didn't before is a change!
if old is None:
return True
@@ -712,6 +724,6 @@ ParsedResource = Union[
ParsedDocumentation,
ParsedMacro,
ParsedNode,
ParsedReport,
ParsedExposure,
ParsedSourceDefinition,
]

View File

@@ -8,8 +8,9 @@ from dbt.contracts.util import (
import dbt.helper_types # noqa:F401
from dbt.exceptions import CompilationException
from hologram import JsonSchemaMixin
from hologram.helpers import StrEnum, ExtensibleJsonSchemaMixin
from dbt.dataclass_schema import (
dbtClassMixin, StrEnum, ExtensibleDbtClassMixin
)
from dataclasses import dataclass, field
from datetime import timedelta
@@ -18,7 +19,7 @@ from typing import Optional, List, Union, Dict, Any, Sequence
@dataclass
class UnparsedBaseNode(JsonSchemaMixin, Replaceable):
class UnparsedBaseNode(dbtClassMixin, Replaceable):
package_name: str
root_path: str
path: str
@@ -66,12 +67,12 @@ class UnparsedRunHook(UnparsedNode):
@dataclass
class Docs(JsonSchemaMixin, Replaceable):
class Docs(dbtClassMixin, Replaceable):
show: bool = True
@dataclass
class HasDocs(AdditionalPropertiesMixin, ExtensibleJsonSchemaMixin,
class HasDocs(AdditionalPropertiesMixin, ExtensibleDbtClassMixin,
Replaceable):
name: str
description: str = ''
@@ -100,7 +101,7 @@ class UnparsedColumn(HasTests):
@dataclass
class HasColumnDocs(JsonSchemaMixin, Replaceable):
class HasColumnDocs(dbtClassMixin, Replaceable):
columns: Sequence[HasDocs] = field(default_factory=list)
@@ -110,7 +111,7 @@ class HasColumnTests(HasColumnDocs):
@dataclass
class HasYamlMetadata(JsonSchemaMixin):
class HasYamlMetadata(dbtClassMixin):
original_file_path: str
yaml_key: str
package_name: str
@@ -127,7 +128,7 @@ class UnparsedNodeUpdate(HasColumnTests, HasTests, HasYamlMetadata):
@dataclass
class MacroArgument(JsonSchemaMixin):
class MacroArgument(dbtClassMixin):
name: str
type: Optional[str] = None
description: str = ''
@@ -148,7 +149,7 @@ class TimePeriod(StrEnum):
@dataclass
class Time(JsonSchemaMixin, Replaceable):
class Time(dbtClassMixin, Replaceable):
count: int
period: TimePeriod
@@ -158,19 +159,14 @@ class Time(JsonSchemaMixin, Replaceable):
return actual_age > difference
class FreshnessStatus(StrEnum):
Pass = 'pass'
Warn = 'warn'
Error = 'error'
@dataclass
class FreshnessThreshold(JsonSchemaMixin, Mergeable):
class FreshnessThreshold(dbtClassMixin, Mergeable):
warn_after: Optional[Time] = None
error_after: Optional[Time] = None
filter: Optional[str] = None
def status(self, age: float) -> FreshnessStatus:
def status(self, age: float) -> "dbt.contracts.results.FreshnessStatus":
from dbt.contracts.results import FreshnessStatus
if self.error_after and self.error_after.exceeded(age):
return FreshnessStatus.Error
elif self.warn_after and self.warn_after.exceeded(age):
@@ -185,7 +181,7 @@ class FreshnessThreshold(JsonSchemaMixin, Mergeable):
@dataclass
class AdditionalPropertiesAllowed(
AdditionalPropertiesMixin,
ExtensibleJsonSchemaMixin
ExtensibleDbtClassMixin
):
_extra: Dict[str, Any] = field(default_factory=dict)
@@ -217,7 +213,7 @@ class ExternalTable(AdditionalPropertiesAllowed, Mergeable):
@dataclass
class Quoting(JsonSchemaMixin, Mergeable):
class Quoting(dbtClassMixin, Mergeable):
database: Optional[bool] = None
schema: Optional[bool] = None
identifier: Optional[bool] = None
@@ -235,15 +231,18 @@ class UnparsedSourceTableDefinition(HasColumnTests, HasTests):
external: Optional[ExternalTable] = None
tags: List[str] = field(default_factory=list)
def to_dict(self, omit_none=True, validate=False):
result = super().to_dict(omit_none=omit_none, validate=validate)
if omit_none and self.freshness is None:
result['freshness'] = None
return result
def __post_serialize__(self, dct, options=None):
dct = super().__post_serialize__(dct)
keep_none = False
if options and 'keep_none' in options and options['keep_none']:
keep_none = True
if not keep_none and self.freshness is None:
dct['freshness'] = None
return dct
@dataclass
class UnparsedSourceDefinition(JsonSchemaMixin, Replaceable):
class UnparsedSourceDefinition(dbtClassMixin, Replaceable):
name: str
description: str = ''
meta: Dict[str, Any] = field(default_factory=dict)
@@ -262,15 +261,18 @@ class UnparsedSourceDefinition(JsonSchemaMixin, Replaceable):
def yaml_key(self) -> 'str':
return 'sources'
def to_dict(self, omit_none=True, validate=False):
result = super().to_dict(omit_none=omit_none, validate=validate)
if omit_none and self.freshness is None:
result['freshness'] = None
return result
def __post_serialize__(self, dct, options=None):
dct = super().__post_serialize__(dct)
keep_none = False
if options and 'keep_none' in options and options['keep_none']:
keep_none = True
if not keep_none and self.freshness is None:
dct['freshness'] = None
return dct
@dataclass
class SourceTablePatch(JsonSchemaMixin):
class SourceTablePatch(dbtClassMixin):
name: str
description: Optional[str] = None
meta: Optional[Dict[str, Any]] = None
@@ -288,7 +290,7 @@ class SourceTablePatch(JsonSchemaMixin):
columns: Optional[Sequence[UnparsedColumn]] = None
def to_patch_dict(self) -> Dict[str, Any]:
dct = self.to_dict(omit_none=True)
dct = self.to_dict()
remove_keys = ('name')
for key in remove_keys:
if key in dct:
@@ -301,7 +303,7 @@ class SourceTablePatch(JsonSchemaMixin):
@dataclass
class SourcePatch(JsonSchemaMixin, Replaceable):
class SourcePatch(dbtClassMixin, Replaceable):
name: str = field(
metadata=dict(description='The name of the source to override'),
)
@@ -325,7 +327,7 @@ class SourcePatch(JsonSchemaMixin, Replaceable):
tags: Optional[List[str]] = None
def to_patch_dict(self) -> Dict[str, Any]:
dct = self.to_dict(omit_none=True)
dct = self.to_dict()
remove_keys = ('name', 'overrides', 'tables', 'path')
for key in remove_keys:
if key in dct:
@@ -345,7 +347,7 @@ class SourcePatch(JsonSchemaMixin, Replaceable):
@dataclass
class UnparsedDocumentation(JsonSchemaMixin, Replaceable):
class UnparsedDocumentation(dbtClassMixin, Replaceable):
package_name: str
root_path: str
path: str
@@ -405,17 +407,17 @@ class MaturityType(StrEnum):
@dataclass
class ReportOwner(JsonSchemaMixin, Replaceable):
class ExposureOwner(dbtClassMixin, Replaceable):
email: str
name: Optional[str] = None
@dataclass
class UnparsedReport(JsonSchemaMixin, Replaceable):
class UnparsedExposure(dbtClassMixin, Replaceable):
name: str
type: ExposureType
owner: ReportOwner
owner: ExposureOwner
description: str = ''
maturity: Optional[MaturityType] = None
url: Optional[str] = None
description: Optional[str] = None
depends_on: List[str] = field(default_factory=list)

View File

@@ -4,25 +4,39 @@ from dbt.helper_types import NoValue
from dbt.logger import GLOBAL_LOGGER as logger # noqa
from dbt import tracking
from dbt import ui
from hologram import JsonSchemaMixin, ValidationError
from hologram.helpers import HyphenatedJsonSchemaMixin, register_pattern, \
ExtensibleJsonSchemaMixin
from dbt.dataclass_schema import (
dbtClassMixin, ValidationError,
HyphenatedDbtClassMixin,
ExtensibleDbtClassMixin,
register_pattern, ValidatedStringMixin
)
from dataclasses import dataclass, field
from typing import Optional, List, Dict, Union, Any, NewType
from typing import Optional, List, Dict, Union, Any
from mashumaro.types import SerializableType
PIN_PACKAGE_URL = 'https://docs.getdbt.com/docs/package-management#section-specifying-package-versions' # noqa
PIN_PACKAGE_URL = 'https://docs.getdbt.com/docs/package-management#section-specifying-package-versions' # noqa
DEFAULT_SEND_ANONYMOUS_USAGE_STATS = True
Name = NewType('Name', str)
class Name(ValidatedStringMixin):
ValidationRegex = r'^[^\d\W]\w*$'
register_pattern(Name, r'^[^\d\W]\w*$')
class SemverString(str, SerializableType):
def _serialize(self) -> str:
return self
@classmethod
def _deserialize(cls, value: str) -> 'SemverString':
return SemverString(value)
# this does not support the full semver (does not allow a trailing -fooXYZ) and
# is not restrictive enough for full semver, (allows '1.0'). But it's like
# 'semver lite'.
SemverString = NewType('SemverString', str)
register_pattern(
SemverString,
r'^(?:0|[1-9]\d*)\.(?:0|[1-9]\d*)(\.(?:0|[1-9]\d*))?$',
@@ -30,15 +44,15 @@ register_pattern(
@dataclass
class Quoting(JsonSchemaMixin, Mergeable):
identifier: Optional[bool]
schema: Optional[bool]
database: Optional[bool]
project: Optional[bool]
class Quoting(dbtClassMixin, Mergeable):
schema: Optional[bool] = None
database: Optional[bool] = None
project: Optional[bool] = None
identifier: Optional[bool] = None
@dataclass
class Package(Replaceable, HyphenatedJsonSchemaMixin):
class Package(Replaceable, HyphenatedDbtClassMixin):
pass
@@ -54,7 +68,7 @@ RawVersion = Union[str, float]
@dataclass
class GitPackage(Package):
git: str
revision: Optional[RawVersion]
revision: Optional[RawVersion] = None
warn_unpinned: Optional[bool] = None
def get_revisions(self) -> List[str]:
@@ -80,7 +94,7 @@ PackageSpec = Union[LocalPackage, GitPackage, RegistryPackage]
@dataclass
class PackageConfig(JsonSchemaMixin, Replaceable):
class PackageConfig(dbtClassMixin, Replaceable):
packages: List[PackageSpec]
@@ -96,13 +110,13 @@ class ProjectPackageMetadata:
@dataclass
class Downloads(ExtensibleJsonSchemaMixin, Replaceable):
class Downloads(ExtensibleDbtClassMixin, Replaceable):
tarball: str
@dataclass
class RegistryPackageMetadata(
ExtensibleJsonSchemaMixin,
ExtensibleDbtClassMixin,
ProjectPackageMetadata,
):
downloads: Downloads
@@ -142,6 +156,7 @@ BANNED_PROJECT_NAMES = {
'sql',
'sql_now',
'store_result',
'store_raw_result',
'target',
'this',
'tojson',
@@ -153,7 +168,7 @@ BANNED_PROJECT_NAMES = {
@dataclass
class Project(HyphenatedJsonSchemaMixin, Replaceable):
class Project(HyphenatedDbtClassMixin, Replaceable):
name: Name
version: Union[SemverString, float]
config_version: int
@@ -190,18 +205,16 @@ class Project(HyphenatedJsonSchemaMixin, Replaceable):
query_comment: Optional[Union[QueryComment, NoValue, str]] = NoValue()
@classmethod
def from_dict(cls, data, validate=True) -> 'Project':
result = super().from_dict(data, validate=validate)
if result.name in BANNED_PROJECT_NAMES:
def validate(cls, data):
super().validate(data)
if data['name'] in BANNED_PROJECT_NAMES:
raise ValidationError(
f'Invalid project name: {result.name} is a reserved word'
f"Invalid project name: {data['name']} is a reserved word"
)
return result
@dataclass
class UserConfig(ExtensibleJsonSchemaMixin, Replaceable, UserConfigContract):
class UserConfig(ExtensibleDbtClassMixin, Replaceable, UserConfigContract):
send_anonymous_usage_stats: bool = DEFAULT_SEND_ANONYMOUS_USAGE_STATS
use_colors: Optional[bool] = None
partial_parse: Optional[bool] = None
@@ -221,7 +234,7 @@ class UserConfig(ExtensibleJsonSchemaMixin, Replaceable, UserConfigContract):
@dataclass
class ProfileConfig(HyphenatedJsonSchemaMixin, Replaceable):
class ProfileConfig(HyphenatedDbtClassMixin, Replaceable):
profile_name: str = field(metadata={'preserve_underscore': True})
target_name: str = field(metadata={'preserve_underscore': True})
config: UserConfig
@@ -232,10 +245,10 @@ class ProfileConfig(HyphenatedJsonSchemaMixin, Replaceable):
@dataclass
class ConfiguredQuoting(Quoting, Replaceable):
identifier: bool
schema: bool
database: Optional[bool]
project: Optional[bool]
identifier: bool = True
schema: bool = True
database: Optional[bool] = None
project: Optional[bool] = None
@dataclass
@@ -248,5 +261,5 @@ class Configuration(Project, ProfileConfig):
@dataclass
class ProjectList(JsonSchemaMixin):
class ProjectList(dbtClassMixin):
projects: Dict[str, Project]

View File

@@ -1,12 +1,11 @@
from collections.abc import Mapping
from dataclasses import dataclass, fields
from typing import (
Optional, TypeVar, Generic, Dict,
Optional, Dict,
)
from typing_extensions import Protocol
from hologram import JsonSchemaMixin
from hologram.helpers import StrEnum
from dbt.dataclass_schema import dbtClassMixin, StrEnum
from dbt import deprecations
from dbt.contracts.util import Replaceable
@@ -32,7 +31,7 @@ class HasQuoting(Protocol):
quoting: Dict[str, bool]
class FakeAPIObject(JsonSchemaMixin, Replaceable, Mapping):
class FakeAPIObject(dbtClassMixin, Replaceable, Mapping):
# override the mapping truthiness, len is always >1
def __bool__(self):
return True
@@ -58,16 +57,13 @@ class FakeAPIObject(JsonSchemaMixin, Replaceable, Mapping):
return self.from_dict(value)
T = TypeVar('T')
@dataclass
class _ComponentObject(FakeAPIObject, Generic[T]):
database: T
schema: T
identifier: T
class Policy(FakeAPIObject):
database: bool = True
schema: bool = True
identifier: bool = True
def get_part(self, key: ComponentName) -> T:
def get_part(self, key: ComponentName) -> bool:
if key == ComponentName.Database:
return self.database
elif key == ComponentName.Schema:
@@ -80,25 +76,18 @@ class _ComponentObject(FakeAPIObject, Generic[T]):
.format(key, list(ComponentName))
)
def replace_dict(self, dct: Dict[ComponentName, T]):
kwargs: Dict[str, T] = {}
def replace_dict(self, dct: Dict[ComponentName, bool]):
kwargs: Dict[str, bool] = {}
for k, v in dct.items():
kwargs[str(k)] = v
return self.replace(**kwargs)
@dataclass
class Policy(_ComponentObject[bool]):
database: bool = True
schema: bool = True
identifier: bool = True
@dataclass
class Path(_ComponentObject[Optional[str]]):
database: Optional[str]
schema: Optional[str]
identifier: Optional[str]
class Path(FakeAPIObject):
database: Optional[str] = None
schema: Optional[str] = None
identifier: Optional[str] = None
def __post_init__(self):
# handle pesky jinja2.Undefined sneaking in here and messing up rende
@@ -120,3 +109,22 @@ class Path(_ComponentObject[Optional[str]]):
if part is not None:
part = part.lower()
return part
def get_part(self, key: ComponentName) -> Optional[str]:
if key == ComponentName.Database:
return self.database
elif key == ComponentName.Schema:
return self.schema
elif key == ComponentName.Identifier:
return self.identifier
else:
raise ValueError(
'Got a key of {}, expected one of {}'
.format(key, list(ComponentName))
)
def replace_dict(self, dct: Dict[ComponentName, str]):
kwargs: Dict[str, str] = {}
for k, v in dct.items():
kwargs[str(k)] = v
return self.replace(**kwargs)

View File

@@ -1,12 +1,11 @@
from dbt.contracts.graph.manifest import CompileResultNode
from dbt.contracts.graph.unparsed import (
FreshnessStatus, FreshnessThreshold
FreshnessThreshold
)
from dbt.contracts.graph.parsed import ParsedSourceDefinition
from dbt.contracts.util import (
BaseArtifactMetadata,
ArtifactMixin,
Writable,
VersionedSchema,
Replaceable,
schema_version,
@@ -18,18 +17,21 @@ from dbt.logger import (
GLOBAL_LOGGER as logger,
)
from dbt.utils import lowercase
from hologram.helpers import StrEnum
from hologram import JsonSchemaMixin
from dbt.dataclass_schema import dbtClassMixin, StrEnum
import agate
from dataclasses import dataclass, field
from datetime import datetime
from typing import Union, Dict, List, Optional, Any, NamedTuple, Sequence
from typing import (
Union, Dict, List, Optional, Any, NamedTuple, Sequence,
)
from dbt.clients.system import write_json
@dataclass
class TimingInfo(JsonSchemaMixin):
class TimingInfo(dbtClassMixin):
name: str
started_at: Optional[datetime] = None
completed_at: Optional[datetime] = None
@@ -55,50 +57,73 @@ class collect_timing_info:
logger.debug('finished collecting timing info')
class NodeStatus(StrEnum):
Success = "success"
Error = "error"
Fail = "fail"
Warn = "warn"
Skipped = "skipped"
Pass = "pass"
RuntimeErr = "runtime error"
class RunStatus(StrEnum):
Success = NodeStatus.Success
Error = NodeStatus.Error
Skipped = NodeStatus.Skipped
class TestStatus(StrEnum):
Pass = NodeStatus.Pass
Error = NodeStatus.Error
Fail = NodeStatus.Fail
Warn = NodeStatus.Warn
class FreshnessStatus(StrEnum):
Pass = NodeStatus.Pass
Warn = NodeStatus.Warn
Error = NodeStatus.Error
RuntimeErr = NodeStatus.RuntimeErr
@dataclass
class BaseResult(JsonSchemaMixin):
class BaseResult(dbtClassMixin):
status: Union[RunStatus, TestStatus, FreshnessStatus]
timing: List[TimingInfo]
thread_id: str
execution_time: float
adapter_response: Dict[str, Any]
message: Optional[Union[str, int]]
@classmethod
def __pre_deserialize__(cls, data, options=None):
data = super().__pre_deserialize__(data, options=options)
if 'message' not in data:
data['message'] = None
return data
@dataclass
class NodeResult(BaseResult):
node: CompileResultNode
error: Optional[str] = None
status: Union[None, str, int, bool] = None
execution_time: Union[str, int] = 0
thread_id: Optional[str] = None
timing: List[TimingInfo] = field(default_factory=list)
fail: Optional[bool] = None
warn: Optional[bool] = None
@dataclass
class PartialResult(BaseResult, Writable):
pass
# if the result got to the point where it could be skipped/failed, we would
# be returning a real result, not a partial.
@property
def skipped(self):
return False
@dataclass
class WritableRunModelResult(BaseResult, Writable):
skip: bool = False
class RunResult(NodeResult):
agate_table: Optional[agate.Table] = field(
default=None, metadata={
'serialize': lambda x: None, 'deserialize': lambda x: None
}
)
@property
def skipped(self):
return self.skip
return self.status == RunStatus.Skipped
@dataclass
class RunModelResult(WritableRunModelResult):
agate_table: Optional[agate.Table] = None
def to_dict(self, *args, **kwargs):
dct = super().to_dict(*args, **kwargs)
dct.pop('agate_table', None)
return dct
@dataclass
class ExecutionResult(JsonSchemaMixin):
class ExecutionResult(dbtClassMixin):
results: Sequence[BaseResult]
elapsed_time: float
@@ -112,9 +137,6 @@ class ExecutionResult(JsonSchemaMixin):
return self.results[idx]
RunResult = Union[PartialResult, WritableRunModelResult]
@dataclass
class RunResultsMetadata(BaseArtifactMetadata):
dbt_schema_version: str = field(
@@ -123,30 +145,69 @@ class RunResultsMetadata(BaseArtifactMetadata):
@dataclass
@schema_version('run-results', 1)
class RunResultsArtifact(
class RunResultOutput(BaseResult):
unique_id: str
def process_run_result(result: RunResult) -> RunResultOutput:
return RunResultOutput(
unique_id=result.node.unique_id,
status=result.status,
timing=result.timing,
thread_id=result.thread_id,
execution_time=result.execution_time,
message=result.message,
adapter_response=result.adapter_response
)
@dataclass
class RunExecutionResult(
ExecutionResult,
ArtifactMixin,
):
results: Sequence[RunResult]
args: Dict[str, Any] = field(default_factory=dict)
generated_at: datetime = field(default_factory=datetime.utcnow)
def write(self, path: str):
writable = RunResultsArtifact.from_execution_results(
results=self.results,
elapsed_time=self.elapsed_time,
generated_at=self.generated_at,
args=self.args,
)
writable.write(path)
@dataclass
@schema_version('run-results', 1)
class RunResultsArtifact(ExecutionResult, ArtifactMixin):
results: Sequence[RunResultOutput]
args: Dict[str, Any] = field(default_factory=dict)
@classmethod
def from_node_results(
def from_execution_results(
cls,
results: Sequence[RunResult],
elapsed_time: float,
generated_at: datetime,
args: Dict,
):
processed_results = [process_run_result(result) for result in results]
meta = RunResultsMetadata(
dbt_schema_version=str(cls.dbt_schema_version),
generated_at=generated_at,
)
return cls(
metadata=meta,
results=results,
results=processed_results,
elapsed_time=elapsed_time,
args=args
)
def write(self, path: str):
write_json(path, self.to_dict(options={'keep_none': True}))
@dataclass
class RunOperationResult(ExecutionResult):
@@ -171,7 +232,7 @@ class RunOperationResultsArtifact(RunOperationResult, ArtifactMixin):
elapsed_time: float,
generated_at: datetime,
):
meta = RunResultsMetadata(
meta = RunOperationResultMetadata(
dbt_schema_version=str(cls.dbt_schema_version),
generated_at=generated_at,
)
@@ -182,59 +243,56 @@ class RunOperationResultsArtifact(RunOperationResult, ArtifactMixin):
success=success,
)
# due to issues with typing.Union collapsing subclasses, this can't subclass
# PartialResult
@dataclass
class SourceFreshnessResultMixin(JsonSchemaMixin):
class SourceFreshnessResult(NodeResult):
node: ParsedSourceDefinition
status: FreshnessStatus
max_loaded_at: datetime
snapshotted_at: datetime
age: float
# due to issues with typing.Union collapsing subclasses, this can't subclass
# PartialResult
@dataclass
class SourceFreshnessResult(BaseResult, Writable, SourceFreshnessResultMixin):
node: ParsedSourceDefinition
status: FreshnessStatus = FreshnessStatus.Pass
def __post_init__(self):
self.fail = self.status == 'error'
@property
def warned(self):
return self.status == 'warn'
@property
def skipped(self):
return False
def _copykeys(src, keys, **updates):
return {k: getattr(src, k) for k in keys}
class FreshnessErrorEnum(StrEnum):
runtime_error = 'runtime error'
@dataclass
class SourceFreshnessRuntimeError(JsonSchemaMixin):
class SourceFreshnessRuntimeError(dbtClassMixin):
unique_id: str
error: str
state: FreshnessErrorEnum
error: Optional[Union[str, int]]
status: FreshnessErrorEnum
@dataclass
class SourceFreshnessOutput(JsonSchemaMixin):
class SourceFreshnessOutput(dbtClassMixin):
unique_id: str
max_loaded_at: datetime
snapshotted_at: datetime
max_loaded_at_time_ago_in_s: float
state: FreshnessStatus
status: FreshnessStatus
criteria: FreshnessThreshold
adapter_response: Dict[str, Any]
FreshnessNodeResult = Union[PartialResult, SourceFreshnessResult]
@dataclass
class PartialSourceFreshnessResult(NodeResult):
status: FreshnessStatus
@property
def skipped(self):
return False
FreshnessNodeResult = Union[PartialSourceFreshnessResult,
SourceFreshnessResult]
FreshnessNodeOutput = Union[SourceFreshnessRuntimeError, SourceFreshnessOutput]
@@ -242,11 +300,11 @@ def process_freshness_result(
result: FreshnessNodeResult
) -> FreshnessNodeOutput:
unique_id = result.node.unique_id
if result.error is not None:
if result.status == FreshnessStatus.RuntimeErr:
return SourceFreshnessRuntimeError(
unique_id=unique_id,
error=result.error,
state=FreshnessErrorEnum.runtime_error,
error=result.message,
status=FreshnessErrorEnum.runtime_error,
)
# we know that this must be a SourceFreshnessResult
@@ -268,8 +326,9 @@ def process_freshness_result(
max_loaded_at=result.max_loaded_at,
snapshotted_at=result.snapshotted_at,
max_loaded_at_time_ago_in_s=result.age,
state=result.status,
status=result.status,
criteria=criteria,
adapter_response=result.adapter_response
)
@@ -327,40 +386,40 @@ CatalogKey = NamedTuple(
@dataclass
class StatsItem(JsonSchemaMixin):
class StatsItem(dbtClassMixin):
id: str
label: str
value: Primitive
description: Optional[str]
include: bool
description: Optional[str] = None
StatsDict = Dict[str, StatsItem]
@dataclass
class ColumnMetadata(JsonSchemaMixin):
class ColumnMetadata(dbtClassMixin):
type: str
comment: Optional[str]
index: int
name: str
comment: Optional[str] = None
ColumnMap = Dict[str, ColumnMetadata]
@dataclass
class TableMetadata(JsonSchemaMixin):
class TableMetadata(dbtClassMixin):
type: str
database: Optional[str]
schema: str
name: str
comment: Optional[str]
owner: Optional[str]
database: Optional[str] = None
comment: Optional[str] = None
owner: Optional[str] = None
@dataclass
class CatalogTable(JsonSchemaMixin, Replaceable):
class CatalogTable(dbtClassMixin, Replaceable):
metadata: TableMetadata
columns: ColumnMap
stats: StatsDict
@@ -383,12 +442,18 @@ class CatalogMetadata(BaseArtifactMetadata):
@dataclass
class CatalogResults(JsonSchemaMixin):
class CatalogResults(dbtClassMixin):
nodes: Dict[str, CatalogTable]
sources: Dict[str, CatalogTable]
errors: Optional[List[str]]
errors: Optional[List[str]] = None
_compile_results: Optional[Any] = None
def __post_serialize__(self, dct, options=None):
dct = super().__post_serialize__(dct, options=options)
if '_compile_results' in dct:
del dct['_compile_results']
return dct
@dataclass
@schema_version('catalog', 1)

View File

@@ -5,13 +5,12 @@ from dataclasses import dataclass, field
from datetime import datetime, timedelta
from typing import Optional, Union, List, Any, Dict, Type, Sequence
from hologram import JsonSchemaMixin
from hologram.helpers import StrEnum
from dbt.dataclass_schema import dbtClassMixin, StrEnum
from dbt.contracts.graph.compiled import CompileResultNode
from dbt.contracts.graph.manifest import WritableManifest
from dbt.contracts.results import (
TimingInfo,
RunResult, RunResultsArtifact, TimingInfo,
CatalogArtifact,
CatalogResults,
ExecutionResult,
@@ -19,8 +18,7 @@ from dbt.contracts.results import (
FreshnessResult,
RunOperationResult,
RunOperationResultsArtifact,
RunResult,
RunResultsArtifact,
RunExecutionResult,
)
from dbt.contracts.util import VersionedSchema, schema_version
from dbt.exceptions import InternalException
@@ -35,16 +33,25 @@ TaskID = uuid.UUID
@dataclass
class RPCParameters(JsonSchemaMixin):
timeout: Optional[float]
class RPCParameters(dbtClassMixin):
task_tags: TaskTags
timeout: Optional[float]
@classmethod
def __pre_deserialize__(cls, data, options=None):
data = super().__pre_deserialize__(data, options=options)
if 'timeout' not in data:
data['timeout'] = None
if 'task_tags' not in data:
data['task_tags'] = None
return data
@dataclass
class RPCExecParameters(RPCParameters):
name: str
sql: str
macros: Optional[str]
macros: Optional[str] = None
@dataclass
@@ -80,6 +87,7 @@ class RPCTestParameters(RPCCompileParameters):
data: bool = False
schema: bool = False
state: Optional[str] = None
defer: Optional[bool] = None
@dataclass
@@ -132,7 +140,7 @@ class StatusParameters(RPCParameters):
@dataclass
class GCSettings(JsonSchemaMixin):
class GCSettings(dbtClassMixin):
# start evicting the longest-ago-ended tasks here
maxsize: int
# start evicting all tasks before now - auto_reap_age when we have this
@@ -226,32 +234,35 @@ class RemoteCompileResult(RemoteCompileResultMixin):
@schema_version('remote-execution-result', 1)
class RemoteExecutionResult(ExecutionResult, RemoteResult):
results: Sequence[RunResult]
args: Dict[str, Any] = field(default_factory=dict)
generated_at: datetime = field(default_factory=datetime.utcnow)
def write(self, path: str):
writable = RunResultsArtifact.from_node_results(
writable = RunResultsArtifact.from_execution_results(
generated_at=self.generated_at,
results=self.results,
elapsed_time=self.elapsed_time,
args=self.args,
)
writable.write(path)
@classmethod
def from_local_result(
cls,
base: RunResultsArtifact,
base: RunExecutionResult,
logs: List[LogMessage],
) -> 'RemoteExecutionResult':
return cls(
generated_at=base.metadata.generated_at,
generated_at=base.generated_at,
results=base.results,
elapsed_time=base.elapsed_time,
args=base.args,
logs=logs,
)
@dataclass
class ResultTable(JsonSchemaMixin):
class ResultTable(dbtClassMixin):
column_names: List[str]
rows: List[Any]
@@ -408,21 +419,31 @@ class TaskHandlerState(StrEnum):
@dataclass
class TaskTiming(JsonSchemaMixin):
class TaskTiming(dbtClassMixin):
state: TaskHandlerState
start: Optional[datetime]
end: Optional[datetime]
elapsed: Optional[float]
# These ought to be defaults but superclass order doesn't
# allow that to work
@classmethod
def __pre_deserialize__(cls, data, options=None):
data = super().__pre_deserialize__(data, options=options)
for field_name in ('start', 'end', 'elapsed'):
if field_name not in data:
data[field_name] = None
return data
@dataclass
class TaskRow(TaskTiming):
task_id: TaskID
request_id: Union[str, int]
request_source: str
method: str
timeout: Optional[float]
tags: TaskTags
request_id: Union[str, int]
tags: TaskTags = None
timeout: Optional[float] = None
@dataclass
@@ -448,7 +469,7 @@ class KillResult(RemoteResult):
@dataclass
@schema_version('remote-manifest-result', 1)
class GetManifestResult(RemoteResult):
manifest: Optional[WritableManifest]
manifest: Optional[WritableManifest] = None
# this is kind of carefuly structured: BlocksManifestTasks is implied by
@@ -472,6 +493,16 @@ class PollResult(RemoteResult, TaskTiming):
end: Optional[datetime]
elapsed: Optional[float]
# These ought to be defaults but superclass order doesn't
# allow that to work
@classmethod
def __pre_deserialize__(cls, data, options=None):
data = super().__pre_deserialize__(data, options=options)
for field_name in ('start', 'end', 'elapsed'):
if field_name not in data:
data[field_name] = None
return data
@dataclass
@schema_version('poll-remote-deps-result', 1)

View File

@@ -1,17 +1,18 @@
from dataclasses import dataclass
from hologram import JsonSchemaMixin
from dbt.dataclass_schema import dbtClassMixin
from typing import List, Dict, Any, Union
@dataclass
class SelectorDefinition(JsonSchemaMixin):
class SelectorDefinition(dbtClassMixin):
name: str
definition: Union[str, Dict[str, Any]]
description: str = ''
@dataclass
class SelectorFile(JsonSchemaMixin):
class SelectorFile(dbtClassMixin):
selectors: List[SelectorDefinition]
version: int = 2

View File

@@ -7,13 +7,12 @@ from typing import (
from dbt.clients.system import write_json, read_json
from dbt.exceptions import (
IncompatibleSchemaException,
InternalException,
RuntimeException,
)
from dbt.version import __version__
from dbt.tracking import get_invocation_id
from hologram import JsonSchemaMixin
from dbt.dataclass_schema import dbtClassMixin
MacroKey = Tuple[str, str]
SourceKey = Tuple[str, str]
@@ -57,8 +56,10 @@ class Mergeable(Replaceable):
class Writable:
def write(self, path: str, omit_none: bool = False):
write_json(path, self.to_dict(omit_none=omit_none)) # type: ignore
def write(self, path: str):
write_json(
path, self.to_dict(options={'keep_none': True}) # type: ignore
)
class AdditionalPropertiesMixin:
@@ -69,22 +70,41 @@ class AdditionalPropertiesMixin:
"""
ADDITIONAL_PROPERTIES = True
# This takes attributes in the dictionary that are
# not in the class definitions and puts them in an
# _extra dict in the class
@classmethod
def from_dict(cls, data, validate=True):
self = super().from_dict(data=data, validate=validate)
keys = self.to_dict(validate=False, omit_none=False)
def __pre_deserialize__(cls, data, options=None):
# dir() did not work because fields with
# metadata settings are not found
# The original version of this would create the
# object first and then update extra with the
# extra keys, but that won't work here, so
# we're copying the dict so we don't insert the
# _extra in the original data. This also requires
# that Mashumaro actually build the '_extra' field
cls_keys = cls._get_field_names()
new_dict = {}
for key, value in data.items():
if key not in keys:
self.extra[key] = value
return self
if key not in cls_keys and key != '_extra':
if '_extra' not in new_dict:
new_dict['_extra'] = {}
new_dict['_extra'][key] = value
else:
new_dict[key] = value
data = new_dict
data = super().__pre_deserialize__(data, options=options)
return data
def to_dict(self, omit_none=True, validate=False):
data = super().to_dict(omit_none=omit_none, validate=validate)
def __post_serialize__(self, dct, options=None):
data = super().__post_serialize__(dct, options=options)
data.update(self.extra)
if '_extra' in data:
del data['_extra']
return data
def replace(self, **kwargs):
dct = self.to_dict(omit_none=False, validate=False)
dct = self.to_dict(options={'keep_none': True})
dct.update(kwargs)
return self.from_dict(dct)
@@ -135,7 +155,7 @@ def get_metadata_env() -> Dict[str, str]:
@dataclasses.dataclass
class BaseArtifactMetadata(JsonSchemaMixin):
class BaseArtifactMetadata(dbtClassMixin):
dbt_schema_version: str
dbt_version: str = __version__
generated_at: datetime = dataclasses.field(
@@ -158,7 +178,7 @@ def schema_version(name: str, version: int):
@dataclasses.dataclass
class VersionedSchema(JsonSchemaMixin):
class VersionedSchema(dbtClassMixin):
dbt_schema_version: ClassVar[SchemaVersion]
@classmethod
@@ -180,18 +200,9 @@ class ArtifactMixin(VersionedSchema, Writable, Readable):
metadata: BaseArtifactMetadata
@classmethod
def from_dict(
cls: Type[T], data: Dict[str, Any], validate: bool = True
) -> T:
def validate(cls, data):
super().validate(data)
if cls.dbt_schema_version is None:
raise InternalException(
'Cannot call from_dict with no schema version!'
)
if validate:
expected = str(cls.dbt_schema_version)
found = data.get('metadata', {}).get(SCHEMA_VERSION_KEY)
if found != expected:
raise IncompatibleSchemaException(expected, found)
return super().from_dict(data=data, validate=validate)

View File

@@ -0,0 +1,170 @@
from typing import (
Type, ClassVar, Dict, cast, TypeVar
)
import re
from dataclasses import fields
from enum import Enum
from datetime import datetime
from dateutil.parser import parse
from hologram import JsonSchemaMixin, FieldEncoder, ValidationError
from mashumaro import DataClassDictMixin
from mashumaro.types import SerializableEncoder, SerializableType
class DateTimeSerializableEncoder(SerializableEncoder[datetime]):
@classmethod
def _serialize(cls, value: datetime) -> str:
out = value.isoformat()
# Assume UTC if timezone is missing
if value.tzinfo is None:
out = out + "Z"
return out
@classmethod
def _deserialize(cls, value: str) -> datetime:
return (
value if isinstance(value, datetime) else parse(cast(str, value))
)
TV = TypeVar("TV")
# This class pulls in both JsonSchemaMixin from Hologram and
# DataClassDictMixin from our fork of Mashumaro. The 'to_dict'
# and 'from_dict' methods come from Mashumaro. Building
# jsonschemas for every class and the 'validate' method
# come from Hologram.
class dbtClassMixin(DataClassDictMixin, JsonSchemaMixin):
"""Mixin which adds methods to generate a JSON schema and
convert to and from JSON encodable dicts with validation
against the schema
"""
_serializable_encoders: ClassVar[Dict[str, SerializableEncoder]] = {
'datetime.datetime': DateTimeSerializableEncoder(),
}
_hyphenated: ClassVar[bool] = False
ADDITIONAL_PROPERTIES: ClassVar[bool] = False
# This is called by the mashumaro to_dict in order to handle
# nested classes.
# Munges the dict that's returned.
def __post_serialize__(self, dct, options=None):
keep_none = False
if options and 'keep_none' in options and options['keep_none']:
keep_none = True
if not keep_none: # remove attributes that are None
new_dict = {k: v for k, v in dct.items() if v is not None}
dct = new_dict
if self._hyphenated:
new_dict = {}
for key in dct:
if '_' in key:
new_key = key.replace('_', '-')
new_dict[new_key] = dct[key]
else:
new_dict[key] = dct[key]
dct = new_dict
return dct
# This is called by the mashumaro _from_dict method, before
# performing the conversion to a dict
@classmethod
def __pre_deserialize__(cls, data, options=None):
if cls._hyphenated:
new_dict = {}
for key in data:
if '-' in key:
new_key = key.replace('-', '_')
new_dict[new_key] = data[key]
else:
new_dict[key] = data[key]
data = new_dict
return data
# This is used in the hologram._encode_field method, which calls
# a 'to_dict' method which does not have the same parameters in
# hologram and in mashumaro.
def _local_to_dict(self, **kwargs):
args = {}
if 'omit_none' in kwargs and kwargs['omit_none'] is False:
args['options'] = {'keep_none': True}
return self.to_dict(**args)
class ValidatedStringMixin(str, SerializableType):
ValidationRegex = ''
@classmethod
def _deserialize(cls, value: str) -> 'ValidatedStringMixin':
cls.validate(value)
return ValidatedStringMixin(value)
def _serialize(self) -> str:
return str(self)
@classmethod
def validate(cls, value):
res = re.match(cls.ValidationRegex, value)
if res is None:
raise ValidationError(f"Invalid value: {value}") # TODO
# These classes must be in this order or it doesn't work
class StrEnum(str, SerializableType, Enum):
def __str__(self):
return self.value
# https://docs.python.org/3.6/library/enum.html#using-automatic-values
def _generate_next_value_(name, *_):
return name
def _serialize(self) -> str:
return self.value
@classmethod
def _deserialize(cls, value: str):
return cls(value)
class HyphenatedDbtClassMixin(dbtClassMixin):
# used by from_dict/to_dict
_hyphenated: ClassVar[bool] = True
# used by jsonschema validation, _get_fields
@classmethod
def field_mapping(cls):
result = {}
for field in fields(cls):
skip = field.metadata.get("preserve_underscore")
if skip:
continue
if "_" in field.name:
result[field.name] = field.name.replace("_", "-")
return result
class ExtensibleDbtClassMixin(dbtClassMixin):
ADDITIONAL_PROPERTIES = True
# This is used by Hologram in jsonschema validation
def register_pattern(base_type: Type, pattern: str) -> None:
"""base_type should be a typing.NewType that should always have the given
regex pattern. That means that its underlying type ('__supertype__') had
better be a str!
"""
class PatternEncoder(FieldEncoder):
@property
def json_schema(self):
return {"type": "string", "pattern": pattern}
dbtClassMixin.register_field_encoders({base_type: PatternEncoder()})

View File

@@ -7,14 +7,14 @@ from dbt.node_types import NodeType
from dbt import flags
from dbt.ui import line_wrap_message
import hologram
import dbt.dataclass_schema
def validator_error_message(exc):
"""Given a hologram.ValidationError (which is basically a
"""Given a dbt.dataclass_schema.ValidationError (which is basically a
jsonschema.ValidationError), return the relevant parts as a string
"""
if not isinstance(exc, hologram.ValidationError):
if not isinstance(exc, dbt.dataclass_schema.ValidationError):
return str(exc)
path = "[%s]" % "][".join(map(repr, exc.relative_path))
return 'at path {}: {}'.format(path, exc.message)
@@ -132,7 +132,7 @@ class RuntimeException(RuntimeError, Exception):
result.update({
'raw_sql': self.node.raw_sql,
# the node isn't always compiled, but if it is, include that!
'compiled_sql': getattr(self.node, 'injected_sql', None),
'compiled_sql': getattr(self.node, 'compiled_sql', None),
})
return result

View File

@@ -1,5 +1,8 @@
import os
import multiprocessing
if os.name != 'nt':
# https://bugs.python.org/issue41567
import multiprocessing.popen_spawn_posix # type: ignore
from pathlib import Path
from typing import Optional

View File

@@ -1,6 +1,6 @@
# special support for CLI argument parsing.
import itertools
import yaml
from dbt.clients.yaml_helper import yaml, Loader, Dumper # noqa: F401
from typing import (
Dict, List, Optional, Tuple, Any, Union
@@ -19,7 +19,7 @@ from .selector_spec import (
INTERSECTION_DELIMITER = ','
DEFAULT_INCLUDES: List[str] = ['fqn:*', 'source:*', 'report:*']
DEFAULT_INCLUDES: List[str] = ['fqn:*', 'source:*', 'exposure:*']
DEFAULT_EXCLUDES: List[str] = []
DATA_TEST_SELECTOR: str = 'test_type:data'
SCHEMA_TEST_SELECTOR: str = 'test_type:schema'
@@ -236,7 +236,7 @@ def parse_dict_definition(definition: Dict[str, Any]) -> SelectionSpec:
)
# if key isn't a valid method name, this will raise
base = SelectionCriteria.from_dict(definition, dct)
base = SelectionCriteria.selection_criteria_from_dict(definition, dct)
if diff_arg is None:
return base
else:

View File

@@ -7,7 +7,7 @@ from typing import (
import networkx as nx # type: ignore
from .graph import UniqueId
from dbt.contracts.graph.parsed import ParsedSourceDefinition, ParsedReport
from dbt.contracts.graph.parsed import ParsedSourceDefinition, ParsedExposure
from dbt.contracts.graph.compiled import GraphMemberNode
from dbt.contracts.graph.manifest import Manifest
from dbt.node_types import NodeType
@@ -50,8 +50,8 @@ class GraphQueue:
node = self.manifest.expect(node_id)
if node.resource_type != NodeType.Model:
return False
# must be a Model - tell mypy this won't be a Source or Report
assert not isinstance(node, (ParsedSourceDefinition, ParsedReport))
# must be a Model - tell mypy this won't be a Source or Exposure
assert not isinstance(node, (ParsedSourceDefinition, ParsedExposure))
if node.is_ephemeral:
return False
return True

View File

@@ -129,7 +129,7 @@ class NodeSelector(MethodManager):
if unique_id in self.manifest.sources:
source = self.manifest.sources[unique_id]
return source.config.enabled
elif unique_id in self.manifest.reports:
elif unique_id in self.manifest.exposures:
return True
node = self.manifest.nodes[unique_id]
return not node.empty and node.config.enabled
@@ -146,8 +146,8 @@ class NodeSelector(MethodManager):
node = self.manifest.nodes[unique_id]
elif unique_id in self.manifest.sources:
node = self.manifest.sources[unique_id]
elif unique_id in self.manifest.reports:
node = self.manifest.reports[unique_id]
elif unique_id in self.manifest.exposures:
node = self.manifest.exposures[unique_id]
else:
raise InternalException(
f'Node {unique_id} not found in the manifest!'

View File

@@ -3,7 +3,7 @@ from itertools import chain
from pathlib import Path
from typing import Set, List, Dict, Iterator, Tuple, Any, Union, Type, Optional
from hologram.helpers import StrEnum
from dbt.dataclass_schema import StrEnum
from .graph import UniqueId
@@ -17,7 +17,7 @@ from dbt.contracts.graph.manifest import Manifest, WritableManifest
from dbt.contracts.graph.parsed import (
HasTestMetadata,
ParsedDataTestNode,
ParsedReport,
ParsedExposure,
ParsedSchemaTestNode,
ParsedSourceDefinition,
)
@@ -46,7 +46,7 @@ class MethodName(StrEnum):
TestType = 'test_type'
ResourceType = 'resource_type'
State = 'state'
Report = 'report'
Exposure = 'exposure'
def is_selected_node(real_node, node_selector):
@@ -75,7 +75,7 @@ def is_selected_node(real_node, node_selector):
return True
SelectorTarget = Union[ParsedSourceDefinition, ManifestNode, ParsedReport]
SelectorTarget = Union[ParsedSourceDefinition, ManifestNode, ParsedExposure]
class SelectorMethod(metaclass=abc.ABCMeta):
@@ -111,16 +111,16 @@ class SelectorMethod(metaclass=abc.ABCMeta):
continue
yield unique_id, source
def report_nodes(
def exposure_nodes(
self,
included_nodes: Set[UniqueId]
) -> Iterator[Tuple[UniqueId, ParsedReport]]:
) -> Iterator[Tuple[UniqueId, ParsedExposure]]:
for key, report in self.manifest.reports.items():
for key, exposure in self.manifest.exposures.items():
unique_id = UniqueId(key)
if unique_id not in included_nodes:
continue
yield unique_id, report
yield unique_id, exposure
def all_nodes(
self,
@@ -128,7 +128,7 @@ class SelectorMethod(metaclass=abc.ABCMeta):
) -> Iterator[Tuple[UniqueId, SelectorTarget]]:
yield from chain(self.parsed_nodes(included_nodes),
self.source_nodes(included_nodes),
self.report_nodes(included_nodes))
self.exposure_nodes(included_nodes))
def configurable_nodes(
self,
@@ -140,9 +140,9 @@ class SelectorMethod(metaclass=abc.ABCMeta):
def non_source_nodes(
self,
included_nodes: Set[UniqueId],
) -> Iterator[Tuple[UniqueId, Union[ParsedReport, ManifestNode]]]:
) -> Iterator[Tuple[UniqueId, Union[ParsedExposure, ManifestNode]]]:
yield from chain(self.parsed_nodes(included_nodes),
self.report_nodes(included_nodes))
self.exposure_nodes(included_nodes))
@abc.abstractmethod
def search(
@@ -244,7 +244,7 @@ class SourceSelectorMethod(SelectorMethod):
yield node
class ReportSelectorMethod(SelectorMethod):
class ExposureSelectorMethod(SelectorMethod):
def search(
self, included_nodes: Set[UniqueId], selector: str
) -> Iterator[UniqueId]:
@@ -256,13 +256,13 @@ class ReportSelectorMethod(SelectorMethod):
target_package, target_name = parts
else:
msg = (
'Invalid report selector value "{}". Reports must be of '
'the form ${{report_name}} or '
'${{report_package.report_name}}'
'Invalid exposure selector value "{}". Exposures must be of '
'the form ${{exposure_name}} or '
'${{exposure_package.exposure_name}}'
).format(selector)
raise RuntimeException(msg)
for node, real_node in self.report_nodes(included_nodes):
for node, real_node in self.exposure_nodes(included_nodes):
if target_package not in (real_node.package_name, SELECTOR_GLOB):
continue
if target_name not in (real_node.name, SELECTOR_GLOB):
@@ -481,8 +481,8 @@ class StateSelectorMethod(SelectorMethod):
previous_node = manifest.nodes[node]
elif node in manifest.sources:
previous_node = manifest.sources[node]
elif node in manifest.reports:
previous_node = manifest.reports[node]
elif node in manifest.exposures:
previous_node = manifest.exposures[node]
if checker(previous_node, real_node):
yield node
@@ -499,7 +499,7 @@ class MethodManager:
MethodName.TestName: TestNameSelectorMethod,
MethodName.TestType: TestTypeSelectorMethod,
MethodName.State: StateSelectorMethod,
MethodName.Report: ReportSelectorMethod,
MethodName.Exposure: ExposureSelectorMethod,
}
def __init__(

View File

@@ -102,7 +102,9 @@ class SelectionCriteria:
return method_name, method_arguments
@classmethod
def from_dict(cls, raw: Any, dct: Dict[str, Any]) -> 'SelectionCriteria':
def selection_criteria_from_dict(
cls, raw: Any, dct: Dict[str, Any]
) -> 'SelectionCriteria':
if 'value' not in dct:
raise RuntimeException(
f'Invalid node spec "{raw}" - no search value!'
@@ -123,6 +125,26 @@ class SelectionCriteria:
children_depth=children_depth,
)
@classmethod
def dict_from_single_spec(cls, raw: str):
result = RAW_SELECTOR_PATTERN.match(raw)
if result is None:
return {'error': 'Invalid selector spec'}
dct: Dict[str, Any] = result.groupdict()
method_name, method_arguments = cls.parse_method(dct)
meth_name = str(method_name)
if method_arguments:
meth_name = meth_name + '.' + '.'.join(method_arguments)
dct['method'] = meth_name
dct = {k: v for k, v in dct.items() if (v is not None and v != '')}
if 'childrens_parents' in dct:
dct['childrens_parents'] = bool(dct.get('childrens_parents'))
if 'parents' in dct:
dct['parents'] = bool(dct.get('parents'))
if 'children' in dct:
dct['children'] = bool(dct.get('children'))
return dct
@classmethod
def from_single_spec(cls, raw: str) -> 'SelectionCriteria':
result = RAW_SELECTOR_PATTERN.match(raw)
@@ -130,7 +152,7 @@ class SelectionCriteria:
# bad spec!
raise RuntimeException(f'Invalid selector spec "{raw}"')
return cls.from_dict(raw, result.groupdict())
return cls.selection_criteria_from_dict(raw, result.groupdict())
class BaseSelectionGroup(Iterable[SelectionSpec], metaclass=ABCMeta):

View File

@@ -2,14 +2,27 @@
from dataclasses import dataclass
from datetime import timedelta
from pathlib import Path
from typing import NewType, Tuple, AbstractSet
from typing import Tuple, AbstractSet, Union
from hologram import (
FieldEncoder, JsonSchemaMixin, JsonDict, ValidationError
from dbt.dataclass_schema import (
dbtClassMixin, ValidationError, StrEnum,
)
from hologram.helpers import StrEnum
from hologram import FieldEncoder, JsonDict
from mashumaro.types import SerializableType
Port = NewType('Port', int)
class Port(int, SerializableType):
@classmethod
def _deserialize(cls, value: Union[int, str]) -> 'Port':
try:
value = int(value)
except ValueError:
raise ValidationError(f'Cannot encode {value} into port number')
return Port(value)
def _serialize(self) -> int:
return self
class PortEncoder(FieldEncoder):
@@ -66,12 +79,12 @@ class NVEnum(StrEnum):
@dataclass
class NoValue(JsonSchemaMixin):
class NoValue(dbtClassMixin):
"""Sometimes, you want a way to say none that isn't None"""
novalue: NVEnum = NVEnum.novalue
JsonSchemaMixin.register_field_encoders({
dbtClassMixin.register_field_encoders({
Port: PortEncoder(),
timedelta: TimeDeltaFieldEncoder(),
Path: PathEncoder(),

View File

@@ -1,4 +1,4 @@
from hologram.helpers import StrEnum
from dbt.dataclass_schema import StrEnum
import json
from typing import Union, Dict, Any

View File

@@ -7,15 +7,15 @@
{{ write(sql) }}
{%- endif -%}
{%- set status, res = adapter.execute(sql, auto_begin=auto_begin, fetch=fetch_result) -%}
{%- set res, table = adapter.execute(sql, auto_begin=auto_begin, fetch=fetch_result) -%}
{%- if name is not none -%}
{{ store_result(name, status=status, agate_table=res) }}
{{ store_result(name, response=res, agate_table=table) }}
{%- endif -%}
{%- endif -%}
{%- endmacro %}
{% macro noop_statement(name=None, status=None, res=None) -%}
{% macro noop_statement(name=None, message=None, code=None, rows_affected=None, res=None) -%}
{%- set sql = caller() -%}
{%- if name == 'main' -%}
@@ -24,7 +24,7 @@
{%- endif -%}
{%- if name is not none -%}
{{ store_result(name, status=status, agate_table=res) }}
{{ store_raw_result(name, message=message, code=code, rows_affected=rows_affected, agate_table=res) }}
{%- endif -%}
{%- endmacro %}

View File

@@ -112,7 +112,7 @@
{%- set exists_as_view = (old_relation is not none and old_relation.is_view) -%}
{%- set agate_table = load_agate_table() -%}
{%- do store_result('agate_table', status='OK', agate_table=agate_table) -%}
{%- do store_result('agate_table', response='OK', agate_table=agate_table) -%}
{{ run_hooks(pre_hooks, inside_transaction=False) }}
@@ -129,11 +129,11 @@
{% set create_table_sql = create_csv_table(model, agate_table) %}
{% endif %}
{% set status = 'CREATE' if full_refresh_mode else 'INSERT' %}
{% set num_rows = (agate_table.rows | length) %}
{% set code = 'CREATE' if full_refresh_mode else 'INSERT' %}
{% set rows_affected = (agate_table.rows | length) %}
{% set sql = load_csv_rows(model, agate_table) %}
{% call noop_statement('main', status ~ ' ' ~ num_rows) %}
{% call noop_statement('main', code ~ ' ' ~ rows_affected, code, rows_affected) %}
{{ create_table_sql }};
-- dbt seed --
{{ sql }}

View File

@@ -37,6 +37,7 @@
{{ strategy.unique_key }} as dbt_unique_key
from {{ target_relation }}
where dbt_valid_to is null
),
@@ -87,7 +88,6 @@
where snapshotted_data.dbt_unique_key is null
or (
snapshotted_data.dbt_unique_key is not null
and snapshotted_data.dbt_valid_to is null
and (
{{ strategy.row_changed }}
)
@@ -104,8 +104,7 @@
from updates_source_data as source_data
join snapshotted_data on snapshotted_data.dbt_unique_key = source_data.dbt_unique_key
where snapshotted_data.dbt_valid_to is null
and (
where (
{{ strategy.row_changed }}
)
)
@@ -125,8 +124,7 @@
from snapshotted_data
left join deletes_source_data as source_data on snapshotted_data.dbt_unique_key = source_data.dbt_unique_key
where snapshotted_data.dbt_valid_to is null
and source_data.dbt_unique_key is null
where source_data.dbt_unique_key is null
)
{%- endif %}
@@ -216,7 +214,7 @@
{% if not target_relation_exists %}
{% set build_sql = build_snapshot_table(strategy, model['injected_sql']) %}
{% set build_sql = build_snapshot_table(strategy, model['compiled_sql']) %}
{% set final_sql = create_table_as(False, target_relation, build_sql) %}
{% else %}

View File

@@ -13,13 +13,7 @@
when matched
and DBT_INTERNAL_DEST.dbt_valid_to is null
and DBT_INTERNAL_SOURCE.dbt_change_type = 'update'
then update
set dbt_valid_to = DBT_INTERNAL_SOURCE.dbt_valid_to
when matched
and DBT_INTERNAL_DEST.dbt_valid_to is null
and DBT_INTERNAL_SOURCE.dbt_change_type = 'delete'
and DBT_INTERNAL_SOURCE.dbt_change_type in ('update', 'delete')
then update
set dbt_valid_to = DBT_INTERNAL_SOURCE.dbt_valid_to

View File

@@ -106,7 +106,7 @@
{% macro snapshot_check_all_get_existing_columns(node, target_exists) -%}
{%- set query_columns = get_columns_in_query(node['injected_sql']) -%}
{%- set query_columns = get_columns_in_query(node['compiled_sql']) -%}
{%- if not target_exists -%}
{# no table yet -> return whatever the query does #}
{{ return([false, query_columns]) }}
@@ -164,7 +164,11 @@
{%- for col in check_cols -%}
{{ snapshotted_rel }}.{{ col }} != {{ current_rel }}.{{ col }}
or
({{ snapshotted_rel }}.{{ col }} is null) != ({{ current_rel }}.{{ col }} is null)
(
(({{ snapshotted_rel }}.{{ col }} is null) and not ({{ current_rel }}.{{ col }} is null))
or
((not {{ snapshotted_rel }}.{{ col }} is null) and ({{ current_rel }}.{{ col }} is null))
)
{%- if not loop.last %} or {% endif -%}
{%- endfor -%}
{%- endif -%}

File diff suppressed because one or more lines are too long

View File

@@ -13,7 +13,7 @@ from typing import Optional, List, ContextManager, Callable, Dict, Any, Set
import colorama
import logbook
from hologram import JsonSchemaMixin
from dbt.dataclass_schema import dbtClassMixin
# Colorama needs some help on windows because we're using logger.info
# intead of print(). If the Windows env doesn't have a TERM var set,
@@ -45,11 +45,10 @@ DEBUG_LOG_FORMAT = (
ExceptionInformation = str
Extras = Dict[str, Any]
@dataclass
class LogMessage(JsonSchemaMixin):
class LogMessage(dbtClassMixin):
timestamp: datetime
message: str
channel: str
@@ -57,7 +56,7 @@ class LogMessage(JsonSchemaMixin):
levelname: str
thread_name: str
process: int
extra: Optional[Extras] = None
extra: Optional[Dict[str, Any]] = None
exc_info: Optional[ExceptionInformation] = None
@classmethod
@@ -215,7 +214,7 @@ class TextOnly(logbook.Processor):
class TimingProcessor(logbook.Processor):
def __init__(self, timing_info: Optional[JsonSchemaMixin] = None):
def __init__(self, timing_info: Optional[dbtClassMixin] = None):
self.timing_info = timing_info
super().__init__()

View File

@@ -23,6 +23,7 @@ import dbt.task.generate as generate_task
import dbt.task.serve as serve_task
import dbt.task.freshness as freshness_task
import dbt.task.run_operation as run_operation_task
import dbt.task.parse as parse_task
from dbt.profiler import profiler
from dbt.task.list import ListTask
from dbt.task.rpc.server import RPCServerTask
@@ -445,6 +446,21 @@ def _build_snapshot_subparser(subparsers, base_subparser):
return sub
def _add_defer_argument(*subparsers):
for sub in subparsers:
sub.add_optional_argument_inverse(
'--defer',
enable_help='''
If set, defer to the state variable for resolving unselected nodes.
''',
disable_help='''
If set, do not defer to the state variable for resolving unselected
nodes.
''',
default=flags.DEFER_MODE,
)
def _build_run_subparser(subparsers, base_subparser):
run_sub = subparsers.add_parser(
'run',
@@ -462,19 +478,6 @@ def _build_run_subparser(subparsers, base_subparser):
'''
)
# this is a "dbt run"-only thing, for now
run_sub.add_optional_argument_inverse(
'--defer',
enable_help='''
If set, defer to the state variable for resolving unselected nodes.
''',
disable_help='''
If set, do not defer to the state variable for resolving unselected
nodes.
''',
default=flags.DEFER_MODE,
)
run_sub.set_defaults(cls=run_task.RunTask, which='run', rpc_method='run')
return run_sub
@@ -494,6 +497,21 @@ def _build_compile_subparser(subparsers, base_subparser):
return sub
def _build_parse_subparser(subparsers, base_subparser):
sub = subparsers.add_parser(
'parse',
parents=[base_subparser],
help='''
Parsed the project and provides information on performance
'''
)
sub.set_defaults(cls=parse_task.ParseTask, which='parse',
rpc_method='parse')
sub.add_argument('--write-manifest', action='store_true')
sub.add_argument('--compile', action='store_true')
return sub
def _build_docs_generate_subparser(subparsers, base_subparser):
# it might look like docs_sub is the correct parents entry, but that
# will cause weird errors about 'conflicting option strings'.
@@ -1006,16 +1024,19 @@ def parse_args(args, cls=DBTArgumentParser):
rpc_sub = _build_rpc_subparser(subs, base_subparser)
run_sub = _build_run_subparser(subs, base_subparser)
compile_sub = _build_compile_subparser(subs, base_subparser)
parse_sub = _build_parse_subparser(subs, base_subparser)
generate_sub = _build_docs_generate_subparser(docs_subs, base_subparser)
test_sub = _build_test_subparser(subs, base_subparser)
seed_sub = _build_seed_subparser(subs, base_subparser)
# --threads, --no-version-check
_add_common_arguments(run_sub, compile_sub, generate_sub, test_sub,
rpc_sub, seed_sub)
rpc_sub, seed_sub, parse_sub)
# --models, --exclude
# list_sub sets up its own arguments.
_add_selection_arguments(run_sub, compile_sub, generate_sub, test_sub)
_add_selection_arguments(snapshot_sub, seed_sub, models_name='select')
# --defer
_add_defer_argument(run_sub, test_sub)
# --full-refresh
_add_table_mutability_arguments(run_sub, compile_sub)

View File

@@ -1,6 +1,6 @@
from typing import List
from hologram.helpers import StrEnum
from dbt.dataclass_schema import StrEnum
class NodeType(StrEnum):
@@ -14,7 +14,7 @@ class NodeType(StrEnum):
Documentation = 'docs'
Source = 'source'
Macro = 'macro'
Report = 'report'
Exposure = 'exposure'
@classmethod
def executable(cls) -> List['NodeType']:
@@ -46,6 +46,7 @@ class NodeType(StrEnum):
cls.Source,
cls.Macro,
cls.Analysis,
cls.Exposure
]
def pluralize(self) -> str:

View File

@@ -13,7 +13,9 @@ class AnalysisParser(SimpleSQLParser[ParsedAnalysisNode]):
)
def parse_from_dict(self, dct, validate=True) -> ParsedAnalysisNode:
return ParsedAnalysisNode.from_dict(dct, validate=validate)
if validate:
ParsedAnalysisNode.validate(dct)
return ParsedAnalysisNode.from_dict(dct)
@property
def resource_type(self) -> NodeType:

View File

@@ -5,7 +5,7 @@ from typing import (
List, Dict, Any, Iterable, Generic, TypeVar
)
from hologram import ValidationError
from dbt.dataclass_schema import ValidationError
from dbt import utils
from dbt.clients.jinja import MacroGenerator
@@ -23,7 +23,7 @@ from dbt.context.context_config import (
from dbt.contracts.files import (
SourceFile, FilePath, FileHash
)
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.manifest import MacroManifest
from dbt.contracts.graph.parsed import HasUniqueID
from dbt.contracts.graph.unparsed import UnparsedNode
from dbt.exceptions import (
@@ -99,7 +99,7 @@ class Parser(BaseParser[FinalValue], Generic[FinalValue]):
results: ParseResult,
project: Project,
root_project: RuntimeConfig,
macro_manifest: Manifest,
macro_manifest: MacroManifest,
) -> None:
super().__init__(results, project)
self.root_project = root_project
@@ -108,9 +108,10 @@ class Parser(BaseParser[FinalValue], Generic[FinalValue]):
class RelationUpdate:
def __init__(
self, config: RuntimeConfig, manifest: Manifest, component: str
self, config: RuntimeConfig, macro_manifest: MacroManifest,
component: str
) -> None:
macro = manifest.find_generate_macro_by_name(
macro = macro_manifest.find_generate_macro_by_name(
component=component,
root_project_name=config.project_name,
)
@@ -120,7 +121,7 @@ class RelationUpdate:
)
root_context = generate_generate_component_name_macro(
macro, config, manifest
macro, config, macro_manifest
)
self.updater = MacroGenerator(macro, root_context)
self.component = component
@@ -144,18 +145,21 @@ class ConfiguredParser(
results: ParseResult,
project: Project,
root_project: RuntimeConfig,
macro_manifest: Manifest,
macro_manifest: MacroManifest,
) -> None:
super().__init__(results, project, root_project, macro_manifest)
self._update_node_database = RelationUpdate(
manifest=macro_manifest, config=root_project, component='database'
macro_manifest=macro_manifest, config=root_project,
component='database'
)
self._update_node_schema = RelationUpdate(
manifest=macro_manifest, config=root_project, component='schema'
macro_manifest=macro_manifest, config=root_project,
component='schema'
)
self._update_node_alias = RelationUpdate(
manifest=macro_manifest, config=root_project, component='alias'
macro_manifest=macro_manifest, config=root_project,
component='alias'
)
@abc.abstractclassmethod
@@ -252,7 +256,7 @@ class ConfiguredParser(
}
dct.update(kwargs)
try:
return self.parse_from_dict(dct)
return self.parse_from_dict(dct, validate=True)
except ValidationError as exc:
msg = validator_error_message(exc)
# this is a bit silly, but build an UnparsedNode just for error
@@ -275,20 +279,24 @@ class ConfiguredParser(
def render_with_context(
self, parsed_node: IntermediateNode, config: ContextConfig
) -> None:
"""Given the parsed node and a ContextConfig to use during parsing,
render the node's sql wtih macro capture enabled.
# Given the parsed node and a ContextConfig to use during parsing,
# render the node's sql wtih macro capture enabled.
# Note: this mutates the config object when config calls are rendered.
Note: this mutates the config object when config() calls are rendered.
"""
# during parsing, we don't have a connection, but we might need one, so
# we have to acquire it.
with get_adapter(self.root_project).connection_for(parsed_node):
context = self._context_for(parsed_node, config)
# this goes through the process of rendering, but just throws away
# the rendered result. The "macro capture" is the point?
get_rendered(
parsed_node.raw_sql, context, parsed_node, capture_macros=True
)
# This is taking the original config for the node, converting it to a dict,
# updating the config with new config passed in, then re-creating the
# config from the dict in the node.
def update_parsed_node_config(
self, parsed_node: IntermediateNode, config_dict: Dict[str, Any]
) -> None:

View File

@@ -12,7 +12,9 @@ class DataTestParser(SimpleSQLParser[ParsedDataTestNode]):
)
def parse_from_dict(self, dct, validate=True) -> ParsedDataTestNode:
return ParsedDataTestNode.from_dict(dct, validate=validate)
if validate:
ParsedDataTestNode.validate(dct)
return ParsedDataTestNode.from_dict(dct)
@property
def resource_type(self) -> NodeType:

View File

@@ -79,7 +79,9 @@ class HookParser(SimpleParser[HookBlock, ParsedHookNode]):
return [path]
def parse_from_dict(self, dct, validate=True) -> ParsedHookNode:
return ParsedHookNode.from_dict(dct, validate=validate)
if validate:
ParsedHookNode.validate(dct)
return ParsedHookNode.from_dict(dct)
@classmethod
def get_compiled_path(cls, block: HookBlock):

View File

@@ -1,10 +1,14 @@
from dataclasses import dataclass
from dataclasses import field
import os
import pickle
from typing import (
Dict, Optional, Mapping, Callable, Any, List, Type, Union, MutableMapping
)
import time
import dbt.exceptions
import dbt.tracking
import dbt.flags as flags
from dbt.adapters.factory import (
@@ -19,10 +23,13 @@ from dbt.config import Project, RuntimeConfig
from dbt.context.docs import generate_runtime_docs
from dbt.contracts.files import FilePath, FileHash
from dbt.contracts.graph.compiled import ManifestNode
from dbt.contracts.graph.manifest import Manifest, Disabled
from dbt.contracts.graph.parsed import (
ParsedSourceDefinition, ParsedNode, ParsedMacro, ColumnInfo, ParsedReport
from dbt.contracts.graph.manifest import (
Manifest, MacroManifest, AnyManifest, Disabled
)
from dbt.contracts.graph.parsed import (
ParsedSourceDefinition, ParsedNode, ParsedMacro, ColumnInfo, ParsedExposure
)
from dbt.contracts.util import Writable
from dbt.exceptions import (
ref_target_not_found,
get_target_not_found_or_disabled_msg,
@@ -46,12 +53,39 @@ from dbt.parser.sources import patch_sources
from dbt.ui import warning_tag
from dbt.version import __version__
from dbt.dataclass_schema import dbtClassMixin
PARTIAL_PARSE_FILE_NAME = 'partial_parse.pickle'
PARSING_STATE = DbtProcessState('parsing')
DEFAULT_PARTIAL_PARSE = False
@dataclass
class ParserInfo(dbtClassMixin):
parser: str
elapsed: float
path_count: int = 0
@dataclass
class ProjectLoaderInfo(dbtClassMixin):
project_name: str
elapsed: float
parsers: List[ParserInfo]
path_count: int = 0
@dataclass
class ManifestLoaderInfo(dbtClassMixin, Writable):
path_count: int = 0
is_partial_parse_enabled: Optional[bool] = None
parse_project_elapsed: Optional[float] = None
patch_sources_elapsed: Optional[float] = None
process_manifest_elapsed: Optional[float] = None
load_all_elapsed: Optional[float] = None
projects: List[ProjectLoaderInfo] = field(default_factory=list)
_parser_types: List[Type[Parser]] = [
ModelParser,
SnapshotParser,
@@ -105,20 +139,43 @@ class ManifestLoader:
self,
root_project: RuntimeConfig,
all_projects: Mapping[str, Project],
macro_hook: Optional[Callable[[Manifest], Any]] = None,
macro_hook: Optional[Callable[[AnyManifest], Any]] = None,
) -> None:
self.root_project: RuntimeConfig = root_project
self.all_projects: Mapping[str, Project] = all_projects
self.macro_hook: Callable[[Manifest], Any]
self.macro_hook: Callable[[AnyManifest], Any]
if macro_hook is None:
self.macro_hook = lambda m: None
else:
self.macro_hook = macro_hook
# results holds all of the nodes created by parsing,
# in dictionaries: nodes, sources, docs, macros, exposures,
# macro_patches, patches, source_patches, files, etc
self.results: ParseResult = make_parse_result(
root_project, all_projects,
)
self._loaded_file_cache: Dict[str, FileBlock] = {}
self._perf_info = ManifestLoaderInfo(
is_partial_parse_enabled=self._partial_parse_enabled()
)
def track_project_load(self):
invocation_id = dbt.tracking.active_user.invocation_id
dbt.tracking.track_project_load({
"invocation_id": invocation_id,
"project_id": self.root_project.hashed_name(),
"path_count": self._perf_info.path_count,
"parse_project_elapsed": self._perf_info.parse_project_elapsed,
"patch_sources_elapsed": self._perf_info.patch_sources_elapsed,
"process_manifest_elapsed": (
self._perf_info.process_manifest_elapsed
),
"load_all_elapsed": self._perf_info.load_all_elapsed,
"is_partial_parse_enabled": (
self._perf_info.is_partial_parse_enabled
),
})
def parse_with_cache(
self,
@@ -158,7 +215,7 @@ class ManifestLoader:
def parse_project(
self,
project: Project,
macro_manifest: Manifest,
macro_manifest: MacroManifest,
old_results: Optional[ParseResult],
) -> None:
parsers: List[Parser] = []
@@ -170,11 +227,37 @@ class ManifestLoader:
# per-project cache.
self._loaded_file_cache.clear()
project_parser_info: List[ParserInfo] = []
start_timer = time.perf_counter()
total_path_count = 0
for parser in parsers:
parser_path_count = 0
parser_start_timer = time.perf_counter()
for path in parser.search():
self.parse_with_cache(path, parser, old_results)
parser_path_count = parser_path_count + 1
def load_only_macros(self) -> Manifest:
if parser_path_count > 0:
project_parser_info.append(ParserInfo(
parser=parser.resource_type,
path_count=parser_path_count,
elapsed=time.perf_counter() - parser_start_timer
))
total_path_count = total_path_count + parser_path_count
elapsed = time.perf_counter() - start_timer
project_info = ProjectLoaderInfo(
project_name=project.project_name,
path_count=total_path_count,
elapsed=elapsed,
parsers=project_parser_info
)
self._perf_info.projects.append(project_info)
self._perf_info.path_count = (
self._perf_info.path_count + total_path_count
)
def load_only_macros(self) -> MacroManifest:
old_results = self.read_parse_results()
for project in self.all_projects.values():
@@ -183,24 +266,33 @@ class ManifestLoader:
self.parse_with_cache(path, parser, old_results)
# make a manifest with just the macros to get the context
macro_manifest = Manifest.from_macros(
macro_manifest = MacroManifest(
macros=self.results.macros,
files=self.results.files
)
self.macro_hook(macro_manifest)
return macro_manifest
def load(self, macro_manifest: Manifest):
# This is where the main action happens
def load(self, macro_manifest: MacroManifest):
# if partial parse is enabled, load old results
old_results = self.read_parse_results()
if old_results is not None:
logger.debug('Got an acceptable cached parse result')
# store the macros & files from the adapter macro manifest
self.results.macros.update(macro_manifest.macros)
self.results.files.update(macro_manifest.files)
start_timer = time.perf_counter()
for project in self.all_projects.values():
# parse a single project
self.parse_project(project, macro_manifest, old_results)
self._perf_info.parse_project_elapsed = (
time.perf_counter() - start_timer
)
def write_parse_results(self):
path = os.path.join(self.root_project.target_path,
PARTIAL_PARSE_FILE_NAME)
@@ -300,7 +392,11 @@ class ManifestLoader:
# before we do anything else, patch the sources. This mutates
# results.disabled, so it needs to come before the final 'disabled'
# list is created
start_patch = time.perf_counter()
sources = patch_sources(self.results, self.root_project)
self._perf_info.patch_sources_elapsed = (
time.perf_counter() - start_patch
)
disabled = []
for value in self.results.disabled.values():
disabled.extend(value)
@@ -314,24 +410,33 @@ class ManifestLoader:
sources=sources,
macros=self.results.macros,
docs=self.results.docs,
reports=self.results.reports,
exposures=self.results.exposures,
metadata=self.root_project.get_metadata(),
disabled=disabled,
files=self.results.files,
selectors=self.root_project.manifest_selectors,
)
manifest.patch_nodes(self.results.patches)
manifest.patch_macros(self.results.macro_patches)
start_process = time.perf_counter()
self.process_manifest(manifest)
self._perf_info.process_manifest_elapsed = (
time.perf_counter() - start_process
)
return manifest
@classmethod
def load_all(
cls,
root_config: RuntimeConfig,
macro_manifest: Manifest,
macro_hook: Callable[[Manifest], Any],
macro_manifest: MacroManifest,
macro_hook: Callable[[AnyManifest], Any],
) -> Manifest:
with PARSING_STATE:
start_load_all = time.perf_counter()
projects = root_config.load_dependencies()
loader = cls(root_config, projects, macro_hook)
loader.load(macro_manifest=macro_manifest)
@@ -339,14 +444,21 @@ class ManifestLoader:
manifest = loader.create_manifest()
_check_manifest(manifest, root_config)
manifest.build_flat_graph()
loader._perf_info.load_all_elapsed = (
time.perf_counter() - start_load_all
)
loader.track_project_load()
return manifest
@classmethod
def load_macros(
cls,
root_config: RuntimeConfig,
macro_hook: Callable[[Manifest], Any],
) -> Manifest:
macro_hook: Callable[[AnyManifest], Any],
) -> MacroManifest:
with PARSING_STATE:
projects = root_config.load_dependencies()
loader = cls(root_config, projects, macro_hook)
@@ -514,6 +626,12 @@ def _process_docs_for_macro(
arg.description = get_rendered(arg.description, context)
def _process_docs_for_exposure(
context: Dict[str, Any], exposure: ParsedExposure
) -> None:
exposure.description = get_rendered(exposure.description, context)
def process_docs(manifest: Manifest, config: RuntimeConfig):
for node in manifest.nodes.values():
ctx = generate_runtime_docs(
@@ -539,13 +657,21 @@ def process_docs(manifest: Manifest, config: RuntimeConfig):
config.project_name,
)
_process_docs_for_macro(ctx, macro)
for exposure in manifest.exposures.values():
ctx = generate_runtime_docs(
config,
exposure,
manifest,
config.project_name,
)
_process_docs_for_exposure(ctx, exposure)
def _process_refs_for_report(
manifest: Manifest, current_project: str, report: ParsedReport
def _process_refs_for_exposure(
manifest: Manifest, current_project: str, exposure: ParsedExposure
):
"""Given a manifest and a report in that manifest, process its refs"""
for ref in report.refs:
"""Given a manifest and a exposure in that manifest, process its refs"""
for ref in exposure.refs:
target_model: Optional[Union[Disabled, ManifestNode]] = None
target_model_name: str
target_model_package: Optional[str] = None
@@ -563,14 +689,14 @@ def _process_refs_for_report(
target_model_name,
target_model_package,
current_project,
report.package_name,
exposure.package_name,
)
if target_model is None or isinstance(target_model, Disabled):
# This may raise. Even if it doesn't, we don't want to add
# this report to the graph b/c there is no destination report
# this exposure to the graph b/c there is no destination exposure
invalid_ref_fail_unless_test(
report, target_model_name, target_model_package,
exposure, target_model_name, target_model_package,
disabled=(isinstance(target_model, Disabled))
)
@@ -578,8 +704,8 @@ def _process_refs_for_report(
target_model_id = target_model.unique_id
report.depends_on.nodes.append(target_model_id)
manifest.update_report(report)
exposure.depends_on.nodes.append(target_model_id)
manifest.update_exposure(exposure)
def _process_refs_for_node(
@@ -630,33 +756,33 @@ def _process_refs_for_node(
def process_refs(manifest: Manifest, current_project: str):
for node in manifest.nodes.values():
_process_refs_for_node(manifest, current_project, node)
for report in manifest.reports.values():
_process_refs_for_report(manifest, current_project, report)
for exposure in manifest.exposures.values():
_process_refs_for_exposure(manifest, current_project, exposure)
return manifest
def _process_sources_for_report(
manifest: Manifest, current_project: str, report: ParsedReport
def _process_sources_for_exposure(
manifest: Manifest, current_project: str, exposure: ParsedExposure
):
target_source: Optional[Union[Disabled, ParsedSourceDefinition]] = None
for source_name, table_name in report.sources:
for source_name, table_name in exposure.sources:
target_source = manifest.resolve_source(
source_name,
table_name,
current_project,
report.package_name,
exposure.package_name,
)
if target_source is None or isinstance(target_source, Disabled):
invalid_source_fail_unless_test(
report,
exposure,
source_name,
table_name,
disabled=(isinstance(target_source, Disabled))
)
continue
target_source_id = target_source.unique_id
report.depends_on.nodes.append(target_source_id)
manifest.update_report(report)
exposure.depends_on.nodes.append(target_source_id)
manifest.update_exposure(exposure)
def _process_sources_for_node(
@@ -692,8 +818,8 @@ def process_sources(manifest: Manifest, current_project: str):
continue
assert not isinstance(node, ParsedSourceDefinition)
_process_sources_for_node(manifest, current_project, node)
for report in manifest.reports.values():
_process_sources_for_report(manifest, current_project, report)
for exposure in manifest.exposures.values():
_process_sources_for_exposure(manifest, current_project, exposure)
return manifest
@@ -723,14 +849,14 @@ def process_node(
def load_macro_manifest(
config: RuntimeConfig,
macro_hook: Callable[[Manifest], Any],
) -> Manifest:
macro_hook: Callable[[AnyManifest], Any],
) -> MacroManifest:
return ManifestLoader.load_macros(config, macro_hook)
def load_manifest(
config: RuntimeConfig,
macro_manifest: Manifest,
macro_hook: Callable[[Manifest], Any],
macro_manifest: MacroManifest,
macro_hook: Callable[[AnyManifest], Any],
) -> Manifest:
return ManifestLoader.load_all(config, macro_manifest, macro_hook)

View File

@@ -11,7 +11,9 @@ class ModelParser(SimpleSQLParser[ParsedModelNode]):
)
def parse_from_dict(self, dct, validate=True) -> ParsedModelNode:
return ParsedModelNode.from_dict(dct, validate=validate)
if validate:
ParsedModelNode.validate(dct)
return ParsedModelNode.from_dict(dct)
@property
def resource_type(self) -> NodeType:

View File

@@ -1,7 +1,7 @@
from dataclasses import dataclass, field
from typing import TypeVar, MutableMapping, Mapping, Union, List
from hologram import JsonSchemaMixin
from dbt.dataclass_schema import dbtClassMixin
from dbt.contracts.files import RemoteFile, FileHash, SourceFile
from dbt.contracts.graph.compiled import CompileResultNode
@@ -15,7 +15,7 @@ from dbt.contracts.graph.parsed import (
ParsedMacroPatch,
ParsedModelNode,
ParsedNodePatch,
ParsedReport,
ParsedExposure,
ParsedRPCNode,
ParsedSeedNode,
ParsedSchemaTestNode,
@@ -62,7 +62,7 @@ def dict_field():
@dataclass
class ParseResult(JsonSchemaMixin, Writable, Replaceable):
class ParseResult(dbtClassMixin, Writable, Replaceable):
vars_hash: FileHash
profile_hash: FileHash
project_hashes: MutableMapping[str, FileHash]
@@ -70,7 +70,7 @@ class ParseResult(JsonSchemaMixin, Writable, Replaceable):
sources: MutableMapping[str, UnpatchedSourceDefinition] = dict_field()
docs: MutableMapping[str, ParsedDocumentation] = dict_field()
macros: MutableMapping[str, ParsedMacro] = dict_field()
reports: MutableMapping[str, ParsedReport] = dict_field()
exposures: MutableMapping[str, ParsedExposure] = dict_field()
macro_patches: MutableMapping[MacroKey, ParsedMacroPatch] = dict_field()
patches: MutableMapping[str, ParsedNodePatch] = dict_field()
source_patches: MutableMapping[SourceKey, SourcePatch] = dict_field()
@@ -103,10 +103,10 @@ class ParseResult(JsonSchemaMixin, Writable, Replaceable):
self.add_node_nofile(node)
self.get_file(source_file).nodes.append(node.unique_id)
def add_report(self, source_file: SourceFile, report: ParsedReport):
_check_duplicates(report, self.reports)
self.reports[report.unique_id] = report
self.get_file(source_file).reports.append(report.unique_id)
def add_exposure(self, source_file: SourceFile, exposure: ParsedExposure):
_check_duplicates(exposure, self.exposures)
self.exposures[exposure.unique_id] = exposure
self.get_file(source_file).exposures.append(exposure.unique_id)
def add_disabled_nofile(self, node: CompileResultNode):
if node.unique_id in self.disabled:
@@ -269,11 +269,11 @@ class ParseResult(JsonSchemaMixin, Writable, Replaceable):
continue
self._process_node(node_id, source_file, old_file, old_result)
for report_id in old_file.reports:
report = _expect_value(
report_id, old_result.reports, old_file, "reports"
for exposure_id in old_file.exposures:
exposure = _expect_value(
exposure_id, old_result.exposures, old_file, "exposures"
)
self.add_report(source_file, report)
self.add_exposure(source_file, exposure)
patched = False
for name in old_file.patches:

View File

@@ -26,7 +26,9 @@ class RPCCallParser(SimpleSQLParser[ParsedRPCNode]):
return []
def parse_from_dict(self, dct, validate=True) -> ParsedRPCNode:
return ParsedRPCNode.from_dict(dct, validate=validate)
if validate:
ParsedRPCNode.validate(dct)
return ParsedRPCNode.from_dict(dct)
@property
def resource_type(self) -> NodeType:

View File

@@ -13,7 +13,7 @@ from dbt.contracts.graph.unparsed import (
UnparsedAnalysisUpdate,
UnparsedMacroUpdate,
UnparsedNodeUpdate,
UnparsedReport,
UnparsedExposure,
)
from dbt.exceptions import raise_compiler_error
from dbt.parser.search import FileBlock
@@ -82,7 +82,7 @@ Target = TypeVar(
UnparsedMacroUpdate,
UnparsedAnalysisUpdate,
UnpatchedSourceDefinition,
UnparsedReport,
UnparsedExposure,
)
@@ -179,6 +179,7 @@ class TestBuilder(Generic[Testable]):
- or it may not be namespaced (test)
"""
# The 'test_name' is used to find the 'macro' that implements the test
TEST_NAME_PATTERN = re.compile(
r'((?P<test_namespace>([a-zA-Z_][0-9a-zA-Z_]*))\.)?'
r'(?P<test_name>([a-zA-Z_][0-9a-zA-Z_]*))'
@@ -302,6 +303,8 @@ class TestBuilder(Generic[Testable]):
name = '{}_{}'.format(self.namespace, name)
return get_nice_schema_test_name(name, self.target.name, self.args)
# this is the 'raw_sql' that's used in 'render_update' and execution
# of the test macro
def build_raw_sql(self) -> str:
return (
"{{{{ config(severity='{severity}') }}}}"

View File

@@ -6,9 +6,9 @@ from typing import (
Iterable, Dict, Any, Union, List, Optional, Generic, TypeVar, Type
)
from hologram import ValidationError, JsonSchemaMixin
from dbt.dataclass_schema import ValidationError, dbtClassMixin
from dbt.adapters.factory import get_adapter
from dbt.adapters.factory import get_adapter, get_adapter_package_names
from dbt.clients.jinja import get_rendered, add_rendered_test_kwargs
from dbt.clients.yaml_helper import load_yaml_text
from dbt.config.renderer import SchemaYamlRenderer
@@ -20,7 +20,10 @@ from dbt.context.context_config import (
)
from dbt.context.configured import generate_schema_yml
from dbt.context.target import generate_target_context
from dbt.context.providers import generate_parse_report
from dbt.context.providers import (
generate_parse_exposure, generate_test_context
)
from dbt.context.macro_resolver import MacroResolver
from dbt.contracts.files import FileHash
from dbt.contracts.graph.manifest import SourceFile
from dbt.contracts.graph.model_config import SourceConfig
@@ -31,7 +34,7 @@ from dbt.contracts.graph.parsed import (
ParsedSchemaTestNode,
ParsedMacroPatch,
UnpatchedSourceDefinition,
ParsedReport,
ParsedExposure,
)
from dbt.contracts.graph.unparsed import (
FreshnessThreshold,
@@ -43,7 +46,7 @@ from dbt.contracts.graph.unparsed import (
UnparsedColumn,
UnparsedMacroUpdate,
UnparsedNodeUpdate,
UnparsedReport,
UnparsedExposure,
UnparsedSourceDefinition,
)
from dbt.exceptions import (
@@ -95,6 +98,7 @@ def error_context(
class ParserRef:
"""A helper object to hold parse-time references."""
def __init__(self):
self.column_info: Dict[str, ColumnInfo] = {}
@@ -172,6 +176,15 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
self.raw_renderer = SchemaYamlRenderer(ctx)
internal_package_names = get_adapter_package_names(
self.root_project.credentials.type
)
self.macro_resolver = MacroResolver(
self.macro_manifest.macros,
self.root_project.project_name,
internal_package_names
)
@classmethod
def get_compiled_path(cls, block: FileBlock) -> str:
# should this raise an error?
@@ -201,9 +214,11 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
)
def parse_from_dict(self, dct, validate=True) -> ParsedSchemaTestNode:
return ParsedSchemaTestNode.from_dict(dct, validate=validate)
if validate:
ParsedSchemaTestNode.validate(dct)
return ParsedSchemaTestNode.from_dict(dct)
def _parse_format_version(
def _check_format_version(
self, yaml: YamlBlock
) -> None:
path = yaml.path.relative_path
@@ -264,6 +279,11 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
base=False,
)
def _get_relation_name(self, node: ParsedSourceDefinition):
adapter = get_adapter(self.root_project)
relation_cls = adapter.Relation
return str(relation_cls.create_from(self.root_project, node))
def parse_source(
self, target: UnpatchedSourceDefinition
) -> ParsedSourceDefinition:
@@ -302,7 +322,7 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
default_database = self.root_project.credentials.database
return ParsedSourceDefinition(
parsed_source = ParsedSourceDefinition(
package_name=target.package_name,
database=(source.database or default_database),
schema=(source.schema or source.name),
@@ -330,6 +350,11 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
unrendered_config=unrendered_config,
)
# relation name is added after instantiation because the adapter does
# not provide the relation name for a UnpatchedSourceDefinition object
parsed_source.relation_name = self._get_relation_name(parsed_source)
return parsed_source
def create_test_node(
self,
target: Union[UnpatchedSourceDefinition, UnparsedNodeUpdate],
@@ -363,7 +388,8 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
'checksum': FileHash.empty().to_dict(),
}
try:
return self.parse_from_dict(dct)
ParsedSchemaTestNode.validate(dct)
return ParsedSchemaTestNode.from_dict(dct)
except ValidationError as exc:
msg = validator_error_message(exc)
# this is a bit silly, but build an UnparsedNode just for error
@@ -376,6 +402,7 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
)
raise CompilationException(msg, node=node) from exc
# lots of time spent in this method
def _parse_generic_test(
self,
target: Testable,
@@ -414,6 +441,7 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
# is not necessarily this package's name
fqn = self.get_fqn(fqn_path, builder.fqn_name)
# this is the config that is used in render_update
config = self.initial_config(fqn)
metadata = {
@@ -436,9 +464,53 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
column_name=column_name,
test_metadata=metadata,
)
self.render_update(node, config)
self.render_test_update(node, config, builder)
return node
# This does special shortcut processing for the two
# most common internal macros, not_null and unique,
# which avoids the jinja rendering to resolve config
# and variables, etc, which might be in the macro.
# In the future we will look at generalizing this
# more to handle additional macros or to use static
# parsing to avoid jinja overhead.
def render_test_update(self, node, config, builder):
macro_unique_id = self.macro_resolver.get_macro_id(
node.package_name, 'test_' + builder.name)
# Add the depends_on here so we can limit the macros added
# to the context in rendering processing
node.depends_on.add_macro(macro_unique_id)
if (macro_unique_id in
['macro.dbt.test_not_null', 'macro.dbt.test_unique']):
self.update_parsed_node(node, config)
node.unrendered_config['severity'] = builder.severity()
node.config['severity'] = builder.severity()
# source node tests are processed at patch_source time
if isinstance(builder.target, UnpatchedSourceDefinition):
sources = [builder.target.fqn[-2], builder.target.fqn[-1]]
node.sources.append(sources)
else: # all other nodes
node.refs.append([builder.target.name])
else:
try:
# make a base context that doesn't have the magic kwargs field
context = generate_test_context(
node, self.root_project, self.macro_manifest, config,
self.macro_resolver,
)
# update with rendered test kwargs (which collects any refs)
add_rendered_test_kwargs(context, node, capture_macros=True)
# the parsed node is not rendered in the native context.
get_rendered(
node.raw_sql, context, node, capture_macros=True
)
self.update_parsed_node(node, config)
except ValidationError as exc:
# we got a ValidationError - probably bad types in config()
msg = validator_error_message(exc)
raise CompilationException(msg, node=node) from exc
def parse_source_test(
self,
target: UnpatchedSourceDefinition,
@@ -543,17 +615,20 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
for test in block.tests:
self.parse_test(block, test, None)
def parse_reports(self, block: YamlBlock) -> None:
parser = ReportParser(self, block)
def parse_exposures(self, block: YamlBlock) -> None:
parser = ExposureParser(self, block)
for node in parser.parse():
self.results.add_report(block.file, node)
self.results.add_exposure(block.file, node)
def parse_file(self, block: FileBlock) -> None:
dct = self._yaml_from_file(block.file)
# mark the file as seen, even if there are no macros in it
# mark the file as seen, in ParseResult.files
self.results.get_file(block.file)
if dct:
try:
# This does a deep_map to check for circular references
dct = self.raw_renderer.render_data(dct)
except CompilationException as exc:
raise CompilationException(
@@ -561,24 +636,58 @@ class SchemaParser(SimpleParser[SchemaTestBlock, ParsedSchemaTestNode]):
f'project {self.project.project_name}: {exc}'
) from exc
# contains the FileBlock and the data (dictionary)
yaml_block = YamlBlock.from_file_block(block, dct)
self._parse_format_version(yaml_block)
# checks version
self._check_format_version(yaml_block)
parser: YamlDocsReader
for key in NodeType.documentable():
plural = key.pluralize()
if key == NodeType.Source:
parser = SourceParser(self, yaml_block, plural)
elif key == NodeType.Macro:
parser = MacroPatchParser(self, yaml_block, plural)
elif key == NodeType.Analysis:
parser = AnalysisPatchParser(self, yaml_block, plural)
else:
parser = TestablePatchParser(self, yaml_block, plural)
# There are 7 kinds of parsers:
# Model, Seed, Snapshot, Source, Macro, Analysis, Exposures
# NonSourceParser.parse(), TestablePatchParser is a variety of
# NodePatchParser
if 'models' in dct:
parser = TestablePatchParser(self, yaml_block, 'models')
for test_block in parser.parse():
self.parse_tests(test_block)
self.parse_reports(yaml_block)
# NonSourceParser.parse()
if 'seeds' in dct:
parser = TestablePatchParser(self, yaml_block, 'seeds')
for test_block in parser.parse():
self.parse_tests(test_block)
# NonSourceParser.parse()
if 'snapshots' in dct:
parser = TestablePatchParser(self, yaml_block, 'snapshots')
for test_block in parser.parse():
self.parse_tests(test_block)
# This parser uses SourceParser.parse() which doesn't return
# any test blocks. Source tests are handled at a later point
# in the process.
if 'sources' in dct:
parser = SourceParser(self, yaml_block, 'sources')
parser.parse()
# NonSourceParser.parse()
if 'macros' in dct:
parser = MacroPatchParser(self, yaml_block, 'macros')
for test_block in parser.parse():
self.parse_tests(test_block)
# NonSourceParser.parse()
if 'analyses' in dct:
parser = AnalysisPatchParser(self, yaml_block, 'analyses')
for test_block in parser.parse():
self.parse_tests(test_block)
# parse exposures
if 'exposures' in dct:
self.parse_exposures(yaml_block)
Parsed = TypeVar(
@@ -595,11 +704,14 @@ NonSourceTarget = TypeVar(
)
# abstract base class (ABCMeta)
class YamlReader(metaclass=ABCMeta):
def __init__(
self, schema_parser: SchemaParser, yaml: YamlBlock, key: str
) -> None:
self.schema_parser = schema_parser
# key: models, seeds, snapshots, sources, macros,
# analyses, exposures
self.key = key
self.yaml = yaml
@@ -619,6 +731,9 @@ class YamlReader(metaclass=ABCMeta):
def root_project(self):
return self.schema_parser.root_project
# for the different schema subparsers ('models', 'source', etc)
# get the list of dicts pointed to by the key in the yaml config,
# ensure that the dicts have string keys
def get_key_dicts(self) -> Iterable[Dict[str, Any]]:
data = self.yaml.data.get(self.key, [])
if not isinstance(data, list):
@@ -628,7 +743,10 @@ class YamlReader(metaclass=ABCMeta):
)
path = self.yaml.path.original_file_path
# for each dict in the data (which is a list of dicts)
for entry in data:
# check that entry is a dict and that all dict values
# are strings
if coerce_dict_str(entry) is not None:
yield entry
else:
@@ -644,19 +762,22 @@ class YamlDocsReader(YamlReader):
raise NotImplementedError('parse is abstract')
T = TypeVar('T', bound=JsonSchemaMixin)
T = TypeVar('T', bound=dbtClassMixin)
class SourceParser(YamlDocsReader):
def _target_from_dict(self, cls: Type[T], data: Dict[str, Any]) -> T:
path = self.yaml.path.original_file_path
try:
cls.validate(data)
return cls.from_dict(data)
except (ValidationError, JSONValidationException) as exc:
msg = error_context(path, self.key, data, exc)
raise CompilationException(msg) from exc
# the other parse method returns TestBlocks. This one doesn't.
def parse(self) -> List[TestBlock]:
# get a verified list of dicts for the key handled by this parser
for data in self.get_key_dicts():
data = self.project.credentials.translate_aliases(
data, recurse=True
@@ -699,10 +820,12 @@ class SourceParser(YamlDocsReader):
self.results.add_source(self.yaml.file, result)
# This class has three main subclasses: TestablePatchParser (models,
# seeds, snapshots), MacroPatchParser, and AnalysisPatchParser
class NonSourceParser(YamlDocsReader, Generic[NonSourceTarget, Parsed]):
@abstractmethod
def _target_type(self) -> Type[NonSourceTarget]:
raise NotImplementedError('_unsafe_from_dict not implemented')
raise NotImplementedError('_target_type not implemented')
@abstractmethod
def get_block(self, node: NonSourceTarget) -> TargetBlock:
@@ -717,33 +840,55 @@ class NonSourceParser(YamlDocsReader, Generic[NonSourceTarget, Parsed]):
def parse(self) -> List[TestBlock]:
node: NonSourceTarget
test_blocks: List[TestBlock] = []
# get list of 'node' objects
# UnparsedNodeUpdate (TestablePatchParser, models, seeds, snapshots)
# = HasColumnTests, HasTests
# UnparsedAnalysisUpdate (UnparsedAnalysisParser, analyses)
# = HasColumnDocs, HasDocs
# UnparsedMacroUpdate (MacroPatchParser, 'macros')
# = HasDocs
# correspond to this parser's 'key'
for node in self.get_unparsed_target():
# node_block is a TargetBlock (Macro or Analysis)
# or a TestBlock (all of the others)
node_block = self.get_block(node)
if isinstance(node_block, TestBlock):
# TestablePatchParser = models, seeds, snapshots
test_blocks.append(node_block)
if isinstance(node, (HasColumnDocs, HasColumnTests)):
# UnparsedNodeUpdate and UnparsedAnalysisUpdate
refs: ParserRef = ParserRef.from_target(node)
else:
refs = ParserRef()
# This adds the node_block to self.results (a ParseResult
# object) as a ParsedNodePatch or ParsedMacroPatch
self.parse_patch(node_block, refs)
return test_blocks
def get_unparsed_target(self) -> Iterable[NonSourceTarget]:
path = self.yaml.path.original_file_path
for data in self.get_key_dicts():
# get verified list of dicts for the 'key' that this
# parser handles
key_dicts = self.get_key_dicts()
for data in key_dicts:
# add extra data to each dict. This updates the dicts
# in the parser yaml
data.update({
'original_file_path': path,
'yaml_key': self.key,
'package_name': self.project.project_name,
})
try:
model = self._target_type().from_dict(data)
# target_type: UnparsedNodeUpdate, UnparsedAnalysisUpdate,
# or UnparsedMacroUpdate
self._target_type().validate(data)
node = self._target_type().from_dict(data)
except (ValidationError, JSONValidationException) as exc:
msg = error_context(path, self.key, data, exc)
raise CompilationException(msg) from exc
else:
yield model
yield node
class NodePatchParser(
@@ -805,21 +950,21 @@ class MacroPatchParser(NonSourceParser[UnparsedMacroUpdate, ParsedMacroPatch]):
self.results.add_macro_patch(self.yaml.file, result)
class ReportParser(YamlReader):
class ExposureParser(YamlReader):
def __init__(self, schema_parser: SchemaParser, yaml: YamlBlock):
super().__init__(schema_parser, yaml, NodeType.Report.pluralize())
super().__init__(schema_parser, yaml, NodeType.Exposure.pluralize())
self.schema_parser = schema_parser
self.yaml = yaml
def parse_report(self, unparsed: UnparsedReport) -> ParsedReport:
def parse_exposure(self, unparsed: UnparsedExposure) -> ParsedExposure:
package_name = self.project.project_name
unique_id = f'{NodeType.Report}.{package_name}.{unparsed.name}'
unique_id = f'{NodeType.Exposure}.{package_name}.{unparsed.name}'
path = self.yaml.path.relative_path
fqn = self.schema_parser.get_fqn_prefix(path)
fqn.append(unparsed.name)
parsed = ParsedReport(
parsed = ParsedExposure(
package_name=package_name,
root_path=self.project.project_root,
path=path,
@@ -833,7 +978,7 @@ class ReportParser(YamlReader):
owner=unparsed.owner,
maturity=unparsed.maturity,
)
ctx = generate_parse_report(
ctx = generate_parse_exposure(
parsed,
self.root_project,
self.schema_parser.macro_manifest,
@@ -848,12 +993,13 @@ class ReportParser(YamlReader):
# parsed now has a populated refs/sources
return parsed
def parse(self) -> Iterable[ParsedReport]:
def parse(self) -> Iterable[ParsedExposure]:
for data in self.get_key_dicts():
try:
unparsed = UnparsedReport.from_dict(data)
UnparsedExposure.validate(data)
unparsed = UnparsedExposure.from_dict(data)
except (ValidationError, JSONValidationException) as exc:
msg = error_context(self.yaml.path, self.key, data, exc)
raise CompilationException(msg) from exc
parsed = self.parse_report(unparsed)
parsed = self.parse_exposure(unparsed)
yield parsed

View File

@@ -13,7 +13,9 @@ class SeedParser(SimpleSQLParser[ParsedSeedNode]):
)
def parse_from_dict(self, dct, validate=True) -> ParsedSeedNode:
return ParsedSeedNode.from_dict(dct, validate=validate)
if validate:
ParsedSeedNode.validate(dct)
return ParsedSeedNode.from_dict(dct)
@property
def resource_type(self) -> NodeType:

View File

@@ -1,7 +1,7 @@
import os
from typing import List
from hologram import ValidationError
from dbt.dataclass_schema import ValidationError
from dbt.contracts.graph.parsed import (
IntermediateSnapshotNode, ParsedSnapshotNode
@@ -26,7 +26,9 @@ class SnapshotParser(
)
def parse_from_dict(self, dct, validate=True) -> IntermediateSnapshotNode:
return IntermediateSnapshotNode.from_dict(dct, validate=validate)
if validate:
IntermediateSnapshotNode.validate(dct)
return IntermediateSnapshotNode.from_dict(dct)
@property
def resource_type(self) -> NodeType:

View File

@@ -6,7 +6,7 @@ from typing import (
Set,
)
from dbt.config import RuntimeConfig
from dbt.contracts.graph.manifest import Manifest, SourceKey
from dbt.contracts.graph.manifest import MacroManifest, SourceKey
from dbt.contracts.graph.parsed import (
UnpatchedSourceDefinition,
ParsedSourceDefinition,
@@ -33,7 +33,7 @@ class SourcePatcher:
) -> None:
self.results = results
self.root_project = root_project
self.macro_manifest = Manifest.from_macros(
self.macro_manifest = MacroManifest(
macros=self.results.macros,
files=self.results.files
)

View File

@@ -3,7 +3,7 @@ little bit too much to go anywhere else.
"""
from dbt.adapters.factory import get_adapter
from dbt.parser.manifest import load_manifest
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.manifest import Manifest, MacroManifest
from dbt.config import RuntimeConfig
@@ -23,10 +23,10 @@ def get_full_manifest(
config.clear_dependencies()
adapter.clear_macro_manifest()
internal: Manifest = adapter.load_macro_manifest()
macro_manifest: MacroManifest = adapter.load_macro_manifest()
return load_manifest(
config,
internal,
macro_manifest,
adapter.connections.set_query_header,
)

View File

@@ -1,8 +1,7 @@
import logbook
import logbook.queues
from jsonrpc.exceptions import JSONRPCError
from hologram import JsonSchemaMixin
from hologram.helpers import StrEnum
from dbt.dataclass_schema import StrEnum
from dataclasses import dataclass, field
from datetime import datetime, timedelta
@@ -25,8 +24,11 @@ class QueueMessageType(StrEnum):
terminating = frozenset((Error, Result, Timeout))
# This class was subclassed from JsonSchemaMixin, but it
# doesn't appear to be necessary, and Mashumaro does not
# handle logbook.LogRecord
@dataclass
class QueueMessage(JsonSchemaMixin):
class QueueMessage:
message_type: QueueMessageType

View File

@@ -3,7 +3,7 @@ from abc import abstractmethod
from copy import deepcopy
from typing import List, Optional, Type, TypeVar, Generic, Dict, Any
from hologram import JsonSchemaMixin, ValidationError
from dbt.dataclass_schema import dbtClassMixin, ValidationError
from dbt.contracts.rpc import RPCParameters, RemoteResult, RemoteMethodFlags
from dbt.exceptions import NotImplementedException, InternalException
@@ -109,7 +109,7 @@ class RemoteBuiltinMethod(RemoteMethod[Parameters, Result]):
'the run() method on builtins should never be called'
)
def __call__(self, **kwargs: Dict[str, Any]) -> JsonSchemaMixin:
def __call__(self, **kwargs: Dict[str, Any]) -> dbtClassMixin:
try:
params = self.get_parameters().from_dict(kwargs)
except ValidationError as exc:

View File

@@ -65,7 +65,7 @@ class RPCCompileRunner(GenericRPCRunner[RemoteCompileResult]):
def execute(self, compiled_node, manifest) -> RemoteCompileResult:
return RemoteCompileResult(
raw_sql=compiled_node.raw_sql,
compiled_sql=compiled_node.injected_sql,
compiled_sql=compiled_node.compiled_sql,
node=compiled_node,
timing=[], # this will get added later
logs=[],
@@ -88,7 +88,7 @@ class RPCCompileRunner(GenericRPCRunner[RemoteCompileResult]):
class RPCExecuteRunner(GenericRPCRunner[RemoteRunResult]):
def execute(self, compiled_node, manifest) -> RemoteRunResult:
_, execute_result = self.adapter.execute(
compiled_node.injected_sql, fetch=True
compiled_node.compiled_sql, fetch=True
)
table = ResultTable(
@@ -98,7 +98,7 @@ class RPCExecuteRunner(GenericRPCRunner[RemoteRunResult]):
return RemoteRunResult(
raw_sql=compiled_node.raw_sql,
compiled_sql=compiled_node.injected_sql,
compiled_sql=compiled_node.compiled_sql,
node=compiled_node,
table=table,
timing=[],

View File

@@ -1,7 +1,7 @@
import json
from typing import Callable, Dict, Any
from hologram import JsonSchemaMixin
from dbt.dataclass_schema import dbtClassMixin
from jsonrpc.exceptions import (
JSONRPCParseError,
JSONRPCInvalidRequestException,
@@ -90,11 +90,14 @@ class ResponseManager(JSONRPCResponseManager):
@classmethod
def _get_responses(cls, requests, dispatcher):
for output in super()._get_responses(requests, dispatcher):
# if it's a result, check if it's a JsonSchemaMixin and if so call
# if it's a result, check if it's a dbtClassMixin and if so call
# to_dict
if hasattr(output, 'result'):
if isinstance(output.result, JsonSchemaMixin):
output.result = output.result.to_dict(omit_none=False)
if isinstance(output.result, dbtClassMixin):
# Note: errors in to_dict do not show up anywhere in
# the output and all you get is a generic 500 error
output.result = \
output.result.to_dict(options={'keep_none': True})
yield output
@classmethod

View File

@@ -9,7 +9,7 @@ from typing import (
)
from typing_extensions import Protocol
from hologram import JsonSchemaMixin, ValidationError
from dbt.dataclass_schema import dbtClassMixin, ValidationError
import dbt.exceptions
import dbt.flags
@@ -187,6 +187,7 @@ def get_results_context(
class StateHandler:
"""A helper context manager to manage task handler state."""
def __init__(self, task_handler: 'RequestTaskHandler') -> None:
self.handler = task_handler
@@ -248,6 +249,7 @@ class SetArgsStateHandler(StateHandler):
"""A state handler that does not touch state on success and does not
execute the teardown
"""
def handle_completed(self):
pass
@@ -257,6 +259,7 @@ class SetArgsStateHandler(StateHandler):
class RequestTaskHandler(threading.Thread, TaskHandlerProtocol):
"""Handler for the single task triggered by a given jsonrpc request."""
def __init__(
self,
manager: TaskManagerProtocol,
@@ -280,7 +283,7 @@ class RequestTaskHandler(threading.Thread, TaskHandlerProtocol):
# - The actual thread that this represents, which writes its data to
# the result and logs. The atomicity of list.append() and item
# assignment means we don't need a lock.
self.result: Optional[JsonSchemaMixin] = None
self.result: Optional[dbtClassMixin] = None
self.error: Optional[RPCException] = None
self.state: TaskHandlerState = TaskHandlerState.NotStarted
self.logs: List[LogMessage] = []
@@ -400,6 +403,7 @@ class RequestTaskHandler(threading.Thread, TaskHandlerProtocol):
try:
with StateHandler(self):
self.result = self.get_result()
except (dbt.exceptions.Exception, RPCException):
# we probably got an error after the RPC call ran (and it was
# probably deps...). By now anyone who wanted to see it has seen it
@@ -449,6 +453,7 @@ class RequestTaskHandler(threading.Thread, TaskHandlerProtocol):
)
try:
cls.validate(self.task_kwargs)
return cls.from_dict(self.task_kwargs)
except ValidationError as exc:
# raise a TypeError to indicate invalid parameters so we get a nice

View File

@@ -14,11 +14,11 @@ from dbt.contracts.rpc import (
class TaskHandlerProtocol(Protocol):
started: Optional[datetime]
ended: Optional[datetime]
state: TaskHandlerState
task_id: TaskID
process: Optional[multiprocessing.Process]
state: TaskHandlerState
started: Optional[datetime] = None
ended: Optional[datetime] = None
process: Optional[multiprocessing.Process] = None
@property
def request_id(self) -> Union[str, int]:

View File

@@ -4,8 +4,7 @@ import re
from dbt.exceptions import VersionsNotCompatibleException
import dbt.utils
from hologram import JsonSchemaMixin
from hologram.helpers import StrEnum
from dbt.dataclass_schema import dbtClassMixin, StrEnum
from typing import Optional
@@ -18,12 +17,12 @@ class Matchers(StrEnum):
@dataclass
class VersionSpecification(JsonSchemaMixin):
major: Optional[str]
minor: Optional[str]
patch: Optional[str]
prerelease: Optional[str]
build: Optional[str]
class VersionSpecification(dbtClassMixin):
major: Optional[str] = None
minor: Optional[str] = None
patch: Optional[str] = None
prerelease: Optional[str] = None
build: Optional[str] = None
matcher: Matchers = Matchers.EXACT

View File

@@ -9,7 +9,7 @@ from dbt import tracking
from dbt import ui
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.results import (
RunModelResult, collect_timing_info
NodeStatus, RunResult, collect_timing_info, RunStatus
)
from dbt.exceptions import (
NotImplementedException, CompilationException, RuntimeException,
@@ -165,6 +165,7 @@ class ExecutionContext:
"""During execution and error handling, dbt makes use of mutable state:
timing information and the newest (compiled vs executed) form of the node.
"""
def __init__(self, node):
self.timing = []
self.node = node
@@ -179,20 +180,20 @@ class BaseRunner(metaclass=ABCMeta):
self.num_nodes = num_nodes
self.skip = False
self.skip_cause: Optional[RunModelResult] = None
self.skip_cause: Optional[RunResult] = None
@abstractmethod
def compile(self, manifest: Manifest) -> Any:
pass
def get_result_status(self, result) -> Dict[str, str]:
if result.error:
return {'node_status': 'error', 'node_error': str(result.error)}
elif result.skip:
if result.status == NodeStatus.Error:
return {'node_status': 'error', 'node_error': str(result.message)}
elif result.status == NodeStatus.Skipped:
return {'node_status': 'skipped'}
elif result.fail:
elif result.status == NodeStatus.Fail:
return {'node_status': 'failed'}
elif result.warn:
elif result.status == NodeStatus.Warn:
return {'node_status': 'warn'}
else:
return {'node_status': 'passed'}
@@ -212,52 +213,62 @@ class BaseRunner(metaclass=ABCMeta):
return result
def _build_run_result(self, node, start_time, error, status, timing_info,
skip=False, fail=None, warn=None, agate_table=None):
def _build_run_result(self, node, start_time, status, timing_info, message,
agate_table=None, adapter_response=None):
execution_time = time.time() - start_time
thread_id = threading.current_thread().name
return RunModelResult(
node=node,
error=error,
skip=skip,
if adapter_response is None:
adapter_response = {}
return RunResult(
status=status,
fail=fail,
warn=warn,
execution_time=execution_time,
thread_id=thread_id,
execution_time=execution_time,
timing=timing_info,
message=message,
node=node,
agate_table=agate_table,
adapter_response=adapter_response
)
def error_result(self, node, error, start_time, timing_info):
def error_result(self, node, message, start_time, timing_info):
return self._build_run_result(
node=node,
start_time=start_time,
error=error,
status='ERROR',
timing_info=timing_info
status=RunStatus.Error,
timing_info=timing_info,
message=message,
)
def ephemeral_result(self, node, start_time, timing_info):
return self._build_run_result(
node=node,
start_time=start_time,
error=None,
status=None,
timing_info=timing_info
status=RunStatus.Success,
timing_info=timing_info,
message=None
)
def from_run_result(self, result, start_time, timing_info):
return self._build_run_result(
node=result.node,
start_time=start_time,
error=result.error,
skip=result.skip,
status=result.status,
fail=result.fail,
warn=result.warn,
timing_info=timing_info,
message=result.message,
agate_table=result.agate_table,
adapter_response=result.adapter_response
)
def skip_result(self, node, message):
thread_id = threading.current_thread().name
return RunResult(
status=RunStatus.Skipped,
thread_id=thread_id,
execution_time=0,
timing=[],
message=message,
node=node,
adapter_response={}
)
def compile_and_execute(self, manifest, ctx):
@@ -340,7 +351,7 @@ class BaseRunner(metaclass=ABCMeta):
# an error
if (
exc_str is not None and result is not None and
result.error is None and error is None
result.status != NodeStatus.Error and error is None
):
error = exc_str
@@ -389,7 +400,7 @@ class BaseRunner(metaclass=ABCMeta):
schema_name = self.node.schema
node_name = self.node.name
error = None
error_message = None
if not self.node.is_ephemeral_model:
# if this model was skipped due to an upstream ephemeral model
# failure, print a special 'error skip' message.
@@ -408,7 +419,7 @@ class BaseRunner(metaclass=ABCMeta):
'an ephemeral failure'
)
# set an error so dbt will exit with an error code
error = (
error_message = (
'Compilation Error in {}, caused by compilation error '
'in referenced ephemeral model {}'
.format(self.node.unique_id,
@@ -423,7 +434,7 @@ class BaseRunner(metaclass=ABCMeta):
self.num_nodes
)
node_result = RunModelResult(self.node, skip=True, error=error)
node_result = self.skip_result(self.node, error_message)
return node_result
def do_skip(self, cause=None):

View File

@@ -2,7 +2,7 @@ import os.path
import os
import shutil
from dbt.task.base import BaseTask
from dbt.task.base import BaseTask, move_to_nearest_project_dir
from dbt.logger import GLOBAL_LOGGER as logger
from dbt.config import UnsetProfileConfig
@@ -32,6 +32,7 @@ class CleanTask(BaseTask):
This function takes all the paths in the target file
and cleans the project paths that are not protected.
"""
move_to_nearest_project_dir(self.args)
for path in self.config.clean_targets:
logger.info("Checking {}/*".format(path))
if not self.__is_protected_path(path):

View File

@@ -1,7 +1,8 @@
import threading
from .runnable import GraphRunnableTask
from .base import BaseRunner
from dbt.contracts.results import RunModelResult
from dbt.contracts.results import RunStatus, RunResult
from dbt.exceptions import InternalException
from dbt.graph import ResourceTypeSelector, SelectionSpec, parse_difference
from dbt.logger import print_timestamped_line
@@ -16,7 +17,15 @@ class CompileRunner(BaseRunner):
pass
def execute(self, compiled_node, manifest):
return RunModelResult(compiled_node)
return RunResult(
node=compiled_node,
status=RunStatus.Success,
timing=[],
thread_id=threading.current_thread().name,
execution_time=0,
message=None,
adapter_response={}
)
def compile(self, manifest):
compiler = self.adapter.get_compiler()

View File

@@ -48,7 +48,6 @@ Check your database credentials and try again. For more information, visit:
{url}
'''.lstrip()
MISSING_PROFILE_MESSAGE = '''
dbt looked for a profiles.yml file in {path}, but did
not find one. For more information on configuring your profile, consult the
@@ -90,6 +89,7 @@ class DebugTask(BaseTask):
self.profile_name: Optional[str] = None
self.project: Optional[Project] = None
self.project_fail_details = ''
self.any_failure = False
self.messages: List[str] = []
@property
@@ -111,7 +111,7 @@ class DebugTask(BaseTask):
def run(self):
if self.args.config_dir:
self.path_info()
return
return not self.any_failure
version = get_installed_version().to_version_string(skip_matcher=True)
print('dbt version: {}'.format(version))
@@ -129,6 +129,11 @@ class DebugTask(BaseTask):
print(message)
print('')
return not self.any_failure
def interpret_results(self, results):
return results
def _load_project(self):
if not os.path.exists(self.project_path):
self.project_fail_details = FILE_NOT_FOUND
@@ -245,6 +250,7 @@ class DebugTask(BaseTask):
self.messages.append(MISSING_PROFILE_MESSAGE.format(
path=self.profile_path, url=ProfileConfigDocs
))
self.any_failure = True
return red('ERROR not found')
try:
@@ -283,6 +289,7 @@ class DebugTask(BaseTask):
dbt.clients.system.run_cmd(os.getcwd(), ['git', '--help'])
except dbt.exceptions.ExecutableError as exc:
self.messages.append('Error from git --help: {!s}'.format(exc))
self.any_failure = True
return red('ERROR')
return green('OK found')
@@ -310,6 +317,8 @@ class DebugTask(BaseTask):
def _log_project_fail(self):
if not self.project_fail_details:
return
self.any_failure = True
if self.project_fail_details == FILE_NOT_FOUND:
return
print('Project loading failed for the following reason:')
@@ -319,6 +328,8 @@ class DebugTask(BaseTask):
def _log_profile_fail(self):
if not self.profile_fail_details:
return
self.any_failure = True
if self.profile_fail_details == FILE_NOT_FOUND:
return
print('Profile loading failed for the following reason:')
@@ -334,7 +345,7 @@ class DebugTask(BaseTask):
adapter = get_adapter(profile)
try:
with adapter.connection_named('debug'):
adapter.execute('select 1 as id')
adapter.debug_query()
except Exception as exc:
return COULD_NOT_CONNECT_MESSAGE.format(
err=str(exc),
@@ -347,6 +358,7 @@ class DebugTask(BaseTask):
result = self.attempt_connection(self.profile)
if result is not None:
self.messages.append(result)
self.any_failure = True
return red('ERROR')
return green('OK connection ok')

View File

@@ -1,7 +1,6 @@
import os
import threading
import time
from typing import Dict
from .base import BaseRunner
from .printer import (
@@ -13,16 +12,13 @@ from .runnable import GraphRunnableTask
from dbt.contracts.results import (
FreshnessExecutionResultArtifact,
FreshnessResult,
PartialResult,
SourceFreshnessResult,
FreshnessResult, PartialSourceFreshnessResult,
SourceFreshnessResult, FreshnessStatus
)
from dbt.exceptions import RuntimeException, InternalException
from dbt.logger import print_timestamped_line
from dbt.node_types import NodeType
from dbt import utils
from dbt.graph import NodeSelector, SelectionSpec, parse_difference
from dbt.contracts.graph.parsed import ParsedSourceDefinition
@@ -36,12 +32,6 @@ class FreshnessRunner(BaseRunner):
'Freshness: nodes cannot be skipped!'
)
def get_result_status(self, result) -> Dict[str, str]:
if result.error:
return {'node_status': 'error', 'node_error': str(result.error)}
else:
return {'node_status': str(result.status)}
def before_execute(self):
description = 'freshness of {0.source_name}.{0.name}'.format(self.node)
print_start_line(description, self.node_index, self.num_nodes)
@@ -49,18 +39,33 @@ class FreshnessRunner(BaseRunner):
def after_execute(self, result):
print_freshness_result_line(result, self.node_index, self.num_nodes)
def _build_run_result(self, node, start_time, error, status, timing_info,
skip=False, failed=None):
def error_result(self, node, message, start_time, timing_info):
return self._build_run_result(
node=node,
start_time=start_time,
status=FreshnessStatus.RuntimeErr,
timing_info=timing_info,
message=message,
)
def _build_run_result(
self,
node,
start_time,
status,
timing_info,
message
):
execution_time = time.time() - start_time
thread_id = threading.current_thread().name
status = utils.lowercase(status)
return PartialResult(
node=node,
return PartialSourceFreshnessResult(
status=status,
error=error,
execution_time=execution_time,
thread_id=thread_id,
execution_time=execution_time,
timing=timing_info,
message=message,
node=node,
adapter_response={}
)
def from_run_result(self, result, start_time, timing_info):
@@ -95,6 +100,10 @@ class FreshnessRunner(BaseRunner):
node=compiled_node,
status=status,
thread_id=threading.current_thread().name,
timing=[],
execution_time=0,
message=None,
adapter_response={},
**freshness
)
@@ -160,7 +169,10 @@ class FreshnessTask(GraphRunnableTask):
def task_end_messages(self, results):
for result in results:
if result.error is not None:
if result.status in (
FreshnessStatus.Error,
FreshnessStatus.RuntimeErr
):
print_run_result_error(result)
print_timestamped_line('Done.')

View File

@@ -3,7 +3,7 @@ import shutil
from datetime import datetime
from typing import Dict, List, Any, Optional, Tuple, Set
from hologram import ValidationError
from dbt.dataclass_schema import ValidationError
from .compile import CompileTask
@@ -11,8 +11,8 @@ from dbt.adapters.factory import get_adapter
from dbt.contracts.graph.compiled import CompileResultNode
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.results import (
TableMetadata, CatalogTable, CatalogResults, Primitive, CatalogKey,
StatsItem, StatsDict, ColumnMetadata, CatalogArtifact
NodeStatus, TableMetadata, CatalogTable, CatalogResults, Primitive,
CatalogKey, StatsItem, StatsDict, ColumnMetadata, CatalogArtifact
)
from dbt.exceptions import InternalException
from dbt.include.global_project import DOCS_INDEX_FILE_PATH
@@ -211,7 +211,7 @@ class GenerateTask(CompileTask):
compile_results = None
if self.args.compile:
compile_results = CompileTask.run(self)
if any(r.error is not None for r in compile_results):
if any(r.status == NodeStatus.Error for r in compile_results):
print_timestamped_line(
'compile failed, cannot generate docs'
)

View File

@@ -2,7 +2,7 @@ import json
from typing import Type
from dbt.contracts.graph.parsed import (
ParsedReport,
ParsedExposure,
ParsedSourceDefinition,
)
from dbt.graph import (
@@ -24,7 +24,7 @@ class ListTask(GraphRunnableTask):
NodeType.Seed,
NodeType.Test,
NodeType.Source,
NodeType.Report,
NodeType.Exposure,
))
ALL_RESOURCE_VALUES = DEFAULT_RESOURCE_VALUES | frozenset((
NodeType.Analysis,
@@ -76,8 +76,8 @@ class ListTask(GraphRunnableTask):
yield self.manifest.nodes[node]
elif node in self.manifest.sources:
yield self.manifest.sources[node]
elif node in self.manifest.reports:
yield self.manifest.reports[node]
elif node in self.manifest.exposures:
yield self.manifest.exposures[node]
else:
raise RuntimeException(
f'Got an unexpected result from node selection: "{node}"'
@@ -93,11 +93,11 @@ class ListTask(GraphRunnableTask):
node.package_name, node.source_name, node.name
])
yield f'source:{source_selector}'
elif node.resource_type == NodeType.Report:
assert isinstance(node, ParsedReport)
# reports are searched for by pkg.report_name
report_selector = '.'.join([node.package_name, node.name])
yield f'report:{report_selector}'
elif node.resource_type == NodeType.Exposure:
assert isinstance(node, ParsedExposure)
# exposures are searched for by pkg.exposure_name
exposure_selector = '.'.join([node.package_name, node.name])
yield f'exposure:{exposure_selector}'
else:
# everything else is from `fqn`
yield '.'.join(node.fqn)
@@ -110,7 +110,7 @@ class ListTask(GraphRunnableTask):
for node in self._iterate_selected_nodes():
yield json.dumps({
k: v
for k, v in node.to_dict(omit_none=False).items()
for k, v in node.to_dict(options={'keep_none': True}).items()
if k in self.ALLOWED_KEYS
})
@@ -190,4 +190,5 @@ class ListTask(GraphRunnableTask):
)
def interpret_results(self, results):
return bool(results)
# list command should always return 0 as exit code
return True

99
core/dbt/task/parse.py Normal file
View File

@@ -0,0 +1,99 @@
# This task is intended to be used for diagnosis, development and
# performance analysis.
# It separates out the parsing flows for easier logging and
# debugging.
# To store cProfile performance data, execute with the '-r'
# flag and an output file: dbt -r dbt.cprof parse.
# Use a visualizer such as snakeviz to look at the output:
# snakeviz dbt.cprof
from dbt.task.base import ConfiguredTask
from dbt.adapters.factory import get_adapter
from dbt.parser.manifest import (
Manifest, MacroManifest, ManifestLoader, _check_manifest
)
from dbt.logger import DbtProcessState, print_timestamped_line
from dbt.clients.system import write_file
from dbt.graph import Graph
import time
from typing import Optional
import os
import json
import dbt.utils
MANIFEST_FILE_NAME = 'manifest.json'
PERF_INFO_FILE_NAME = 'perf_info.json'
PARSING_STATE = DbtProcessState('parsing')
class ParseTask(ConfiguredTask):
def __init__(self, args, config):
super().__init__(args, config)
self.manifest: Optional[Manifest] = None
self.graph: Optional[Graph] = None
self.loader: Optional[ManifestLoader] = None
def write_manifest(self):
path = os.path.join(self.config.target_path, MANIFEST_FILE_NAME)
self.manifest.write(path)
def write_perf_info(self):
path = os.path.join(self.config.target_path, PERF_INFO_FILE_NAME)
write_file(path, json.dumps(self.loader._perf_info,
cls=dbt.utils.JSONEncoder, indent=4))
print_timestamped_line(f"Performance info: {path}")
# This method takes code that normally exists in other files
# and pulls it in here, to simplify logging and make the
# parsing flow-of-control easier to understand and manage,
# with the downside that if changes happen in those other methods,
# similar changes might need to be made here.
# ManifestLoader.get_full_manifest
# ManifestLoader.load
# ManifestLoader.load_all
def get_full_manifest(self):
adapter = get_adapter(self.config) # type: ignore
macro_manifest: MacroManifest = adapter.load_macro_manifest()
print_timestamped_line("Macro manifest loaded")
root_config = self.config
macro_hook = adapter.connections.set_query_header
with PARSING_STATE:
start_load_all = time.perf_counter()
projects = root_config.load_dependencies()
print_timestamped_line("Dependencies loaded")
loader = ManifestLoader(root_config, projects, macro_hook)
print_timestamped_line("ManifestLoader created")
loader.load(macro_manifest=macro_manifest)
print_timestamped_line("Manifest loaded")
loader.write_parse_results()
print_timestamped_line("Parse results written")
manifest = loader.create_manifest()
print_timestamped_line("Manifest created")
_check_manifest(manifest, root_config)
print_timestamped_line("Manifest checked")
manifest.build_flat_graph()
print_timestamped_line("Flat graph built")
loader._perf_info.load_all_elapsed = (
time.perf_counter() - start_load_all
)
self.loader = loader
self.manifest = manifest
print_timestamped_line("Manifest loaded")
def compile_manifest(self):
adapter = get_adapter(self.config)
compiler = adapter.get_compiler()
self.graph = compiler.compile(self.manifest)
def run(self):
print_timestamped_line('Start parsing.')
self.get_full_manifest()
if self.args.compile:
print_timestamped_line('Compiling.')
self.compile_manifest()
if self.args.write_manifest:
print_timestamped_line('Writing manifest.')
self.write_manifest()
self.write_perf_info()
print_timestamped_line('Done.')

View File

@@ -11,6 +11,10 @@ from dbt.tracking import InvocationProcessor
from dbt import ui
from dbt import utils
from dbt.contracts.results import (
FreshnessStatus, NodeResult, NodeStatus, TestStatus
)
def print_fancy_output_line(
msg: str, status: str, logger_fn: Callable, index: Optional[int],
@@ -98,37 +102,37 @@ def print_cancel_line(model) -> None:
def get_printable_result(
result, success: str, error: str) -> Tuple[str, str, Callable]:
if result.error is not None:
if result.status == NodeStatus.Error:
info = 'ERROR {}'.format(error)
status = ui.red(result.status)
status = ui.red(result.status.upper())
logger_fn = logger.error
else:
info = 'OK {}'.format(success)
status = ui.green(result.status)
status = ui.green(result.message)
logger_fn = logger.info
return info, status, logger_fn
def print_test_result_line(
result, schema_name, index: int, total: int
result: NodeResult, schema_name, index: int, total: int
) -> None:
model = result.node
if result.error is not None:
if result.status == TestStatus.Error:
info = "ERROR"
color = ui.red
logger_fn = logger.error
elif result.status == 0:
elif result.status == TestStatus.Pass:
info = 'PASS'
color = ui.green
logger_fn = logger.info
elif result.warn:
info = 'WARN {}'.format(result.status)
elif result.status == TestStatus.Warn:
info = 'WARN {}'.format(result.message)
color = ui.yellow
logger_fn = logger.warning
elif result.fail:
info = 'FAIL {}'.format(result.status)
elif result.status == TestStatus.Fail:
info = 'FAIL {}'.format(result.message)
color = ui.red
logger_fn = logger.error
else:
@@ -196,15 +200,15 @@ def print_seed_result_line(result, schema_name: str, index: int, total: int):
def print_freshness_result_line(result, index: int, total: int) -> None:
if result.error:
if result.status == FreshnessStatus.RuntimeErr:
info = 'ERROR'
color = ui.red
logger_fn = logger.error
elif result.status == 'error':
elif result.status == FreshnessStatus.Error:
info = 'ERROR STALE'
color = ui.red
logger_fn = logger.error
elif result.status == 'warn':
elif result.status == FreshnessStatus.Warn:
info = 'WARN'
color = ui.yellow
logger_fn = logger.warning
@@ -220,11 +224,7 @@ def print_freshness_result_line(result, index: int, total: int) -> None:
source_name = result.source_name
table_name = result.table_name
msg = "{info} freshness of {source_name}.{table_name}".format(
info=info,
source_name=source_name,
table_name=table_name
)
msg = f"{info} freshness of {source_name}.{table_name}"
print_fancy_output_line(
msg,
@@ -237,14 +237,16 @@ def print_freshness_result_line(result, index: int, total: int) -> None:
def interpret_run_result(result) -> str:
if result.error is not None or result.fail:
if result.status in (NodeStatus.Error, NodeStatus.Fail):
return 'error'
elif result.skipped:
elif result.status == NodeStatus.Skipped:
return 'skip'
elif result.warn:
elif result.status == NodeStatus.Warn:
return 'warn'
else:
elif result.status in (NodeStatus.Pass, NodeStatus.Success):
return 'pass'
else:
raise RuntimeError(f"unhandled result {result}")
def print_run_status_line(results) -> None:
@@ -272,7 +274,9 @@ def print_run_result_error(
with TextOnly():
logger.info("")
if result.fail or (is_warning and result.warn):
if result.status == NodeStatus.Fail or (
is_warning and result.status == NodeStatus.Warn
):
if is_warning:
color = ui.yellow
info = 'Warning'
@@ -288,12 +292,13 @@ def print_run_result_error(
result.node.original_file_path))
try:
int(result.status)
# if message is int, must be rows returned for a test
int(result.message)
except ValueError:
logger.error(" Status: {}".format(result.status))
else:
status = utils.pluralize(result.status, 'result')
logger.error(" Got {}, expected 0.".format(status))
num_rows = utils.pluralize(result.message, 'result')
logger.error(" Got {}, expected 0.".format(num_rows))
if result.node.build_path is not None:
with TextOnly():
@@ -301,9 +306,9 @@ def print_run_result_error(
logger.info(" compiled SQL at {}".format(
result.node.build_path))
else:
elif result.message is not None:
first = True
for line in result.error.split("\n"):
for line in result.message.split("\n"):
if first:
logger.error(ui.yellow(line))
first = False
@@ -342,8 +347,21 @@ def print_end_of_run_summary(
def print_run_end_messages(results, keyboard_interrupt: bool = False) -> None:
errors = [r for r in results if r.error is not None or r.fail]
warnings = [r for r in results if r.warn]
errors, warnings = [], []
for r in results:
if r.status in (
NodeStatus.RuntimeErr,
NodeStatus.Error,
NodeStatus.Fail
):
errors.append(r)
elif r.status == NodeStatus.Skipped and r.message is not None:
# this means we skipped a node because of an issue upstream,
# so include it as an error
errors.append(r)
elif r.status == NodeStatus.Warn:
warnings.append(r)
with DbtStatusMessage(), InvocationProcessor():
print_end_of_run_summary(len(errors),
len(warnings),

View File

@@ -1,6 +1,6 @@
import abc
import shlex
import yaml
from dbt.clients.yaml_helper import Dumper, yaml # noqa: F401
from typing import Type, Optional

View File

@@ -129,6 +129,10 @@ class RemoteTestProjectTask(RPCCommandTask[RPCTestParameters], TestTask):
self.args.schema = params.schema
if params.threads is not None:
self.args.threads = params.threads
if params.defer is None:
self.args.defer = flags.DEFER_MODE
else:
self.args.defer = params.defer
self.args.state = state_path(params.state)
self.set_previous_state()

View File

@@ -1,7 +1,8 @@
# import these so we can find them
from . import sql_commands # noqa
from . import project_commands # noqa
from . import deps # noqa
from . import deps # noqa
import multiprocessing.queues # noqa - https://bugs.python.org/issue41567
import json
import os
import signal

View File

@@ -1,7 +1,10 @@
import functools
import threading
import time
from typing import List, Dict, Any, Iterable, Set, Tuple, Optional, AbstractSet
from dbt.dataclass_schema import dbtClassMixin
from .compile import CompileRunner, CompileTask
from .printer import (
@@ -23,7 +26,7 @@ from dbt.contracts.graph.compiled import CompileResultNode
from dbt.contracts.graph.manifest import WritableManifest
from dbt.contracts.graph.model_config import Hook
from dbt.contracts.graph.parsed import ParsedHookNode
from dbt.contracts.results import RunModelResult
from dbt.contracts.results import NodeStatus, RunResult, RunStatus
from dbt.exceptions import (
CompilationException,
InternalException,
@@ -93,6 +96,7 @@ def get_hooks_by_tags(
def get_hook(source, index):
hook_dict = get_hook_dict(source)
hook_dict.setdefault('index', index)
Hook.validate(hook_dict)
return Hook.from_dict(hook_dict)
@@ -105,9 +109,9 @@ def track_model_run(index, num_nodes, run_model_result):
"index": index,
"total": num_nodes,
"execution_time": run_model_result.execution_time,
"run_status": run_model_result.status,
"run_skipped": run_model_result.skip,
"run_error": None,
"run_status": str(run_model_result.status).upper(),
"run_skipped": run_model_result.status == NodeStatus.Skipped,
"run_error": run_model_result.status == NodeStatus.Error,
"model_materialization": run_model_result.node.get_materialization(),
"model_id": utils.get_hash(run_model_result.node),
"hashed_contents": utils.get_hashed_contents(
@@ -187,7 +191,18 @@ class ModelRunner(CompileRunner):
def _build_run_model_result(self, model, context):
result = context['load_result']('main')
return RunModelResult(model, status=result.status)
adapter_response = {}
if isinstance(result.response, dbtClassMixin):
adapter_response = result.response.to_dict()
return RunResult(
node=model,
status=RunStatus.Success,
timing=[],
thread_id=threading.current_thread().name,
execution_time=0,
message=str(result.response),
adapter_response=adapter_response
)
def _materialization_relations(
self, result: Any, model
@@ -255,7 +270,7 @@ class RunTask(CompileTask):
def get_hook_sql(self, adapter, hook, idx, num_hooks, extra_context):
compiler = adapter.get_compiler()
compiled = compiler.compile_node(hook, self.manifest, extra_context)
statement = compiled.injected_sql
statement = compiled.compiled_sql
hook_index = hook.index or num_hooks
hook_obj = get_hook(statement, index=hook_index)
return hook_obj.sql or ''
@@ -322,7 +337,7 @@ class RunTask(CompileTask):
with finishctx, DbtModelState({'node_status': 'passed'}):
print_hook_end_line(
hook_text, status, idx, num_hooks, timer.elapsed
hook_text, str(status), idx, num_hooks, timer.elapsed
)
self._total_executed += len(ordered_hooks)
@@ -372,7 +387,7 @@ class RunTask(CompileTask):
)
return state.manifest
def defer_to_manifest(self, selected_uids: AbstractSet[str]):
def defer_to_manifest(self, adapter, selected_uids: AbstractSet[str]):
deferred_manifest = self._get_deferred_manifest()
if deferred_manifest is None:
return
@@ -382,6 +397,7 @@ class RunTask(CompileTask):
'manifest to defer from!'
)
self.manifest.merge_from_artifact(
adapter=adapter,
other=deferred_manifest,
selected=selected_uids,
)
@@ -389,10 +405,10 @@ class RunTask(CompileTask):
self.write_manifest()
def before_run(self, adapter, selected_uids: AbstractSet[str]):
self.defer_to_manifest(selected_uids)
with adapter.connection_named('master'):
self.create_schemas(adapter, selected_uids)
self.populate_adapter_cache(adapter)
self.defer_to_manifest(adapter, selected_uids)
self.safe_run_hooks(adapter, RunHookType.Start, {})
def after_run(self, adapter, results):
@@ -400,10 +416,16 @@ class RunTask(CompileTask):
# list of unique database, schema pairs that successfully executed
# models were in. for backwards compatibility, include the old
# 'schemas', which did not include database information.
database_schema_set: Set[Tuple[Optional[str], str]] = {
(r.node.database, r.node.schema) for r in results
if not any((r.error is not None, r.fail, r.skipped))
if r.status not in (
NodeStatus.Error,
NodeStatus.Fail,
NodeStatus.Skipped
)
}
self._total_executed += len(results)
extras = {

View File

@@ -5,6 +5,7 @@ from concurrent.futures import as_completed
from datetime import datetime
from multiprocessing.dummy import Pool as ThreadPool
from typing import Optional, Dict, List, Set, Tuple, Iterable, AbstractSet
from pathlib import PosixPath, WindowsPath
from .printer import (
print_run_result_error,
@@ -30,7 +31,7 @@ from dbt.logger import (
from dbt.contracts.graph.compiled import CompileResultNode
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.parsed import ParsedSourceDefinition
from dbt.contracts.results import RunResultsArtifact
from dbt.contracts.results import NodeStatus, RunExecutionResult
from dbt.contracts.state import PreviousState
from dbt.exceptions import (
InternalException,
@@ -188,17 +189,17 @@ class GraphRunnableTask(ManifestTask):
fail_fast = getattr(self.config.args, 'fail_fast', False)
if (result.fail is not None or result.error is not None) and fail_fast:
if result.status in (NodeStatus.Error, NodeStatus.Fail) and fail_fast:
self._raise_next_tick = FailFastException(
message='Failing early due to test failure or runtime error',
result=result,
node=getattr(result, 'node', None)
)
elif result.error is not None and self.raise_on_first_error():
elif result.status == NodeStatus.Error and self.raise_on_first_error():
# if we raise inside a thread, it'll just get silently swallowed.
# stash the error message we want here, and it will check the
# next 'tick' - should be soon since our thread is about to finish!
self._raise_next_tick = RuntimeException(result.error)
self._raise_next_tick = RuntimeException(result.message)
return result
@@ -286,7 +287,7 @@ class GraphRunnableTask(ManifestTask):
else:
self.manifest.update_node(node)
if result.error is not None:
if result.status == NodeStatus.Error:
if is_ephemeral:
cause = result
else:
@@ -425,6 +426,7 @@ class GraphRunnableTask(ManifestTask):
result = self.execute_with_hooks(selected_uids)
if flags.WRITE_JSON:
self.write_manifest()
self.write_result(result)
self.task_end_messages(result.results)
@@ -434,7 +436,14 @@ class GraphRunnableTask(ManifestTask):
if results is None:
return False
failures = [r for r in results if r.error or r.fail]
failures = [
r for r in results if r.status in (
NodeStatus.RuntimeErr,
NodeStatus.Error,
NodeStatus.Fail,
NodeStatus.Skipped # propogate error message causing skip
)
]
return len(failures) == 0
def get_model_schemas(
@@ -529,11 +538,37 @@ class GraphRunnableTask(ManifestTask):
create_future.result()
def get_result(self, results, elapsed_time, generated_at):
return RunResultsArtifact.from_node_results(
return RunExecutionResult(
results=results,
elapsed_time=elapsed_time,
generated_at=generated_at
generated_at=generated_at,
args=self.args_to_dict(),
)
def args_to_dict(self):
var_args = vars(self.args)
dict_args = {}
# remove args keys that clutter up the dictionary
for key in var_args:
if key == 'cls':
continue
if var_args[key] is None:
continue
default_false_keys = (
'debug', 'full_refresh', 'fail_fast', 'warn_error',
'single_threaded', 'test_new_parser', 'log_cache_events',
'strict'
)
if key in default_false_keys and var_args[key] is False:
continue
if key == 'vars' and var_args[key] == '{}':
continue
# this was required for a test case
if (isinstance(var_args[key], PosixPath) or
isinstance(var_args[key], WindowsPath)):
var_args[key] = str(var_args[key])
dict_args[key] = var_args[key]
return dict_args
def task_end_messages(self, results):
print_run_end_messages(results)

View File

@@ -7,6 +7,7 @@ from .printer import (
print_run_end_messages,
)
from dbt.contracts.results import RunStatus
from dbt.exceptions import InternalException
from dbt.graph import ResourceTypeSelector
from dbt.logger import GLOBAL_LOGGER as logger, TextOnly
@@ -37,6 +38,10 @@ class SeedRunner(ModelRunner):
class SeedTask(RunTask):
def defer_to_manifest(self, adapter, selected_uids):
# seeds don't defer
return
def raise_on_first_error(self):
return False
@@ -79,5 +84,5 @@ class SeedTask(RunTask):
def show_tables(self, results):
for result in results:
if result.error is None:
if result.status != RunStatus.Error:
self.show_table(result)

View File

@@ -22,6 +22,10 @@ class SnapshotTask(RunTask):
def raise_on_first_error(self):
return False
def defer_to_manifest(self, adapter, selected_uids):
# snapshots don't defer
return
def get_node_selector(self):
if self.manifest is None or self.graph is None:
raise InternalException(

View File

@@ -1,3 +1,4 @@
import threading
from typing import Dict, Any, Set
from .compile import CompileRunner
@@ -14,7 +15,7 @@ from dbt.contracts.graph.parsed import (
ParsedDataTestNode,
ParsedSchemaTestNode,
)
from dbt.contracts.results import RunModelResult
from dbt.contracts.results import RunResult, TestStatus
from dbt.exceptions import raise_compiler_error, InternalException
from dbt.graph import (
ResourceTypeSelector,
@@ -42,7 +43,7 @@ class TestRunner(CompileRunner):
def execute_data_test(self, test: CompiledDataTestNode):
res, table = self.adapter.execute(
test.injected_sql, auto_begin=True, fetch=True
test.compiled_sql, auto_begin=True, fetch=True
)
num_rows = len(table.rows)
@@ -59,7 +60,7 @@ class TestRunner(CompileRunner):
def execute_schema_test(self, test: CompiledSchemaTestNode):
res, table = self.adapter.execute(
test.injected_sql,
test.compiled_sql,
auto_begin=True,
fetch=True,
)
@@ -83,19 +84,30 @@ class TestRunner(CompileRunner):
elif isinstance(test, CompiledSchemaTestNode):
failed_rows = self.execute_schema_test(test)
else:
raise InternalException(
f'Expected compiled schema test or compiled data test, got '
f'{type(test)}'
)
severity = test.config.severity.upper()
severity = test.config.severity.upper()
thread_id = threading.current_thread().name
status = None
if failed_rows == 0:
return RunModelResult(test, status=failed_rows)
status = TestStatus.Pass
elif severity == 'ERROR' or flags.WARN_ERROR:
return RunModelResult(test, status=failed_rows, fail=True)
status = TestStatus.Fail
else:
return RunModelResult(test, status=failed_rows, warn=True)
status = TestStatus.Warn
return RunResult(
node=test,
status=status,
timing=[],
thread_id=thread_id,
execution_time=0,
message=int(failed_rows),
adapter_response={}
)
def after_execute(self, result):
self.print_result_line(result)
@@ -115,7 +127,7 @@ class TestSelector(ResourceTypeSelector):
)
def expand_selection(self, selected: Set[UniqueId]) -> Set[UniqueId]:
# reports can't have tests, so this is relatively easy
# exposures can't have tests, so this is relatively easy
selected_tests = set()
for unique_id in self.graph.select_successors(selected):
if unique_id in self.manifest.nodes:
@@ -132,6 +144,7 @@ class TestTask(RunTask):
Read schema files + custom data tests and validate that
constraints are satisfied.
"""
def raise_on_first_error(self):
return False

Some files were not shown because too many files have changed in this diff Show More