Compare commits

...

105 Commits

Author SHA1 Message Date
Gerda Shank
845b95f3d0 Initial file creation of code documentation READMEs 2022-01-31 22:09:17 -05:00
Nathaniel May
13b18654f0 Guard against unnecessarily calling dump_graph in logging (#4619)
* add lazy type and apply to cache events
2022-01-31 14:14:34 -05:00
Jeremy Cohen
aafa1c7f47 Change InvalidRefInTestNode level to DEBUG (#4647)
* Debug-level test depends on disabled

* Add PR link to Changelog
2022-01-31 18:28:43 +01:00
Jeremy Cohen
638e3ad299 Drop support for Python <3.7.2 (#4643)
* Drop support for 3.7.1 + 3.7.2

* Rm root level setup.py

* Rm 'dbt' pkg from build-dist script

* Fixup changelog
2022-01-31 17:31:20 +01:00
Emily Rockman
d9cfeb1ea3 Retry after failure to download or failure to open files (#4609)
* add retry logic, tests when extracting tarfile fails

* fixed bug with not catching empty responses

* specify compression type

* WIP test

* more testing work

* fixed up unit test

* add changelog

* Add more comments!

* clarify why we do the json() check for None
2022-01-31 10:26:51 -06:00
Chenyu Li
e6786a2bc3 fix comparision for new model/body (#4631)
* fix comparison for new model/body
2022-01-31 10:33:35 -05:00
leahwicz
13571435a3 Initial addition of CODEOWNERS file (#4620)
* Initial addition of CODEOWNERS file

* Proposed sub-team ownership (#4632)

* Updating for the events module to be both language and execution

* Adding more comment details

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
2022-01-27 16:23:55 -05:00
Gerda Shank
efb890db2d [#4504] Use mashumaro for serializing logging events (#4505) 2022-01-27 14:43:26 -05:00
Niall Woodward
f3735187a6 Run check_if_can_write_profile before create_profile_using_project_profile_template [CT-67] [Backport 1.0.latest] (#4447)
* Run check_if_can_write_profile before create_profile_using_project_profile_template

* Changelog

Co-authored-by: Ian Knox <81931810+iknox-fa@users.noreply.github.com>
2022-01-27 11:17:28 -06:00
Gerda Shank
3032594b26 [#4554] Don't require a profile for dbt deps and clean commands (#4610) 2022-01-25 12:26:44 -05:00
Joel Labes
1df7a029b4 Clarify "incompatible package version" error msg (#4587)
* Clarify "incompatible package version" error msg

* Clarify error message when they shouldn't fall fwd
2022-01-24 18:33:45 -05:00
leahwicz
f467fba151 Changing Jira mirroring workflows to point to shared Actions (#4615) 2022-01-24 12:20:12 -05:00
Amir Kadivar
8791313ec5 Validate project names in interactive dbt init (#4536)
* Validate project names in interactive dbt init

- workflow: ask the user to provide a valid project name until they do.
- new integration tests
- supported scenarios:
  - dbt init
  - dbt init -s
  - dbt init [name]
  - dbt init [name] -s

* Update Changelog.md

* Add full URLs to CHANGELOG.md

Co-authored-by: Chenyu Li <chenyulee777@gmail.com>

Co-authored-by: Chenyu Li <chenyulee777@gmail.com>
2022-01-21 18:24:26 -05:00
leahwicz
7798f932a0 Add Backport Action (#4605) 2022-01-21 12:40:55 -05:00
Nathaniel May
a588607ec6 drop support for Python 3.7.0 and 3.7.1 (#4585) 2022-01-19 12:24:37 -05:00
Joel Labes
348764d99d Rename data directory to seeds (#4589)
* Rename data directory to seeds

* Update CHANGELOG.md
2022-01-19 10:04:35 -06:00
Gerda Shank
5aeb088a73 [#3988] Fix test deprecation warnings (#4556) 2022-01-12 17:03:11 -05:00
leahwicz
e943b9fc84 Mirror labels to Jira (#4550)
* Adding Jira label mirroring

* Fixing bad step name
2022-01-05 09:29:52 -05:00
leahwicz
892426eecb Mirroring issues to Jira (#4548)
* Adding issue creation Jira Action

* Adding issue closing Jira Action

* Add labeling logic
2022-01-04 17:00:03 -05:00
Emily Rockman
1d25b2b046 test name standardization (#4509)
* rename tests for standardization

* more renaming

* rename tests to remove duplicate numbers

* removed unused file

* removed unused files in 016

* removed unused files in 017

* fixed schema number mismatch 027

* fixed to be actual directory name 025

* remove unused dir 029

* remove unused files 039

* remove unused files 053

* updated changelog
2022-01-04 11:36:47 -06:00
github-actions[bot]
da70840be8 Bumping version to 1.0.1 (#4543)
* Bumping version to 1.0.1

* Update CHANGELOG.md

* Update CHANGELOG.md

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2022-01-03 13:04:50 -05:00
leahwicz
7632782ecd Removing Docker from bumpversion script (#4542) 2022-01-03 12:48:03 -05:00
Nathaniel May
6fae647097 copy over windows compat logic for colored log output (#4474) 2022-01-03 12:37:36 -05:00
leahwicz
fc8b8c11d5 Commenting our Docker portion of Version Bump (#4541) 2022-01-03 12:37:20 -05:00
Topherhindman
26a7922a34 Fix small typo in architecture doc (#4533) 2022-01-03 12:00:04 +01:00
Emily Rockman
c18b4f1f1a removed unused code in unit tests (#4496)
* removed unused code

* add changelog

* moved changelog entry
2021-12-23 08:26:22 -06:00
Nathaniel May
fa31a67499 Add Structured Logging ADR (#4308) 2021-12-22 10:26:14 -05:00
Ian Knox
742cd990ee New Dockerfile (#4487)
New Dockerfile supporting individual db adapters and architectures
2021-12-22 08:29:21 -06:00
Gerda Shank
8463af35c3 [#4523] Fix error with env_var in hook (#4524) 2021-12-20 14:19:05 -05:00
github-actions[bot]
b34a4ab493 Bumping version to 1.0.1rc1 (#4517)
* Bumping version to 1.0.1rc1

* Update CHANGELOG.md

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2021-12-19 15:33:38 -05:00
Jeremy Cohen
417ccdc3b4 Fix bool coercion to 0/1 (#4512)
* Fix bool coercion

* Fix unit test
2021-12-19 10:30:25 -05:00
Emily Rockman
7c46b784ef scrub message of secrets (#4507)
* scrub message of secrets

* update changelog

* use new scrubbing and scrub more places using git

* fixed small miss of string conv and missing raise

* fix bug with cloning error

* resolving message issues

* better, more specific scrubbing
2021-12-17 16:05:57 -06:00
Gerda Shank
067b861d30 Improve checking of schema version for pre-1.0.0 manifests (#4497)
* [#4470] Improve checking of schema version for pre-1.0.0 manifests

* Check exception code instead of message in test
2021-12-16 13:30:52 -05:00
Emily Rockman
9f6ed3cec3 update log message to use adapter name (#4501)
* update log message to use adapter name

* add changelog
2021-12-16 11:46:28 -06:00
Nathaniel May
43edc887f9 Simplify Log Destinations (#4483) 2021-12-16 11:40:05 -05:00
Emily Rockman
6d4c64a436 compile new index file for docs (#4484)
* compile new index file for docs

* Add changelog

* move changleog entries for docs changes
2021-12-16 10:09:02 -06:00
Gerda Shank
0ed14fa236 [#4464] Check specifically for generic node type for some partial parsing actions (#4465)
* [#4464] Check specifically for generic node type for some partial parsing actions

* Add check for existence of macro file in saved_files

* Check for existence of patch file in saved_files
2021-12-14 16:28:40 -05:00
Emily Rockman
51f2daf4b0 updated DepsStartPackageInstall event to use package name (#4482)
* updated event to user package name

* add changelog
2021-12-14 14:25:29 -06:00
Matthew McKnight
76f7bf9900 made change to test of str (#4463)
* made change to test of str

* changelog update
2021-12-13 11:55:19 -06:00
Matthew McKnight
3fc715f066 updating contributing.md based on suggestions from updates to adapter… (#4356)
* updating contributing.md based on suggestions from updates to adapter contributing files.

* removed section refering to non-postgres databases for core contributing.md

* making suggested changes to contributing.md based on kyle's initial lookover

* Update CONTRIBUTING.md

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>
2021-12-10 13:14:49 -06:00
Rebekka Moyson
b6811da84f Fix dbt docs overview to working url (#4442)
* Fix to working url

* add fix to changelog
2021-12-08 10:30:41 -06:00
Nathaniel May
1dffccd9da point latest version check to dbt-core package (#4434) 2021-12-03 16:13:38 -05:00
github-actions[bot]
9ed9936c84 Bumping version to 1.0.0 (#4431)
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
2021-12-03 13:27:46 -05:00
Jeremy Cohen
e75ae8c754 Changelog entries for rc3 -> final (#4389)
* Changelog entries for rc3 -> final

* More updates

* Final entry

* Last fix, and the date

* These few, these happy few
2021-12-03 19:16:46 +01:00
Nathaniel May
b68535b8cb relax version specifier for dbt-extractor (#4427) 2021-12-03 12:56:03 -05:00
Nathaniel May
5310498647 add new interop tests for black-box json log schema testing (#4327) 2021-12-03 12:51:28 -05:00
Ian Knox
22b1a09aa2 stringify generic exceptions (#4424) 2021-12-03 10:32:22 -06:00
Jeremy Cohen
6855fe06a7 Info vs debug text formatting (#4418) 2021-12-03 14:36:42 +01:00
Jeremy Cohen
affd8619c2 Sources aren't materialized (#4417) 2021-12-03 14:36:35 +01:00
Jeremy Cohen
b67d5f396b Add flag to main.py. Reinstantiate after flags (#4416) 2021-12-03 14:36:25 +01:00
Emily Rockman
b3039fdc76 add node type codes to more events + more hook log data (#4378)
* add node type codes to more events + more hook log

* minor fixes

* renames started/finished keys

* made process more clear

* fixed errors

* Put back report_node_data in fresshness.py

Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
2021-12-02 19:25:57 -05:00
Nathaniel May
9bdf5fe74a use reference keys instead of relations (#4410) 2021-12-02 18:35:51 -05:00
Emily Rockman
c675c2d318 Logging README (#4395)
* WIP

* more README cleanup

* readme tweaks

* small tweaks

* wording updates
2021-12-02 17:04:23 -06:00
Ian Knox
2cd1f7d98e user configurable event buffer size (#4411) 2021-12-02 16:47:31 -06:00
Jeremy Cohen
ce9ac8ea10 Rollover + backup for dbt.log (#4405) 2021-12-02 22:10:11 +01:00
Jeremy Cohen
b90ab74975 A few final logging touch-ups (#4388)
* Rm unused events, per #4104

* More structured ConcurrencyLine

* Replace \n prefixes with EmptyLine

* Reimplement ui.warning_tag to centralize logic

* Use warning_tag for deprecations too

* Rm more unused event types

* Exclude EmptyLine from json logs

* loglines are not always created by events (#4406)

Co-authored-by: Nathaniel May <nathaniel.may@fishtownanalytics.com>
2021-12-02 22:09:46 +01:00
Emily Rockman
6d3c3f1995 update file name (#4402) 2021-12-02 15:04:29 -06:00
Nathaniel May
74fbaa18cd change json override strategy (#4396) 2021-12-02 15:04:52 -05:00
Emily Rockman
fc7c073691 allow log_format to be set in profile configs (#4394) 2021-12-02 13:51:45 -06:00
leahwicz
29f504e201 Fix release process (#4385) 2021-12-02 11:18:49 -05:00
Nathaniel May
eeb490ed15 use rfc3339 format for log time stamps (#4384) 2021-12-02 09:44:10 -05:00
Gerda Shank
c220b1e42c [#4354] Different output for console and file logs (#4379)
* [#4354] Different output for console and file logs

* Tweak some log formats

* Change loging of thread names
2021-12-02 08:23:25 -05:00
Jeremy Cohen
d973ae9ec6 Tiny touchups for deps, clean (#4366)
* Use actual profile name for log msg

* Raise clean dep warning iff configured path missing
2021-12-02 12:12:49 +01:00
Ian Knox
f461683df5 Add windows OS error supressing for temp dir cleanups (#4380) 2021-12-01 17:25:56 -06:00
Nathaniel May
41ed976941 move event code up a level (#4381)
move event code up a level plus minor fixes
2021-12-01 17:30:19 -05:00
Gerda Shank
e93ad5f118 Make the stdout logger actually go to stdout (#4368) 2021-11-30 17:48:23 -05:00
Emily Rockman
d75ed964f8 only log events in cache.py when flag is set set (#4369)
flag is --log-cache-events
2021-11-30 15:17:08 -06:00
Nathaniel May
284ac9b138 better dataclass field handling (#4361)
fix serializing dataclass fields so they show up at all
2021-11-30 13:34:57 -05:00
github-actions[bot]
7448ec5adb Bumping version to 1.0.0rc3 (#4363)
* Bumping version to 1.0.0rc3

* Updating Changelog for release

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2021-11-30 09:35:03 -05:00
Emily Rockman
caa6269bc7 add node_info to relevant logs (#4336)
* WIP

* fixed some merg issues

* WIP

* first pass with node_status logging

* add node details to one more

* another pass at node info

* fixed failures

* convert to classes

* more tweaks to basic implementation

* added in ststus, organized a bit

* saving broken state

* working state with lots of todos

* formatting

* add start/end tiemstamps

* adding node_status logging to more events

* adding node_status to more events

* Add RunningStatus and set in node

* Add NodeCompiling and NodeExecuting events, switch to _event_status dict

* add _event_status to SourceDefinition

* small tweaks to NodeInfo

* fixed misnamed attr

* small fix to validation

* rename logging timestamps to minimize name collision

* fixed flake failure

* move str formatting to events

* incorporate serialization changes

* add node_status to event_to_serializable_dict

* convert nodeInfo to dict with dataclass builtin

* Try to fix failing unit, flake8, mypy tests (#4362)

* fixed leftover merge conflict

Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
2021-11-30 09:34:28 -05:00
Gerda Shank
31691c3b88 Events with graph_func include actual output of graph_func (#4360) 2021-11-29 20:20:22 -05:00
Ian Knox
3a904a811f Event buffer for structlog (#4358)
Add Internal event buffer

Co-authored-by: Nathaniel May <nathaniel.may@fishtownanalytics.com>
2021-11-29 20:12:20 -05:00
Nathaniel May
b927a31a53 make json serialization overridable for events (#4326)
* simplify scrubbing

* add overridable serialize method to events

* add imperfect test for json serialization of events

Co-authored-by: Ian Knox <ian.knox@fishtownanalytics.com>
Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>
2021-11-29 18:19:34 -05:00
Kyle Wigley
d8dd75320c set invocation id when generating psuedo config (#4359) 2021-11-29 17:29:12 -05:00
Nathaniel May
a613556246 add thread_name to json output (#4353) 2021-11-29 14:01:50 -05:00
Jeremy Cohen
8d2351d541 Logging: restore previous (small) behaviors (#4341)
* Log formatting from flags earlier

* WARN-level stdout for list task

* Readd tracking events to File

* PR feedback, annotate hacks

* Revert "PR feedback, annotate hacks"

This reverts commit 5508fa230b.

* This is maybe better

* Annotate main.py

* One more comment in base.py

* Update changelog
2021-11-29 19:05:39 +01:00
leahwicz
f72b603196 Adding release workflow (#4288) 2021-11-29 10:37:14 -05:00
Gerda Shank
4eb17b57fb Provide function to set the invocation_id (#4351) 2021-11-29 10:15:19 -05:00
Cor
85a4b87267 Use cls in classmethod (#4345)
Instead of calling the class explicitly, use the `cls` variable instead.
2021-11-29 09:57:52 -05:00
jan zens
0d320c58da fix typo in UnparsedSourceDefinition.__post_serialize_ (#4349)
* fix typo in UnparsedSourceDefinition.__post_serialize_

fix typo in UnparsedSourceDefinition.__post_serialize_

* update CHANGELOG.md

update CHANGELOG.md

add #4349

* Update changelog

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
2021-11-29 11:36:11 +01:00
Emilie Lima Schario
ed1ff2caac Adjust logic when finding approx matches for model or test matching (#4076)
* adjust logic when finding approx matches

* update changelog

* Update core/dbt/adapters/base/relation.py

Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>

* Update changelog

Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
2021-11-29 11:20:01 +01:00
sarah-weatherbee
d80646c258 adds additional augmented assignment statements (#4315) (#4331)
* adds additional augmented assignment statements (#4315)

* Per PR comments, revised CHANGELOG.md to note change and contributor info
2021-11-27 09:04:40 -06:00
Matthew McKnight
a9b4316346 Mc knight 42/test event codes (#4338)
* pushing up to get eye on from Nate

* updating to compare

* latest push

* finished test for duplicate codes with a lot of help from Nate

* resolving suggestions

* removed duplicated code in types.py, made minor changes to test_events.py

* added missing func call
2021-11-24 16:03:43 -06:00
Gerda Shank
36776b96e7 [#4337] Always create an invocation_id, even when not tracking (#4340) 2021-11-24 16:54:17 -05:00
Jeremy Cohen
7f2d3cd24f Fix static parser tracking logic (#4332)
* Fix static parser tracking logic

* Add changelog note
2021-11-24 17:26:56 +01:00
Gerda Shank
d046ae0606 [#4253] Support partial parsing of env_vars in metrics definitions (#4322) 2021-11-23 15:02:47 -05:00
Gerda Shank
e8c267275e [#4254] Change some CompilationExceptions to ParsingException in the parser (#4328) 2021-11-23 13:50:00 -05:00
github-actions[bot]
a4951749a8 Bumping version to 1.0.0rc2 (#4321)
* Bumping version to 1.0.0rc2

* Update changelog

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
2021-11-22 21:26:15 +01:00
Ian Knox
e1a2e8d9f5 Add codes to all log events (re-work of PR #4268) (#4319)
* re-work of old branch
2021-11-22 13:14:33 -06:00
Emily Rockman
f80c78e3a5 add logic to scrub more than str types (#4317) 2021-11-22 12:58:10 -06:00
Emily Rockman
c541eca592 structured logging: add data attributes to json log output (#4301)
* simplified data construction

* fixed missed scrubbing of secrets

* switched to vars()

* scrub entire log line, update how attributes get pulled

* get ahead of serialization errors

* store if data is serialized and modify values instead of a copy of values

* fixed unused import from merge
2021-11-19 15:43:26 -06:00
Nathaniel May
726aba0586 version logging (#4289)
* start adding version logging, noticed some wrong stuff

* fix bad pid and ts

* remove level format on json logs

Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
2021-11-19 14:53:50 -06:00
Jeremy Cohen
d300addee1 SecretContext for secret env vars, profiles + packages only (#4311)
* SecretContext for secret env vars

* Cleanup exception. Add + edit tests

* Add changelog entry
2021-11-19 19:36:19 +01:00
Kyle Wigley
d5d16f01f4 Fix flags import (#4307) 2021-11-18 14:59:49 -05:00
Kyle Wigley
2cb26e2699 Add supported dbt tasks (#4200) 2021-11-18 14:05:00 -05:00
Nathaniel May
b4793b4f9b Fix adapter failures due to string formatting issues (#4305)
fix adapter failures due to string formatting issues
2021-11-18 12:54:20 -05:00
Gerda Shank
045e70ccf1 [#4298] Fix 'created_at' in ParsedMetric to allow recalculating metrics depends_on refs (#4299) 2021-11-18 09:29:09 -05:00
Jeremy Cohen
ba23395c8e Fix metrics count in compile stats (#4292)
* Fix metrics count in compile stats

* Add changelog entry
2021-11-18 09:28:13 +01:00
Joel Labes
0aacd99168 Get prerelease packages when specifically requested (#4295)
* Get prerelease packages when specifically required. Add test validating it works

* Update CHANGELOG.md
2021-11-18 09:11:49 +01:00
Nathaniel May
e4b5d73dc4 adjust level length for text only (#4303)
adjust level length for text log lines only
2021-11-17 17:32:15 -05:00
Gerda Shank
bd950f687a [#4252] Serialization error when missing quotes in metrics model ref() call (#4287) 2021-11-17 17:14:32 -05:00
Gerda Shank
aea23a488b [#4272] Move validator keyword argument in jinja 'config.get' to after 'default' (#4297) 2021-11-17 17:12:26 -05:00
Jeremy Cohen
22731df07b Fix: default log formatting (#4302)
* Respect log formatting

* PR feedback
2021-11-17 21:10:14 +01:00
Jeremy Cohen
c55be164e6 Separate warnings. Fix duplication (#4291) 2021-11-17 18:01:28 +01:00
kadero
9d73304c1a Alow snapshot defer (#4296)
* Alow snapshot defer

* Update changelog
2021-11-17 16:56:37 +01:00
715 changed files with 4874 additions and 1753 deletions

View File

@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.0.0rc1
current_version = 1.0.1
parse = (?P<major>\d+)
\.(?P<minor>\d+)
\.(?P<patch>\d+)
@@ -35,5 +35,3 @@ first_value = 1
[bumpversion:file:plugins/postgres/setup.py]
[bumpversion:file:plugins/postgres/dbt/adapters/postgres/__version__.py]
[bumpversion:file:docker/requirements/requirements.txt]

43
.github/CODEOWNERS vendored Normal file
View File

@@ -0,0 +1,43 @@
# This file contains the code owners for the dbt-core repo.
# PRs will be automatically assigned for review to the associated
# team(s) or person(s) that touches any files that are mapped to them.
#
# A statement takes precedence over the statements above it so more general
# assignments are found at the top with specific assignments being lower in
# the ordering (i.e. catch all assignment should be the first item)
#
# Consult GitHub documentation for formatting guidelines:
# https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners#example-of-a-codeowners-file
# As a default for areas with no assignment,
# the core team as a whole will be assigned
* @dbt-labs/core
# Changes to GitHub configurations including Actions
/.github/ @leahwicz
# Language core modules
/core/dbt/config/ @dbt-labs/core-language
/core/dbt/context/ @dbt-labs/core-language
/core/dbt/contracts/ @dbt-labs/core-language
/core/dbt/deps/ @dbt-labs/core-language
/core/dbt/parser/ @dbt-labs/core-language
# Execution core modules
/core/dbt/events/ @dbt-labs/core-execution @dbt-labs/core-language # eventually remove language but they have knowledge here now
/core/dbt/graph/ @dbt-labs/core-execution
/core/dbt/task/ @dbt-labs/core-execution
# Adapter interface, scaffold, Postgres plugin
/core/dbt/adapters @dbt-labs/core-adapters
/core/scripts/create_adapter_plugin.py @dbt-labs/core-adapters
/plugins/ @dbt-labs/core-adapters
# Global project: default macros, including generic tests + materializations
/core/dbt/include/global_project @dbt-labs/core-execution @dbt-labs/core-adapters
# Perf regression testing framework
# This excludes the test project files itself since those aren't specific
# framework changes (excluded by not setting an owner next to it- no owner)
/performance @nathaniel-may
/performance/projects

34
.github/workflows/backport.yml vendored Normal file
View File

@@ -0,0 +1,34 @@
# **what?**
# When a PR is merged, if it has the backport label, it will create
# a new PR to backport those changes to the given branch. If it can't
# cleanly do a backport, it will comment on the merged PR of the failure.
#
# Label naming convention: "backport <branch name to backport to>"
# Example: backport 1.0.latest
#
# You MUST "Squash and merge" the original PR or this won't work.
# **why?**
# Changes sometimes need to be backported to release branches.
# This automates the backporting process
# **when?**
# Once a PR is "Squash and merge"'d and it has been correctly labeled
# according to the naming convention.
name: Backport
on:
pull_request:
types:
- closed
- labeled
jobs:
backport:
runs-on: ubuntu-18.04
name: Backport
steps:
- name: Backport
uses: tibdex/backport@v1.1.1
with:
github_token: ${{ secrets.GITHUB_TOKEN }}

26
.github/workflows/jira-creation.yml vendored Normal file
View File

@@ -0,0 +1,26 @@
# **what?**
# Mirrors issues into Jira. Includes the information: title,
# GitHub Issue ID and URL
# **why?**
# Jira is our tool for tracking and we need to see these issues in there
# **when?**
# On issue creation or when an issue is labeled `Jira`
name: Jira Issue Creation
on:
issues:
types: [opened, labeled]
permissions:
issues: write
jobs:
call-label-action:
uses: dbt-labs/jira-actions/.github/workflows/jira-creation.yml@main
secrets:
JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}
JIRA_API_TOKEN: ${{ secrets.JIRA_API_TOKEN }}

27
.github/workflows/jira-label.yml vendored Normal file
View File

@@ -0,0 +1,27 @@
# **what?**
# Calls mirroring Jira label Action. Includes adding a new label
# to an existing issue or removing a label as well
# **why?**
# Jira is our tool for tracking and we need to see these labels in there
# **when?**
# On labels being added or removed from issues
name: Jira Label Mirroring
on:
issues:
types: [labeled, unlabeled]
permissions:
issues: read
jobs:
call-label-action:
uses: dbt-labs/jira-actions/.github/workflows/jira-label.yml@main
secrets:
JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}
JIRA_API_TOKEN: ${{ secrets.JIRA_API_TOKEN }}

24
.github/workflows/jira-transition.yml vendored Normal file
View File

@@ -0,0 +1,24 @@
# **what?**
# Transition a Jira issue to a new state
# Only supports these GitHub Issue transitions:
# closed, deleted, reopened
# **why?**
# Jira needs to be kept up-to-date
# **when?**
# On issue closing, deletion, reopened
name: Jira Issue Transition
on:
issues:
types: [closed, deleted, reopened]
jobs:
call-label-action:
uses: dbt-labs/jira-actions/.github/workflows/jira-transition.yml@main
secrets:
JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}
JIRA_API_TOKEN: ${{ secrets.JIRA_API_TOKEN }}

200
.github/workflows/release.yml vendored Normal file
View File

@@ -0,0 +1,200 @@
# **what?**
# Take the given commit, run unit tests specifically on that sha, build and
# package it, and then release to GitHub and PyPi with that specific build
# **why?**
# Ensure an automated and tested release process
# **when?**
# This will only run manually with a given sha and version
name: Release to GitHub and PyPi
on:
workflow_dispatch:
inputs:
sha:
description: 'The last commit sha in the release'
required: true
version_number:
description: 'The release version number (i.e. 1.0.0b1)'
required: true
defaults:
run:
shell: bash
jobs:
unit:
name: Unit test
runs-on: ubuntu-latest
env:
TOXENV: "unit"
steps:
- name: Check out the repository
uses: actions/checkout@v2
with:
persist-credentials: false
ref: ${{ github.event.inputs.sha }}
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.8
- name: Install python dependencies
run: |
pip install --user --upgrade pip
pip install tox
pip --version
tox --version
- name: Run tox
run: tox
build:
name: build packages
runs-on: ubuntu-latest
steps:
- name: Check out the repository
uses: actions/checkout@v2
with:
persist-credentials: false
ref: ${{ github.event.inputs.sha }}
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.8
- name: Install python dependencies
run: |
pip install --user --upgrade pip
pip install --upgrade setuptools wheel twine check-wheel-contents
pip --version
- name: Build distributions
run: ./scripts/build-dist.sh
- name: Show distributions
run: ls -lh dist/
- name: Check distribution descriptions
run: |
twine check dist/*
- name: Check wheel contents
run: |
check-wheel-contents dist/*.whl --ignore W007,W008
- uses: actions/upload-artifact@v2
with:
name: dist
path: |
dist/
!dist/dbt-${{github.event.inputs.version_number}}.tar.gz
test-build:
name: verify packages
needs: [build, unit]
runs-on: ubuntu-latest
steps:
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.8
- name: Install python dependencies
run: |
pip install --user --upgrade pip
pip install --upgrade wheel
pip --version
- uses: actions/download-artifact@v2
with:
name: dist
path: dist/
- name: Show distributions
run: ls -lh dist/
- name: Install wheel distributions
run: |
find ./dist/*.whl -maxdepth 1 -type f | xargs pip install --force-reinstall --find-links=dist/
- name: Check wheel distributions
run: |
dbt --version
- name: Install source distributions
run: |
find ./dist/*.gz -maxdepth 1 -type f | xargs pip install --force-reinstall --find-links=dist/
- name: Check source distributions
run: |
dbt --version
github-release:
name: GitHub Release
needs: test-build
runs-on: ubuntu-latest
steps:
- uses: actions/download-artifact@v2
with:
name: dist
path: '.'
# Need to set an output variable because env variables can't be taken as input
# This is needed for the next step with releasing to GitHub
- name: Find release type
id: release_type
env:
IS_PRERELEASE: ${{ contains(github.event.inputs.version_number, 'rc') || contains(github.event.inputs.version_number, 'b') }}
run: |
echo ::set-output name=isPrerelease::$IS_PRERELEASE
- name: Creating GitHub Release
uses: softprops/action-gh-release@v1
with:
name: dbt-core v${{github.event.inputs.version_number}}
tag_name: v${{github.event.inputs.version_number}}
prerelease: ${{ steps.release_type.outputs.isPrerelease }}
target_commitish: ${{github.event.inputs.sha}}
body: |
[Release notes](https://github.com/dbt-labs/dbt-core/blob/main/CHANGELOG.md)
files: |
dbt_postgres-${{github.event.inputs.version_number}}-py3-none-any.whl
dbt_core-${{github.event.inputs.version_number}}-py3-none-any.whl
dbt-postgres-${{github.event.inputs.version_number}}.tar.gz
dbt-core-${{github.event.inputs.version_number}}.tar.gz
pypi-release:
name: Pypi release
runs-on: ubuntu-latest
needs: github-release
environment: PypiProd
steps:
- uses: actions/download-artifact@v2
with:
name: dist
path: 'dist'
- name: Publish distribution to PyPI
uses: pypa/gh-action-pypi-publish@v1.4.2
with:
password: ${{ secrets.PYPI_API_TOKEN }}

View File

@@ -0,0 +1,71 @@
# This Action checks makes a dbt run to sample json structured logs
# and checks that they conform to the currently documented schema.
#
# If this action fails it either means we have unintentionally deviated
# from our documented structured logging schema, or we need to bump the
# version of our structured logging and add new documentation to
# communicate these changes.
name: Structured Logging Schema Check
on:
push:
branches:
- "main"
- "*.latest"
- "releases/*"
pull_request:
workflow_dispatch:
permissions: read-all
jobs:
# run the performance measurements on the current or default branch
test-schema:
name: Test Log Schema
runs-on: ubuntu-latest
env:
# turns warnings into errors
RUSTFLAGS: "-D warnings"
# points tests to the log file
LOG_DIR: "/home/runner/work/dbt-core/dbt-core/logs"
# tells integration tests to output into json format
DBT_LOG_FORMAT: 'json'
steps:
- name: checkout dev
uses: actions/checkout@v2
with:
persist-credentials: false
- name: Setup Python
uses: actions/setup-python@v2.2.2
with:
python-version: "3.8"
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
- name: install dbt
run: pip install -r dev-requirements.txt -r editable-requirements.txt
- name: Set up postgres
uses: ./.github/actions/setup-postgres-linux
- name: ls
run: ls
# integration tests generate a ton of logs in different files. the next step will find them all.
# we actually care if these pass, because the normal test run doesn't usually include many json log outputs
- name: Run integration tests
run: tox -e py38-postgres -- -nauto
# apply our schema tests to every log event from the previous step
# skips any output that isn't valid json
- uses: actions-rs/cargo@v1
with:
command: run
args: --manifest-path test/interop/log_parsing/Cargo.toml

View File

@@ -66,12 +66,12 @@ jobs:
git push origin bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_$GITHUB_RUN_ID
git branch --set-upstream-to=origin/bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_$GITHUB_RUN_ID bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_$GITHUB_RUN_ID
- name: Generate Docker requirements
run: |
source env/bin/activate
pip install -r requirements.txt
pip freeze -l > docker/requirements/requirements.txt
git status
# - name: Generate Docker requirements
# run: |
# source env/bin/activate
# pip install -r requirements.txt
# pip freeze -l > docker/requirements/requirements.txt
# git status
- name: Bump version
run: |

View File

@@ -4,16 +4,17 @@ The core function of dbt is SQL compilation and execution. Users create projects
Most of the python code in the repository is within the `core/dbt` directory. Currently the main subdirectories are:
- [`adapters`](core/dbt/adapters): Define base classes for behavior that is likely to differ across databases
- [`clients`](core/dbt/clients): Interface with dependencies (agate, jinja) or across operating systems
- [`config`](core/dbt/config): Reconcile user-supplied configuration from connection profiles, project files, and Jinja macros
- [`context`](core/dbt/context): Build and expose dbt-specific Jinja functionality
- [`contracts`](core/dbt/contracts): Define Python objects (dataclasses) that dbt expects to create and validate
- [`deps`](core/dbt/deps): Package installation and dependency resolution
- [`graph`](core/dbt/graph): Produce a `networkx` DAG of project resources, and selecting those resources given user-supplied criteria
- [`include`](core/dbt/include): The dbt "global project," which defines default implementations of Jinja2 macros
- [`parser`](core/dbt/parser): Read project files, validate, construct python objects
- [`task`](core/dbt/task): Set forth the actions that dbt can perform when invoked
- [`adapters`](core/dbt/adapters/README.md): Define base classes for behavior that is likely to differ across databases
- [`clients`](core/dbt/clients/README.md): Interface with dependencies (agate, jinja) or across operating systems
- [`config`](core/dbt/config/README.md): Reconcile user-supplied configuration from connection profiles, project files, and Jinja macros
- [`context`](core/dbt/context/README.md): Build and expose dbt-specific Jinja functionality
- [`contracts`](core/dbt/contracts/README.md): Define Python objects (dataclasses) that dbt expects to create and validate
- [`deps`](core/dbt/deps/README.md): Package installation and dependency resolution
- [`events`](core/dbt/events/README.md): Logging events
- [`graph`](core/dbt/graph/README.md): Produce a `networkx` DAG of project resources, and selecting those resources given user-supplied criteria
- [`include`](core/dbt/include/README.md): The dbt "global project," which defines default implementations of Jinja2 macros
- [`parser`](core/dbt/parser/README.md): Read project files, validate, construct python objects
- [`task`](core/dbt/task/README.md): Set forth the actions that dbt can perform when invoked
### Invoking dbt
@@ -44,4 +45,4 @@ The [`test/`](test/) subdirectory includes unit and integration tests that run a
- [docker](docker/): All dbt versions are published as Docker images on DockerHub. This subfolder contains the `Dockerfile` (constant) and `requirements.txt` (one for each version).
- [etc](etc/): Images for README
- [scripts](scripts/): Helper scripts for testing, releasing, and producing JSON schemas. These are not included in distributions of dbt, not are they rigorously tested—they're just handy tools for the dbt maintainers :)
- [scripts](scripts/): Helper scripts for testing, releasing, and producing JSON schemas. These are not included in distributions of dbt, nor are they rigorously tested—they're just handy tools for the dbt maintainers :)

View File

@@ -1,14 +1,130 @@
## dbt-core 1.0.0rc2 (TBD)
## dbt-core 1.1.0 (TBD)
### Features
- New Dockerfile to support specific db adapters and platforms. See docker/README.md for details ([#4495](https://github.com/dbt-labs/dbt-core/issues/4495), [#4487](https://github.com/dbt-labs/dbt-core/pull/4487))
### Fixes
- User wasn't asked for permission to overwite a profile entry when running init inside an existing project ([#4375](https://github.com/dbt-labs/dbt-core/issues/4375), [#4447](https://github.com/dbt-labs/dbt-core/pull/4447))
- Add project name validation to `dbt init` ([#4490](https://github.com/dbt-labs/dbt-core/issues/4490),[#4536](https://github.com/dbt-labs/dbt-core/pull/4536))
### Under the hood
- Testing cleanup ([#4496](https://github.com/dbt-labs/dbt-core/pull/4496), [#4509](https://github.com/dbt-labs/dbt-core/pull/4509))
- Clean up test deprecation warnings ([#3988](https://github.com/dbt-labs/dbt-core/issue/3988), [#4556](https://github.com/dbt-labs/dbt-core/pull/4556))
- Use mashumaro for serialization in event logging ([#4504](https://github.com/dbt-labs/dbt-core/issues/4504), [#4505](https://github.com/dbt-labs/dbt-core/pull/4505))
- Drop support for Python 3.7.0 + 3.7.1 ([#4584](https://github.com/dbt-labs/dbt-core/issues/4584), [#4585](https://github.com/dbt-labs/dbt-core/pull/4585), [#4643](https://github.com/dbt-labs/dbt-core/pull/4643))
Contributors:
- [@NiallRees](https://github.com/NiallRees) ([#4447](https://github.com/dbt-labs/dbt-core/pull/4447))
## dbt-core 1.0.2 (TBD)
### Fixes
- Projects created using `dbt init` now have the correct `seeds` directory created (instead of `data`) ([#4588](https://github.com/dbt-labs/dbt-core/issues/4588), [#4599](https://github.com/dbt-labs/dbt-core/pull/4589))
- Don't require a profile for dbt deps and clean commands ([#4554](https://github.com/dbt-labs/dbt-core/issues/4554), [#4610](https://github.com/dbt-labs/dbt-core/pull/4610))
- Select modified.body works correctly when new model added([#4570](https://github.com/dbt-labs/dbt-core/issues/4570), [#4631](https://github.com/dbt-labs/dbt-core/pull/4631))
- Fix bug in retry logic for bad response from hub and when there is a bad git tarball download. ([#4577](https://github.com/dbt-labs/dbt-core/issues/4577), [#4579](https://github.com/dbt-labs/dbt-core/issues/4579), [#4609](https://github.com/dbt-labs/dbt-core/pull/4609))
- Restore previous log level (DEBUG) when a test depends on a disabled resource. Still WARN if the resource is missing ([#4594](https://github.com/dbt-labs/dbt-core/issues/4594), [#4647](https://github.com/dbt-labs/dbt-core/pull/4647))
## dbt-core 1.0.1 (January 03, 2022)
* [@amirkdv](https://github.com/amirkdv) ([#4536](https://github.com/dbt-labs/dbt-core/pull/4536))
## dbt-core 1.0.1rc1 (December 20, 2021)
### Fixes
- Fix wrong url in the dbt docs overview homepage ([#4442](https://github.com/dbt-labs/dbt-core/pull/4442))
- Fix redefined status param of SQLQueryStatus to typecheck the string which passes on `._message` value of `AdapterResponse` or the `str` value sent by adapter plugin. ([#4463](https://github.com/dbt-labs/dbt-core/pull/4463#issuecomment-990174166))
- Fix `DepsStartPackageInstall` event to use package name instead of version number. ([#4482](https://github.com/dbt-labs/dbt-core/pull/4482))
- Reimplement log message to use adapter name instead of the object method. ([#4501](https://github.com/dbt-labs/dbt-core/pull/4501))
- Issue better error message for incompatible schemas ([#4470](https://github.com/dbt-labs/dbt-core/pull/4442), [#4497](https://github.com/dbt-labs/dbt-core/pull/4497))
- Remove secrets from error related to packages. ([#4507](https://github.com/dbt-labs/dbt-core/pull/4507))
- Prevent coercion of boolean values (`True`, `False`) to numeric values (`0`, `1`) in query results ([#4511](https://github.com/dbt-labs/dbt-core/issues/4511), [#4512](https://github.com/dbt-labs/dbt-core/pull/4512))
- Fix error with an env_var in a project hook ([#4523](https://github.com/dbt-labs/dbt-core/issues/4523), [#4524](https://github.com/dbt-labs/dbt-core/pull/4524))
- Add additional windows compat logic for colored log output. ([#4443](https://github.com/dbt-labs/dbt-core/issues/4443))
### Docs
- Fix missing data on exposures in docs ([#4467](https://github.com/dbt-labs/dbt-core/issues/4467))
Contributors:
- [remoyson](https://github.com/remoyson) ([#4442](https://github.com/dbt-labs/dbt-core/pull/4442))
## dbt-core 1.0.0 (December 3, 2021)
### Fixes
- Configure the CLI logger destination to use stdout instead of stderr ([#4368](https://github.com/dbt-labs/dbt-core/pull/4368))
- Make the size of `EVENT_HISTORY` configurable, via `EVENT_BUFFER_SIZE` global config ([#4411](https://github.com/dbt-labs/dbt-core/pull/4411), [#4416](https://github.com/dbt-labs/dbt-core/pull/4416))
- Change type of `log_format` in `profiles.yml` user config to be string, not boolean ([#4394](https://github.com/dbt-labs/dbt-core/pull/4394))
### Under the hood
- Only log cache events if `LOG_CACHE_EVENTS` is enabled, and disable by default. This restores previous behavior ([#4369](https://github.com/dbt-labs/dbt-core/pull/4369))
- Move event codes to be a top-level attribute of JSON-formatted logs, rather than nested in `data` ([#4381](https://github.com/dbt-labs/dbt-core/pull/4381))
- Fix failing integration test on Windows ([#4380](https://github.com/dbt-labs/dbt-core/pull/4380))
- Clean up warning messages for `clean` + `deps` ([#4366](https://github.com/dbt-labs/dbt-core/pull/4366))
- Use RFC3339 timestamps for log messages ([#4384](https://github.com/dbt-labs/dbt-core/pull/4384))
- Different text output for console (info) and file (debug) logs ([#4379](https://github.com/dbt-labs/dbt-core/pull/4379), [#4418](https://github.com/dbt-labs/dbt-core/pull/4418))
- Remove unused events. More structured `ConcurrencyLine`. Replace `\n` message starts/ends with `EmptyLine` events, and exclude `EmptyLine` from JSON-formatted output ([#4388](https://github.com/dbt-labs/dbt-core/pull/4388))
- Update `events` module README ([#4395](https://github.com/dbt-labs/dbt-core/pull/4395))
- Rework approach to JSON serialization for events with non-standard properties ([#4396](https://github.com/dbt-labs/dbt-core/pull/4396))
- Update legacy logger file name to `dbt.log.legacy` ([#4402](https://github.com/dbt-labs/dbt-core/pull/4402))
- Rollover `dbt.log` at 10 MB, and keep up to 5 backups, restoring previous behavior ([#4405](https://github.com/dbt-labs/dbt-core/pull/4405))
- Use reference keys instead of full relation objects in cache events ([#4410](https://github.com/dbt-labs/dbt-core/pull/4410))
- Add `node_type` contextual info to more events ([#4378](https://github.com/dbt-labs/dbt-core/pull/4378))
- Make `materialized` config optional in `node_type` ([#4417](https://github.com/dbt-labs/dbt-core/pull/4417))
- Stringify exception in `GenericExceptionOnRun` to support JSON serialization ([#4424](https://github.com/dbt-labs/dbt-core/pull/4424))
- Add "interop" tests for machine consumption of structured log output ([#4327](https://github.com/dbt-labs/dbt-core/pull/4327))
- Relax version specifier for `dbt-extractor` to `~=0.4.0`, to support compiled wheels for additional architectures when available ([#4427](https://github.com/dbt-labs/dbt-core/pull/4427))
## dbt-core 1.0.0rc3 (November 30, 2021)
### Fixes
- Support partial parsing of env_vars in metrics ([#4253](https://github.com/dbt-labs/dbt-core/issues/4293), [#4322](https://github.com/dbt-labs/dbt-core/pull/4322))
- Fix typo in `UnparsedSourceDefinition.__post_serialize__` ([#3545](https://github.com/dbt-labs/dbt-core/issues/3545), [#4349](https://github.com/dbt-labs/dbt-core/pull/4349))
### Under the hood
- Change some CompilationExceptions to ParsingExceptions ([#4254](http://github.com/dbt-labs/dbt-core/issues/4254), [#4328](https://github.com/dbt-core/pull/4328))
- Reorder logic for static parser sampling to speed up model parsing ([#4332](https://github.com/dbt-labs/dbt-core/pull/4332))
- Use more augmented assignment statements ([#4315](https://github.com/dbt-labs/dbt-core/issues/4315)), ([#4311](https://github.com/dbt-labs/dbt-core/pull/4331))
- Adjust logic when finding approximate matches for models and tests ([#3835](https://github.com/dbt-labs/dbt-core/issues/3835)), [#4076](https://github.com/dbt-labs/dbt-core/pull/4076))
- Restore small previous behaviors for logging: JSON formatting for first few events; `WARN`-level stdout for `list` task; include tracking events in `dbt.log` ([#4341](https://github.com/dbt-labs/dbt-core/pull/4341))
Contributors:
- [@sarah-weatherbee](https://github.com/sarah-weatherbee) ([#4331](https://github.com/dbt-labs/dbt-core/pull/4331))
- [@emilieschario](https://github.com/emilieschario) ([#4076](https://github.com/dbt-labs/dbt-core/pull/4076))
- [@sneznaj](https://github.com/sneznaj) ([#4349](https://github.com/dbt-labs/dbt-core/pull/4349))
## dbt-core 1.0.0rc2 (November 22, 2021)
### Breaking changes
- Restrict secret env vars (prefixed `DBT_ENV_SECRET_`) to `profiles.yml` + `packages.yml` _only_. Raise an exception if a secret env var is used elsewhere ([#4310](https://github.com/dbt-labs/dbt-core/issues/4310), [#4311](https://github.com/dbt-labs/dbt-core/pull/4311))
- Reorder arguments to `config.get()` so that `default` is second ([#4273](https://github.com/dbt-labs/dbt-core/issues/4273), [#4297](https://github.com/dbt-labs/dbt-core/pull/4297))
### Features
- Avoid error when missing column in YAML description ([#4151](https://github.com/dbt-labs/dbt-core/issues/4151), [#4285](https://github.com/dbt-labs/dbt-core/pull/4285))
- Allow `--defer` flag to `dbt snapshot` ([#4110](https://github.com/dbt-labs/dbt-core/issues/4110), [#4296](https://github.com/dbt-labs/dbt-core/pull/4296))
- Install prerelease packages when `version` explicitly references a prerelease version, regardless of `install-prerelease` status ([#4243](https://github.com/dbt-labs/dbt-core/issues/4243), [#4295](https://github.com/dbt-labs/dbt-core/pull/4295))
- Add data attributes to json log messages ([#4301](https://github.com/dbt-labs/dbt-core/pull/4301))
- Add event codes to all log events ([#4319](https://github.com/dbt-labs/dbt-core/pull/4319))
### Fixes
- Fix serialization error with missing quotes in metrics model ref ([#4252](https://github.com/dbt-labs/dbt-core/issues/4252), [#4287](https://github.com/dbt-labs/dbt-core/pull/4289))
- Correct definition of 'created_at' in ParsedMetric nodes ([#4298](http://github.com/dbt-labs/dbt-core/issues/4298), [#4299](https://github.com/dbt-labs/dbt-core/pull/4299))
### Fixes
- Allow specifying default in Jinja config.get with default keyword ([#4273](https://github.com/dbt-labs/dbt-core/issues/4273), [#4297](https://github.com/dbt-labs/dbt-core/pull/4297))
- Fix serialization error with missing quotes in metrics model ref ([#4252](https://github.com/dbt-labs/dbt-core/issues/4252), [#4287](https://github.com/dbt-labs/dbt-core/pull/4289))
- Correct definition of 'created_at' in ParsedMetric nodes ([#4298](https://github.com/dbt-labs/dbt-core/issues/4298), [#4299](https://github.com/dbt-labs/dbt-core/pull/4299))
### Under the hood
Add --indirect-selection parameter to profiles.yml and builtin DBT_ env vars; stringified parameter to enable multi-modal use ([#3997](https://github.com/dbt-labs/dbt-core/issues/3997), [PR #4270](https://github.com/dbt-labs/dbt-core/pull/4270))
- Add --indirect-selection parameter to profiles.yml and builtin DBT_ env vars; stringified parameter to enable multi-modal use ([#3997](https://github.com/dbt-labs/dbt-core/issues/3997), [#4270](https://github.com/dbt-labs/dbt-core/pull/4270))
- Fix filesystem searcher test failure on Python 3.9 ([#3689](https://github.com/dbt-labs/dbt-core/issues/3689), [#4271](https://github.com/dbt-labs/dbt-core/pull/4271))
- Clean up deprecation warnings shown for `dbt_project.yml` config renames ([#4276](https://github.com/dbt-labs/dbt-core/issues/4276), [#4291](https://github.com/dbt-labs/dbt-core/pull/4291))
- Fix metrics count in compiled project stats ([#4290](https://github.com/dbt-labs/dbt-core/issues/4290), [#4292](https://github.com/dbt-labs/dbt-core/pull/4292))
- First pass at supporting more dbt tasks via python lib ([#4200](https://github.com/dbt-labs/dbt-core/pull/4200))
Contributors:
- [@kadero](https://github.com/kadero) ([#4285](https://github.com/dbt-labs/dbt-core/pull/4285))
- [@kadero](https://github.com/kadero) ([#4285](https://github.com/dbt-labs/dbt-core/pull/4285), [#4296](https://github.com/dbt-labs/dbt-core/pull/4296))
- [@joellabes](https://github.com/joellabes) ([#4295](https://github.com/dbt-labs/dbt-core/pull/4295))
## dbt-core 1.0.0rc1 (November 10, 2021)
@@ -74,7 +190,7 @@ Contributors:
- Make finding disabled nodes more consistent ([#4069](https://github.com/dbt-labs/dbt-core/issues/4069), [#4073](https://github.com/dbt-labas/dbt-core/pull/4073))
- Remove connection from `render_with_context` during parsing, thereby removing misleading log message ([#3137](https://github.com/dbt-labs/dbt-core/issues/3137), [#4062](https://github.com/dbt-labas/dbt-core/pull/4062))
- Wait for postgres docker container to be ready in `setup_db.sh`. ([#3876](https://github.com/dbt-labs/dbt-core/issues/3876), [#3908](https://github.com/dbt-labs/dbt-core/pull/3908))
- Prefer macros defined in the project over the ones in a package by default ([#4106](https://github.com/dbt-labs/dbt-core/issues/4106), [#4114](https://github.com/dbt-labs/dbt-core/pull/4114))
- Prefer macros defined in the project over the ones in a package by default ([#4106](https://github.com/dbt-labs/dbt-core/issues/4106), [#4114](https://github.com/dbt-labs/dbt-core/pull/4114))
- Dependency updates ([#4079](https://github.com/dbt-labs/dbt-core/pull/4079)), ([#3532](https://github.com/dbt-labs/dbt-core/pull/3532)
- Schedule partial parsing for SQL files with env_var changes ([#3885](https://github.com/dbt-labs/dbt-core/issues/3885), [#4101](https://github.com/dbt-labs/dbt-core/pull/4101))
- Schedule partial parsing for schema files with env_var changes ([#3885](https://github.com/dbt-labs/dbt-core/issues/3885), [#4162](https://github.com/dbt-labs/dbt-core/pull/4162))
@@ -135,7 +251,7 @@ Contributors:
- [@laxjesse](https://github.com/laxjesse) ([#4019](https://github.com/dbt-labs/dbt-core/pull/4019))
- [@gitznik](https://github.com/Gitznik) ([#4124](https://github.com/dbt-labs/dbt-core/pull/4124))
## dbt 0.21.1 (Release TBD)
## dbt 0.21.1 (November 29, 2021)
### Fixes
- Add `get_where_subquery` to test macro namespace, fixing custom generic tests that rely on introspecting the `model` arg at parse time ([#4195](https://github.com/dbt-labs/dbt/issues/4195), [#4197](https://github.com/dbt-labs/dbt/pull/4197))
@@ -279,7 +395,7 @@ Contributors:
- [@jmriego](https://github.com/jmriego) ([#3526](https://github.com/dbt-labs/dbt-core/pull/3526))
- [@danielefrigo](https://github.com/danielefrigo) ([#3547](https://github.com/dbt-labs/dbt-core/pull/3547))
## dbt 0.20.2 (Release TBD)
## dbt 0.20.2 (September 07, 2021)
### Under the hood

View File

@@ -10,7 +10,7 @@
## About this document
This document is a guide intended for folks interested in contributing to `dbt`. Below, we document the process by which members of the community should create issues and submit pull requests (PRs) in this repository. It is not intended as a guide for using `dbt`, and it assumes a certain level of familiarity with Python concepts such as virtualenvs, `pip`, python modules, filesystems, and so on. This guide assumes you are using macOS or Linux and are comfortable with the command line.
This document is a guide intended for folks interested in contributing to `dbt-core`. Below, we document the process by which members of the community should create issues and submit pull requests (PRs) in this repository. It is not intended as a guide for using `dbt-core`, and it assumes a certain level of familiarity with Python concepts such as virtualenvs, `pip`, python modules, filesystems, and so on. This guide assumes you are using macOS or Linux and are comfortable with the command line.
If you're new to python development or contributing to open-source software, we encourage you to read this document from start to finish. If you get stuck, drop us a line in the `#dbt-core-development` channel on [slack](https://community.getdbt.com).
@@ -20,101 +20,101 @@ If you have an issue or code change suggestion related to a specific database [a
### Signing the CLA
Please note that all contributors to `dbt` must sign the [Contributor License Agreement](https://docs.getdbt.com/docs/contributor-license-agreements) to have their Pull Request merged into the `dbt` codebase. If you are unable to sign the CLA, then the `dbt` maintainers will unfortunately be unable to merge your Pull Request. You are, however, welcome to open issues and comment on existing ones.
Please note that all contributors to `dbt-core` must sign the [Contributor License Agreement](https://docs.getdbt.com/docs/contributor-license-agreements) to have their Pull Request merged into the `dbt-core` codebase. If you are unable to sign the CLA, then the `dbt-core` maintainers will unfortunately be unable to merge your Pull Request. You are, however, welcome to open issues and comment on existing ones.
## Proposing a change
`dbt` is Apache 2.0-licensed open source software. `dbt` is what it is today because community members like you have opened issues, provided feedback, and contributed to the knowledge loop for the entire communtiy. Whether you are a seasoned open source contributor or a first-time committer, we welcome and encourage you to contribute code, documentation, ideas, or problem statements to this project.
`dbt-core` is Apache 2.0-licensed open source software. `dbt-core` is what it is today because community members like you have opened issues, provided feedback, and contributed to the knowledge loop for the entire communtiy. Whether you are a seasoned open source contributor or a first-time committer, we welcome and encourage you to contribute code, documentation, ideas, or problem statements to this project.
### Defining the problem
If you have an idea for a new feature or if you've discovered a bug in `dbt`, the first step is to open an issue. Please check the list of [open issues](https://github.com/dbt-labs/dbt-core/issues) before creating a new one. If you find a relevant issue, please add a comment to the open issue instead of creating a new one. There are hundreds of open issues in this repository and it can be hard to know where to look for a relevant open issue. **The `dbt` maintainers are always happy to point contributors in the right direction**, so please err on the side of documenting your idea in a new issue if you are unsure where a problem statement belongs.
If you have an idea for a new feature or if you've discovered a bug in `dbt-core`, the first step is to open an issue. Please check the list of [open issues](https://github.com/dbt-labs/dbt-core/issues) before creating a new one. If you find a relevant issue, please add a comment to the open issue instead of creating a new one. There are hundreds of open issues in this repository and it can be hard to know where to look for a relevant open issue. **The `dbt-core` maintainers are always happy to point contributors in the right direction**, so please err on the side of documenting your idea in a new issue if you are unsure where a problem statement belongs.
> **Note:** All community-contributed Pull Requests _must_ be associated with an open issue. If you submit a Pull Request that does not pertain to an open issue, you will be asked to create an issue describing the problem before the Pull Request can be reviewed.
### Discussing the idea
After you open an issue, a `dbt` maintainer will follow up by commenting on your issue (usually within 1-3 days) to explore your idea further and advise on how to implement the suggested changes. In many cases, community members will chime in with their own thoughts on the problem statement. If you as the issue creator are interested in submitting a Pull Request to address the issue, you should indicate this in the body of the issue. The `dbt` maintainers are _always_ happy to help contributors with the implementation of fixes and features, so please also indicate if there's anything you're unsure about or could use guidance around in the issue.
After you open an issue, a `dbt-core` maintainer will follow up by commenting on your issue (usually within 1-3 days) to explore your idea further and advise on how to implement the suggested changes. In many cases, community members will chime in with their own thoughts on the problem statement. If you as the issue creator are interested in submitting a Pull Request to address the issue, you should indicate this in the body of the issue. The `dbt-core` maintainers are _always_ happy to help contributors with the implementation of fixes and features, so please also indicate if there's anything you're unsure about or could use guidance around in the issue.
### Submitting a change
If an issue is appropriately well scoped and describes a beneficial change to the `dbt` codebase, then anyone may submit a Pull Request to implement the functionality described in the issue. See the sections below on how to do this.
If an issue is appropriately well scoped and describes a beneficial change to the `dbt-core` codebase, then anyone may submit a Pull Request to implement the functionality described in the issue. See the sections below on how to do this.
The `dbt` maintainers will add a `good first issue` label if an issue is suitable for a first-time contributor. This label often means that the required code change is small, limited to one database adapter, or a net-new addition that does not impact existing functionality. You can see the list of currently open issues on the [Contribute](https://github.com/dbt-labs/dbt-core/contribute) page.
The `dbt-core` maintainers will add a `good first issue` label if an issue is suitable for a first-time contributor. This label often means that the required code change is small, limited to one database adapter, or a net-new addition that does not impact existing functionality. You can see the list of currently open issues on the [Contribute](https://github.com/dbt-labs/dbt-core/contribute) page.
Here's a good workflow:
- Comment on the open issue, expressing your interest in contributing the required code change
- Outline your planned implementation. If you want help getting started, ask!
- Follow the steps outlined below to develop locally. Once you have opened a PR, one of the `dbt` maintainers will work with you to review your code.
- Add a test! Tests are crucial for both fixes and new features alike. We want to make sure that code works as intended, and that it avoids any bugs previously encountered. Currently, the best resource for understanding `dbt`'s [unit](test/unit) and [integration](test/integration) tests is the tests themselves. One of the maintainers can help by pointing out relevant examples.
- Follow the steps outlined below to develop locally. Once you have opened a PR, one of the `dbt-core` maintainers will work with you to review your code.
- Add a test! Tests are crucial for both fixes and new features alike. We want to make sure that code works as intended, and that it avoids any bugs previously encountered. Currently, the best resource for understanding `dbt-core`'s [unit](test/unit) and [integration](test/integration) tests is the tests themselves. One of the maintainers can help by pointing out relevant examples.
In some cases, the right resolution to an open issue might be tangential to the `dbt` codebase. The right path forward might be a documentation update or a change that can be made in user-space. In other cases, the issue might describe functionality that the `dbt` maintainers are unwilling or unable to incorporate into the `dbt` codebase. When it is determined that an open issue describes functionality that will not translate to a code change in the `dbt` repository, the issue will be tagged with the `wontfix` label (see below) and closed.
In some cases, the right resolution to an open issue might be tangential to the `dbt-core` codebase. The right path forward might be a documentation update or a change that can be made in user-space. In other cases, the issue might describe functionality that the `dbt-core` maintainers are unwilling or unable to incorporate into the `dbt-core` codebase. When it is determined that an open issue describes functionality that will not translate to a code change in the `dbt-core` repository, the issue will be tagged with the `wontfix` label (see below) and closed.
### Using issue labels
The `dbt` maintainers use labels to categorize open issues. Some labels indicate the databases impacted by the issue, while others describe the domain in the `dbt` codebase germane to the discussion. While most of these labels are self-explanatory (eg. `snowflake` or `bigquery`), there are others that are worth describing.
The `dbt-core` maintainers use labels to categorize open issues. Most labels describe the domain in the `dbt-core` codebase germane to the discussion.
| tag | description |
| --- | ----------- |
| [triage](https://github.com/dbt-labs/dbt-core/labels/triage) | This is a new issue which has not yet been reviewed by a `dbt` maintainer. This label is removed when a maintainer reviews and responds to the issue. |
| [bug](https://github.com/dbt-labs/dbt-core/labels/bug) | This issue represents a defect or regression in `dbt` |
| [enhancement](https://github.com/dbt-labs/dbt-core/labels/enhancement) | This issue represents net-new functionality in `dbt` |
| [good first issue](https://github.com/dbt-labs/dbt-core/labels/good%20first%20issue) | This issue does not require deep knowledge of the `dbt` codebase to implement. This issue is appropriate for a first-time contributor. |
| [triage](https://github.com/dbt-labs/dbt-core/labels/triage) | This is a new issue which has not yet been reviewed by a `dbt-core` maintainer. This label is removed when a maintainer reviews and responds to the issue. |
| [bug](https://github.com/dbt-labs/dbt-core/labels/bug) | This issue represents a defect or regression in `dbt-core` |
| [enhancement](https://github.com/dbt-labs/dbt-core/labels/enhancement) | This issue represents net-new functionality in `dbt-core` |
| [good first issue](https://github.com/dbt-labs/dbt-core/labels/good%20first%20issue) | This issue does not require deep knowledge of the `dbt-core` codebase to implement. This issue is appropriate for a first-time contributor. |
| [help wanted](https://github.com/dbt-labs/dbt-core/labels/help%20wanted) / [discussion](https://github.com/dbt-labs/dbt-core/labels/discussion) | Conversation around this issue in ongoing, and there isn't yet a clear path forward. Input from community members is most welcome. |
| [duplicate](https://github.com/dbt-labs/dbt-core/issues/duplicate) | This issue is functionally identical to another open issue. The `dbt` maintainers will close this issue and encourage community members to focus conversation on the other one. |
| [snoozed](https://github.com/dbt-labs/dbt-core/labels/snoozed) | This issue describes a good idea, but one which will probably not be addressed in a six-month time horizon. The `dbt` maintainers will revist these issues periodically and re-prioritize them accordingly. |
| [stale](https://github.com/dbt-labs/dbt-core/labels/stale) | This is an old issue which has not recently been updated. Stale issues will periodically be closed by `dbt` maintainers, but they can be re-opened if the discussion is restarted. |
| [wontfix](https://github.com/dbt-labs/dbt-core/labels/wontfix) | This issue does not require a code change in the `dbt` repository, or the maintainers are unwilling/unable to merge a Pull Request which implements the behavior described in the issue. |
| [duplicate](https://github.com/dbt-labs/dbt-core/issues/duplicate) | This issue is functionally identical to another open issue. The `dbt-core` maintainers will close this issue and encourage community members to focus conversation on the other one. |
| [snoozed](https://github.com/dbt-labs/dbt-core/labels/snoozed) | This issue describes a good idea, but one which will probably not be addressed in a six-month time horizon. The `dbt-core` maintainers will revist these issues periodically and re-prioritize them accordingly. |
| [stale](https://github.com/dbt-labs/dbt-core/labels/stale) | This is an old issue which has not recently been updated. Stale issues will periodically be closed by `dbt-core` maintainers, but they can be re-opened if the discussion is restarted. |
| [wontfix](https://github.com/dbt-labs/dbt-core/labels/wontfix) | This issue does not require a code change in the `dbt-core` repository, or the maintainers are unwilling/unable to merge a Pull Request which implements the behavior described in the issue. |
#### Branching Strategy
`dbt` has three types of branches:
`dbt-core` has three types of branches:
- **Trunks** are where active development of the next release takes place. There is one trunk named `develop` at the time of writing this, and will be the default branch of the repository.
- **Release Branches** track a specific, not yet complete release of `dbt`. Each minor version release has a corresponding release branch. For example, the `0.11.x` series of releases has a branch called `0.11.latest`. This allows us to release new patch versions under `0.11` without necessarily needing to pull them into the latest version of `dbt`.
- **Trunks** are where active development of the next release takes place. There is one trunk named `main` at the time of writing this, and will be the default branch of the repository.
- **Release Branches** track a specific, not yet complete release of `dbt-core`. Each minor version release has a corresponding release branch. For example, the `0.11.x` series of releases has a branch called `0.11.latest`. This allows us to release new patch versions under `0.11` without necessarily needing to pull them into the latest version of `dbt-core`.
- **Feature Branches** track individual features and fixes. On completion they should be merged into the trunk branch or a specific release branch.
## Getting the code
### Installing git
You will need `git` in order to download and modify the `dbt` source code. On macOS, the best way to download git is to just install [Xcode](https://developer.apple.com/support/xcode/).
You will need `git` in order to download and modify the `dbt-core` source code. On macOS, the best way to download git is to just install [Xcode](https://developer.apple.com/support/xcode/).
### External contributors
If you are not a member of the `dbt-labs` GitHub organization, you can contribute to `dbt` by forking the `dbt` repository. For a detailed overview on forking, check out the [GitHub docs on forking](https://help.github.com/en/articles/fork-a-repo). In short, you will need to:
If you are not a member of the `dbt-labs` GitHub organization, you can contribute to `dbt-core` by forking the `dbt-core` repository. For a detailed overview on forking, check out the [GitHub docs on forking](https://help.github.com/en/articles/fork-a-repo). In short, you will need to:
1. fork the `dbt` repository
1. fork the `dbt-core` repository
2. clone your fork locally
3. check out a new branch for your proposed changes
4. push changes to your fork
5. open a pull request against `dbt-labs/dbt` from your forked repository
### Core contributors
### dbt Labs contributors
If you are a member of the `dbt-labs` GitHub organization, you will have push access to the `dbt` repo. Rather than forking `dbt` to make your changes, just clone the repository, check out a new branch, and push directly to that branch.
If you are a member of the `dbt-labs` GitHub organization, you will have push access to the `dbt-core` repo. Rather than forking `dbt-core` to make your changes, just clone the repository, check out a new branch, and push directly to that branch.
## Setting up an environment
There are some tools that will be helpful to you in developing locally. While this is the list relevant for `dbt` development, many of these tools are used commonly across open-source python projects.
There are some tools that will be helpful to you in developing locally. While this is the list relevant for `dbt-core` development, many of these tools are used commonly across open-source python projects.
### Tools
A short list of tools used in `dbt` testing that will be helpful to your understanding:
A short list of tools used in `dbt-core` testing that will be helpful to your understanding:
- [`tox`](https://tox.readthedocs.io/en/latest/) to manage virtualenvs across python versions. We currently target the latest patch releases for Python 3.6, Python 3.7, Python 3.8, and Python 3.9
- [`tox`](https://tox.readthedocs.io/en/latest/) to manage virtualenvs across python versions. We currently target the latest patch releases for Python 3.7, Python 3.8, and Python 3.9
- [`pytest`](https://docs.pytest.org/en/latest/) to discover/run tests
- [`make`](https://users.cs.duke.edu/~ola/courses/programming/Makefiles/Makefiles.html) - but don't worry too much, nobody _really_ understands how make works and our Makefile is super simple
- [`flake8`](https://flake8.pycqa.org/en/latest/) for code linting
- [`mypy`](https://mypy.readthedocs.io/en/stable/) for static type checking
- [Github Actions](https://github.com/features/actions)
A deep understanding of these tools in not required to effectively contribute to `dbt`, but we recommend checking out the attached documentation if you're interested in learning more about them.
A deep understanding of these tools in not required to effectively contribute to `dbt-core`, but we recommend checking out the attached documentation if you're interested in learning more about them.
#### virtual environments
We strongly recommend using virtual environments when developing code in `dbt`. We recommend creating this virtualenv
in the root of the `dbt` repository. To create a new virtualenv, run:
We strongly recommend using virtual environments when developing code in `dbt-core`. We recommend creating this virtualenv
in the root of the `dbt-core` repository. To create a new virtualenv, run:
```sh
python3 -m venv env
source env/bin/activate
@@ -135,11 +135,11 @@ For testing, and later in the examples in this document, you may want to have `p
brew install postgresql
```
## Running `dbt` in development
## Running `dbt-core` in development
### Installation
First make sure that you set up your `virtualenv` as described in [Setting up an environment](#setting-up-an-environment). Also ensure you have the latest version of pip installed with `pip install --upgrade pip`. Next, install `dbt` (and its dependencies) with:
First make sure that you set up your `virtualenv` as described in [Setting up an environment](#setting-up-an-environment). Also ensure you have the latest version of pip installed with `pip install --upgrade pip`. Next, install `dbt-core` (and its dependencies) with:
```sh
make dev
@@ -147,23 +147,24 @@ make dev
pip install -r dev-requirements.txt -r editable-requirements.txt
```
When `dbt` is installed this way, any changes you make to the `dbt` source code will be reflected immediately in your next `dbt` run.
When `dbt-core` is installed this way, any changes you make to the `dbt-core` source code will be reflected immediately in your next `dbt-core` run.
### Running `dbt`
With your virtualenv activated, the `dbt` script should point back to the source code you've cloned on your machine. You can verify this by running `which dbt`. This command should show you a path to an executable in your virtualenv.
### Running `dbt-core`
With your virtualenv activated, the `dbt-core` script should point back to the source code you've cloned on your machine. You can verify this by running `which dbt`. This command should show you a path to an executable in your virtualenv.
Configure your [profile](https://docs.getdbt.com/docs/configure-your-profile) as necessary to connect to your target databases. It may be a good idea to add a new profile pointing to a local postgres instance, or a specific test sandbox within your data warehouse if appropriate.
## Testing
Getting the `dbt` integration tests set up in your local environment will be very helpful as you start to make changes to your local version of `dbt`. The section that follows outlines some helpful tips for setting up the test environment.
Getting the `dbt-core` integration tests set up in your local environment will be very helpful as you start to make changes to your local version of `dbt-core`. The section that follows outlines some helpful tips for setting up the test environment.
Although `dbt` works with a number of different databases, you won't need to supply credentials for every one of these databases in your test environment. Instead you can test all dbt-core code changes with Python and Postgres.
Although `dbt-core` works with a number of different databases, you won't need to supply credentials for every one of these databases in your test environment. Instead you can test all dbt-core code changes with Python and Postgres.
### Initial setup
We recommend starting with `dbt`'s Postgres tests. These tests cover most of the functionality in `dbt`, are the fastest to run, and are the easiest to set up. To run the Postgres integration tests, you'll have to do one extra step of setting up the test database:
We recommend starting with `dbt-core`'s Postgres tests. These tests cover most of the functionality in `dbt-core`, are the fastest to run, and are the easiest to set up. To run the Postgres integration tests, you'll have to do one extra step of setting up the test database:
```sh
make setup-db
@@ -174,15 +175,6 @@ docker-compose up -d database
PGHOST=localhost PGUSER=root PGPASSWORD=password PGDATABASE=postgres bash test/setup_db.sh
```
`dbt` uses test credentials specified in a `test.env` file in the root of the repository for non-Postgres databases. This `test.env` file is git-ignored, but please be _extra_ careful to never check in credentials or other sensitive information when developing against `dbt`. To create your `test.env` file, copy the provided sample file, then supply your relevant credentials. This step is only required to use non-Postgres databases.
```
cp test.env.sample test.env
$EDITOR test.env
```
> In general, it's most important to have successful unit and Postgres tests. Once you open a PR, `dbt` will automatically run integration tests for the other three core database adapters. Of course, if you are a BigQuery user, contributing a BigQuery-only feature, it's important to run BigQuery tests as well.
### Test commands
There are a few methods for running tests locally.
@@ -208,9 +200,9 @@ suites.
[`tox`](https://tox.readthedocs.io/en/latest/) takes care of managing virtualenvs and install dependencies in order to run
tests. You can also run tests in parallel, for example, you can run unit tests
for Python 3.6, Python 3.7, Python 3.8, `flake8` checks, and `mypy` checks in
for Python 3.7, Python 3.8, Python 3.9, `flake8` checks, and `mypy` checks in
parallel with `tox -p`. Also, you can run unit tests for specific python versions
with `tox -e py36`. The configuration for these tests in located in `tox.ini`.
with `tox -e py37`. The configuration for these tests in located in `tox.ini`.
#### `pytest`
@@ -230,6 +222,8 @@ python -m pytest test/unit/test_graph.py::GraphTest::test__dependency_list
dbt Labs provides a CI environment to test changes to specific adapters, and periodic maintenance checks of `dbt-core` through Github Actions. For example, if you submit a pull request to the `dbt-redshift` repo, GitHub will trigger automated code checks and tests against Redshift.
A `dbt` maintainer will review your PR. They may suggest code revision for style or clarity, or request that you add unit or integration test(s). These are good things! We believe that, with a little bit of help, anyone can contribute high-quality code.
A `dbt-core` maintainer will review your PR. They may suggest code revision for style or clarity, or request that you add unit or integration test(s). These are good things! We believe that, with a little bit of help, anyone can contribute high-quality code.
- First time contributors should note code checks + unit tests require a maintainer to approve.
Once all tests are passing and your PR has been approved, a `dbt` maintainer will merge your changes into the active development branch. And that's it! Happy developing :tada:
Once all tests are passing and your PR has been approved, a `dbt-core` maintainer will merge your changes into the active development branch. And that's it! Happy developing :tada:

View File

@@ -1,3 +1,8 @@
##
# This dockerfile is used for local development and adapter testing only.
# See `/docker` for a generic and production-ready docker file
##
FROM ubuntu:20.04
ENV DEBIAN_FRONTEND noninteractive

52
core/dbt/README.md Normal file
View File

@@ -0,0 +1,52 @@
# core/dbt directory README
## The following are individual files in this directory.
### deprecations.py
### flags.py
### main.py
### tracking.py
### version.py
### lib.py
### node_types.py
### helper_types.py
### links.py
### semver.py
### ui.py
### compilation.py
### dataclass_schema.py
### exceptions.py
### hooks.py
### logger.py
### profiler.py
### utils.py
## The subdirectories will be documented in a README in the subdirectory
* config
* include
* adapters
* context
* deps
* graph
* task
* clients
* events

View File

@@ -0,0 +1 @@
# Adapters README

View File

@@ -39,7 +39,7 @@ from dbt.adapters.base.relation import (
ComponentName, BaseRelation, InformationSchema, SchemaSearchMap
)
from dbt.adapters.base import Column as BaseColumn
from dbt.adapters.cache import RelationsCache
from dbt.adapters.cache import RelationsCache, _make_key
SeedModel = Union[ParsedSeedNode, CompiledSeedNode]
@@ -291,7 +291,7 @@ class BaseAdapter(metaclass=AdapterMeta):
if (database, schema) not in self.cache:
fire_event(
CacheMiss(
conn_name=self.nice_connection_name,
conn_name=self.nice_connection_name(),
database=database,
schema=schema
)
@@ -676,7 +676,11 @@ class BaseAdapter(metaclass=AdapterMeta):
relations = self.list_relations_without_caching(
schema_relation
)
fire_event(ListRelations(database=database, schema=schema, relations=relations))
fire_event(ListRelations(
database=database,
schema=schema,
relations=[_make_key(x) for x in relations]
))
return relations

View File

@@ -89,7 +89,10 @@ class BaseRelation(FakeAPIObject, Hashable):
if not self._is_exactish_match(k, v):
exact_match = False
if self.path.get_lowered_part(k) != v.lower():
if (
self.path.get_lowered_part(k).strip(self.quote_character) !=
v.lower().strip(self.quote_character)
):
approximate_match = False
if approximate_match and not exact_match:

View File

@@ -1,8 +1,8 @@
import threading
from collections import namedtuple
from copy import deepcopy
from typing import Any, Dict, Iterable, List, Optional, Set, Tuple
from dbt.adapters.reference_keys import _make_key, _ReferenceKey
import dbt.exceptions
from dbt.events.functions import fire_event
from dbt.events.types import (
@@ -21,18 +21,7 @@ from dbt.events.types import (
UpdateReference
)
from dbt.utils import lowercase
_ReferenceKey = namedtuple('_ReferenceKey', 'database schema identifier')
def _make_key(relation) -> _ReferenceKey:
"""Make _ReferenceKeys with lowercase values for the cache so we don't have
to keep track of quoting
"""
# databases and schemas can both be None
return _ReferenceKey(lowercase(relation.database),
lowercase(relation.schema),
lowercase(relation.identifier))
from dbt.helper_types import Lazy
def dot_separated(key: _ReferenceKey) -> str:
@@ -303,11 +292,12 @@ class RelationsCache:
:raises InternalError: If either entry does not exist.
"""
ref_key = _make_key(referenced)
dep_key = _make_key(dependent)
if (ref_key.database, ref_key.schema) not in self:
# if we have not cached the referenced schema at all, we must be
# referring to a table outside our control. There's no need to make
# a link - we will never drop the referenced relation during a run.
fire_event(UncachedRelation(dep_key=dependent, ref_key=ref_key))
fire_event(UncachedRelation(dep_key=dep_key, ref_key=ref_key))
return
if ref_key not in self.relations:
# Insert a dummy "external" relation.
@@ -315,8 +305,6 @@ class RelationsCache:
type=referenced.External
)
self.add(referenced)
dep_key = _make_key(dependent)
if dep_key not in self.relations:
# Insert a dummy "external" relation.
dependent = dependent.replace(
@@ -334,12 +322,12 @@ class RelationsCache:
:param BaseRelation relation: The underlying relation.
"""
cached = _CachedRelation(relation)
fire_event(AddRelation(relation=cached))
fire_event(DumpBeforeAddGraph(graph_func=self.dump_graph))
fire_event(AddRelation(relation=_make_key(cached)))
fire_event(DumpBeforeAddGraph(dump=Lazy.defer(lambda: self.dump_graph())))
with self.lock:
self._setdefault(cached)
fire_event(DumpAfterAddGraph(graph_func=self.dump_graph))
fire_event(DumpAfterAddGraph(dump=Lazy.defer(lambda: self.dump_graph())))
def _remove_refs(self, keys):
"""Removes all references to all entries in keys. This does not
@@ -354,17 +342,17 @@ class RelationsCache:
for cached in self.relations.values():
cached.release_references(keys)
def _drop_cascade_relation(self, dropped):
def _drop_cascade_relation(self, dropped_key):
"""Drop the given relation and cascade it appropriately to all
dependent relations.
:param _CachedRelation dropped: An existing _CachedRelation to drop.
"""
if dropped not in self.relations:
fire_event(DropMissingRelation(relation=dropped))
if dropped_key not in self.relations:
fire_event(DropMissingRelation(relation=dropped_key))
return
consequences = self.relations[dropped].collect_consequences()
fire_event(DropCascade(dropped=dropped, consequences=consequences))
consequences = self.relations[dropped_key].collect_consequences()
fire_event(DropCascade(dropped=dropped_key, consequences=consequences))
self._remove_refs(consequences)
def drop(self, relation):
@@ -378,10 +366,10 @@ class RelationsCache:
:param str schema: The schema of the relation to drop.
:param str identifier: The identifier of the relation to drop.
"""
dropped = _make_key(relation)
fire_event(DropRelation(dropped=dropped))
dropped_key = _make_key(relation)
fire_event(DropRelation(dropped=dropped_key))
with self.lock:
self._drop_cascade_relation(dropped)
self._drop_cascade_relation(dropped_key)
def _rename_relation(self, old_key, new_relation):
"""Rename a relation named old_key to new_key, updating references.
@@ -453,7 +441,7 @@ class RelationsCache:
new_key = _make_key(new)
fire_event(RenameSchema(old_key=old_key, new_key=new_key))
fire_event(DumpBeforeRenameSchema(graph_func=self.dump_graph))
fire_event(DumpBeforeRenameSchema(dump=Lazy.defer(lambda: self.dump_graph())))
with self.lock:
if self._check_rename_constraints(old_key, new_key):
@@ -461,7 +449,7 @@ class RelationsCache:
else:
self._setdefault(_CachedRelation(new))
fire_event(DumpAfterRenameSchema(graph_func=self.dump_graph))
fire_event(DumpAfterRenameSchema(dump=Lazy.defer(lambda: self.dump_graph())))
def get_relations(
self, database: Optional[str], schema: Optional[str]

View File

@@ -0,0 +1,24 @@
# this module exists to resolve circular imports with the events module
from collections import namedtuple
from typing import Optional
_ReferenceKey = namedtuple('_ReferenceKey', 'database schema identifier')
def lowercase(value: Optional[str]) -> Optional[str]:
if value is None:
return None
else:
return value.lower()
def _make_key(relation) -> _ReferenceKey:
"""Make _ReferenceKeys with lowercase values for the cache so we don't have
to keep track of quoting
"""
# databases and schemas can both be None
return _ReferenceKey(lowercase(relation.database),
lowercase(relation.schema),
lowercase(relation.identifier))

View File

@@ -75,7 +75,8 @@ class SQLConnectionManager(BaseConnectionManager):
fire_event(
SQLQueryStatus(
status=self.get_response(cursor), elapsed=round((time.time() - pre), 2)
status=str(self.get_response(cursor)),
elapsed=round((time.time() - pre), 2)
)
)

View File

@@ -5,6 +5,7 @@ import dbt.clients.agate_helper
from dbt.contracts.connection import Connection
import dbt.exceptions
from dbt.adapters.base import BaseAdapter, available
from dbt.adapters.cache import _make_key
from dbt.adapters.sql import SQLConnectionManager
from dbt.events.functions import fire_event
from dbt.events.types import ColTypeChange, SchemaCreation, SchemaDrop
@@ -122,7 +123,7 @@ class SQLAdapter(BaseAdapter):
ColTypeChange(
orig_type=target_column.data_type,
new_type=new_type,
table=current,
table=_make_key(current),
)
)
@@ -182,7 +183,7 @@ class SQLAdapter(BaseAdapter):
def create_schema(self, relation: BaseRelation) -> None:
relation = relation.without_identifier()
fire_event(SchemaCreation(relation=relation))
fire_event(SchemaCreation(relation=_make_key(relation)))
kwargs = {
'relation': relation,
}
@@ -193,7 +194,7 @@ class SQLAdapter(BaseAdapter):
def drop_schema(self, relation: BaseRelation) -> None:
relation = relation.without_identifier()
fire_event(SchemaDrop(relation=relation))
fire_event(SchemaDrop(relation=_make_key(relation)))
kwargs = {
'relation': relation,
}

View File

@@ -0,0 +1 @@
# Clients README

View File

@@ -13,6 +13,18 @@ from dbt.exceptions import RuntimeException
BOM = BOM_UTF8.decode('utf-8') # '\ufeff'
class Number(agate.data_types.Number):
# undo the change in https://github.com/wireservice/agate/pull/733
# i.e. do not cast True and False to numeric 1 and 0
def cast(self, d):
if type(d) == bool:
raise agate.exceptions.CastError(
'Do not cast True to 1 or False to 0.'
)
else:
return super().cast(d)
class ISODateTime(agate.data_types.DateTime):
def cast(self, d):
# this is agate.data_types.DateTime.cast with the "clever" bits removed
@@ -41,7 +53,7 @@ def build_type_tester(
) -> agate.TypeTester:
types = [
agate.data_types.Number(null_values=('null', '')),
Number(null_values=('null', '')),
agate.data_types.Date(null_values=('null', ''),
date_format='%Y-%m-%d'),
agate.data_types.DateTime(null_values=('null', ''),

View File

@@ -8,7 +8,10 @@ from dbt.events.types import (
GitProgressUpdatingExistingDependency, GitProgressPullingNewDependency,
GitNothingToDo, GitProgressUpdatedCheckoutRange, GitProgressCheckedOutAt
)
import dbt.exceptions
from dbt.exceptions import (
CommandResultError, RuntimeException, bad_package_spec, raise_git_cloning_error,
raise_git_cloning_problem
)
from packaging import version
@@ -22,9 +25,9 @@ def _raise_git_cloning_error(repo, revision, error):
if 'usage: git' in stderr:
stderr = stderr.split('\nusage: git')[0]
if re.match("fatal: destination path '(.+)' already exists", stderr):
raise error
raise_git_cloning_error(error)
dbt.exceptions.bad_package_spec(repo, revision, stderr)
bad_package_spec(repo, revision, stderr)
def clone(repo, cwd, dirname=None, remove_git_dir=False, revision=None, subdirectory=None):
@@ -53,7 +56,7 @@ def clone(repo, cwd, dirname=None, remove_git_dir=False, revision=None, subdirec
clone_cmd.append(dirname)
try:
result = run_cmd(cwd, clone_cmd, env={'LC_ALL': 'C'})
except dbt.exceptions.CommandResultError as exc:
except CommandResultError as exc:
_raise_git_cloning_error(repo, revision, exc)
if subdirectory:
@@ -61,7 +64,7 @@ def clone(repo, cwd, dirname=None, remove_git_dir=False, revision=None, subdirec
clone_cmd_subdir = ['git', 'sparse-checkout', 'set', subdirectory]
try:
run_cmd(cwd_subdir, clone_cmd_subdir)
except dbt.exceptions.CommandResultError as exc:
except CommandResultError as exc:
_raise_git_cloning_error(repo, revision, exc)
if remove_git_dir:
@@ -105,9 +108,9 @@ def checkout(cwd, repo, revision=None):
revision = 'HEAD'
try:
return _checkout(cwd, repo, revision)
except dbt.exceptions.CommandResultError as exc:
except CommandResultError as exc:
stderr = exc.stderr.decode('utf-8').strip()
dbt.exceptions.bad_package_spec(repo, revision, stderr)
bad_package_spec(repo, revision, stderr)
def get_current_sha(cwd):
@@ -131,14 +134,11 @@ def clone_and_checkout(repo, cwd, dirname=None, remove_git_dir=False,
remove_git_dir=remove_git_dir,
subdirectory=subdirectory,
)
except dbt.exceptions.CommandResultError as exc:
except CommandResultError as exc:
err = exc.stderr.decode('utf-8')
exists = re.match("fatal: destination path '(.+)' already exists", err)
if not exists:
print(
'\nSomething went wrong while cloning {}'.format(repo) +
'\nCheck the debug logs for more information')
raise
raise_git_cloning_problem(repo)
directory = None
start_sha = None
@@ -148,7 +148,7 @@ def clone_and_checkout(repo, cwd, dirname=None, remove_git_dir=False,
else:
matches = re.match("Cloning into '(.+)'", err.decode('utf-8'))
if matches is None:
raise dbt.exceptions.RuntimeException(
raise RuntimeException(
f'Error cloning {repo} - never saw "Cloning into ..." from git'
)
directory = matches.group(1)

View File

@@ -33,7 +33,12 @@ def _get(path, registry_base_url=None):
resp = requests.get(url, timeout=30)
fire_event(RegistryProgressGETResponse(url=url, resp_code=resp.status_code))
resp.raise_for_status()
if resp is None:
# It is unexpected for the content of the response to be None so if it is, raising this error
# will cause this function to retry (if called within _get_with_retries) and hopefully get
# a response. This seems to happen when there's an issue with the Hub.
# See https://github.com/dbt-labs/dbt-core/issues/4577
if resp.json() is None:
raise requests.exceptions.ContentDecodingError(
'Request error: The response is None', response=resp
)

View File

@@ -441,7 +441,7 @@ def run_cmd(
fire_event(SystemStdErrMsg(bmsg=err))
if proc.returncode != 0:
fire_event(SystemReportReturnCode(code=proc.returncode))
fire_event(SystemReportReturnCode(returncode=proc.returncode))
raise dbt.exceptions.CommandResultError(cwd, cmd, proc.returncode,
out, err)
@@ -485,7 +485,7 @@ def untar_package(
) -> None:
tar_path = convert_path(tar_path)
tar_dir_name = None
with tarfile.open(tar_path, 'r') as tarball:
with tarfile.open(tar_path, 'r:gz') as tarball:
tarball.extractall(dest_dir)
tar_dir_name = os.path.commonprefix(tarball.getnames())
if rename_to:

View File

@@ -3,6 +3,7 @@ from collections import defaultdict
from typing import List, Dict, Any, Tuple, cast, Optional
import networkx as nx # type: ignore
import pickle
import sqlparse
from dbt import flags
@@ -91,6 +92,8 @@ def _generate_stats(manifest: Manifest):
stats[source.resource_type] += 1
for exposure in manifest.exposures.values():
stats[exposure.resource_type] += 1
for metric in manifest.metrics.values():
stats[metric.resource_type] += 1
for macro in manifest.macros.values():
stats[macro.resource_type] += 1
return stats
@@ -160,7 +163,8 @@ class Linker:
for node_id in self.graph:
data = manifest.expect(node_id).to_dict(omit_none=True)
out_graph.add_node(node_id, **data)
nx.write_gpickle(out_graph, outfile)
with open(outfile, 'wb') as outfh:
pickle.dump(out_graph, outfh, protocol=pickle.HIGHEST_PROTOCOL)
class Compiler:

View File

@@ -0,0 +1 @@
# Config README

View File

@@ -45,7 +45,7 @@ INVALID_VERSION_ERROR = """\
This version of dbt is not supported with the '{package}' package.
Installed version of dbt: {installed}
Required version of dbt for '{package}': {version_spec}
Check the requirements for the '{package}' package, or run dbt again with \
Check for a different version of the '{package}' package, or run dbt again with \
--no-version-check
"""
@@ -54,7 +54,7 @@ IMPOSSIBLE_VERSION_ERROR = """\
The package version requirement can never be satisfied for the '{package}
package.
Required versions of dbt for '{package}': {version_spec}
Check the requirements for the '{package}' package, or run dbt again with \
Check for a different version of the '{package}' package, or run dbt again with \
--no-version-check
"""
@@ -305,7 +305,7 @@ class PartialProject(RenderComponents):
)
raise DbtProjectError(msg.format(deprecated_path=deprecated_path,
exp_path=exp_path))
deprecations.warn('project_config_path',
deprecations.warn(f'project-config-{deprecated_path}',
deprecated_path=deprecated_path,
exp_path=exp_path)

View File

@@ -2,6 +2,7 @@ from typing import Dict, Any, Tuple, Optional, Union, Callable
from dbt.clients.jinja import get_rendered, catch_jinja
from dbt.context.target import TargetContext
from dbt.context.secret import SecretContext
from dbt.context.base import BaseContext
from dbt.contracts.connection import HasCredentials
from dbt.exceptions import (
@@ -175,8 +176,13 @@ class DbtProjectYamlRenderer(BaseRenderer):
return True
class ProfileRenderer(BaseRenderer):
class SelectorRenderer(BaseRenderer):
@property
def name(self):
return 'Selector config'
class SecretRenderer(BaseRenderer):
def __init__(
self, cli_vars: Optional[Dict[str, Any]] = None
) -> None:
@@ -184,22 +190,22 @@ class ProfileRenderer(BaseRenderer):
# object in order to retrieve the env_vars.
if cli_vars is None:
cli_vars = {}
self.ctx_obj = BaseContext(cli_vars)
self.ctx_obj = SecretContext(cli_vars)
context = self.ctx_obj.to_dict()
super().__init__(context)
@property
def name(self):
'Profile'
return 'Secret'
class PackageRenderer(BaseRenderer):
class ProfileRenderer(SecretRenderer):
@property
def name(self):
return 'Profile'
class PackageRenderer(SecretRenderer):
@property
def name(self):
return 'Packages config'
class SelectorRenderer(BaseRenderer):
@property
def name(self):
return 'Selector config'

View File

@@ -1,7 +1,7 @@
import itertools
import os
from copy import deepcopy
from dataclasses import dataclass, fields
from dataclasses import dataclass
from pathlib import Path
from typing import (
Dict, Any, Optional, Mapping, Iterator, Iterable, Tuple, List, MutableSet,
@@ -13,20 +13,17 @@ from .project import Project
from .renderer import DbtProjectYamlRenderer, ProfileRenderer
from .utils import parse_cli_vars
from dbt import flags
from dbt import tracking
from dbt.adapters.factory import get_relation_class_by_name, get_include_paths
from dbt.helper_types import FQNPath, PathSet
from dbt.config.profile import read_user_config
from dbt.contracts.connection import AdapterRequiredConfig, Credentials
from dbt.contracts.graph.manifest import ManifestMetadata
from dbt.contracts.relation import ComponentName
from dbt.events.types import ProfileLoadError, ProfileNotFound
from dbt.events.functions import fire_event
from dbt.ui import warning_tag
from dbt.contracts.project import Configuration, UserConfig
from dbt.exceptions import (
RuntimeException,
DbtProfileError,
DbtProjectError,
validator_error_message,
warn_or_error,
@@ -191,6 +188,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
profile_renderer: ProfileRenderer,
profile_name: Optional[str],
) -> Profile:
return Profile.render_from_args(
args, profile_renderer, profile_name
)
@@ -412,21 +410,12 @@ class UnsetCredentials(Credentials):
return ()
class UnsetConfig(UserConfig):
def __getattribute__(self, name):
if name in {f.name for f in fields(UserConfig)}:
raise AttributeError(
f"'UnsetConfig' object has no attribute {name}"
)
def __post_serialize__(self, dct):
return {}
# This is used by UnsetProfileConfig, for commands which do
# not require a profile, i.e. dbt deps and clean
class UnsetProfile(Profile):
def __init__(self):
self.credentials = UnsetCredentials()
self.user_config = UnsetConfig()
self.user_config = UserConfig() # This will be read in _get_rendered_profile
self.profile_name = ''
self.target_name = ''
self.threads = -1
@@ -443,6 +432,8 @@ class UnsetProfile(Profile):
return Profile.__getattribute__(self, name)
# This class is used by the dbt deps and clean commands, because they don't
# require a functioning profile.
@dataclass
class UnsetProfileConfig(RuntimeConfig):
"""This class acts a lot _like_ a RuntimeConfig, except if your profile is
@@ -525,7 +516,7 @@ class UnsetProfileConfig(RuntimeConfig):
profile_env_vars=profile.profile_env_vars,
profile_name='',
target_name='',
user_config=UnsetConfig(),
user_config=UserConfig(),
threads=getattr(args, 'threads', 1),
credentials=UnsetCredentials(),
args=args,
@@ -540,17 +531,12 @@ class UnsetProfileConfig(RuntimeConfig):
profile_renderer: ProfileRenderer,
profile_name: Optional[str],
) -> Profile:
try:
profile = Profile.render_from_args(
args, profile_renderer, profile_name
)
except (DbtProjectError, DbtProfileError) as exc:
fire_event(ProfileLoadError(exc=exc))
fire_event(ProfileNotFound(profile_name=profile_name))
# return the poisoned form
profile = UnsetProfile()
# disable anonymous usage statistics
tracking.disable_tracking()
profile = UnsetProfile()
# The profile (for warehouse connection) is not needed, but we want
# to get the UserConfig, which is also in profiles.yml
user_config = read_user_config(flags.PROFILES_DIR)
profile.user_config = user_config
return profile
@classmethod
@@ -565,9 +551,6 @@ class UnsetProfileConfig(RuntimeConfig):
:raises ValidationException: If the cli variables are invalid.
"""
project, profile = cls.collect_parts(args)
if not isinstance(profile, UnsetProfile):
# if it's a real profile, return a real config
cls = RuntimeConfig
return cls.from_parts(
project=project,

View File

@@ -0,0 +1 @@
# Contexts and Jinja rendering

View File

@@ -11,8 +11,11 @@ from dbt.clients.yaml_helper import ( # noqa: F401
yaml, safe_load, SafeLoader, Loader, Dumper
)
from dbt.contracts.graph.compiled import CompiledResource
from dbt.exceptions import raise_compiler_error, MacroReturn, raise_parsing_error
from dbt.events.functions import fire_event
from dbt.exceptions import (
raise_compiler_error, MacroReturn, raise_parsing_error, disallow_secret_env_var
)
from dbt.logger import SECRET_ENV_PREFIX
from dbt.events.functions import fire_event, get_invocation_id
from dbt.events.types import MacroEventInfo, MacroEventDebug
from dbt.version import __version__ as dbt_version
@@ -42,6 +45,7 @@ import re
# Context class hierarchy
#
# BaseContext -- core/dbt/context/base.py
# SecretContext -- core/dbt/context/secret.py
# TargetContext -- core/dbt/context/target.py
# ConfiguredContext -- core/dbt/context/configured.py
# SchemaYamlContext -- core/dbt/context/configured.py
@@ -313,6 +317,8 @@ class BaseContext(metaclass=ContextMeta):
If the default is None, raise an exception for an undefined variable.
"""
return_value = None
if var.startswith(SECRET_ENV_PREFIX):
disallow_secret_env_var(var)
if var in os.environ:
return_value = os.environ[var]
elif default is not None:
@@ -482,9 +488,9 @@ class BaseContext(metaclass=ContextMeta):
{% endmacro %}"
"""
if info:
fire_event(MacroEventInfo(msg))
fire_event(MacroEventInfo(msg=msg))
else:
fire_event(MacroEventDebug(msg))
fire_event(MacroEventDebug(msg=msg))
return ''
@contextproperty
@@ -520,10 +526,7 @@ class BaseContext(metaclass=ContextMeta):
"""invocation_id outputs a UUID generated for this dbt run (useful for
auditing)
"""
if tracking.active_user is not None:
return tracking.active_user.invocation_id
else:
return None
return get_invocation_id()
@contextproperty
def modules(self) -> Dict[str, Any]:

View File

@@ -8,7 +8,7 @@ from dbt.utils import MultiDict
from dbt.context.base import contextproperty, contextmember, Var
from dbt.context.target import TargetContext
from dbt.exceptions import raise_parsing_error
from dbt.exceptions import raise_parsing_error, disallow_secret_env_var
class ConfiguredContext(TargetContext):
@@ -89,13 +89,15 @@ class SchemaYamlContext(ConfiguredContext):
@contextmember
def env_var(self, var: str, default: Optional[str] = None) -> str:
return_value = None
if var.startswith(SECRET_ENV_PREFIX):
disallow_secret_env_var(var)
if var in os.environ:
return_value = os.environ[var]
elif default is not None:
return_value = default
if return_value is not None:
if not var.startswith(SECRET_ENV_PREFIX) and self.schema_yaml_vars:
if self.schema_yaml_vars:
self.schema_yaml_vars.env_vars[var] = return_value
return return_value
else:

View File

@@ -38,6 +38,7 @@ from dbt.contracts.graph.parsed import (
)
from dbt.exceptions import (
CompilationException,
ParsingException,
InternalException,
ValidationException,
RuntimeException,
@@ -50,6 +51,7 @@ from dbt.exceptions import (
source_target_not_found,
wrapped_exports,
raise_parsing_error,
disallow_secret_env_var,
)
from dbt.config import IsFQNResource
from dbt.node_types import NodeType
@@ -325,7 +327,7 @@ class ParseConfigObject(Config):
def require(self, name, validator=None):
return ''
def get(self, name, validator=None, default=None):
def get(self, name, default=None, validator=None):
return ''
def persist_relation_docs(self) -> bool:
@@ -369,7 +371,7 @@ class RuntimeConfigObject(Config):
return to_return
def get(self, name, validator=None, default=None):
def get(self, name, default=None, validator=None):
to_return = self._lookup(name, default)
if validator is not None and default is not None:
@@ -1172,6 +1174,8 @@ class ProviderContext(ManifestContext):
If the default is None, raise an exception for an undefined variable.
"""
return_value = None
if var.startswith(SECRET_ENV_PREFIX):
disallow_secret_env_var(var)
if var in os.environ:
return_value = os.environ[var]
elif default is not None:
@@ -1180,13 +1184,14 @@ class ProviderContext(ManifestContext):
if return_value is not None:
# Save the env_var value in the manifest and the var name in the source_file.
# If this is compiling, do not save because it's irrelevant to parsing.
if (not var.startswith(SECRET_ENV_PREFIX) and self.model and
not hasattr(self.model, 'compiled')):
if self.model and not hasattr(self.model, 'compiled'):
self.manifest.env_vars[var] = return_value
source_file = self.manifest.files[self.model.file_id]
# Schema files should never get here
if source_file.parse_file_type != 'schema':
source_file.env_vars.append(var)
# hooks come from dbt_project.yml which doesn't have a real file_id
if self.model.file_id in self.manifest.files:
source_file = self.manifest.files[self.model.file_id]
# Schema files should never get here
if source_file.parse_file_type != 'schema':
source_file.env_vars.append(var)
return return_value
else:
msg = f"Env var required but not provided: '{var}'"
@@ -1388,11 +1393,24 @@ def generate_parse_exposure(
class MetricRefResolver(BaseResolver):
def __call__(self, *args) -> str:
if len(args) not in (1, 2):
package = None
if len(args) == 1:
name = args[0]
elif len(args) == 2:
package, name = args
else:
ref_invalid_args(self.model, args)
self.validate_args(name, package)
self.model.refs.append(list(args))
return ''
def validate_args(self, name, package):
if not isinstance(name, str):
raise ParsingException(
f'In a metrics section in {self.model.original_file_path} '
f'the name argument to ref() must be a string'
)
def generate_parse_metrics(
metric: ParsedMetric,
@@ -1467,6 +1485,8 @@ class TestContext(ProviderContext):
@contextmember
def env_var(self, var: str, default: Optional[str] = None) -> str:
return_value = None
if var.startswith(SECRET_ENV_PREFIX):
disallow_secret_env_var(var)
if var in os.environ:
return_value = os.environ[var]
elif default is not None:
@@ -1474,7 +1494,7 @@ class TestContext(ProviderContext):
if return_value is not None:
# Save the env_var value in the manifest and the var name in the source_file
if not var.startswith(SECRET_ENV_PREFIX) and self.model:
if self.model:
self.manifest.env_vars[var] = return_value
# the "model" should only be test nodes, but just in case, check
if self.model.resource_type == NodeType.Test and self.model.file_key_name:

View File

@@ -0,0 +1,40 @@
import os
from typing import Any, Dict, Optional
from .base import BaseContext, contextmember
from dbt.exceptions import raise_parsing_error
class SecretContext(BaseContext):
"""This context is used in profiles.yml + packages.yml. It can render secret
env vars that aren't usable elsewhere"""
@contextmember
def env_var(self, var: str, default: Optional[str] = None) -> str:
"""The env_var() function. Return the environment variable named 'var'.
If there is no such environment variable set, return the default.
If the default is None, raise an exception for an undefined variable.
In this context *only*, env_var will return the actual values of
env vars prefixed with DBT_ENV_SECRET_
"""
return_value = None
if var in os.environ:
return_value = os.environ[var]
elif default is not None:
return_value = default
if return_value is not None:
self.env_vars[var] = return_value
return return_value
else:
msg = f"Env var required but not provided: '{var}'"
raise_parsing_error(msg)
def generate_secret_context(cli_vars: Dict[str, Any]) -> Dict[str, Any]:
ctx = SecretContext(cli_vars)
# This is not a Mashumaro to_dict call
return ctx.to_dict()

View File

@@ -0,0 +1 @@
# Contracts README

View File

@@ -178,7 +178,26 @@ class ParsedNodeMandatory(
@dataclass
class ParsedNodeDefaults(ParsedNodeMandatory):
class NodeInfoMixin():
_event_status: Dict[str, Any] = field(default_factory=dict)
@property
def node_info(self):
node_info = {
"node_path": getattr(self, 'path', None),
"node_name": getattr(self, 'name', None),
"unique_id": getattr(self, 'unique_id', None),
"resource_type": str(getattr(self, 'resource_type', '')),
"materialized": self.config.get('materialized'),
"node_status": str(self._event_status.get('node_status')),
"node_started_at": self._event_status.get("started_at"),
"node_finished_at": self._event_status.get("finished_at")
}
return node_info
@dataclass
class ParsedNodeDefaults(NodeInfoMixin, ParsedNodeMandatory):
tags: List[str] = field(default_factory=list)
refs: List[List[str]] = field(default_factory=list)
sources: List[List[str]] = field(default_factory=list)
@@ -223,6 +242,8 @@ class ParsedNode(ParsedNodeDefaults, ParsedNodeMixins, SerializableType):
def __post_serialize__(self, dct):
if 'config_call_dict' in dct:
del dct['config_call_dict']
if '_event_status' in dct:
del dct['_event_status']
return dct
@classmethod
@@ -428,6 +449,10 @@ class ParsedSingularTestNode(ParsedNode):
# refactor the various configs.
config: TestConfig = field(default_factory=TestConfig) # type: ignore
@property
def test_node_type(self):
return 'singular'
@dataclass
class ParsedGenericTestNode(ParsedNode, HasTestMetadata):
@@ -449,6 +474,10 @@ class ParsedGenericTestNode(ParsedNode, HasTestMetadata):
True
)
@property
def test_node_type(self):
return 'generic'
@dataclass
class IntermediateSnapshotNode(ParsedNode):
@@ -599,12 +628,11 @@ class UnpatchedSourceDefinition(UnparsedBaseNode, HasUniqueID, HasFqn):
@dataclass
class ParsedSourceDefinition(
class ParsedSourceMandatory(
UnparsedBaseNode,
HasUniqueID,
HasRelationMetadata,
HasFqn,
):
name: str
source_name: str
@@ -612,6 +640,13 @@ class ParsedSourceDefinition(
loader: str
identifier: str
resource_type: NodeType = field(metadata={'restrict': [NodeType.Source]})
@dataclass
class ParsedSourceDefinition(
NodeInfoMixin,
ParsedSourceMandatory
):
quoting: Quoting = field(default_factory=Quoting)
loaded_at_field: Optional[str] = None
freshness: Optional[FreshnessThreshold] = None
@@ -627,6 +662,11 @@ class ParsedSourceDefinition(
relation_name: Optional[str] = None
created_at: float = field(default_factory=lambda: time.time())
def __post_serialize__(self, dct):
if '_event_status' in dct:
del dct['_event_status']
return dct
def same_database_representation(
self, other: 'ParsedSourceDefinition'
) -> bool:
@@ -800,7 +840,7 @@ class ParsedMetric(UnparsedBaseNode, HasUniqueID, HasFqn):
sources: List[List[str]] = field(default_factory=list)
depends_on: DependsOn = field(default_factory=DependsOn)
refs: List[List[str]] = field(default_factory=list)
created_at: int = field(default_factory=lambda: int(time.time()))
created_at: float = field(default_factory=lambda: time.time())
@property
def depends_on_nodes(self):

View File

@@ -285,7 +285,7 @@ class UnparsedSourceDefinition(dbtClassMixin, Replaceable):
def __post_serialize__(self, dct):
dct = super().__post_serialize__(dct)
if 'freshnewss' not in dct and self.freshness is None:
if 'freshness' not in dct and self.freshness is None:
dct['freshness'] = None
return dct

View File

@@ -18,6 +18,18 @@ DEFAULT_SEND_ANONYMOUS_USAGE_STATS = True
class Name(ValidatedStringMixin):
ValidationRegex = r'^[^\d\W]\w*$'
@classmethod
def is_valid(cls, value: Any) -> bool:
if not isinstance(value, str):
return False
try:
cls.validate(value)
except ValidationError:
return False
return True
register_pattern(Name, r'^[^\d\W]\w*$')
@@ -231,7 +243,7 @@ class UserConfig(ExtensibleDbtClassMixin, Replaceable, UserConfigContract):
printer_width: Optional[int] = None
write_json: Optional[bool] = None
warn_error: Optional[bool] = None
log_format: Optional[bool] = None
log_format: Optional[str] = None
debug: Optional[bool] = None
version_check: Optional[bool] = None
fail_fast: Optional[bool] = None

View File

@@ -58,6 +58,12 @@ class collect_timing_info:
fire_event(TimingInfoCollected())
class RunningStatus(StrEnum):
Started = 'started'
Compiling = 'compiling'
Executing = 'executing'
class NodeStatus(StrEnum):
Success = "success"
Error = "error"

View File

@@ -14,7 +14,8 @@ class PreviousState:
manifest_path = self.path / 'manifest.json'
if manifest_path.exists() and manifest_path.is_file():
try:
self.manifest = WritableManifest.read(str(manifest_path))
# we want to bail with an error if schema versions don't match
self.manifest = WritableManifest.read_and_check_versions(str(manifest_path))
except IncompatibleSchemaException as exc:
exc.add_filename(str(manifest_path))
raise
@@ -22,7 +23,8 @@ class PreviousState:
results_path = self.path / 'run_results.json'
if results_path.exists() and results_path.is_file():
try:
self.results = RunResultsArtifact.read(str(results_path))
# we want to bail with an error if schema versions don't match
self.results = RunResultsArtifact.read_and_check_versions(str(results_path))
except IncompatibleSchemaException as exc:
exc.add_filename(str(results_path))
raise

View File

@@ -9,9 +9,10 @@ from dbt.clients.system import write_json, read_json
from dbt.exceptions import (
InternalException,
RuntimeException,
IncompatibleSchemaException
)
from dbt.version import __version__
from dbt.tracking import get_invocation_id
from dbt.events.functions import get_invocation_id
from dbt.dataclass_schema import dbtClassMixin
SourceKey = Tuple[str, str]
@@ -158,6 +159,8 @@ def get_metadata_env() -> Dict[str, str]:
}
# This is used in the ManifestMetadata, RunResultsMetadata, RunOperationResultMetadata,
# FreshnessMetadata, and CatalogMetadata classes
@dataclasses.dataclass
class BaseArtifactMetadata(dbtClassMixin):
dbt_schema_version: str
@@ -177,6 +180,17 @@ class BaseArtifactMetadata(dbtClassMixin):
return dct
# This is used as a class decorator to set the schema_version in the
# 'dbt_schema_version' class attribute. (It's copied into the metadata objects.)
# Name attributes of SchemaVersion in classes with the 'schema_version' decorator:
# manifest
# run-results
# run-operation-result
# sources
# catalog
# remote-compile-result
# remote-execution-result
# remote-run-result
def schema_version(name: str, version: int):
def inner(cls: Type[VersionedSchema]):
cls.dbt_schema_version = SchemaVersion(
@@ -187,6 +201,7 @@ def schema_version(name: str, version: int):
return inner
# This is used in the ArtifactMixin and RemoteResult classes
@dataclasses.dataclass
class VersionedSchema(dbtClassMixin):
dbt_schema_version: ClassVar[SchemaVersion]
@@ -198,6 +213,30 @@ class VersionedSchema(dbtClassMixin):
result['$id'] = str(cls.dbt_schema_version)
return result
@classmethod
def read_and_check_versions(cls, path: str):
try:
data = read_json(path)
except (EnvironmentError, ValueError) as exc:
raise RuntimeException(
f'Could not read {cls.__name__} at "{path}" as JSON: {exc}'
) from exc
# Check metadata version. There is a class variable 'dbt_schema_version', but
# that doesn't show up in artifacts, where it only exists in the 'metadata'
# dictionary.
if hasattr(cls, 'dbt_schema_version'):
if 'metadata' in data and 'dbt_schema_version' in data['metadata']:
previous_schema_version = data['metadata']['dbt_schema_version']
# cls.dbt_schema_version is a SchemaVersion object
if str(cls.dbt_schema_version) != previous_schema_version:
raise IncompatibleSchemaException(
expected=str(cls.dbt_schema_version),
found=previous_schema_version
)
return cls.from_dict(data) # type: ignore
T = TypeVar('T', bound='ArtifactMixin')
@@ -205,6 +244,8 @@ T = TypeVar('T', bound='ArtifactMixin')
# metadata should really be a Generic[T_M] where T_M is a TypeVar bound to
# BaseArtifactMetadata. Unfortunately this isn't possible due to a mypy issue:
# https://github.com/python/mypy/issues/7520
# This is used in the WritableManifest, RunResultsArtifact, RunOperationResultsArtifact,
# and CatalogArtifact
@dataclasses.dataclass(init=False)
class ArtifactMixin(VersionedSchema, Writable, Readable):
metadata: BaseArtifactMetadata

View File

@@ -22,7 +22,7 @@ class DateTimeSerialization(SerializationStrategy):
out = value.isoformat()
# Assume UTC if timezone is missing
if value.tzinfo is None:
out = out + "Z"
out += "Z"
return out
def deserialize(self, value):

View File

@@ -36,9 +36,9 @@ class DBTDeprecation:
if self.name not in active_deprecations:
desc = self.description.format(**kwargs)
msg = ui.line_wrap_message(
desc, prefix='* Deprecation Warning: '
desc, prefix='Deprecated functionality\n\n'
)
dbt.exceptions.warn_or_error(msg)
dbt.exceptions.warn_or_error(msg, log_fmt=ui.warning_tag('{}'))
self.track_deprecation_warn()
active_deprecations.add(self.name)
@@ -61,13 +61,20 @@ class PackageInstallPathDeprecation(DBTDeprecation):
class ConfigPathDeprecation(DBTDeprecation):
_name = 'project_config_path'
_description = '''\
The `{deprecated_path}` config has been deprecated in favor of `{exp_path}`.
The `{deprecated_path}` config has been renamed to `{exp_path}`.
Please update your `dbt_project.yml` configuration to reflect this change.
'''
class ConfigSourcePathDeprecation(ConfigPathDeprecation):
_name = 'project-config-source-paths'
class ConfigDataPathDeprecation(ConfigPathDeprecation):
_name = 'project-config-data-paths'
_adapter_renamed_description = """\
The adapter function `adapter.{old_name}` is deprecated and will be removed in
a future release of dbt. Please use `adapter.{new_name}` instead.
@@ -106,7 +113,8 @@ def warn(name, *args, **kwargs):
active_deprecations: Set[str] = set()
deprecations_list: List[DBTDeprecation] = [
ConfigPathDeprecation(),
ConfigSourcePathDeprecation(),
ConfigDataPathDeprecation(),
PackageInstallPathDeprecation(),
PackageRedirectDeprecation()
]

1
core/dbt/deps/README.md Normal file
View File

@@ -0,0 +1 @@
# Deps README

View File

@@ -1,4 +1,5 @@
import os
import functools
from typing import List
from dbt import semver
@@ -14,6 +15,7 @@ from dbt.exceptions import (
DependencyException,
package_not_found,
)
from dbt.utils import _connection_exception_retry as connection_exception_retry
class RegistryPackageMixin:
@@ -68,9 +70,28 @@ class RegistryPinnedPackage(RegistryPackageMixin, PinnedPackage):
system.make_directory(os.path.dirname(tar_path))
download_url = metadata.downloads.tarball
system.download_with_retries(download_url, tar_path)
deps_path = project.packages_install_path
package_name = self.get_project_name(project, renderer)
download_untar_fn = functools.partial(
self.download_and_untar,
download_url,
tar_path,
deps_path,
package_name
)
connection_exception_retry(download_untar_fn, 5)
def download_and_untar(self, download_url, tar_path, deps_path, package_name):
"""
Sometimes the download of the files fails and we want to retry. Sometimes the
download appears successful but the file did not make it through as expected
(generally due to a github incident). Either way we want to retry downloading
and untarring to see if we can get a success. Call this within
`_connection_exception_retry`
"""
system.download(download_url, tar_path)
system.untar_package(tar_path, deps_path, package_name)
@@ -127,9 +148,12 @@ class RegistryUnpinnedPackage(
raise DependencyException(new_msg) from e
available = registry.get_available_versions(self.package)
prerelease_version_specified = any(
bool(version.prerelease) for version in self.versions
)
installable = semver.filter_installable(
available,
self.install_prerelease
self.install_prerelease or prerelease_version_specified
)
available_latest = installable[-1]

View File

@@ -1,12 +1,55 @@
# Events Module
The Events module is the implmentation for structured logging. These events represent both a programatic interface to dbt processes as well as human-readable messaging in one centralized place. The centralization allows for leveraging mypy to enforce interface invariants across all dbt events, and the distinct type layer allows for decoupling events and libraries such as loggers.
The Events module is responsible for communicating internal dbt structures into a consumable interface. Right now, the events module is exclusively used for structured logging, but in the future could grow to include other user-facing components such as exceptions. These events represent both a programatic interface to dbt processes as well as human-readable messaging in one centralized place. The centralization allows for leveraging mypy to enforce interface invariants across all dbt events, and the distinct type layer allows for decoupling events and libraries such as loggers.
# Using the Events Module
The event module provides types that represent what is happening in dbt in `events.types`. These types are intended to represent an exhaustive list of all things happening within dbt that will need to be logged, streamed, or printed. To fire an event, `events.functions::fire_event` is the entry point to the module from everywhere in dbt.
# Logging
When events are processed via `fire_event`, nearly everything is logged. Whether or not the user has enabled the debug flag, all debug messages are still logged to the file. However, some events are particularly time consuming to construct because they return a huge amount of data. Today, the only messages in this category are cache events and are only logged if the `--log-cache-events` flag is on. This is important because these messages should not be created unless they are going to be logged, because they cause a noticable performance degredation. We achieve this by making the event class explicitly use lazy values for the expensive ones so they are not computed until the moment they are required. This is done with the data type `core/dbt/helper_types.py::Lazy` which includes usage documentation.
Example:
```
@dataclass
class DumpBeforeAddGraph(DebugLevel, Cache):
dump: Lazy[Dict[str, List[str]]]
code: str = "E031"
def message(self) -> str:
return f"before adding : {self.dump.force()}"
```
# Adding a New Event
In `events.types` add a new class that represents the new event. This may be a simple class with no values, or it may be a dataclass with some values to construct downstream messaging. Only include the data necessary to construct this message within this class. You must extend all destinations (e.g. - if your log message belongs on the cli, extend `CliEventABC`) as well as the loglevel this event belongs to.
In `events.types` add a new class that represents the new event. All events must be a dataclass with, at minimum, a code. You may also include some other values to construct downstream messaging. Only include the data necessary to construct this message within this class. You must extend all destinations (e.g. - if your log message belongs on the cli, extend `Cli`) as well as the loglevel this event belongs to. This system has been designed to take full advantage of mypy so running it will catch anything you may miss.
## Required for Every Event
- a string attribute `code`, that's unique across events
- assign a log level by extending `DebugLevel`, `InfoLevel`, `WarnLevel`, or `ErrorLevel`
- a message()
- extend `File` and/or `Cli` based on where it should output
Example
```
@dataclass
class PartialParsingDeletedExposure(DebugLevel, Cli, File):
unique_id: str
code: str = "I049"
def message(self) -> str:
return f"Partial parsing: deleted exposure {self.unique_id}"
```
## Optional (based on your event)
- Events associated with node status changes must be extended with `NodeInfo` which contains a node_info attribute
All values other than `code` and `node_info` will be included in the `data` node of the json log output.
Once your event has been added, add a dummy call to your new event at the bottom of `types.py` and also add your new Event to the list `sample_values` in `test/unit/test_events.py'.
# Adapter Maintainers
To integrate existing log messages from adapters, you likely have a line of code like this in your adapter already:

View File

@@ -1,8 +1,9 @@
from abc import ABCMeta, abstractmethod
from abc import ABCMeta, abstractproperty, abstractmethod
from dataclasses import dataclass
from datetime import datetime
from dbt.events.serialization import EventSerialization
import os
from typing import Any
import threading
from typing import Any, Dict
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
@@ -11,30 +12,9 @@ from typing import Any
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# in preparation for #3977
class TestLevel():
def level_tag(self) -> str:
return "test"
class DebugLevel():
def level_tag(self) -> str:
return "debug"
class InfoLevel():
def level_tag(self) -> str:
return "info"
class WarnLevel():
def level_tag(self) -> str:
return "warn"
class ErrorLevel():
def level_tag(self) -> str:
return "error"
class Cache():
# Events with this class will only be logged when the `--log-cache-events` flag is passed
pass
@dataclass
@@ -52,44 +32,89 @@ class ShowException():
# TODO add exhaustiveness checking for subclasses
# top-level superclass for all events
class Event(metaclass=ABCMeta):
# fields that should be on all events with their default implementations
ts: datetime = datetime.now()
pid: int = os.getpid()
# code: int
# Do not define fields with defaults here
# four digit string code that uniquely identifies this type of event
# uniqueness and valid characters are enforced by tests
@abstractproperty
@staticmethod
def code() -> str:
raise Exception("code() not implemented for event")
# The 'to_dict' method is added by mashumaro via the EventSerialization.
# It should be in all subclasses that are to record actual events.
@abstractmethod
def to_dict(self):
raise Exception('to_dict not implemented for Event')
# do not define this yourself. inherit it from one of the above level types.
@abstractmethod
def level_tag(self) -> str:
raise Exception("level_tag not implemented for event")
raise Exception("level_tag not implemented for Event")
# Solely the human readable message. Timestamps and formatting will be added by the logger.
# Must override yourself
@abstractmethod
def message(self) -> str:
raise Exception("msg not implemented for cli event")
raise Exception("msg not implemented for Event")
# returns a dictionary representation of the event fields. You must specify which of the
# available messages you would like to use (i.e. - e.message, e.cli_msg(), e.file_msg())
# used for constructing json formatted events. includes secrets which must be scrubbed at
# the usage site.
def to_dict(self, msg: str) -> dict:
level = self.level_tag()
return {
'pid': self.pid,
'msg': msg,
'level': level if len(level) == 5 else f"{level} "
}
# exactly one pid per concrete event
def get_pid(self) -> int:
return os.getpid()
# in theory threads can change so we don't cache them.
def get_thread_name(self) -> str:
return threading.current_thread().getName()
@classmethod
def get_invocation_id(cls) -> str:
from dbt.events.functions import get_invocation_id
return get_invocation_id()
class File(Event, metaclass=ABCMeta):
# Solely the human readable message. Timestamps and formatting will be added by the logger.
def file_msg(self) -> str:
# returns the event msg unless overriden in the concrete class
return self.message()
# in preparation for #3977
@dataclass # type: ignore[misc]
class TestLevel(EventSerialization, Event):
def level_tag(self) -> str:
return "test"
class Cli(Event, metaclass=ABCMeta):
# Solely the human readable message. Timestamps and formatting will be added by the logger.
def cli_msg(self) -> str:
# returns the event msg unless overriden in the concrete class
return self.message()
@dataclass # type: ignore[misc]
class DebugLevel(EventSerialization, Event):
def level_tag(self) -> str:
return "debug"
@dataclass # type: ignore[misc]
class InfoLevel(EventSerialization, Event):
def level_tag(self) -> str:
return "info"
@dataclass # type: ignore[misc]
class WarnLevel(EventSerialization, Event):
def level_tag(self) -> str:
return "warn"
@dataclass # type: ignore[misc]
class ErrorLevel(EventSerialization, Event):
def level_tag(self) -> str:
return "error"
# prevents an event from going to the file
class NoFile():
pass
# prevents an event from going to stdout
class NoStdOut():
pass
# This class represents the node_info which is generated
# by the NodeInfoMixin class in dbt.contracts.graph.parsed
@dataclass
class NodeInfo():
node_info: Dict[str, Any]

View File

@@ -1,39 +1,79 @@
import colorama
from colorama import Style
import dbt.events.functions as this # don't worry I hate it too.
from dbt.events.base_types import Cli, Event, File, ShowException
from dbt.events.base_types import NoStdOut, Event, NoFile, ShowException, Cache
from dbt.events.types import EventBufferFull, T_Event, MainReportVersion, EmptyLine
import dbt.flags as flags
# TODO this will need to move eventually
from dbt.logger import SECRET_ENV_PREFIX, make_log_dir_if_missing, GLOBAL_LOGGER
from datetime import datetime
import json
import io
from io import StringIO, TextIOWrapper
import json
import logbook
import logging
from logging import Logger
import sys
from logging.handlers import RotatingFileHandler
import os
from typing import Callable, List, TypeVar, Union
import uuid
import threading
from typing import Any, Dict, List, Optional, Union
from collections import deque
global LOG_VERSION
LOG_VERSION = 2
# create the global event history buffer with the default max size (10k)
# python 3.7 doesn't support type hints on globals, but mypy requires them. hence the ignore.
# TODO the flags module has not yet been resolved when this is created
global EVENT_HISTORY
EVENT_HISTORY = deque(maxlen=flags.EVENT_BUFFER_SIZE) # type: ignore
# create the global file logger with no configuration
global FILE_LOG
FILE_LOG = logging.getLogger('default_file')
null_handler = logging.NullHandler()
FILE_LOG.addHandler(null_handler)
# set up logger to go to stdout with defaults
# setup_event_logger will be called once args have been parsed
global STDOUT_LOG
STDOUT_LOG = logging.getLogger('default_stdout')
STDOUT_LOG.setLevel(logging.INFO)
stdout_handler = logging.StreamHandler()
stdout_handler = logging.StreamHandler(sys.stdout)
stdout_handler.setLevel(logging.INFO)
STDOUT_LOG.addHandler(stdout_handler)
format_color = True
format_json = False
invocation_id: Optional[str] = None
# Colorama needs some help on windows because we're using logger.info
# intead of print(). If the Windows env doesn't have a TERM var set,
# then we should override the logging stream to use the colorama
# converter. If the TERM var is set (as with Git Bash), then it's safe
# to send escape characters and no log handler injection is needed.
colorama_stdout = sys.stdout
colorama_wrap = True
colorama.init(wrap=colorama_wrap)
if sys.platform == 'win32' and not os.getenv('TERM'):
colorama_wrap = False
colorama_stdout = colorama.AnsiToWin32(sys.stdout).stream
elif sys.platform == 'win32':
colorama_wrap = False
colorama.init(wrap=colorama_wrap)
def setup_event_logger(log_path):
def setup_event_logger(log_path, level_override=None):
# flags have been resolved, and log_path is known
global EVENT_HISTORY
EVENT_HISTORY = deque(maxlen=flags.EVENT_BUFFER_SIZE) # type: ignore
make_log_dir_if_missing(log_path)
this.format_json = flags.LOG_FORMAT == 'json'
# USE_COLORS can be None if the app just started and the cli flags
@@ -41,7 +81,7 @@ def setup_event_logger(log_path):
this.format_color = True if flags.USE_COLORS else False
# TODO this default should live somewhere better
log_dest = os.path.join(log_path, 'dbt.log')
level = logging.DEBUG if flags.DEBUG else logging.INFO
level = level_override or (logging.DEBUG if flags.DEBUG else logging.INFO)
# overwrite the STDOUT_LOG logger with the configured one
this.STDOUT_LOG = logging.getLogger('configured_std_out')
@@ -50,7 +90,7 @@ def setup_event_logger(log_path):
FORMAT = "%(message)s"
stdout_passthrough_formatter = logging.Formatter(fmt=FORMAT)
stdout_handler = logging.StreamHandler()
stdout_handler = logging.StreamHandler(sys.stdout)
stdout_handler.setFormatter(stdout_passthrough_formatter)
stdout_handler.setLevel(level)
# clear existing stdout TextIOWrapper stream handlers
@@ -66,7 +106,12 @@ def setup_event_logger(log_path):
file_passthrough_formatter = logging.Formatter(fmt=FORMAT)
file_handler = RotatingFileHandler(filename=log_dest, encoding='utf8')
file_handler = RotatingFileHandler(
filename=log_dest,
encoding='utf8',
maxBytes=10 * 1024 * 1024, # 10 mb
backupCount=5
)
file_handler.setFormatter(file_passthrough_formatter)
file_handler.setLevel(logging.DEBUG) # always debug regardless of user input
this.FILE_LOG.handlers.clear()
@@ -106,34 +151,100 @@ def scrub_secrets(msg: str, secrets: List[str]) -> str:
return scrubbed
# Type representing Event and all subclasses of Event
T_Event = TypeVar('T_Event', bound=Event)
# returns a dictionary representation of the event fields.
# the message may contain secrets which must be scrubbed at the usage site.
def event_to_serializable_dict(
e: T_Event,
) -> Dict[str, Any]:
log_line = dict()
code: str
try:
log_line = e.to_dict()
except AttributeError as exc:
event_type = type(e).__name__
raise Exception( # TODO this may hang async threads
f"type {event_type} is not serializable. {str(exc)}"
)
# We get the code from the event object, so we don't need it in the data
if 'code' in log_line:
del log_line['code']
event_dict = {
'type': 'log_line',
'log_version': LOG_VERSION,
'ts': get_ts_rfc3339(),
'pid': e.get_pid(),
'msg': e.message(),
'level': e.level_tag(),
'data': log_line,
'invocation_id': e.get_invocation_id(),
'thread_name': e.get_thread_name(),
'code': e.code
}
return event_dict
# translates an Event to a completely formatted text-based log line
# you have to specify which message you want. (i.e. - e.message, e.cli_msg(), e.file_msg())
# type hinting everything as strings so we don't get any unintentional string conversions via str()
def create_text_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) -> str:
def create_info_text_log_line(e: T_Event) -> str:
color_tag: str = '' if this.format_color else Style.RESET_ALL
ts: str = e.ts.strftime("%H:%M:%S")
scrubbed_msg: str = scrub_secrets(msg_fn(e), env_secrets())
level: str = e.level_tag()
log_line: str = f"{color_tag}{ts} | [ {level} ] | {scrubbed_msg}"
ts: str = get_ts().strftime("%H:%M:%S")
scrubbed_msg: str = scrub_secrets(e.message(), env_secrets())
log_line: str = f"{color_tag}{ts} {scrubbed_msg}"
return log_line
def create_debug_text_log_line(e: T_Event) -> str:
log_line: str = ''
# Create a separator if this is the beginning of an invocation
if type(e) == MainReportVersion:
separator = 30 * '='
log_line = f'\n\n{separator} {get_ts()} | {get_invocation_id()} {separator}\n'
color_tag: str = '' if this.format_color else Style.RESET_ALL
ts: str = get_ts().strftime("%H:%M:%S.%f")
scrubbed_msg: str = scrub_secrets(e.message(), env_secrets())
level: str = e.level_tag() if len(e.level_tag()) == 5 else f"{e.level_tag()} "
thread = ''
if threading.current_thread().getName():
thread_name = threading.current_thread().getName()
thread_name = thread_name[:10]
thread_name = thread_name.ljust(10, ' ')
thread = f' [{thread_name}]:'
log_line = log_line + f"{color_tag}{ts} [{level}]{thread} {scrubbed_msg}"
return log_line
# translates an Event to a completely formatted json log line
# you have to specify which message you want. (i.e. - e.message, e.cli_msg(), e.file_msg())
def create_json_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) -> str:
values = e.to_dict(scrub_secrets(msg_fn(e), env_secrets()))
values['ts'] = e.ts.isoformat()
log_line = json.dumps(values, sort_keys=True)
return log_line
def create_json_log_line(e: T_Event) -> Optional[str]:
if type(e) == EmptyLine:
return None # will not be sent to logger
# using preformatted ts string instead of formatting it here to be extra careful about timezone
values = event_to_serializable_dict(e)
raw_log_line = json.dumps(values, sort_keys=True)
return scrub_secrets(raw_log_line, env_secrets())
# calls create_stdout_text_log_line() or create_json_log_line() according to logger config
def create_log_line(
e: T_Event,
file_output=False
) -> Optional[str]:
if this.format_json:
return create_json_log_line(e) # json output, both console and file
elif file_output is True or flags.DEBUG:
return create_debug_text_log_line(e) # default file output
else:
return create_info_text_log_line(e) # console output
# allows for resuse of this obnoxious if else tree.
# do not use for exceptions, it doesn't pass along exc_info, stack_info, or extra
def send_to_logger(l: Union[Logger, logbook.Logger], level_tag: str, log_line: str):
if not log_line:
return
if level_tag == 'test':
# TODO after implmenting #3977 send to new test level
l.debug(log_line)
@@ -206,43 +317,78 @@ def send_exc_to_logger(
# (i.e. - mutating the event history, printing to stdout, logging
# to files, etc.)
def fire_event(e: Event) -> None:
# TODO manage history in phase 2: EVENT_HISTORY.append(e)
# skip logs when `--log-cache-events` is not passed
if isinstance(e, Cache) and not flags.LOG_CACHE_EVENTS:
return
# if and only if the event history deque will be completely filled by this event
# fire warning that old events are now being dropped
global EVENT_HISTORY
if len(EVENT_HISTORY) == (flags.EVENT_BUFFER_SIZE - 1):
EVENT_HISTORY.append(e)
fire_event(EventBufferFull())
else:
EVENT_HISTORY.append(e)
# backwards compatibility for plugins that require old logger (dbt-rpc)
if flags.ENABLE_LEGACY_LOGGER:
# using Event::message because the legacy logger didn't differentiate messages by
# destination
log_line = (
create_json_log_line(e, msg_fn=lambda x: x.message())
if this.format_json else
create_text_log_line(e, msg_fn=lambda x: x.message())
)
send_to_logger(GLOBAL_LOGGER, e.level_tag(), log_line)
log_line = create_log_line(e)
if log_line:
send_to_logger(GLOBAL_LOGGER, e.level_tag(), log_line)
return # exit the function to avoid using the current logger as well
# always logs debug level regardless of user input
if isinstance(e, File):
log_line = create_json_log_line(e, msg_fn=lambda x: x.file_msg())
if not isinstance(e, NoFile):
log_line = create_log_line(e, file_output=True)
# doesn't send exceptions to exception logger
send_to_logger(FILE_LOG, level_tag=e.level_tag(), log_line=log_line)
if log_line:
send_to_logger(FILE_LOG, level_tag=e.level_tag(), log_line=log_line)
if isinstance(e, Cli):
if not isinstance(e, NoStdOut):
# explicitly checking the debug flag here so that potentially expensive-to-construct
# log messages are not constructed if debug messages are never shown.
if e.level_tag() == 'debug' and not flags.DEBUG:
return # eat the message in case it was one of the expensive ones
log_line = create_json_log_line(e, msg_fn=lambda x: x.cli_msg())
if not isinstance(e, ShowException):
send_to_logger(STDOUT_LOG, level_tag=e.level_tag(), log_line=log_line)
# CliEventABC and ShowException
else:
send_exc_to_logger(
STDOUT_LOG,
level_tag=e.level_tag(),
log_line=log_line,
exc_info=e.exc_info,
stack_info=e.stack_info,
extra=e.extra
)
log_line = create_log_line(e)
if log_line:
if not isinstance(e, ShowException):
send_to_logger(STDOUT_LOG, level_tag=e.level_tag(), log_line=log_line)
else:
send_exc_to_logger(
STDOUT_LOG,
level_tag=e.level_tag(),
log_line=log_line,
exc_info=e.exc_info,
stack_info=e.stack_info,
extra=e.extra
)
def get_invocation_id() -> str:
global invocation_id
if invocation_id is None:
invocation_id = str(uuid.uuid4())
return invocation_id
def set_invocation_id() -> None:
# This is primarily for setting the invocation_id for separate
# commands in the dbt servers. It shouldn't be necessary for the CLI.
global invocation_id
invocation_id = str(uuid.uuid4())
# exactly one time stamp per concrete event
def get_ts() -> datetime:
ts = datetime.utcnow()
return ts
# preformatted time stamp
def get_ts_rfc3339() -> str:
ts = get_ts()
ts_rfc3339 = ts.strftime('%Y-%m-%dT%H:%M:%S.%fZ')
return ts_rfc3339

View File

@@ -1,7 +0,0 @@
from dbt.events.base_types import Event
from typing import List
# the global history of events for this session
# TODO this is naive and the memory footprint is likely far too large.
EVENT_HISTORY: List[Event] = []

View File

@@ -0,0 +1,56 @@
from dbt.helper_types import Lazy
from mashumaro import DataClassDictMixin
from mashumaro.config import (
BaseConfig as MashBaseConfig
)
from mashumaro.types import SerializationStrategy
from typing import Dict, List
# The dbtClassMixin serialization class has a DateTime serialization strategy
# class. If a datetime ends up in an event class, we could use a similar class
# here to serialize it in our preferred format.
class ExceptionSerialization(SerializationStrategy):
def serialize(self, value):
out = str(value)
return out
def deserialize(self, value):
return (Exception(value))
class BaseExceptionSerialization(SerializationStrategy):
def serialize(self, value):
return str(value)
def deserialize(self, value):
return (BaseException(value))
# This is an explicit deserializer for the type Lazy[Dict[str, List[str]]]
# mashumaro does not support composing serialization strategies, so all
# future uses of Lazy will need to register a unique serialization class like this one.
class LazySerialization1(SerializationStrategy):
def serialize(self, value) -> Dict[str, List[str]]:
return value.force()
# we _can_ deserialize into a lazy value, but that defers running the deserialization
# function till the value is used which can raise errors at very unexpected times.
# It's best practice to do strict deserialization unless you're in a very special case.
def deserialize(self, value):
raise Exception("Don't deserialize into a Lazy value. Try just using the value itself.")
# This class is the equivalent of dbtClassMixin that's used for serialization
# in other parts of the code. That class did extra things which we didn't want
# to use for events, so this class is a simpler version of dbtClassMixin.
class EventSerialization(DataClassDictMixin):
# This is where we register serializtion strategies per type.
class Config(MashBaseConfig):
serialization_strategy = {
Exception: ExceptionSerialization(),
BaseException: ExceptionSerialization(),
Lazy[Dict[str, List[str]]]: LazySerialization1()
}

View File

@@ -1,44 +0,0 @@
from typing import (
Any,
NamedTuple,
Optional,
)
# N.B.:
# These stubs were autogenerated by stubgen and then hacked
# to pieces to ensure we had something other than "Any" types
# where using external classes to instantiate event subclasses
# in events/types.py.
#
# This goes away when we turn mypy on for everything.
#
# Don't trust them too much at all!
class _ReferenceKey(NamedTuple):
database: Any
schema: Any
identifier: Any
class _CachedRelation:
referenced_by: Any
inner: Any
class AdapterResponse:
code: Optional[str]
rows_affected: Optional[int]
class BaseRelation:
path: Any
type: Optional[Any]
quote_character: str
include_policy: Any
quote_policy: Any
dbt_created: bool
class InformationSchema(BaseRelation):
information_schema_view: Optional[str]

View File

@@ -5,7 +5,7 @@ from .types import (
WarnLevel,
ErrorLevel,
ShowException,
Cli
NoFile
)
@@ -13,45 +13,59 @@ from .types import (
# Reuse the existing messages when adding logs to tests.
@dataclass
class IntegrationTestInfo(InfoLevel, Cli):
class IntegrationTestInfo(InfoLevel, NoFile):
msg: str
code: str = "T001"
def message(self) -> str:
return f"Integration Test: {self.msg}"
@dataclass
class IntegrationTestDebug(DebugLevel, Cli):
class IntegrationTestDebug(DebugLevel, NoFile):
msg: str
code: str = "T002"
def message(self) -> str:
return f"Integration Test: {self.msg}"
@dataclass
class IntegrationTestWarn(WarnLevel, Cli):
class IntegrationTestWarn(WarnLevel, NoFile):
msg: str
code: str = "T003"
def message(self) -> str:
return f"Integration Test: {self.msg}"
@dataclass
class IntegrationTestError(ErrorLevel, Cli):
class IntegrationTestError(ErrorLevel, NoFile):
msg: str
code: str = "T004"
def message(self) -> str:
return f"Integration Test: {self.msg}"
@dataclass
class IntegrationTestException(ShowException, ErrorLevel, Cli):
class IntegrationTestException(ShowException, ErrorLevel, NoFile):
msg: str
code: str = "T005"
def message(self) -> str:
return f"Integration Test: {self.msg}"
@dataclass
class UnitTestInfo(InfoLevel, NoFile):
msg: str
code: str = "T006"
def message(self) -> str:
return f"Unit Test: {self.msg}"
# since mypy doesn't run on every file we need to suggest to mypy that every
# class gets instantiated. But we don't actually want to run this code.
# making the conditional `if False` causes mypy to skip it as dead code so
@@ -64,3 +78,4 @@ if 1 == 0:
IntegrationTestWarn(msg='')
IntegrationTestError(msg='')
IntegrationTestException(msg='')
UnitTestInfo(msg='')

File diff suppressed because it is too large Load Diff

View File

@@ -2,8 +2,7 @@ import builtins
import functools
from typing import NoReturn, Optional, Mapping, Any
from dbt.logger import get_secret_env
from dbt.events.functions import fire_event
from dbt.events.functions import fire_event, scrub_secrets, env_secrets
from dbt.events.types import GeneralWarningMsg, GeneralWarningException
from dbt.node_types import NodeType
from dbt import flags
@@ -54,7 +53,7 @@ class RuntimeException(RuntimeError, Exception):
def __init__(self, msg, node=None):
self.stack = []
self.node = node
self.msg = msg
self.msg = scrub_secrets(msg, env_secrets())
def add_node(self, node=None):
if node is not None and node is not self.node:
@@ -401,8 +400,6 @@ class CommandError(RuntimeException):
super().__init__(message)
self.cwd = cwd
self.cmd = cmd
for secret in get_secret_env():
self.cmd = str(self.cmd).replace(secret, "*****")
self.args = (cwd, cmd, message)
def __str__(self):
@@ -466,7 +463,29 @@ def raise_database_error(msg, node=None) -> NoReturn:
def raise_dependency_error(msg) -> NoReturn:
raise DependencyException(msg)
raise DependencyException(scrub_secrets(msg, env_secrets()))
def raise_git_cloning_error(error: CommandResultError) -> NoReturn:
error.cmd = scrub_secrets(str(error.cmd), env_secrets())
raise error
def raise_git_cloning_problem(repo) -> NoReturn:
repo = scrub_secrets(repo, env_secrets())
msg = '''\
Something went wrong while cloning {}
Check the debug logs for more information
'''
raise RuntimeException(msg.format(repo))
def disallow_secret_env_var(env_var_name) -> NoReturn:
"""Raise an error when a secret env var is referenced outside allowed
rendering contexts"""
msg = ("Secret env vars are allowed only in profiles.yml or packages.yml. "
"Found '{env_var_name}' referenced elsewhere.")
raise_parsing_error(msg.format(env_var_name=env_var_name))
def invalid_type_error(method_name, arg_name, got_value, expected_type,
@@ -684,9 +703,9 @@ def missing_materialization(model, adapter_type):
def bad_package_spec(repo, spec, error_message):
raise InternalException(
"Error checking out spec='{}' for repo {}\n{}".format(
spec, repo, error_message))
msg = "Error checking out spec='{}' for repo {}\n{}".format(spec, repo, error_message)
raise InternalException(scrub_secrets(msg, env_secrets()))
def raise_cache_inconsistent(message):
@@ -755,6 +774,10 @@ def system_error(operation_name):
class ConnectionException(Exception):
"""
There was a problem with the connection that returned a bad response,
timed out, or resulted in a file that is corrupt.
"""
pass
@@ -991,7 +1014,7 @@ def raise_duplicate_alias(
def warn_or_error(msg, node=None, log_fmt=None):
if flags.WARN_ERROR:
raise_compiler_error(msg, node)
raise_compiler_error(scrub_secrets(msg, env_secrets()), node)
else:
fire_event(GeneralWarningMsg(msg=msg, log_fmt=log_fmt))

View File

@@ -33,6 +33,8 @@ SEND_ANONYMOUS_USAGE_STATS = None
PRINTER_WIDTH = 80
WHICH = None
INDIRECT_SELECTION = None
LOG_CACHE_EVENTS = None
EVENT_BUFFER_SIZE = 100000
# Global CLI defaults. These flags are set from three places:
# CLI args, environment variables, and user_config (profiles.yml).
@@ -51,7 +53,9 @@ flag_defaults = {
"FAIL_FAST": False,
"SEND_ANONYMOUS_USAGE_STATS": True,
"PRINTER_WIDTH": 80,
"INDIRECT_SELECTION": 'eager'
"INDIRECT_SELECTION": 'eager',
"LOG_CACHE_EVENTS": False,
"EVENT_BUFFER_SIZE": 100000
}
@@ -99,7 +103,7 @@ def set_from_args(args, user_config):
USE_EXPERIMENTAL_PARSER, STATIC_PARSER, WRITE_JSON, PARTIAL_PARSE, \
USE_COLORS, STORE_FAILURES, PROFILES_DIR, DEBUG, LOG_FORMAT, INDIRECT_SELECTION, \
VERSION_CHECK, FAIL_FAST, SEND_ANONYMOUS_USAGE_STATS, PRINTER_WIDTH, \
WHICH
WHICH, LOG_CACHE_EVENTS, EVENT_BUFFER_SIZE
STRICT_MODE = False # backwards compatibility
# cli args without user_config or env var option
@@ -122,6 +126,8 @@ def set_from_args(args, user_config):
SEND_ANONYMOUS_USAGE_STATS = get_flag_value('SEND_ANONYMOUS_USAGE_STATS', args, user_config)
PRINTER_WIDTH = get_flag_value('PRINTER_WIDTH', args, user_config)
INDIRECT_SELECTION = get_flag_value('INDIRECT_SELECTION', args, user_config)
LOG_CACHE_EVENTS = get_flag_value('LOG_CACHE_EVENTS', args, user_config)
EVENT_BUFFER_SIZE = get_flag_value('EVENT_BUFFER_SIZE', args, user_config)
def get_flag_value(flag, args, user_config):
@@ -134,7 +140,13 @@ def get_flag_value(flag, args, user_config):
if env_value is not None and env_value != '':
env_value = env_value.lower()
# non Boolean values
if flag in ['LOG_FORMAT', 'PRINTER_WIDTH', 'PROFILES_DIR', 'INDIRECT_SELECTION']:
if flag in [
'LOG_FORMAT',
'PRINTER_WIDTH',
'PROFILES_DIR',
'INDIRECT_SELECTION',
'EVENT_BUFFER_SIZE'
]:
flag_value = env_value
else:
flag_value = env_set_bool(env_value)
@@ -142,7 +154,7 @@ def get_flag_value(flag, args, user_config):
flag_value = getattr(user_config, lc_flag)
else:
flag_value = flag_defaults[flag]
if flag == 'PRINTER_WIDTH': # printer_width must be an int or it hangs
if flag in ['PRINTER_WIDTH', 'EVENT_BUFFER_SIZE']: # must be ints
flag_value = int(flag_value)
if flag == 'PROFILES_DIR':
flag_value = os.path.abspath(flag_value)
@@ -165,5 +177,7 @@ def get_flag_dict():
"fail_fast": FAIL_FAST,
"send_anonymous_usage_stats": SEND_ANONYMOUS_USAGE_STATS,
"printer_width": PRINTER_WIDTH,
"indirect_selection": INDIRECT_SELECTION
"indirect_selection": INDIRECT_SELECTION,
"log_cache_events": LOG_CACHE_EVENTS,
"event_buffer_size": EVENT_BUFFER_SIZE
}

1
core/dbt/graph/README.md Normal file
View File

@@ -0,0 +1 @@
# Graph README

View File

@@ -31,11 +31,13 @@ class Graph:
"""Returns all nodes having a path to `node` in `graph`"""
if not self.graph.has_node(node):
raise InternalException(f'Node {node} not found in the graph!')
with nx.utils.reversed(self.graph):
anc = nx.single_source_shortest_path_length(G=self.graph,
source=node,
cutoff=max_depth)\
.keys()
# This used to use nx.utils.reversed(self.graph), but that is deprecated,
# so changing to use self.graph.reverse(copy=False) as recommeneded
G = self.graph.reverse(copy=False) if self.graph.is_directed() else self.graph
anc = nx.single_source_shortest_path_length(G=G,
source=node,
cutoff=max_depth)\
.keys()
return anc - {node}
def descendants(

View File

@@ -86,8 +86,9 @@ class NodeSelector(MethodManager):
try:
collected = self.select_included(nodes, spec)
except InvalidSelectorException:
valid_selectors = ", ".join(self.SELECTOR_METHODS)
fire_event(SelectorReportInvalidSelector(
selector_methods=self.SELECTOR_METHODS,
valid_selectors=valid_selectors,
spec_method=spec.method,
raw_spec=spec.raw
))

View File

@@ -1,7 +1,7 @@
import abc
from itertools import chain
from pathlib import Path
from typing import Set, List, Dict, Iterator, Tuple, Any, Union, Type, Optional
from typing import Set, List, Dict, Iterator, Tuple, Any, Union, Type, Optional, Callable
from dbt.dataclass_schema import StrEnum
@@ -478,42 +478,28 @@ class StateSelectorMethod(SelectorMethod):
previous_macros = []
return self.recursively_check_macros_modified(node, previous_macros)
def check_modified(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
# TODO check modifed_content and check_modified macro seems a bit redundent
def check_modified_content(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
different_contents = not new.same_contents(old) # type: ignore
upstream_macro_change = self.check_macros_modified(new)
return different_contents or upstream_macro_change
def check_modified_body(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
if hasattr(new, "same_body"):
return not new.same_body(old) # type: ignore
else:
return False
def check_modified_configs(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
if hasattr(new, "same_config"):
return not new.same_config(old) # type: ignore
else:
return False
def check_modified_persisted_descriptions(
self, old: Optional[SelectorTarget], new: SelectorTarget
) -> bool:
if hasattr(new, "same_persisted_description"):
return not new.same_persisted_description(old) # type: ignore
else:
return False
def check_modified_relation(
self, old: Optional[SelectorTarget], new: SelectorTarget
) -> bool:
if hasattr(new, "same_database_representation"):
return not new.same_database_representation(old) # type: ignore
else:
return False
def check_modified_macros(self, _, new: SelectorTarget) -> bool:
return self.check_macros_modified(new)
@staticmethod
def check_modified_factory(
compare_method: str
) -> Callable[[Optional[SelectorTarget], SelectorTarget], bool]:
# get a function that compares two selector target based on compare method provided
def check_modified_things(old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
if hasattr(new, compare_method):
# when old body does not exist or old and new are not the same
return not old or not getattr(new, compare_method)(old) # type: ignore
else:
return False
return check_modified_things
def check_new(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
return old is None
@@ -527,14 +513,21 @@ class StateSelectorMethod(SelectorMethod):
state_checks = {
# it's new if there is no old version
'new': lambda old, _: old is None,
'new':
lambda old, _: old is None,
# use methods defined above to compare properties of old + new
'modified': self.check_modified,
'modified.body': self.check_modified_body,
'modified.configs': self.check_modified_configs,
'modified.persisted_descriptions': self.check_modified_persisted_descriptions,
'modified.relation': self.check_modified_relation,
'modified.macros': self.check_modified_macros,
'modified':
self.check_modified_content,
'modified.body':
self.check_modified_factory('same_body'),
'modified.configs':
self.check_modified_factory('same_config'),
'modified.persisted_descriptions':
self.check_modified_factory('same_persisted_description'),
'modified.relation':
self.check_modified_factory('same_database_representation'),
'modified.macros':
self.check_modified_macros,
}
if selector in state_checks:
checker = state_checks[selector]

View File

@@ -149,7 +149,7 @@ class SelectionCriteria:
method_name, method_arguments = cls.parse_method(dct)
meth_name = str(method_name)
if method_arguments:
meth_name = meth_name + '.' + '.'.join(method_arguments)
meth_name += '.' + '.'.join(method_arguments)
dct['method'] = meth_name
dct = {k: v for k, v in dct.items() if (v is not None and v != '')}
if 'childrens_parents' in dct:

View File

@@ -1,4 +1,8 @@
# never name this package "types", or mypy will crash in ugly ways
# necessary for annotating constructors
from __future__ import annotations
from dataclasses import dataclass
from datetime import timedelta
from pathlib import Path
@@ -9,6 +13,7 @@ from dbt.dataclass_schema import (
)
from hologram import FieldEncoder, JsonDict
from mashumaro.types import SerializableType
from typing import Callable, cast, Generic, Optional, TypeVar
class Port(int, SerializableType):
@@ -93,3 +98,35 @@ dbtClassMixin.register_field_encoders({
FQNPath = Tuple[str, ...]
PathSet = AbstractSet[FQNPath]
T = TypeVar('T')
# A data type for representing lazily evaluated values.
#
# usage:
# x = Lazy.defer(lambda: expensive_fn())
# y = x.force()
#
# inspired by the purescript data type
# https://pursuit.purescript.org/packages/purescript-lazy/5.0.0/docs/Data.Lazy
@dataclass
class Lazy(Generic[T]):
_f: Callable[[], T]
memo: Optional[T] = None
# constructor for lazy values
@classmethod
def defer(cls, f: Callable[[], T]) -> Lazy[T]:
return Lazy(f)
# workaround for open mypy issue:
# https://github.com/python/mypy/issues/6910
def _typed_eval_f(self) -> T:
return cast(Callable[[], T], getattr(self, "_f"))()
# evaluates the function if the value has not been memoized already
def force(self) -> T:
if self.memo is None:
self.memo = self._typed_eval_f()
return self.memo

View File

@@ -0,0 +1 @@
# Include README

View File

@@ -35,7 +35,7 @@ Note that you can also right-click on models to interactively filter and explore
### More information
- [What is dbt](https://docs.getdbt.com/docs/overview)?
- [What is dbt](https://docs.getdbt.com/docs/introduction)?
- Read the [dbt viewpoint](https://docs.getdbt.com/docs/viewpoint)
- [Installation](https://docs.getdbt.com/docs/installation)
- Join the [dbt Community](https://www.getdbt.com/community/) for questions and discussion

File diff suppressed because one or more lines are too long

View File

@@ -1,16 +1,18 @@
# TODO: this file is one big TODO
from dbt.exceptions import RuntimeException
import os
from dbt.exceptions import RuntimeException
from dbt import flags
from collections import namedtuple
RuntimeArgs = namedtuple(
'RuntimeArgs', 'project_dir profiles_dir single_threaded'
'RuntimeArgs', 'project_dir profiles_dir single_threaded profile_name'
)
def get_dbt_config(project_dir, single_threaded=False):
from dbt.config.runtime import RuntimeConfig
import dbt.adapters.factory
import dbt.events.functions
if os.getenv('DBT_PROFILES_DIR'):
profiles_dir = os.getenv('DBT_PROFILES_DIR')
@@ -19,13 +21,16 @@ def get_dbt_config(project_dir, single_threaded=False):
# Construct a phony config
config = RuntimeConfig.from_args(RuntimeArgs(
project_dir, profiles_dir, single_threaded
project_dir, profiles_dir, single_threaded, 'user'
))
# Clear previously registered adapters--
# this fixes cacheing behavior on the dbt-server
flags.set_from_args('', config)
dbt.adapters.factory.reset_adapters()
# Load the relevant adapter
dbt.adapters.factory.register_adapter(config)
# Set invocation id
dbt.events.functions.set_invocation_id()
return config
@@ -34,11 +39,26 @@ def get_task_by_type(type):
# TODO: we need to tell dbt-server what tasks are available
from dbt.task.run import RunTask
from dbt.task.list import ListTask
from dbt.task.seed import SeedTask
from dbt.task.test import TestTask
from dbt.task.build import BuildTask
from dbt.task.snapshot import SnapshotTask
from dbt.task.run_operation import RunOperationTask
if type == 'run':
return RunTask
elif type == 'test':
return TestTask
elif type == 'list':
return ListTask
elif type == 'seed':
return SeedTask
elif type == 'build':
return BuildTask
elif type == 'snapshot':
return SnapshotTask
elif type == 'run_operation':
return RunOperationTask
raise RuntimeException('not a valid task')

View File

@@ -424,7 +424,7 @@ class DelayedFileHandler(logbook.RotatingFileHandler, FormatterMixin):
return
make_log_dir_if_missing(log_dir)
log_path = os.path.join(log_dir, 'dbt.log.old') # TODO hack for now
log_path = os.path.join(log_dir, 'dbt.log.legacy') # TODO hack for now
self._super_init(log_path)
self._replay_buffered()
self._log_path = log_path

View File

@@ -36,7 +36,7 @@ from dbt.adapters.factory import reset_adapters, cleanup_connections
import dbt.tracking
from dbt.utils import ExitCodes
from dbt.utils import ExitCodes, args_to_dict
from dbt.config.profile import DEFAULT_PROFILES_DIR, read_user_config
from dbt.exceptions import (
InternalException,
@@ -140,7 +140,7 @@ def main(args=None):
exit_code = e.code
except BaseException as e:
fire_event(MainEncounteredError(e=e))
fire_event(MainEncounteredError(e=str(e)))
fire_event(MainStackTrace(stack_trace=traceback.format_exc()))
exit_code = ExitCodes.UnhandledError.value
@@ -205,7 +205,7 @@ def track_run(task):
)
except (NotImplementedException,
FailedToConnectException) as e:
fire_event(MainEncounteredError(e=e))
fire_event(MainEncounteredError(e=str(e)))
dbt.tracking.track_invocation_end(
config=task.config, args=task.args, result_type="error"
)
@@ -221,25 +221,24 @@ def track_run(task):
def run_from_args(parsed):
log_cache_events(getattr(parsed, 'log_cache_events', False))
# we can now use the logger for stdout
# set log_format in the logger
parsed.cls.pre_init_hook(parsed)
fire_event(MainReportVersion(v=dbt.version.installed))
# this will convert DbtConfigErrors into RuntimeExceptions
# task could be any one of the task objects
task = parsed.cls.from_args(args=parsed)
fire_event(MainReportArgs(args=parsed))
# Set up logging
log_path = None
if task.config is not None:
log_path = getattr(task.config, 'log_path', None)
# we can finally set the file logger up
log_manager.set_path(log_path)
setup_event_logger(log_path or 'logs')
# if 'list' task: set stdout to WARN instead of INFO
level_override = parsed.cls.pre_init_hook(parsed)
setup_event_logger(log_path or 'logs', level_override)
fire_event(MainReportVersion(v=str(dbt.version.installed)))
fire_event(MainReportArgs(args=args_to_dict(parsed)))
if dbt.tracking.active_user is not None: # mypy appeasement, always true
fire_event(MainTrackingUserState(dbt.tracking.active_user.state()))
fire_event(MainTrackingUserState(user_state=dbt.tracking.active_user.state()))
results = None
@@ -1077,6 +1076,14 @@ def parse_args(args, cls=DBTArgumentParser):
'''
)
p.add_argument(
'--event-buffer-size',
dest='event_buffer_size',
help='''
Sets the max number of events to buffer in EVENT_HISTORY
'''
)
subs = p.add_subparsers(title="Available sub-commands")
base_subparser = _build_base_subparser()
@@ -1109,7 +1116,7 @@ def parse_args(args, cls=DBTArgumentParser):
_add_selection_arguments(
run_sub, compile_sub, generate_sub, test_sub, snapshot_sub, seed_sub)
# --defer
_add_defer_argument(run_sub, test_sub, build_sub)
_add_defer_argument(run_sub, test_sub, build_sub, snapshot_sub)
# --full-refresh
_add_table_mutability_arguments(run_sub, compile_sub, build_sub)

View File

@@ -0,0 +1 @@
# Parser README

View File

@@ -23,7 +23,7 @@ from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.parsed import HasUniqueID, ManifestNodes
from dbt.contracts.graph.unparsed import UnparsedNode
from dbt.exceptions import (
CompilationException, validator_error_message, InternalException
ParsingException, validator_error_message, InternalException
)
from dbt import hooks
from dbt.node_types import NodeType
@@ -247,7 +247,7 @@ class ConfiguredParser(
original_file_path=block.path.original_file_path,
raw_sql=block.contents,
)
raise CompilationException(msg, node=node)
raise ParsingException(msg, node=node)
def _context_for(
self, parsed_node: IntermediateNode, config: ContextConfig
@@ -378,7 +378,7 @@ class ConfiguredParser(
except ValidationError as exc:
# we got a ValidationError - probably bad types in config()
msg = validator_error_message(exc)
raise CompilationException(msg, node=node) from exc
raise ParsingException(msg, node=node) from exc
def add_result_node(self, block: FileBlock, node: ManifestNodes):
if node.config.enabled:

View File

@@ -2,7 +2,7 @@ from typing import Iterable, List
import jinja2
from dbt.exceptions import CompilationException
from dbt.exceptions import ParsingException
from dbt.clients import jinja
from dbt.contracts.graph.parsed import ParsedGenericTestNode
from dbt.contracts.graph.unparsed import UnparsedMacro
@@ -55,14 +55,14 @@ class GenericTestParser(BaseParser[ParsedGenericTestNode]):
)
if isinstance(t, jinja.BlockTag)
]
except CompilationException as exc:
except ParsingException as exc:
exc.add_node(base_node)
raise
for block in blocks:
try:
ast = jinja.parse(block.full_block)
except CompilationException as e:
except ParsingException as e:
e.add_node(base_node)
raise
@@ -72,7 +72,7 @@ class GenericTestParser(BaseParser[ParsedGenericTestNode]):
if len(generic_test_nodes) != 1:
# things have gone disastrously wrong, we thought we only
# parsed one block!
raise CompilationException(
raise ParsingException(
f'Found multiple generic tests in {block.full_block}, expected 1',
node=base_node
)

View File

@@ -6,7 +6,7 @@ from dbt.clients import jinja
from dbt.contracts.graph.unparsed import UnparsedMacro
from dbt.contracts.graph.parsed import ParsedMacro
from dbt.contracts.files import FilePath, SourceFile
from dbt.exceptions import CompilationException
from dbt.exceptions import ParsingException
from dbt.events.functions import fire_event
from dbt.events.types import MacroFileParse
from dbt.node_types import NodeType
@@ -62,14 +62,14 @@ class MacroParser(BaseParser[ParsedMacro]):
)
if isinstance(t, jinja.BlockTag)
]
except CompilationException as exc:
except ParsingException as exc:
exc.add_node(base_node)
raise
for block in blocks:
try:
ast = jinja.parse(block.full_block)
except CompilationException as e:
except ParsingException as e:
e.add_node(base_node)
raise
@@ -78,7 +78,7 @@ class MacroParser(BaseParser[ParsedMacro]):
if len(macro_nodes) != 1:
# things have gone disastrously wrong, we thought we only
# parsed one block!
raise CompilationException(
raise ParsingException(
f'Found multiple macros in {block.full_block}, expected 1',
node=base_node
)

View File

@@ -19,7 +19,7 @@ from dbt.adapters.factory import (
get_adapter_package_names,
)
from dbt.helper_types import PathSet
from dbt.events.functions import fire_event
from dbt.events.functions import fire_event, get_invocation_id
from dbt.events.types import (
PartialParsingFullReparseBecauseOfError, PartialParsingExceptionFile, PartialParsingFile,
PartialParsingException, PartialParsingSkipParsing, PartialParsingMacroChangeStartFullParse,
@@ -195,7 +195,7 @@ class ManifestLoader:
start_load_all = time.perf_counter()
projects = config.load_dependencies()
loader = ManifestLoader(config, projects, macro_hook)
loader = cls(config, projects, macro_hook)
manifest = loader.load()
@@ -246,7 +246,7 @@ class ManifestLoader:
project_parser_files = self.partial_parser.get_parsing_files()
self.partially_parsing = True
self.manifest = self.saved_manifest
except Exception:
except Exception as exc:
# pp_files should still be the full set and manifest is new manifest,
# since get_parsing_files failed
fire_event(PartialParsingFullReparseBecauseOfError())
@@ -284,6 +284,9 @@ class ManifestLoader:
exc_info['full_reparse_reason'] = ReparseReason.exception
dbt.tracking.track_partial_parser(exc_info)
if os.environ.get('DBT_PP_TEST'):
raise exc
if self.manifest._parsing_info is None:
self.manifest._parsing_info = ParsingInfo()
@@ -398,7 +401,7 @@ class ManifestLoader:
block = FileBlock(self.manifest.files[file_id])
parser.parse_file(block)
# increment parsed path count for performance tracking
self._perf_info.parsed_path_count = self._perf_info.parsed_path_count + 1
self._perf_info.parsed_path_count += 1
# generic tests hisotrically lived in the macros directoy but can now be nested
# in a /generic directory under /tests so we want to process them here as well
if 'GenericTestParser' in parser_files:
@@ -407,7 +410,7 @@ class ManifestLoader:
block = FileBlock(self.manifest.files[file_id])
parser.parse_file(block)
# increment parsed path count for performance tracking
self._perf_info.parsed_path_count = self._perf_info.parsed_path_count + 1
self._perf_info.parsed_path_count += 1
self.build_macro_resolver()
# Look at changed macros and update the macro.depends_on.macros
@@ -450,7 +453,7 @@ class ManifestLoader:
parser.parse_file(block, dct=dct)
else:
parser.parse_file(block)
project_parsed_path_count = project_parsed_path_count + 1
project_parsed_path_count += 1
# Save timing info
project_loader_info.parsers.append(ParserInfo(
@@ -458,7 +461,7 @@ class ManifestLoader:
parsed_path_count=project_parsed_path_count,
elapsed=time.perf_counter() - parser_start_timer
))
total_parsed_path_count = total_parsed_path_count + project_parsed_path_count
total_parsed_path_count += project_parsed_path_count
# HookParser doesn't run from loaded files, just dbt_project.yml,
# so do separately
@@ -478,7 +481,7 @@ class ManifestLoader:
project_loader_info.parsed_path_count = (
project_loader_info.parsed_path_count + total_parsed_path_count
)
project_loader_info.elapsed = project_loader_info.elapsed + elapsed
project_loader_info.elapsed += elapsed
self._perf_info.parsed_path_count = (
self._perf_info.parsed_path_count + total_parsed_path_count
)
@@ -629,10 +632,7 @@ class ManifestLoader:
# We don't want to have stale generated_at dates
manifest.metadata.generated_at = datetime.utcnow()
# or invocation_ids
if dbt.tracking.active_user:
manifest.metadata.invocation_id = dbt.tracking.active_user.invocation_id
else:
manifest.metadata.invocation_id = None
manifest.metadata.invocation_id = get_invocation_id()
return manifest
except Exception as exc:
fire_event(ParsedFileLoadFailed(path=path, exc=exc))
@@ -690,7 +690,7 @@ class ManifestLoader:
key_list.sort()
env_var_str = ''
for key in key_list:
env_var_str = env_var_str + f'{key}:{config.project_env_vars[key]}|'
env_var_str += f'{key}:{config.project_env_vars[key]}|'
project_env_vars_hash = FileHash.from_contents(env_var_str)
# Create a FileHash of the env_vars in the project
@@ -698,7 +698,7 @@ class ManifestLoader:
key_list.sort()
env_var_str = ''
for key in key_list:
env_var_str = env_var_str + f'{key}:{config.profile_env_vars[key]}|'
env_var_str += f'{key}:{config.profile_env_vars[key]}|'
profile_env_vars_hash = FileHash.from_contents(env_var_str)
# Create a FileHash of the profile file
@@ -772,7 +772,7 @@ class ManifestLoader:
# Create tracking event for saving performance info
def track_project_load(self):
invocation_id = dbt.tracking.active_user.invocation_id
invocation_id = get_invocation_id()
dbt.tracking.track_project_load({
"invocation_id": invocation_id,
"project_id": self.root_project.hashed_name(),

View File

@@ -50,7 +50,7 @@ class ModelParser(SimpleSQLParser[ParsedModelNode]):
# not when the experimental parser flag is on.
exp_sample: bool = False
# sampling the stable static parser against jinja is significantly
# more expensive and therefor done far less frequently.
# more expensive and therefore done far less frequently.
stable_sample: bool = False
# there are two samples above, and it is perfectly fine if both happen
# at the same time. If that happens, the experimental parser, stable
@@ -148,20 +148,24 @@ class ModelParser(SimpleSQLParser[ParsedModelNode]):
)
self.manifest._parsing_info.static_analysis_parsed_path_count += 1
# if the static parser failed, add the correct messages for tracking
elif isinstance(statically_parsed, str):
if statically_parsed == "cannot_parse":
result += ["01_stable_parser_cannot_parse"]
elif statically_parsed == "has_banned_macro":
result += ["08_has_banned_macro"]
super().render_update(node, config)
fire_event(StaticParserFallbackJinjaRendering(path=node.path))
# if the static parser didn't succeed, fall back to jinja
else:
# jinja rendering
super().render_update(node, config)
fire_event(StaticParserFallbackJinjaRendering(path=node.path))
# if sampling, add the correct messages for tracking
if exp_sample and isinstance(experimental_sample, str):
if experimental_sample == "cannot_parse":
result += ["01_experimental_parser_cannot_parse"]
elif experimental_sample == "has_banned_macro":
result += ["08_has_banned_macro"]
elif stable_sample and isinstance(statically_parsed, str):
if statically_parsed == "cannot_parse":
result += ["81_stable_parser_cannot_parse"]
elif statically_parsed == "has_banned_macro":
result += ["88_has_banned_macro"]
# only send the tracking event if there is at least one result code
if result:
# fire a tracking event. this fires one event for every sample

View File

@@ -315,7 +315,7 @@ class PartialParsing:
if node.patch_path:
file_id = node.patch_path
# it might be changed... then what?
if file_id not in self.file_diff['deleted']:
if file_id not in self.file_diff['deleted'] and file_id in self.saved_files:
# schema_files should already be updated
schema_file = self.saved_files[file_id]
dict_key = parse_file_type_to_key[source_file.parse_file_type]
@@ -375,7 +375,7 @@ class PartialParsing:
for unique_id in unique_ids:
if unique_id in self.saved_manifest.nodes:
node = self.saved_manifest.nodes[unique_id]
if node.resource_type == NodeType.Test:
if node.resource_type == NodeType.Test and node.test_node_type == 'generic':
# test nodes are handled separately. Must be removed from schema file
continue
file_id = node.file_id
@@ -435,7 +435,9 @@ class PartialParsing:
self.check_for_special_deleted_macros(source_file)
self.handle_macro_file_links(source_file, follow_references)
file_id = source_file.file_id
self.deleted_manifest.files[file_id] = self.saved_files.pop(file_id)
# It's not clear when this file_id would not exist in saved_files
if file_id in self.saved_files:
self.deleted_manifest.files[file_id] = self.saved_files.pop(file_id)
def check_for_special_deleted_macros(self, source_file):
for unique_id in source_file.macros:
@@ -498,7 +500,9 @@ class PartialParsing:
for unique_id in unique_ids:
if unique_id in self.saved_manifest.nodes:
node = self.saved_manifest.nodes[unique_id]
if node.resource_type == NodeType.Test:
# Both generic tests from yaml files and singular tests have NodeType.Test
# so check for generic test.
if node.resource_type == NodeType.Test and node.test_node_type == 'generic':
schema_file_id = node.file_id
schema_file = self.saved_manifest.files[schema_file_id]
(key, name) = schema_file.get_key_and_name_for_test(node.unique_id)
@@ -670,8 +674,8 @@ class PartialParsing:
continue
elem = self.get_schema_element(new_yaml_dict[dict_key], name)
if elem:
self.delete_schema_macro_patch(schema_file, macro)
self.merge_patch(schema_file, dict_key, macro)
self.delete_schema_macro_patch(schema_file, elem)
self.merge_patch(schema_file, dict_key, elem)
# exposures
dict_key = 'exposures'
@@ -693,21 +697,31 @@ class PartialParsing:
continue
elem = self.get_schema_element(new_yaml_dict[dict_key], name)
if elem:
self.delete_schema_exposure(schema_file, exposure)
self.merge_patch(schema_file, dict_key, exposure)
self.delete_schema_exposure(schema_file, elem)
self.merge_patch(schema_file, dict_key, elem)
# metrics
dict_key = 'metrics'
metric_diff = self.get_diff_for('metrics', saved_yaml_dict, new_yaml_dict)
if metric_diff['changed']:
for metric in metric_diff['changed']:
self.delete_schema_metric(schema_file, metric)
self.merge_patch(schema_file, 'metrics', metric)
self.merge_patch(schema_file, dict_key, metric)
if metric_diff['deleted']:
for metric in metric_diff['deleted']:
self.delete_schema_metric(schema_file, metric)
if metric_diff['added']:
for metric in metric_diff['added']:
self.merge_patch(schema_file, 'metrics', metric)
self.merge_patch(schema_file, dict_key, metric)
# Handle schema file updates due to env_var changes
if dict_key in env_var_changes and dict_key in new_yaml_dict:
for name in env_var_changes[dict_key]:
if name in metric_diff['changed_or_deleted_names']:
continue
elem = self.get_schema_element(new_yaml_dict[dict_key], name)
if elem:
self.delete_schema_metric(schema_file, elem)
self.merge_patch(schema_file, dict_key, elem)
# Take a "section" of the schema file yaml dictionary from saved and new schema files
# and determine which parts have changed

View File

@@ -5,7 +5,7 @@ from dbt.contracts.files import (
)
from dbt.parser.schemas import yaml_from_file, schema_file_keys, check_format_version
from dbt.exceptions import CompilationException
from dbt.exceptions import ParsingException
from dbt.parser.search import filesystem_search
from typing import Optional
@@ -54,17 +54,17 @@ def validate_yaml(file_path, dct):
if not isinstance(dct[key], list):
msg = (f"The schema file at {file_path} is "
f"invalid because the value of '{key}' is not a list")
raise CompilationException(msg)
raise ParsingException(msg)
for element in dct[key]:
if not isinstance(element, dict):
msg = (f"The schema file at {file_path} is "
f"invalid because a list element for '{key}' is not a dictionary")
raise CompilationException(msg)
raise ParsingException(msg)
if 'name' not in element:
msg = (f"The schema file at {file_path} is "
f"invalid because a list element for '{key}' does not have a "
"name attribute.")
raise CompilationException(msg)
raise ParsingException(msg)
# Special processing for big seed files

View File

@@ -960,10 +960,9 @@ class MacroPatchParser(NonSourceParser[UnparsedMacroUpdate, ParsedMacroPatch]):
unique_id = f'macro.{patch.package_name}.{patch.name}'
macro = self.manifest.macros.get(unique_id)
if not macro:
warn_or_error(
f'WARNING: Found patch for macro "{patch.name}" '
f'which was not found'
)
msg = f'Found patch for macro "{patch.name}" ' \
f'which was not found'
warn_or_error(msg, log_fmt=warning_tag('{}'))
return
if macro.patch_path:
package_name, existing_file_path = macro.patch_path.split('://')

View File

@@ -8,7 +8,7 @@ from dbt.clients.jinja import extract_toplevel_blocks, BlockTag
from dbt.clients.system import find_matching
from dbt.config import Project
from dbt.contracts.files import FilePath, AnySourceFile
from dbt.exceptions import CompilationException, InternalException
from dbt.exceptions import ParsingException, InternalException
# What's the point of wrapping a SourceFile with this class?
@@ -113,7 +113,7 @@ class BlockSearcher(Generic[BlockSearchResult], Iterable[BlockSearchResult]):
assert isinstance(block, BlockTag)
yield block
except CompilationException as exc:
except ParsingException as exc:
if exc.node is None:
exc.add_node(source_file)
raise

View File

@@ -7,7 +7,7 @@ from dbt.contracts.graph.parsed import (
IntermediateSnapshotNode, ParsedSnapshotNode
)
from dbt.exceptions import (
CompilationException, validator_error_message
ParsingException, validator_error_message
)
from dbt.node_types import NodeType
from dbt.parser.base import SQLParser
@@ -68,7 +68,7 @@ class SnapshotParser(
self.set_snapshot_attributes(parsed_node)
return parsed_node
except ValidationError as exc:
raise CompilationException(validator_error_message(exc), node)
raise ParsingException(validator_error_message(exc), node)
def parse_file(self, file_block: FileBlock) -> None:
blocks = BlockSearcher(

View File

@@ -1,5 +1,6 @@
from dataclasses import dataclass
import re
import warnings
from typing import List
from packaging import version as packaging_version
@@ -145,10 +146,13 @@ class VersionSpecifier(VersionSpecification):
return 1
if b is None:
return -1
if packaging_version.parse(a) > packaging_version.parse(b):
return 1
elif packaging_version.parse(a) < packaging_version.parse(b):
return -1
# This suppresses the LegacyVersion deprecation warning
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=DeprecationWarning)
if packaging_version.parse(a) > packaging_version.parse(b):
return 1
elif packaging_version.parse(a) < packaging_version.parse(b):
return -1
equal = ((self.matcher == Matchers.GREATER_THAN_OR_EQUAL and
other.matcher == Matchers.LESS_THAN_OR_EQUAL) or

1
core/dbt/task/README.md Normal file
View File

@@ -0,0 +1 @@
# Task README

View File

@@ -8,19 +8,21 @@ from dbt import tracking
from dbt import flags
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.results import (
NodeStatus, RunResult, collect_timing_info, RunStatus
NodeStatus, RunResult, collect_timing_info, RunStatus, RunningStatus
)
from dbt.exceptions import (
NotImplementedException, CompilationException, RuntimeException,
InternalException
)
from dbt.logger import log_manager
import dbt.events.functions as event_logger
from dbt.events.functions import fire_event
from dbt.events.types import (
DbtProjectError, DbtProjectErrorException, DbtProfileError, DbtProfileErrorException,
ProfileListTitle, ListSingleProfile, NoDefinedProfiles, ProfileHelpMessage,
CatchableExceptionOnRun, InternalExceptionOnRun, GenericExceptionOnRun,
NodeConnectionReleaseError, PrintDebugStackTrace, SkippingDetails, PrintSkipBecauseError
NodeConnectionReleaseError, PrintDebugStackTrace, SkippingDetails, PrintSkipBecauseError,
NodeCompiling, NodeExecuting
)
from .printer import print_run_result_error
@@ -64,6 +66,9 @@ class BaseTask(metaclass=ABCMeta):
"""A hook called before the task is initialized."""
if args.log_format == 'json':
log_manager.format_json()
# we're mutating the initialized, but not-yet-configured event logger
# because it's being configured too late -- bad! TODO refactor!
event_logger.format_json = True
else:
log_manager.format_text()
@@ -71,6 +76,9 @@ class BaseTask(metaclass=ABCMeta):
def set_log_format(cls):
if flags.LOG_FORMAT == 'json':
log_manager.format_json()
# we're mutating the initialized, but not-yet-configured event logger
# because it's being configured too late -- bad! TODO refactor!
event_logger.format_json = True
else:
log_manager.format_text()
@@ -279,6 +287,13 @@ class BaseRunner(metaclass=ABCMeta):
def compile_and_execute(self, manifest, ctx):
result = None
with self.adapter.connection_for(self.node):
ctx.node._event_status['node_status'] = RunningStatus.Compiling
fire_event(
NodeCompiling(
node_info=ctx.node.node_info,
unique_id=ctx.node.unique_id,
)
)
with collect_timing_info('compile') as timing_info:
# if we fail here, we still have a compiled node to return
# this has the benefit of showing a build path for the errant
@@ -288,6 +303,13 @@ class BaseRunner(metaclass=ABCMeta):
# for ephemeral nodes, we only want to compile, not run
if not ctx.node.is_ephemeral_model:
ctx.node._event_status['node_status'] = RunningStatus.Executing
fire_event(
NodeExecuting(
node_info=ctx.node.node_info,
unique_id=ctx.node.unique_id,
)
)
with collect_timing_info('execute') as timing_info:
result = self.run(ctx.node, manifest)
ctx.node = result.node
@@ -312,7 +334,7 @@ class BaseRunner(metaclass=ABCMeta):
GenericExceptionOnRun(
build_path=self.node.build_path,
unique_id=self.node.unique_id,
exc=e
exc=str(e) # TODO: unstring this when serialization is fixed
)
)
fire_event(PrintDebugStackTrace())
@@ -425,7 +447,8 @@ class BaseRunner(metaclass=ABCMeta):
schema=schema_name,
node_name=node_name,
index=self.node_index,
total=self.num_nodes
total=self.num_nodes,
node_info=self.node.node_info
)
)

View File

@@ -38,7 +38,7 @@ class CleanTask(BaseTask):
"""
move_to_nearest_project_dir(self.args)
if ('dbt_modules' in self.config.clean_targets and
self.config.packages_install_path != 'dbt_modules'):
self.config.packages_install_path not in self.config.clean_targets):
deprecations.warn('install-packages-path')
for path in self.config.clean_targets:
fire_event(CheckCleanPath(path=path))

View File

@@ -10,7 +10,7 @@ from dbt.deps.resolver import resolve_packages
from dbt.events.functions import fire_event
from dbt.events.types import (
DepsNoPackagesFound, DepsStartPackageInstall, DepsUpdateAvailable, DepsUTD,
DepsInstallInfo, DepsListSubdirectory, DepsNotifyUpdatesAvailable
DepsInstallInfo, DepsListSubdirectory, DepsNotifyUpdatesAvailable, EmptyLine
)
from dbt.clients import system
@@ -63,7 +63,7 @@ class DepsTask(BaseTask):
source_type = package.source_type()
version = package.get_version()
fire_event(DepsStartPackageInstall(package=package))
fire_event(DepsStartPackageInstall(package_name=package_name))
package.install(self.config, renderer)
fire_event(DepsInstallInfo(version_name=package.nice_version_name()))
if source_type == 'hub':
@@ -81,6 +81,7 @@ class DepsTask(BaseTask):
source_type=source_type,
version=version)
if packages_to_upgrade:
fire_event(EmptyLine())
fire_event(DepsNotifyUpdatesAvailable(packages=packages_to_upgrade))
@classmethod

View File

@@ -40,7 +40,8 @@ class FreshnessRunner(BaseRunner):
PrintStartLine(
description=description,
index=self.node_index,
total=self.num_nodes
total=self.num_nodes,
node_info=self.node.node_info
)
)
@@ -58,7 +59,8 @@ class FreshnessRunner(BaseRunner):
table_name=table_name,
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time
execution_time=result.execution_time,
node_info=self.node.node_info
)
)
elif result.status == FreshnessStatus.Error:
@@ -68,7 +70,8 @@ class FreshnessRunner(BaseRunner):
table_name=table_name,
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time
execution_time=result.execution_time,
node_info=self.node.node_info
)
)
elif result.status == FreshnessStatus.Warn:
@@ -78,7 +81,8 @@ class FreshnessRunner(BaseRunner):
table_name=table_name,
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time
execution_time=result.execution_time,
node_info=self.node.node_info
)
)
else:
@@ -88,7 +92,8 @@ class FreshnessRunner(BaseRunner):
table_name=table_name,
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time
execution_time=result.execution_time,
node_info=self.node.node_info
)
)

View File

@@ -14,6 +14,8 @@ from dbt import flags
from dbt.version import _get_adapter_plugin_names
from dbt.adapters.factory import load_plugin, get_include_paths
from dbt.contracts.project import Name as ProjectName
from dbt.events.functions import fire_event
from dbt.events.types import (
StarterProjectPath, ConfigFolderDirectory, NoSampleProfileFound, ProfileWrittenWithSample,
@@ -269,6 +271,16 @@ class InitTask(BaseTask):
numeric_choice = click.prompt(prompt_msg, type=click.INT)
return available_adapters[numeric_choice - 1]
def get_valid_project_name(self) -> str:
"""Returns a valid project name, either from CLI arg or user prompt."""
name = self.args.project_name
while not ProjectName.is_valid(name):
if name:
click.echo(name + " is not a valid project name.")
name = click.prompt("Enter a name for your project (letters, digits, underscore)")
return name
def run(self):
"""Entry point for the init task."""
profiles_dir = flags.PROFILES_DIR
@@ -285,6 +297,8 @@ class InitTask(BaseTask):
# just setup the user's profile.
fire_event(SettingUpProfile())
profile_name = self.get_profile_name_from_current_project()
if not self.check_if_can_write_profile(profile_name=profile_name):
return
# If a profile_template.yml exists in the project root, that effectively
# overrides the profile_template.yml for the given target.
profile_template_path = Path("profile_template.yml")
@@ -296,8 +310,6 @@ class InitTask(BaseTask):
return
except Exception:
fire_event(InvalidProfileTemplateYAML())
if not self.check_if_can_write_profile(profile_name=profile_name):
return
adapter = self.ask_for_adapter_choice()
self.create_profile_from_target(
adapter, profile_name=profile_name
@@ -306,11 +318,7 @@ class InitTask(BaseTask):
# When dbt init is run outside of an existing project,
# create a new project and set up the user's profile.
project_name = self.args.project_name
if project_name is None:
# If project name is not provided,
# ask the user which project name they'd like to use.
project_name = click.prompt("What is the desired project name?")
project_name = self.get_valid_project_name()
project_path = Path(project_name)
if project_path.exists():
fire_event(ProjectNameAlreadyExists(name=project_name))

View File

@@ -11,6 +11,8 @@ from dbt.task.test import TestSelector
from dbt.node_types import NodeType
from dbt.exceptions import RuntimeException, InternalException, warn_or_error
from dbt.logger import log_manager
import logging
import dbt.events.functions as event_logger
class ListTask(GraphRunnableTask):
@@ -55,8 +57,17 @@ class ListTask(GraphRunnableTask):
@classmethod
def pre_init_hook(cls, args):
"""A hook called before the task is initialized."""
# Filter out all INFO-level logging to allow piping ls output to jq, etc
# WARN level will still include all warnings + errors
# Do this by:
# - returning the log level so that we can pass it into the 'level_override'
# arg of events.functions.setup_event_logger() -- good!
# - mutating the initialized, not-yet-configured STDOUT event logger
# because it's being configured too late -- bad! TODO refactor!
log_manager.stderr_console()
event_logger.STDOUT_LOG.level = logging.WARN
super().pre_init_hook(args)
return logging.WARN
def _iterate_selected_nodes(self):
selector = self.get_node_selector()

View File

@@ -65,6 +65,8 @@ def print_run_status_line(results) -> None:
stats[result_type] += 1
stats['total'] += 1
with TextOnly():
fire_event(EmptyLine())
fire_event(StatsLine(stats=stats))

View File

@@ -11,7 +11,7 @@ from .printer import (
print_run_end_messages,
get_counts,
)
from datetime import datetime
from dbt import tracking
from dbt import utils
from dbt.adapters.base import BaseRelation
@@ -21,14 +21,14 @@ from dbt.contracts.graph.compiled import CompileResultNode
from dbt.contracts.graph.manifest import WritableManifest
from dbt.contracts.graph.model_config import Hook
from dbt.contracts.graph.parsed import ParsedHookNode
from dbt.contracts.results import NodeStatus, RunResult, RunStatus
from dbt.contracts.results import NodeStatus, RunResult, RunStatus, RunningStatus
from dbt.exceptions import (
CompilationException,
InternalException,
RuntimeException,
missing_materialization,
)
from dbt.events.functions import fire_event
from dbt.events.functions import fire_event, get_invocation_id
from dbt.events.types import (
DatabaseErrorRunning, EmptyLine, HooksRunning, HookFinished,
PrintModelErrorResultLine, PrintModelResultLine, PrintStartLine,
@@ -102,7 +102,7 @@ def get_hook(source, index):
def track_model_run(index, num_nodes, run_model_result):
if tracking.active_user is None:
raise InternalException('cannot track model run with no active user')
invocation_id = tracking.active_user.invocation_id
invocation_id = get_invocation_id()
tracking.track_model_run({
"invocation_id": invocation_id,
"index": index,
@@ -177,7 +177,8 @@ class ModelRunner(CompileRunner):
PrintStartLine(
description=self.describe_node(),
index=self.node_index,
total=self.num_nodes
total=self.num_nodes,
node_info=self.node.node_info
)
)
@@ -190,7 +191,8 @@ class ModelRunner(CompileRunner):
status=result.status,
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time
execution_time=result.execution_time,
node_info=self.node.node_info
)
)
else:
@@ -200,7 +202,8 @@ class ModelRunner(CompileRunner):
status=result.message,
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time
execution_time=result.execution_time,
node_info=self.node.node_info
)
)
@@ -339,6 +342,8 @@ class RunTask(CompileTask):
finishctx = TimestampNamed('node_finished_at')
for idx, hook in enumerate(ordered_hooks, start=1):
hook._event_status['started_at'] = datetime.utcnow().isoformat()
hook._event_status['node_status'] = RunningStatus.Started
sql = self.get_hook_sql(adapter, hook, idx, num_hooks,
extra_context)
@@ -352,29 +357,36 @@ class RunTask(CompileTask):
statement=hook_text,
index=idx,
total=num_hooks,
truncate=True
node_info=hook.node_info
)
)
status = 'OK'
with Timer() as timer:
if len(sql.strip()) > 0:
status, _ = adapter.execute(sql, auto_begin=False,
fetch=False)
self.ran_hooks.append(hook)
response, _ = adapter.execute(sql, auto_begin=False, fetch=False)
status = response._message
else:
status = 'OK'
self.ran_hooks.append(hook)
hook._event_status['finished_at'] = datetime.utcnow().isoformat()
with finishctx, DbtModelState({'node_status': 'passed'}):
hook._event_status['node_status'] = RunStatus.Success
fire_event(
PrintHookEndLine(
statement=hook_text,
status=str(status),
status=status,
index=idx,
total=num_hooks,
execution_time=timer.elapsed,
truncate=True
node_info=hook.node_info
)
)
# `_event_status` dict is only used for logging. Make sure
# it gets deleted when we're done with it
del hook._event_status["started_at"]
del hook._event_status["finished_at"]
del hook._event_status["node_status"]
self._total_executed += len(ordered_hooks)
@@ -387,7 +399,7 @@ class RunTask(CompileTask):
try:
self.run_hooks(adapter, hook_type, extra_context)
except RuntimeException:
fire_event(DatabaseErrorRunning(hook_type))
fire_event(DatabaseErrorRunning(hook_type=hook_type.value))
raise
def print_results_line(self, results, execution_time):

View File

@@ -6,7 +6,6 @@ from concurrent.futures import as_completed
from datetime import datetime
from multiprocessing.dummy import Pool as ThreadPool
from typing import Optional, Dict, List, Set, Tuple, Iterable, AbstractSet
from pathlib import PosixPath, WindowsPath
from .printer import (
print_run_result_error,
@@ -34,7 +33,7 @@ from dbt.events.types import (
from dbt.contracts.graph.compiled import CompileResultNode
from dbt.contracts.graph.manifest import Manifest
from dbt.contracts.graph.parsed import ParsedSourceDefinition
from dbt.contracts.results import NodeStatus, RunExecutionResult
from dbt.contracts.results import NodeStatus, RunExecutionResult, RunningStatus
from dbt.contracts.state import PreviousState
from dbt.exceptions import (
InternalException,
@@ -56,6 +55,7 @@ from dbt.parser.manifest import ManifestLoader
import dbt.exceptions
from dbt import flags
import dbt.utils
from dbt.ui import warning_tag
RESULT_FILE_NAME = 'run_results.json'
MANIFEST_FILE_NAME = 'manifest.json'
@@ -189,6 +189,8 @@ class GraphRunnableTask(ManifestTask):
def get_runner(self, node):
adapter = get_adapter(self.config)
run_count: int = 0
num_nodes: int = 0
if node.is_ephemeral_model:
run_count = 0
@@ -206,17 +208,38 @@ class GraphRunnableTask(ManifestTask):
with RUNNING_STATE, uid_context:
startctx = TimestampNamed('node_started_at')
index = self.index_offset(runner.node_index)
runner.node._event_status['started_at'] = datetime.utcnow().isoformat()
runner.node._event_status['node_status'] = RunningStatus.Started
extended_metadata = ModelMetadata(runner.node, index)
with startctx, extended_metadata:
fire_event(NodeStart(unique_id=runner.node.unique_id))
status: Dict[str, str]
fire_event(
NodeStart(
node_info=runner.node.node_info,
unique_id=runner.node.unique_id,
)
)
status: Dict[str, str] = {}
try:
result = runner.run_with_hooks(self.manifest)
status = runner.get_result_status(result)
runner.node._event_status['node_status'] = result.status
runner.node._event_status['finished_at'] = datetime.utcnow().isoformat()
finally:
finishctx = TimestampNamed('node_finished_at')
finishctx = TimestampNamed('finished_at')
with finishctx, DbtModelState(status):
fire_event(NodeFinished(unique_id=runner.node.unique_id))
fire_event(
NodeFinished(
node_info=runner.node.node_info,
unique_id=runner.node.unique_id,
run_result=result.to_dict(),
)
)
# `_event_status` dict is only used for logging. Make sure
# it gets deleted when we're done with it
del runner.node._event_status["started_at"]
del runner.node._event_status["finished_at"]
del runner.node._event_status["node_status"]
fail_fast = flags.FAIL_FAST
@@ -335,7 +358,7 @@ class GraphRunnableTask(ManifestTask):
adapter = get_adapter(self.config)
if not adapter.is_cancelable():
fire_event(QueryCancelationUnsupported(type=adapter.type))
fire_event(QueryCancelationUnsupported(type=adapter.type()))
else:
with adapter.connection_named('master'):
for conn_name in adapter.cancel_open_connections():
@@ -353,10 +376,8 @@ class GraphRunnableTask(ManifestTask):
num_threads = self.config.threads
target_name = self.config.target_name
text = "Concurrency: {} threads (target='{}')"
concurrency_line = text.format(num_threads, target_name)
with NodeCount(self.num_nodes):
fire_event(ConcurrencyLine(concurrency_line=concurrency_line))
fire_event(ConcurrencyLine(num_threads=num_threads, target_name=target_name))
with TextOnly():
fire_event(EmptyLine())
@@ -437,8 +458,11 @@ class GraphRunnableTask(ManifestTask):
)
if len(self._flattened_nodes) == 0:
warn_or_error("\nWARNING: Nothing to do. Try checking your model "
"configs and model specification args")
with TextOnly():
fire_event(EmptyLine())
msg = "Nothing to do. Try checking your model " \
"configs and model specification args"
warn_or_error(msg, log_fmt=warning_tag('{}'))
result = self.get_result(
results=[],
generated_at=datetime.utcnow(),
@@ -566,38 +590,8 @@ class GraphRunnableTask(ManifestTask):
results=results,
elapsed_time=elapsed_time,
generated_at=generated_at,
args=self.args_to_dict(),
args=dbt.utils.args_to_dict(self.args),
)
def args_to_dict(self):
var_args = vars(self.args).copy()
# update the args with the flags, which could also come from environment
# variables or user_config
flag_dict = flags.get_flag_dict()
var_args.update(flag_dict)
dict_args = {}
# remove args keys that clutter up the dictionary
for key in var_args:
if key == 'cls':
continue
if var_args[key] is None:
continue
# TODO: add more default_false_keys
default_false_keys = (
'debug', 'full_refresh', 'fail_fast', 'warn_error',
'single_threaded', 'log_cache_events',
'use_experimental_parser',
)
if key in default_false_keys and var_args[key] is False:
continue
if key == 'vars' and var_args[key] == '{}':
continue
# this was required for a test case
if (isinstance(var_args[key], PosixPath) or
isinstance(var_args[key], WindowsPath)):
var_args[key] = str(var_args[key])
dict_args[key] = var_args[key]
return dict_args
def task_end_messages(self, results):
print_run_end_messages(results)

View File

@@ -11,7 +11,7 @@ from dbt.graph import ResourceTypeSelector
from dbt.logger import TextOnly
from dbt.events.functions import fire_event
from dbt.events.types import (
SeedHeader, SeedHeaderSeperator, EmptyLine, PrintSeedErrorResultLine,
SeedHeader, SeedHeaderSeparator, EmptyLine, PrintSeedErrorResultLine,
PrintSeedResultLine, PrintStartLine
)
from dbt.node_types import NodeType
@@ -27,7 +27,8 @@ class SeedRunner(ModelRunner):
PrintStartLine(
description=self.describe_node(),
index=self.node_index,
total=self.num_nodes
total=self.num_nodes,
node_info=self.node.node_info
)
)
@@ -50,7 +51,8 @@ class SeedRunner(ModelRunner):
total=self.num_nodes,
execution_time=result.execution_time,
schema=self.node.schema,
relation=model.alias
relation=model.alias,
node_info=model.node_info
)
)
else:
@@ -61,7 +63,8 @@ class SeedRunner(ModelRunner):
total=self.num_nodes,
execution_time=result.execution_time,
schema=self.node.schema,
relation=model.alias
relation=model.alias,
node_info=model.node_info
)
)
@@ -106,7 +109,7 @@ class SeedTask(RunTask):
with TextOnly():
fire_event(EmptyLine())
fire_event(SeedHeader(header=header))
fire_event(SeedHeaderSeperator(len_header=len(header)))
fire_event(SeedHeaderSeparator(len_header=len(header)))
rand_table.print_table(max_rows=10, max_columns=None)
with TextOnly():

View File

@@ -6,7 +6,7 @@ from dbt.include.global_project import DOCS_INDEX_FILE_PATH
from http.server import SimpleHTTPRequestHandler
from socketserver import TCPServer
from dbt.events.functions import fire_event
from dbt.events.types import ServingDocsPort, ServingDocsAccessInfo, ServingDocsExitInfo
from dbt.events.types import ServingDocsPort, ServingDocsAccessInfo, ServingDocsExitInfo, EmptyLine
from dbt.task.base import ConfiguredTask
@@ -22,6 +22,8 @@ class ServeTask(ConfiguredTask):
fire_event(ServingDocsPort(address=address, port=port))
fire_event(ServingDocsAccessInfo(port=port))
fire_event(EmptyLine())
fire_event(EmptyLine())
fire_event(ServingDocsExitInfo())
# mypy doesn't think SimpleHTTPRequestHandler is ok here, but it is

View File

@@ -23,7 +23,8 @@ class SnapshotRunner(ModelRunner):
cfg=cfg,
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time
execution_time=result.execution_time,
node_info=model.node_info
)
)
else:
@@ -34,7 +35,8 @@ class SnapshotRunner(ModelRunner):
cfg=cfg,
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time
execution_time=result.execution_time,
node_info=model.node_info
)
)
@@ -43,10 +45,6 @@ class SnapshotTask(RunTask):
def raise_on_first_error(self):
return False
def defer_to_manifest(self, adapter, selected_uids):
# snapshots don't defer
return
def get_node_selector(self):
if self.manifest is None or self.graph is None:
raise InternalException(

View File

@@ -74,7 +74,8 @@ class TestRunner(CompileRunner):
name=model.name,
index=self.node_index,
num_models=self.num_nodes,
execution_time=result.execution_time
execution_time=result.execution_time,
node_info=model.node_info
)
)
elif result.status == TestStatus.Pass:
@@ -83,7 +84,8 @@ class TestRunner(CompileRunner):
name=model.name,
index=self.node_index,
num_models=self.num_nodes,
execution_time=result.execution_time
execution_time=result.execution_time,
node_info=model.node_info
)
)
elif result.status == TestStatus.Warn:
@@ -93,7 +95,8 @@ class TestRunner(CompileRunner):
index=self.node_index,
num_models=self.num_nodes,
execution_time=result.execution_time,
failures=result.failures
failures=result.failures,
node_info=model.node_info
)
)
elif result.status == TestStatus.Fail:
@@ -103,7 +106,8 @@ class TestRunner(CompileRunner):
index=self.node_index,
num_models=self.num_nodes,
execution_time=result.execution_time,
failures=result.failures
failures=result.failures,
node_info=model.node_info
)
)
else:
@@ -114,7 +118,8 @@ class TestRunner(CompileRunner):
PrintStartLine(
description=self.describe_node(),
index=self.node_index,
total=self.num_nodes
total=self.num_nodes,
node_info=self.node.node_info
)
)

View File

@@ -3,7 +3,7 @@ from typing import Optional
from dbt.clients.yaml_helper import ( # noqa:F401
yaml, safe_load, Loader, Dumper,
)
from dbt.events.functions import fire_event
from dbt.events.functions import fire_event, get_invocation_id
from dbt.events.types import (
DisableTracking, SendingEvent, SendEventFailure, FlushEvents,
FlushEventsFailure, TrackingInitializeFailure
@@ -103,7 +103,7 @@ class User:
self.cookie_dir = cookie_dir
self.id = None
self.invocation_id = str(uuid.uuid4())
self.invocation_id = get_invocation_id()
self.run_started_at = datetime.now(tz=pytz.utc)
def state(self):
@@ -184,7 +184,7 @@ def get_invocation_context(user, config, args):
return {
"project_id": None if config is None else config.hashed_name(),
"user_id": user.id,
"invocation_id": user.invocation_id,
"invocation_id": get_invocation_id(),
"command": args.which,
"options": None,
@@ -262,7 +262,7 @@ def track(user, *args, **kwargs):
if user.do_not_track:
return
else:
fire_event(SendingEvent(kwargs=kwargs))
fire_event(SendingEvent(kwargs=str(kwargs)))
try:
tracker.track_struct_event(*args, **kwargs)
except Exception:
@@ -294,7 +294,7 @@ def track_project_load(options):
active_user,
category='dbt',
action='load_project',
label=active_user.invocation_id,
label=get_invocation_id(),
context=context
)
@@ -308,7 +308,7 @@ def track_resource_counts(resource_counts):
active_user,
category='dbt',
action='resource_counts',
label=active_user.invocation_id,
label=get_invocation_id(),
context=context
)
@@ -322,7 +322,7 @@ def track_model_run(options):
active_user,
category="dbt",
action='run_model',
label=active_user.invocation_id,
label=get_invocation_id(),
context=context
)
@@ -336,7 +336,7 @@ def track_rpc_request(options):
active_user,
category="dbt",
action='rpc_request',
label=active_user.invocation_id,
label=get_invocation_id(),
context=context
)
@@ -356,7 +356,7 @@ def track_package_install(config, args, options):
active_user,
category="dbt",
action='package',
label=active_user.invocation_id,
label=get_invocation_id(),
property_='install',
context=context
)
@@ -375,7 +375,7 @@ def track_deprecation_warn(options):
active_user,
category="dbt",
action='deprecation',
label=active_user.invocation_id,
label=get_invocation_id(),
property_='warn',
context=context
)
@@ -441,7 +441,7 @@ def track_experimental_parser_sample(options):
active_user,
category='dbt',
action='experimental_parser',
label=active_user.invocation_id,
label=get_invocation_id(),
context=context
)
@@ -455,7 +455,7 @@ def track_partial_parser(options):
active_user,
category='dbt',
action='partial_parser',
label=active_user.invocation_id,
label=get_invocation_id(),
context=context
)
@@ -491,13 +491,6 @@ def initialize_tracking(cookie_dir):
active_user = User(None)
def get_invocation_id() -> Optional[str]:
if active_user is None:
return None
else:
return active_user.invocation_id
class InvocationProcessor(logbook.Processor):
def __init__(self):
super().__init__()
@@ -506,7 +499,7 @@ class InvocationProcessor(logbook.Processor):
if active_user is not None:
record.extra.update({
"run_started_at": active_user.run_started_at.isoformat(),
"invocation_id": active_user.invocation_id,
"invocation_id": get_invocation_id(),
})

View File

@@ -62,7 +62,7 @@ def line_wrap_message(
# (we'll turn it into a single line soon). Support windows, too.
splitter = '\r\n\r\n' if '\r\n\r\n' in msg else '\n\n'
chunks = msg.split(splitter)
return '\n'.join(textwrap.fill(chunk, width=width) for chunk in chunks)
return '\n'.join(textwrap.fill(chunk, width=width, break_on_hyphens=False) for chunk in chunks)
def warning_tag(msg: str) -> str:

View File

@@ -10,12 +10,15 @@ import jinja2
import json
import os
import requests
from tarfile import ReadError
import time
from pathlib import PosixPath, WindowsPath
from contextlib import contextmanager
from dbt.exceptions import ConnectionException
from dbt.events.functions import fire_event
from dbt.events.types import RetryExternalCall
from dbt import flags
from enum import Enum
from typing_extensions import Protocol
from typing import (
@@ -598,7 +601,9 @@ class MultiDict(Mapping[str, Any]):
def _connection_exception_retry(fn, max_attempts: int, attempt: int = 0):
"""Attempts to run a function that makes an external call, if the call fails
on a connection error or timeout, it will be tried up to 5 more times.
on a connection error, timeout or decompression issue, it will be tried up to 5 more times.
See https://github.com/dbt-labs/dbt-core/issues/4579 for context on this decompression issues
specifically.
"""
try:
return fn()
@@ -606,6 +611,7 @@ def _connection_exception_retry(fn, max_attempts: int, attempt: int = 0):
requests.exceptions.ConnectionError,
requests.exceptions.Timeout,
requests.exceptions.ContentDecodingError,
ReadError,
) as exc:
if attempt <= max_attempts - 1:
fire_event(RetryExternalCall(attempt=attempt, max=max_attempts))
@@ -613,3 +619,40 @@ def _connection_exception_retry(fn, max_attempts: int, attempt: int = 0):
_connection_exception_retry(fn, max_attempts, attempt + 1)
else:
raise ConnectionException('External connection exception occurred: ' + str(exc))
# This is used to serialize the args in the run_results and in the logs.
# We do this separately because there are a few fields that don't serialize,
# i.e. PosixPath, WindowsPath, and types. It also includes args from both
# cli args and flags, which is more complete than just the cli args.
# If new args are added that are false by default (particularly in the
# global options) they should be added to the 'default_false_keys' list.
def args_to_dict(args):
var_args = vars(args).copy()
# update the args with the flags, which could also come from environment
# variables or user_config
flag_dict = flags.get_flag_dict()
var_args.update(flag_dict)
dict_args = {}
# remove args keys that clutter up the dictionary
for key in var_args:
if key == 'cls':
continue
if var_args[key] is None:
continue
# TODO: add more default_false_keys
default_false_keys = (
'debug', 'full_refresh', 'fail_fast', 'warn_error',
'single_threaded', 'log_cache_events', 'store_failures',
'use_experimental_parser',
)
if key in default_false_keys and var_args[key] is False:
continue
if key == 'vars' and var_args[key] == '{}':
continue
# this was required for a test case
if (isinstance(var_args[key], PosixPath) or
isinstance(var_args[key], WindowsPath)):
var_args[key] = str(var_args[key])
dict_args[key] = var_args[key]
return dict_args

View File

@@ -11,7 +11,7 @@ import dbt.exceptions
import dbt.semver
PYPI_VERSION_URL = 'https://pypi.org/pypi/dbt/json'
PYPI_VERSION_URL = 'https://pypi.org/pypi/dbt-core/json'
def get_latest_version():
@@ -96,5 +96,5 @@ def _get_dbt_plugins_info():
yield plugin_name, mod.version
__version__ = '1.0.0rc1'
__version__ = '1.0.1'
installed = get_installed_version()

Some files were not shown because too many files have changed in this diff Show More