* Add test that checks microbatch models are all operating with the same `current_time`
* Set an `invocated_at` on the `RuntimeConfig` and plumb to `MicrobatchBuilder`
* Add changie doc
* Rename `invocated_at` to `invoked_at`
* Simply conditional logic for setting MicrobatchBuilder.batch_current_time
* Rename `batch_current_time` to `default_end_time` for MicrobatchBuilder
* Begin testing that microbatch execution times are being tracked and set
* Begin tracking the execution time of batches for microbatch models
* Add changie doc
* Additional assertions in microbatch testing
* Validate that `event_time_start` is before `event_time_end` when passed from CLI
Sometimes CLI options have restrictions based on other CLI options. This is the case
for `--event-time-start` and `--event-time-end`. Unfortunately, click doesn't provide
a good way for validating this, at least not that I found. Additionaly I'm not sure
if we have had anything like this previously. In any case, I couldn't find a
centralized validation area for such occurances. Thus I've gone and added one,
`validate_option_interactions`. Long term if more validations are added, we should
add this wrapper to each CLI command. For now I've only added it to the commands that
support `event_time_start` and `event_time_end`, specifically `build` and `run`.
* Add changie doc
* If `--event-time-end` is not specififed, ensure `--event-time-start` is less than the current time
* Fixup error message about event_time_start and event_time_end
* Move logic to validate `event_time` cli flags to `flags.py`
* Update validation of `--event-time-start` against `datetime.now` to use UTC
* When retrying microbatch models, propagate prior successful state
* Changie doc for microbatch dbt retry fixes
* Fix test_manifest unit tests for batch_info key
* Add functional test for when a microbatch model has multiple retries
* Add comment about when batch_info will be something other than None
* Add tests to check how microbatch models respect `full_refresh` model configs
* Fix `_is_incremental` to properly respect `full_refresh` model config
In dbt-core, it is generally expected that values passed via CLI flags take
precedence over model level configs. However, `full_refresh` on a model is an
exception to this rule, where in the model config takes precedence. This
config exists specifically to _prevent_ accidental full refreshes of large
incremental models, as doing so can be costly. **_It is actually best
practice_** to set `full_refresh=False` on incremental models.
Prior to this commit, for microbatch models, the above was not happening. The
CLI flag `--full-refresh` was taking precedence over the model config
`full_refresh`. That meant that if `--full-refresh` was supplied, then the
microbatch model **_would full refresh_** even if `full_refresh=False` was
set on the model. This commit solves that problem.
* Add changie doc for microbatch `full_refresh` config handling
* Add `PartialSuccess` status type and use it for microbatch models with mixed results
* Handle `PartialSuccess` in `interpret_run_result`
* Add `BatchResults` object to `BaseResult` and begin tracking during microbatch runs
* Ensure batch_results being propagated to `run_results` artifact
* Move `batch_results` from `BaseResult` class to `RunResult` class
* Move `BatchResults` and `BatchType` to separate arifacts file to avoid circular imports
In our next commit we're gonna modify `dbt/contracts/graph/nodes.py` to import the
`BatchType` as part of our work to implement dbt retry for microbatch model nodes.
Unfortunately, the import in `nodes.py` creates a circular dependency because
`dbt/artifacts/schemas/results.py` imports from `nodes.py` and `dbt/artifacts/schemas/run/v5/run.py`
imports from that `results.py`. Thus the new import creates a circular import. Now this
_shouldn't_ be necessary as nothing in artifacts should import from the rest of dbt-core.
However, we do. We should fix this, but this is also out of scope for this segement of work.
* Add `PartialSuccess` as a retry-able status, and use batches to retry microbatch models
* Fix BatchType type so that the first datetime is no longer Optional
* Ensure `PartialSuccess` causes skipping of downstream nodes
* Alter `PartialSuccess` status to be considered an error in `interpret_run_result`
* Update schemas and test artifacts to include new batch_results run results key
* Add functional test to check that 'dbt retry' retries 'PartialSuccess' models
* Update partition failure test to assert downstream models are skipped
* Improve `success`/`error`/`partial success` messaging for microbatch models
* Include `PartialSuccess` in status that `--fail-fast` counts as a failure
* Update `LogModelResult` to handle partial successes
* Update `EndOfRunSummary` to handle partial successes
* Cleanup TODO comment
* Raise a DbtInternalError if we get a batch run result without `batch_results`
* When running a microbatch model with supplied batches, force non full-refresh behavior
This is necessary because of retry. Say on the initial run the microbatch model
succeeds on 97% of it's batches. Then on retry it does the last 3%. If the retry
of the microbatch model executes in full refresh mode it _might_ blow away the
97% of work that has been done. This edge case seems to be adapter specific.
* Only pass batches to retry for microbatch model when there was a PartialSuccess
In the previous commit we made it so that retries of microbatch models wouldn't
run in full refresh mode when the microbatch model to retry has batches already
specified from the prior run. This is only problematic when the run being retried
was a full refresh AND all the batches for a given microbatch model failed. In
that case WE DO want to do a full refresh for the given microbatch model. To better
outline the problem, consider the following:
* a microbatch model had a begin of `2020-01-01` and has been running this way for awhile
* the begin config has changed to `2024-01-01` and dbt run --full-refresh gets run
* every batch for an microbatch model fails
* on dbt retry the the relation is said to exist, and the now out of range data (2020-01-01 through 2023-12-31) is never purged
To avoid this, all we have to do is ONLY pass the batch information for partially successful microbatch
models. Note: microbatch models only have a partially successful status IFF they have both
successful and failed batches.
* Fix test_manifest unit tests to know about model 'batches' key
* Add some console output assertions to microbatch functional tests
* add batch_results: None to expected_run_results
* Add changie doc for microbatch retry functionality
* maintain protoc version 5.26.1
* Cleanup extraneous comment in LogModelResult
---------
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* Test case for `merge_exclude_columns`
* Update expected output for `merge_exclude_columns`
* Skip TestMergeExcludeColumns test
* Enable this test since PostgreSQL 15+ is available in CI now
* Undo modification to expected output
* Remove duplicated constructor for `ResourceTypeSelector`
* Add type annotation for `ResourceTypeSelector`
* Standardize on constructor for `ResourceTypeSelector` where `include_empty_nodes=True`
* Changelog entry
* Adding logic to TestSelector to remove unit tests if they are in excluded_resource_types
* Adding change log
* Respect `--resource-type` and `--exclude-resource-type` CLI flags and associated environment variables
* Test CLI flag for excluding unit tests for the `dbt test` command
* Satisy isort pre-commit hook
* Fix mypy for positional argument "resource_types" in call to "TestSelector"
* Replace `TestSelector` with `ResourceTypeSelector`
* Add co-author
* Update changelog description
* Add functional tests for new feature
* Compare the actual results, not just the count
* Remove test case covered elsewhere
* Test for `DBT_EXCLUDE_RESOURCE_TYPES` environment variable for `dbt test`
* Update per pre-commit hook
* Restore to original form (until we refactor extraneous `ResourceTypeSelector` references later)
---------
Co-authored-by: Matthew Cooper <asimov.1st@gmail.com>
* initial rough-in with CLI flags
* dbt-adapters testing against event-time-ref-filtering
* fix TestList
* Checkpoint
* fix tests
* add event_time_start params to build
* rename configs
* Gate resolve_event_time_filter via micro batch strategy and fix strptime usage
* Add unit test for resolve_event_time_filter
* Additional unit tests for `resolve_event_time_filter` to ensure lookback + batch_size work
* Remove extraneous comments and print statements from resolve_event_time_filter
* Fixup microbatch functional tests to use microbatch strategy
* Gate microbatch functionality behind env_var while in beta
* Add comment about how _is_incremental should be removed
* Improve `event_time_start/end` cli parameters to auto convert to datetime objects
* for testing: dbt-postgres 'microbatch' strategy
* rough in: chunked backfills
* partial failure of microbatch runs
* decouple run result methods
* initial refactor
* rename configs to __dbt_internal
* update compiled_code in context after re-compilation
* finish rename of context vars
* changelog entry
* fix patch_microbatch_end_time
* refactor into MicrobatchBuilder
* fix provider unit tests + add unit tests for MicrobatchBuilder
* add TestMicrobatchJinjaContextVarsAvailable
* unit test offset + truncate timestamp methods
* Remove pairing.md file
* Add tying to microbatch specific functions added in `task/run.py`
* Add doc strings to microbatch.py functions and classes
* Set microbatch node status to `ERROR` if all batches for node failed
* Fire an event for batch exceptions instead of directly printing
* Fix firing of failed microbatch log event
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* Update functional tests to cover this case
* Revert "Update functional tests to cover this case"
This reverts commit 4c78e816f6.
* New functional tests to cover the resource_type config
* Separate data tests from unit tests for `resource_types` config of `dbt list` and `dbt build`
* Changelog entry
* Add functional tests for custom incremental strategies names 'microbatch'
* Point dev-requirement of `dbt-adapters` back to the main branch
The associated branch/PR in `dbt-adapters` that we were previously
pointing to has been merged. Thus we can point back to `main` again.
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* made class changing directory a context manager.
* add change log
* fix conflict
* made base as a context manager
* add assertion
* Remove index.html
* add it test to testDbtRunner
* fix deps args order
* fix test
---------
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
Co-authored-by: Chenyu Li <chenyu.li@dbtlabs.com>
* use full manifest in adapter instead of macro_manifest
* Add test case
* Add changelog entry
* Remove commented code.
---------
Co-authored-by: Peter Allen Webb <peter.webb@dbtlabs.com>
* Tests for calling a macro in a pre- or post-hook config in properties.yml
* Late render pre- and post-hooks configs in properties / schema YAML files
* Changelog entry
* Use alias instead of name when adding ephemeral model prefixes
* Adjust TestCustomSchemaWithCustomMacroFromModelName to test ephemeral models
* Add changelog entry for ephemeral model CTE identifier fix
* Reference model.identifier and model.name where appropriate to resolve typing errors
* Move test for ephemeral model with alias to dedicated test in test_compile.py
* update children search
* update search to include children in original selector
* add changie
* remove unused function
* fix wrong function call
* fix depth
* sketch
* Bring back the happy path fixture snapshot file
The commit c783a86 removed the snapshot file from the happy path fixture.
This was done because the snapshot was breaking the tests we were adding,
`test_run_commands`. However this broke `test_ls` in `test_list.py`. In order
to move forward, we need everything to be working. Maybe the idea was to delete
the `test_list.py` file, however that is not noted anywhere, and was not done.
Thus this commit ensures that test is not broken nor or new tests.
* Create conftest for `functional` tests so that happy path fixtures are accessible
* Format `test_commands.py` and update imports to appease pre-commit hooks
* Parametrize `test_run_command` to make it easier to see which command is failing (if any)
* Update the setup for `TestRunCommands.test_run_command` to be more formulaic
* Add test to ensure resource types are selectable
* Fix docstring formatting in TestRunCommands
* Fixup documentation for test_commands.py
---------
Co-authored-by: Chenyu Li <chenyu.li@dbtlabs.com>
* Add breakpoint.
* Move breakpoint.
* Add fix
* Add changelog.
* Avoid sorting for the string case.
* Add unit test.
* Fix test.
* add good unit tests for coverage of sort method.
* add sql format coverage.
* Modify behavior to log a warning and proceed.
* code review comments.
---------
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Fix exclusive_primary_alt_value_setting to set warn_error_options correctly
* Add test
* Changie
* Fix unit test
* Replace conversion method
* Refactor normalize_warn_error_options
* Add changelog.
* Avoid sorting for the string case.
* add good unit tests for coverage of sort method.
* add sql format coverage.
---------
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Correct `isort` configuration to include dbt-semantic-interfaces as internal
We thought we were already doing this. However, we accidentally missed the last
`s` of `dbt-semantic-interfaces`, so imports from dbt-semantic-interfaces were not
being identified as an internal package by isort. This fixes that.
* Run isort using updated configs to mark `dbt-semantic-interfaces` as included
* Fix `test_can_silence` tests in `test_warn_error_options.py` to ensure silencing
We're fixing an issue wherein `silence` specifications in the `dbt_project.yaml`
weren't being respected. This was odd since we had tests specifically for this.
It turns out the tests were broken. Essentially the warning was instead being raised
as an error due to `include: 'all'`. Then because it was being raised as an error,
the event wasn't going through the logger. We were only asserting in these tests that
the silenced event wasn't going through the logger (which it wasn't) so everything
"appeared" to be working. Unfortunately everything wasn't actually working. This is now
highlighted because `test_warn_error_options::TestWarnErrorOptionsFromProject:test_can_silence`
is now failing with this commit.
* Fix setting `warn_error_options` via `dbt_project.yaml` flags.
Back when I did the work for #10058 (specifically c52d6531) I thought that
the `warn_error_options` would automatically be converted from the yaml
to the `WarnErrorOptions` object as we were building the `ProjectFlags` object,
which holds `warn_error_options`, via `ProjectFlags.from_dict`. And I thought
this was validated by the `test_can_silence` test added in c52d6531. However,
there were two problems:
1. The definition of `warn_error_options` on `PrjectFlags` is a dict, not a
`WarnErrorOptions` object
2. The `test_can_silence` test was broken, and not testing what I thought
The quick fix (this commit) is to ensure `silence` is passed to `WarnErrorOptions`
instantiation in `dbt.cli.flags.convert_config`. The better fix would be to make
the `warn_error_options` of `ProjectFlags` a `WarnErrorOptions` object instead of
a dict. However, to do this we first need to update dbt-common's `WarnErrorOptions`
definition to default `include` to an empty list. Doing so would allow us to get rid
of `convert_config` entirely.
* Add unit test for `ModelRunner.print_result_line`
* Add (and skip) unit test for `ModelRunner.execute`
An attempt at testing `ModelRunner.execute`. We should probably also be
asserting that the model has been executed. However before we could get there,
we're running into runtime errors during `ModelRunner.execute`. Currently the
struggle is ensuring the adapter exists in the global factory when `execute`
goes looking for. The error we're getting looks like the following:
```
def test_execute(self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner) -> None:
> model_runner.execute(model=table_model, manifest=manifest)
tests/unit/task/test_run.py:121:
----
core/dbt/task/run.py:259: in execute
context = generate_runtime_model_context(model, self.config, manifest)
core/dbt/context/providers.py:1636: in generate_runtime_model_context
ctx = ModelContext(model, config, manifest, RuntimeProvider(), None)
core/dbt/context/providers.py:834: in __init__
self.adapter = get_adapter(self.config)
venv/lib/python3.10/site-packages/dbt/adapters/factory.py:207: in get_adapter
return FACTORY.lookup_adapter(config.credentials.type)
---`
self = <dbt.adapters.factory.AdapterContainer object at 0x106e73280>, adapter_name = 'postgres'
def lookup_adapter(self, adapter_name: str) -> Adapter:
> return self.adapters[adapter_name]
E KeyError: 'postgres'
venv/lib/python3.10/site-packages/dbt/adapters/factory.py:132: KeyError
```
* Add `postgres_adapter` fixture for use in `TestModelRunner`
Previously we were running into an issue where the during `ModelRunner.execute`
the mock_adapter we were using wouldn't be found in the global adapter
factory. We've gotten past this error by supply a "real" adapter, a
`PostgresAdapter` instance. However we're now running into a new error
in which the materialization macro can't be found. This error looks like
```
model_runner = <dbt.task.run.ModelRunner object at 0x106746650>
def test_execute(
self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner
) -> None:
> model_runner.execute(model=table_model, manifest=manifest)
tests/unit/task/test_run.py:129:
----
self = <dbt.task.run.ModelRunner object at 0x106746650>
model = ModelNode(database='dbt', schema='dbt_schema', name='table_model', resource_type=<NodeType.Model: 'model'>, package_na...ected'>, constraints=[], version=None, latest_version=None, deprecation_date=None, defer_relation=None, primary_key=[])
manifest = Manifest(nodes={'seed.pkg.seed': SeedNode(database='dbt', schema='dbt_schema', name='seed', resource_type=<NodeType.Se...s(show=True, node_color=None), patch_path=None, arguments=[], created_at=1718229810.21914, supported_languages=None)}})
def execute(self, model, manifest):
context = generate_runtime_model_context(model, self.config, manifest)
materialization_macro = manifest.find_materialization_macro_by_name(
self.config.project_name, model.get_materialization(), self.adapter.type()
)
if materialization_macro is None:
> raise MissingMaterializationError(
materialization=model.get_materialization(), adapter_type=self.adapter.type()
)
E dbt.adapters.exceptions.compilation.MissingMaterializationError: Compilation Error
E No materialization 'table' was found for adapter postgres! (searched types 'default' and 'postgres')
core/dbt/task/run.py:266: MissingMaterializationError
```
* Add spoofed macro fixture `materialization_table_default` for `test_execute` test
Previously the `TestModelRunner:test_execute` test was running into a runtime error
do to the macro `materialization_table_default` macro not existing in the project. This
commit adds that macro to the project (though it should ideally get loaded via interactions
between the manifest and adapter). Manually adding it resolved our previous issue, but created
a new one. The macro appears to not be properly loaded into the manifest, and thus isn't
discoverable later on when getting the macros for the jinja context. This leads to an error
that looks like the following:
```
model_runner = <dbt.task.run.ModelRunner object at 0x1080a4f70>
def test_execute(
self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner
) -> None:
> model_runner.execute(model=table_model, manifest=manifest)
tests/unit/task/test_run.py:129:
----
core/dbt/task/run.py:287: in execute
result = MacroGenerator(
core/dbt/clients/jinja.py:82: in __call__
return self.call_macro(*args, **kwargs)
venv/lib/python3.10/site-packages/dbt_common/clients/jinja.py:294: in call_macro
macro = self.get_macro()
---
self = <dbt.clients.jinja.MacroGenerator object at 0x1080f3130>
def get_macro(self):
name = self.get_name()
template = self.get_template()
# make the module. previously we set both vars and local, but that's
# redundant: They both end up in the same place
# make_module is in jinja2.environment. It returns a TemplateModule
module = template.make_module(vars=self.context, shared=False)
> macro = module.__dict__[get_dbt_macro_name(name)]
E KeyError: 'dbt_macro__materialization_table_default'
venv/lib/python3.10/site-packages/dbt_common/clients/jinja.py:277: KeyError
```
It's becoming apparent that we need to find a better way to either mock or legitimately
load the default and adapter macros. At this point I think I've exausted the time box
I should be using to figure out if testing the `ModelRunner` class is possible currently,
with the result being more work has yet to be done.
* Begin adding the `LogModelResult` event catcher to event manager class fixture
* init push for issue 10198
* add changelog
* add unit tests based on michelle example
* add data_tests, and post_hook unit tests
* pull creating macro_func out of try call
* revert last commit
* pull macro_func definition back out of try
* update code formatting
* Add basic semantic layer fixture nodes to unit test `manifest` fixture
We're doing this in preperation to a for a unit test which will be testing
these nodes (as well as others) and thus we want them in the manifest.
* Add `WhereFilterInteresection` to `QueryParams` of `saved_query` fixture
In the previous commit, 58990aa450, we added
the `saved_query` fixture to the `manifest` fixture. This broke the test
`tests/unit/parser/test_manifest.py::TestPartialParse::test_partial_parse_by_version`.
It broke because the `Manifest.deepcopy` manifest basically dictifies things. When we were
dictifying the `QueryParams` of the `saved_query` before, the `where` key was getting
dropped because it was `None`. We'd then run into a runtime error on instantiation of the
`QueryParams` because although `where` is declared as _optional_, we don't set a default
value for it. And thus, kaboom :(
We should probably provide a default value for `where`. However that is out of scope
for this branch of work.
* Fix `test_select_fqn` to account for newly included semantic model
In 58990aa450 we added a semantic model
to the `manifest` fixture. This broke the test
`tests/unit/graph/test_selector_methods.py::test_select_fqn` because in
the test it selects nodes based on the string `*.*.*_model`. The newly
added semantic model matches this, and thus needed to be added to the
expected results.
* Add unit tests for `_warn_for_unused_resource_config_paths` method
Note: At this point the test when run with for a `unit_test` config
taht should be considered used, fails. This is because it is not being
properly identified as used.
* Include `unit_tests` in `Manifest.get_resouce_fqns`
Because `unit_tests` weren't being included in calls to `Manifest.get_resource.fqns`,
it always appeared to `_warn_for_unused_resource_config_paths` that there were no
unit tests in the manifest. Because of this `_warn_for_unused_resource_config_paths` thought
that _any_ `unit_test` config in `dbt_project.yaml` was unused.
* Rename `maniest` fixture in `test_selector` to `mock_manifest`
We have a globally available `manifest` fixture in our unit tests. In the
coming commits we're going to add tests to the file which use the gloablly
available `manifest` fixture. Prior to this commit, the locally defined
`manifest` fixture was taking precidence. To get around this, the easiest
solution was to rename the locally defined fixture.
I had tried to isolate the locally defined fixture by moving it, and the relevant
tests to a test class like `TestNodeSelector`. However because of _how_ the relevant
tests were parameterized, this proved difficult. Basically for readability we define
a variable which holds a list of all the parameterization variables. By moving to a
test class, the definition of the variables would have had to be defined directly in
the parameterization macro call. Although possible, it made the readability slighty
worse. It might be worth doing anyway in the long run, but instead I used a less heavy
handed alternative (already described)
* Improve type hinting in `tests/unit/utils/manifest.py`
* Ensure `args` get set from global flags for `runtime_config` fixture in unit tests
The `Compiler.compile` method accesses `self.config.args.which`. The `config`
is the `RuntimeConfig` the `Compiler` was instantiated with. Our `runtime_config`
fixture was being instatiated with an empty dict for the `args` property. Thus
a `which` property of the args wasn't being made avaiable, and if `compile` was run
a runtime error would occur. To solve this, we've begun instantiating the args from
the global flags via `get_flags()`. This works because we ensure the `set_test_flags`
fixture is run first which calls `set_from_args`.
* Create a `make_manifest` utility function for use in unit tests and fixture creation
* Refactor `Compiler` and `NodeSelector` tests in `test_selector.py` to use pytesting methodology
* Remove parsing tests that exist in `test_selector.py`
We had some tests in `test_selector.py::GraphTest` that didn't add
anything ontop of what was already being tested else where in the file
except the parsing of models. However, the tests in `test_parser.py::ModelParserTest`
cover everything being tested here (and then some). Thus these tests
in `test_selector.py::GraphTest` are unnecessary and can be deleted.
* Move `test__partial_parse` from `test_selector.py` to `test_manifest.py`
There was a test `test__partial_parse` in `test_selector.py` which tested
the functionality of `is_partial_parsable` of the `ManifestLoader`. This
doesn't really make sense to exist in `test_selector.py` where we are
testing selectors. We test the `ManifestLoader` class in `test_manifest.py`
which seemed like a more appropriate place for the test. Additionally we
renamed the test to `test_is_partial_parsable_by_version` to more accurately
describe what is being tested.
* Make `test_selector`'s manifest fixture name even more specific
* Add type hint to `expected_nodes` in `test_selector.py` tests
In the test `tests/unit/graph/test_selector.py::TestCompiler::test_two_models_simple_ref`
we have a variable `expected_nodes` that we are setting via a list comprehension.
At a glance it isn't immediately obvious what `expected_nodes` actually is. It's a
list, but a list of what? One suggestion was to explicitly write out the list of strings.
However, I worry about the brittleness of doing so. That might be the way we head long
term, but as a compromise for now, I've added a type hint the the variable definition.
* Fire skipped events at debug level
Closes https://github.com/dbt-labs/dbt-core/issues/8774
* add changelog entry
* Update to work with 1.9.*.
* Add tests for --fail-fast not showing skip messages unless --debug.
* Update test that works by itself, but assumes to much to work in integration tests.
---------
Co-authored-by: Scott Gigante <scottgigante@users.noreply.github.com>
* init push arbitrary configs for generic tests pr
* iterative work
* initial test design attempts
* test reformatting
* test rework, have basic structure for 3 of 4 passing, need to figure out how to best represent same key error, failing correctly though
* swap up test formats for new config dict and mixed varitey to run of dbt parse and inspecting the manifest
* modify tests to get passing, then modify the TestBuilder class work from earlier to be more dry
* add changelog
* modify code to match suggested changes around seperate methods and test id fix
* add column_name reference to init for deeper nested _render_values can use the input
* add type annotations
* feedback based on mike review
* Create `runtime_config` fixture and necessary upstream fixtures
* Check for better scoped `ProjectContractError` in test_runtime tests
Previously in `test_unsupported_version_extra_config` and
`test_archive_not_allowed` we were checking for `DbtProjectError`. This
worked because `ProjectContractError` is a subclass of `DbtProjectError`.
However, if we check for `DbtProjectError` in these tests than, some tangential
failure which raises a `DbtProejctError` type error would go undetected. As
we plan on modifying these tests to be pytest in the coming commits, we want to
ensure that the tests are succeeding for the right reason.
* Convert `test_str` of `TestRuntimeConfig` to a pytest test using fixtures
* Convert `test_from_parts` of `TestRuntimeConfig` to a pytest test using fixtures
While converting `test_from_parts` I noticed the comment
> TODO(jeb): Adapters must assert that quoting is populated?
This led me to beleive that `quoting` shouldn't be "fully" realized
in our project fixture unless we're saying that it's gone through
adapter instantiation. Thus I update the `quoting` on our project
fixture to be an empty dict. This change affected `test__str__` in
`test_project.py` which we thus needed to update accordingly.
* Convert runtime version specifier tests to pytest tests and move to test_project
We've done two things in this commit, which arguably _should_ have been done in
two commits. First we moved the version specifier tests from `test_runtime.py::TestRuntimeConfig`
to `test_project.py::TestGetRequiredVersion` this is because what is really being
tested is the `_get_required_version` method. Doing it via `RuntimeConfig.from_parts` method
made actually testing it a lot harder as it requires setting up more of the world and
running with a _full_ project config dict.
The second thing we did was convert it from the old unittest implementation to a pytest
implementation. This saves us from having to create most of the world as we were doing
previously in these tests.
Of note, I did not move the test `test_unsupported_version_range_bad_config`. This test
is a bit different from the rest of the version specifier tests. It was introduced in
[1eb5857811](1eb5857811)
of [#2726](https://github.com/dbt-labs/dbt-core/pull/2726) to resolve [#2638](https://github.com/dbt-labs/dbt-core/issues/2638).
The focus of #2726 was to ensure the version specifier checks were run _before_ the validation
of the `dbt_project.yml`. Thus what this test is actually testing for is order of
operations at parse time. As such, this is really more a _functional_ test than a
unit test. In the next commit we'll get this test moved (and renamed)
* Create a better test for checking that version checks come before project schema validation
* Convert `test_get_metadata` to pytest test
* Refactor `test_archive_not_allowed` to functional test
We do already have tests that ensure "extra" keys aren't allowed in
the dbt_project.yaml. This test is different because it's checking that
a specific key, `archive`, isn't allowed. We do this because at one point
in time `archive` _was_ an allowed key. Specifically, we stopped allowing
`archive` in dbt-core 0.15.0 via commit [f26948dd](f26948dde2).
Given that it's been 5 years and a major version, we could probably remove
this test, but let's keep it around unless we start supporting `archive` again.
* Convert `warn_for_unused_resource_config_paths` tests to use pytest
* Add fixtures for setting and resettign flags for unit tests
* Remove unnecessary `set_from_args` in non `unittest.TestCase` based unit tests
In the previous commit we added a pytest fixture which sets and tears down
the global flags arg via `set_from_args` for every pytest based unit test.
Previously we had added a `set_from_args` in tests or test files to reset
the global flags from if they were modified by a previous test. This is no
longer necessary because of the work done in the previous commit.
Note: We did not modify any tests that use the `unittest.TestCase` class
because they don't use pytest fixtures. Thus those tests need to continue
operating as they currently do until we shift them to pytest unit tests.
* Utilize the new `args_for_flags` fixture for setting of flags in `test_contracts_graph_parsed.py`
* Convert `test_compilation.py` from `TestCase` tests to pytest tests
We did this so in the next commit we can drop the unnecessary `set_from_args`
in the next commit. That will be it's own commit because converting these
tests is a restructuring that doing separately makes things easier to follow.
That is to say, all changes in this commit were just to convert the tests to
pytest, no other changes were made.
* Drop unnecessary `set_from_args` in `test_compilation.py`
* Add return types to all methods in `test_compilation.py`
* Reduce imports from `compilation` in `test_compilation.py`
* Update `test_logging.py` now that we don't need to worry about global flags
* Conditionally import `Generator` type for python 3.8
In python 3.9 `Generator` was moved to `collections.abc` and deprecated
in `typing`. We still support 3.8 and thus need to be conditionally
importing `Generator`. We should remove this in the future when we drop
support for 3.8.
* Add more accurate RSS high water mark measurement for Linux
* Add changelog entry.
* Checks to avoid exception based flow control, per review.
* Fix review nit.
* Add unit test to assert `setup_config_logger` clears the event manager state
* Move `setup_event_logger` tests from `test_functions.py` to `test_logging.py`
* Move `EventCatcher` to `tests.utils` for use in unit and functional tests
* Update fixture mocking global event manager to instead clear it
Previously we had started _mocking_ the global event manager. We did this
because we though that meant anything we did to the global event manager,
including modifying it via things like `setup_event_logger`, would be
automatically cleaned up at the end of any test using the fixture because
the mock would go away. However, this understanding of fixtures and mocking
was incorrect, and the global event manager wouldn't be properly isolated/reset.
Thus we changed to fixture to instead cleanup the global event manager before
any test that uses it and by using `autouse=True` in the fixture definition
we made it so that every unit test uses the fixture.
Note this will no longer be viabled if we every multi-thread our unit testing as
the event manager isn't actually isolated, and thus two tests could both modify
the event manager at the same time.
* Add test for different `write_perf_info` values to `get_full_manifest`
* Add test for different `reset` values to `get_full_manifest`
* Abstract required mocks for `get_full_manifest` tests to reduce duplication
There are a set of required mocks that `get_full_manifest` unit tests need.
Instead of doing these mocks in each test, we've abstracted these mocks into
a reusable function. I did try to do this as a fixture, but for some reaosn
the mocks didn't actually propagate when I did that.
* Add test for different `PARTIAL_PARSE_FILE_DIFF` values to `get_full_manifest`
* Refactor mock fixtures in `test_manifest.py` to make them more widely available
* Convert `set_required_mocks` of `TestGetFullManifest` into a fixture
This wasn't working before, but it does now. Not sure why.
This was done by running `pre-commit run --all`. That this was needed
is a temporary glitch in how our `Tests and Code Checks` github action
works on PRs. Basically we added `isort` to the pre-commit hooks recently, and
this does additional linting/formatting on our imports.
People reasonably have branches which were started prior to `isort` being
part of the pre-commit hooks on main. Thus, unless those branches get caught
up to main, the github action on associated PRs won't run `isort` because
it doesn't exist on those branchs. Once everyone gets their local `main`
branch updated (I suspect this might take a few days) this problem will go
away.
* Add `isort` as a dev-req and pre-commit hook
The tool `isort` reorders imports to be in alphabetical order. I've
added it because our imports in most files are in random order. The lack
of order meant that:
- sometimes the same module would be imported from twice
- figuring out if a a module was already being imported from took
longer
In the next commit I'll actually run isort to order everything. The best
part is that when developing, we don't have to put them in correct order.
Though you can if you want. However, `isort` will take care of re-ordering
things at commit time :)
* Improve isort functionality by setting initial `.isort.cfg`
Specifically we set two config values: `extend_skip_glob` and `known_first_party`.
The `extend_skip_glob` extends the default skipped paths. The defaults can be seen
here https://pycqa.github.io/isort/docs/configuration/options.html#skip. We are skipping
third party stubs because these are more so provided (I believe). We are skipping
`.github` and `scripts` as they feel out of scope and things we can be less strict with.
The `known_first_party` setting makes it so that these imports get grouped separately from
all other imports, which is useful visually of "this comes from us" vs "this comes from
someone/somewhere else".
* Add profile `black` to isort config
I was seeing some odd behavior where running pre-commit, adding the modified
files, and then running pre-commit again would result making more modifications
to some of the same files. This felt odd. You shouldn't have to run pre-commit
more multiple times for it to eventually come to a final "solution". I believe
the problem was because we are using the tool `black` to format things, but weren't
registering the black profile with `isort` this lead to some conflicting formatting
rules, and the two tools had to negotiate a few times before being both satisfied.
Registering the profile `black` with `isort` resolved this problem.
* Reorder, merge-duplicate, and format module imports using `isort`
This was done by running `pre-commit run --all`. I ran it separately from
the commit process itself because I wanted to run it against all files
instead of only changed files.
Of note, this not only reordered and formatted our imports. But we also
had 60 distinct duplicate module import paths across 50 files, which this
took care of. When I say "distinct duplicate module import paths" I mean
when `from x.y.z import` was imported more than once in a single file.
* add support for explicit nulls for loaded_at_field
* add test
* changelog
* add parsing for tests
* centralize logic a bit
* account for sources being None
* fix bug
* remove new field from SourceDefinition
* add validation for empty string, mroe tests
* Move deferral from task to manifest loading + RefResolver
* dbt clone must specify --defer
* Fix deferral for unit test type deteection
* Add changelog
* Move merge_from_artifact from end of parsing back to task before_run to reduce scope of refactor
* PR review. DeferRelation conforms to RelationConfig protocol
* Add test case for #10017
* Update manifest v12 in test_previous_version_state
---------
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* Change agate upper bound to v1.10
* Add changelog.
* update lower pin
* for testing
* put back dev requirement
* move the lower pin back to 1.7
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Refactor test class `EventCatcher` into utils to make accessible to other tests
* Raise minimum version of dbt_common to 1.0.2
We're going to start depending on `silence` existing as a attr of
`WarnErrorOptions`. The `silence` attr was only added in dbt_common
1.0.2, thus that is our new minimum.
* Add ability to silence warnings from `warn_error_options` CLI arguments
* Add `flush` to `EventCatcher` test util, and use in `test_warn_error_options`
* Add tests to `TestWarnErrorOptionsFromCLI` for `include` and `exclude`
* Test support for setting `silence` of `warn_error_options` in `dbt_project` flags
Support for `silence` was _automatically_ added when we upgraded to dbt_common 1.0.2.
This is because we build the project flags in a `.from_dict` style, which is cool. In
this case it was _automatically_ handled in `read_project_flags` in `project.py`. More
specifically here bcbde3ac42/core/dbt/config/project.py (L839)
* Add tests to `TestWarnErrorOptionsFromProject` for `include` and `exclude`
Typically we can't have multiple tests in the same `test class` if they're
utilizing/modifying file system fixtures. That is because the file system
fixtures are scoped to test classes, so they don't reset inbetween tests within
the same test class. This problem _was_ affectin these tests as they modify the
`dbt_project.yml` file which is set by a class based fixture. To get around this,
because I find it annoying to create multiple test classes when the tests really
should be grouped, I created a "function" scoped fixture to reset the `dbt_project.yml`.
* Update `warn_error_options` in CLI args to support `error` and `warn` options
Setting `error` is the same as setting `include`, but only one can be specified.
Setting `warn` is the same as setting `exclude`, but only one can be specified.
* Update `warn_error_options` in Project flags to support `error` and `warn` options
As part of this I refactored `exclusive_primary_alt_value_setting` into an upstream
location `/config/utils`. That is because importing it in `/config/project.py` from
`cli/option_types.py` caused some circular dependency issues.
* Use `DbtExclusivePropertyUseError` in `exclusive_primary_alt_value_setting` instead of `DbtConfigError`
Using `DbtConfigError` seemed reasonable. However in order to make sure the error
got raised in `read_project_flags` we had to mark it to be `DbtConfigError`s to be
re-raised. This had the uninteded consequence of reraising a smattering of errors
which were previously being swallowed. I'd argue that if those are errors we're
swallowing, the functions that raise those errors should be modified perhaps to
conditionally not raise them, but that's not the world we live in and is out of
scope for this branch of work. Thus instead we've created a error specific to
`WarnErrorOptions` issues, and now use, and catch for re-raising.
* Add unit tests for `exclusive_primary_alt_value_setting` method
I debated about parametrizing these tests, and it can be done. However,
I found that the resulting code ended up being about the same number of
lines and slightly less readable (in my opinion). Given the simplicity of
these tests, I think not parametrizing them is okay.
Letting the dbt version be dynamic in the project fixture previously was
causing some tests to break whenever the version of dbt actually got updated,
which isn't great. It'd be super annoying to have to always update tests
affected by this. To get around this we've gone and hard coded the dbt version
in the profile. The alternative was to interpolate the version during comparison
during the relevant tests, which felt less appealing.
* Move `tests/unit/test_yaml_renderer.py` to `tests/unit/parser/test_schema_renderer.py`
* Move `tests/unit/test_unit_test_parser.py` to `tests/unit/parser/test_unit_tests.py`
* Convert `tests/unit/test_tracking.py` to use pytest fixtures
* Delete `tests/unit/test_sql_result.py` as it was moved to `dbt-adapters`
* Move `tests/unit/test_semantic_models.py` to `tests/unit/graph/test_nodes.py
* Group tests of `SemanticModel` in `test_nodes.py` into a `TestSemanticModel` class
* Move `tests/unit/test_selector_errors.py` to `tests/unit/config/test_selectors.py`
* Add `Project` fixture for unit tests and test `Project` class methods
* Move `Project.__eq__` unit tests to new pytest class testing
* Move `Project.hashed_name` unit test to pytest testing class
* Rename some testing class in `test_project.py` to align with testing split
* Refactor `project` fixture to make accessible to other unit tests
* simplify dockerfile, eliminate references to adapter repos as they will be handled in those repos
* keep dbt-postgres target for historical releases of dbt-postgres
* update third party image to pip install conditionally
* Add event type for deprecation of spaces in model names
* Begin emitting deprecation warning for spaces in model names
* Only warn on first model name with spaces unless `--debug` is specified
For projects with a lot of models that have spaces in their names, the
warning about this deprecation would be incredibly annoying. Now we instead
only log the first model name issue and then a count of how many models
have the issue, unless `--debug` is specified.
* Refactor `EventCatcher` so that the event to catch is setable
We want to be able to catch more than just `SpacesInModelNameDeprecation`
events, and in the next commit we will alter our tests to do so. Thus
instead of writing a new catcher for each event type, a slight modification
to the existing `EventCatcher` makes this much easier.
* Add project flag to control whether spaces are allowed in model names
* Log errors and raise exception when `allow_spaces_in_model_names` is `False`
* Use custom event for output invalid name counts instead of `Note` events
Using `Note` events was causing test flakiness when run in a multi
worker environment using `pytest -nauto`. This is because the event
manager is currently a global. So in a situation where test `A` starts
and test `tests_debug_when_spaces_in_name` starts shortly there after,
the event manager for both tests will have the callbacks set in
`tests_debug_when_spaces_in_name`. Then if something in test `A` fired
a `Note` event, this would affect the count of `Note` events that
`tests_debug_when_spaces_in_name` sees, causing assertion failures. By
creating a custom event, `TotalModelNamesWithSpacesDeprecation`, we limit
the possible flakiness to only tests that fire the custom event. Thus
we didn't _eliminate_ all possibility of flakiness, but realistically
only the tests in `test_check_for_spaces_in_model_names.py` can now
interfere with each other. Which still isn't great, but to fully
resolve the problem we need to work on how the event manager is
handled (preferably not globally).
* Always log total invalid model names if at least one
Previously we only logged out the count of how many invalid model names
there were if there was two or more invalid names (and not in debug mode).
However this message is important if there is even one invalid model
name and regardless of whether you are running debug mode. That is because
automated tools might be looking for the event type to track if anything
is wrong.
A related change in this commit is that we now only output the debug hint
if it wasn't run with debug mode. The idea being that if they are already
running it in debug mode, the hint could come accross as somewhat
patronizing.
* Reduce duplicate `if` logic in `check_for_spaces_in_model_names`
* Improve readability of logs related to problematic model names
We want people running dbt to be able to at a glance see warnings/errors
with running their project. In this case we are focused specifically on
errors/warnings in regards to model names containing spaces. Previously
we were only ever emitting the `warning_tag` in the message even if the
event itself was being emitted at an `ERROR` level. We now properly have
`[ERROR]` or `[WARNING]` in the message depending on the level. Unfortunately
we couldn't just look what level the event was being fired at, because that
information doesn't exist on the event itself.
Additionally, we're using events that base off of `DynamicEvents` which
unfortunately hard coded to `DEBUG`. Changing this would involve still
having a `level` property on the definition in `core_types.proto` and
then having `DynamicEvent`s look to `self.level` in the `level_tag`
method. Then we could change how firing events works based on the an
event's `level_tag` return value. This all sounds like a bit of tech
debt suited for PR, possibly multiple, and thus is not being done here.
* Alter `TotalModelNamesWithSpacesDeprecation` message to handle singular and plural
* Remove duplicate import in `test_graph.py` introduced from merging in main
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Expect that the `args` variable is un-modified by `dbt.invoke(args)`
* Make `args` variable un-modified by `dbt.invoke(args)`
* Changelog entry
* Expect that the `args` variable is un-modified by `make_dbt_context`
* Make the `args` variable is un-modified by `make_dbt_context`
* Make a copy of `args` passed to `make_dbt_context`
* Revert "Make a copy of `args` passed to `make_dbt_context`"
This reverts commit 79227b4d34.
* Ensure BaseRunner handles nodes without `build_path`
Some nodes, like SourceDefinition nodes, don't have a `build_path` property.
This is problematic because we take in nodes with no type checking, and
assume they have properties sometimes, like `build_path`. This was just
the case in BaseRunner's `_handle_generic_exception` and
`_handle_interal_exception` methods. Thus to stop dbt from crashing when
trying to handle an exception related to a node without a `build_path`,
we added an private method to the BaseRunner class for safely trying
to get `build_path`.
* Use keyword arguments when instantiating `Note` events in freshness.py
Previously we were passing arguments during the `Note` event instantiations
in freshness.py as positional arguments. This would cause not the desired
`Note` event to be emitted, but instead get the message
```
[Note] Don't use positional arguments when constructing logging events
```
which was our fault, not the users'. Additionally, we were passing the
level for the event in the `Note` instantiation when we needed to be
passing it to the `fire_event` method.
* Raise error when `loaded_at_field` is `None` and metadata check isn't possible
Previously if a source freshness check didn't have a `loaded_at_field` and
metadata source freshness wasn't supported by the adapter, then we'd log
a warning message and let the source freshness check continue. This was problematic
because the source freshness check couldn't actually continue and the process
would raise an error in the form
```
type object argument after ** must be a mapping, not NoneType
```
because the `freshness` variable was never getting set. This error wasn't particularly
helpful for any person running into it. So instead of letting that error
happen we now deliberately raise an error with helpful information.
* Add test which ensures bad source freshness checks raise appropriate error
This test directly tests that when a source freshness check doesn't have a
`loaded_at_field` and the adapter in use doesn't support metadata checks,
then the appropriate error message gets raised. That is, it directly tests
the change made in a162d53a8. This test indirectly tests the changes in both
7ec2f82a9 and 7b0ff3198 as the appropriate error can only be raised because
we've fixed other upstream issues via those commits.
* Add changelog entry for source freshness edgecase fixes
* Add @p.profile and @p.target to the list of "global" CLI flags
* Add env vars (DBT_PROFILE, DBT_TARGET) to the params
* Add unit test
* Simplify unit test
* changie
* Update .changes/unreleased/Features-20231115-092005.yaml
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* Fix incorrect envvar names
* Realign environment variable names
* Remove from specific subcommands
* Add test_global_flags_not_on_subcommands
* Remove one unnecessary test case
* Remove other unnecessary test case
---------
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Our `protobuf` dep was in the section of `setup.py` which we delineate
as expecting all future versions of it to be compatible. However, this
is no longer actually the case, and in e4fe839e45
we restricted it to major vesion 4.
* [#9570] Fix fixtures in fixtures/subfolders throwing parsing error
* Fast-forward imports to match upstream
* Re-introduce doc strings on traceback info handling
* [#9570] Changelog update for fix of fixtures in fixtures/subfolders throwing parsing error
* [#9570] Improve testability and coverage for partial parsing
* Transform skip_parsing (private variable of ManifestLoader.load()) into instance-attribute of ManifestLoader(), with default value False
(to enable splitting of ManifestLoader.load())
* Split ManifestLoader.load(), to extract operation of PartialParsing into new method called ManifestLoader.safe_update_project_parser_files_partially()
(to simplify both cognitive complexity in the codebase and mocking in unittestest)
* Add "ignore" type-comments in new ManifestLoader.safe_update_project_parser_files_partially()
(to silence mypy warnings regarding instance-attributes which can be initialized as None or as something else, e.g. self.saved_manifest)[1]
[1] Although I wanted avoid "ignore" type-comments, it seems like addressing these mypy warnings in a stricter sense requires technical alignment and broader code changes.
For example, might need to initialize self.saved_manifest as Manifest, instead of Optional[Manifest], so that PartialParsing gets inputs with type it currently expects.
... perhaps too far beyond the scope of this fix?
* Check for equality with existing input_measures when adding input_measures
* Changie
* Add type annotation
* Move add_input_measure to metric from type_params
* Add tests to check that saved queries show in `dbt list`
* Update `list` task to support saved queries
This is built off of @jtcohen6 work in d6e7cda on jerco/fix-9532.
I didn't directly cherry pick because there was more work to do as
well as merge conflicts. That is to say @jtcohen6 should be credited
with some of the work.
* Update error message when iterating over nodes during list command errors
This was originally suggested by @jtcohen6 in d6e7cda of jerco/fix-9532.
This commit just makes sure the change gets included because I didn't
cherry-pick that commit into this work.
* Add test around deleting a YAML file containing semantic models and metrics
It was raised in https://github.com/dbt-labs/dbt-core/issues/8860 that an
error is being raised during partial parsing when files containing
metrics/semantic models are deleted. In further testing it looks like this
error specifically happens when a file containing both semantic models and
metrics is deleted. If the deleted file contains just semantic models or
metrics there seems to be no issue. The next commit should contain the fix.
* Skip deleted schema files when scheduling files during partial parsing
Waaaay back (in 7563b99) deleted schema files started being separated out
from deleted non-schema files. However ever since, when it came to scheduling
files for reparsing, we've only done so for deleted non-schema files. We even
missed this when we refactored the scheduling code in b37e5b5. This change
updates `_schedule_for_parsing` which is used by `schedule_nodes_for_parsing`
to begin skipping deleted schema files in addition to deleted non schema files.
* Update `add_to_pp_files` to ignore `deleted_schema_files`
As noted in the previous commit, we started separating out deleted
schema files from deleted non-schema files a looong time ago. However,
this whole time we've been adding `deleted_schema_files` to the list
of files to be parsed. This change corrects for that.
* Add changie doc for partial parsing KeyError fix
Protobuf v5 has breaking changes. Here we are limiting the protobuf
dependency to one major version, 4, so that we don't have to patch
over handling 2 different major versions of protobuf.
* Clearer no-op logging in stubbed SavedQueryRunner
* Add changelog entry
* Fix unit test
* More logging touchups
* Fix failing test
* Rename flag + refactor per #9629
* Fix failing test
* regenerate core_proto_types with libprotoc 25.3
---------
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
A recent update to the version ranges for our internally
maintained support packages quite reasonably expanded the
allowed versions for dbt-semantic-interfaces to all minor versions
after 0.5.0, under the assumption that subsequent releases will
generally be backwards-compatible.
Unfortunately, dbt-semantic-interfaces is not yet in that state.
So we update the version range accordingly, and include some
comments around version range expectations for dependencies
listed in this section of dbt-core's package configuration.
CVE-2024-22195 identified an issue in Jinja2 versions <= 3.1.2. As such
we've gone and changed our dependency requirement specification to be
3.1.3 or greater (but less than 4).
Note: Preivously we were using the `~=` version specifier. However due
to some issues with the `~=` we've moved to using `>=` in combination
with `<`. This gives us the same range that `~=` gave us, but avoids
a pip resolution issue when multiple packages in an environment use `~=`
for the same dependency.
* Remove extraneous `/` in `schema-check.yml`
We have hypothesis that the extra `/` in `schema-check` is causing
issues we're seeing currently in the artifact check failing. It may
not be the final solution, but we should fix it anyway.
* Move `artifact_minor_upgrade` label check to job level of `Check Artifact Changes`
Previously the checking for `artifact_minor_upgrade` was happening in each job
step of `Check Artifact Changes`. By moving it up to the job level instead of
in the job steps we make it so the check for the label only happens once and
it simplifies the job steps.
* Update `Check Artifact Changes` to use `dorny/paths-filter`
Previously we were using `git diff` to check if any files had changed
in `core/dbt/artifacts`. However, our `git diff` usage was including any
changes that happened on `main` which the PR branch did not have. This
commit switches the check from using `git diff` to `dorny/paths-filter`,
which is what we use for checking for changelog existence as well. The
`dorny/paths-fitler` includes logic for excluding changes that are on main
but not the PR branch (which is what want to happen).
* Move `ColumnInfo` to dbt/artifacts
* Move `Quoting` resource to dbt/artifacts
* Move `TimePeriod` to `types.py` in dbt/artifacts
* Move `Time` class to `components`
We need to move the data parts of the `Time` definition to dbt/artifacts.
That is not what we're doing in this commit. In this commit we're simply
moving the functional `Time` definition upstream of `unparsed` and `nodes`.
This does two things
- Mirrors the import path that the resource `time` definition will have in dbt/artifacts
- Reduces the chance of ciricular import problems between `unparsed` and `nodes`
* Move data part of `Time` definition to dbt/artifacts
* Move `FreshnessThreshold` class to components module
We need to move the data parts of the `FreshnessThreshold` definition to dbt/artifacts.
That is not what we're doing in this commit. In this commit we're simply
moving the functional `FreshnessThreshold` definition upstream of `unparsed` and `nodes`.
This does two things
- Mirrors the import path that the resource `FreshnessThreshold` definition will have in dbt/artifacts
- Reduces the chance of ciricular import problems between `unparsed` and `nodes`
* Move data part of `FreshnessThreshold` to dbt/artifacts
Note: We had to override some of the attrs of the `FreshnessThreshold`
resource because the resource version only has access to the resource
version of `Time`. The overrides in the functional definition of
`FreshnessThreshold` make it so the attrs use the functional version
of `Time`.
* Move `ExternalTable` and `ExternalPartition` to `source_definition` module in dbt/artifacts
* Move `SourceConfig` to `source_definition` module in dbt/artifacts
* Move `HasRelationMetadata` to core `components` module
This is a precursor to splitting `HasRelationMetadata` into it's
data and functional parts.
* Move data portion of `HasRelationMetadata` to dbt/artifacts
* Move `SourceDefinitionMandatory` to dbt/artifacts
* Move the data parts of `SourceDefinition` to dbt/artifacts
Something interesting here is that we had to override the `freshness`
property. We had to do this because if we didn't we wouldn't get the
functional parts of `FreshnessThreshold`, we'd only get the data parts.
Also of note, the `SourceDefintion` has a lot of `@property` methods that
on other classes would be actual attribute properties of the node. There is
an argument to be made that these should be moved as well, but thats perhaps
a separate discussion.
Finally, we have not (yet) moved `NodeInfoMixin`. It is an open discussion
whether we do or not. It seems primarily functional, as a means to update the
source freshness information. As the artifacts primarily deal with the shape
of the data, not how it should be set, it seems for now that `NodeInfoMixin`
should stay in core / not move to artifacts. This thinking may change though.
* Refactor `from_resource` to no longer use generics
In the next commit we're gonna add a `to_resource` method. As we don't
want to have to pass a resource into `to_resource`, the class itself
needs to expose what resource class should be built. Thus a type annotation
is no longer enough. To solve this we've added a class method to BaseNode
which returns the associated resource class. The method on BaseNode will
raise a NotImplementedError unless the the inheriting class has overridden
the `resouce_class` method to return the a resource class.
You may be thinking "Why not a class property"? And that is absolutely a
valid question. We used to be able to chain `@classmethod` with
`@property` to create a class property. However, this was deprecated in
python 3.11 and removed in 3.13 (details on why this happened can be found
[here](https://github.com/python/cpython/issues/89519)). There is an
[alternate way to setup a class property](https://github.com/python/cpython/issues/89519#issuecomment-1397534245),
however this seems a bit convoluted if a class method easily gets the job
done. The draw back is that we must do `.resource_class()` instead of
`.resource_class` and on classes implementing `BaseNode` we have to
override it with a method instead of a property specification.
Additionally, making it a class _instance_ property won't work because
we don't want to require an _instance_ of the class to get the
`resource_class` as we might not have an instance at our dispossal.
* Add `to_resource` method to `BaseNode`
Nodes have extra attributes. We don't want these extra attributes to
get serialized. Thus we're converting back to resources prior to
serialization. There could be a CPU hit here as we're now dictifying
and undictifying right before serialization. We can do some complicated
and non-straight-forward things to get around this. However, we want
to see how big of a perforance hit we actually have before going that
route.
* Drop `__post_serialize__` from `SourceDefinition` node class
The method `__post_serialize__` on the `SourceDefinition` was used for
ensuring the property `_event_status` didn't make it to the serialized
version of the node. Now that resource definition of `SourceDefinition`
handles serialization/deserialization, we can drop `__post_serialize__`
as it is no longer needed.
* Merge functional parts of `components` into their resource counter parts
We discussed this on the PR. It seems like a minimal lift, and minimal to
support. Doing so also has the benefit of reducing a bunch of the overriding
we were previously doing.
* Fixup:: Rename variable `name` to `node_id` in `_map_nodes_to_map_resources`
Naming is hard. That is all.
* Fixup: Ensure conversion of groups to resources for `WritableManifest`
this pr provides additional command line options (cloud cli) for dbt development, as well as clarifies 'dbt core' under the 'get started' header.
cc @greg-mckeon
* update docker file to remove &subdirectory=plugins/postgres from git path
* remove extra proto file generation scripts which are no longer necessary in this repo
* Begin using `Mergeable` supplied by `dbt-common`
We're currently in the process of moving the "data resource" portion on nodes
to `dbt/artifacts`. Some of those artifacts depend on `Mergeable` which has
been defined on core. In order to move the data resources to `dbt/artifacts`,
we thus need to move `Mergeable` upstream of core. We moved `Mergeable` to
[dbt-common](https://github.com/dbt-labs/dbt-common) in
https://github.com/dbt-labs/dbt-common/pull/59, and released this change in
[dbt-common 0.1.3](https://pypi.org/project/dbt-common/0.1.3/). As such as, in
order to unblock some of the `dbt/artifacts` migration work, we first need to
update references to `Mergeable` in core to use the `dbt-common` definition.
NOTE: We include changing over to `Replaceable` from `dbt-common` in this
commit. This is because there wasn't a clean way to do it. If I moved the imports
of `Replaceable` on in the files where we updated `Mergeable` then we would
have left `Replaceable` in an inbetween state. If we had moved all instances
of `Replaceable`, it'd be out of scope for the change. As such, it makes more
sense to do that as a separate changeset.
* Remove definition of `Mergeable` from dbt/contracts/util
Although we've removed the definition of `Mergeable` we've ensured the
import paths are still available. We do this because this is under
`contracts`, and the sudden disappearance from the import path might
cause issues for community members using dbt-core as a library.
Ideally we'd define a `Mergeable` class here that inherits the
`dbt-common` definition and raise a deprecation warning on instantiation.
However, we don't have an established strategy to do so.
* Use new context invocation class.
* Adjust new constructor param on InvocationContext, make tests robust
* Add changelog entry.
* Clarify parameter name
* Move `ExposureType` to dbt/artifacts
* Move `MaturityType` to dbt/artifacts
* Move `ExposureConfig` to dbt/artifacts
* Move data parts of `Exposure` node class to dbt/artifacts
* Update leftover incorrect imports of `Owner` resource
There were a few places in the code base that were importing `Owner`
from `unparsed` or `nodes`. The places importing from `unparsed` were
working because `unparsed` itself was correctly importing from
`artifacts.resources`. However in places where it was being imported
from `nodes`, an exception was being raised because in the previous
commit we removed the import of `Owner` in `nodes` because it was
no longer needed.
* Move `SemanticModel` sub dataclasses to dbt/artifacts
* Move `NodeRelation` to dbt/artifacts
* Move `SemanticModelConfig` to dbt/artifacts
* Move data portion of `SemanticModel` to dbt/artifacts
* Add contextual comments to `semantic_model.py` about DSI protocols
* Fixup mypy complaint
* Migrate v12 manifest to use artifact definitions of `SavedQuery`, `Metric`, and `SemanticModel`
* Convert `SemanticModel` and `Metric` resources to full nodes in selector search
In the `search` method in `selector_methods.py`, we were getting object
representations from the incoming writable manifest by unique id. What we
get from the writable manifest though is increasingly the `resource`
(data artifact) part of the node, not the full node. This was problematic
because a number of the selector processes _compare_ the old node to the
new node, but the `resource` representation doesn't have the comparator
methods.
In this commit we dict-ify the resource and then get the full node by
undictifying that. We should probably have a better built in process to
the full node objects to do this, but this will do for now.
* Add `from_resource` implementation on `BaseNode` to ease resource to node conversion
We want to easily be able to create nodes from their resource counter
parts. It's actually imperative that we can do so. The previous commit
had a manual way to do so where needed. However, we don't want to have
to put `from_dict(.to_dict())` everywhere. So here we hadded a `from_resource`
class method to `BaseNode`. Everything that inherits from `BaseNode` thus
automatically gets this functionality.
HOWEVER, the implementation currently has a problem. Specifically, the
type for `resource_instance` is `BaseResource`. Which means if one is
calling say `Metric.from_resource()`, one could hand it a `SemanticModelResource`
and mypy won't complain. In this case, a semi-cryptic error might get
raised at runtime. Whether or not an error gets raised depends entirely
on whether or not the dictified resource instance manages to satisfy all
the required attributes of the desired node class. THIS IS VERY BAD.
We should be able to solve this issue in an upcoming (hopefully next)
commit, wherein we genericize `BaseNode` such that when inheriting it
you declare it with a resource type. Technically a runtime error will
still be possible, however any mixups should be caught by mypy on
pre-commit hooks as well as PRs.
* Make `BaseNode` a generic that is defined with a `ResourceType`
Turning `BaseNode` into an ABC generic allows us to say that the inheriting
class can define what resource type from artifacts it should be used with.
This gives us added type safety to what resource type can be passed into
`from_resource` when called via `SemanticModel.from_resource(...)`,
`Metric.from_resource(...)`, and etc.
NOTE: This only gives us type safety from mypy. If we begin ignoring
mypy errors during development, we can still get into a situation for
runtime errors (it's just harder to do so now).
* simplify and modularize tagging logic
* change package field to dropdown, log inputs to publish, skip actual publish for testing
* add dry run option
* update to v3 of docker actions to migrate from node16 (deprecated) to node20
* Move `MetricInputMeasure` to dbt/artifacts
* Move `MetricTimeWindow` to dbt/artifacts
* Move `MetricInput` to dbt/artifacts
* Move `ConstantPropertyInput` and `ConversionTypeParams` to dbt/artifacts
* Move `MetricTypeParams` to dbt/artifacts
* Remove obsolete `MetricReference` class from core
The `MetricReference` defined in `nodes.py` is from pre core 1.6 metrics,
i.e. the legacy semantic layer prior to integrating with MetricFlow. I
double checked and found that this `MetricReference` is found _nowhere_
in core. It is dead, with no plan of coming back. Thus deleting it seems
logical.
* Move `MetricConfig` to dbt/artifacts
* Move data portion of `Metric` node to dbt/artifacts
* Move `depends_on_nodes` and `search_name` back to core `Metric` implementation
I got a little too indiscriminate in what got moved in the `Metric`
definition split in the previous commit. Specifically `depends_on_nodes`
and `search_name` shouldn't have been moved to `dbt/artifacts` as they
are specific core internals, not artifacts to be depended on.
* Add context comment to `metric.py` artifact file about upstream protocols.
* Move the common semantic layer node components to v1 artifact resources
* Move `FileSlice` and `SourceFileMetadata` to `semantic_layer_components` in artifacts
* Split `GraphNode` into a functional class in core and data class in artifacts
* Refactor the `same_context` checks of `Exports` into `SavedQuery`
This is important because we want to move the `Export` class to artifacts.
However, because it had functional parts we would have split it in half,
with the data definition exists in artifacts and the functional specification
defined in core. At first glance thats not problematic. However, the
`SavedQuery` definition in artifacts would only be able to point at the
data definition of `Export`, and then the function `SavedQuery` spec in
core would have to override that with the functional `Export` definition
that exists in core. This would make the inheritance rather wonky and
confusing. This refactor simplifies thigs greatly because now we can move
the entirety of `Export` to artifacts, and the core `SavedQuery` won't
have to override anything.
* Move child components of `SavedQuery` to artifacts
Specifically the components in `contracts/graph/saved_queries.py` which
are `Export`, `ExportConfig`, and `QueryParams` got moved to
`artifacts/resources/v1/saved_query.py`. The moving of `Export` was
made possible by the refactor in the previous commit.
* Move `SavedQueryMandatory` to dbt/artifacts
* Move `SavedQueryConfig` to dbt/artifacts
* Move `DependsOn` class to artifacts
If we had followed the general paradigm we've set, we would have split
`DependsOn` into a data half and a functional half, with the data half
going in artifacts. However, doing so overly complicates the work that
we're doing. Additionally looking forward, we hope to simplify the
`DependsOn` (as well as `MacroDependsOn`) to use `sets` instead of
`lists`, thus allowing us to get rid of the fuctional part. We haven't
done that refactor here because there is a reasonable amount of risk
associated with such a change such that doign so should be it's own
segement of work.
* Move `NodeVersion` and `RefArgs` to dbt/artifacts
I debated about making this two commits. However I only realized we
needed to also move `NodeVersion` when I was most the way through
moving `RefArgs`, and instead of stashing, I just decided to due both.
They're kind of inseparable anyways because it only makes sense to
move `NodeVersion` if you move `RefArgs`, but you can't move `RefArgs`
unless you also move `NodeVersion`. The two in one commit are still
small enough that I'm okay with this.
* Move data portion of `SavedQuery` class to dbt/artifacts
* Update implementation-ticket.yml
Changed "Notion docs" to "documentations"
* Added changelog
* modified the contributing and readme files.
* fixed end of files as test failed on previous commit.
* fixed the test errors.
* Changes as per reviewer's request have been made.
* some changes idk
* Update .changes/unreleased/Under the Hood-20240109-091856.yaml
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Update .github/ISSUE_TEMPLATE/implementation-ticket.yml
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Update .github/ISSUE_TEMPLATE/implementation-ticket.yml
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
---------
Co-authored-by: Tania <tonayya@users.noreply.github.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* simplify release inputs
* fix vars
* add missing quote:
* drop all env vars since the workflows dont like them
* Update .github/workflows/release.yml
* Add unit test that shows unit tests work with external nodes
* Abbreviate in names in external nodes test to stay under 64 character postgres max
Was getting test failures due to resulting lengthy model names being created
by unit test task in the functional test
* Fix unit test parsing to ensure external nodes continue to keep their package name
* Add seed to test of external node unit test, and indirectly have the external node point to it
Previously I was getting an error about the columns for the external model
not being fetchable from the database via the macro `get_columns_in_relation`.
By creating a seed for the tests, which creates a table in postgres, we can then
tell the external model that it's database schema and identifier (the relation)
is that table from the seed without make the seed an actual dependency of the
external model in the dag.
* Ensure all models in unit test shadow manifest have a non `None` path
External nodes generally don't have paths, but in unit tests we write out
all models to sql files (as this allows us to test them). Thus external
nodes need to have their paths set.
* Add `run` step to function test of unit test with external nodes
This is necessary because when executing a unit tests, the columns
associated with a model in the database are retrieved. For this to
be possible, the model must exist in the database, thus we must
run the associated models at least once first.
* Create a full external package for function test of a unit test with an external node
Previously we were only pseudo creating an external package for testing
how unit tests work with external nodes. This was problematic because the
package didn't actually exist and thus wasn't seen as accessible when running
through dag dependencies. By actually creating the external package, we
ensure that all the built in normal processes happen.
* Add test for more ephemoral external models
* Flip logic in `packages_for_node` to remove error case
By flipping the logic from `not in` to `in` we can drop the exception
and instead default to the model runtime config when the package isn't
found. We're still trying to grok if there will be any fallout from this.
The tests all pass, but that doesn't guarantee nothing bad will happen.
* Add changie doc for added support of external nodes in unit tests
* Initial implementation of unit testing (from pr #2911)
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* 8295 unit testing artifacts (#8477)
* unit test config: tags & meta (#8565)
* Add additional functional test for unit testing selection, artifacts, etc (#8639)
* Enable inline csv format in unit testing (#8743)
* Support unit testing incremental models (#8891)
* update unit test key: unit -> unit-tests (#8988)
* convert to use unit test name at top level key (#8966)
* csv file fixtures (#9044)
* Unit test support for `state:modified` and `--defer` (#9032)
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* Allow use of sources as unit testing inputs (#9059)
* Use daff for diff formatting in unit testing (#8984)
* Fix#8652: Use seed file from disk for unit testing if rows not specified in YAML config (#9064)
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Fix#8652: Use seed value if rows not specified
* Move unit testing to test and build commands (#9108)
* Enable unit testing in non-root packages (#9184)
* convert test to data_test (#9201)
* Make fixtures files full-fledged members of manifest and enable partial parsing (#9225)
* In build command run unit tests before models (#9273)
---------
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
* remove dbt.contracts.connection imports from adapter module
* Move events to common (#8676)
* Move events to common
* More Type Annotations (#8536)
* Extend use of type annotations in the events module.
* Add return type of None to more __init__ definitions.
* Still more type annotations adding -> None to __init__
* Tweak per review
* Allow adapters to include python package logging in dbt logs (#8643)
* add set_package_log_level functionality
* set package handler
* set package handler
* add logging about stting up logging
* test event log handler
* add event log handler
* add event log level
* rename package and add unit tests
* revert logfile config change
* cleanup and add code comments
* add changie
* swap function for dict
* add additional unit tests
* fix unit test
* update README and protos
* fix formatting
* update precommit
---------
Co-authored-by: Peter Webb <peter.webb@dbtlabs.com>
* fix import
* move types_pb2.py from events to common/events
* move agate_helper into common
* Add utils module (#8910)
* moving types_pb2.py to common/events
* split out utils into core/common/adapters
* add changie
* remove usage of dbt.config.PartialProject from dbt/adapters (#8909)
* remove usage of dbt.config.PartialProject from dbt/adapters
* add changie
---------
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
* move agate_helper unit tests under tests/unit/common
* move agate_helper into common (#8911)
* move agate_helper into common
* add changie
---------
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
* remove dbt.flags.MP_CONTEXT usage in dbt/adapters (#8931)
* remove dbt.flags.LOG_CACHE_EVENTS usage in dbt/adapters (#8933)
* Refactor Base Exceptions (#8989)
* moving types_pb2.py to common/events
* Refactor Base Exceptions
* update make_log_dir_if_missing to handle str
* move remaining adapters exception imports to common/adapters
---------
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* Remove usage of dbt.deprecations in dbt/adapters, enable core & adapter-specific (#9051)
* Decouple adapter constraints from core (#9054)
* Move constraints to dbt.common
* Move constraints to contracts folder, per review
* Add a changelog entry.
* move include/global_project to adapters (#8930)
* remove adapter.get_compiler (#9134)
* Move adapter logger to adapters (#9165)
* moving types_pb2.py to common/events
* Move AdapterLogger to adapter folder
* add changie
* delete accidentally merged types_pb2.py
* Move the semver package to common and alter references. (#9166)
* Move the semver package to common and alter references.
* Alter leftover references to dbt.semver, this time using from syntax.
---------
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Refactor EventManager setup and interaction (#9180)
* moving types_pb2.py to common/events
* move event manager setup back to core, remove ref to global EVENT_MANAGER and clean up event manager functions
* move invocation_id from events to first class common concept
* move lowercase utils to common
* move lowercase utils to common
* ref CAPTURE_STREAM through method
* add changie
* first pass: adapter migration script (#9160)
* Decouple macro generator from adapters (#9149)
* Remove usage of dbt.contracts.relation in dbt/adapters (#9207)
* Remove ResultNode usage from connections (#9211)
* Add RelationConfig Protocol for use in Relation.create_from (#9210)
* move relation contract to dbt.adapters
* changelog entry
* first pass: clean up relation.create_from
* type ignores
* type ignore
* changelog entry
* update RelationConfig variable names
* Merge main into feature/decouple-adapters-from-core (#9240)
* moving types_pb2.py to common/events
* Restore warning on unpinned git packages (#9157)
* Support --empty flag for schema-only dry runs (#8971)
* Fix ensuring we produce valid jsonschema artifacts for manifest, catalog, sources, and run-results (#9155)
* Drop `all_refs=True` from jsonschema-ization build process
Passing `all_refs=True` makes it so that Everything is a ref, even
the top level schema. In jsonschema land, this essentially makes the
produced artifact not a full schema, but a fractal object to be included
in a schema. Thus when `$id` is passed in, jsonschema tools blow up
because `$id` is for identifying a schema, which we explicitly weren't
creating. The alternative was to drop the inclusion of `$id`. Howver, we're
intending to create a schema, and having an `$id` is recommended best
practice. Additionally since we were intending to create a schema,
not a fractal, it seemed best to create to full schema.
* Explicity produce jsonschemas using DRAFT_2020_12 dialect
Previously were were implicitly using the `DRAFT_2020_12` dialect through
mashumaro. It felt wise to begin explicitly specifying this. First, it
is closest in available mashumaro provided dialects to what we produced
pre 1.7. Secondly, if mashumaro changes its default for whatever reason
(say a new dialect is added, and mashumaro moves to that), we don't want
to automatically inherit that.
* Bump manifest version to v12
Core 1.7 released with manifest v11, and we don't want to be overriding
that with 1.8. It'd be weird for 1.7 and 1.8 to both have v11 manifests,
but for them to be different, right?
* Begin including schema dialect specification in produced jsonschema
In jsonschema's documentation they state
> It's not always easy to tell which draft a JSON Schema is using.
> You can use the $schema keyword to declare which version of the JSON Schema specification the schema is written to.
> It's generally good practice to include it, though it is not required.
and
> For brevity, the $schema keyword isn't included in most of the examples in this book, but it should always be used in the real world.
Basically, to know how to parse a schema, it's important to include what
schema dialect is being used for the schema specification. The change in
this commit ensures we include that information.
* Create manifest v12 jsonschema specification
* Add change documentation for jsonschema schema production fix
* Bump run-results version to v6
* Generate new v6 run-results jsonschema
* Regenerate catalog v1 and sources v3 with fixed jsonschema production
* Update tests to handle bumped versions of manifest and run-results
---------
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Co-authored-by: Quigley Malcolm <QMalcolm@users.noreply.github.com>
* Move BaseConfig to Common (#9224)
* moving types_pb2.py to common/events
* move BaseConfig and assorted dependencies to common
* move ShowBehavior and OnConfigurationChange to common
* add changie
* Remove manifest from catalog and connection method signatures (#9242)
* Add MacroResolverProtocol, remove lazy loading of manifest in adapter.execute_macro (#9243)
* remove manifest from adapter.execute_macro, replace with MacroResolver + remove lazy loading
* rename to MacroResolverProtocol
* pass MacroResolverProtcol in adapter.calculate_freshness_from_metadata
* changelog entry
* fix adapter.calculate_freshness call
* pass context to MacroQueryStringSetter (#9248)
* moving types_pb2.py to common/events
* remove manifest from adapter.execute_macro, replace with MacroResolver + remove lazy loading
* rename to MacroResolverProtocol
* pass MacroResolverProtcol in adapter.calculate_freshness_from_metadata
* changelog entry
* fix adapter.calculate_freshness call
* pass context to MacroQueryStringSetter
* changelog entry
---------
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
* add macro_context_generator on adapter (#9251)
* moving types_pb2.py to common/events
* remove manifest from adapter.execute_macro, replace with MacroResolver + remove lazy loading
* rename to MacroResolverProtocol
* pass MacroResolverProtcol in adapter.calculate_freshness_from_metadata
* changelog entry
* fix adapter.calculate_freshness call
* add macro_context_generator on adapter
* fix adapter test setup
* changelog entry
* Update parser to support conversion metrics (#9173)
* added ConversionTypeParams classes
* updated parser for ConversionTypeParams
* added step to populate input_measure for conversion metrics
* version bump on DSI
* comment back manifest generating line
* updated v12 schemas
* added tests
* added changelog
* Add typing for macro_context_generator, fix query_header_context
---------
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
Co-authored-by: William Deng <33618746+WilliamDee@users.noreply.github.com>
* Pass mp_context to adapter factory (#9275)
* moving types_pb2.py to common/events
* require core to pass mp_context to adapter factory
* add changie
* fix SpawnContext annotation
* Fix include for decoupling (#9286)
* moving types_pb2.py to common/events
* fix include path in MANIFEST.in
* Fix include for decoupling (#9288)
* moving types_pb2.py to common/events
* fix include path in MANIFEST.in
* add index.html to in MANIFEST.in
* move system client to common (#9294)
* moving types_pb2.py to common/events
* move system.py to common
* add changie update README
* remove dbt.utils from semver.py
* remove aliasing connection_exception_retry
* Update materialized views to use RelationConfigs and remove refs to dbt.utils (#9291)
* moving types_pb2.py to common/events
* add AdapterRuntimeConfig protocol and clean up dbt-postgress core imports
* add changie
* remove AdapterRuntimeConfig
* update changelog
* Add config field to RelationConfig (#9300)
* moving types_pb2.py to common/events
* add config field to RelationConfig
* merge main into feature/decouple-adapters-from-core (#9305)
* moving types_pb2.py to common/events
* Update parser to support conversion metrics (#9173)
* added ConversionTypeParams classes
* updated parser for ConversionTypeParams
* added step to populate input_measure for conversion metrics
* version bump on DSI
* comment back manifest generating line
* updated v12 schemas
* added tests
* added changelog
* Remove `--dry-run` flag from `dbt deps` (#9169)
* Rm --dry-run flag for dbt deps
* Add changelog entry
* Update test
* PR feedback
* adding clean_up methods to basic and unique_id tests (#9195)
* init attempt of adding clean_up methods to basic and unique_id tests
* swapping cleanup method drop of test_schema to unique_schema to test breakage on docs_generate test
* moving the clean_up method down into class BaseDocsGenerate
* remove drop relation for unique_schema
* manually define alternate_schema for clean_up as not being seen as part of project_config
* add changelog
* remove unneeded changelog
* uncomment line that generates new manifest and delete manifest our changes created
* make sure the manifest test is deleted and readd older version of manifest.json to appease test
* manually revert file to previous commit
* Revert "manually revert file to previous commit"
This reverts commit a755419e8b.
---------
Co-authored-by: William Deng <33618746+WilliamDee@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Matthew McKnight <91097623+McKnight-42@users.noreply.github.com>
* resolve merge conflict on unparsed.py (#9309)
* moving types_pb2.py to common/events
* Update parser to support conversion metrics (#9173)
* added ConversionTypeParams classes
* updated parser for ConversionTypeParams
* added step to populate input_measure for conversion metrics
* version bump on DSI
* comment back manifest generating line
* updated v12 schemas
* added tests
* added changelog
* Remove `--dry-run` flag from `dbt deps` (#9169)
* Rm --dry-run flag for dbt deps
* Add changelog entry
* Update test
* PR feedback
* adding clean_up methods to basic and unique_id tests (#9195)
* init attempt of adding clean_up methods to basic and unique_id tests
* swapping cleanup method drop of test_schema to unique_schema to test breakage on docs_generate test
* moving the clean_up method down into class BaseDocsGenerate
* remove drop relation for unique_schema
* manually define alternate_schema for clean_up as not being seen as part of project_config
* add changelog
* remove unneeded changelog
* uncomment line that generates new manifest and delete manifest our changes created
* make sure the manifest test is deleted and readd older version of manifest.json to appease test
* manually revert file to previous commit
* Revert "manually revert file to previous commit"
This reverts commit a755419e8b.
---------
Co-authored-by: William Deng <33618746+WilliamDee@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Matthew McKnight <91097623+McKnight-42@users.noreply.github.com>
* Resolve unparsed.py conflict (#9311)
* Update parser to support conversion metrics (#9173)
* added ConversionTypeParams classes
* updated parser for ConversionTypeParams
* added step to populate input_measure for conversion metrics
* version bump on DSI
* comment back manifest generating line
* updated v12 schemas
* added tests
* added changelog
* Remove `--dry-run` flag from `dbt deps` (#9169)
* Rm --dry-run flag for dbt deps
* Add changelog entry
* Update test
* PR feedback
* adding clean_up methods to basic and unique_id tests (#9195)
* init attempt of adding clean_up methods to basic and unique_id tests
* swapping cleanup method drop of test_schema to unique_schema to test breakage on docs_generate test
* moving the clean_up method down into class BaseDocsGenerate
* remove drop relation for unique_schema
* manually define alternate_schema for clean_up as not being seen as part of project_config
* add changelog
* remove unneeded changelog
* uncomment line that generates new manifest and delete manifest our changes created
* make sure the manifest test is deleted and readd older version of manifest.json to appease test
* manually revert file to previous commit
* Revert "manually revert file to previous commit"
This reverts commit a755419e8b.
---------
Co-authored-by: William Deng <33618746+WilliamDee@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Matthew McKnight <91097623+McKnight-42@users.noreply.github.com>
---------
Co-authored-by: colin-rogers-dbt <111200756+colin-rogers-dbt@users.noreply.github.com>
Co-authored-by: Peter Webb <peter.webb@dbtlabs.com>
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
Co-authored-by: Mila Page <67295367+VersusFacit@users.noreply.github.com>
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Quigley Malcolm <QMalcolm@users.noreply.github.com>
Co-authored-by: William Deng <33618746+WilliamDee@users.noreply.github.com>
Co-authored-by: Matthew McKnight <91097623+McKnight-42@users.noreply.github.com>
Co-authored-by: Chenyu Li <chenyu.li@dbtlabs.com>
* init attempt of adding clean_up methods to basic and unique_id tests
* swapping cleanup method drop of test_schema to unique_schema to test breakage on docs_generate test
* moving the clean_up method down into class BaseDocsGenerate
* remove drop relation for unique_schema
* manually define alternate_schema for clean_up as not being seen as part of project_config
* add changelog
* remove unneeded changelog
* uncomment line that generates new manifest and delete manifest our changes created
* make sure the manifest test is deleted and readd older version of manifest.json to appease test
* manually revert file to previous commit
* Revert "manually revert file to previous commit"
This reverts commit a755419e8b.
* Drop `all_refs=True` from jsonschema-ization build process
Passing `all_refs=True` makes it so that Everything is a ref, even
the top level schema. In jsonschema land, this essentially makes the
produced artifact not a full schema, but a fractal object to be included
in a schema. Thus when `$id` is passed in, jsonschema tools blow up
because `$id` is for identifying a schema, which we explicitly weren't
creating. The alternative was to drop the inclusion of `$id`. Howver, we're
intending to create a schema, and having an `$id` is recommended best
practice. Additionally since we were intending to create a schema,
not a fractal, it seemed best to create to full schema.
* Explicity produce jsonschemas using DRAFT_2020_12 dialect
Previously were were implicitly using the `DRAFT_2020_12` dialect through
mashumaro. It felt wise to begin explicitly specifying this. First, it
is closest in available mashumaro provided dialects to what we produced
pre 1.7. Secondly, if mashumaro changes its default for whatever reason
(say a new dialect is added, and mashumaro moves to that), we don't want
to automatically inherit that.
* Bump manifest version to v12
Core 1.7 released with manifest v11, and we don't want to be overriding
that with 1.8. It'd be weird for 1.7 and 1.8 to both have v11 manifests,
but for them to be different, right?
* Begin including schema dialect specification in produced jsonschema
In jsonschema's documentation they state
> It's not always easy to tell which draft a JSON Schema is using.
> You can use the $schema keyword to declare which version of the JSON Schema specification the schema is written to.
> It's generally good practice to include it, though it is not required.
and
> For brevity, the $schema keyword isn't included in most of the examples in this book, but it should always be used in the real world.
Basically, to know how to parse a schema, it's important to include what
schema dialect is being used for the schema specification. The change in
this commit ensures we include that information.
* Create manifest v12 jsonschema specification
* Add change documentation for jsonschema schema production fix
* Bump run-results version to v6
* Generate new v6 run-results jsonschema
* Regenerate catalog v1 and sources v3 with fixed jsonschema production
* Update tests to handle bumped versions of manifest and run-results
In [dbt-labs/dbt-core#7984](https://github.com/dbt-labs/dbt-core/pull/7984)
we began setting a metrics `type_params.input_measures` during metric
processing post parsing. However in that PR we didn't clean up the comment
in the parser about setting `input_measures`. This is that post fact
cleanup.
* Add test asserting GraphRunnableTasks attempt to cancel connections on SystemExit
* Add test asserting GraphRunnableTasks attempt to cancel connections on KeyboardInterrupt
* Add test asserting GraphRunnableNode doesn't try to cancel connections on generic Exception
* tarball lockfile fix
* Add changie doc for tarball deps issue
* Add integration test for ensuring tarball package specification works
This test was written _after_ the fix was commited. However, I ran this
test against main without the fix and it failed. After running the test
with the tarball fix, it passed.
* Remove unnecessary `tarball` conditional logic in `PackageConfig.validate`
We had a conditional to skip validation for a package if the package
included the `tarball` key. However, this conditional always returned
false as it was nested inside a conditional that the package had the
default `package` key, which means it's not a tarball package, but a
package package (maybe we need better differentiation here). If we need
additional validation for tarballs down the road, we should do that one
level up. At this time we have no additional validaitons to add.
* Fix typos in changie doc for tarball deps issue
* Improve tarball package test naming and add related unhappy path test
* Remove unnecessary `setUp` fixture from tarball package tests
We initially included this fixture due to copy and pasting another
test. However, this `setUp` fixture isn't actually necessary for the
tarball dependency tests.
---------
Co-authored-by: Chenyu Li <chenyu.li@dbtlabs.com>
* Add test asserting `SavedQuery` configs can be set from `dbt_project.yml`
* Allow extraneous properties in Export configs
This brings the Export config object more in line with how other config
objects are specified in the unparsed definition. It allows for specifying
of extra configs, although they won't get propagate to the final config.
* Add `ExportConfig` options to `SavedQueryConfig` options
This allows for specifying `ExportConfig` options at the `SavedQueryConfig` level.
This also therefore allows these options to be specified in the dbt_project.yml
config. The plan in the follow up commit is to merge the `SavedQueryConfig` options
into all configs of `Exports` belonging to the saved query.
There are a couple caveots to call out:
1. We've used `schema` instead of `schema_name` on the `SavedQueryConfig` despite
it being called `schema_name` on the `ExportConfig`. This is because need `schema_name`
to be the name of the property on the `ExportConfig`, but `schema` is the user facing
specification.
2. We didn't add the `ExportConfig` `alias` property to the `SavedQueryConfig` This
is because `alias` will always be specific to a single export, and thus it doesn't
make sense to allow defining it on the `SavedQueryConfig` to then apply to all
`Exports` belonging to the `SavedQuery`
* Begin inheriting configs from saved query config, and transitively from project config
Export configs will now inherit from saved query configs, with a preference
for export config specifications. That is to say an export config will inherity
a config attr from the saved query config only if a value hasn't been supplied
on the export config directly. Additionally because the saved query config has
a similar relationship with the project config, exports configs can inherit
from the project config (again with a preference for export config specifications).
* Correct conditional in export config building for map schema to schema_name
I somehow wrote a really weird, but also valid, conditional statement. Previously
the conditional was
```
if combined.get("schema") is not combined.get("schema_name") is None:
```
which basically checked whether `schema` was a boolean that didn't match
the boolean of whether `schema_name` was None. This would pretty much
always evaluate to True because `schema` should be a string or none, not
a bool, and thus would never match the right hand side. Crazy. It has now
been fixed to do the thing we want to it to do. If `schema` isn't `None`,
and `schema_name` is `None`, then set `schema_name` to have the value of
`schema`.
* Update parameter names in `_get_export_config` to be more verbose
* Supports non half width alphanumeric for generic test
* Unify conditional statements
* add CHANGELOG entries
* added test for Japanese
* Move the fix further upstream
* Remove the changes in core/dbt/task/runnable.py
* Fix accidental removal of `_` substitution character
---------
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
* Handle unknown `type_code` for model contracts
* Changelog entry
* Fix changelog entry
* Functional test for a `type_code` that is not recognized by psycopg2
* Functional tests for data type mismatches
* add test
* fix test
* first pass with constraint error
* add back column checks for temp tables
* changelog
* Update .changes/unreleased/Fixes-20231024-145504.yaml
* changie doc for DSI 0.3.0 upgrade
* Gracefully handle v10 metric filters
* Fix iteration over metrics in `upgrade_v10_metric_filters`
* Update previous manifest version test fixtures to have more expressive metrics
* Regenerate the test v10 manifest artifact using the more expressive metrics from 904cc1ef
To do this I cherry-picked 904cc1ef onto my local 1.6.latest branch,
had the test regenerate the test v10 manifest artifact, and then over
wrote the test v10 manifest artifact on this branch (cherry-picking it
across the branches didn't work, had to copy paste :grimmace:)
* Regenerate test v11 manifest artifact using the fixture changes in 904cc1ef
* Update `upgrade_v10_metric_filters` to handled disabled metrics
Regenerating the v10 and v11 test manifest artifacts uncovered an
issue wherein we weren't handling disabled metrics that need to
get upgraded. This commit fixes that. Additionally, the
`upgrade_v10_metric_filters` was getting a bit unwieldy, so I broke
extracted the abstracted sub functions.
* Fix `test_backwards_compatible_versions` test
When we regenerated the v10 test manifest artifact, it started having
the `metricflow_time_sine` model, and it didn't previously. This caused
`test_backwards_compatible_versions` to start failing because it was
no longer identified as having modified state for v10. The test has
been altered accordingly
* Bump to dbt-semantic-interfaces 0.3.0b1
* Update import path of `WhereFilterParser` from `dbt-semantic-interfaces`
In 0.3.x of `dbt-semantic-intefaces` the location of the WhereFilterParser
moved to be grouped in with a bunch of new adjacent code. As such,
we needed to correct our import path of it.
* Create basic `SavedQuery` node type based on `SavedQuery` protocol from DSI
* Add ability to add SavedQueries to the manifest
* Define unparsed SavedQuery node
* Begin parsing saved_query objects to manifest
* Skip jinja rendering of `SavedQuery.where` property
* Begin propagating `SavedQueries` on the manifest to the semantic manifest
* Add tests for basic saved query parsing
* Add custom pluralization handling of SavedQuery node type
* Add a config subclass to SavedQuery node
* Move the SavedQuery node to nodes.py
Unfortunately things are a bit too intertwined currently for SavedQuery
to be in it's own file. We need to add the SavedQuery node to the
GraphMemberNode, unfortunately with SavedQuery in it's own file,
importing it would have caused a circular dependency. We'll need
to separately come in and split things up as a cleanup portion of
work.
* Add basic plumbing of saved query configs to projects
* Add basic lookup utility for saved queries, SavedQueryLookup
* Handle disabled SavedQuery nodes in parsing and lookups
* Add SavedQuery nodes to grouping process
Our grouping logic seems to be in a weird spot. It seems liek we're
moving to setting the `group` for a node in the node's `config` however,
all of the logic around grouping is still focused on the top level `group`
property on a nodes. To get group stuff plumbed I've thus added `group`
as a top level property of the `SavedQuery` node, and populated it from
the config group value.
* Plumb through saved query in a lot more places
I don't like making scatter shot commits like this. However, a lot
of this commit was written ~4am, soooo yea. Things were broken, I wanted
things to be unbroken. I mostly searched for `semantic_models` and added
the equivalent necessary `saved_queries`. Some stuff is in support of
writing out the manifest, some stuff helps with node selection, it's a
lot of miscelaneous stuff that I don't fully understand.
* Add `depends_on` to `SavedQuery` nodes and populate from `metrics` property
* Add partial parsing support to SavedQuery nodes
* Add `docs` support for SavedQuery descriptions
* Support selctor methods for SavedQuery nodes
* Add `refs` property to SavedQuery node
We don't actually append anything to `refs` for SavedQuery nodes currently.
I'm not sure if anything needs to be appended to them. Regardless, we
access the `refs` property throughout the codebase while iterating over
nodes. It seems wise to support this attribute as to not accidently blow
something up with it not existing.
* Support `saved_queries` when upgrading from manifests <= v10 (and regenerate v11)
* Add changie doc for saved query node support
* Pin to dbt-semantic-interfaces 0.3.0b1 for saved query work
We're gonna release DSI 0.3.0, and if this PR automatically pulls that
in things will break. But the things that need fixing should be handled
separately from this PR. After releasing DSI 0.3.0 I'm going to create
a branch off/ontop of this one, and open a stacked PR with the associated
changes.
* Bump supported DSI version to 0.3.x
* Switch metric filters and saved query where to use ne WhereFilterIntersection
* Update schema yaml readers to create WhereFilterInterfaces
* Expand metric filters and saved query where property to handle both str and list of strs
* Update tests which were broken by where filter changes
* Regeneate v11 manifest
* Fixup: Update `SavedQueryLookup.perform_lookup` to operate on saved queries
I missed this when I was copy and pasting 🤦
* Add support for getting freshness from DBMS metadata
* Add changelog entry
* Add simple test case
* Change parsing error to warning and add new event type for warning
* Code review simplification of capability dict.
* Revisions to the capability mechanism per review
* Move utility function.
* Reduce try/except scope
* Clean up imports.
* Simplify typing per review
* Unit test fix
* add `store_failures_as` parameter to TestConfig, catch strategy parameter in test materialization
* create test results as views
* updated test expected values for new config option
* break up tests into reusable tests and adapter specific configuration, update test to check for relation type and confirm views update
* move test configuration into base test class
* allow `store_failures_as` to drive whether failures are stored
* update expected test config dicts to include the new default value for store_failures_as
* Add `store_failures_as` config for generic tests
* cover --store-failures on CLI gap
* add generic tests test case for store_failures_as
* update object names for generic test case tests for store_failures_as
* remove unique generic test, it was not testing `store_failures_as`
* pull generic run and assertion into base test class to turn tests into quasi-parameterized tests
* add ephemeral option for store_failures_as, as a way to easily turn off store_failures at the model level
* add compilation error for invalid setting of store_failures_as
---------
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
* Explanation of Parsing vs. Compilation vs. Runtime
* Update core/dbt/parser/parsing-vs-compilation-vs-runtime.md
* Update core/dbt/parser/parsing-vs-compilation-vs-runtime.md
* Update core/dbt/parser/parsing-vs-compilation-vs-runtime.md
* Update core/dbt/parser/parsing-vs-compilation-vs-runtime.md
* Update core/dbt/parser/parsing-vs-compilation-vs-runtime.md
* Update core/dbt/parser/parsing-vs-compilation-vs-runtime.md
* Apply suggestions from code review
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
* Fix a couple markdown rendering issues
* Move to the "explain it like im 64" folder
When ELI5 just isnt detailed enough.
* Disambiguate Python references
Disambiguate Python references and delineate SQL models ("Jinja-SQL") from Python models ("dbt-py")
---------
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
* Add semantic model test to `test_contracts_graph_parsed.py`
The tests in `test_contracts_graph_parsed.py` are meant to ensure
that we can go from objects to dictionaries and back without any
changes. We've had a desire to simplify these tests. Most tests in
this file have three to four fixtures, this test only has one. What
a test of this format ensures is that parsing a SemanticModel from
a dictionary doesn't add/drop any keys from the dictionary and that
when going back to the dictionary no keys are dropped. This style of
test will still break whenever the semantic model (or sub objects)
change. However now when that happens, only one fixture will have to
be updated (whereas previously we had to update 3-4 fixtures).
* Begin using hypothesis package for symmetry testing
Hypothesis is a python package for doing property testing. The `@given`
parameterizes a test, with it generating the arguements it has following
`strategies`. The main strategies we use is `builds` this takes in a callable
passes any sub strategies for named arguements, and will try to infer any
other arguments if the callable is typed. I found that even though the
test was run many many times, some of the `SemanticModel` properties
weren't being changed. For instance `dimensions`, `entities`, and `measures`
were always empty lists. Because of this I defined sub strategies for
some attributes of `SemanticModel`s.
* Update unittest readme to have details on test_contracts_graph_parsed methodology
* Include option to generate static index.html
* Added changie
* Using DBT's system load / write file methods for better cross platform
support
* Updated docs tests with dbt.client.systems calls for file reading
* Writing out static_index.html as binary file to prevent line-ending
conversions on Windows. (similar behaviour as index.html)
* Add performance metrics to the CommandCompleted event.
* Add changelog entry.
* Add flag for controling the log level of ResourceReport.
* Update changelog entry to reflect changes
* Remove outdated attributes
* Work around missing resource module on windows
* Fix corner case where flags are not set
* Add new get_catalog_relations macro, allowing dbt to specify which relations in a schema the adapter should return data about
* Implement postgres adapter support for relation filtering on catalog queries
* Code review changes adding feature flag for catalog-by-relation-list support
* Use profile specified in --profile with dbt init (#7450)
* Use profile specified in --profile with dbt init
* Update .changes/unreleased/Fixes-20230424-161642.yaml
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* Refactor run() method into functions, replace exit() calls with exceptions
* Update help text for profile option
---------
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* add TestLargeEphemeralCompilation (#8376)
* Fix a couple of issues in the postgres implementation of get_catalog_relations
* Add relation count limit at which to fall back to batch retrieval
* Better feature detection mechanism for adapters.
* Code review changes to get_catalog_relations and adapter feature checking
* Add changelog entry
---------
Co-authored-by: ezraerb <ezraerb@alum.mit.edu>
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
* Add `date_spine` macro (and macros it depends on) from dbt-utils to core
The macros added are
- date_spine
- get_intervals_between
- generate_series
- get_powers_of_two
We're adding these to core because they are becoming more prevalently used
with the increase usage in the semantic layer. Basically if you are
using the semantic layer currently, then it is almost a requirement
to use dbt-utils, which is undesireable given the SL is supported directly
in core. The primary focus of this was to just add `date_spine`. However,
because `date_spine` depends on other macros, these other macros were
also moved.
* Add adapter tests for `get_powers_of_two` macro
* Add adapter tests for `generate_series` macro
* Add adapter tests for `get_intervals_between` macro
* Add adapter tests for `date_spine` macro
* Improve test fixture for `date_spine` macro to work with multiple adapters
* Cast to types to date in fixture_date_spine when targeting redshift
* Improve test fixture for `get_intervals_between` macro to work with multiple adapters
* changie doc for adding date_spine macro
* Include 'join_to_timespine` and `fill_nulls_with` in metric fixture
* Support `join_to_timespine` and `fill_nulls_with` properties on measure inputs to metrics
* Assert new `fill_nulls_with` and `join_to_timespine` properties don't break associated DSI protocol
* Add doc for metric null coalescing improvements
* Fix unit test for unparsed metric objects
The `assert_symmetric` function asserts that dictionaries are mostly
equivalent. I say mostly equivalent because it drops keys that are
`None`. The issue is that that `join_to_timespine` gets defaulted
to `False`, so we have to specify it in the `get_ok_dict` so that
they match.
* allow multioption to be quoted
* changelog
* fix test
* remove list format
* fix tests
* fix list object
* review arg change
* fix quotes
* Update .changes/unreleased/Features-20230918-150855.yaml
* add types
* convert list to set in test
* make mypy happy
* mroe mypy happiness
* more mypy happiness
* last mypy change
* add node to test
- Removed the FirstRunResultError and AfterFirstRunResultError event types, using the existing RunResultError in their place. ([#7963](https://github.com/dbt-labs/dbt-core/issues/7963))
### Features
- Enable re-population of metadata vars post-environment change during programmatic invocation ([#8010](https://github.com/dbt-labs/dbt-core/issues/8010))
- Added support to configure a delimiter for a seed file, defaults to comma ([#3990](https://github.com/dbt-labs/dbt-core/issues/3990))
- Allow specification of `create_metric: true` on measures ([#8125](https://github.com/dbt-labs/dbt-core/issues/8125))
### Fixes
- Copy dir during `dbt deps` if symlink fails ([#7428](https://github.com/dbt-labs/dbt-core/issues/7428), [#8223](https://github.com/dbt-labs/dbt-core/issues/8223))
- Display contract and column constraints on the model page ([dbt-docs/#433](https://github.com/dbt-labs/dbt-docs/issues/433))
- Display semantic model details in docs ([dbt-docs/#431](https://github.com/dbt-labs/dbt-docs/issues/431))
### Under the Hood
- Refactor flaky test pp_versioned_models ([#7781](https://github.com/dbt-labs/dbt-core/issues/7781))
- format exception from dbtPlugin.initialize ([#8152](https://github.com/dbt-labs/dbt-core/issues/8152))
- A way to control maxBytes for a single dbt.log file ([#8199](https://github.com/dbt-labs/dbt-core/issues/8199))
- Ref expressions with version can now be processed by the latest version of the high-performance dbt-extractor library. ([#7688](https://github.com/dbt-labs/dbt-core/issues/7688))
- Bump manifest schema version to v11, freeze manifest v10 ([#8333](https://github.com/dbt-labs/dbt-core/issues/8333))
- add tracking for plugin.get_nodes calls ([#8344](https://github.com/dbt-labs/dbt-core/issues/8344))
- add internal flag: --no-partial-parse-file-diff to inform whether to compute a file diff during partial parsing ([#8363](https://github.com/dbt-labs/dbt-core/issues/8363))
- Add return values to a number of functions for mypy ([#8389](https://github.com/dbt-labs/dbt-core/issues/8389))
- Fix mypy warnings for ManifestLoader.load() ([#8401](https://github.com/dbt-labs/dbt-core/issues/8401))
- Use python version 3.10.7 in Docker image. ([#8444](https://github.com/dbt-labs/dbt-core/issues/8444))
### Dependencies
- Bump mypy from 1.3.0 to 1.4.0 ([#7912](https://github.com/dbt-labs/dbt-core/pull/7912))
- Bump mypy from 1.4.0 to 1.4.1 ([#8219](https://github.com/dbt-labs/dbt-core/pull/8219))
- Update pin for click<9([#8232](https://github.com/dbt-labs/dbt-core/pull/8232))
- Add node attributes related to compilation to run_results.json ([#7519](https://github.com/dbt-labs/dbt-core/issues/7519))
- Support configuration of semantic models with the addition of enable/disable and group enablement. ([#7968](https://github.com/dbt-labs/dbt-core/issues/7968))
### Fixes
- Add support for swapping materialized views with tables/views and vice versa ([#8449](https://github.com/dbt-labs/dbt-core/issues/8449))
- Turn breaking changes to contracted models into warnings for unversioned models ([#8384](https://github.com/dbt-labs/dbt-core/issues/8384), [#8282](https://github.com/dbt-labs/dbt-core/issues/8282))
- Ensure parsing does not break when `window_groupings` is not specified for `non_additive_dimension` ([#8453](https://github.com/dbt-labs/dbt-core/issues/8453))
- fix ambiguous reference error for tests and versions when model name is duplicated across packages ([#8327](https://github.com/dbt-labs/dbt-core/issues/8327), [#8493](https://github.com/dbt-labs/dbt-core/issues/8493))
- Fix "Internal Error: Expected node <unique-id> not found in manifest" when depends_on set on ModelNodeArgs ([#8506](https://github.com/dbt-labs/dbt-core/issues/8506))
- Maximally parallelize dbt clone in clone command" ([#7914](https://github.com/dbt-labs/dbt-core/issues/7914))
- Add --host flag to dbt docs serve, defaulting to '127.0.0.1' ([#10229](https://github.com/dbt-labs/dbt-core/issues/10229))
- Update data_test to accept arbitrary config options ([#10197](https://github.com/dbt-labs/dbt-core/issues/10197))
- add pre_model and post_model hook calls to data and unit tests to be able to provide extra config options ([#10198](https://github.com/dbt-labs/dbt-core/issues/10198))
- add --empty value to jinja context as flags.EMPTY ([#10317](https://github.com/dbt-labs/dbt-core/issues/10317))
- Warning message for snapshot timestamp data types ([#10234](https://github.com/dbt-labs/dbt-core/issues/10234))
- Support cumulative_type_params & sub-daily granularities in semantic manifest. ([#10360](https://github.com/dbt-labs/dbt-core/issues/10360))
- Add time_granularity to metric spec. ([#10376](https://github.com/dbt-labs/dbt-core/issues/10376))
- Support standard schema/database fields for snapshots ([#10301](https://github.com/dbt-labs/dbt-core/issues/10301))
- Support ref and source in foreign key constraint expressions, bump dbt-common minimum to 1.6 ([#8062](https://github.com/dbt-labs/dbt-core/issues/8062))
- Support new semantic layer time spine configs to enable sub-daily granularity. ([#10475](https://github.com/dbt-labs/dbt-core/issues/10475))
- Include models that depend on changed vars in state:modified, add state:modified.vars selection method ([#4304](https://github.com/dbt-labs/dbt-core/issues/4304))
- Add support for behavior flags ([#10618](https://github.com/dbt-labs/dbt-core/issues/10618))
- Enable `--resource-type` and `--exclude-resource-type` CLI flags and environment variables for `dbt test` ([#10656](https://github.com/dbt-labs/dbt-core/issues/10656))
- Execute microbatch models in batches ([#10700](https://github.com/dbt-labs/dbt-core/issues/10700))
- Create 'skip_nodes_if_on_run_start_fails' behavior change flag ([#7387](https://github.com/dbt-labs/dbt-core/issues/7387))
- Allow snapshots to be defined in YAML. ([#10246](https://github.com/dbt-labs/dbt-core/issues/10246))
- Write microbatch compiled/run targets to separate files, one per batch ([#10714](https://github.com/dbt-labs/dbt-core/issues/10714))
- Track incremental_strategy as part of model_run tracking event ([#10761](https://github.com/dbt-labs/dbt-core/issues/10761))
- Support required 'begin' config for microbatch models ([#10701](https://github.com/dbt-labs/dbt-core/issues/10701))
- Parse-time validation of microbatch configs: require event_time, batch_size, lookback and validate input event_time ([#10709](https://github.com/dbt-labs/dbt-core/issues/10709))
- Added the --inline-direct parameter to 'dbt show' ([#10770](https://github.com/dbt-labs/dbt-core/issues/10770))
- Enable `retry` support for microbatch models ([#10715](https://github.com/dbt-labs/dbt-core/issues/10715), [#10729](https://github.com/dbt-labs/dbt-core/issues/10729))
- Use unrendered database and schema source properties during state:modified, behind state_modified_compare_more_unrendered_values behavoiur flag ([#9573](https://github.com/dbt-labs/dbt-core/issues/9573))
- Ensure microbatch models respect `full_refresh` model config ([#10785](https://github.com/dbt-labs/dbt-core/issues/10785))
- Adds validations for custom_granularities to ensure unique naming. ([#9265](https://github.com/dbt-labs/dbt-core/issues/9265))
- Test case for `merge_exclude_columns` ([#8267](https://github.com/dbt-labs/dbt-core/issues/8267))
- Convert "Skipping model due to fail_fast" message to DEBUG level ([#8774](https://github.com/dbt-labs/dbt-core/issues/8774))
- Restore previous behavior for --favor-state: only favor defer_relation if not selected in current command" ([#10107](https://github.com/dbt-labs/dbt-core/issues/10107))
- Unit test fixture (csv) returns null for empty value ([#9881](https://github.com/dbt-labs/dbt-core/issues/9881))
- Fix json format log and --quiet for ls and jinja print by converting print call to fire events ([#8756](https://github.com/dbt-labs/dbt-core/issues/8756))
- Add resource type to saved_query ([#10168](https://github.com/dbt-labs/dbt-core/issues/10168))
- Fix: Order-insensitive unit test equality assertion for expected/actual with multiple nulls ([#10167](https://github.com/dbt-labs/dbt-core/issues/10167))
- Renaming or removing a contracted model should raise a BreakingChange warning/error ([#10116](https://github.com/dbt-labs/dbt-core/issues/10116))
- prefer disabled project nodes to external node ([#10224](https://github.com/dbt-labs/dbt-core/issues/10224))
- Fix issues with selectors and inline nodes ([#8943](https://github.com/dbt-labs/dbt-core/issues/8943), [#9269](https://github.com/dbt-labs/dbt-core/issues/9269))
- Fix snapshot config to work in yaml files ([#4000](https://github.com/dbt-labs/dbt-core/issues/4000))
- Improve handling of error when loading schema file list ([#10284](https://github.com/dbt-labs/dbt-core/issues/10284))
- Use model alias for the CTE identifier generated during ephemeral materialization ([#5273](https://github.com/dbt-labs/dbt-core/issues/5273))
- Implement state:modified for saved queries ([#10294](https://github.com/dbt-labs/dbt-core/issues/10294))
- Saved Query node fail during skip ([#10029](https://github.com/dbt-labs/dbt-core/issues/10029))
- DOn't warn on `unit_test` config paths that are properly used ([#10311](https://github.com/dbt-labs/dbt-core/issues/10311))
- Fix setting `silence` of `warn_error_options` via `dbt_project.yaml` flags ([#10160](https://github.com/dbt-labs/dbt-core/issues/10160))
- Attempt to provide test fixture tables with all values to set types correctly for comparisong with source tables ([#10365](https://github.com/dbt-labs/dbt-core/issues/10365))
- Limit data_tests deprecation to root_project ([#9835](https://github.com/dbt-labs/dbt-core/issues/9835))
- CLI flags should take precedence over env var flags ([#10304](https://github.com/dbt-labs/dbt-core/issues/10304))
- Fix typing for artifact schemas ([#10442](https://github.com/dbt-labs/dbt-core/issues/10442))
- Fix over deletion of generated_metrics in partial parsing ([#10450](https://github.com/dbt-labs/dbt-core/issues/10450))
- Do not update varchar column definitions if a contract exists ([#10362](https://github.com/dbt-labs/dbt-core/issues/10362))
- fix all_constraints access, disabled node parsing of non-uniquely named resources ([#10509](https://github.com/dbt-labs/dbt-core/issues/10509))
- respect --quiet and --warn-error-options for flag deprecations ([#10105](https://github.com/dbt-labs/dbt-core/issues/10105))
- Propagate measure label when using create_metrics ([#10536](https://github.com/dbt-labs/dbt-core/issues/10536))
- Fix state:modified check for exports ([#10138](https://github.com/dbt-labs/dbt-core/issues/10138))
- Filter out empty nodes after graph selection to support consistent selection of nodes that depend on upstream public models ([#8987](https://github.com/dbt-labs/dbt-core/issues/8987))
- Late render pre- and post-hooks configs in properties / schema YAML files ([#10603](https://github.com/dbt-labs/dbt-core/issues/10603))
- Allow the use of env_var function in certain macros in which it was previously unavailable. ([#10609](https://github.com/dbt-labs/dbt-core/issues/10609))
- Remove deprecation for tests: to data_tests: change ([#10564](https://github.com/dbt-labs/dbt-core/issues/10564))
- Fix `--resource-type test` for `dbt list` and `dbt build` ([#10730](https://github.com/dbt-labs/dbt-core/issues/10730))
- Fix unit tests for incremental model with alias ([#10754](https://github.com/dbt-labs/dbt-core/issues/10754))
- Allow singular tests to be documented in properties.yml ([#9005](https://github.com/dbt-labs/dbt-core/issues/9005))
- Ignore --empty in unit test ref/source rendering ([#10516](https://github.com/dbt-labs/dbt-core/issues/10516))
- Ignore rendered jinja in configs for state:modified, behind state_modified_compare_more_unrendered_values behaviour flag ([#9564](https://github.com/dbt-labs/dbt-core/issues/9564))
- Improve performance of infer primary key ([#10781](https://github.com/dbt-labs/dbt-core/issues/10781))
- Attempt to skip saved query processing when no semantic manifest changes ([#10563](https://github.com/dbt-labs/dbt-core/issues/10563))
- Ensure dbt retry of microbatch models doesn't lose prior successful state ([#10800](https://github.com/dbt-labs/dbt-core/issues/10800))
### Docs
- Enable display of unit tests ([dbt-docs/#501](https://github.com/dbt-labs/dbt-docs/issues/501))
- Unit tests not rendering ([dbt-docs/#506](https://github.com/dbt-labs/dbt-docs/issues/506))
- Add support for Saved Query node ([dbt-docs/#486](https://github.com/dbt-labs/dbt-docs/issues/486))
- Fix npm security vulnerabilities as of June 2024 ([dbt-docs/#513](https://github.com/dbt-labs/dbt-docs/issues/513))
### Under the Hood
- Clear error message for Private package in dbt-core ([#10083](https://github.com/dbt-labs/dbt-core/issues/10083))
- Enable use of context in serialization ([#10093](https://github.com/dbt-labs/dbt-core/issues/10093))
- Make RSS high water mark measurement more accurate on Linux ([#10177](https://github.com/dbt-labs/dbt-core/issues/10177))
- Enable record filtering by type. ([#10240](https://github.com/dbt-labs/dbt-core/issues/10240))
- Additional logging for skipped ephemeral models ([#10389](https://github.com/dbt-labs/dbt-core/issues/10389))
- bump black to 24.3.0 ([#10454](https://github.com/dbt-labs/dbt-core/issues/10454))
- generate protos with protoc version 5.26.1 ([#10457](https://github.com/dbt-labs/dbt-core/issues/10457))
- Move from minimal-snowplow-tracker fork back to snowplow-tracker ([#8409](https://github.com/dbt-labs/dbt-core/issues/8409))
- Add group info to RunResultError, RunResultFailure, RunResultWarning log lines ([#](https://github.com/dbt-labs/dbt-core/issues/))
- Improve speed of tree traversal when finding children, increasing build speed for some selectors ([#10434](https://github.com/dbt-labs/dbt-core/issues/10434))
- Add test for sources tables with quotes ([#10582](https://github.com/dbt-labs/dbt-core/issues/10582))
- Additional type hints for `core/dbt/version.py` ([#10612](https://github.com/dbt-labs/dbt-core/issues/10612))
- Fix typing issues in core/dbt/contracts/sql.py ([#10614](https://github.com/dbt-labs/dbt-core/issues/10614))
- Fix type errors in `dbt/core/task/clean.py` ([#10616](https://github.com/dbt-labs/dbt-core/issues/10616))
- Add Snowplow tracking for behavior flag deprecations ([#10552](https://github.com/dbt-labs/dbt-core/issues/10552))
- Add test utility patch_microbatch_end_time for adapters testing ([#10713](https://github.com/dbt-labs/dbt-core/issues/10713))
- Replace `TestSelector` with `ResourceTypeSelector` ([#10718](https://github.com/dbt-labs/dbt-core/issues/10718))
- Standardize returning `ResourceTypeSelector` instances in `dbt list` and `dbt build` ([#10739](https://github.com/dbt-labs/dbt-core/issues/10739))
- Add group metadata info to LogModelResult and LogTestResult ([#10775](https://github.com/dbt-labs/dbt-core/issues/10775))
- Increase supported version range for dbt-semantic-interfaces. Needed to support custom calendar features. ([#9265](https://github.com/dbt-labs/dbt-core/issues/9265))
### Security
- Explicitly bind to localhost in docs serve ([#10209](https://github.com/dbt-labs/dbt-core/issues/10209))
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.