* Add event name to `message` of recently added deprecations
* Make it harder to not supply the event name to deprecation messages
* Add changie doc
* Fixup import naming
* initial hatch implmentation
* cleanup docs
* replacing makefile
* cleanup hatch commands to match adapters
reorganize more to match adapters setup
script comment
dont pip install
fix test commands
* changelog
improve changelog
* CI fix
* fix for env
* use a standard version file
* remove odd license logic
* fix bumpversion
* remove sha input
* more cleanup
* fix legacy build path
* define version for pyproject.toml
* use hatch hook for license
* remove tox
* ensure tests are split
* remove temp file for testing
* explicitly match old verion in pyproject.toml
* fix up testing
* get rid of bumpversion
* put dev_dependencies.txtin hatch
* setup.py is now dead
* set python version for local dev
* local dev fixes
* temp script to compare wheels
* parity with existing wheel builds
* Revert "temp script to compare wheels"
This reverts commit c31417a092.
* fix docker test file
* Allow dbt deps to run when vars lack defaults in dbt_project.yml
* Added Changelog entry
* fixed integration tests
* fixed mypy error
* Fix: Use strict var validation by default, lenient only for dbt deps to show helpful errors
* Fixed Integration tests
* fixed nit review comments
* addressed review comments and cleaned up tests
* addressed review comments and cleaned up tests
* Add test checking that `NoNodesForSelectionCriteria` is only fired once per invocation
* Stop emitting `NoNodesForSelectionCriteria` three times during `build` command
* update changelog
---------
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
* Explicitly support functions during partial parsing
* Emit a `Note` event when partial parsing is skipped due to there being no changes
* Begin testing partial parsing support of function nodes
* Add changie doc
* Move test_pp_functions to use `EventCatcher` from dbt-common
* Remove from `functions` instead of `nodes` during partial parsing function deletion
* Fix the partial parsing scheduling of function sql and yaml files
Previously we were treating the partial parsing scheduling of function
files as if they were only defined by YAML files. However functions consist
of a "raw code file" (typically a .sql file) and a YAML file. We needed
to update the the deletion handling + scheduling of functions during partial
parsing to act more similar to "mssat" files in order to achieve this.
This work was primarily done agentically, but then simplified by myself
afterwards.
* Test that changing the alias of a function doesn't require reparsing of the downstream nodes that reference it
* Add test to check that functions with not default schemas get their schemas created
* Ensure schemas of function nodes are created when in DAG during `build` command
* Add changie doc for function schema bug fix
* Add tests to check parsing of function argument default values
* Begin allowing the specification of `default_value` on function arguments
* Validate that non-default function arguments don't come _after_ default function arguments
* Add changie doc
* Clean up changelog on main
* Bumping version to 1.12.0a1
* Code quality cleanup
* Update CHANGELOG.md
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Propagate measure.config to metric.config when specified during create_metric:True
* changelog
* Update the metric.expr to be populated correctly according to DSI rules
* convert setup.py to pyproject.toml
* move dev requirements into pyproject.toml
* with setup.py gone we can install from root
* lint
cleanrly state intention to remove
* convert precommit to use dev deps
* consolidate version to pyproject.toml
* editable req
get rid of editable-req
* docs updates
* tweak configs for builds
* fix script
* changelog
* fixes to build
* revert unnecesary changes
more simplification
revert linting
more simplification
fix
don’t need it
* Update `setup.py` to drop support for python 3.9
* Update github issue templates to not use python 3.9 as an example
* Update github workflows to no longer depend on or test python 3.9
* Drop python 3.9 from the test dockerfile
* Update `CONTRIBUTING.md` to correctly list what python versions we test
* Update comment about some code specifically needed for a python 3.9.7 issue
* Update pre-commit python version comment
* Add changie doc
* Update imports from click as upgrading to python 3.10 changed some click items
* Add test to check that python UDFs can be parsed
* Add `entry_point` and `runtime_version` to function node config
These two configs are required for python UDFs in some warehouses and
may also be required for other UDF languages moving forward. The specific
adapters implementation will enforce the requirement. By default both
configs will be `None` unless set.
* Begin searching for `.py` files in `functions` directory
* Switch to using `SimpleParser` for functions
Previously we were using `SimpleSQLParser` and we were _only_ parsing
SQL files. However, we're now also parsing python files for functions.
As such it makes sense to switch to the `SimpleParser`. Functionally there
is no change bceause we re-added the `parse_file` override that `SimpleSQLParser`
had (there was nothing sql specific about it). Hence this is mostly a
symbolic change.
* Add changie doc
* Add test which checks that function nodes can be configured from dbt_project.yml
* Support setting function node configs from dbt_project.yml
* add changie doc
* Fix unit tests to expect `functions` as part of project
* Update function node tests to look for `type` on function config
* Update `function` node to have `type` on config
* Update parsing of `function` nodes to expect `type` on the config
* Add changie doc
* Add test to check that a function's volatility is configurable
* Define the `FunctionVolatility` enum type
* Add `volatility` as a configuration on function nodes
* Add changie doc
* Ensure jsonschema validation tests aren't skipping validation because postgres isn't technically supported
* Blanket accept `functions` as top level yaml key as temp fix
We for the moment can't sync over the full jsonschema from fusion,
as such this is a stop gap simply so that we don't raise deprecation
warnings if people start specifying functions.
* Move model column `meta` and `tags` into the column's config in happy path fixture
* Test that functions work properly when unit testing models
* Ensure that functions properly get propagated to the `manifest` and `depends_on` of the `unit_test` node
* Update comment about `RuntimeUnitTestFunctionResolver`
* Add changie doc
* Add test to ensure that using a function with `--empty` works
* Ensure relations for functions are created with a `type` set to `function`
Previously on creation of function relations we weren't passing a `type`
value. This was problematic because in dbt-adapters we call `is_function`
(which uses the relation `type`) to determine whether a relation can be
filtered when filtering options (like `empty` or `event_time`) are present.
Because `type` wasn't set for function relations, `is_function` would
return `False` and thus in the present of a filter, we would attempt to
filter it. This would raise an error because functions can't be filtered.
Setting the type on the relation solves the issue.
* add changie doc
* Add `FunctionType` enum
* Add `type` property to `Function` resource
* Add `type` property to `ParsedFunctionPatch` and `UnparsedFunctionUpdate`
* Begin populating a function's `type` during patch parsing
* Regnerate v12 manifest to include function `type` property
* Add changie doc
* Begin testing that function node `type` property is setable and accessible
* Move comment about triggering the PathEncoder back to its proper place
* Allow for the defining of basic SQL UDFs (#11957)
* Add initial definiton of the `Function` resource
* Add FunctionNode definition to graph contracts
* Add test which checks whether basic UDFs can be parsed
This test fails right now, which is intentional. This is test driven
development. Now I do work to maket the test pass :)
* Add basic function sql parser for UDFs, and plumb it through parsing code paths
* Begin populating `functions` in the ref lookup
* Begin patching `function` nodes with their yaml definitions
Of note, presently `arguments` and `return_type` aren't populating properly.
It's likely that we'll have to do additional work to the FunctionPatchParser
to get this _fully_ working.
* Increase responsibility of FunctionPatchParser to handle entire `parse_patch` of function nodes
* Fix testing suite to accomodate addition of new `function` node
* Add changie doc for new `function` node type
* Minor refactoring of `NodePatchParser.parse_patch` to reduce code duplication in `FunctionPatchParser`
* Ability to list and select function nodes (#11968)
* Begin listing `function` nodes in `list` command
* Add ability to run `list` specifying the `function` resource type
* Function nodes are support selection via: name, file path, and resource type
* Add changie doc
* Core handles lifecycle of function nodes (#12008)
* Add basic test to check that UDFs get created in data warehouse
* Add functions to the runner map of \ operation
* Add basic stub of `FunctionRunner` modeled after `SeedRunner`
* Begin using `FunctionRunner` for running `function` nodes
* Add stubbing of things to implement on `FunctionRunner`
* Initial implementation of execution of function nodes
This is largely a copy of the execution of model nodes (in run.py) but
with some abstractions into helper methods to make the body of the
`execute` function easier to follow. Of note, right now this appears to
be getting the incorrect macro from the adapter. This is likely because
for some reason the node's materialization config is being set to `view`
by default.
* Ensure parsed function nodes get the correct materialization type
* Begin generating context for `function` materialization macro
* Stub out adapter response in node result as it was causing some failures
* Correct the adapter response in the run result for functions
* Begin logging `LogFunctionResult` event for completed function nodes
* Add changie doc
* Temp update dev reqs to point at branch of dbt-adapters
* Add test `LogFunctionResult` event to serialization test
* Add `function` nodes to the `WritableManifest`
* Fix tests
* Remove no longer relevant `TODO`s from `function.py`
* Add a new macro `function()` to the jinja context for using functions (#12031)
* Update function tests to look for `functions` under `manifest.functions`
* Begin storing funciton nodes in `Manifest.functions` instead of `Manifest.nodes`
* Ensure function nodes are still included in nodes to run during `build`
* Add ability to lookup functions on the manifest
* Update patch parsing of function YAML files now that functions live on `Manifest.functions`
* Mark function nodes as no longer refable
* Ensure function nodes are still selectable
* Add `function` macro!
* Ensure functions nodes are correctly linked in the DAG
* Update jinja context tests to expect `function` macro to exist
* Fix unit tests in test suite to expect function nodes
* Add changie doc
* regen v12.json jsonschema
* Fix test `TestVerifyArtifacts::test_run_and_generate`
* Fix test `TestVerifyArtifactsReferences::test_references`
* Fix test `TestVerifyArtifactsVersions::test_versions`
* Regen manifest artifact for `TestPreviousVersionState::test_compare_state_current`
* Update `_iterate_selected_nodes` to support function nodes
* Ensure we process node functions to ensure they get added to the `depends_on`
* Take functions into account for state modified
* Regen data for `TestModifiedStateSchemaEvolution::test_modified_state_schema_evolution` test
* Default `functions` property on `WritableManifest` to a dict
I'm not sure if this is actually how we want to do this. However, without
doing this the `WritableManifest` will break on loading of older manifests
that don't have `functions`. The alternative to this would be to bump
the schema version (v12 -> v13) and create an upgrade in `upgrade_manifest.py`.
* Update UDF tests to use a more general purpose function
* Add tests ensuring UDFs can be used in models and `--inline` queries
* Correct `ParseFunctionResolver` so that the name isn't added twice to the function args spec
* Drop `functions` from `Exposure` and `Metric` definitions
* Regen v12 manifest schema
* Remove unnecessary string interpolation
* Point dev reqs back to dbt-adapters@main
* Empty commit
* Increase shared memory size for postgres docker container
I recently started getting errors that look like
```
E dbt_common.exceptions.base.DbtDatabaseError: Database Error
E could not resize shared memory segment "/PostgreSQL.3814850474" to 2097152 bytes: No space left on device
```
At first I thought this was a lack of memory, disk space, or ulimit file descriptors. However
increasing all of those things did not solve the problem. I eventually found, by exec-ing into
the container and running `df -h /dev/shm && ls -lh /dev/shm` that the container only had 64MB
of memory available to it. This change increases the memory available to the container to 1GB,
which resolved the issue.
* Use `docker compose` instead of `docker-compose`
The later was docker v1, and no longer works. Use `docker compose` instead.
* Only run homebrew postgres in `setup_db.sh` if `SKIP_HOMEBREW` is not passed
Our github actions use homebrew, but our local dev uses docker. When we
were doing local development and running `make setup-db` suddenly there would
be _two_ postgres instances running. One via homebrew, and another in docker.
This was breaking the setup. Now when running `make setup-db` we skip the
homebrew relevant portions of `setup_db.sh`.
* Set more PG environment variables in `setup_db.sh`
* fix: Properly quote event_time column names in sample mode filters
When using the --sample flag with models that have camel case or
spaced column names as their event_time field, the generated SQL
would fail because column names weren't properly quoted.
This fix introduces a robust quoting system that:
- Checks column-level quote configuration first (highest precedence)
- Falls back to source-level quoting settings
- Uses the existing Column class for proper quote handling
- Centralizes the logic in a dedicated method to eliminate duplication
- Ensures sample mode works with PostgreSQL and other databases that
require quoted identifiers for column names with spaces or special characters
Fixes issue where --sample flag fails with camel case or spaced
event_time column names.
* returning the same path that was used earlier for the event_time filed
* adding changelog
* verify cla agreement
* test: Add comprehensive tests for _resolve_event_time_field_name method
This commit adds extensive test coverage for the _resolve_event_time_field_name
method to address the PR review feedback requesting tests.
Changes:
- Add 28 parametrized test cases covering all quoting scenarios
- Test column-level vs source-level quote precedence
- Test edge cases: missing columns, empty columns dict, no quoting attributes
- Test camel case, snake case, and spaced column names
- Test both quoted and unquoted column name scenarios
- Improve method robustness with better error handling
The tests ensure the method correctly handles:
- Column-level quote settings taking precedence over source-level
- Proper fallback to source-level quoting when column-level is not set
- Edge cases where columns don't exist or have no quoting attributes
- Various column name formats (simple, camelCase, snake_case, spaced)
Fixes: Addresses PR review feedback requesting comprehensive test coverage
* style: Apply code formatting from pre-commit hooks
- Apply black formatting to providers.py and test_providers.py
- Fix trailing whitespace issues
- Add proper type guards for event_time attribute access
- Ensure all tests continue to pass after formatting changes
* Create custom hook for checking for improper imports of artifact resources
* Fix return value of `has_bad_artifact_resource_imports.py::main`
* Regex match versioned resource imports and give import in pre-commit error
* (Tidy First): Fix imports of artifact resources to not import direct versioned resources
* Add changie doc
* feat: support nested key traversal in dbt list output
* feat: support nested key traversal in dbt list output
* feat: support nested key traversal in dbt list output
* feat: support nested key traversal in dbt list output
* feat: support nested key traversal in dbt list output
* feat: support nested key traversal in dbt list output
* feat: support nested key traversal in dbt list output
* Update version for libpq-dev in Dockerfile
The previous version we had for libpq-dev stopped being listed. As such
we need to change to installing a version that is still listed. Hence
we now install version 13.22-0+deb11u1
* Fix `FromAsCasing` warning in Docker file
Our docker file was raising the warning
`FromAsCasing: 'as' and 'FROM' keywords' casing do not match (line 27)`
because we were using `FROM` and `as`, and docker wants those words
to have the same casing. As such, the `as` instances have become `AS`.
* Add changie doc
* Pull in latest jsonschemas, primarily for improved SL definitions
* Improve metric definitions in happy path test fixture to be more expansive
* Add changie doc
* Fix test_list to know about new happy path fixture metrics
* Make `GenericJSONSchemaValidationDeprecation` a "preview" deprecation
Making the deprecation a preview will:
1. Remove it from the summary
2. Emit it as a Note event instead of the actual deprecation event
a. This does mean it'll still be in the logs (but as info level instead of warning)
* Update message of `GenericJSONSchemaValidationDeprecation` to state it's only possibly a deprecation
* Add changie doc
* fix GenericJSONSchemaValidationDeprecation related tests
* Add more details to `GenericJSONSchemaValidationDeprecation` message
* Fix tests related to GenericJSONSchemaValidationDeprecation
* Bump dbt-protos dep min to get new env var namespace deprecation event
* Define new EnvironmentVariableNamespaceDeprecation event in core
* Add new deprecation class for EnvironmentVariableNamespaceDeprecation
* Bump dbt-common dep min to get new env var prefix definiton
* Add new `env_vars` module with function for validating dbt engine env vars
* Add changie doc
* Begin keeping a list of env vars associated with cli params
* Begin validating that only allowed engine environment variables exist
* Add some extra engine env vars found throughout the project to the known list
* Begin cross propagating dbt engine env vars with old names
If the old env var name is present, and the new one is not, set the
new one to have the value of the old one. Else, if the new one is set,
set/override old name to have the value of the new one.
There are some drawbacks to this approach. Namely, click only validates
environment variable types for the environment variables it is aware of.
Thus by using the new environment variable naming scheme for existing
environment variables (not newly added ones), we actually lose type guarantees.
This might require a rework.
* Add test for validate_engine_env_vars method
* Add unit test ensuring new engine env vars get added correctly
* Add integration test for environment variable namespace deprecation
* Move logic for propagating engine env vars to pre-flight env var setting
Previously we were attempting to set it on the flags context, but that is
_not_ the environment variable context. Instead what appears to happen is
that the environment variable context is loaded, click takes this into
consideration, and then the flags are set from click's understanding of
passed cli params + env vars.
* Get the env vars from the invocation context in `validate_engine_env_vars`
* Move `_create_engine_env_var` to `__init__` of `EngineEnvVar` data class
* Fix error type in __init__ of EngineEnvVar dataclass
* Correct grammar of EnvironmentVariableNamespaceDeprecation message
* Upgrade to DSI 0.9.0
Note this new version has some breaking changes (changes to class names). This won't impact semantic manifest parsing. The changes in the new version will be used to support order_by and limit on saved queries.
* Changelog
* Update test saved query
* Improve deprecation message for SourceOverrideDeprecation
* Move SourceOverrideDeprecation to jsonschema validation code path
* Update test for checking SourceOverrideDeprecation
* Update dbt_project.yml jsonschema spec to handle nested config defs
Additionally adds some more cloud configs
* Update schema files jsonschema definition to not have `overrides` for sources
Additionally add some cloud definitions
* Add changie doc
* Update happy_path fixture to include nested config specifations in dbt_project.yml
* First draft of SourceOverrideDeprecation warning.
* Refinements and test
* Back out unneeded change`
* Fix unit test.
* add changie doc
* Bump minimum dbt-protos to 1.0.335
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* Stop dynamically setting ubuntu version for `main.yml` and structured logging actions
These actions are important to run on community PRs. However these workflows
use `on: pull_request` instead of `on: pull_request_target`. That is intentional,
as `on: pull_request` doesn't give access to variables or secrets, and we need
to keep it that way for security purposes. The these actions were trying to access
a variable, which they don't have access to. This was a nicety for us, because
sometimes we'd delay moving to github's `ubuntu-latest`. However, the security
concern is more important, and thus we lose the variable for these workflows.
* Change `runs_on` of `artifact-reviews.yml`
* Stop dynamically setting mac and windows versions in main.yml
* Revert "bump dbt-common (#11640)"
This reverts commit c6b7655b65.
* update freshness model config handling
* lower case all columns when processing unit test results
* add changelog
* swap .columns for .column_names
* use rename instead of select api for normalizing agate table column casing
* Add helper to validate model configs via jsonschema
* Store jsonschemas as module vars instead of reloading everytime
Every time we were calling a jsonschema validation, we were _reloading_
from file the underlying jsonschema. As a one off, this isn't too costly.
However, for large projects it starts to add up. By only loading each json
schema once we can save a lot of time. Calling one of the functions which
loads a jsonschema 10,000 times was costing ~3.7215 seconds. By switching
to this module var paradigm we reduced that to ~0.3743 seconds.
* Begin validating configs from model `.sql` files
It was a bit of a hunt to figure out where to do this. We couldn't do
the validating in `calculate_node_config` because that function is called
4 times per node (which is an issue of itself, but out of scope for this
work). We also couldn't do the validation where `_config_call_dict` is set
because it turns out there are multiple avenues for setting
`_config_call_dict`, which is a fun rabbit hole.
* Ensure .sql configs are validated only once
It turns out that that `update_parsed_node_config` can potentially be
called twice per model. It'll be called from either `ModelParser.render_update`
or `ModelParser.populate`, and it can additionally be called from
`PatchParser.patch_node_config` if there is a .yml definition for the
model. We only want to validate the config once, and we aren't guaranteed
to have a `PatchParser` if there is no patch for the model. Thus, we've
updated `ModelParser.populate` and `ModelParser.render_update` to
request the config validation (which by default doesn't run unless requested).
* Split out the model config specific validation from general jsonschema validation
We're validating model configs from sql files via a subschema of the main
resources jsonschema, different case logic for detecting the different
types of deprecation warnings present. Thus `validate_model_config` cannot
call `jsonschema_validate`. We could have had both logic paths exist in
`jsonschema_validate`, but it would have added another later of if/elses
and bloated the function substantially.
* Handle additional properties of sub config objects
* Give better key path information for .sql config jsonschema issues
* Add tests for validate_model_config
* Add changie doc
* Fix jsonschemas unittests to avoid catching irrelevant issues
* Revert "bump dbt-common (#11640)"
This reverts commit c6b7655b65.
* update freshness model config handling
* lower case all columns when processing unit test results
* add changelog
* swap .columns for .column_names
* Loosen pydantic maximum to <3 (allowing for pydantic 2)
* Add an internal pydantic shim for getting pydantic BaseSettings reguardless of pydantic v1 vs v2
* Add changie doc
In 1.10.0 we began utilizing `jsonschema._keywords`. However, the submodule
`_keywords` wasn't added until jsonschema `4.19.1` which came out September
20th, 2023. Our jsonschema requirement was being set transitively via
dbt-common as `>=4.0,<5`. This mean people doing a _non_ fresh install of
dbt-core `1.10.0` could end up with a broken system if their existing
jsonschema dependency was anywhere in the range `>=4.0,<4.19.1`. By bumping the
minimum jsonschema version we make it such that anyone install dbt-core 1.10.1 will
automatically get there jsonschema updated (assuming they don't have an exclusionary
pin)
* Begin testing that model freshness can't be set as a top level model property
* Remove ability to specify freshness as top level property of models
* Add come comments to calculate_node_config for better readability
* Drop `freshness` as a top level property of models, and let `patch_node_config` handle merging config freshness
Model freshness hasn't been released in a minor release yet, not been documented. Thus
it is safe to remove the top level property of freshness on models. Freshness will instead
be set, and gotten, from the model config. Additionally our way of calculating the
config model freshness only got the top level `+freshness` from dbt_project.yml (ignoring
any path specific definitions). By instead using the built in `calculate_node_config` (which
is eventually called by `patch_node_config`), we get all path specific freshness config handling
and it also handles the precedence of `dbt_project.yml` specification, schema file specification,
and sql file specification.
* add changie doc
* Ensure source node `.freshness` is equal to node's `.config.freshness`
* Default source config freshness to empty spec if no freshenss spec is given
* Update contract tests for source nodes
* Ensure `build_after` is present in model freshness in parsing, otherwise skip freshness definition
* add freshness model config test
* add changelog
---------
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
* Handle explicit setting of null for source freshness config
* Abstract out the creation of the target config
This is useful because it makes that portion of code more re-usable/portable
and makes the work we are about to do easier.
* Fix bug in `merge_source_freshness` where empty freshness was preferenced over `None`
The issue was that during merging of freshnesses, an "empty freshness", one
where all values are `None`, was being preferenced over `None`. This was
problematic because an "empty freshness" indicates that a freshness was not
specified at that level. While `None` means that the freshness was _explicitly_
set to `None`. As such we should preference the thing that was specifically set.
* Properly get dbt_project defined freshness and don't merge with schema defined freshness
Previously we were only getting the "top level" freshness from the
dbt_project.yaml. This was ignoring freshness settings for the direct,
source, and table set in the dbt_project.yaml. Additionally, we were
merging the dbt_project.yaml freshness into the schema freshness. Long
term this merging would be desireably, however before we do that we need
to ensure freshness at diffrent levels within the dbt_project.yml get
properly merged (currently the different levels clobber each other). Fixing
that is a larger issue though. So for the time being, the schema defintion
of freshness will clobber any dbt_project.yml definition of freshness.
* Add changie doc
* Fix whitespace to make code quality happy
* Set the parsed source freshness to an empty FreshnessThreshold if None
This maintains backwards compatibility
* Revert "bump dbt-common (#11640)"
This reverts commit c6b7655b65.
* add file_format as a top level config in CatalogWriteIntegrationConfig
* add changelog
* Clean up changelog on main
* Bumping version to 1.11.0a1
* Code quality cleanup
* add old changelogs
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Add a robust schema definition of singular test to happy path fixture
* Add generic tests to happy path fixture
* Add unit tests to happy path fixture
* Fix data test + unit test happy path fixtures so they're valid
* Fix test_list.py for data test + unit test happy path fixture
* Fixup issues due to imperfect merge
* Drop generic data test definition style that we don't want to support from happy path fixture
* Add data test attributes to a pre-existing data test type
* Fix test_list.py again
* Don't forget to normalize in test_list.py
* Include event name in msg of deprecation warning for all recently added deprecations
* Add behavior flag for gating inclusion of event name in older deprecation messages
* Conditionally append event name to older deprecation events depending on behavior flag
* Add changie doc
* Migrate to `WarnErrorOptionsV2` and begin using `error` and `warn` as primary config keys
* Update tests using `WarnErrorOptions` to use `error` and `warn` terminology
* Begin emitting deprecation warning when include/exclude terminology is used with WarnErrorOptions
* bump minimum of dbt-protos
* Add test for new WarnErrorOptions deprecation
* add changie doc
* Fix test_warn_error_options.py tests
* Fix test_singular_tests.py tests
* Add WOEIncludeExcludeDeprecation to test_events.py serialization test
* Begin testing that `happy_path_project` and `project` fixtures have no deprecations
* Add model specific configs to model yml description in happy path test
* Add all possible model config property keys to happy path fixture
* Add more model properties to happy path fixture
* Move configs for happy path testing onto new happy path model fixture
* Fix deprecation tests names
* Add newly generated jsonschema for schema files
* Skip happy path deprecation test for now
* Fix 'database' value of model for happy path fixture
* Fix happy path fixture model grants to a role that exists
* Fix test_list.py
* Fix detection of additional config property deprecation
Previously we were taking the first `key` on the `instance` property
of the jsonschema ValidationError. However, this validation error
is raised as an "anyOf" violation, which then has sub-errors in its
`context` property. To identify the key in violation, we have to
find the `additionalProperties` validation in the sub-errors. The key
that is an issue can then be parsed from that sub-error.
* Refactor key parsing from jsonschema ValidationError messages to single definition
* Update handling `additionalProperties` violations to handle multiple keys in violation
* Add changelog
* Remove guard logic in jsonschemas validation rule that is no longer needed
* fix Dockerfile.test
* add change
* Ensure that all instances where `pre-commit` is called are prefixed with `$(DOCKER_CMD)`
* Changelog entry
---------
Co-authored-by: Taichi Kato <taichi-8128@outlook.jp>
In a lot of our function deprecation warning tests we check for a
matching string within an event message. Some of these matches check
for a file path. The problem with this was that windows formats
file paths differently. This was causing the functional tests to
_fail_ when run in a windows environment. To fix this we've removed
the file path part of the string from the test assertions.
* Begin basic jsonschema validations of dbt_project.yml (#11505)
* Add jsonschema for validation project file
* Add utility for helping to load jsonschema resources
Currently things are a bit hard coded. We should probably alter this
to be a bit more flexible.
* Begin validating the the `dbt_project.yml` via jsonschema
* Begin emitting deprecation warnings for generic jsonschema violations in dbt_project.yml
* Move from `DbtInternalError` to `DbtRuntimeError` to avoid circular imports
* Add tests for basic jsonschema validation of `dbt_project.yml`
* Add changie doc
* Add seralization test for new deprecation events
* Alter the project jsonschema to not require things that are optional
* Add datafiles to package egg
* Update inclusion of project jsonschema in setup.py to get files correctly
Using the glob spec returns a list of found files. Our previous spec was
raising the error
`error: can't copy 'dbt/resources/input_schemas/project/*.json': doesn't exist or not a regular file`
* Try another approach of adding jsonschema to egg
* Add input_schemas dir to MANIFEST.in spec
* Drop jsonschema inclusion spec from setup.py
* Begin using importlib.resources.files for loading project jsonschema
This doesn't currently work with editable installs :'(
* Use relative paths for loading jsonchemas instead of importlib
Using "importlib" is the blessed way to do this sort of thing. However,
that is failing for us on editable installs. This commit switches us
to using relative paths. Technically doing this has edge cases, however
this is also what we do for the `start_project` used in `dbt init`. So
we're going to do the same, for now. We should revisit this soon.
* Drop requirment of `__additional_properties__` specified by project jsonschema
* Drop requirement for `pre-hook` and `post-hook` specified by project jsonschema
* Reset `active_deprecations` global at the end of tests using `project` fixture
* Begin validation the jsonschema of YAML resource files (#11516)
* Add jsonschema for resources
* Begin jsonschema validating YAML resource files in dbt projects
* Drop `tests` and `data_tests` as required properties of `Columns` and `Models` for resources jsonschema
* Drop `__additional_properties__` as required for `_Metrics` in resources jsonschema
* Drop `post_hook` and `pre_hook` requirement for `__SnapshotsConfig` in resources jsonschema
* Update `_error_path_to_string` to handle empty paths
* Create + use custom Draft7Validator to ignore datetime and date classes
* Break `TestRetry` functional test class into multiple test classes
There was some overflow global state from one test to another which was
causing some of the tests to break.
* Refactor duplicate instances of `jsonschema_validate` to single definition
* Begin testing jsonschema validation of resource YAMLs
* Add changie doc
* Add Deprecation Warnings for Unexpected Jinja Blocks (#11514)
* Add deprecation warnings on unexpected jinja blocks.
* Add changelog entry.
* Add test event.
* Regen proto types.
* Fix event test.
* Add `UnexpectedJinjaBlockDeprecationSummary` and add file context to `UnexpectedJinjaBlockDeprecation` (#11517)
* Add summary event for UnexpectedJinjaBlockDeprecation
* Begin including file information in UnexpectedJinjaBlockDeprecation event
* Add UnexpectedJinjaBlockDeprecationSummary to test_events.py
* Deprecate Custom Top-Level Keys (#11518)
* Add specific deprecation for custom top level keys.
* Add changelog entry
* Add test events
* Add Check for Duplicate YAML Keys (#11510)
* Add functionality to check for duplicate yaml keys, working around PyYAML limitation.
* Fix up some ancient typing issues.
* Ignore typing issue, for now.
* Correct unit tests of `checked_load`
* Add event and deprecation types for duplicate yaml keys
* Begin validating `dbt_project.yml` for duplicate key violations
* Begin checking for duplicate key violations in schema files
* Add test to check duplicate keys are checked in schema files
* Refactor checked_yaml failure handling to reduce duplicate code
* Move `checked_load` utilities to separate file to avoid circular imports
* Handle yaml `start_mark` correctly for top level key errors
* Update changelog
* Fix test.
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* Fix issue with YAML anchors in new CheckedLoader class.
* Deprecate having custom keys in config blocks (#11522)
* Add deprecation event for custom keys found in configs
* Begin checking schema files for custom keys found in configs
* Test new CustomConfigInConfigDeprecation event
* Add changie doc
* Add custom config key deprecation events to event serialization test
* Provide message to ValidationError in `SelectorConfig.from_path`
This typing error is unrelated to the changes in this PR. However,
it was failing CI, so I figured it'd be simple to just fix it.
* Add some extra guards around the ValidationFailure `path` and `instance`
* [TIDY-FRIST] Use new `deprecation_tag` (#11524)
* Tidy First: Update deprecation events to use the new `deprecation_tag`
Note did this for a majority of deprecations, but not _all_ deprecations.
That is because not all deprecations were following the pattern. As some
people do string parsing of our logs with regex, altering the deprecations
that weren't doing what `deprecation_tag` does to use `deprecation_tag`
would be a _breaking change_ for those events, thus we did not alter those
events
* Bump minimum dbt-common to `1.22.0`
* Fix tests
* Begin emitting deprecation events for custom properties found in objects (#11526)
* Fix CustomKeyInConfigDeprecationSummary
* Add deprecation type for custom properties in YAML objects
* Begin emitting deprecation events for custom properties found in objects
* Add changie doc
* Add `loaded_at_query` property to `_Sources` definition in jsonschema
This was breaking the test tests/unit/parser/test_parser.py::SchemaParserSourceTest::test_parse_source_custom_freshness_at_source
* Move validating jsonschema of schema files earlier in the process
Previously we were validating the jsonschema of schema files in
`SchemaParser.parse_file`. However, the file is originally loaded in
`yaml_from_file` (which happens before `SchemaParser.parse_file`), and
`yaml_from_file` _modifies_ the loaded dictionary to add some additional
properties. These additional properties violate the jsonschema unfortunately,
and thus we needed to start validating the schema against the jsonschema
before any such modifications.
* Skip parser tests for `model.freshness`
Model freshness never got fully implemented, won't be implemented nor
documented for 1.10. As such we're gonna consider the `model.freshness`
property an "unknown additional property". This is actually good as some
people have "accidentally" defined "freshness" on their models (likely due
to copy/paste of a source), and that property isn't doing anything.
* One single DeprecationsSummary event to rule them all (#11540)
* Begin emitting singular deprecations summary, instead of summary per deprecation type
* Remove concept of deprecation specific summary events in deprecations module
* Drop deprecation summary events that have been added to `feature-branch--11335-deprecations` but not `main`
These are save to drop with no notice because they only ever existed
on a feature branch, never main.
* Correct code numbers for new events on feature-branch that haven't made it to main yet
* Kill `PackageRedirectDeprecationSummary` event, and retire its event code
* add changie doc
* Update jsonschemas to versions 0.0.110 (#11541)
* Update jsonschems to 0.0.110
* Don't allow additional properties in configs
* Don't allow additional top level properties on objects
* Allow for 'loaded_at_query' on Sources and Tables
* Don't allow additional top level properties in schema files
---------
Co-authored-by: Peter Webb <peter.webb@dbtlabs.com>
* [#9791] Fix datetime.datetime.utcnow() is deprecated as of Python 3.12
* Explicit UTC timezone declaration for instances of datetime.now()
* Keep utcnow() in functional test case to avoid setup errors
* Utilize the more specific datetime class import for microbatch config
* Replace utcnow calls in contracts and artifacts
* Replace utcnow calls in functional and unit test cases
* Test deserialization of compiled run execution results
* Test deserialization of instantiated run execution result
* Code style improvements
* rough in catalog contracts + requires.catalog
* set catalog integration
* add initial functional test for catalog parsing
* use dbt-adapters.git@feature/externalCatalogConfig
* add concrete catalog integration config
* add requires.catalog to build + reorder requires
* separate data objects from loaders
* improve functional test and fix import
* Discard changes to tests/functional/adapter/simple_seed/test_seed_type_override.py
* Change branch name for dot-adapters
* make table_format and catalog_type strings for now
* remove uv from makefile
* Discard changes to dev-requirements.txt
* Overhaul parsing catalogs.yml
* Use [] instead of None
* update postgres macos action
* Add more tests
* Add changie
* Second round of refactoring
* Address PR comments
---------
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
* Functional test for hourly microbatch model
* Use today's date for functional test for hourly microbatch model
* Use today's date for functional test for hourly microbatch model
* Restore to original
* Only use alphanumeric characters within batch ids
* Add tests for batch_id and change expected output for format_batch_start
* Handle missing batch_start
* Revert "Handle missing batch_start"
This reverts commit 65a1db0048. Reverting this because
`batch_start` for `format_batch_start` cannot be `None` and `start_time` for `batch_id`
cannot be `None`.
* Improve BatchSize specific values for `format_batch_start` and `batch_id` methods
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* Update to latest ddtrace within minor version range.
* Add test coverage for Python 3.13
* Update setup.py to indicate Python 3.13 support.
* Update freezegun version to support Python 3.13
* Add changelog entry.
* Default macro argument information from original definitions.
* Add argument type and count warnings behind behavior flag.
* Add changelog entry.
* Make flag test more robust.
* Use a unique event for macro annotation warnings, per review.
* Add event to test list.
* Regenerate core_types_pb2 using protoc 5.28.3
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* update ubuntu 20.04 to 24.04
* updates to ubuntu-latest instead
* try postgres update
* Change owner of db creation script so postgres can run it.
* Add sudos.
* Add debug logging.
* Set execute bit on scripts.
* More debug logging.
* try a service
* clean up and split the integrations tests by os
---------
Co-authored-by: Peter Allen Webb <peter.webb@dbtlabs.com>
* Push orchestration of batches previously in the `RunTask` into `MicrobatchModelRunner`
* Split `MicrobatchModelRunner` into two separate runners
`MicrobatchModelRunner` is now an orchestrator of `MicrobatchBatchRunner`s, the latter being what handle actual batch execution
* Introduce new `DbtThreadPool` that knows if it's been closed
* Enable `MicrobatchModelRunner` to shutdown gracefully when it detects the thread pool has been closed
* Add secondary_profiles to profile.py
* Add more tests for edge cases
* Add changie
* Allow inferring target name and add tests for the same
* Incorporate review feedback
* remove unnecessary nesting
* Use typing_extensions.Self
* use quoted type again
* address pr comments round 2
* Allow for rendering of refs/sources in snapshots to be sampled
Of note the parameterization of `test_resolve_event_time_filter` in
tests/unit/context/test_providers.py is getting large and cumbersome.
It may be time soon to split it into a few distinct tests to facilitate
the necessity of fewer parametrized arguments for a given test.
* Simplify `isinstance` checks when resolving event time filter
Previously we were doing `isintance(a, class1) or (isinstance(a, class2)`
but this can be simplified to `isintance(a, (class1, class2))`. Woops.
* Ensure sampling of refs of snapshots is possible
Notably we didn't have to add `insinstance(self.target, SnapshotConfig)` to the
checks in `resolve_event_time_filter` because `SnapshotConfig` is a subclass
of `NodeConfig`.
* Add changie doc
* Reapply "Add `doc_blocks` to manifest for nodes and columns (#11224)" (#11283)
This reverts commit 55e0df181f.
* Expand doc_blocks backcompat test
* Refactor to method, add docstring
* Add `--sample` flag to `run` command
* Remove no longer needed `if` statement around EventTimeFilter creation for microbatch models
Upon the initial implementation of microbatch models, the the `start` for a batch was _optional_.
However, in c3d87b89fb they became guaranteed. Thus the if statement
guarding when `start/end` isn't present for microbatch models was no longer actually doing anything.
Hence, the if statement was safe to remove.
* Get sample mode working with `--event-time-start/end`
This is temporary as a POC. In the end, sample mode can't depend on the arguments
`--event-time-start/end` and will need to be split into their own CLI args / project
config, something like `--sample-window`. The issue with using `--event-time-start/end`
is that if people set those in the project configs, then their microbatch models would
_always_ run with those values even outside of sample mode. Despite that, this is a
useful checkpoint even though it will go away.
* Begin using `--sample-window` for sample mode instead of `--event-time-start/end`
Using `--event-time-start/end` for sample mode was conflicting with microbatch models
when _not_ running in sample mode. We will have to do _slightly_ more work to plumb
this new way of specifying sample time to microbatch models.
* Move `SampleWindow` class to `sample_window.py` in `event_time` submodule
This is mostly symbolic. We are going to be adding some utilities for "event_time"
type things, which will all live in the `event_time` submodule. Additionally we plan
to refactor `/incremental/materializations/microbatch.py` into the sub module as well.
* Create an `offset_timestamp` separate from MicrobatchBuilder
The `MicrobatchBuilder.offset_timestamp` _truncates_ the timestamp before
offsetting it. We don't want to do that, we want to offset the "raw" timestamp.
We could have split renamed the microbatch builder function name to
`truncate_and_offset_timestamp` and separated the offset logic into a separate
abstract function. However, the offset logic in the MicrobatchBuilder context
depends on the truncation. We might later on be able to refactor the Microbatch
provided function by instead truncating _after_ offsetting instead of before.
But that is out of scope for this initial work, and we should instead revisit it
later.
* Add `types-python-dateutil` to dev requirements
The previous commit began using a submodule of the dateutil builtin
python library. We weren't previously using this library, and thus didn't
need the type stubs for it. But now that we do use it, we need to have
the type stubs during development.
* Begin supporting microbatch models in sample mode
* Move parsing logic of `SampleWindowType` to `SampleWindow`
* Allow for specificaion of "specific" sample windows
In most cases people will want to set "relative" sample windows, i.e.
"3 days" to sample the last three days. However, there are some cases
where people will want to "specific" sample windows for some chunk of
historic time, i.e. `{'start': '2024-01-01', 'end': '2024-01-31'}`.
* Fix tests of `BaseResolver.resolve_event_time_filter` for sample mode changes
* Add `--no-sample` as it's necessary for retry
* Add guards to accessing of `sample` and `sample_window`
This was necessary because these aren't _always_ available. I had expected
to need to do this after putting the `sample` flag behind an environment
variable (which I haven't done yet). However, we needed to add the guards
sooner because the `render` logic is called multiple times throughout the
dbt process, and earlier on the flags aren't available.
* Gate sample mode functionality via env var `DBT_EXPERIMENTAL_SAMPLE_MODE`
At this point sample mode is _alpha_ and should not be depended upon. To make
this crystal clear we've gated the functionality behind an environment variable.
We'll likely remove this gate in the coming month.
* Add sample mode tests for incremental models
* Add changie doc for sample mode initial implementation
* Fixup sample mode functional tests
I had updated the `later_input_model.sql` to be easier to test with. However,
I didn't correspondingly update the inital `input_model.sql` to match.
* Ensure microbatch creates correct number of batches when sample mode env var isn't present
Previously microbatch was creating the _right_ number of batches when:
1. sample mode _wasn't_ being used
2. sample mode _was_ being used AND the env var was present
Unfortunately sample mode _wasn't_ creating the right number of batches when:
3. sample mode _was_ being used AND the env var _wasn't_ present.
In case (3) sample mode shouldn't be run. Unfortunately we weren't gating sample
mode by the environment variable during batch creation. This lead to a situtation
where in creating batches it was using sample mode but in the rendering of refs
it _wasn't_ using sample mode. Putting it in an inbetween state... This commit
fixes that issue.
Additionally of note, we currently have duplicate sample mode gating logic in the
batch creation as well as in the rendering of refs. We should probably consolidate
this logic into a singular importable function, that way any future changes of how
sample mode is gated is easier to implement.
* Correct comment in SampleWindow post serialization method
* Hide CLI sample mode options
We are doing this _temporarily_ while sample mode as a feature is in
alpha/beta and locked behind an environment variable. When we remove the
environment variable we should also unhide these.
Currently, running this command on a project containing a microbatch
model results in an error, as microbatch models require a datetime
value in their config which cannot be serialized by the default JSON
serializer.
There already exists a custom JSON serializer within the dbt-core
project that converts datetime to ISO string format. This change uses
the above serializer to resolve the error.
* Update `TestMicrobatchWithInputWithoutEventTime` to check running again raises warning
The first time the project is run, the appropriate warning about inputs is raised. However,
the warning is only being raised when a full parse happens. When partial parsing happens
the warning isn't getting raised. In the next commit we'll fix this issue. This commit updates
the test to show that the second run (with partial parsing) doesn't raise the update, and thus
the test fails.
* Update manifest loading to _always_ check microbatch model inputs
Of note we are at the point where multiple validations are iterating
all of the nodes in a manifest. We should refactor these _soon_ such that
we are not iterating over the same list multiple times.
* Add changie doc
* Begin producing warning when attempting to force concurrent batches without adapter support
Batches of microbatch models can be executed sequentially or concurrently. We try to figure out which to do intelligently. As part of that, we implemented an override, the model config `concurrent_batches`, to allow the user to bypass _some_ of our automatic detection. However, a user _cannot_ for batches to run concurrently if the adapter doesn't support concurrent batches (declaring support is opt in). Thus, if an adapter _doesn't_ support running batches concurrently, and a user tries to force concurrent execution via `concurrent_batches`, then we need to warn the user that that isn't happening.
* Add custom event type for warning about invalid `concurrent_batches` config
* Fire `InvalidConcurrentBatchesConfig` warning via `warn_or_error` so it can be silenced
* Update partial success test to assert partial successes mean that the run failed
* Update results interpretation to include `PartialSuccess` as failure status
* Update single batch test case to check for generic exceptions
* Explicitly skip last final batch execution when there is only one batch
Previously if there was only one batch, we would try to execute _two_
batches. The first batch, and a "last" non existent batch. This would
result in an unhandled exception.
* Changie doc
* microbatch: split out first and last batch to run in serial
* only run pre_hook on first batch, post_hook on last batch
* refactor: internalize parallel to RunTask._submit_batch
* Add optional `force_sequential` to `_submit_batch` to allow for skipping parallelism check
* Force last batch to run sequentially
* Force first batch to run sequentially
* Remove batch_idx check in `should_run_in_parallel`
`should_run_in_parallel` shouldn't, and no longer needs to, take into
consideration where in batch exists in a larger context. The first and
last batch for a microbatch model are now forced to run sequentially
by `handle_microbatch_model`
* Begin skipping batches if first batch fails
* Write custom `on_skip` for `MicrobatchModelRunner` to better handle when batches are skipped
This was necessary specifically because the default on skip set the `X of Y` part
of the skipped log using the `node_index` and the `num_nodes`. If there was 2
nodes and we are on the 4th batch of the second node, we'd get a message like
`SKIPPED 4 of 2...` which didn't make much sense. We're likely in a future commit
going to add a custom event for logging the start, result, and skipping of batches
for better readability of the logs.
* Add microbatch pre-hook, post-hook, and sequential first/last batch tests
* Fix/Add tests around first batch failure vs latter batch failure
* Correct MicrobatchModelRunner.on_skip to handle skipping the entire node
Previously `MicrobatchModelRunner.on_skip` only handled when a _batch_ of
the model was being skipped. However, that method is also used when the
entire microbatch model is being skipped due to an upstream node error. Because
we previously _weren't_ handling this second case, it'd cause an unhandled
runtime exception. Thus, we now need to check whether we're running a batch or not,
and there is no batch, then use the super's on_skip method.
* Correct conditional logic for setting pre- and post-hooks for batches
Previously we were doing an if+elif for setting pre- and post-hooks
for batches, where in the `if` matched if the batch wasn't the first
batch, and the `elif` matched if the batch wasn't the last batch. The
issue with this is that if the `if` was hit, the `elif` _wouldn't_ be hit.
This caused the first batch to appropriately not run the `post-hook` but
then every hook after would run the `post-hook`.
* Add two new event types `LogStartBatch` and `LogBatchResult`
* Update MicrobatchModelRunner to use new batch specific log events
* Fix event testing
* Update microbatch integration tests to catch batch specific event types
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* New function to add graph edges.
* Clean up, leave out flag temporarily for testing.
* Put new test edge behavior behind flag.
* Final draft of documentaiton.
* Add `batch_id` to jinja context of microbatch batches
* Add changie doc
* Update `format_batch_start` to assume `batch_start` is always provided
* Add "runtime only" property `batch_context` to `ModelNode`
By it being "runtime only" we mean that it doesn't exist on the artifact
and thus won't be written out to the manifest artifact.
* Begin populating `batch_context` during materialization execution for microbatch batches
* Fix circular import
* Fixup MicrobatchBuilder.batch_id property method
* Ensure MicrobatchModelRunner doesn't double compile batches
We were compiling the node for each batch _twice_. Besides making microbatch
models more expensive than they needed to be, double compiling wasn't
causing any issue. However the first compilation was happening _before_ we
had added the batch context information to the model node for the batch. This
was leading to models which try to access the `batch_context` information on the
model to blow up, which was undesirable. As such, we've now gone and skipped
the first compilation. We've done this similar to how SavedQuery nodes skip
compilation.
* Add `__post_serialize__` method to `BatchContext` to ensure correct dict shape
This is weird, but necessary, I apologize. Mashumaro handles the
dictification of this class via a compile time generated `to_dict`
method based off of the _typing_ of th class. By default `datetime`
types are converted to strings. We don't want that, we want them to
stay datetimes.
* Update tests to check for `batch_context`
* Update `resolve_event_time_filter` to use new `batch_context`
* Stop testing for batchless compiled code for microbatch models
In 45daec72f4 we stopped an extra compilation
that was happening per batch prior to the batch_context being loaded. Stopping
this extra compilation means that compiled sql for the microbatch model without
the event time filter / batch context is no longer produced. We have discussed
this and _believe_ it is okay given that this is a new node type that has not
hit GA yet.
* Rename `ModelNode.batch_context` to `ModelNode.batch`
* Rename `build_batch_context` to `build_jinja_context_for_batch`
The name `build_batch_context` was confusing as
1) We have a `BatchContext` object, which the method was not building
2) The method builds the jinja context for the batch
As such it felt appropriate to rename the method to more accurately
communicate what it does.
* Rename test macro `invalid_batch_context_macro_sql` to `invalid_batch_jinja_context_macro_sql`
This rename was to make it more clear that the jinja context for a
batch was being checked, as a batch_context has a slightly different
connotation.
* Update changie doc
* Rename `batch_info` to `previous_batch_results`
* Exclude `previous_batch_results` from serialization of model node to avoid jinja context bloat
* Drop `previous_batch_results` key from `test_manifest.py` unit tests
In 4050e377ec we began excluding
`previous_batch_results` from the serialized representation of the
ModelNode. As such, we no longer need to check for it in `test_manifest.py`.
* Clean up changelog on main
* Bumping version to 1.10.0a1
* Code quality cleanup
* add 1.8,1.9 link
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
2024-11-21 15:54:56 -06:00
4579 changed files with 63969 additions and 61559 deletions
- Maximally parallelize dbt clone in clone command" ([#7914](https://github.com/dbt-labs/dbt-core/issues/7914))
- Add --host flag to dbt docs serve, defaulting to '127.0.0.1' ([#10229](https://github.com/dbt-labs/dbt-core/issues/10229))
- Update data_test to accept arbitrary config options ([#10197](https://github.com/dbt-labs/dbt-core/issues/10197))
- add pre_model and post_model hook calls to data and unit tests to be able to provide extra config options ([#10198](https://github.com/dbt-labs/dbt-core/issues/10198))
- add --empty value to jinja context as flags.EMPTY ([#10317](https://github.com/dbt-labs/dbt-core/issues/10317))
- Warning message for snapshot timestamp data types ([#10234](https://github.com/dbt-labs/dbt-core/issues/10234))
- Support cumulative_type_params & sub-daily granularities in semantic manifest. ([#10360](https://github.com/dbt-labs/dbt-core/issues/10360))
- Add time_granularity to metric spec. ([#10376](https://github.com/dbt-labs/dbt-core/issues/10376))
- Support standard schema/database fields for snapshots ([#10301](https://github.com/dbt-labs/dbt-core/issues/10301))
- Support ref and source in foreign key constraint expressions, bump dbt-common minimum to 1.6 ([#8062](https://github.com/dbt-labs/dbt-core/issues/8062))
- Support new semantic layer time spine configs to enable sub-daily granularity. ([#10475](https://github.com/dbt-labs/dbt-core/issues/10475))
- Include models that depend on changed vars in state:modified, add state:modified.vars selection method ([#4304](https://github.com/dbt-labs/dbt-core/issues/4304))
- Add support for behavior flags ([#10618](https://github.com/dbt-labs/dbt-core/issues/10618))
- Enable `--resource-type` and `--exclude-resource-type` CLI flags and environment variables for `dbt test` ([#10656](https://github.com/dbt-labs/dbt-core/issues/10656))
- Execute microbatch models in batches ([#10700](https://github.com/dbt-labs/dbt-core/issues/10700))
- Create 'skip_nodes_if_on_run_start_fails' behavior change flag ([#7387](https://github.com/dbt-labs/dbt-core/issues/7387))
- Allow snapshots to be defined in YAML. ([#10246](https://github.com/dbt-labs/dbt-core/issues/10246))
- Write microbatch compiled/run targets to separate files, one per batch ([#10714](https://github.com/dbt-labs/dbt-core/issues/10714))
- Track incremental_strategy as part of model_run tracking event ([#10761](https://github.com/dbt-labs/dbt-core/issues/10761))
- Support required 'begin' config for microbatch models ([#10701](https://github.com/dbt-labs/dbt-core/issues/10701))
- Parse-time validation of microbatch configs: require event_time, batch_size, lookback and validate input event_time ([#10709](https://github.com/dbt-labs/dbt-core/issues/10709))
- Added the --inline-direct parameter to 'dbt show' ([#10770](https://github.com/dbt-labs/dbt-core/issues/10770))
- Enable `retry` support for microbatch models ([#10715](https://github.com/dbt-labs/dbt-core/issues/10715), [#10729](https://github.com/dbt-labs/dbt-core/issues/10729))
- Use unrendered database and schema source properties during state:modified, behind state_modified_compare_more_unrendered_values behavoiur flag ([#9573](https://github.com/dbt-labs/dbt-core/issues/9573))
- Ensure microbatch models respect `full_refresh` model config ([#10785](https://github.com/dbt-labs/dbt-core/issues/10785))
- Adds validations for custom_granularities to ensure unique naming. ([#9265](https://github.com/dbt-labs/dbt-core/issues/9265))
- Test case for `merge_exclude_columns` ([#8267](https://github.com/dbt-labs/dbt-core/issues/8267))
- Convert "Skipping model due to fail_fast" message to DEBUG level ([#8774](https://github.com/dbt-labs/dbt-core/issues/8774))
- Restore previous behavior for --favor-state: only favor defer_relation if not selected in current command" ([#10107](https://github.com/dbt-labs/dbt-core/issues/10107))
- Unit test fixture (csv) returns null for empty value ([#9881](https://github.com/dbt-labs/dbt-core/issues/9881))
- Fix json format log and --quiet for ls and jinja print by converting print call to fire events ([#8756](https://github.com/dbt-labs/dbt-core/issues/8756))
- Add resource type to saved_query ([#10168](https://github.com/dbt-labs/dbt-core/issues/10168))
- Fix: Order-insensitive unit test equality assertion for expected/actual with multiple nulls ([#10167](https://github.com/dbt-labs/dbt-core/issues/10167))
- Renaming or removing a contracted model should raise a BreakingChange warning/error ([#10116](https://github.com/dbt-labs/dbt-core/issues/10116))
- prefer disabled project nodes to external node ([#10224](https://github.com/dbt-labs/dbt-core/issues/10224))
- Fix issues with selectors and inline nodes ([#8943](https://github.com/dbt-labs/dbt-core/issues/8943), [#9269](https://github.com/dbt-labs/dbt-core/issues/9269))
- Fix snapshot config to work in yaml files ([#4000](https://github.com/dbt-labs/dbt-core/issues/4000))
- Improve handling of error when loading schema file list ([#10284](https://github.com/dbt-labs/dbt-core/issues/10284))
- Use model alias for the CTE identifier generated during ephemeral materialization ([#5273](https://github.com/dbt-labs/dbt-core/issues/5273))
- Implement state:modified for saved queries ([#10294](https://github.com/dbt-labs/dbt-core/issues/10294))
- Saved Query node fail during skip ([#10029](https://github.com/dbt-labs/dbt-core/issues/10029))
- DOn't warn on `unit_test` config paths that are properly used ([#10311](https://github.com/dbt-labs/dbt-core/issues/10311))
- Fix setting `silence` of `warn_error_options` via `dbt_project.yaml` flags ([#10160](https://github.com/dbt-labs/dbt-core/issues/10160))
- Attempt to provide test fixture tables with all values to set types correctly for comparisong with source tables ([#10365](https://github.com/dbt-labs/dbt-core/issues/10365))
- Limit data_tests deprecation to root_project ([#9835](https://github.com/dbt-labs/dbt-core/issues/9835))
- CLI flags should take precedence over env var flags ([#10304](https://github.com/dbt-labs/dbt-core/issues/10304))
- Fix typing for artifact schemas ([#10442](https://github.com/dbt-labs/dbt-core/issues/10442))
- Fix over deletion of generated_metrics in partial parsing ([#10450](https://github.com/dbt-labs/dbt-core/issues/10450))
- Do not update varchar column definitions if a contract exists ([#10362](https://github.com/dbt-labs/dbt-core/issues/10362))
- fix all_constraints access, disabled node parsing of non-uniquely named resources ([#10509](https://github.com/dbt-labs/dbt-core/issues/10509))
- respect --quiet and --warn-error-options for flag deprecations ([#10105](https://github.com/dbt-labs/dbt-core/issues/10105))
- Propagate measure label when using create_metrics ([#10536](https://github.com/dbt-labs/dbt-core/issues/10536))
- Fix state:modified check for exports ([#10138](https://github.com/dbt-labs/dbt-core/issues/10138))
- Filter out empty nodes after graph selection to support consistent selection of nodes that depend on upstream public models ([#8987](https://github.com/dbt-labs/dbt-core/issues/8987))
- Late render pre- and post-hooks configs in properties / schema YAML files ([#10603](https://github.com/dbt-labs/dbt-core/issues/10603))
- Allow the use of env_var function in certain macros in which it was previously unavailable. ([#10609](https://github.com/dbt-labs/dbt-core/issues/10609))
- Remove deprecation for tests: to data_tests: change ([#10564](https://github.com/dbt-labs/dbt-core/issues/10564))
- Fix `--resource-type test` for `dbt list` and `dbt build` ([#10730](https://github.com/dbt-labs/dbt-core/issues/10730))
- Fix unit tests for incremental model with alias ([#10754](https://github.com/dbt-labs/dbt-core/issues/10754))
- Allow singular tests to be documented in properties.yml ([#9005](https://github.com/dbt-labs/dbt-core/issues/9005))
- Ignore --empty in unit test ref/source rendering ([#10516](https://github.com/dbt-labs/dbt-core/issues/10516))
- Ignore rendered jinja in configs for state:modified, behind state_modified_compare_more_unrendered_values behaviour flag ([#9564](https://github.com/dbt-labs/dbt-core/issues/9564))
- Improve performance of infer primary key ([#10781](https://github.com/dbt-labs/dbt-core/issues/10781))
- Attempt to skip saved query processing when no semantic manifest changes ([#10563](https://github.com/dbt-labs/dbt-core/issues/10563))
- Ensure dbt retry of microbatch models doesn't lose prior successful state ([#10800](https://github.com/dbt-labs/dbt-core/issues/10800))
### Docs
- Enable display of unit tests ([dbt-docs/#501](https://github.com/dbt-labs/dbt-docs/issues/501))
- Unit tests not rendering ([dbt-docs/#506](https://github.com/dbt-labs/dbt-docs/issues/506))
- Add support for Saved Query node ([dbt-docs/#486](https://github.com/dbt-labs/dbt-docs/issues/486))
- Fix npm security vulnerabilities as of June 2024 ([dbt-docs/#513](https://github.com/dbt-labs/dbt-docs/issues/513))
### Under the Hood
- Clear error message for Private package in dbt-core ([#10083](https://github.com/dbt-labs/dbt-core/issues/10083))
- Enable use of context in serialization ([#10093](https://github.com/dbt-labs/dbt-core/issues/10093))
- Make RSS high water mark measurement more accurate on Linux ([#10177](https://github.com/dbt-labs/dbt-core/issues/10177))
- Enable record filtering by type. ([#10240](https://github.com/dbt-labs/dbt-core/issues/10240))
- Additional logging for skipped ephemeral models ([#10389](https://github.com/dbt-labs/dbt-core/issues/10389))
- bump black to 24.3.0 ([#10454](https://github.com/dbt-labs/dbt-core/issues/10454))
- generate protos with protoc version 5.26.1 ([#10457](https://github.com/dbt-labs/dbt-core/issues/10457))
- Move from minimal-snowplow-tracker fork back to snowplow-tracker ([#8409](https://github.com/dbt-labs/dbt-core/issues/8409))
- Add group info to RunResultError, RunResultFailure, RunResultWarning log lines ([#](https://github.com/dbt-labs/dbt-core/issues/))
- Improve speed of tree traversal when finding children, increasing build speed for some selectors ([#10434](https://github.com/dbt-labs/dbt-core/issues/10434))
- Add test for sources tables with quotes ([#10582](https://github.com/dbt-labs/dbt-core/issues/10582))
- Additional type hints for `core/dbt/version.py` ([#10612](https://github.com/dbt-labs/dbt-core/issues/10612))
- Fix typing issues in core/dbt/contracts/sql.py ([#10614](https://github.com/dbt-labs/dbt-core/issues/10614))
- Fix type errors in `dbt/core/task/clean.py` ([#10616](https://github.com/dbt-labs/dbt-core/issues/10616))
- Add Snowplow tracking for behavior flag deprecations ([#10552](https://github.com/dbt-labs/dbt-core/issues/10552))
- Add test utility patch_microbatch_end_time for adapters testing ([#10713](https://github.com/dbt-labs/dbt-core/issues/10713))
- Replace `TestSelector` with `ResourceTypeSelector` ([#10718](https://github.com/dbt-labs/dbt-core/issues/10718))
- Standardize returning `ResourceTypeSelector` instances in `dbt list` and `dbt build` ([#10739](https://github.com/dbt-labs/dbt-core/issues/10739))
- Add group metadata info to LogModelResult and LogTestResult ([#10775](https://github.com/dbt-labs/dbt-core/issues/10775))
- Increase supported version range for dbt-semantic-interfaces. Needed to support custom calendar features. ([#9265](https://github.com/dbt-labs/dbt-core/issues/9265))
### Security
- Explicitly bind to localhost in docs serve ([#10209](https://github.com/dbt-labs/dbt-core/issues/10209))
- Add `order_by` and `limit` fields to saved queries. ([#10531](https://github.com/dbt-labs/dbt-core/issues/10531))
- Enable specification of dbt_valid_to for current records ([#10187](https://github.com/dbt-labs/dbt-core/issues/10187))
- Enable use of multi-column unique key in snapshots ([#9992](https://github.com/dbt-labs/dbt-core/issues/9992))
- Ensure `--event-time-start` is before `--event-time-end` ([#10786](https://github.com/dbt-labs/dbt-core/issues/10786))
- Ensure microbatch models use same `current_time` value ([#10819](https://github.com/dbt-labs/dbt-core/issues/10819))
- Emit warning when microbatch model has no input with `event_time` config ([#10926](https://github.com/dbt-labs/dbt-core/issues/10926))
### Fixes
- Pass test user config to adapter pre_hook by explicitly adding test builder config to node ([#10484](https://github.com/dbt-labs/dbt-core/issues/10484))
- Handle edge cases when a specified `--event-time-end` is equivalent to the batch size truncated batch start time ([#10824](https://github.com/dbt-labs/dbt-core/issues/10824))
- Begin tracking execution time of microbatch model batches ([#10825](https://github.com/dbt-labs/dbt-core/issues/10825))
- Allow instances of generic data tests to be documented ([#2578](https://github.com/dbt-labs/dbt-core/issues/2578))
- Fix warnings for models referring to a deprecated model ([#10833](https://github.com/dbt-labs/dbt-core/issues/10833))
- Change `lookback` default from `0` to `1` to ensure better data completeness ([#10867](https://github.com/dbt-labs/dbt-core/issues/10867))
- Make `--event-time-start` and `--event-time-end` mutually required ([#10874](https://github.com/dbt-labs/dbt-core/issues/10874))
- Exclude hook result from results in on-run-end context ([#7387](https://github.com/dbt-labs/dbt-core/issues/7387))
- Implement partial parsing for all-yaml snapshots ([#10903](https://github.com/dbt-labs/dbt-core/issues/10903))
- Restore source quoting behaviour when quoting config provided in dbt_project.yml ([#10892](https://github.com/dbt-labs/dbt-core/issues/10892))
- Fix bug when referencing deprecated models ([#10915](https://github.com/dbt-labs/dbt-core/issues/10915))
- Fix 'model' jinja context variable type to dict ([#10927](https://github.com/dbt-labs/dbt-core/issues/10927))
- Take `end_time` for batches to the ceiling to handle edge case where `event_time` column is a date ([#10868](https://github.com/dbt-labs/dbt-core/issues/10868))
### Under the Hood
- Remove support and testing for Python 3.8, which is now EOL. ([#10861](https://github.com/dbt-labs/dbt-core/issues/10861))
### Dependencies
- Bump minimnum allowed dbt-adapters version to 1.8.0 ([#N/A](https://github.com/dbt-labs/dbt-core/issues/N/A))
- Emit debug logging event whenever artifacts are written ([#10937](https://github.com/dbt-labs/dbt-core/issues/10937))
- Support --empty for snapshots ([#10372](https://github.com/dbt-labs/dbt-core/issues/10372))
### Fixes
- Ensure KeyboardInterrupt/SystemExit halts microbatch model execution ([#10862](https://github.com/dbt-labs/dbt-core/issues/10862))
- Handle exceptions in `get_execution_status` more broadly to better ensure `run_results.json` gets written ([#10934](https://github.com/dbt-labs/dbt-core/issues/10934))
### Under the Hood
- Behavior change for mf timespine without yaml configuration ([#10959](https://github.com/dbt-labs/dbt-core/issues/10959))
- Behavior change for cumulative metric type param ([#10960](https://github.com/dbt-labs/dbt-core/issues/10960))
body:Make `--event-time-start` and `--event-time-end` mutually required
time:2024-10-17T14:53:57.149238-07:00
custom:
Author:QMalcolm
Issue:"10874"
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.