* Add event name to `message` of recently added deprecations
* Make it harder to not supply the event name to deprecation messages
* Add changie doc
* Fixup import naming
* initial hatch implmentation
* cleanup docs
* replacing makefile
* cleanup hatch commands to match adapters
reorganize more to match adapters setup
script comment
dont pip install
fix test commands
* changelog
improve changelog
* CI fix
* fix for env
* use a standard version file
* remove odd license logic
* fix bumpversion
* remove sha input
* more cleanup
* fix legacy build path
* define version for pyproject.toml
* use hatch hook for license
* remove tox
* ensure tests are split
* remove temp file for testing
* explicitly match old verion in pyproject.toml
* fix up testing
* get rid of bumpversion
* put dev_dependencies.txtin hatch
* setup.py is now dead
* set python version for local dev
* local dev fixes
* temp script to compare wheels
* parity with existing wheel builds
* Revert "temp script to compare wheels"
This reverts commit c31417a092.
* fix docker test file
* Allow dbt deps to run when vars lack defaults in dbt_project.yml
* Added Changelog entry
* fixed integration tests
* fixed mypy error
* Fix: Use strict var validation by default, lenient only for dbt deps to show helpful errors
* Fixed Integration tests
* fixed nit review comments
* addressed review comments and cleaned up tests
* addressed review comments and cleaned up tests
* Add test checking that `NoNodesForSelectionCriteria` is only fired once per invocation
* Stop emitting `NoNodesForSelectionCriteria` three times during `build` command
* update changelog
---------
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
* Explicitly support functions during partial parsing
* Emit a `Note` event when partial parsing is skipped due to there being no changes
* Begin testing partial parsing support of function nodes
* Add changie doc
* Move test_pp_functions to use `EventCatcher` from dbt-common
* Remove from `functions` instead of `nodes` during partial parsing function deletion
* Fix the partial parsing scheduling of function sql and yaml files
Previously we were treating the partial parsing scheduling of function
files as if they were only defined by YAML files. However functions consist
of a "raw code file" (typically a .sql file) and a YAML file. We needed
to update the the deletion handling + scheduling of functions during partial
parsing to act more similar to "mssat" files in order to achieve this.
This work was primarily done agentically, but then simplified by myself
afterwards.
* Test that changing the alias of a function doesn't require reparsing of the downstream nodes that reference it
* Add test to check that functions with not default schemas get their schemas created
* Ensure schemas of function nodes are created when in DAG during `build` command
* Add changie doc for function schema bug fix
* Add tests to check parsing of function argument default values
* Begin allowing the specification of `default_value` on function arguments
* Validate that non-default function arguments don't come _after_ default function arguments
* Add changie doc
* Clean up changelog on main
* Bumping version to 1.12.0a1
* Code quality cleanup
* Update CHANGELOG.md
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Propagate measure.config to metric.config when specified during create_metric:True
* changelog
* Update the metric.expr to be populated correctly according to DSI rules
* convert setup.py to pyproject.toml
* move dev requirements into pyproject.toml
* with setup.py gone we can install from root
* lint
cleanrly state intention to remove
* convert precommit to use dev deps
* consolidate version to pyproject.toml
* editable req
get rid of editable-req
* docs updates
* tweak configs for builds
* fix script
* changelog
* fixes to build
* revert unnecesary changes
more simplification
revert linting
more simplification
fix
don’t need it
* Update `setup.py` to drop support for python 3.9
* Update github issue templates to not use python 3.9 as an example
* Update github workflows to no longer depend on or test python 3.9
* Drop python 3.9 from the test dockerfile
* Update `CONTRIBUTING.md` to correctly list what python versions we test
* Update comment about some code specifically needed for a python 3.9.7 issue
* Update pre-commit python version comment
* Add changie doc
* Update imports from click as upgrading to python 3.10 changed some click items
* Add test to check that python UDFs can be parsed
* Add `entry_point` and `runtime_version` to function node config
These two configs are required for python UDFs in some warehouses and
may also be required for other UDF languages moving forward. The specific
adapters implementation will enforce the requirement. By default both
configs will be `None` unless set.
* Begin searching for `.py` files in `functions` directory
* Switch to using `SimpleParser` for functions
Previously we were using `SimpleSQLParser` and we were _only_ parsing
SQL files. However, we're now also parsing python files for functions.
As such it makes sense to switch to the `SimpleParser`. Functionally there
is no change bceause we re-added the `parse_file` override that `SimpleSQLParser`
had (there was nothing sql specific about it). Hence this is mostly a
symbolic change.
* Add changie doc
* Add test which checks that function nodes can be configured from dbt_project.yml
* Support setting function node configs from dbt_project.yml
* add changie doc
* Fix unit tests to expect `functions` as part of project
* Update function node tests to look for `type` on function config
* Update `function` node to have `type` on config
* Update parsing of `function` nodes to expect `type` on the config
* Add changie doc
* Add test to check that a function's volatility is configurable
* Define the `FunctionVolatility` enum type
* Add `volatility` as a configuration on function nodes
* Add changie doc
* Ensure jsonschema validation tests aren't skipping validation because postgres isn't technically supported
* Blanket accept `functions` as top level yaml key as temp fix
We for the moment can't sync over the full jsonschema from fusion,
as such this is a stop gap simply so that we don't raise deprecation
warnings if people start specifying functions.
* Move model column `meta` and `tags` into the column's config in happy path fixture
* Test that functions work properly when unit testing models
* Ensure that functions properly get propagated to the `manifest` and `depends_on` of the `unit_test` node
* Update comment about `RuntimeUnitTestFunctionResolver`
* Add changie doc
* Add test to ensure that using a function with `--empty` works
* Ensure relations for functions are created with a `type` set to `function`
Previously on creation of function relations we weren't passing a `type`
value. This was problematic because in dbt-adapters we call `is_function`
(which uses the relation `type`) to determine whether a relation can be
filtered when filtering options (like `empty` or `event_time`) are present.
Because `type` wasn't set for function relations, `is_function` would
return `False` and thus in the present of a filter, we would attempt to
filter it. This would raise an error because functions can't be filtered.
Setting the type on the relation solves the issue.
* add changie doc
* Add `FunctionType` enum
* Add `type` property to `Function` resource
* Add `type` property to `ParsedFunctionPatch` and `UnparsedFunctionUpdate`
* Begin populating a function's `type` during patch parsing
* Regnerate v12 manifest to include function `type` property
* Add changie doc
* Begin testing that function node `type` property is setable and accessible
* Move comment about triggering the PathEncoder back to its proper place
* Allow for the defining of basic SQL UDFs (#11957)
* Add initial definiton of the `Function` resource
* Add FunctionNode definition to graph contracts
* Add test which checks whether basic UDFs can be parsed
This test fails right now, which is intentional. This is test driven
development. Now I do work to maket the test pass :)
* Add basic function sql parser for UDFs, and plumb it through parsing code paths
* Begin populating `functions` in the ref lookup
* Begin patching `function` nodes with their yaml definitions
Of note, presently `arguments` and `return_type` aren't populating properly.
It's likely that we'll have to do additional work to the FunctionPatchParser
to get this _fully_ working.
* Increase responsibility of FunctionPatchParser to handle entire `parse_patch` of function nodes
* Fix testing suite to accomodate addition of new `function` node
* Add changie doc for new `function` node type
* Minor refactoring of `NodePatchParser.parse_patch` to reduce code duplication in `FunctionPatchParser`
* Ability to list and select function nodes (#11968)
* Begin listing `function` nodes in `list` command
* Add ability to run `list` specifying the `function` resource type
* Function nodes are support selection via: name, file path, and resource type
* Add changie doc
* Core handles lifecycle of function nodes (#12008)
* Add basic test to check that UDFs get created in data warehouse
* Add functions to the runner map of \ operation
* Add basic stub of `FunctionRunner` modeled after `SeedRunner`
* Begin using `FunctionRunner` for running `function` nodes
* Add stubbing of things to implement on `FunctionRunner`
* Initial implementation of execution of function nodes
This is largely a copy of the execution of model nodes (in run.py) but
with some abstractions into helper methods to make the body of the
`execute` function easier to follow. Of note, right now this appears to
be getting the incorrect macro from the adapter. This is likely because
for some reason the node's materialization config is being set to `view`
by default.
* Ensure parsed function nodes get the correct materialization type
* Begin generating context for `function` materialization macro
* Stub out adapter response in node result as it was causing some failures
* Correct the adapter response in the run result for functions
* Begin logging `LogFunctionResult` event for completed function nodes
* Add changie doc
* Temp update dev reqs to point at branch of dbt-adapters
* Add test `LogFunctionResult` event to serialization test
* Add `function` nodes to the `WritableManifest`
* Fix tests
* Remove no longer relevant `TODO`s from `function.py`
* Add a new macro `function()` to the jinja context for using functions (#12031)
* Update function tests to look for `functions` under `manifest.functions`
* Begin storing funciton nodes in `Manifest.functions` instead of `Manifest.nodes`
* Ensure function nodes are still included in nodes to run during `build`
* Add ability to lookup functions on the manifest
* Update patch parsing of function YAML files now that functions live on `Manifest.functions`
* Mark function nodes as no longer refable
* Ensure function nodes are still selectable
* Add `function` macro!
* Ensure functions nodes are correctly linked in the DAG
* Update jinja context tests to expect `function` macro to exist
* Fix unit tests in test suite to expect function nodes
* Add changie doc
* regen v12.json jsonschema
* Fix test `TestVerifyArtifacts::test_run_and_generate`
* Fix test `TestVerifyArtifactsReferences::test_references`
* Fix test `TestVerifyArtifactsVersions::test_versions`
* Regen manifest artifact for `TestPreviousVersionState::test_compare_state_current`
* Update `_iterate_selected_nodes` to support function nodes
* Ensure we process node functions to ensure they get added to the `depends_on`
* Take functions into account for state modified
* Regen data for `TestModifiedStateSchemaEvolution::test_modified_state_schema_evolution` test
* Default `functions` property on `WritableManifest` to a dict
I'm not sure if this is actually how we want to do this. However, without
doing this the `WritableManifest` will break on loading of older manifests
that don't have `functions`. The alternative to this would be to bump
the schema version (v12 -> v13) and create an upgrade in `upgrade_manifest.py`.
* Update UDF tests to use a more general purpose function
* Add tests ensuring UDFs can be used in models and `--inline` queries
* Correct `ParseFunctionResolver` so that the name isn't added twice to the function args spec
* Drop `functions` from `Exposure` and `Metric` definitions
* Regen v12 manifest schema
* Remove unnecessary string interpolation
* Point dev reqs back to dbt-adapters@main
* Empty commit
* Increase shared memory size for postgres docker container
I recently started getting errors that look like
```
E dbt_common.exceptions.base.DbtDatabaseError: Database Error
E could not resize shared memory segment "/PostgreSQL.3814850474" to 2097152 bytes: No space left on device
```
At first I thought this was a lack of memory, disk space, or ulimit file descriptors. However
increasing all of those things did not solve the problem. I eventually found, by exec-ing into
the container and running `df -h /dev/shm && ls -lh /dev/shm` that the container only had 64MB
of memory available to it. This change increases the memory available to the container to 1GB,
which resolved the issue.
* Use `docker compose` instead of `docker-compose`
The later was docker v1, and no longer works. Use `docker compose` instead.
* Only run homebrew postgres in `setup_db.sh` if `SKIP_HOMEBREW` is not passed
Our github actions use homebrew, but our local dev uses docker. When we
were doing local development and running `make setup-db` suddenly there would
be _two_ postgres instances running. One via homebrew, and another in docker.
This was breaking the setup. Now when running `make setup-db` we skip the
homebrew relevant portions of `setup_db.sh`.
* Set more PG environment variables in `setup_db.sh`
* fix: Properly quote event_time column names in sample mode filters
When using the --sample flag with models that have camel case or
spaced column names as their event_time field, the generated SQL
would fail because column names weren't properly quoted.
This fix introduces a robust quoting system that:
- Checks column-level quote configuration first (highest precedence)
- Falls back to source-level quoting settings
- Uses the existing Column class for proper quote handling
- Centralizes the logic in a dedicated method to eliminate duplication
- Ensures sample mode works with PostgreSQL and other databases that
require quoted identifiers for column names with spaces or special characters
Fixes issue where --sample flag fails with camel case or spaced
event_time column names.
* returning the same path that was used earlier for the event_time filed
* adding changelog
* verify cla agreement
* test: Add comprehensive tests for _resolve_event_time_field_name method
This commit adds extensive test coverage for the _resolve_event_time_field_name
method to address the PR review feedback requesting tests.
Changes:
- Add 28 parametrized test cases covering all quoting scenarios
- Test column-level vs source-level quote precedence
- Test edge cases: missing columns, empty columns dict, no quoting attributes
- Test camel case, snake case, and spaced column names
- Test both quoted and unquoted column name scenarios
- Improve method robustness with better error handling
The tests ensure the method correctly handles:
- Column-level quote settings taking precedence over source-level
- Proper fallback to source-level quoting when column-level is not set
- Edge cases where columns don't exist or have no quoting attributes
- Various column name formats (simple, camelCase, snake_case, spaced)
Fixes: Addresses PR review feedback requesting comprehensive test coverage
* style: Apply code formatting from pre-commit hooks
- Apply black formatting to providers.py and test_providers.py
- Fix trailing whitespace issues
- Add proper type guards for event_time attribute access
- Ensure all tests continue to pass after formatting changes
* Create custom hook for checking for improper imports of artifact resources
* Fix return value of `has_bad_artifact_resource_imports.py::main`
* Regex match versioned resource imports and give import in pre-commit error
* (Tidy First): Fix imports of artifact resources to not import direct versioned resources
* Add changie doc
* feat: support nested key traversal in dbt list output
* feat: support nested key traversal in dbt list output
* feat: support nested key traversal in dbt list output
* feat: support nested key traversal in dbt list output
* feat: support nested key traversal in dbt list output
* feat: support nested key traversal in dbt list output
* feat: support nested key traversal in dbt list output
* Update version for libpq-dev in Dockerfile
The previous version we had for libpq-dev stopped being listed. As such
we need to change to installing a version that is still listed. Hence
we now install version 13.22-0+deb11u1
* Fix `FromAsCasing` warning in Docker file
Our docker file was raising the warning
`FromAsCasing: 'as' and 'FROM' keywords' casing do not match (line 27)`
because we were using `FROM` and `as`, and docker wants those words
to have the same casing. As such, the `as` instances have become `AS`.
* Add changie doc
* Pull in latest jsonschemas, primarily for improved SL definitions
* Improve metric definitions in happy path test fixture to be more expansive
* Add changie doc
* Fix test_list to know about new happy path fixture metrics
* Make `GenericJSONSchemaValidationDeprecation` a "preview" deprecation
Making the deprecation a preview will:
1. Remove it from the summary
2. Emit it as a Note event instead of the actual deprecation event
a. This does mean it'll still be in the logs (but as info level instead of warning)
* Update message of `GenericJSONSchemaValidationDeprecation` to state it's only possibly a deprecation
* Add changie doc
* fix GenericJSONSchemaValidationDeprecation related tests
* Add more details to `GenericJSONSchemaValidationDeprecation` message
* Fix tests related to GenericJSONSchemaValidationDeprecation
* Bump dbt-protos dep min to get new env var namespace deprecation event
* Define new EnvironmentVariableNamespaceDeprecation event in core
* Add new deprecation class for EnvironmentVariableNamespaceDeprecation
* Bump dbt-common dep min to get new env var prefix definiton
* Add new `env_vars` module with function for validating dbt engine env vars
* Add changie doc
* Begin keeping a list of env vars associated with cli params
* Begin validating that only allowed engine environment variables exist
* Add some extra engine env vars found throughout the project to the known list
* Begin cross propagating dbt engine env vars with old names
If the old env var name is present, and the new one is not, set the
new one to have the value of the old one. Else, if the new one is set,
set/override old name to have the value of the new one.
There are some drawbacks to this approach. Namely, click only validates
environment variable types for the environment variables it is aware of.
Thus by using the new environment variable naming scheme for existing
environment variables (not newly added ones), we actually lose type guarantees.
This might require a rework.
* Add test for validate_engine_env_vars method
* Add unit test ensuring new engine env vars get added correctly
* Add integration test for environment variable namespace deprecation
* Move logic for propagating engine env vars to pre-flight env var setting
Previously we were attempting to set it on the flags context, but that is
_not_ the environment variable context. Instead what appears to happen is
that the environment variable context is loaded, click takes this into
consideration, and then the flags are set from click's understanding of
passed cli params + env vars.
* Get the env vars from the invocation context in `validate_engine_env_vars`
* Move `_create_engine_env_var` to `__init__` of `EngineEnvVar` data class
* Fix error type in __init__ of EngineEnvVar dataclass
* Correct grammar of EnvironmentVariableNamespaceDeprecation message
* Upgrade to DSI 0.9.0
Note this new version has some breaking changes (changes to class names). This won't impact semantic manifest parsing. The changes in the new version will be used to support order_by and limit on saved queries.
* Changelog
* Update test saved query
* Improve deprecation message for SourceOverrideDeprecation
* Move SourceOverrideDeprecation to jsonschema validation code path
* Update test for checking SourceOverrideDeprecation
* Update dbt_project.yml jsonschema spec to handle nested config defs
Additionally adds some more cloud configs
* Update schema files jsonschema definition to not have `overrides` for sources
Additionally add some cloud definitions
* Add changie doc
* Update happy_path fixture to include nested config specifations in dbt_project.yml
* First draft of SourceOverrideDeprecation warning.
* Refinements and test
* Back out unneeded change`
* Fix unit test.
* add changie doc
* Bump minimum dbt-protos to 1.0.335
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* Stop dynamically setting ubuntu version for `main.yml` and structured logging actions
These actions are important to run on community PRs. However these workflows
use `on: pull_request` instead of `on: pull_request_target`. That is intentional,
as `on: pull_request` doesn't give access to variables or secrets, and we need
to keep it that way for security purposes. The these actions were trying to access
a variable, which they don't have access to. This was a nicety for us, because
sometimes we'd delay moving to github's `ubuntu-latest`. However, the security
concern is more important, and thus we lose the variable for these workflows.
* Change `runs_on` of `artifact-reviews.yml`
* Stop dynamically setting mac and windows versions in main.yml
* Revert "bump dbt-common (#11640)"
This reverts commit c6b7655b65.
* update freshness model config handling
* lower case all columns when processing unit test results
* add changelog
* swap .columns for .column_names
* use rename instead of select api for normalizing agate table column casing
* Add helper to validate model configs via jsonschema
* Store jsonschemas as module vars instead of reloading everytime
Every time we were calling a jsonschema validation, we were _reloading_
from file the underlying jsonschema. As a one off, this isn't too costly.
However, for large projects it starts to add up. By only loading each json
schema once we can save a lot of time. Calling one of the functions which
loads a jsonschema 10,000 times was costing ~3.7215 seconds. By switching
to this module var paradigm we reduced that to ~0.3743 seconds.
* Begin validating configs from model `.sql` files
It was a bit of a hunt to figure out where to do this. We couldn't do
the validating in `calculate_node_config` because that function is called
4 times per node (which is an issue of itself, but out of scope for this
work). We also couldn't do the validation where `_config_call_dict` is set
because it turns out there are multiple avenues for setting
`_config_call_dict`, which is a fun rabbit hole.
* Ensure .sql configs are validated only once
It turns out that that `update_parsed_node_config` can potentially be
called twice per model. It'll be called from either `ModelParser.render_update`
or `ModelParser.populate`, and it can additionally be called from
`PatchParser.patch_node_config` if there is a .yml definition for the
model. We only want to validate the config once, and we aren't guaranteed
to have a `PatchParser` if there is no patch for the model. Thus, we've
updated `ModelParser.populate` and `ModelParser.render_update` to
request the config validation (which by default doesn't run unless requested).
* Split out the model config specific validation from general jsonschema validation
We're validating model configs from sql files via a subschema of the main
resources jsonschema, different case logic for detecting the different
types of deprecation warnings present. Thus `validate_model_config` cannot
call `jsonschema_validate`. We could have had both logic paths exist in
`jsonschema_validate`, but it would have added another later of if/elses
and bloated the function substantially.
* Handle additional properties of sub config objects
* Give better key path information for .sql config jsonschema issues
* Add tests for validate_model_config
* Add changie doc
* Fix jsonschemas unittests to avoid catching irrelevant issues
* Revert "bump dbt-common (#11640)"
This reverts commit c6b7655b65.
* update freshness model config handling
* lower case all columns when processing unit test results
* add changelog
* swap .columns for .column_names
* Loosen pydantic maximum to <3 (allowing for pydantic 2)
* Add an internal pydantic shim for getting pydantic BaseSettings reguardless of pydantic v1 vs v2
* Add changie doc
In 1.10.0 we began utilizing `jsonschema._keywords`. However, the submodule
`_keywords` wasn't added until jsonschema `4.19.1` which came out September
20th, 2023. Our jsonschema requirement was being set transitively via
dbt-common as `>=4.0,<5`. This mean people doing a _non_ fresh install of
dbt-core `1.10.0` could end up with a broken system if their existing
jsonschema dependency was anywhere in the range `>=4.0,<4.19.1`. By bumping the
minimum jsonschema version we make it such that anyone install dbt-core 1.10.1 will
automatically get there jsonschema updated (assuming they don't have an exclusionary
pin)
* Begin testing that model freshness can't be set as a top level model property
* Remove ability to specify freshness as top level property of models
* Add come comments to calculate_node_config for better readability
* Drop `freshness` as a top level property of models, and let `patch_node_config` handle merging config freshness
Model freshness hasn't been released in a minor release yet, not been documented. Thus
it is safe to remove the top level property of freshness on models. Freshness will instead
be set, and gotten, from the model config. Additionally our way of calculating the
config model freshness only got the top level `+freshness` from dbt_project.yml (ignoring
any path specific definitions). By instead using the built in `calculate_node_config` (which
is eventually called by `patch_node_config`), we get all path specific freshness config handling
and it also handles the precedence of `dbt_project.yml` specification, schema file specification,
and sql file specification.
* add changie doc
* Ensure source node `.freshness` is equal to node's `.config.freshness`
* Default source config freshness to empty spec if no freshenss spec is given
* Update contract tests for source nodes
* Ensure `build_after` is present in model freshness in parsing, otherwise skip freshness definition
* add freshness model config test
* add changelog
---------
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
* Handle explicit setting of null for source freshness config
* Abstract out the creation of the target config
This is useful because it makes that portion of code more re-usable/portable
and makes the work we are about to do easier.
* Fix bug in `merge_source_freshness` where empty freshness was preferenced over `None`
The issue was that during merging of freshnesses, an "empty freshness", one
where all values are `None`, was being preferenced over `None`. This was
problematic because an "empty freshness" indicates that a freshness was not
specified at that level. While `None` means that the freshness was _explicitly_
set to `None`. As such we should preference the thing that was specifically set.
* Properly get dbt_project defined freshness and don't merge with schema defined freshness
Previously we were only getting the "top level" freshness from the
dbt_project.yaml. This was ignoring freshness settings for the direct,
source, and table set in the dbt_project.yaml. Additionally, we were
merging the dbt_project.yaml freshness into the schema freshness. Long
term this merging would be desireably, however before we do that we need
to ensure freshness at diffrent levels within the dbt_project.yml get
properly merged (currently the different levels clobber each other). Fixing
that is a larger issue though. So for the time being, the schema defintion
of freshness will clobber any dbt_project.yml definition of freshness.
* Add changie doc
* Fix whitespace to make code quality happy
* Set the parsed source freshness to an empty FreshnessThreshold if None
This maintains backwards compatibility
* Revert "bump dbt-common (#11640)"
This reverts commit c6b7655b65.
* add file_format as a top level config in CatalogWriteIntegrationConfig
* add changelog
* Clean up changelog on main
* Bumping version to 1.11.0a1
* Code quality cleanup
* add old changelogs
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Add a robust schema definition of singular test to happy path fixture
* Add generic tests to happy path fixture
* Add unit tests to happy path fixture
* Fix data test + unit test happy path fixtures so they're valid
* Fix test_list.py for data test + unit test happy path fixture
* Fixup issues due to imperfect merge
* Drop generic data test definition style that we don't want to support from happy path fixture
* Add data test attributes to a pre-existing data test type
* Fix test_list.py again
* Don't forget to normalize in test_list.py
* Include event name in msg of deprecation warning for all recently added deprecations
* Add behavior flag for gating inclusion of event name in older deprecation messages
* Conditionally append event name to older deprecation events depending on behavior flag
* Add changie doc
* Migrate to `WarnErrorOptionsV2` and begin using `error` and `warn` as primary config keys
* Update tests using `WarnErrorOptions` to use `error` and `warn` terminology
* Begin emitting deprecation warning when include/exclude terminology is used with WarnErrorOptions
* bump minimum of dbt-protos
* Add test for new WarnErrorOptions deprecation
* add changie doc
* Fix test_warn_error_options.py tests
* Fix test_singular_tests.py tests
* Add WOEIncludeExcludeDeprecation to test_events.py serialization test
* Begin testing that `happy_path_project` and `project` fixtures have no deprecations
* Add model specific configs to model yml description in happy path test
* Add all possible model config property keys to happy path fixture
* Add more model properties to happy path fixture
* Move configs for happy path testing onto new happy path model fixture
* Fix deprecation tests names
* Add newly generated jsonschema for schema files
* Skip happy path deprecation test for now
* Fix 'database' value of model for happy path fixture
* Fix happy path fixture model grants to a role that exists
* Fix test_list.py
* Fix detection of additional config property deprecation
Previously we were taking the first `key` on the `instance` property
of the jsonschema ValidationError. However, this validation error
is raised as an "anyOf" violation, which then has sub-errors in its
`context` property. To identify the key in violation, we have to
find the `additionalProperties` validation in the sub-errors. The key
that is an issue can then be parsed from that sub-error.
* Refactor key parsing from jsonschema ValidationError messages to single definition
* Update handling `additionalProperties` violations to handle multiple keys in violation
* Add changelog
* Remove guard logic in jsonschemas validation rule that is no longer needed
* fix Dockerfile.test
* add change
* Ensure that all instances where `pre-commit` is called are prefixed with `$(DOCKER_CMD)`
* Changelog entry
---------
Co-authored-by: Taichi Kato <taichi-8128@outlook.jp>
In a lot of our function deprecation warning tests we check for a
matching string within an event message. Some of these matches check
for a file path. The problem with this was that windows formats
file paths differently. This was causing the functional tests to
_fail_ when run in a windows environment. To fix this we've removed
the file path part of the string from the test assertions.
* Begin basic jsonschema validations of dbt_project.yml (#11505)
* Add jsonschema for validation project file
* Add utility for helping to load jsonschema resources
Currently things are a bit hard coded. We should probably alter this
to be a bit more flexible.
* Begin validating the the `dbt_project.yml` via jsonschema
* Begin emitting deprecation warnings for generic jsonschema violations in dbt_project.yml
* Move from `DbtInternalError` to `DbtRuntimeError` to avoid circular imports
* Add tests for basic jsonschema validation of `dbt_project.yml`
* Add changie doc
* Add seralization test for new deprecation events
* Alter the project jsonschema to not require things that are optional
* Add datafiles to package egg
* Update inclusion of project jsonschema in setup.py to get files correctly
Using the glob spec returns a list of found files. Our previous spec was
raising the error
`error: can't copy 'dbt/resources/input_schemas/project/*.json': doesn't exist or not a regular file`
* Try another approach of adding jsonschema to egg
* Add input_schemas dir to MANIFEST.in spec
* Drop jsonschema inclusion spec from setup.py
* Begin using importlib.resources.files for loading project jsonschema
This doesn't currently work with editable installs :'(
* Use relative paths for loading jsonchemas instead of importlib
Using "importlib" is the blessed way to do this sort of thing. However,
that is failing for us on editable installs. This commit switches us
to using relative paths. Technically doing this has edge cases, however
this is also what we do for the `start_project` used in `dbt init`. So
we're going to do the same, for now. We should revisit this soon.
* Drop requirment of `__additional_properties__` specified by project jsonschema
* Drop requirement for `pre-hook` and `post-hook` specified by project jsonschema
* Reset `active_deprecations` global at the end of tests using `project` fixture
* Begin validation the jsonschema of YAML resource files (#11516)
* Add jsonschema for resources
* Begin jsonschema validating YAML resource files in dbt projects
* Drop `tests` and `data_tests` as required properties of `Columns` and `Models` for resources jsonschema
* Drop `__additional_properties__` as required for `_Metrics` in resources jsonschema
* Drop `post_hook` and `pre_hook` requirement for `__SnapshotsConfig` in resources jsonschema
* Update `_error_path_to_string` to handle empty paths
* Create + use custom Draft7Validator to ignore datetime and date classes
* Break `TestRetry` functional test class into multiple test classes
There was some overflow global state from one test to another which was
causing some of the tests to break.
* Refactor duplicate instances of `jsonschema_validate` to single definition
* Begin testing jsonschema validation of resource YAMLs
* Add changie doc
* Add Deprecation Warnings for Unexpected Jinja Blocks (#11514)
* Add deprecation warnings on unexpected jinja blocks.
* Add changelog entry.
* Add test event.
* Regen proto types.
* Fix event test.
* Add `UnexpectedJinjaBlockDeprecationSummary` and add file context to `UnexpectedJinjaBlockDeprecation` (#11517)
* Add summary event for UnexpectedJinjaBlockDeprecation
* Begin including file information in UnexpectedJinjaBlockDeprecation event
* Add UnexpectedJinjaBlockDeprecationSummary to test_events.py
* Deprecate Custom Top-Level Keys (#11518)
* Add specific deprecation for custom top level keys.
* Add changelog entry
* Add test events
* Add Check for Duplicate YAML Keys (#11510)
* Add functionality to check for duplicate yaml keys, working around PyYAML limitation.
* Fix up some ancient typing issues.
* Ignore typing issue, for now.
* Correct unit tests of `checked_load`
* Add event and deprecation types for duplicate yaml keys
* Begin validating `dbt_project.yml` for duplicate key violations
* Begin checking for duplicate key violations in schema files
* Add test to check duplicate keys are checked in schema files
* Refactor checked_yaml failure handling to reduce duplicate code
* Move `checked_load` utilities to separate file to avoid circular imports
* Handle yaml `start_mark` correctly for top level key errors
* Update changelog
* Fix test.
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* Fix issue with YAML anchors in new CheckedLoader class.
* Deprecate having custom keys in config blocks (#11522)
* Add deprecation event for custom keys found in configs
* Begin checking schema files for custom keys found in configs
* Test new CustomConfigInConfigDeprecation event
* Add changie doc
* Add custom config key deprecation events to event serialization test
* Provide message to ValidationError in `SelectorConfig.from_path`
This typing error is unrelated to the changes in this PR. However,
it was failing CI, so I figured it'd be simple to just fix it.
* Add some extra guards around the ValidationFailure `path` and `instance`
* [TIDY-FRIST] Use new `deprecation_tag` (#11524)
* Tidy First: Update deprecation events to use the new `deprecation_tag`
Note did this for a majority of deprecations, but not _all_ deprecations.
That is because not all deprecations were following the pattern. As some
people do string parsing of our logs with regex, altering the deprecations
that weren't doing what `deprecation_tag` does to use `deprecation_tag`
would be a _breaking change_ for those events, thus we did not alter those
events
* Bump minimum dbt-common to `1.22.0`
* Fix tests
* Begin emitting deprecation events for custom properties found in objects (#11526)
* Fix CustomKeyInConfigDeprecationSummary
* Add deprecation type for custom properties in YAML objects
* Begin emitting deprecation events for custom properties found in objects
* Add changie doc
* Add `loaded_at_query` property to `_Sources` definition in jsonschema
This was breaking the test tests/unit/parser/test_parser.py::SchemaParserSourceTest::test_parse_source_custom_freshness_at_source
* Move validating jsonschema of schema files earlier in the process
Previously we were validating the jsonschema of schema files in
`SchemaParser.parse_file`. However, the file is originally loaded in
`yaml_from_file` (which happens before `SchemaParser.parse_file`), and
`yaml_from_file` _modifies_ the loaded dictionary to add some additional
properties. These additional properties violate the jsonschema unfortunately,
and thus we needed to start validating the schema against the jsonschema
before any such modifications.
* Skip parser tests for `model.freshness`
Model freshness never got fully implemented, won't be implemented nor
documented for 1.10. As such we're gonna consider the `model.freshness`
property an "unknown additional property". This is actually good as some
people have "accidentally" defined "freshness" on their models (likely due
to copy/paste of a source), and that property isn't doing anything.
* One single DeprecationsSummary event to rule them all (#11540)
* Begin emitting singular deprecations summary, instead of summary per deprecation type
* Remove concept of deprecation specific summary events in deprecations module
* Drop deprecation summary events that have been added to `feature-branch--11335-deprecations` but not `main`
These are save to drop with no notice because they only ever existed
on a feature branch, never main.
* Correct code numbers for new events on feature-branch that haven't made it to main yet
* Kill `PackageRedirectDeprecationSummary` event, and retire its event code
* add changie doc
* Update jsonschemas to versions 0.0.110 (#11541)
* Update jsonschems to 0.0.110
* Don't allow additional properties in configs
* Don't allow additional top level properties on objects
* Allow for 'loaded_at_query' on Sources and Tables
* Don't allow additional top level properties in schema files
---------
Co-authored-by: Peter Webb <peter.webb@dbtlabs.com>
* [#9791] Fix datetime.datetime.utcnow() is deprecated as of Python 3.12
* Explicit UTC timezone declaration for instances of datetime.now()
* Keep utcnow() in functional test case to avoid setup errors
* Utilize the more specific datetime class import for microbatch config
* Replace utcnow calls in contracts and artifacts
* Replace utcnow calls in functional and unit test cases
* Test deserialization of compiled run execution results
* Test deserialization of instantiated run execution result
* Code style improvements
* rough in catalog contracts + requires.catalog
* set catalog integration
* add initial functional test for catalog parsing
* use dbt-adapters.git@feature/externalCatalogConfig
* add concrete catalog integration config
* add requires.catalog to build + reorder requires
* separate data objects from loaders
* improve functional test and fix import
* Discard changes to tests/functional/adapter/simple_seed/test_seed_type_override.py
* Change branch name for dot-adapters
* make table_format and catalog_type strings for now
* remove uv from makefile
* Discard changes to dev-requirements.txt
* Overhaul parsing catalogs.yml
* Use [] instead of None
* update postgres macos action
* Add more tests
* Add changie
* Second round of refactoring
* Address PR comments
---------
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
* Functional test for hourly microbatch model
* Use today's date for functional test for hourly microbatch model
* Use today's date for functional test for hourly microbatch model
* Restore to original
* Only use alphanumeric characters within batch ids
* Add tests for batch_id and change expected output for format_batch_start
* Handle missing batch_start
* Revert "Handle missing batch_start"
This reverts commit 65a1db0048. Reverting this because
`batch_start` for `format_batch_start` cannot be `None` and `start_time` for `batch_id`
cannot be `None`.
* Improve BatchSize specific values for `format_batch_start` and `batch_id` methods
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* Update to latest ddtrace within minor version range.
* Add test coverage for Python 3.13
* Update setup.py to indicate Python 3.13 support.
* Update freezegun version to support Python 3.13
* Add changelog entry.
* Default macro argument information from original definitions.
* Add argument type and count warnings behind behavior flag.
* Add changelog entry.
* Make flag test more robust.
* Use a unique event for macro annotation warnings, per review.
* Add event to test list.
* Regenerate core_types_pb2 using protoc 5.28.3
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* update ubuntu 20.04 to 24.04
* updates to ubuntu-latest instead
* try postgres update
* Change owner of db creation script so postgres can run it.
* Add sudos.
* Add debug logging.
* Set execute bit on scripts.
* More debug logging.
* try a service
* clean up and split the integrations tests by os
---------
Co-authored-by: Peter Allen Webb <peter.webb@dbtlabs.com>
* Push orchestration of batches previously in the `RunTask` into `MicrobatchModelRunner`
* Split `MicrobatchModelRunner` into two separate runners
`MicrobatchModelRunner` is now an orchestrator of `MicrobatchBatchRunner`s, the latter being what handle actual batch execution
* Introduce new `DbtThreadPool` that knows if it's been closed
* Enable `MicrobatchModelRunner` to shutdown gracefully when it detects the thread pool has been closed
* Add secondary_profiles to profile.py
* Add more tests for edge cases
* Add changie
* Allow inferring target name and add tests for the same
* Incorporate review feedback
* remove unnecessary nesting
* Use typing_extensions.Self
* use quoted type again
* address pr comments round 2
* Allow for rendering of refs/sources in snapshots to be sampled
Of note the parameterization of `test_resolve_event_time_filter` in
tests/unit/context/test_providers.py is getting large and cumbersome.
It may be time soon to split it into a few distinct tests to facilitate
the necessity of fewer parametrized arguments for a given test.
* Simplify `isinstance` checks when resolving event time filter
Previously we were doing `isintance(a, class1) or (isinstance(a, class2)`
but this can be simplified to `isintance(a, (class1, class2))`. Woops.
* Ensure sampling of refs of snapshots is possible
Notably we didn't have to add `insinstance(self.target, SnapshotConfig)` to the
checks in `resolve_event_time_filter` because `SnapshotConfig` is a subclass
of `NodeConfig`.
* Add changie doc
* Reapply "Add `doc_blocks` to manifest for nodes and columns (#11224)" (#11283)
This reverts commit 55e0df181f.
* Expand doc_blocks backcompat test
* Refactor to method, add docstring
* Add `--sample` flag to `run` command
* Remove no longer needed `if` statement around EventTimeFilter creation for microbatch models
Upon the initial implementation of microbatch models, the the `start` for a batch was _optional_.
However, in c3d87b89fb they became guaranteed. Thus the if statement
guarding when `start/end` isn't present for microbatch models was no longer actually doing anything.
Hence, the if statement was safe to remove.
* Get sample mode working with `--event-time-start/end`
This is temporary as a POC. In the end, sample mode can't depend on the arguments
`--event-time-start/end` and will need to be split into their own CLI args / project
config, something like `--sample-window`. The issue with using `--event-time-start/end`
is that if people set those in the project configs, then their microbatch models would
_always_ run with those values even outside of sample mode. Despite that, this is a
useful checkpoint even though it will go away.
* Begin using `--sample-window` for sample mode instead of `--event-time-start/end`
Using `--event-time-start/end` for sample mode was conflicting with microbatch models
when _not_ running in sample mode. We will have to do _slightly_ more work to plumb
this new way of specifying sample time to microbatch models.
* Move `SampleWindow` class to `sample_window.py` in `event_time` submodule
This is mostly symbolic. We are going to be adding some utilities for "event_time"
type things, which will all live in the `event_time` submodule. Additionally we plan
to refactor `/incremental/materializations/microbatch.py` into the sub module as well.
* Create an `offset_timestamp` separate from MicrobatchBuilder
The `MicrobatchBuilder.offset_timestamp` _truncates_ the timestamp before
offsetting it. We don't want to do that, we want to offset the "raw" timestamp.
We could have split renamed the microbatch builder function name to
`truncate_and_offset_timestamp` and separated the offset logic into a separate
abstract function. However, the offset logic in the MicrobatchBuilder context
depends on the truncation. We might later on be able to refactor the Microbatch
provided function by instead truncating _after_ offsetting instead of before.
But that is out of scope for this initial work, and we should instead revisit it
later.
* Add `types-python-dateutil` to dev requirements
The previous commit began using a submodule of the dateutil builtin
python library. We weren't previously using this library, and thus didn't
need the type stubs for it. But now that we do use it, we need to have
the type stubs during development.
* Begin supporting microbatch models in sample mode
* Move parsing logic of `SampleWindowType` to `SampleWindow`
* Allow for specificaion of "specific" sample windows
In most cases people will want to set "relative" sample windows, i.e.
"3 days" to sample the last three days. However, there are some cases
where people will want to "specific" sample windows for some chunk of
historic time, i.e. `{'start': '2024-01-01', 'end': '2024-01-31'}`.
* Fix tests of `BaseResolver.resolve_event_time_filter` for sample mode changes
* Add `--no-sample` as it's necessary for retry
* Add guards to accessing of `sample` and `sample_window`
This was necessary because these aren't _always_ available. I had expected
to need to do this after putting the `sample` flag behind an environment
variable (which I haven't done yet). However, we needed to add the guards
sooner because the `render` logic is called multiple times throughout the
dbt process, and earlier on the flags aren't available.
* Gate sample mode functionality via env var `DBT_EXPERIMENTAL_SAMPLE_MODE`
At this point sample mode is _alpha_ and should not be depended upon. To make
this crystal clear we've gated the functionality behind an environment variable.
We'll likely remove this gate in the coming month.
* Add sample mode tests for incremental models
* Add changie doc for sample mode initial implementation
* Fixup sample mode functional tests
I had updated the `later_input_model.sql` to be easier to test with. However,
I didn't correspondingly update the inital `input_model.sql` to match.
* Ensure microbatch creates correct number of batches when sample mode env var isn't present
Previously microbatch was creating the _right_ number of batches when:
1. sample mode _wasn't_ being used
2. sample mode _was_ being used AND the env var was present
Unfortunately sample mode _wasn't_ creating the right number of batches when:
3. sample mode _was_ being used AND the env var _wasn't_ present.
In case (3) sample mode shouldn't be run. Unfortunately we weren't gating sample
mode by the environment variable during batch creation. This lead to a situtation
where in creating batches it was using sample mode but in the rendering of refs
it _wasn't_ using sample mode. Putting it in an inbetween state... This commit
fixes that issue.
Additionally of note, we currently have duplicate sample mode gating logic in the
batch creation as well as in the rendering of refs. We should probably consolidate
this logic into a singular importable function, that way any future changes of how
sample mode is gated is easier to implement.
* Correct comment in SampleWindow post serialization method
* Hide CLI sample mode options
We are doing this _temporarily_ while sample mode as a feature is in
alpha/beta and locked behind an environment variable. When we remove the
environment variable we should also unhide these.
Currently, running this command on a project containing a microbatch
model results in an error, as microbatch models require a datetime
value in their config which cannot be serialized by the default JSON
serializer.
There already exists a custom JSON serializer within the dbt-core
project that converts datetime to ISO string format. This change uses
the above serializer to resolve the error.
* Update `TestMicrobatchWithInputWithoutEventTime` to check running again raises warning
The first time the project is run, the appropriate warning about inputs is raised. However,
the warning is only being raised when a full parse happens. When partial parsing happens
the warning isn't getting raised. In the next commit we'll fix this issue. This commit updates
the test to show that the second run (with partial parsing) doesn't raise the update, and thus
the test fails.
* Update manifest loading to _always_ check microbatch model inputs
Of note we are at the point where multiple validations are iterating
all of the nodes in a manifest. We should refactor these _soon_ such that
we are not iterating over the same list multiple times.
* Add changie doc
* Begin producing warning when attempting to force concurrent batches without adapter support
Batches of microbatch models can be executed sequentially or concurrently. We try to figure out which to do intelligently. As part of that, we implemented an override, the model config `concurrent_batches`, to allow the user to bypass _some_ of our automatic detection. However, a user _cannot_ for batches to run concurrently if the adapter doesn't support concurrent batches (declaring support is opt in). Thus, if an adapter _doesn't_ support running batches concurrently, and a user tries to force concurrent execution via `concurrent_batches`, then we need to warn the user that that isn't happening.
* Add custom event type for warning about invalid `concurrent_batches` config
* Fire `InvalidConcurrentBatchesConfig` warning via `warn_or_error` so it can be silenced
* Update partial success test to assert partial successes mean that the run failed
* Update results interpretation to include `PartialSuccess` as failure status
* Update single batch test case to check for generic exceptions
* Explicitly skip last final batch execution when there is only one batch
Previously if there was only one batch, we would try to execute _two_
batches. The first batch, and a "last" non existent batch. This would
result in an unhandled exception.
* Changie doc
* microbatch: split out first and last batch to run in serial
* only run pre_hook on first batch, post_hook on last batch
* refactor: internalize parallel to RunTask._submit_batch
* Add optional `force_sequential` to `_submit_batch` to allow for skipping parallelism check
* Force last batch to run sequentially
* Force first batch to run sequentially
* Remove batch_idx check in `should_run_in_parallel`
`should_run_in_parallel` shouldn't, and no longer needs to, take into
consideration where in batch exists in a larger context. The first and
last batch for a microbatch model are now forced to run sequentially
by `handle_microbatch_model`
* Begin skipping batches if first batch fails
* Write custom `on_skip` for `MicrobatchModelRunner` to better handle when batches are skipped
This was necessary specifically because the default on skip set the `X of Y` part
of the skipped log using the `node_index` and the `num_nodes`. If there was 2
nodes and we are on the 4th batch of the second node, we'd get a message like
`SKIPPED 4 of 2...` which didn't make much sense. We're likely in a future commit
going to add a custom event for logging the start, result, and skipping of batches
for better readability of the logs.
* Add microbatch pre-hook, post-hook, and sequential first/last batch tests
* Fix/Add tests around first batch failure vs latter batch failure
* Correct MicrobatchModelRunner.on_skip to handle skipping the entire node
Previously `MicrobatchModelRunner.on_skip` only handled when a _batch_ of
the model was being skipped. However, that method is also used when the
entire microbatch model is being skipped due to an upstream node error. Because
we previously _weren't_ handling this second case, it'd cause an unhandled
runtime exception. Thus, we now need to check whether we're running a batch or not,
and there is no batch, then use the super's on_skip method.
* Correct conditional logic for setting pre- and post-hooks for batches
Previously we were doing an if+elif for setting pre- and post-hooks
for batches, where in the `if` matched if the batch wasn't the first
batch, and the `elif` matched if the batch wasn't the last batch. The
issue with this is that if the `if` was hit, the `elif` _wouldn't_ be hit.
This caused the first batch to appropriately not run the `post-hook` but
then every hook after would run the `post-hook`.
* Add two new event types `LogStartBatch` and `LogBatchResult`
* Update MicrobatchModelRunner to use new batch specific log events
* Fix event testing
* Update microbatch integration tests to catch batch specific event types
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* New function to add graph edges.
* Clean up, leave out flag temporarily for testing.
* Put new test edge behavior behind flag.
* Final draft of documentaiton.
* Add `batch_id` to jinja context of microbatch batches
* Add changie doc
* Update `format_batch_start` to assume `batch_start` is always provided
* Add "runtime only" property `batch_context` to `ModelNode`
By it being "runtime only" we mean that it doesn't exist on the artifact
and thus won't be written out to the manifest artifact.
* Begin populating `batch_context` during materialization execution for microbatch batches
* Fix circular import
* Fixup MicrobatchBuilder.batch_id property method
* Ensure MicrobatchModelRunner doesn't double compile batches
We were compiling the node for each batch _twice_. Besides making microbatch
models more expensive than they needed to be, double compiling wasn't
causing any issue. However the first compilation was happening _before_ we
had added the batch context information to the model node for the batch. This
was leading to models which try to access the `batch_context` information on the
model to blow up, which was undesirable. As such, we've now gone and skipped
the first compilation. We've done this similar to how SavedQuery nodes skip
compilation.
* Add `__post_serialize__` method to `BatchContext` to ensure correct dict shape
This is weird, but necessary, I apologize. Mashumaro handles the
dictification of this class via a compile time generated `to_dict`
method based off of the _typing_ of th class. By default `datetime`
types are converted to strings. We don't want that, we want them to
stay datetimes.
* Update tests to check for `batch_context`
* Update `resolve_event_time_filter` to use new `batch_context`
* Stop testing for batchless compiled code for microbatch models
In 45daec72f4 we stopped an extra compilation
that was happening per batch prior to the batch_context being loaded. Stopping
this extra compilation means that compiled sql for the microbatch model without
the event time filter / batch context is no longer produced. We have discussed
this and _believe_ it is okay given that this is a new node type that has not
hit GA yet.
* Rename `ModelNode.batch_context` to `ModelNode.batch`
* Rename `build_batch_context` to `build_jinja_context_for_batch`
The name `build_batch_context` was confusing as
1) We have a `BatchContext` object, which the method was not building
2) The method builds the jinja context for the batch
As such it felt appropriate to rename the method to more accurately
communicate what it does.
* Rename test macro `invalid_batch_context_macro_sql` to `invalid_batch_jinja_context_macro_sql`
This rename was to make it more clear that the jinja context for a
batch was being checked, as a batch_context has a slightly different
connotation.
* Update changie doc
* Rename `batch_info` to `previous_batch_results`
* Exclude `previous_batch_results` from serialization of model node to avoid jinja context bloat
* Drop `previous_batch_results` key from `test_manifest.py` unit tests
In 4050e377ec we began excluding
`previous_batch_results` from the serialized representation of the
ModelNode. As such, we no longer need to check for it in `test_manifest.py`.
* Clean up changelog on main
* Bumping version to 1.10.0a1
* Code quality cleanup
* add 1.8,1.9 link
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Allow `dbt show` and `dbt compile` to output JSON without extra logs
* Add `quiet` attribute for ShowNode and CompiledNode messages
* Output of protoc compiler
* Utilize the `quiet` attribute for ShowNode and CompiledNode
* Reuse the `dbt list` approach when the `--quiet` flag is used
* Use PrintEvent to get to stdout even if the logger is set to ERROR
* Functional tests for quiet compile
* Functional tests for quiet show
* Fire event same way regardless if LOG_FORMAT is json or not
* Switch back to firing ShowNode and CompiledNode events
* Make `--inline-direct` to be quiet-compatible
* Temporarily change to dev branch for dbt-common
* Remove extraneous newline
* Functional test for `--quiet` for `--inline-direct` flag
* Update changelog entry
* Update `core_types_pb2.py`
* Restore the original branch in `dev-requirements.txt`
---------
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
This is needed for dbt-core + dbt-adapters to work properly in regards to
the microbatch project_flag/behavior flag `require_batched_execution_for_custom_microbatch_strategy`
* first pass: replace os env with project flag
* Fix `TestMicrobatchMultipleRetries` to not use `os.env`
* Turn off microbatch project flag for `TestMicrobatchCustomUserStrategyDefault` as it was prior to a9df50f
* Update `BaseMicrobatchTest` to turn on microbatch via project flags
* Add changie doc
* Fix functional tests after merging in main
* Add function to that determines whether the new microbatch functionality should be used
The new microbatch functionality is, unfortunately, potentially dangerous. That is
it adds a new materalization strategy `microbatch` which an end user could have
defined as a custom strategy previously. Additionally we added config keys to nodes,
and as `config` is just a Dict[str, Any], it could contain anything, thus meaning
people could already be using the configs we're adding for different purposes. Thus
we need some intellegent gating. Specifically something that adheres to the following:
cms = Custom Microbatch Strategy
abms = Adapter Builtin Microbatch Strategy
bf = Behavior flag
umb = Use Microbatch Batching
t/f/e = True/False/Error
| cms | abms | bf | umb |
| t | t | t | t |
| f | t | t | t |
| t | f | t | t |
| f | f | t | e |
| t | t | f | f |
| f | t | f | t |
| t | f | f | f |
| f | f | f | e |
(The above table assumes that there is a microbatch model present in the project)
In order to achieve this we need to check that either the microbatch behavior
flag is set to true OR microbatch materializaion being used is the _root_ microbatch
materialization (i.e. not custom). The function we added in this commit,
`use_microbatch_batches`, does just that.
* Gate microbatch functionality by `use_microbatch_batches` manifest function
* Rename microbatch behavior flag to `require_batched_execution_for_custom_microbatch_strategy`
* Extract logic of `find_macro_by_name` to `find_macro_candiate_by_name`
In 0349968c61 I had done this for the function
`find_materialization_macro_by_name`, but that wasn't the right function to
do it to, and will be reverted shortly. `find_materialization_macro_by_name`
is used for finding the general materialization macro, whereas `find_macro_by_name`
is more general. For the work we're doing, we need to find the microbatch
macro, which is not a materialization macro.
* Use `find_macro_candidate_by_name` to find the microbatch macro
* Fix microbatch macro locality check to search for `core` locality instead of `root`
Previously were were checking for a locality of `root`. However, a locality
of `root` means it was provided by a `package`. We wnt to check for locality
of `core` which basically means `builtin via dbt-core/adapters`. There is
another locality `imported` which I beleive means it comes from another
package.
* Move the evaluation of `use_microbatch_batches` to the last position in boolean checks
The method `use_microbatch_batches` is always invoked to evaluate an `if`
statement. In most instances, it is part of a logic chain (i.e. there are
multiple things being evaluated in the `if` statement). In `if` statements
where there are multiple things being evaulated, `use_microbatch_batches`
should come _last_ (or as late as possible). This is because it is likely
the most costly thing to evaluate in the logic chain, and thus any shortcuts
cuts via other evaluations in the if statement failing (and thus avoiding
invoking `use_microbatch_batches) is desirable.
* Drop behavior flag setting for BaseMicrobatchTest tests
* Rename 'env_var' to 'project_flag' in test_microbatch.py
* Update microbatch tests to assert when we are/aren't running with batches
* Update `test_resolve_event_time_filter` to use `use_microbatch_batches`
* Fire deprecation warning for custom microbatch macros
* Add microbatch deprecation events to test_events.py
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* Add new `ArtifactWritten` event
* Emit ArtifactWritten event whenever an artifact is written
* Get artifact_type from class name for ArtifactWritten event
* Add changie docs
* Add test to check that ArtifactWritten events are being emitted
* Regenerate core_types_pb2.py using correct protobuf version
* Regen core_types_pb2 again, using a more correct protoc version
* Add unit tests to check how `safe_run_hooks` handles exceptions
* Improve exception handling in `get_execution_status`
Previously in `get_execution_status` if a non `DbtRuntimeError` exception was
raised, the finally would be entered, but the `status`/`message` would not be
set, and thus a `status not defined` exception would get raised on attempting
to return. Tangentially, there is another issue where somehow the `node_status`
is becoming `None`. In all my playing with `get_execution_status` I found that
trying to return an undefined variable in the `finally` caused an undefined
variable exception. However, if in some python version, it instead just handed
back `None`, then this fix should also solve that.
* Add changie docs
* Ensure run_results get written if KeyboardInterrupt happens during end run hooks
* Bump minimum dbt-adpaters to 1.8.0
In https://github.com/dbt-labs/dbt-core/pull/10859 we started using the
`get_adapter_run_info` method provided by `dbt-adapters`. However that
function is only available in dbt-adapters >= 1.8.0. Thus 1.8.0 is our
new minimum for dbt-adapters.
* Add changie doc
* Add function to MicrobatchBuilder to get ceiling of timestamp by batch_size
* Update `MicrobatchBuilder.build_end_time` to use `ceiling_timestamp`
* fix TestMicrobatchBuilder.test_build_end_time by specifying a BatchSize + asserting actual is a ceiling timestamp
* Add changie
---------
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* Stop validating that `--event-time-start` is before "current" time
In the next commit we'll be adding a validation that requires that `--event-time-start`
and `--event-time-end` are mutually required. That is, whenever one is specified,
the other is required. In that world, `--event-time-start` will never need to be compared
against the "current" time, because it'll never be run in conjunction with the "current"
time.
* Validate that `--event-time-start` and `--event-time-end` are mutually present
* Add changie doc for validation changes
* Alter functional microbatch tests to work with updated `event_time_start/end` reqs
We made it such that when `event_time_start` is specified, `event_time_end` must also
be specified (and vice versa). This broke numerous tests, in a few different ways:
1. There were tests that used `--event-time-start` without `--event-time-end` butg
were using event_time_start essentially as the `begin` time for models being initially
built or full refreshed. These tests could simply drop the `--event-time-start` and
instead rely on the `begin` value.
2. There was a test that was trying to load a subset of the data _excluding_ some
data which would be captured by using `begin`. In this test we added an appropriate
`--event-time-end` as the `--event-time-start` was necessary to statisfy what the
test was testing
3. There was a test which was trying to ensure that two microbatch models would be
given the same "current" time. Because we wanted to ensure the "current" time code
path was used, we couldn't add `--event-time-end` to resolve the problem, thus we
needed to remove the `--event-time-start` that was being used. However, this led to
the test being incredibly slow. This was resolved by switching the relevant microbatch
models from having `batch_size`s of `day` to instead have `year`. This solution should
be good enough for roughly ~40 years? We'll figure out a better solution then, so see ya
in 2064. Assuming I haven't died before my 70th birthday, feel free to ping me to get
this taken care of.
---------
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* Add adapter telemetry to snowplow event.
* Temporary dev branch switch.
* Set tracking for overrideable adapter method.
* Do safer adapter ref.
* Improve comment.
* Code review comments.
* Don't call the asdict on a dict.
* Bump ci to pull in fix from base adapter.
* Add unit tests for coverage.
* Update field name from base adapter/schema change.
* remove breakpoint.
* Change `lookback` default from `0` to `1`
* Regen jsonschema manifest v12 to include `lookback` default change
* Regen saved state of v12 manifest for functional artifact testing
* Add changie doc for lookback default change
* Avoid a KeyError if `child_unique_id` is not found in the dictionary
* Changelog entry
* Functional test when an exposure references a deprecated model
dbt-adapters updated the incremental_strategy validation of incremental models such that
the validation now _always_ happens when an incremental model is executed. A test in dbt-core
`TestMicrobatchCustomUserStrategyEnvVarTrueInvalid` was previously set to _expect_ buggy behavior
where an incremental model would succeed on it's "first"/"refresh" run even if it had an invalid
incremental strategy. Thus we needed to update this test in dbt-core to expect the now correct
behavior of incremental model execution time validation
* [Tidy-First]: Fix `timings` object for hooks and macros, and make types of timings explicit
* cast literal to str
* change test
* change jsonschema to enum
* Discard changes to schemas/dbt/manifest/v12.json
* nits
---------
Co-authored-by: Chenyu Li <chenyu.li@dbtlabs.com>
* Add `order_by` and `limit` fields to saved queries.
* Update JSON schema
* Add change log for #10531.
* Check order by / limit in saved-query parsing test.
* Add test that checks microbatch models are all operating with the same `current_time`
* Set an `invocated_at` on the `RuntimeConfig` and plumb to `MicrobatchBuilder`
* Add changie doc
* Rename `invocated_at` to `invoked_at`
* Simply conditional logic for setting MicrobatchBuilder.batch_current_time
* Rename `batch_current_time` to `default_end_time` for MicrobatchBuilder
* Begin testing that microbatch execution times are being tracked and set
* Begin tracking the execution time of batches for microbatch models
* Add changie doc
* Additional assertions in microbatch testing
* Validate that `event_time_start` is before `event_time_end` when passed from CLI
Sometimes CLI options have restrictions based on other CLI options. This is the case
for `--event-time-start` and `--event-time-end`. Unfortunately, click doesn't provide
a good way for validating this, at least not that I found. Additionaly I'm not sure
if we have had anything like this previously. In any case, I couldn't find a
centralized validation area for such occurances. Thus I've gone and added one,
`validate_option_interactions`. Long term if more validations are added, we should
add this wrapper to each CLI command. For now I've only added it to the commands that
support `event_time_start` and `event_time_end`, specifically `build` and `run`.
* Add changie doc
* If `--event-time-end` is not specififed, ensure `--event-time-start` is less than the current time
* Fixup error message about event_time_start and event_time_end
* Move logic to validate `event_time` cli flags to `flags.py`
* Update validation of `--event-time-start` against `datetime.now` to use UTC
* When retrying microbatch models, propagate prior successful state
* Changie doc for microbatch dbt retry fixes
* Fix test_manifest unit tests for batch_info key
* Add functional test for when a microbatch model has multiple retries
* Add comment about when batch_info will be something other than None
* Add tests to check how microbatch models respect `full_refresh` model configs
* Fix `_is_incremental` to properly respect `full_refresh` model config
In dbt-core, it is generally expected that values passed via CLI flags take
precedence over model level configs. However, `full_refresh` on a model is an
exception to this rule, where in the model config takes precedence. This
config exists specifically to _prevent_ accidental full refreshes of large
incremental models, as doing so can be costly. **_It is actually best
practice_** to set `full_refresh=False` on incremental models.
Prior to this commit, for microbatch models, the above was not happening. The
CLI flag `--full-refresh` was taking precedence over the model config
`full_refresh`. That meant that if `--full-refresh` was supplied, then the
microbatch model **_would full refresh_** even if `full_refresh=False` was
set on the model. This commit solves that problem.
* Add changie doc for microbatch `full_refresh` config handling
* Add `PartialSuccess` status type and use it for microbatch models with mixed results
* Handle `PartialSuccess` in `interpret_run_result`
* Add `BatchResults` object to `BaseResult` and begin tracking during microbatch runs
* Ensure batch_results being propagated to `run_results` artifact
* Move `batch_results` from `BaseResult` class to `RunResult` class
* Move `BatchResults` and `BatchType` to separate arifacts file to avoid circular imports
In our next commit we're gonna modify `dbt/contracts/graph/nodes.py` to import the
`BatchType` as part of our work to implement dbt retry for microbatch model nodes.
Unfortunately, the import in `nodes.py` creates a circular dependency because
`dbt/artifacts/schemas/results.py` imports from `nodes.py` and `dbt/artifacts/schemas/run/v5/run.py`
imports from that `results.py`. Thus the new import creates a circular import. Now this
_shouldn't_ be necessary as nothing in artifacts should import from the rest of dbt-core.
However, we do. We should fix this, but this is also out of scope for this segement of work.
* Add `PartialSuccess` as a retry-able status, and use batches to retry microbatch models
* Fix BatchType type so that the first datetime is no longer Optional
* Ensure `PartialSuccess` causes skipping of downstream nodes
* Alter `PartialSuccess` status to be considered an error in `interpret_run_result`
* Update schemas and test artifacts to include new batch_results run results key
* Add functional test to check that 'dbt retry' retries 'PartialSuccess' models
* Update partition failure test to assert downstream models are skipped
* Improve `success`/`error`/`partial success` messaging for microbatch models
* Include `PartialSuccess` in status that `--fail-fast` counts as a failure
* Update `LogModelResult` to handle partial successes
* Update `EndOfRunSummary` to handle partial successes
* Cleanup TODO comment
* Raise a DbtInternalError if we get a batch run result without `batch_results`
* When running a microbatch model with supplied batches, force non full-refresh behavior
This is necessary because of retry. Say on the initial run the microbatch model
succeeds on 97% of it's batches. Then on retry it does the last 3%. If the retry
of the microbatch model executes in full refresh mode it _might_ blow away the
97% of work that has been done. This edge case seems to be adapter specific.
* Only pass batches to retry for microbatch model when there was a PartialSuccess
In the previous commit we made it so that retries of microbatch models wouldn't
run in full refresh mode when the microbatch model to retry has batches already
specified from the prior run. This is only problematic when the run being retried
was a full refresh AND all the batches for a given microbatch model failed. In
that case WE DO want to do a full refresh for the given microbatch model. To better
outline the problem, consider the following:
* a microbatch model had a begin of `2020-01-01` and has been running this way for awhile
* the begin config has changed to `2024-01-01` and dbt run --full-refresh gets run
* every batch for an microbatch model fails
* on dbt retry the the relation is said to exist, and the now out of range data (2020-01-01 through 2023-12-31) is never purged
To avoid this, all we have to do is ONLY pass the batch information for partially successful microbatch
models. Note: microbatch models only have a partially successful status IFF they have both
successful and failed batches.
* Fix test_manifest unit tests to know about model 'batches' key
* Add some console output assertions to microbatch functional tests
* add batch_results: None to expected_run_results
* Add changie doc for microbatch retry functionality
* maintain protoc version 5.26.1
* Cleanup extraneous comment in LogModelResult
---------
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* Test case for `merge_exclude_columns`
* Update expected output for `merge_exclude_columns`
* Skip TestMergeExcludeColumns test
* Enable this test since PostgreSQL 15+ is available in CI now
* Undo modification to expected output
* Remove duplicated constructor for `ResourceTypeSelector`
* Add type annotation for `ResourceTypeSelector`
* Standardize on constructor for `ResourceTypeSelector` where `include_empty_nodes=True`
* Changelog entry
* Adding logic to TestSelector to remove unit tests if they are in excluded_resource_types
* Adding change log
* Respect `--resource-type` and `--exclude-resource-type` CLI flags and associated environment variables
* Test CLI flag for excluding unit tests for the `dbt test` command
* Satisy isort pre-commit hook
* Fix mypy for positional argument "resource_types" in call to "TestSelector"
* Replace `TestSelector` with `ResourceTypeSelector`
* Add co-author
* Update changelog description
* Add functional tests for new feature
* Compare the actual results, not just the count
* Remove test case covered elsewhere
* Test for `DBT_EXCLUDE_RESOURCE_TYPES` environment variable for `dbt test`
* Update per pre-commit hook
* Restore to original form (until we refactor extraneous `ResourceTypeSelector` references later)
---------
Co-authored-by: Matthew Cooper <asimov.1st@gmail.com>
* initial rough-in with CLI flags
* dbt-adapters testing against event-time-ref-filtering
* fix TestList
* Checkpoint
* fix tests
* add event_time_start params to build
* rename configs
* Gate resolve_event_time_filter via micro batch strategy and fix strptime usage
* Add unit test for resolve_event_time_filter
* Additional unit tests for `resolve_event_time_filter` to ensure lookback + batch_size work
* Remove extraneous comments and print statements from resolve_event_time_filter
* Fixup microbatch functional tests to use microbatch strategy
* Gate microbatch functionality behind env_var while in beta
* Add comment about how _is_incremental should be removed
* Improve `event_time_start/end` cli parameters to auto convert to datetime objects
* for testing: dbt-postgres 'microbatch' strategy
* rough in: chunked backfills
* partial failure of microbatch runs
* decouple run result methods
* initial refactor
* rename configs to __dbt_internal
* update compiled_code in context after re-compilation
* finish rename of context vars
* changelog entry
* fix patch_microbatch_end_time
* refactor into MicrobatchBuilder
* fix provider unit tests + add unit tests for MicrobatchBuilder
* add TestMicrobatchJinjaContextVarsAvailable
* unit test offset + truncate timestamp methods
* Remove pairing.md file
* Add tying to microbatch specific functions added in `task/run.py`
* Add doc strings to microbatch.py functions and classes
* Set microbatch node status to `ERROR` if all batches for node failed
* Fire an event for batch exceptions instead of directly printing
* Fix firing of failed microbatch log event
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* Update functional tests to cover this case
* Revert "Update functional tests to cover this case"
This reverts commit 4c78e816f6.
* New functional tests to cover the resource_type config
* Separate data tests from unit tests for `resource_types` config of `dbt list` and `dbt build`
* Changelog entry
* Add functional tests for custom incremental strategies names 'microbatch'
* Point dev-requirement of `dbt-adapters` back to the main branch
The associated branch/PR in `dbt-adapters` that we were previously
pointing to has been merged. Thus we can point back to `main` again.
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* made class changing directory a context manager.
* add change log
* fix conflict
* made base as a context manager
* add assertion
* Remove index.html
* add it test to testDbtRunner
* fix deps args order
* fix test
---------
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
Co-authored-by: Chenyu Li <chenyu.li@dbtlabs.com>
* use full manifest in adapter instead of macro_manifest
* Add test case
* Add changelog entry
* Remove commented code.
---------
Co-authored-by: Peter Allen Webb <peter.webb@dbtlabs.com>
* Tests for calling a macro in a pre- or post-hook config in properties.yml
* Late render pre- and post-hooks configs in properties / schema YAML files
* Changelog entry
* Use alias instead of name when adding ephemeral model prefixes
* Adjust TestCustomSchemaWithCustomMacroFromModelName to test ephemeral models
* Add changelog entry for ephemeral model CTE identifier fix
* Reference model.identifier and model.name where appropriate to resolve typing errors
* Move test for ephemeral model with alias to dedicated test in test_compile.py
* update children search
* update search to include children in original selector
* add changie
* remove unused function
* fix wrong function call
* fix depth
* sketch
* Bring back the happy path fixture snapshot file
The commit c783a86 removed the snapshot file from the happy path fixture.
This was done because the snapshot was breaking the tests we were adding,
`test_run_commands`. However this broke `test_ls` in `test_list.py`. In order
to move forward, we need everything to be working. Maybe the idea was to delete
the `test_list.py` file, however that is not noted anywhere, and was not done.
Thus this commit ensures that test is not broken nor or new tests.
* Create conftest for `functional` tests so that happy path fixtures are accessible
* Format `test_commands.py` and update imports to appease pre-commit hooks
* Parametrize `test_run_command` to make it easier to see which command is failing (if any)
* Update the setup for `TestRunCommands.test_run_command` to be more formulaic
* Add test to ensure resource types are selectable
* Fix docstring formatting in TestRunCommands
* Fixup documentation for test_commands.py
---------
Co-authored-by: Chenyu Li <chenyu.li@dbtlabs.com>
* Add breakpoint.
* Move breakpoint.
* Add fix
* Add changelog.
* Avoid sorting for the string case.
* Add unit test.
* Fix test.
* add good unit tests for coverage of sort method.
* add sql format coverage.
* Modify behavior to log a warning and proceed.
* code review comments.
---------
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Fix exclusive_primary_alt_value_setting to set warn_error_options correctly
* Add test
* Changie
* Fix unit test
* Replace conversion method
* Refactor normalize_warn_error_options
* Add changelog.
* Avoid sorting for the string case.
* add good unit tests for coverage of sort method.
* add sql format coverage.
---------
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Correct `isort` configuration to include dbt-semantic-interfaces as internal
We thought we were already doing this. However, we accidentally missed the last
`s` of `dbt-semantic-interfaces`, so imports from dbt-semantic-interfaces were not
being identified as an internal package by isort. This fixes that.
* Run isort using updated configs to mark `dbt-semantic-interfaces` as included
* Fix `test_can_silence` tests in `test_warn_error_options.py` to ensure silencing
We're fixing an issue wherein `silence` specifications in the `dbt_project.yaml`
weren't being respected. This was odd since we had tests specifically for this.
It turns out the tests were broken. Essentially the warning was instead being raised
as an error due to `include: 'all'`. Then because it was being raised as an error,
the event wasn't going through the logger. We were only asserting in these tests that
the silenced event wasn't going through the logger (which it wasn't) so everything
"appeared" to be working. Unfortunately everything wasn't actually working. This is now
highlighted because `test_warn_error_options::TestWarnErrorOptionsFromProject:test_can_silence`
is now failing with this commit.
* Fix setting `warn_error_options` via `dbt_project.yaml` flags.
Back when I did the work for #10058 (specifically c52d6531) I thought that
the `warn_error_options` would automatically be converted from the yaml
to the `WarnErrorOptions` object as we were building the `ProjectFlags` object,
which holds `warn_error_options`, via `ProjectFlags.from_dict`. And I thought
this was validated by the `test_can_silence` test added in c52d6531. However,
there were two problems:
1. The definition of `warn_error_options` on `PrjectFlags` is a dict, not a
`WarnErrorOptions` object
2. The `test_can_silence` test was broken, and not testing what I thought
The quick fix (this commit) is to ensure `silence` is passed to `WarnErrorOptions`
instantiation in `dbt.cli.flags.convert_config`. The better fix would be to make
the `warn_error_options` of `ProjectFlags` a `WarnErrorOptions` object instead of
a dict. However, to do this we first need to update dbt-common's `WarnErrorOptions`
definition to default `include` to an empty list. Doing so would allow us to get rid
of `convert_config` entirely.
* Add unit test for `ModelRunner.print_result_line`
* Add (and skip) unit test for `ModelRunner.execute`
An attempt at testing `ModelRunner.execute`. We should probably also be
asserting that the model has been executed. However before we could get there,
we're running into runtime errors during `ModelRunner.execute`. Currently the
struggle is ensuring the adapter exists in the global factory when `execute`
goes looking for. The error we're getting looks like the following:
```
def test_execute(self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner) -> None:
> model_runner.execute(model=table_model, manifest=manifest)
tests/unit/task/test_run.py:121:
----
core/dbt/task/run.py:259: in execute
context = generate_runtime_model_context(model, self.config, manifest)
core/dbt/context/providers.py:1636: in generate_runtime_model_context
ctx = ModelContext(model, config, manifest, RuntimeProvider(), None)
core/dbt/context/providers.py:834: in __init__
self.adapter = get_adapter(self.config)
venv/lib/python3.10/site-packages/dbt/adapters/factory.py:207: in get_adapter
return FACTORY.lookup_adapter(config.credentials.type)
---`
self = <dbt.adapters.factory.AdapterContainer object at 0x106e73280>, adapter_name = 'postgres'
def lookup_adapter(self, adapter_name: str) -> Adapter:
> return self.adapters[adapter_name]
E KeyError: 'postgres'
venv/lib/python3.10/site-packages/dbt/adapters/factory.py:132: KeyError
```
* Add `postgres_adapter` fixture for use in `TestModelRunner`
Previously we were running into an issue where the during `ModelRunner.execute`
the mock_adapter we were using wouldn't be found in the global adapter
factory. We've gotten past this error by supply a "real" adapter, a
`PostgresAdapter` instance. However we're now running into a new error
in which the materialization macro can't be found. This error looks like
```
model_runner = <dbt.task.run.ModelRunner object at 0x106746650>
def test_execute(
self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner
) -> None:
> model_runner.execute(model=table_model, manifest=manifest)
tests/unit/task/test_run.py:129:
----
self = <dbt.task.run.ModelRunner object at 0x106746650>
model = ModelNode(database='dbt', schema='dbt_schema', name='table_model', resource_type=<NodeType.Model: 'model'>, package_na...ected'>, constraints=[], version=None, latest_version=None, deprecation_date=None, defer_relation=None, primary_key=[])
manifest = Manifest(nodes={'seed.pkg.seed': SeedNode(database='dbt', schema='dbt_schema', name='seed', resource_type=<NodeType.Se...s(show=True, node_color=None), patch_path=None, arguments=[], created_at=1718229810.21914, supported_languages=None)}})
def execute(self, model, manifest):
context = generate_runtime_model_context(model, self.config, manifest)
materialization_macro = manifest.find_materialization_macro_by_name(
self.config.project_name, model.get_materialization(), self.adapter.type()
)
if materialization_macro is None:
> raise MissingMaterializationError(
materialization=model.get_materialization(), adapter_type=self.adapter.type()
)
E dbt.adapters.exceptions.compilation.MissingMaterializationError: Compilation Error
E No materialization 'table' was found for adapter postgres! (searched types 'default' and 'postgres')
core/dbt/task/run.py:266: MissingMaterializationError
```
* Add spoofed macro fixture `materialization_table_default` for `test_execute` test
Previously the `TestModelRunner:test_execute` test was running into a runtime error
do to the macro `materialization_table_default` macro not existing in the project. This
commit adds that macro to the project (though it should ideally get loaded via interactions
between the manifest and adapter). Manually adding it resolved our previous issue, but created
a new one. The macro appears to not be properly loaded into the manifest, and thus isn't
discoverable later on when getting the macros for the jinja context. This leads to an error
that looks like the following:
```
model_runner = <dbt.task.run.ModelRunner object at 0x1080a4f70>
def test_execute(
self, table_model: ModelNode, manifest: Manifest, model_runner: ModelRunner
) -> None:
> model_runner.execute(model=table_model, manifest=manifest)
tests/unit/task/test_run.py:129:
----
core/dbt/task/run.py:287: in execute
result = MacroGenerator(
core/dbt/clients/jinja.py:82: in __call__
return self.call_macro(*args, **kwargs)
venv/lib/python3.10/site-packages/dbt_common/clients/jinja.py:294: in call_macro
macro = self.get_macro()
---
self = <dbt.clients.jinja.MacroGenerator object at 0x1080f3130>
def get_macro(self):
name = self.get_name()
template = self.get_template()
# make the module. previously we set both vars and local, but that's
# redundant: They both end up in the same place
# make_module is in jinja2.environment. It returns a TemplateModule
module = template.make_module(vars=self.context, shared=False)
> macro = module.__dict__[get_dbt_macro_name(name)]
E KeyError: 'dbt_macro__materialization_table_default'
venv/lib/python3.10/site-packages/dbt_common/clients/jinja.py:277: KeyError
```
It's becoming apparent that we need to find a better way to either mock or legitimately
load the default and adapter macros. At this point I think I've exausted the time box
I should be using to figure out if testing the `ModelRunner` class is possible currently,
with the result being more work has yet to be done.
* Begin adding the `LogModelResult` event catcher to event manager class fixture
* init push for issue 10198
* add changelog
* add unit tests based on michelle example
* add data_tests, and post_hook unit tests
* pull creating macro_func out of try call
* revert last commit
* pull macro_func definition back out of try
* update code formatting
* Add basic semantic layer fixture nodes to unit test `manifest` fixture
We're doing this in preperation to a for a unit test which will be testing
these nodes (as well as others) and thus we want them in the manifest.
* Add `WhereFilterInteresection` to `QueryParams` of `saved_query` fixture
In the previous commit, 58990aa450, we added
the `saved_query` fixture to the `manifest` fixture. This broke the test
`tests/unit/parser/test_manifest.py::TestPartialParse::test_partial_parse_by_version`.
It broke because the `Manifest.deepcopy` manifest basically dictifies things. When we were
dictifying the `QueryParams` of the `saved_query` before, the `where` key was getting
dropped because it was `None`. We'd then run into a runtime error on instantiation of the
`QueryParams` because although `where` is declared as _optional_, we don't set a default
value for it. And thus, kaboom :(
We should probably provide a default value for `where`. However that is out of scope
for this branch of work.
* Fix `test_select_fqn` to account for newly included semantic model
In 58990aa450 we added a semantic model
to the `manifest` fixture. This broke the test
`tests/unit/graph/test_selector_methods.py::test_select_fqn` because in
the test it selects nodes based on the string `*.*.*_model`. The newly
added semantic model matches this, and thus needed to be added to the
expected results.
* Add unit tests for `_warn_for_unused_resource_config_paths` method
Note: At this point the test when run with for a `unit_test` config
taht should be considered used, fails. This is because it is not being
properly identified as used.
* Include `unit_tests` in `Manifest.get_resouce_fqns`
Because `unit_tests` weren't being included in calls to `Manifest.get_resource.fqns`,
it always appeared to `_warn_for_unused_resource_config_paths` that there were no
unit tests in the manifest. Because of this `_warn_for_unused_resource_config_paths` thought
that _any_ `unit_test` config in `dbt_project.yaml` was unused.
* Rename `maniest` fixture in `test_selector` to `mock_manifest`
We have a globally available `manifest` fixture in our unit tests. In the
coming commits we're going to add tests to the file which use the gloablly
available `manifest` fixture. Prior to this commit, the locally defined
`manifest` fixture was taking precidence. To get around this, the easiest
solution was to rename the locally defined fixture.
I had tried to isolate the locally defined fixture by moving it, and the relevant
tests to a test class like `TestNodeSelector`. However because of _how_ the relevant
tests were parameterized, this proved difficult. Basically for readability we define
a variable which holds a list of all the parameterization variables. By moving to a
test class, the definition of the variables would have had to be defined directly in
the parameterization macro call. Although possible, it made the readability slighty
worse. It might be worth doing anyway in the long run, but instead I used a less heavy
handed alternative (already described)
* Improve type hinting in `tests/unit/utils/manifest.py`
* Ensure `args` get set from global flags for `runtime_config` fixture in unit tests
The `Compiler.compile` method accesses `self.config.args.which`. The `config`
is the `RuntimeConfig` the `Compiler` was instantiated with. Our `runtime_config`
fixture was being instatiated with an empty dict for the `args` property. Thus
a `which` property of the args wasn't being made avaiable, and if `compile` was run
a runtime error would occur. To solve this, we've begun instantiating the args from
the global flags via `get_flags()`. This works because we ensure the `set_test_flags`
fixture is run first which calls `set_from_args`.
* Create a `make_manifest` utility function for use in unit tests and fixture creation
* Refactor `Compiler` and `NodeSelector` tests in `test_selector.py` to use pytesting methodology
* Remove parsing tests that exist in `test_selector.py`
We had some tests in `test_selector.py::GraphTest` that didn't add
anything ontop of what was already being tested else where in the file
except the parsing of models. However, the tests in `test_parser.py::ModelParserTest`
cover everything being tested here (and then some). Thus these tests
in `test_selector.py::GraphTest` are unnecessary and can be deleted.
* Move `test__partial_parse` from `test_selector.py` to `test_manifest.py`
There was a test `test__partial_parse` in `test_selector.py` which tested
the functionality of `is_partial_parsable` of the `ManifestLoader`. This
doesn't really make sense to exist in `test_selector.py` where we are
testing selectors. We test the `ManifestLoader` class in `test_manifest.py`
which seemed like a more appropriate place for the test. Additionally we
renamed the test to `test_is_partial_parsable_by_version` to more accurately
describe what is being tested.
* Make `test_selector`'s manifest fixture name even more specific
* Add type hint to `expected_nodes` in `test_selector.py` tests
In the test `tests/unit/graph/test_selector.py::TestCompiler::test_two_models_simple_ref`
we have a variable `expected_nodes` that we are setting via a list comprehension.
At a glance it isn't immediately obvious what `expected_nodes` actually is. It's a
list, but a list of what? One suggestion was to explicitly write out the list of strings.
However, I worry about the brittleness of doing so. That might be the way we head long
term, but as a compromise for now, I've added a type hint the the variable definition.
* Fire skipped events at debug level
Closes https://github.com/dbt-labs/dbt-core/issues/8774
* add changelog entry
* Update to work with 1.9.*.
* Add tests for --fail-fast not showing skip messages unless --debug.
* Update test that works by itself, but assumes to much to work in integration tests.
---------
Co-authored-by: Scott Gigante <scottgigante@users.noreply.github.com>
* init push arbitrary configs for generic tests pr
* iterative work
* initial test design attempts
* test reformatting
* test rework, have basic structure for 3 of 4 passing, need to figure out how to best represent same key error, failing correctly though
* swap up test formats for new config dict and mixed varitey to run of dbt parse and inspecting the manifest
* modify tests to get passing, then modify the TestBuilder class work from earlier to be more dry
* add changelog
* modify code to match suggested changes around seperate methods and test id fix
* add column_name reference to init for deeper nested _render_values can use the input
* add type annotations
* feedback based on mike review
* Create `runtime_config` fixture and necessary upstream fixtures
* Check for better scoped `ProjectContractError` in test_runtime tests
Previously in `test_unsupported_version_extra_config` and
`test_archive_not_allowed` we were checking for `DbtProjectError`. This
worked because `ProjectContractError` is a subclass of `DbtProjectError`.
However, if we check for `DbtProjectError` in these tests than, some tangential
failure which raises a `DbtProejctError` type error would go undetected. As
we plan on modifying these tests to be pytest in the coming commits, we want to
ensure that the tests are succeeding for the right reason.
* Convert `test_str` of `TestRuntimeConfig` to a pytest test using fixtures
* Convert `test_from_parts` of `TestRuntimeConfig` to a pytest test using fixtures
While converting `test_from_parts` I noticed the comment
> TODO(jeb): Adapters must assert that quoting is populated?
This led me to beleive that `quoting` shouldn't be "fully" realized
in our project fixture unless we're saying that it's gone through
adapter instantiation. Thus I update the `quoting` on our project
fixture to be an empty dict. This change affected `test__str__` in
`test_project.py` which we thus needed to update accordingly.
* Convert runtime version specifier tests to pytest tests and move to test_project
We've done two things in this commit, which arguably _should_ have been done in
two commits. First we moved the version specifier tests from `test_runtime.py::TestRuntimeConfig`
to `test_project.py::TestGetRequiredVersion` this is because what is really being
tested is the `_get_required_version` method. Doing it via `RuntimeConfig.from_parts` method
made actually testing it a lot harder as it requires setting up more of the world and
running with a _full_ project config dict.
The second thing we did was convert it from the old unittest implementation to a pytest
implementation. This saves us from having to create most of the world as we were doing
previously in these tests.
Of note, I did not move the test `test_unsupported_version_range_bad_config`. This test
is a bit different from the rest of the version specifier tests. It was introduced in
[1eb5857811](1eb5857811)
of [#2726](https://github.com/dbt-labs/dbt-core/pull/2726) to resolve [#2638](https://github.com/dbt-labs/dbt-core/issues/2638).
The focus of #2726 was to ensure the version specifier checks were run _before_ the validation
of the `dbt_project.yml`. Thus what this test is actually testing for is order of
operations at parse time. As such, this is really more a _functional_ test than a
unit test. In the next commit we'll get this test moved (and renamed)
* Create a better test for checking that version checks come before project schema validation
* Convert `test_get_metadata` to pytest test
* Refactor `test_archive_not_allowed` to functional test
We do already have tests that ensure "extra" keys aren't allowed in
the dbt_project.yaml. This test is different because it's checking that
a specific key, `archive`, isn't allowed. We do this because at one point
in time `archive` _was_ an allowed key. Specifically, we stopped allowing
`archive` in dbt-core 0.15.0 via commit [f26948dd](f26948dde2).
Given that it's been 5 years and a major version, we could probably remove
this test, but let's keep it around unless we start supporting `archive` again.
* Convert `warn_for_unused_resource_config_paths` tests to use pytest
* Add fixtures for setting and resettign flags for unit tests
* Remove unnecessary `set_from_args` in non `unittest.TestCase` based unit tests
In the previous commit we added a pytest fixture which sets and tears down
the global flags arg via `set_from_args` for every pytest based unit test.
Previously we had added a `set_from_args` in tests or test files to reset
the global flags from if they were modified by a previous test. This is no
longer necessary because of the work done in the previous commit.
Note: We did not modify any tests that use the `unittest.TestCase` class
because they don't use pytest fixtures. Thus those tests need to continue
operating as they currently do until we shift them to pytest unit tests.
* Utilize the new `args_for_flags` fixture for setting of flags in `test_contracts_graph_parsed.py`
* Convert `test_compilation.py` from `TestCase` tests to pytest tests
We did this so in the next commit we can drop the unnecessary `set_from_args`
in the next commit. That will be it's own commit because converting these
tests is a restructuring that doing separately makes things easier to follow.
That is to say, all changes in this commit were just to convert the tests to
pytest, no other changes were made.
* Drop unnecessary `set_from_args` in `test_compilation.py`
* Add return types to all methods in `test_compilation.py`
* Reduce imports from `compilation` in `test_compilation.py`
* Update `test_logging.py` now that we don't need to worry about global flags
* Conditionally import `Generator` type for python 3.8
In python 3.9 `Generator` was moved to `collections.abc` and deprecated
in `typing`. We still support 3.8 and thus need to be conditionally
importing `Generator`. We should remove this in the future when we drop
support for 3.8.
* Add more accurate RSS high water mark measurement for Linux
* Add changelog entry.
* Checks to avoid exception based flow control, per review.
* Fix review nit.
* Add unit test to assert `setup_config_logger` clears the event manager state
* Move `setup_event_logger` tests from `test_functions.py` to `test_logging.py`
* Move `EventCatcher` to `tests.utils` for use in unit and functional tests
* Update fixture mocking global event manager to instead clear it
Previously we had started _mocking_ the global event manager. We did this
because we though that meant anything we did to the global event manager,
including modifying it via things like `setup_event_logger`, would be
automatically cleaned up at the end of any test using the fixture because
the mock would go away. However, this understanding of fixtures and mocking
was incorrect, and the global event manager wouldn't be properly isolated/reset.
Thus we changed to fixture to instead cleanup the global event manager before
any test that uses it and by using `autouse=True` in the fixture definition
we made it so that every unit test uses the fixture.
Note this will no longer be viabled if we every multi-thread our unit testing as
the event manager isn't actually isolated, and thus two tests could both modify
the event manager at the same time.
* Add test for different `write_perf_info` values to `get_full_manifest`
* Add test for different `reset` values to `get_full_manifest`
* Abstract required mocks for `get_full_manifest` tests to reduce duplication
There are a set of required mocks that `get_full_manifest` unit tests need.
Instead of doing these mocks in each test, we've abstracted these mocks into
a reusable function. I did try to do this as a fixture, but for some reaosn
the mocks didn't actually propagate when I did that.
* Add test for different `PARTIAL_PARSE_FILE_DIFF` values to `get_full_manifest`
* Refactor mock fixtures in `test_manifest.py` to make them more widely available
* Convert `set_required_mocks` of `TestGetFullManifest` into a fixture
This wasn't working before, but it does now. Not sure why.
This was done by running `pre-commit run --all`. That this was needed
is a temporary glitch in how our `Tests and Code Checks` github action
works on PRs. Basically we added `isort` to the pre-commit hooks recently, and
this does additional linting/formatting on our imports.
People reasonably have branches which were started prior to `isort` being
part of the pre-commit hooks on main. Thus, unless those branches get caught
up to main, the github action on associated PRs won't run `isort` because
it doesn't exist on those branchs. Once everyone gets their local `main`
branch updated (I suspect this might take a few days) this problem will go
away.
* Add `isort` as a dev-req and pre-commit hook
The tool `isort` reorders imports to be in alphabetical order. I've
added it because our imports in most files are in random order. The lack
of order meant that:
- sometimes the same module would be imported from twice
- figuring out if a a module was already being imported from took
longer
In the next commit I'll actually run isort to order everything. The best
part is that when developing, we don't have to put them in correct order.
Though you can if you want. However, `isort` will take care of re-ordering
things at commit time :)
* Improve isort functionality by setting initial `.isort.cfg`
Specifically we set two config values: `extend_skip_glob` and `known_first_party`.
The `extend_skip_glob` extends the default skipped paths. The defaults can be seen
here https://pycqa.github.io/isort/docs/configuration/options.html#skip. We are skipping
third party stubs because these are more so provided (I believe). We are skipping
`.github` and `scripts` as they feel out of scope and things we can be less strict with.
The `known_first_party` setting makes it so that these imports get grouped separately from
all other imports, which is useful visually of "this comes from us" vs "this comes from
someone/somewhere else".
* Add profile `black` to isort config
I was seeing some odd behavior where running pre-commit, adding the modified
files, and then running pre-commit again would result making more modifications
to some of the same files. This felt odd. You shouldn't have to run pre-commit
more multiple times for it to eventually come to a final "solution". I believe
the problem was because we are using the tool `black` to format things, but weren't
registering the black profile with `isort` this lead to some conflicting formatting
rules, and the two tools had to negotiate a few times before being both satisfied.
Registering the profile `black` with `isort` resolved this problem.
* Reorder, merge-duplicate, and format module imports using `isort`
This was done by running `pre-commit run --all`. I ran it separately from
the commit process itself because I wanted to run it against all files
instead of only changed files.
Of note, this not only reordered and formatted our imports. But we also
had 60 distinct duplicate module import paths across 50 files, which this
took care of. When I say "distinct duplicate module import paths" I mean
when `from x.y.z import` was imported more than once in a single file.
* add support for explicit nulls for loaded_at_field
* add test
* changelog
* add parsing for tests
* centralize logic a bit
* account for sources being None
* fix bug
* remove new field from SourceDefinition
* add validation for empty string, mroe tests
* Move deferral from task to manifest loading + RefResolver
* dbt clone must specify --defer
* Fix deferral for unit test type deteection
* Add changelog
* Move merge_from_artifact from end of parsing back to task before_run to reduce scope of refactor
* PR review. DeferRelation conforms to RelationConfig protocol
* Add test case for #10017
* Update manifest v12 in test_previous_version_state
---------
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* Change agate upper bound to v1.10
* Add changelog.
* update lower pin
* for testing
* put back dev requirement
* move the lower pin back to 1.7
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Refactor test class `EventCatcher` into utils to make accessible to other tests
* Raise minimum version of dbt_common to 1.0.2
We're going to start depending on `silence` existing as a attr of
`WarnErrorOptions`. The `silence` attr was only added in dbt_common
1.0.2, thus that is our new minimum.
* Add ability to silence warnings from `warn_error_options` CLI arguments
* Add `flush` to `EventCatcher` test util, and use in `test_warn_error_options`
* Add tests to `TestWarnErrorOptionsFromCLI` for `include` and `exclude`
* Test support for setting `silence` of `warn_error_options` in `dbt_project` flags
Support for `silence` was _automatically_ added when we upgraded to dbt_common 1.0.2.
This is because we build the project flags in a `.from_dict` style, which is cool. In
this case it was _automatically_ handled in `read_project_flags` in `project.py`. More
specifically here bcbde3ac42/core/dbt/config/project.py (L839)
* Add tests to `TestWarnErrorOptionsFromProject` for `include` and `exclude`
Typically we can't have multiple tests in the same `test class` if they're
utilizing/modifying file system fixtures. That is because the file system
fixtures are scoped to test classes, so they don't reset inbetween tests within
the same test class. This problem _was_ affectin these tests as they modify the
`dbt_project.yml` file which is set by a class based fixture. To get around this,
because I find it annoying to create multiple test classes when the tests really
should be grouped, I created a "function" scoped fixture to reset the `dbt_project.yml`.
* Update `warn_error_options` in CLI args to support `error` and `warn` options
Setting `error` is the same as setting `include`, but only one can be specified.
Setting `warn` is the same as setting `exclude`, but only one can be specified.
* Update `warn_error_options` in Project flags to support `error` and `warn` options
As part of this I refactored `exclusive_primary_alt_value_setting` into an upstream
location `/config/utils`. That is because importing it in `/config/project.py` from
`cli/option_types.py` caused some circular dependency issues.
* Use `DbtExclusivePropertyUseError` in `exclusive_primary_alt_value_setting` instead of `DbtConfigError`
Using `DbtConfigError` seemed reasonable. However in order to make sure the error
got raised in `read_project_flags` we had to mark it to be `DbtConfigError`s to be
re-raised. This had the uninteded consequence of reraising a smattering of errors
which were previously being swallowed. I'd argue that if those are errors we're
swallowing, the functions that raise those errors should be modified perhaps to
conditionally not raise them, but that's not the world we live in and is out of
scope for this branch of work. Thus instead we've created a error specific to
`WarnErrorOptions` issues, and now use, and catch for re-raising.
* Add unit tests for `exclusive_primary_alt_value_setting` method
I debated about parametrizing these tests, and it can be done. However,
I found that the resulting code ended up being about the same number of
lines and slightly less readable (in my opinion). Given the simplicity of
these tests, I think not parametrizing them is okay.
Letting the dbt version be dynamic in the project fixture previously was
causing some tests to break whenever the version of dbt actually got updated,
which isn't great. It'd be super annoying to have to always update tests
affected by this. To get around this we've gone and hard coded the dbt version
in the profile. The alternative was to interpolate the version during comparison
during the relevant tests, which felt less appealing.
* Move `tests/unit/test_yaml_renderer.py` to `tests/unit/parser/test_schema_renderer.py`
* Move `tests/unit/test_unit_test_parser.py` to `tests/unit/parser/test_unit_tests.py`
* Convert `tests/unit/test_tracking.py` to use pytest fixtures
* Delete `tests/unit/test_sql_result.py` as it was moved to `dbt-adapters`
* Move `tests/unit/test_semantic_models.py` to `tests/unit/graph/test_nodes.py
* Group tests of `SemanticModel` in `test_nodes.py` into a `TestSemanticModel` class
* Move `tests/unit/test_selector_errors.py` to `tests/unit/config/test_selectors.py`
* Add `Project` fixture for unit tests and test `Project` class methods
* Move `Project.__eq__` unit tests to new pytest class testing
* Move `Project.hashed_name` unit test to pytest testing class
* Rename some testing class in `test_project.py` to align with testing split
* Refactor `project` fixture to make accessible to other unit tests
* simplify dockerfile, eliminate references to adapter repos as they will be handled in those repos
* keep dbt-postgres target for historical releases of dbt-postgres
* update third party image to pip install conditionally
* Add event type for deprecation of spaces in model names
* Begin emitting deprecation warning for spaces in model names
* Only warn on first model name with spaces unless `--debug` is specified
For projects with a lot of models that have spaces in their names, the
warning about this deprecation would be incredibly annoying. Now we instead
only log the first model name issue and then a count of how many models
have the issue, unless `--debug` is specified.
* Refactor `EventCatcher` so that the event to catch is setable
We want to be able to catch more than just `SpacesInModelNameDeprecation`
events, and in the next commit we will alter our tests to do so. Thus
instead of writing a new catcher for each event type, a slight modification
to the existing `EventCatcher` makes this much easier.
* Add project flag to control whether spaces are allowed in model names
* Log errors and raise exception when `allow_spaces_in_model_names` is `False`
* Use custom event for output invalid name counts instead of `Note` events
Using `Note` events was causing test flakiness when run in a multi
worker environment using `pytest -nauto`. This is because the event
manager is currently a global. So in a situation where test `A` starts
and test `tests_debug_when_spaces_in_name` starts shortly there after,
the event manager for both tests will have the callbacks set in
`tests_debug_when_spaces_in_name`. Then if something in test `A` fired
a `Note` event, this would affect the count of `Note` events that
`tests_debug_when_spaces_in_name` sees, causing assertion failures. By
creating a custom event, `TotalModelNamesWithSpacesDeprecation`, we limit
the possible flakiness to only tests that fire the custom event. Thus
we didn't _eliminate_ all possibility of flakiness, but realistically
only the tests in `test_check_for_spaces_in_model_names.py` can now
interfere with each other. Which still isn't great, but to fully
resolve the problem we need to work on how the event manager is
handled (preferably not globally).
* Always log total invalid model names if at least one
Previously we only logged out the count of how many invalid model names
there were if there was two or more invalid names (and not in debug mode).
However this message is important if there is even one invalid model
name and regardless of whether you are running debug mode. That is because
automated tools might be looking for the event type to track if anything
is wrong.
A related change in this commit is that we now only output the debug hint
if it wasn't run with debug mode. The idea being that if they are already
running it in debug mode, the hint could come accross as somewhat
patronizing.
* Reduce duplicate `if` logic in `check_for_spaces_in_model_names`
* Improve readability of logs related to problematic model names
We want people running dbt to be able to at a glance see warnings/errors
with running their project. In this case we are focused specifically on
errors/warnings in regards to model names containing spaces. Previously
we were only ever emitting the `warning_tag` in the message even if the
event itself was being emitted at an `ERROR` level. We now properly have
`[ERROR]` or `[WARNING]` in the message depending on the level. Unfortunately
we couldn't just look what level the event was being fired at, because that
information doesn't exist on the event itself.
Additionally, we're using events that base off of `DynamicEvents` which
unfortunately hard coded to `DEBUG`. Changing this would involve still
having a `level` property on the definition in `core_types.proto` and
then having `DynamicEvent`s look to `self.level` in the `level_tag`
method. Then we could change how firing events works based on the an
event's `level_tag` return value. This all sounds like a bit of tech
debt suited for PR, possibly multiple, and thus is not being done here.
* Alter `TotalModelNamesWithSpacesDeprecation` message to handle singular and plural
* Remove duplicate import in `test_graph.py` introduced from merging in main
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Expect that the `args` variable is un-modified by `dbt.invoke(args)`
* Make `args` variable un-modified by `dbt.invoke(args)`
* Changelog entry
* Expect that the `args` variable is un-modified by `make_dbt_context`
* Make the `args` variable is un-modified by `make_dbt_context`
* Make a copy of `args` passed to `make_dbt_context`
* Revert "Make a copy of `args` passed to `make_dbt_context`"
This reverts commit 79227b4d34.
* Ensure BaseRunner handles nodes without `build_path`
Some nodes, like SourceDefinition nodes, don't have a `build_path` property.
This is problematic because we take in nodes with no type checking, and
assume they have properties sometimes, like `build_path`. This was just
the case in BaseRunner's `_handle_generic_exception` and
`_handle_interal_exception` methods. Thus to stop dbt from crashing when
trying to handle an exception related to a node without a `build_path`,
we added an private method to the BaseRunner class for safely trying
to get `build_path`.
* Use keyword arguments when instantiating `Note` events in freshness.py
Previously we were passing arguments during the `Note` event instantiations
in freshness.py as positional arguments. This would cause not the desired
`Note` event to be emitted, but instead get the message
```
[Note] Don't use positional arguments when constructing logging events
```
which was our fault, not the users'. Additionally, we were passing the
level for the event in the `Note` instantiation when we needed to be
passing it to the `fire_event` method.
* Raise error when `loaded_at_field` is `None` and metadata check isn't possible
Previously if a source freshness check didn't have a `loaded_at_field` and
metadata source freshness wasn't supported by the adapter, then we'd log
a warning message and let the source freshness check continue. This was problematic
because the source freshness check couldn't actually continue and the process
would raise an error in the form
```
type object argument after ** must be a mapping, not NoneType
```
because the `freshness` variable was never getting set. This error wasn't particularly
helpful for any person running into it. So instead of letting that error
happen we now deliberately raise an error with helpful information.
* Add test which ensures bad source freshness checks raise appropriate error
This test directly tests that when a source freshness check doesn't have a
`loaded_at_field` and the adapter in use doesn't support metadata checks,
then the appropriate error message gets raised. That is, it directly tests
the change made in a162d53a8. This test indirectly tests the changes in both
7ec2f82a9 and 7b0ff3198 as the appropriate error can only be raised because
we've fixed other upstream issues via those commits.
* Add changelog entry for source freshness edgecase fixes
* Add @p.profile and @p.target to the list of "global" CLI flags
* Add env vars (DBT_PROFILE, DBT_TARGET) to the params
* Add unit test
* Simplify unit test
* changie
* Update .changes/unreleased/Features-20231115-092005.yaml
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* Fix incorrect envvar names
* Realign environment variable names
* Remove from specific subcommands
* Add test_global_flags_not_on_subcommands
* Remove one unnecessary test case
* Remove other unnecessary test case
---------
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Our `protobuf` dep was in the section of `setup.py` which we delineate
as expecting all future versions of it to be compatible. However, this
is no longer actually the case, and in e4fe839e45
we restricted it to major vesion 4.
* [#9570] Fix fixtures in fixtures/subfolders throwing parsing error
* Fast-forward imports to match upstream
* Re-introduce doc strings on traceback info handling
* [#9570] Changelog update for fix of fixtures in fixtures/subfolders throwing parsing error
* [#9570] Improve testability and coverage for partial parsing
* Transform skip_parsing (private variable of ManifestLoader.load()) into instance-attribute of ManifestLoader(), with default value False
(to enable splitting of ManifestLoader.load())
* Split ManifestLoader.load(), to extract operation of PartialParsing into new method called ManifestLoader.safe_update_project_parser_files_partially()
(to simplify both cognitive complexity in the codebase and mocking in unittestest)
* Add "ignore" type-comments in new ManifestLoader.safe_update_project_parser_files_partially()
(to silence mypy warnings regarding instance-attributes which can be initialized as None or as something else, e.g. self.saved_manifest)[1]
[1] Although I wanted avoid "ignore" type-comments, it seems like addressing these mypy warnings in a stricter sense requires technical alignment and broader code changes.
For example, might need to initialize self.saved_manifest as Manifest, instead of Optional[Manifest], so that PartialParsing gets inputs with type it currently expects.
... perhaps too far beyond the scope of this fix?
* Check for equality with existing input_measures when adding input_measures
* Changie
* Add type annotation
* Move add_input_measure to metric from type_params
* Add tests to check that saved queries show in `dbt list`
* Update `list` task to support saved queries
This is built off of @jtcohen6 work in d6e7cda on jerco/fix-9532.
I didn't directly cherry pick because there was more work to do as
well as merge conflicts. That is to say @jtcohen6 should be credited
with some of the work.
* Update error message when iterating over nodes during list command errors
This was originally suggested by @jtcohen6 in d6e7cda of jerco/fix-9532.
This commit just makes sure the change gets included because I didn't
cherry-pick that commit into this work.
* Add test around deleting a YAML file containing semantic models and metrics
It was raised in https://github.com/dbt-labs/dbt-core/issues/8860 that an
error is being raised during partial parsing when files containing
metrics/semantic models are deleted. In further testing it looks like this
error specifically happens when a file containing both semantic models and
metrics is deleted. If the deleted file contains just semantic models or
metrics there seems to be no issue. The next commit should contain the fix.
* Skip deleted schema files when scheduling files during partial parsing
Waaaay back (in 7563b99) deleted schema files started being separated out
from deleted non-schema files. However ever since, when it came to scheduling
files for reparsing, we've only done so for deleted non-schema files. We even
missed this when we refactored the scheduling code in b37e5b5. This change
updates `_schedule_for_parsing` which is used by `schedule_nodes_for_parsing`
to begin skipping deleted schema files in addition to deleted non schema files.
* Update `add_to_pp_files` to ignore `deleted_schema_files`
As noted in the previous commit, we started separating out deleted
schema files from deleted non-schema files a looong time ago. However,
this whole time we've been adding `deleted_schema_files` to the list
of files to be parsed. This change corrects for that.
* Add changie doc for partial parsing KeyError fix
Protobuf v5 has breaking changes. Here we are limiting the protobuf
dependency to one major version, 4, so that we don't have to patch
over handling 2 different major versions of protobuf.
* Clearer no-op logging in stubbed SavedQueryRunner
* Add changelog entry
* Fix unit test
* More logging touchups
* Fix failing test
* Rename flag + refactor per #9629
* Fix failing test
* regenerate core_proto_types with libprotoc 25.3
---------
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
A recent update to the version ranges for our internally
maintained support packages quite reasonably expanded the
allowed versions for dbt-semantic-interfaces to all minor versions
after 0.5.0, under the assumption that subsequent releases will
generally be backwards-compatible.
Unfortunately, dbt-semantic-interfaces is not yet in that state.
So we update the version range accordingly, and include some
comments around version range expectations for dependencies
listed in this section of dbt-core's package configuration.
CVE-2024-22195 identified an issue in Jinja2 versions <= 3.1.2. As such
we've gone and changed our dependency requirement specification to be
3.1.3 or greater (but less than 4).
Note: Preivously we were using the `~=` version specifier. However due
to some issues with the `~=` we've moved to using `>=` in combination
with `<`. This gives us the same range that `~=` gave us, but avoids
a pip resolution issue when multiple packages in an environment use `~=`
for the same dependency.
* Remove extraneous `/` in `schema-check.yml`
We have hypothesis that the extra `/` in `schema-check` is causing
issues we're seeing currently in the artifact check failing. It may
not be the final solution, but we should fix it anyway.
* Move `artifact_minor_upgrade` label check to job level of `Check Artifact Changes`
Previously the checking for `artifact_minor_upgrade` was happening in each job
step of `Check Artifact Changes`. By moving it up to the job level instead of
in the job steps we make it so the check for the label only happens once and
it simplifies the job steps.
* Update `Check Artifact Changes` to use `dorny/paths-filter`
Previously we were using `git diff` to check if any files had changed
in `core/dbt/artifacts`. However, our `git diff` usage was including any
changes that happened on `main` which the PR branch did not have. This
commit switches the check from using `git diff` to `dorny/paths-filter`,
which is what we use for checking for changelog existence as well. The
`dorny/paths-fitler` includes logic for excluding changes that are on main
but not the PR branch (which is what want to happen).
* Move `ColumnInfo` to dbt/artifacts
* Move `Quoting` resource to dbt/artifacts
* Move `TimePeriod` to `types.py` in dbt/artifacts
* Move `Time` class to `components`
We need to move the data parts of the `Time` definition to dbt/artifacts.
That is not what we're doing in this commit. In this commit we're simply
moving the functional `Time` definition upstream of `unparsed` and `nodes`.
This does two things
- Mirrors the import path that the resource `time` definition will have in dbt/artifacts
- Reduces the chance of ciricular import problems between `unparsed` and `nodes`
* Move data part of `Time` definition to dbt/artifacts
* Move `FreshnessThreshold` class to components module
We need to move the data parts of the `FreshnessThreshold` definition to dbt/artifacts.
That is not what we're doing in this commit. In this commit we're simply
moving the functional `FreshnessThreshold` definition upstream of `unparsed` and `nodes`.
This does two things
- Mirrors the import path that the resource `FreshnessThreshold` definition will have in dbt/artifacts
- Reduces the chance of ciricular import problems between `unparsed` and `nodes`
* Move data part of `FreshnessThreshold` to dbt/artifacts
Note: We had to override some of the attrs of the `FreshnessThreshold`
resource because the resource version only has access to the resource
version of `Time`. The overrides in the functional definition of
`FreshnessThreshold` make it so the attrs use the functional version
of `Time`.
* Move `ExternalTable` and `ExternalPartition` to `source_definition` module in dbt/artifacts
* Move `SourceConfig` to `source_definition` module in dbt/artifacts
* Move `HasRelationMetadata` to core `components` module
This is a precursor to splitting `HasRelationMetadata` into it's
data and functional parts.
* Move data portion of `HasRelationMetadata` to dbt/artifacts
* Move `SourceDefinitionMandatory` to dbt/artifacts
* Move the data parts of `SourceDefinition` to dbt/artifacts
Something interesting here is that we had to override the `freshness`
property. We had to do this because if we didn't we wouldn't get the
functional parts of `FreshnessThreshold`, we'd only get the data parts.
Also of note, the `SourceDefintion` has a lot of `@property` methods that
on other classes would be actual attribute properties of the node. There is
an argument to be made that these should be moved as well, but thats perhaps
a separate discussion.
Finally, we have not (yet) moved `NodeInfoMixin`. It is an open discussion
whether we do or not. It seems primarily functional, as a means to update the
source freshness information. As the artifacts primarily deal with the shape
of the data, not how it should be set, it seems for now that `NodeInfoMixin`
should stay in core / not move to artifacts. This thinking may change though.
* Refactor `from_resource` to no longer use generics
In the next commit we're gonna add a `to_resource` method. As we don't
want to have to pass a resource into `to_resource`, the class itself
needs to expose what resource class should be built. Thus a type annotation
is no longer enough. To solve this we've added a class method to BaseNode
which returns the associated resource class. The method on BaseNode will
raise a NotImplementedError unless the the inheriting class has overridden
the `resouce_class` method to return the a resource class.
You may be thinking "Why not a class property"? And that is absolutely a
valid question. We used to be able to chain `@classmethod` with
`@property` to create a class property. However, this was deprecated in
python 3.11 and removed in 3.13 (details on why this happened can be found
[here](https://github.com/python/cpython/issues/89519)). There is an
[alternate way to setup a class property](https://github.com/python/cpython/issues/89519#issuecomment-1397534245),
however this seems a bit convoluted if a class method easily gets the job
done. The draw back is that we must do `.resource_class()` instead of
`.resource_class` and on classes implementing `BaseNode` we have to
override it with a method instead of a property specification.
Additionally, making it a class _instance_ property won't work because
we don't want to require an _instance_ of the class to get the
`resource_class` as we might not have an instance at our dispossal.
* Add `to_resource` method to `BaseNode`
Nodes have extra attributes. We don't want these extra attributes to
get serialized. Thus we're converting back to resources prior to
serialization. There could be a CPU hit here as we're now dictifying
and undictifying right before serialization. We can do some complicated
and non-straight-forward things to get around this. However, we want
to see how big of a perforance hit we actually have before going that
route.
* Drop `__post_serialize__` from `SourceDefinition` node class
The method `__post_serialize__` on the `SourceDefinition` was used for
ensuring the property `_event_status` didn't make it to the serialized
version of the node. Now that resource definition of `SourceDefinition`
handles serialization/deserialization, we can drop `__post_serialize__`
as it is no longer needed.
* Merge functional parts of `components` into their resource counter parts
We discussed this on the PR. It seems like a minimal lift, and minimal to
support. Doing so also has the benefit of reducing a bunch of the overriding
we were previously doing.
* Fixup:: Rename variable `name` to `node_id` in `_map_nodes_to_map_resources`
Naming is hard. That is all.
* Fixup: Ensure conversion of groups to resources for `WritableManifest`
this pr provides additional command line options (cloud cli) for dbt development, as well as clarifies 'dbt core' under the 'get started' header.
cc @greg-mckeon
* update docker file to remove &subdirectory=plugins/postgres from git path
* remove extra proto file generation scripts which are no longer necessary in this repo
* Begin using `Mergeable` supplied by `dbt-common`
We're currently in the process of moving the "data resource" portion on nodes
to `dbt/artifacts`. Some of those artifacts depend on `Mergeable` which has
been defined on core. In order to move the data resources to `dbt/artifacts`,
we thus need to move `Mergeable` upstream of core. We moved `Mergeable` to
[dbt-common](https://github.com/dbt-labs/dbt-common) in
https://github.com/dbt-labs/dbt-common/pull/59, and released this change in
[dbt-common 0.1.3](https://pypi.org/project/dbt-common/0.1.3/). As such as, in
order to unblock some of the `dbt/artifacts` migration work, we first need to
update references to `Mergeable` in core to use the `dbt-common` definition.
NOTE: We include changing over to `Replaceable` from `dbt-common` in this
commit. This is because there wasn't a clean way to do it. If I moved the imports
of `Replaceable` on in the files where we updated `Mergeable` then we would
have left `Replaceable` in an inbetween state. If we had moved all instances
of `Replaceable`, it'd be out of scope for the change. As such, it makes more
sense to do that as a separate changeset.
* Remove definition of `Mergeable` from dbt/contracts/util
Although we've removed the definition of `Mergeable` we've ensured the
import paths are still available. We do this because this is under
`contracts`, and the sudden disappearance from the import path might
cause issues for community members using dbt-core as a library.
Ideally we'd define a `Mergeable` class here that inherits the
`dbt-common` definition and raise a deprecation warning on instantiation.
However, we don't have an established strategy to do so.
* Use new context invocation class.
* Adjust new constructor param on InvocationContext, make tests robust
* Add changelog entry.
* Clarify parameter name
* Move `ExposureType` to dbt/artifacts
* Move `MaturityType` to dbt/artifacts
* Move `ExposureConfig` to dbt/artifacts
* Move data parts of `Exposure` node class to dbt/artifacts
* Update leftover incorrect imports of `Owner` resource
There were a few places in the code base that were importing `Owner`
from `unparsed` or `nodes`. The places importing from `unparsed` were
working because `unparsed` itself was correctly importing from
`artifacts.resources`. However in places where it was being imported
from `nodes`, an exception was being raised because in the previous
commit we removed the import of `Owner` in `nodes` because it was
no longer needed.
* Move `SemanticModel` sub dataclasses to dbt/artifacts
* Move `NodeRelation` to dbt/artifacts
* Move `SemanticModelConfig` to dbt/artifacts
* Move data portion of `SemanticModel` to dbt/artifacts
* Add contextual comments to `semantic_model.py` about DSI protocols
* Fixup mypy complaint
* Migrate v12 manifest to use artifact definitions of `SavedQuery`, `Metric`, and `SemanticModel`
* Convert `SemanticModel` and `Metric` resources to full nodes in selector search
In the `search` method in `selector_methods.py`, we were getting object
representations from the incoming writable manifest by unique id. What we
get from the writable manifest though is increasingly the `resource`
(data artifact) part of the node, not the full node. This was problematic
because a number of the selector processes _compare_ the old node to the
new node, but the `resource` representation doesn't have the comparator
methods.
In this commit we dict-ify the resource and then get the full node by
undictifying that. We should probably have a better built in process to
the full node objects to do this, but this will do for now.
* Add `from_resource` implementation on `BaseNode` to ease resource to node conversion
We want to easily be able to create nodes from their resource counter
parts. It's actually imperative that we can do so. The previous commit
had a manual way to do so where needed. However, we don't want to have
to put `from_dict(.to_dict())` everywhere. So here we hadded a `from_resource`
class method to `BaseNode`. Everything that inherits from `BaseNode` thus
automatically gets this functionality.
HOWEVER, the implementation currently has a problem. Specifically, the
type for `resource_instance` is `BaseResource`. Which means if one is
calling say `Metric.from_resource()`, one could hand it a `SemanticModelResource`
and mypy won't complain. In this case, a semi-cryptic error might get
raised at runtime. Whether or not an error gets raised depends entirely
on whether or not the dictified resource instance manages to satisfy all
the required attributes of the desired node class. THIS IS VERY BAD.
We should be able to solve this issue in an upcoming (hopefully next)
commit, wherein we genericize `BaseNode` such that when inheriting it
you declare it with a resource type. Technically a runtime error will
still be possible, however any mixups should be caught by mypy on
pre-commit hooks as well as PRs.
* Make `BaseNode` a generic that is defined with a `ResourceType`
Turning `BaseNode` into an ABC generic allows us to say that the inheriting
class can define what resource type from artifacts it should be used with.
This gives us added type safety to what resource type can be passed into
`from_resource` when called via `SemanticModel.from_resource(...)`,
`Metric.from_resource(...)`, and etc.
NOTE: This only gives us type safety from mypy. If we begin ignoring
mypy errors during development, we can still get into a situation for
runtime errors (it's just harder to do so now).
* simplify and modularize tagging logic
* change package field to dropdown, log inputs to publish, skip actual publish for testing
* add dry run option
* update to v3 of docker actions to migrate from node16 (deprecated) to node20
* Move `MetricInputMeasure` to dbt/artifacts
* Move `MetricTimeWindow` to dbt/artifacts
* Move `MetricInput` to dbt/artifacts
* Move `ConstantPropertyInput` and `ConversionTypeParams` to dbt/artifacts
* Move `MetricTypeParams` to dbt/artifacts
* Remove obsolete `MetricReference` class from core
The `MetricReference` defined in `nodes.py` is from pre core 1.6 metrics,
i.e. the legacy semantic layer prior to integrating with MetricFlow. I
double checked and found that this `MetricReference` is found _nowhere_
in core. It is dead, with no plan of coming back. Thus deleting it seems
logical.
* Move `MetricConfig` to dbt/artifacts
* Move data portion of `Metric` node to dbt/artifacts
* Move `depends_on_nodes` and `search_name` back to core `Metric` implementation
I got a little too indiscriminate in what got moved in the `Metric`
definition split in the previous commit. Specifically `depends_on_nodes`
and `search_name` shouldn't have been moved to `dbt/artifacts` as they
are specific core internals, not artifacts to be depended on.
* Add context comment to `metric.py` artifact file about upstream protocols.
* Move the common semantic layer node components to v1 artifact resources
* Move `FileSlice` and `SourceFileMetadata` to `semantic_layer_components` in artifacts
* Split `GraphNode` into a functional class in core and data class in artifacts
* Refactor the `same_context` checks of `Exports` into `SavedQuery`
This is important because we want to move the `Export` class to artifacts.
However, because it had functional parts we would have split it in half,
with the data definition exists in artifacts and the functional specification
defined in core. At first glance thats not problematic. However, the
`SavedQuery` definition in artifacts would only be able to point at the
data definition of `Export`, and then the function `SavedQuery` spec in
core would have to override that with the functional `Export` definition
that exists in core. This would make the inheritance rather wonky and
confusing. This refactor simplifies thigs greatly because now we can move
the entirety of `Export` to artifacts, and the core `SavedQuery` won't
have to override anything.
* Move child components of `SavedQuery` to artifacts
Specifically the components in `contracts/graph/saved_queries.py` which
are `Export`, `ExportConfig`, and `QueryParams` got moved to
`artifacts/resources/v1/saved_query.py`. The moving of `Export` was
made possible by the refactor in the previous commit.
* Move `SavedQueryMandatory` to dbt/artifacts
* Move `SavedQueryConfig` to dbt/artifacts
* Move `DependsOn` class to artifacts
If we had followed the general paradigm we've set, we would have split
`DependsOn` into a data half and a functional half, with the data half
going in artifacts. However, doing so overly complicates the work that
we're doing. Additionally looking forward, we hope to simplify the
`DependsOn` (as well as `MacroDependsOn`) to use `sets` instead of
`lists`, thus allowing us to get rid of the fuctional part. We haven't
done that refactor here because there is a reasonable amount of risk
associated with such a change such that doign so should be it's own
segement of work.
* Move `NodeVersion` and `RefArgs` to dbt/artifacts
I debated about making this two commits. However I only realized we
needed to also move `NodeVersion` when I was most the way through
moving `RefArgs`, and instead of stashing, I just decided to due both.
They're kind of inseparable anyways because it only makes sense to
move `NodeVersion` if you move `RefArgs`, but you can't move `RefArgs`
unless you also move `NodeVersion`. The two in one commit are still
small enough that I'm okay with this.
* Move data portion of `SavedQuery` class to dbt/artifacts
* Update implementation-ticket.yml
Changed "Notion docs" to "documentations"
* Added changelog
* modified the contributing and readme files.
* fixed end of files as test failed on previous commit.
* fixed the test errors.
* Changes as per reviewer's request have been made.
* some changes idk
* Update .changes/unreleased/Under the Hood-20240109-091856.yaml
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Update .github/ISSUE_TEMPLATE/implementation-ticket.yml
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Update .github/ISSUE_TEMPLATE/implementation-ticket.yml
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
---------
Co-authored-by: Tania <tonayya@users.noreply.github.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* simplify release inputs
* fix vars
* add missing quote:
* drop all env vars since the workflows dont like them
* Update .github/workflows/release.yml
* Add unit test that shows unit tests work with external nodes
* Abbreviate in names in external nodes test to stay under 64 character postgres max
Was getting test failures due to resulting lengthy model names being created
by unit test task in the functional test
* Fix unit test parsing to ensure external nodes continue to keep their package name
* Add seed to test of external node unit test, and indirectly have the external node point to it
Previously I was getting an error about the columns for the external model
not being fetchable from the database via the macro `get_columns_in_relation`.
By creating a seed for the tests, which creates a table in postgres, we can then
tell the external model that it's database schema and identifier (the relation)
is that table from the seed without make the seed an actual dependency of the
external model in the dag.
* Ensure all models in unit test shadow manifest have a non `None` path
External nodes generally don't have paths, but in unit tests we write out
all models to sql files (as this allows us to test them). Thus external
nodes need to have their paths set.
* Add `run` step to function test of unit test with external nodes
This is necessary because when executing a unit tests, the columns
associated with a model in the database are retrieved. For this to
be possible, the model must exist in the database, thus we must
run the associated models at least once first.
* Create a full external package for function test of a unit test with an external node
Previously we were only pseudo creating an external package for testing
how unit tests work with external nodes. This was problematic because the
package didn't actually exist and thus wasn't seen as accessible when running
through dag dependencies. By actually creating the external package, we
ensure that all the built in normal processes happen.
* Add test for more ephemoral external models
* Flip logic in `packages_for_node` to remove error case
By flipping the logic from `not in` to `in` we can drop the exception
and instead default to the model runtime config when the package isn't
found. We're still trying to grok if there will be any fallout from this.
The tests all pass, but that doesn't guarantee nothing bad will happen.
* Add changie doc for added support of external nodes in unit tests
* Initial implementation of unit testing (from pr #2911)
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* 8295 unit testing artifacts (#8477)
* unit test config: tags & meta (#8565)
* Add additional functional test for unit testing selection, artifacts, etc (#8639)
* Enable inline csv format in unit testing (#8743)
* Support unit testing incremental models (#8891)
* update unit test key: unit -> unit-tests (#8988)
* convert to use unit test name at top level key (#8966)
* csv file fixtures (#9044)
* Unit test support for `state:modified` and `--defer` (#9032)
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* Allow use of sources as unit testing inputs (#9059)
* Use daff for diff formatting in unit testing (#8984)
* Fix#8652: Use seed file from disk for unit testing if rows not specified in YAML config (#9064)
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Fix#8652: Use seed value if rows not specified
* Move unit testing to test and build commands (#9108)
* Enable unit testing in non-root packages (#9184)
* convert test to data_test (#9201)
* Make fixtures files full-fledged members of manifest and enable partial parsing (#9225)
* In build command run unit tests before models (#9273)
---------
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
* remove dbt.contracts.connection imports from adapter module
* Move events to common (#8676)
* Move events to common
* More Type Annotations (#8536)
* Extend use of type annotations in the events module.
* Add return type of None to more __init__ definitions.
* Still more type annotations adding -> None to __init__
* Tweak per review
* Allow adapters to include python package logging in dbt logs (#8643)
* add set_package_log_level functionality
* set package handler
* set package handler
* add logging about stting up logging
* test event log handler
* add event log handler
* add event log level
* rename package and add unit tests
* revert logfile config change
* cleanup and add code comments
* add changie
* swap function for dict
* add additional unit tests
* fix unit test
* update README and protos
* fix formatting
* update precommit
---------
Co-authored-by: Peter Webb <peter.webb@dbtlabs.com>
* fix import
* move types_pb2.py from events to common/events
* move agate_helper into common
* Add utils module (#8910)
* moving types_pb2.py to common/events
* split out utils into core/common/adapters
* add changie
* remove usage of dbt.config.PartialProject from dbt/adapters (#8909)
* remove usage of dbt.config.PartialProject from dbt/adapters
* add changie
---------
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
* move agate_helper unit tests under tests/unit/common
* move agate_helper into common (#8911)
* move agate_helper into common
* add changie
---------
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
* remove dbt.flags.MP_CONTEXT usage in dbt/adapters (#8931)
* remove dbt.flags.LOG_CACHE_EVENTS usage in dbt/adapters (#8933)
* Refactor Base Exceptions (#8989)
* moving types_pb2.py to common/events
* Refactor Base Exceptions
* update make_log_dir_if_missing to handle str
* move remaining adapters exception imports to common/adapters
---------
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* Remove usage of dbt.deprecations in dbt/adapters, enable core & adapter-specific (#9051)
* Decouple adapter constraints from core (#9054)
* Move constraints to dbt.common
* Move constraints to contracts folder, per review
* Add a changelog entry.
* move include/global_project to adapters (#8930)
* remove adapter.get_compiler (#9134)
* Move adapter logger to adapters (#9165)
* moving types_pb2.py to common/events
* Move AdapterLogger to adapter folder
* add changie
* delete accidentally merged types_pb2.py
* Move the semver package to common and alter references. (#9166)
* Move the semver package to common and alter references.
* Alter leftover references to dbt.semver, this time using from syntax.
---------
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Refactor EventManager setup and interaction (#9180)
* moving types_pb2.py to common/events
* move event manager setup back to core, remove ref to global EVENT_MANAGER and clean up event manager functions
* move invocation_id from events to first class common concept
* move lowercase utils to common
* move lowercase utils to common
* ref CAPTURE_STREAM through method
* add changie
* first pass: adapter migration script (#9160)
* Decouple macro generator from adapters (#9149)
* Remove usage of dbt.contracts.relation in dbt/adapters (#9207)
* Remove ResultNode usage from connections (#9211)
* Add RelationConfig Protocol for use in Relation.create_from (#9210)
* move relation contract to dbt.adapters
* changelog entry
* first pass: clean up relation.create_from
* type ignores
* type ignore
* changelog entry
* update RelationConfig variable names
* Merge main into feature/decouple-adapters-from-core (#9240)
* moving types_pb2.py to common/events
* Restore warning on unpinned git packages (#9157)
* Support --empty flag for schema-only dry runs (#8971)
* Fix ensuring we produce valid jsonschema artifacts for manifest, catalog, sources, and run-results (#9155)
* Drop `all_refs=True` from jsonschema-ization build process
Passing `all_refs=True` makes it so that Everything is a ref, even
the top level schema. In jsonschema land, this essentially makes the
produced artifact not a full schema, but a fractal object to be included
in a schema. Thus when `$id` is passed in, jsonschema tools blow up
because `$id` is for identifying a schema, which we explicitly weren't
creating. The alternative was to drop the inclusion of `$id`. Howver, we're
intending to create a schema, and having an `$id` is recommended best
practice. Additionally since we were intending to create a schema,
not a fractal, it seemed best to create to full schema.
* Explicity produce jsonschemas using DRAFT_2020_12 dialect
Previously were were implicitly using the `DRAFT_2020_12` dialect through
mashumaro. It felt wise to begin explicitly specifying this. First, it
is closest in available mashumaro provided dialects to what we produced
pre 1.7. Secondly, if mashumaro changes its default for whatever reason
(say a new dialect is added, and mashumaro moves to that), we don't want
to automatically inherit that.
* Bump manifest version to v12
Core 1.7 released with manifest v11, and we don't want to be overriding
that with 1.8. It'd be weird for 1.7 and 1.8 to both have v11 manifests,
but for them to be different, right?
* Begin including schema dialect specification in produced jsonschema
In jsonschema's documentation they state
> It's not always easy to tell which draft a JSON Schema is using.
> You can use the $schema keyword to declare which version of the JSON Schema specification the schema is written to.
> It's generally good practice to include it, though it is not required.
and
> For brevity, the $schema keyword isn't included in most of the examples in this book, but it should always be used in the real world.
Basically, to know how to parse a schema, it's important to include what
schema dialect is being used for the schema specification. The change in
this commit ensures we include that information.
* Create manifest v12 jsonschema specification
* Add change documentation for jsonschema schema production fix
* Bump run-results version to v6
* Generate new v6 run-results jsonschema
* Regenerate catalog v1 and sources v3 with fixed jsonschema production
* Update tests to handle bumped versions of manifest and run-results
---------
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Co-authored-by: Quigley Malcolm <QMalcolm@users.noreply.github.com>
* Move BaseConfig to Common (#9224)
* moving types_pb2.py to common/events
* move BaseConfig and assorted dependencies to common
* move ShowBehavior and OnConfigurationChange to common
* add changie
* Remove manifest from catalog and connection method signatures (#9242)
* Add MacroResolverProtocol, remove lazy loading of manifest in adapter.execute_macro (#9243)
* remove manifest from adapter.execute_macro, replace with MacroResolver + remove lazy loading
* rename to MacroResolverProtocol
* pass MacroResolverProtcol in adapter.calculate_freshness_from_metadata
* changelog entry
* fix adapter.calculate_freshness call
* pass context to MacroQueryStringSetter (#9248)
* moving types_pb2.py to common/events
* remove manifest from adapter.execute_macro, replace with MacroResolver + remove lazy loading
* rename to MacroResolverProtocol
* pass MacroResolverProtcol in adapter.calculate_freshness_from_metadata
* changelog entry
* fix adapter.calculate_freshness call
* pass context to MacroQueryStringSetter
* changelog entry
---------
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
* add macro_context_generator on adapter (#9251)
* moving types_pb2.py to common/events
* remove manifest from adapter.execute_macro, replace with MacroResolver + remove lazy loading
* rename to MacroResolverProtocol
* pass MacroResolverProtcol in adapter.calculate_freshness_from_metadata
* changelog entry
* fix adapter.calculate_freshness call
* add macro_context_generator on adapter
* fix adapter test setup
* changelog entry
* Update parser to support conversion metrics (#9173)
* added ConversionTypeParams classes
* updated parser for ConversionTypeParams
* added step to populate input_measure for conversion metrics
* version bump on DSI
* comment back manifest generating line
* updated v12 schemas
* added tests
* added changelog
* Add typing for macro_context_generator, fix query_header_context
---------
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
Co-authored-by: William Deng <33618746+WilliamDee@users.noreply.github.com>
* Pass mp_context to adapter factory (#9275)
* moving types_pb2.py to common/events
* require core to pass mp_context to adapter factory
* add changie
* fix SpawnContext annotation
* Fix include for decoupling (#9286)
* moving types_pb2.py to common/events
* fix include path in MANIFEST.in
* Fix include for decoupling (#9288)
* moving types_pb2.py to common/events
* fix include path in MANIFEST.in
* add index.html to in MANIFEST.in
* move system client to common (#9294)
* moving types_pb2.py to common/events
* move system.py to common
* add changie update README
* remove dbt.utils from semver.py
* remove aliasing connection_exception_retry
* Update materialized views to use RelationConfigs and remove refs to dbt.utils (#9291)
* moving types_pb2.py to common/events
* add AdapterRuntimeConfig protocol and clean up dbt-postgress core imports
* add changie
* remove AdapterRuntimeConfig
* update changelog
* Add config field to RelationConfig (#9300)
* moving types_pb2.py to common/events
* add config field to RelationConfig
* merge main into feature/decouple-adapters-from-core (#9305)
* moving types_pb2.py to common/events
* Update parser to support conversion metrics (#9173)
* added ConversionTypeParams classes
* updated parser for ConversionTypeParams
* added step to populate input_measure for conversion metrics
* version bump on DSI
* comment back manifest generating line
* updated v12 schemas
* added tests
* added changelog
* Remove `--dry-run` flag from `dbt deps` (#9169)
* Rm --dry-run flag for dbt deps
* Add changelog entry
* Update test
* PR feedback
* adding clean_up methods to basic and unique_id tests (#9195)
* init attempt of adding clean_up methods to basic and unique_id tests
* swapping cleanup method drop of test_schema to unique_schema to test breakage on docs_generate test
* moving the clean_up method down into class BaseDocsGenerate
* remove drop relation for unique_schema
* manually define alternate_schema for clean_up as not being seen as part of project_config
* add changelog
* remove unneeded changelog
* uncomment line that generates new manifest and delete manifest our changes created
* make sure the manifest test is deleted and readd older version of manifest.json to appease test
* manually revert file to previous commit
* Revert "manually revert file to previous commit"
This reverts commit a755419e8b.
---------
Co-authored-by: William Deng <33618746+WilliamDee@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Matthew McKnight <91097623+McKnight-42@users.noreply.github.com>
* resolve merge conflict on unparsed.py (#9309)
* moving types_pb2.py to common/events
* Update parser to support conversion metrics (#9173)
* added ConversionTypeParams classes
* updated parser for ConversionTypeParams
* added step to populate input_measure for conversion metrics
* version bump on DSI
* comment back manifest generating line
* updated v12 schemas
* added tests
* added changelog
* Remove `--dry-run` flag from `dbt deps` (#9169)
* Rm --dry-run flag for dbt deps
* Add changelog entry
* Update test
* PR feedback
* adding clean_up methods to basic and unique_id tests (#9195)
* init attempt of adding clean_up methods to basic and unique_id tests
* swapping cleanup method drop of test_schema to unique_schema to test breakage on docs_generate test
* moving the clean_up method down into class BaseDocsGenerate
* remove drop relation for unique_schema
* manually define alternate_schema for clean_up as not being seen as part of project_config
* add changelog
* remove unneeded changelog
* uncomment line that generates new manifest and delete manifest our changes created
* make sure the manifest test is deleted and readd older version of manifest.json to appease test
* manually revert file to previous commit
* Revert "manually revert file to previous commit"
This reverts commit a755419e8b.
---------
Co-authored-by: William Deng <33618746+WilliamDee@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Matthew McKnight <91097623+McKnight-42@users.noreply.github.com>
* Resolve unparsed.py conflict (#9311)
* Update parser to support conversion metrics (#9173)
* added ConversionTypeParams classes
* updated parser for ConversionTypeParams
* added step to populate input_measure for conversion metrics
* version bump on DSI
* comment back manifest generating line
* updated v12 schemas
* added tests
* added changelog
* Remove `--dry-run` flag from `dbt deps` (#9169)
* Rm --dry-run flag for dbt deps
* Add changelog entry
* Update test
* PR feedback
* adding clean_up methods to basic and unique_id tests (#9195)
* init attempt of adding clean_up methods to basic and unique_id tests
* swapping cleanup method drop of test_schema to unique_schema to test breakage on docs_generate test
* moving the clean_up method down into class BaseDocsGenerate
* remove drop relation for unique_schema
* manually define alternate_schema for clean_up as not being seen as part of project_config
* add changelog
* remove unneeded changelog
* uncomment line that generates new manifest and delete manifest our changes created
* make sure the manifest test is deleted and readd older version of manifest.json to appease test
* manually revert file to previous commit
* Revert "manually revert file to previous commit"
This reverts commit a755419e8b.
---------
Co-authored-by: William Deng <33618746+WilliamDee@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Matthew McKnight <91097623+McKnight-42@users.noreply.github.com>
---------
Co-authored-by: colin-rogers-dbt <111200756+colin-rogers-dbt@users.noreply.github.com>
Co-authored-by: Peter Webb <peter.webb@dbtlabs.com>
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
Co-authored-by: Mila Page <67295367+VersusFacit@users.noreply.github.com>
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Quigley Malcolm <QMalcolm@users.noreply.github.com>
Co-authored-by: William Deng <33618746+WilliamDee@users.noreply.github.com>
Co-authored-by: Matthew McKnight <91097623+McKnight-42@users.noreply.github.com>
Co-authored-by: Chenyu Li <chenyu.li@dbtlabs.com>
* init attempt of adding clean_up methods to basic and unique_id tests
* swapping cleanup method drop of test_schema to unique_schema to test breakage on docs_generate test
* moving the clean_up method down into class BaseDocsGenerate
* remove drop relation for unique_schema
* manually define alternate_schema for clean_up as not being seen as part of project_config
* add changelog
* remove unneeded changelog
* uncomment line that generates new manifest and delete manifest our changes created
* make sure the manifest test is deleted and readd older version of manifest.json to appease test
* manually revert file to previous commit
* Revert "manually revert file to previous commit"
This reverts commit a755419e8b.
* Drop `all_refs=True` from jsonschema-ization build process
Passing `all_refs=True` makes it so that Everything is a ref, even
the top level schema. In jsonschema land, this essentially makes the
produced artifact not a full schema, but a fractal object to be included
in a schema. Thus when `$id` is passed in, jsonschema tools blow up
because `$id` is for identifying a schema, which we explicitly weren't
creating. The alternative was to drop the inclusion of `$id`. Howver, we're
intending to create a schema, and having an `$id` is recommended best
practice. Additionally since we were intending to create a schema,
not a fractal, it seemed best to create to full schema.
* Explicity produce jsonschemas using DRAFT_2020_12 dialect
Previously were were implicitly using the `DRAFT_2020_12` dialect through
mashumaro. It felt wise to begin explicitly specifying this. First, it
is closest in available mashumaro provided dialects to what we produced
pre 1.7. Secondly, if mashumaro changes its default for whatever reason
(say a new dialect is added, and mashumaro moves to that), we don't want
to automatically inherit that.
* Bump manifest version to v12
Core 1.7 released with manifest v11, and we don't want to be overriding
that with 1.8. It'd be weird for 1.7 and 1.8 to both have v11 manifests,
but for them to be different, right?
* Begin including schema dialect specification in produced jsonschema
In jsonschema's documentation they state
> It's not always easy to tell which draft a JSON Schema is using.
> You can use the $schema keyword to declare which version of the JSON Schema specification the schema is written to.
> It's generally good practice to include it, though it is not required.
and
> For brevity, the $schema keyword isn't included in most of the examples in this book, but it should always be used in the real world.
Basically, to know how to parse a schema, it's important to include what
schema dialect is being used for the schema specification. The change in
this commit ensures we include that information.
* Create manifest v12 jsonschema specification
* Add change documentation for jsonschema schema production fix
* Bump run-results version to v6
* Generate new v6 run-results jsonschema
* Regenerate catalog v1 and sources v3 with fixed jsonschema production
* Update tests to handle bumped versions of manifest and run-results
In [dbt-labs/dbt-core#7984](https://github.com/dbt-labs/dbt-core/pull/7984)
we began setting a metrics `type_params.input_measures` during metric
processing post parsing. However in that PR we didn't clean up the comment
in the parser about setting `input_measures`. This is that post fact
cleanup.
* Add test asserting GraphRunnableTasks attempt to cancel connections on SystemExit
* Add test asserting GraphRunnableTasks attempt to cancel connections on KeyboardInterrupt
* Add test asserting GraphRunnableNode doesn't try to cancel connections on generic Exception
* tarball lockfile fix
* Add changie doc for tarball deps issue
* Add integration test for ensuring tarball package specification works
This test was written _after_ the fix was commited. However, I ran this
test against main without the fix and it failed. After running the test
with the tarball fix, it passed.
* Remove unnecessary `tarball` conditional logic in `PackageConfig.validate`
We had a conditional to skip validation for a package if the package
included the `tarball` key. However, this conditional always returned
false as it was nested inside a conditional that the package had the
default `package` key, which means it's not a tarball package, but a
package package (maybe we need better differentiation here). If we need
additional validation for tarballs down the road, we should do that one
level up. At this time we have no additional validaitons to add.
* Fix typos in changie doc for tarball deps issue
* Improve tarball package test naming and add related unhappy path test
* Remove unnecessary `setUp` fixture from tarball package tests
We initially included this fixture due to copy and pasting another
test. However, this `setUp` fixture isn't actually necessary for the
tarball dependency tests.
---------
Co-authored-by: Chenyu Li <chenyu.li@dbtlabs.com>
* Add test asserting `SavedQuery` configs can be set from `dbt_project.yml`
* Allow extraneous properties in Export configs
This brings the Export config object more in line with how other config
objects are specified in the unparsed definition. It allows for specifying
of extra configs, although they won't get propagate to the final config.
* Add `ExportConfig` options to `SavedQueryConfig` options
This allows for specifying `ExportConfig` options at the `SavedQueryConfig` level.
This also therefore allows these options to be specified in the dbt_project.yml
config. The plan in the follow up commit is to merge the `SavedQueryConfig` options
into all configs of `Exports` belonging to the saved query.
There are a couple caveots to call out:
1. We've used `schema` instead of `schema_name` on the `SavedQueryConfig` despite
it being called `schema_name` on the `ExportConfig`. This is because need `schema_name`
to be the name of the property on the `ExportConfig`, but `schema` is the user facing
specification.
2. We didn't add the `ExportConfig` `alias` property to the `SavedQueryConfig` This
is because `alias` will always be specific to a single export, and thus it doesn't
make sense to allow defining it on the `SavedQueryConfig` to then apply to all
`Exports` belonging to the `SavedQuery`
* Begin inheriting configs from saved query config, and transitively from project config
Export configs will now inherit from saved query configs, with a preference
for export config specifications. That is to say an export config will inherity
a config attr from the saved query config only if a value hasn't been supplied
on the export config directly. Additionally because the saved query config has
a similar relationship with the project config, exports configs can inherit
from the project config (again with a preference for export config specifications).
* Correct conditional in export config building for map schema to schema_name
I somehow wrote a really weird, but also valid, conditional statement. Previously
the conditional was
```
if combined.get("schema") is not combined.get("schema_name") is None:
```
which basically checked whether `schema` was a boolean that didn't match
the boolean of whether `schema_name` was None. This would pretty much
always evaluate to True because `schema` should be a string or none, not
a bool, and thus would never match the right hand side. Crazy. It has now
been fixed to do the thing we want to it to do. If `schema` isn't `None`,
and `schema_name` is `None`, then set `schema_name` to have the value of
`schema`.
* Update parameter names in `_get_export_config` to be more verbose
* Supports non half width alphanumeric for generic test
* Unify conditional statements
* add CHANGELOG entries
* added test for Japanese
* Move the fix further upstream
* Remove the changes in core/dbt/task/runnable.py
* Fix accidental removal of `_` substitution character
---------
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
* Handle unknown `type_code` for model contracts
* Changelog entry
* Fix changelog entry
* Functional test for a `type_code` that is not recognized by psycopg2
* Functional tests for data type mismatches
* add test
* fix test
* first pass with constraint error
* add back column checks for temp tables
* changelog
* Update .changes/unreleased/Fixes-20231024-145504.yaml
* changie doc for DSI 0.3.0 upgrade
* Gracefully handle v10 metric filters
* Fix iteration over metrics in `upgrade_v10_metric_filters`
* Update previous manifest version test fixtures to have more expressive metrics
* Regenerate the test v10 manifest artifact using the more expressive metrics from 904cc1ef
To do this I cherry-picked 904cc1ef onto my local 1.6.latest branch,
had the test regenerate the test v10 manifest artifact, and then over
wrote the test v10 manifest artifact on this branch (cherry-picking it
across the branches didn't work, had to copy paste :grimmace:)
* Regenerate test v11 manifest artifact using the fixture changes in 904cc1ef
* Update `upgrade_v10_metric_filters` to handled disabled metrics
Regenerating the v10 and v11 test manifest artifacts uncovered an
issue wherein we weren't handling disabled metrics that need to
get upgraded. This commit fixes that. Additionally, the
`upgrade_v10_metric_filters` was getting a bit unwieldy, so I broke
extracted the abstracted sub functions.
* Fix `test_backwards_compatible_versions` test
When we regenerated the v10 test manifest artifact, it started having
the `metricflow_time_sine` model, and it didn't previously. This caused
`test_backwards_compatible_versions` to start failing because it was
no longer identified as having modified state for v10. The test has
been altered accordingly
* Bump to dbt-semantic-interfaces 0.3.0b1
* Update import path of `WhereFilterParser` from `dbt-semantic-interfaces`
In 0.3.x of `dbt-semantic-intefaces` the location of the WhereFilterParser
moved to be grouped in with a bunch of new adjacent code. As such,
we needed to correct our import path of it.
* Create basic `SavedQuery` node type based on `SavedQuery` protocol from DSI
* Add ability to add SavedQueries to the manifest
* Define unparsed SavedQuery node
* Begin parsing saved_query objects to manifest
* Skip jinja rendering of `SavedQuery.where` property
* Begin propagating `SavedQueries` on the manifest to the semantic manifest
* Add tests for basic saved query parsing
* Add custom pluralization handling of SavedQuery node type
* Add a config subclass to SavedQuery node
* Move the SavedQuery node to nodes.py
Unfortunately things are a bit too intertwined currently for SavedQuery
to be in it's own file. We need to add the SavedQuery node to the
GraphMemberNode, unfortunately with SavedQuery in it's own file,
importing it would have caused a circular dependency. We'll need
to separately come in and split things up as a cleanup portion of
work.
* Add basic plumbing of saved query configs to projects
* Add basic lookup utility for saved queries, SavedQueryLookup
* Handle disabled SavedQuery nodes in parsing and lookups
* Add SavedQuery nodes to grouping process
Our grouping logic seems to be in a weird spot. It seems liek we're
moving to setting the `group` for a node in the node's `config` however,
all of the logic around grouping is still focused on the top level `group`
property on a nodes. To get group stuff plumbed I've thus added `group`
as a top level property of the `SavedQuery` node, and populated it from
the config group value.
* Plumb through saved query in a lot more places
I don't like making scatter shot commits like this. However, a lot
of this commit was written ~4am, soooo yea. Things were broken, I wanted
things to be unbroken. I mostly searched for `semantic_models` and added
the equivalent necessary `saved_queries`. Some stuff is in support of
writing out the manifest, some stuff helps with node selection, it's a
lot of miscelaneous stuff that I don't fully understand.
* Add `depends_on` to `SavedQuery` nodes and populate from `metrics` property
* Add partial parsing support to SavedQuery nodes
* Add `docs` support for SavedQuery descriptions
* Support selctor methods for SavedQuery nodes
* Add `refs` property to SavedQuery node
We don't actually append anything to `refs` for SavedQuery nodes currently.
I'm not sure if anything needs to be appended to them. Regardless, we
access the `refs` property throughout the codebase while iterating over
nodes. It seems wise to support this attribute as to not accidently blow
something up with it not existing.
* Support `saved_queries` when upgrading from manifests <= v10 (and regenerate v11)
* Add changie doc for saved query node support
* Pin to dbt-semantic-interfaces 0.3.0b1 for saved query work
We're gonna release DSI 0.3.0, and if this PR automatically pulls that
in things will break. But the things that need fixing should be handled
separately from this PR. After releasing DSI 0.3.0 I'm going to create
a branch off/ontop of this one, and open a stacked PR with the associated
changes.
* Bump supported DSI version to 0.3.x
* Switch metric filters and saved query where to use ne WhereFilterIntersection
* Update schema yaml readers to create WhereFilterInterfaces
* Expand metric filters and saved query where property to handle both str and list of strs
* Update tests which were broken by where filter changes
* Regeneate v11 manifest
* Fixup: Update `SavedQueryLookup.perform_lookup` to operate on saved queries
I missed this when I was copy and pasting 🤦
* Add support for getting freshness from DBMS metadata
* Add changelog entry
* Add simple test case
* Change parsing error to warning and add new event type for warning
* Code review simplification of capability dict.
* Revisions to the capability mechanism per review
* Move utility function.
* Reduce try/except scope
* Clean up imports.
* Simplify typing per review
* Unit test fix
* add `store_failures_as` parameter to TestConfig, catch strategy parameter in test materialization
* create test results as views
* updated test expected values for new config option
* break up tests into reusable tests and adapter specific configuration, update test to check for relation type and confirm views update
* move test configuration into base test class
* allow `store_failures_as` to drive whether failures are stored
* update expected test config dicts to include the new default value for store_failures_as
* Add `store_failures_as` config for generic tests
* cover --store-failures on CLI gap
* add generic tests test case for store_failures_as
* update object names for generic test case tests for store_failures_as
* remove unique generic test, it was not testing `store_failures_as`
* pull generic run and assertion into base test class to turn tests into quasi-parameterized tests
* add ephemeral option for store_failures_as, as a way to easily turn off store_failures at the model level
* add compilation error for invalid setting of store_failures_as
---------
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
* Explanation of Parsing vs. Compilation vs. Runtime
* Update core/dbt/parser/parsing-vs-compilation-vs-runtime.md
* Update core/dbt/parser/parsing-vs-compilation-vs-runtime.md
* Update core/dbt/parser/parsing-vs-compilation-vs-runtime.md
* Update core/dbt/parser/parsing-vs-compilation-vs-runtime.md
* Update core/dbt/parser/parsing-vs-compilation-vs-runtime.md
* Update core/dbt/parser/parsing-vs-compilation-vs-runtime.md
* Apply suggestions from code review
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
* Fix a couple markdown rendering issues
* Move to the "explain it like im 64" folder
When ELI5 just isnt detailed enough.
* Disambiguate Python references
Disambiguate Python references and delineate SQL models ("Jinja-SQL") from Python models ("dbt-py")
---------
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
* Add semantic model test to `test_contracts_graph_parsed.py`
The tests in `test_contracts_graph_parsed.py` are meant to ensure
that we can go from objects to dictionaries and back without any
changes. We've had a desire to simplify these tests. Most tests in
this file have three to four fixtures, this test only has one. What
a test of this format ensures is that parsing a SemanticModel from
a dictionary doesn't add/drop any keys from the dictionary and that
when going back to the dictionary no keys are dropped. This style of
test will still break whenever the semantic model (or sub objects)
change. However now when that happens, only one fixture will have to
be updated (whereas previously we had to update 3-4 fixtures).
* Begin using hypothesis package for symmetry testing
Hypothesis is a python package for doing property testing. The `@given`
parameterizes a test, with it generating the arguements it has following
`strategies`. The main strategies we use is `builds` this takes in a callable
passes any sub strategies for named arguements, and will try to infer any
other arguments if the callable is typed. I found that even though the
test was run many many times, some of the `SemanticModel` properties
weren't being changed. For instance `dimensions`, `entities`, and `measures`
were always empty lists. Because of this I defined sub strategies for
some attributes of `SemanticModel`s.
* Update unittest readme to have details on test_contracts_graph_parsed methodology
* Include option to generate static index.html
* Added changie
* Using DBT's system load / write file methods for better cross platform
support
* Updated docs tests with dbt.client.systems calls for file reading
* Writing out static_index.html as binary file to prevent line-ending
conversions on Windows. (similar behaviour as index.html)
* Add performance metrics to the CommandCompleted event.
* Add changelog entry.
* Add flag for controling the log level of ResourceReport.
* Update changelog entry to reflect changes
* Remove outdated attributes
* Work around missing resource module on windows
* Fix corner case where flags are not set
* Add new get_catalog_relations macro, allowing dbt to specify which relations in a schema the adapter should return data about
* Implement postgres adapter support for relation filtering on catalog queries
* Code review changes adding feature flag for catalog-by-relation-list support
* Use profile specified in --profile with dbt init (#7450)
* Use profile specified in --profile with dbt init
* Update .changes/unreleased/Fixes-20230424-161642.yaml
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* Refactor run() method into functions, replace exit() calls with exceptions
* Update help text for profile option
---------
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* add TestLargeEphemeralCompilation (#8376)
* Fix a couple of issues in the postgres implementation of get_catalog_relations
* Add relation count limit at which to fall back to batch retrieval
* Better feature detection mechanism for adapters.
* Code review changes to get_catalog_relations and adapter feature checking
* Add changelog entry
---------
Co-authored-by: ezraerb <ezraerb@alum.mit.edu>
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
* Add `date_spine` macro (and macros it depends on) from dbt-utils to core
The macros added are
- date_spine
- get_intervals_between
- generate_series
- get_powers_of_two
We're adding these to core because they are becoming more prevalently used
with the increase usage in the semantic layer. Basically if you are
using the semantic layer currently, then it is almost a requirement
to use dbt-utils, which is undesireable given the SL is supported directly
in core. The primary focus of this was to just add `date_spine`. However,
because `date_spine` depends on other macros, these other macros were
also moved.
* Add adapter tests for `get_powers_of_two` macro
* Add adapter tests for `generate_series` macro
* Add adapter tests for `get_intervals_between` macro
* Add adapter tests for `date_spine` macro
* Improve test fixture for `date_spine` macro to work with multiple adapters
* Cast to types to date in fixture_date_spine when targeting redshift
* Improve test fixture for `get_intervals_between` macro to work with multiple adapters
* changie doc for adding date_spine macro
* Include 'join_to_timespine` and `fill_nulls_with` in metric fixture
* Support `join_to_timespine` and `fill_nulls_with` properties on measure inputs to metrics
* Assert new `fill_nulls_with` and `join_to_timespine` properties don't break associated DSI protocol
* Add doc for metric null coalescing improvements
* Fix unit test for unparsed metric objects
The `assert_symmetric` function asserts that dictionaries are mostly
equivalent. I say mostly equivalent because it drops keys that are
`None`. The issue is that that `join_to_timespine` gets defaulted
to `False`, so we have to specify it in the `get_ok_dict` so that
they match.
* allow multioption to be quoted
* changelog
* fix test
* remove list format
* fix tests
* fix list object
* review arg change
* fix quotes
* Update .changes/unreleased/Features-20230918-150855.yaml
* add types
* convert list to set in test
* make mypy happy
* mroe mypy happiness
* more mypy happiness
* last mypy change
* add node to test
* Extend use of type annotations in the events module.
* Add return type of None to more __init__ definitions.
* Still more type annotations adding -> None to __init__
* Tweak per review
* Use profile specified in --profile with dbt init
* Update .changes/unreleased/Fixes-20230424-161642.yaml
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* Refactor run() method into functions, replace exit() calls with exceptions
* Update help text for profile option
---------
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* move config changes into alter.sql in alignment with other adapters
* move shared relations macros to relations root
* move single models files to models root
* add table to replace
* move create file into relation directory
* implement replace for postgres
* move column specific macros into column directory
* add unit test for can_be_replaced
* update renameable_relations and replaceable_relations to frozensets to set defaults
* fixed tests for new defaults
* Add docstrings to `contracts/graph/metrics.py` functions to document what they do
Used [dbt-labs/dbt-core#5607](https://github.com/dbt-labs/dbt-core/pull/5607)
for context on what the functions should do.
* Add typing to `reverse_dag_parsing` and update function to work on 1.6+ metrics
* Add typing to `parent_metrics` and `parent_metrics_names`
* Add typing to `base_metric_dependency` and `derived_metric_dependency` and update functions to work on 1.6+ metrics
* Simplify implementations of `basic_metric_dependency` and `derived_metric_dependnecy`
* Add typing to `ResolvedMetricReference` initialization
* Add typing to `derived_metric_dependency_graph`
* Simplify conditional controls in `ResolvedMetricReference` functions
The functions in `ResolvedMetricReference` use `manifest.metric.get(...)`
which will only return either a `Metric` or `None`, never a different
node type. Thus we don't need to check that the returned metric is
a metric.
* Don't recurse on over `depends_on` for non-derived metrics in `reverse_dag_parsing`
The function `reverse_dag_parsing` only cares about derived metrics,
that is metrics that depend on other metrics. Metrics only depend on
other metrics if they are one of the `DERIVED_METRICS` types. Thus
doing a recursive call to `reverse_dag_parsing` for non `DERIVED_METRICS`
types is unnecessary. Previously we were iterating over a metric's
`depends_on` property regardless of whether the metric was a `DERIVED_METRICS`
type. Now we only do this work if the metric is of a `DERIVED_METRICS`
type.
* Simplify `parent_metrics_names` by having it call `parent_metrics`
* Unskip `TestMetricHelperFunctions.test_derived_metric` and update fixture setup
* Add changie doc for metric helper function updates
* Get manifest in `test_derived_metric` from the parse dbt_run invocation
* Remove `Relation` a intiatlization attribute for `ResolvedMetricReference`
* Add return typing to class `__` functions of `ResolvedMetricReference`
* Move from `manifest.metrics.get` to `manifest.expect` in metric helpers
Previously with `manifest.metrics.get` we were just skipping when `None`
was returned. Getting `None` back was expected in that `parent_unique_id`s
that didn't belong to metrics should return `None` when calling
`manifest.metrics.get`, and these are fine to skip. However, there's
an edgecase where a `parent_unique_id` is supposed to be a metric, but
isn't found, thus returning `None`. How likely this edge case could
get hit, I'm not sure, but it's a possible edge case. Using `manifest.metrics.get`
it we can't actually tell if we're in the edge case or not. By moving
to `manifest.expect` we get the error handling built in, and the only
trade off is that we need to change our conditional to skip returned
nodes that aren't metrics.
* update `Number` class to handle integer values (#8306)
* add show test for json data
* oh changie my changie
* revert unecessary cahnge to fixture
* keep decimal class for precision methods, but return __int__ value
* jerco updates
* update integer type
* update other tests
* Update .changes/unreleased/Fixes-20230803-093502.yaml
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* account for integer vs number on table merges
* add tests for combining number with integer.
* add unit test when nulls are added
* cant none as an Integer
* fix null tests
---------
Co-authored-by: dave-connors-3 <73915542+dave-connors-3@users.noreply.github.com>
Co-authored-by: Dave Connors <dave.connors@fishtownanalytics.com>
* first draft of adding in table - materialized view swap
* table/view/materialized view can all replace each other
* update renameable relations to a config
* migrate relations macros from `macros/adapters/relations` to `macros/relations` so that generics are close to the relation specific macros that they reference; also aligns with adapter macro files structure, to look more familiar
* move drop macro to drop macro file
* align the behavior of get_drop_sql and drop_relation, adopt existing default from drop_relation
* add explicit ddl for drop statements instead of inheriting the default from dbt-core
* update replace macro dependent macros to align with naming standards
* update type for mashumaro, update related test
* Improve typing of `ContextMember` functions
* Improve typing of `Var` functions
* Improve typing of `ContextMeta.__new__`
* Improve typing `BaseContext` and functions
In addition to just adding parameter typing and return typing to
`BaseContext` functions. We also declared `_context_members_` and
`_context_attrs_` as properites of `BaseContext` this was necessary
because they're being accessed in the classes functions. However,
because they were being indirectly instantiated by the metaclass
`ContextMeta`, the properties weren't actually known to exist. By
adding declaring the properties on the `BaseContext`, we let mypy
know they exist.
* Remove bare `invocations` of `@contextmember` and `@contextproperty`, and add typing to them
Previously `contextmember` and `contextproperty` were 2-in-1 decorators.
This meant they could be invoked either as `@contextmember` or
`@contextmember('some_string')`. This was fine until we wanted to return
typing to the functions. In the instance where the bare decorator was used
(i.e. no `(...)` were present) an object was expected to be returned. However
in the instance where parameters were passed on the invocation, a callable
was expected to be returned. Putting a union of both in the return type
made the invocations complain about each others' return type. To get around this
we've dropped the bare invocation as acceptable. The parenthesis are now always
required, but passing a string in them is optional.
* WIP
* WIP
* get group and enabled added
* changelog
* cleanup
* getting measure lookup working
* missed file
* get project level working
* fix last test
* add groups to config tests
* more group tests
* fix path
* clean up manifest.py
* update error message
* fix test assert
* remove extra check
* resolve conflicts in manaifest
* update manifest
* resolve conflict
* add alias
* Add compiled node properties to run_results.json
* Include compiled-node attributes in run_results.json
* Fix typo
* Bump schema version of run_results
* Fix test assertions
* Update expected run_results to reflect new attributes
* Code review changes
* Fix mypy warnings for ManifestLoader.load() (#8443)
* revert python version for docker images (#8445)
* revert python version for docker images
* add comment to not update python version, update changelog
* Bumping version to 1.7.0b1 and generate changelog
* [CT-3013] Fix parsing of `window_groupings` (#8454)
* Update semantic model parsing tests to check measure non_additive_dimension spec
* Make `window_groupings` default to empty list if not specified on `non_additive_dimension`
* Add changie doc for `window_groupings` parsing fix
* update `Number` class to handle integer values (#8306)
* add show test for json data
* oh changie my changie
* revert unecessary cahnge to fixture
* keep decimal class for precision methods, but return __int__ value
* jerco updates
* update integer type
* update other tests
* Update .changes/unreleased/Fixes-20230803-093502.yaml
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Improve docker image README (#8212)
* Improve docker image README
- Fix unnecessary/missing newline escapes
- Remove double whitespace between parameters
- 2-space indent for extra lines in image build commands
* Add changelog entry for #8212
* ADAP-814: Refactor prep for MV updates (#8459)
* apply reformatting changes only for #8449
* add logging back to get_create_materialized_view_as_sql
* changie
* swap trigger (#8463)
* update the implementation template (#8466)
* update the implementation template
* add colon
* Split tests into classes (#8474)
* add flaky decorator
* split up tests into classes
* revert update agate for int (#8478)
* updated typing and methods to meet mypy standards (#8485)
* Convert error to conditional warning for unversioned contracted model, fix msg format (#8451)
* first pass, tests need updates
* update proto defn
* fixing tests
* more test fixes
* finish fixing test file
* reformat the message
* formatting messages
* changelog
* add event to unit test
* feedback on message structure
* WIP
* fix up event to take in all fields
* fix test
* Fix ambiguous reference error for duplicate model names across packages with tests (#8488)
* Safely remove external nodes from manifest (#8495)
* [CT-2840] Improved semantic layer protocol satisfaction tests (#8456)
* Test `SemanticModel` satisfies protocol when none of it's `Optionals` are specified
* Add tests ensuring SourceFileMetadata and FileSlice satisfiy DSI protocols
* Add test asserting Defaults obj satisfies protocol
* Add test asserting SemanticModel with optionals specified satisfies protocol
* Split dimension protocol satisfaction tests into with and without optionals
* Simplify DSI Protocol import strategy in protocol satisfaction tests
* Add test asserting DimensionValidtyParams satisfies protocol
* Add test asserting DimensionTypeParams satisfies protocol
* Split entity protocol satisfaction tests into with and without optionals
* Split measure protocol satisfication tests and add measure aggregation params satisficaition test
* Split metric protocol satisfaction test into optional specified an unspecified
Additionally, create where_filter pytest fixture
* Improve protocol satisfaction tests for MetricTypeParams and sub protocols
Specifically we added/improved protocol satisfaction tests for
- MetricTypeParams
- MetricInput
- MetricInputMeasure
- MetricTimeWindow
* Convert to using mashumaro jsonschema with acceptable performance (#8437)
* Regenerate run_results schema after merging in changes from main.
---------
Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
Co-authored-by: Matthew McKnight <91097623+McKnight-42@users.noreply.github.com>
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: Quigley Malcolm <QMalcolm@users.noreply.github.com>
Co-authored-by: dave-connors-3 <73915542+dave-connors-3@users.noreply.github.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
Co-authored-by: Jaime Martínez Rincón <jaime@jamezrin.name>
Co-authored-by: Mike Alfare <13974384+mikealfare@users.noreply.github.com>
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
* Test `SemanticModel` satisfies protocol when none of it's `Optionals` are specified
* Add tests ensuring SourceFileMetadata and FileSlice satisfiy DSI protocols
* Add test asserting Defaults obj satisfies protocol
* Add test asserting SemanticModel with optionals specified satisfies protocol
* Split dimension protocol satisfaction tests into with and without optionals
* Simplify DSI Protocol import strategy in protocol satisfaction tests
* Add test asserting DimensionValidtyParams satisfies protocol
* Add test asserting DimensionTypeParams satisfies protocol
* Split entity protocol satisfaction tests into with and without optionals
* Split measure protocol satisfication tests and add measure aggregation params satisficaition test
* Split metric protocol satisfaction test into optional specified an unspecified
Additionally, create where_filter pytest fixture
* Improve protocol satisfaction tests for MetricTypeParams and sub protocols
Specifically we added/improved protocol satisfaction tests for
- MetricTypeParams
- MetricInput
- MetricInputMeasure
- MetricTimeWindow
* first pass, tests need updates
* update proto defn
* fixing tests
* more test fixes
* finish fixing test file
* reformat the message
* formatting messages
* changelog
* add event to unit test
* feedback on message structure
* WIP
* fix up event to take in all fields
* fix test
* add show test for json data
* oh changie my changie
* revert unecessary cahnge to fixture
* keep decimal class for precision methods, but return __int__ value
* jerco updates
* update integer type
* update other tests
* Update .changes/unreleased/Fixes-20230803-093502.yaml
---------
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Update semantic model parsing tests to check measure non_additive_dimension spec
* Make `window_groupings` default to empty list if not specified on `non_additive_dimension`
* Add changie doc for `window_groupings` parsing fix
* first pass
* WIP
* update issue body
* fix triggering label
* fix docs
* add better run name
* reduce complexity
* update description
* fix PR title
* point at workflow on main
* fix wording
* add label
* Update semantic model parsing test to check `create_metric = true` functionality
* Add `create_metric` boolean property to unparsed measure objects
* Begin creating metrics from measures with `create_metric = True`
* Add test ensuring partial parsing handles metrics generated from measures
* Ensure partial parsing appropriately deletes metrics generated from semantic models
* Add changie doc for addition
* Separate generated metrics from parsed metrics for partial parsing
I was doing a demo earlier today of this branch (minus this commit)
and noticed something odd. When I changes a semantic model, metrics
that should have been technically uneffected would get dropped. Basically
if I made a change to a semantic model which had metrics in the same
file, and then ran parse, those metrics defined in the same file
would get dropped. Then with no other changes, if I ran parse again
they would come back. What was happening was that parsed metrics
and generated metrics were getting tracked the same way on the file
objects for partial parsing. In 0787a7c7b6
we began dropping all metrics tracked in a file objects when changes
to semantic models were detected. Since parsed metrics and generated
metrics were being tracked together on the file object, the parsed
metrics were getting dropped as well. In this commit we begin separating
out the tracking of generated metrics and parsed metrics on the
file object, and now only drop the generated metrics when semantic
models have a detected change.
* Assert in test that semantic model partial parsing doesn't clobber regular metrics
* Replaced the FirstRunResultError and AfterFirstRunResultError events with RunResultError.
* Attempts at reasonable unit tests.
* Restore event manager after unit test.
* Support configurable delimiter for seed files, default to comma (#3990)
* Update Features-20230317-144957.yaml
* Moved "delimiter" to seed config instead of node config
* Update core/dbt/clients/agate_helper.py
Co-authored-by: Cor <jczuurmond@protonmail.com>
* Update test_contracts_graph_parsed.py
* fixed integration tests
* Added functional tests for seed files with a unique delimiter
* Added docstrings
* Added a test for an empty string configured delimiter value
* whitespace
* ran black
* updated changie entry
* Update Features-20230317-144957.yaml
---------
Co-authored-by: Cor <jczuurmond@protonmail.com>
* add param to control maxBytes for single dbt.log file
* nits
* nits
* Update core/dbt/cli/params.py
Co-authored-by: Peter Webb <peter.webb@dbtlabs.com>
---------
Co-authored-by: Peter Webb <peter.webb@dbtlabs.com>
* Add test ensuring `warn_error_options` is dictified in `invocation_args_dict` of contexts
* Add dictification specific to `warn_error_options` in `args_to_dict`
* Changie doc for serialization changes of warn_error_options
* Add test asserting that a macro with the work materializtion doesn't cause issues
* Let macro names include the word `materialization`
Previously we were checking if a macro included a materialization
based on if the macro name included the word `materialization`. However,
a macro name included the word `materialization` isn't guarnteed to
actually have a materialization, and a macro that doesn't have
`materialization` in the name isn't guaranteed to not have a materialization.
This change is to detect macros with materializations based on the
detected block type of the macro.
* Add changie doc materialization in macro detection
* Add test for checking that `_connection_exception_retry` handles `EOFError`s
* Update `_connection_exception_retry` to handle `EOFError` exceptions
* Add changie docs for `_connection_exception_retry` handling `EOFError` exceptions
* applied new integration tests to existing framework
* applied new integration tests to existing framework
* generalized tests for reusability in adapters; fixed drop index issue
* generalized tests for reusability in adapters; fixed drop index issue
* removed unnecessary overrides in tests
* adjusted import to allow for usage in adapters
* adjusted import to allow for usage in adapters
* removed fixture artifact
* generalized the materialized view fixture which will need to be specific to the adapter
* unskipped tests in the test runner package
* corrected test condition
* corrected test condition
* added missing initial build for the relation type swap tests
* add env vars for datadog ci visibility
* modify pytest command for tracing
* fix posargs
* move env vars to job that needs them
* add test repeater to DD
* swap flags
* Bump version support for `dbt-semantic-interfaces` to `~=0.1.0rc1`
* Add tests for asserting WhereFilter satisfies protocol
* Add `call_parameter_sets` to `WhereFilter` class to satisfy protocol
* Changie doc for moving to DSI 0.1.0rc1
* [CT-2822] Fix `NonAdditiveDimension` Implementation (#8089)
* Add test to ensure `NonAdditiveDimension` implementation satisfies protocol
* Fix typo in `NonAdditiveDimension`: `window_grouples` -> `window_groupings`
* Add changie doc for typo fix in NonAdditiveDimension
* Add metrics from metric type params to a metric's depends_on
* Add Lookup utility for finding `SemanticModel`s by measure names
* Add the `SemanticModel` of a `Metric`'s measure property to the `Metric`'s `depends_on`
* Add `SemanticModelConfig` to `SemanticModel`
Some tests were failing due to `Metric`'s referencing `SemanticModels`.
Specifically there was a check to see if a referenced node was disabled,
and because `SemanticModel`'s didn't have a `config` holding the `enabled`
boolean attr, core would blow up.
* Checkpoint on test fixing
* Correct metricflow_time_spine_sql in test fixtures
* Add check for `SemanticModel` nodes in `Linker.link_node`
Now that `Metrics` depend on `SemanticModels` and `SemanticModels`
have their own dependencies on `Models` they need to be checked for
in the `Linker.link_node`. I forget the details but things blow up
without it. Basically it adds the SemanticModels to the dependency
graph.
* Fix artifacts/test_previous_version_state.py tests
* fix access/test_access.py tests
* Fix function metric tests
* Fix functional partial_parsing tests
* Add time dimension to semantic model in exposures fixture
* Bump DSI version to a minimum of 0.1.0dev10
DSI 0.1.0dev10 fixes an incoherence issue in DSI around `agg_time_dimension`
setting. This incoherence was that `measure.agg_time_dimension` was being
required, even though it was no longer supposed to be a required attribute
(it's specificially typed as optional in the protocol). This was causing
a handful of tests to fail because the `semantic_model.defaults.agg_time_dimension`
value wasn't being respected. Pulling in the fix from DSI 0.1.0dev10 fixes
the issue.
Interestingly after bumping the DSI version, the integration tests were
still failing. If I ran the tests individually they passed though. To get
`make integration` to run properly I ended up having to clear my `.tox`
cache, as it seems some outdated state was being persisted.
* Add test specifically for checking the `depends_on` of `Metric` nodes
* Re-enable test asserting calling metric nodes in models
* Migrate `checked_agg_time_dimension` to `checked_agg_time_dimension_for_measure`
DSI 0.1.0dev10 moved `checked_agg_time_dimension` from the `Measure`
protocol to the `SemanticModel` protocol as `checked_agg_time_dimension_for_measure`.
This finishes a change where for a given measure either the `Measure.agg_time_dimension`
or the measure's parent `SemanticModel.defaults.agg_time_dimension` needs to be
set, instead of always require the measure's `Measure.agg_time_dimension`.
* Add changie doc for populating metric
---------
Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
The original implementation of validate_sql was called dry_run,
but in the rename the test classes and much of their associated
documentation still retained the old naming.
This is mainly cosmetic, but since these test classes will be
imported into adapter repositories we should fix this now before
the wrong name proliferates.
* Add dry_run method to base adapter with implementation for SQLAdapters
resolves#7839
In the CLI integration, MetricFlow will issue dry run queries as
part of its warehouse-level validation of the semantic manifest,
including all semantic model and metric definitions.
In most cases, issuing an `explain` query is adequate, however,
BigQuery does not support the `explain` keyword and so we cannot
simply pre-pend `explain` to our input queries and expect the
correct behavior across all contexts.
This commit adds a dry_run() method to the BaseAdapter which mirrors
the execute() method in that it simply delegates to the ConnectionManager.
It also adds a working implementation to the SQLConnectionManager and
includes a few test cases for adapter maintainers to try out on their own.
The current implementation should work out of the box with most
of our adapters. BigQuery will require us to implement the dry_run
method on the BigQueryConnectionManager, and community-maintained
adapters can opt in by enabling the test and ensuring their own
implementations work as expected.
Note - we decided to make these concrete methods that throw runtime
exceptions for direct descendants of BaseAdapter in order to avoid
forcing community adapter maintainers to implement a method that does
not currently have any use cases in dbt proper.
* Switch dry_run implementation to be macro-based
The common pattern for engine-specific SQL statement construction
in dbt is to provide a default macro which can then be overridden
on a per-adapter basis by either adapter maintainers or end users.
The advantage of this is users can take advantage of alternative
SQL syntax for performance or other reasons, or even to enable
local usage if an engine relies on a non-standard expression and
the adapter maintainer has not updated the package.
Although there are some risks here they are minimal, and the benefit
of added expressiveness and consistency with other similar constructs
is clear, so we adopt this approach here.
* Improve error message for InvalidConnectionError in test_invalid_dry_run.
* Rename dry_run to validate_sql
The validate_sql name has less chance of colliding with dbt's
command nomenclature, both now and in some future where we have
dry-run operations.
* Rename macro and test files to validate_sql
* Fix changelog entry
* add permissions
* replace db setup
* try with bash instead of just pytest flags
* fix test command
* remove spaces
* remove force-flaky flag
* add starting vlaues
* add mac and windows postgres isntall
* define use bash
* fix typo
* update output report
* tweak last if condition
* clarify failures/successful runs
* print running success and failure tally
* just output pytest instead of capturing it
* set shell to not exit immediately on exit code
* add formatting around results for easier scanning
* more output formatting
* add matrix to unlock parallel runners
* increase to ten batches
* update debug
* add comment
* clean up comments
* Remove `create_metric` as a public facing `SemanticModel.Measure` property
We want to add `create_metric`. The `create_metric` property will be
incredibly useful. However, at this time it is not hooked up, and we don't
have time to hook it up before the code freeze for 1.6.0rc of core. As
it doesn't do anything, we shouldn't allow people to specify it, because
it won't do what one would expect. We plan on making the implementation
of `create_metric` a priority for 1.7 of core
* Changie doc for the removal of create_metric property
* add negative test case
* changie
* missed a comma
* Update changelog entry
* Add a negative number (rather than subtract a positive number)
---------
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
* Fix accidental propagation of log messages to root logger.
* Add changelog entry
* Fixed an issue which blocked debug logging to stdout with --log-level debug, unless --debug was also used.
* Use dbt-semantic-interface validations on semantic models and metrics defined in Core.
* Remove empty test, since semantic models don't generate any validation warnings.
* Add changelog entry.
* Temporarily remove requirement that there must be semantic models definied in order to define metrics
* add interface changes section to the PR template
* update entire template
* split up choices for tests and interfaces
* minor formatting change
* add line breaks
* actually put in line breaks
* revert split choices in checklist
* add line breaks to top
* move docs link
* typo
* ct-2551: adds old and unmodified state selection methods
* ct-2551: update check_unmodified_content to simplify
* add unit and integration tests for unmodified and old
* add changelog entry
* ct-2551: reformatting of contingent adapter assignment list
* UnifiedToUTC
* Check proximity of dbt_valid_to and deleted time
* update the message to print if the assertion fails
* add CHANGELOG entries
* test only if naive
* Added comments about naive and aware
* Generalize comparison of datetimes that are "close enough"
---------
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
* Fix tests fixtures which were using measures for metric numerator/denominators
In our previous upgrade to DSI dev7, numerators and denominators for
metrics switched from being `MetricInputMeasure`s to `MetricInput`s.
I.e. metric numerators and denominators should references other metrics,
not semantic model measures. However, at that time, we weren't actually
doing anything with numerators and denominators in core, so no issue
got raised. The changes we are about to make though are going to surface
these issues..
* Add tests for ensuring a metric's `input_measures` gets properly populated
* Begin populating `metric.type_params.input_measures`
This isn't my favorite bit of code. Mostly because there are checks for
existence which really should be handled before this point, however a
good point for that to happen doesn't exist currently. For instance,
in an ideal world by the time we get to `_process_metric_node`, if a
metric is of type `RATIO` and the nominator and denominator should be
guaranteed.
* Update test checking that disabled metrics aren't added to the manifest metrics
We updated from the metric `number_of_people` to `average_tenure_minus_people` for
this test because disabling `number_of_people` raised other exceptions at parse
time due to a metric referencing a disabled metric. The metric `average_tenure_minus_people`
is a leaf metric, and so for this test, it is a better candidate.
* Update `test_disabled_metric_ref_model` to have more disabled metrics
There are metrics which depend on the metric `number_of_people`. If
`number_of_people` is disabled without the metrics that depend on it
being disabled, then a different (expected) exception would be raised
than the one this test is testing for. Thus we've disabled those
downstream metrics.
* Add test which checks that metrics depending on disabled metrics raise an exception
* Add changie doc for populating metric input measures
* Add merge incremental strategy
* Expect merge to be a valid strategy for Postgres
---------
Co-authored-by: Anders Swanson <anders.swanson@dbtlabs.com>
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
* CT-2711: Add remove_tests() call to delete_schema_source() so that call sites are more uniform with other node deletion call sites. This will enable further code factorization.
* CT-2711: Factor repeated code section (mostly) out of PartialParsing.handle_schema_file_changes()
* CT-2711: Factor a repeated code section out of schedule_nodes_for_parsing()
* Update semantic model parsing test to check measure agg params
* Make `use_discrete_percentile` and `use_approximate_percentile` non optional and default false
This was a mistake in our implementation of the MeasureAggregationParams.
We had defined them as optional and defaulting to `None`. However, as the
protocol states, they cannot be `None`, they must be a boolean value.
Thus now we now ensure them.
* Add changie doc for measure percentile fixes
* Update semantic model parsing test to check different measure expr types
* Allow semantic model measure exprs to be defined with ints and bools in yaml
Sometimes the expr for a measure can defined in yaml with a bool or an int.
However, we were only allowing for strings. There was a work around for this,
which was wrapping your bool or int in double quotes in the yaml, but
this can be fairly annoying for the end user.
* Changie doc for fixing measure expr yaml specification
* CT-2651: Add Semantic Models to the manifest and various pieces of graph linking code
* CT-2651: Finish integrating semantic models into the partial parsing system
* CT-2651: More semantic model details for partial parsing
* CT-2651: Remove merged references to project_dependencies
* CT-2651: Revise changelog entry
* CT-2651: Disable unit test until partial parsing of semantic models is complete.
* CT-2651: Temporarily disable an apparently-flaky test.
* Add some comments to methods constructing Project/RuntimeConfig
* Save flag that packages dict came from dependencies.yml
* Test for not rendering packages_dict
* Changie
* Ensure packages_yml_dict and dependencies_yml_dict are dictionaries
* Ensure "packages" passed to render_packages is a dict
* Bump DSI dependency version to 0.1.0dev7
* Cleaner DSI type enum importing
Previoulsy we had to use individual import paths for each type enum
that dbt-semantic-interfaces provided. However, dbt-semantic-interfaces
has been updated to allow for importing all the type enums from a
singular path.
* Cleaner DSI protocol importing
Previoulsy we had to use individual import paths for each protocol
that dbt-semantic-interfaces provided. However, dbt-semantic-interfaces
has been updated to allow for importing all the protocols from a
singular path.
* Add semantic protocol satisifcation test for metric type params
* Replace `metric.type_params.measures` with `metric.type_params.input_measures`
In DSI 0.1.0dev7 `measures` on metric type params became `input_measures`.
Additionally `input_measures` should not be user specified but something
we compile at parse time, thus we've removed it from `UnparsedMetricTypeParams`.
Finally, actually populating `input_measures` is somewhat complicated due
to the existance of derived metrics, thus that work is being pushed
off to CT-2707.
* Update metric numerator/denominator to be `MetricInput`s
In DSI 0.1.0dev7 `metric.type_params.numerator` and `metric.type_params.denominator`
switched from being `MetricInputMeasure`s to `MetricInput`s. This
commit reflects that change. Additionally, some helper functions on
metric type params were removed related to the numerator and denominator.
Thus we've removed them respectively in this commit.
* Add protocol satisfaction tests for `MetricInput` and `MetricInputMeasure`
* Add `post_aggregation_reference` to `MetricInput` and fix typo in `MetricInputMeasure`
DSI 0.1.0dev7 added `post_aggregation_reference` to the `MetricInput` protocol,
thus we've added it to our implementation in core. Additionally, we had a typo
in a method name in our implementation of `MetricInputMeasure`, ironically
a similar function to the one we've added for `MetricInput`
* Changie doc for upgraded to DSI 0.1.0dev7
* Fix parsing of metric numerator and denominator in schema_yaml_readers
Previously numerator and denominator of a metric were `MetricInputMeasure`s,
now they're `MetricInput`s. Changing the typing isn't enough though.
We have parsing functions in `schema_yaml_readers` which were specifically
parsing the numerator and denominator as if they were `MetricInputMeasure`s.
Thus we had to updating the schema_yaml_readers to parse them as `MetricInput`s.
During this we had some logic in a parsing function `_get_metric_inputs` which
could be abstracted to newly added functions.
* Upgrade to dbt-semantic-interfaces v0.1.0dev5
This is a fairly simple upgrade. Literally it's just pointing at the
the new versions. The v3 schemas are directly compatible with v5 because
there were no protocol level changes from v3 to v5. All the changers were
updates to tools MetricFlow uses from DSI, not tools that we ourselves
are using in core (yet).
* Add changie doc for DSI version bump
* Update metric filters in testing fixtures
I incorrectly wrote the tests such that they didn't include curly
braces, `{{..}}`, around things like `dimension(..)` for filters.
This updates the tests fixtures to have proper filter specifications
* Skip jinja rendering of `filter` key of metrics
Note that `filter` can show up in multiple places: as a root key
on a metric (`metric.filter`), on a metric input (`metric.type_params.metrics[x].filter`),
denominator (`metric.type_params.denominator.filter`), numerator
(`metric.type_params.numerator.filter`), and a metric input measure
(`metric.type_params.measure.filter` and `metric.type_params.measures[x].filter`).
In this commit we skip all of them :)
* Add changie doc for skipping jinja parsing for metric filters
* Update yaml renderer test for metrics
* Add AdapterRegistered event log message
* Add AdapterRegistered to unit test
* make versioning and logging consistent
* make versioning and logging consistent
* add to_version_string
* remove extra equals
* format fire_event
* Add tests to ensure our semantic layer nodes satisfy the DSI protocols
These tests create runtime checkable versions of the protocols defined in
DSI. Thus we can instantiate instances of our semantic layer nodes and
use `isinstance` to check that they satisfy the protocol. These `runtime_checkable`
versions of the protocols should only exist in testing and should never
be used in the actual package code.
* Update the `Dimension` object of `SemanticModel` node to match DSI protocol
* Make `UnparsedDimension` more strict and update schema readers accordingly
* Update the `Entity` object of `SemanticModel` node to match DSI protocol
* Make `UnparsedEntity` more strict and update schema readers accordingly
* Update the `Measure` object of `SemanticModel` node to match DSI protocol
* Make `UnparsedMeasure` more strict and update schema readers accordingly
* Update the `SemanticModel` node to match DSI protocol
A lot of the additions are helper functions which we don't actually
use in core. This is a known issue. We're in the process of removing
a fair number of them from the DSI protocol spec. However, in the meantime
we need to implement them to satisfy the protocol unfortunately.
* Make `UnparsedSemanticModel` more strict and update schema readers accordingly
* Changie entry for updating SemanticModel node
* Use contextvar to store and get project_root for path selector method
* Changie
* Modify test to check Path selector with project-dir
* Don't set cv_project_root in base task if no config
* Refactor MetricNode definition to satisfy DSI Metric protocol
* Fix tests involving metrics to have updated properties
* Update UnparsedMetricNode to match new metric yaml spec
* Update MetricParser for new unparsed and parsed MetricNodes
* Remove `rename_metric_attr`
We're intentionally breaking the spec. There will be a separate tool provided
for migrating from dbt-metrics to dbt x metricflow. This bit of code was renaming
things like `type` to `calculation_method`. This is problematic because `type` is
on the new spec, while `calculation_method` is not. Additionally, since we're
intentionally breaking the spec, this function, `rename_metric_attr`, shouldn't be
used for any property renaming.
* Fix tests for Metrics (1.6) changes
* Regenerated v10 manifest schema and associated functional test artifact state
* Remove no longer needed tests
* Skip / comment out tests for metrics functionality that we'll be implementing later
* Begin outputting semantic manifest artifact on every run
* Drop metrics during upgrade_manifest_json if manifest is v9 or before
* Update properties of `minimal_parsed_metric_dict` to match new metric spec
* Add changie entry for metric node breaking changes
* Add semantic model nodes to semantic manifest
* Add dbt-semantic-interfaces as a dependency
With the integration with MetricFlow we're taking a dependency on
`dbt-semantic-interfaces` which acts as the source of truth for
protocols which MetricFlow and dbt-core need to agree on. Additionally
we're hard pinning to 0.1.0.dev3 for now. We plan on having a less
restrictive specification when dbt-core 1.6 hits GA.
* Add implementations of DSI Metadata protocol to nodes.py
* CT-2521: Initial work on adding new SemanticModel node
* CT-2521: Second rough draft of SemanticModels
* CT-2521: Update schema v10
* CT-2521: Update unit tests for new SemanticModel collection in manifest
* CT-2521: Add changelog entry
* CT-2521: Final touches on initial implementation of SemanticModel parsing
* Change name of Metadata class to reduce potential for confusion
* Remove "Replaceable" inheritance, per review
* CT-2521: Rename internal variables from semantic_models to semantic_nodes
* CT-2521: Update manifest schema to reflect change
---------
Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* changie
* ADAP-387: Stub materialized view as a materialization (#7211)
* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources
* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources
* remove unneeded return statement, rename directory
* remove unneeded ()
* responding to some pr feedback
* adjusting order of events for mv base work
* move up prexisting drop of backup
* change relatiion type to view to be consistent
* add base test case
* fix jinja exeception message expression, basic test passing
* response to feedback, removeal of refresh infavor of combined create_as, etc.
* swapping to api layer and stratgeies for default implementation (basing off postgres, redshift)
* remove stratgey to limit need for now
* remove unneeded story level changelog entry
* add strategies to condtional in place of old macros
* macro name fix
* rename refresh macro in api level
* align names between postgres and default to same convention
* align names between postgres and default to same convention
* change a create call to full refresh
* pull adapter rename into strategy, add backup_relation as optional arg
* minor typo fix, add intermediate relation to refresh strategy and initial attempt at further conditional logic
* updating to feature main
---------
Co-authored-by: Matthew McKnight <matthew.mcknight@dbtlabs.com>
* ADAP-387: reverting db_api implementation (#7322)
* changie
* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources
* remove unneeded return statement, rename directory
* remove unneeded ()
* responding to some pr feedback
* adjusting order of events for mv base work
* move up prexisting drop of backup
* change relatiion type to view to be consistent
* add base test case
* fix jinja exeception message expression, basic test passing
* response to feedback, removeal of refresh infavor of combined create_as, etc.
* swapping to api layer and stratgeies for default implementation (basing off postgres, redshift)
* remove stratgey to limit need for now
* remove unneeded story level changelog entry
* add strategies to condtional in place of old macros
* macro name fix
* rename refresh macro in api level
* align names between postgres and default to same convention
* change a create call to full refresh
* pull adapter rename into strategy, add backup_relation as optional arg
* minor typo fix, add intermediate relation to refresh strategy and initial attempt at further conditional logic
* updating to feature main
* removing db_api and strategies directories in favor of matching current materialization setups
* macro name change
* revert to current approach for materializations
* added tests
* added `is_materialized_view` to `BaseRelation`
* updated materialized view stored value to snake case
* typo
* moved materialized view tests into adapter test framework
* add enum to relation for comparison in jinja
---------
Co-authored-by: Mike Alfare <mike.alfare@dbtlabs.com>
* ADAP-391: Add configuration change option (#7272)
* changie
* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources
* move up pre-existing drop of backup
* change relation type to view to be consistent
* add base test case
* fix jinja exception message expression, basic test passing
* align names between postgres and default to same convention
* init set of Enum for config
* work on initial Enum class for on_configuration_change base it off ConstraintTypes which is also a str based Enum in core
* add on_configuration_change to unit test expected values
* make suggested name change to Enum class
* add on_configuration_change to some integration tests
* add on_configuration_change to expected_manifest to pass functional tests
* added `is_materialized_view` to `BaseRelation`
* updated materialized view stored value to snake case
* moved materialized view tests into adapter test framework
* add alter materialized view macro
* change class name, and config setup
* play with field setup for on_configuration_change
* add method for default selection in enum class
* renamed get_refresh_data_in_materialized_view_sql to align with experimental package
* changed expected values to default string
* added in `on_configuration_change` setting
* change ignore to skip
* updated default option for on_configuration_change on NodeConfig
* removed explicit calls to enum values
* add test setup for testing fail config option
* updated `config_updates` to `configuration_changes` to align with `on_configuration_change` name
* setup configuration change framework
* skipped tests that are expected to fail without adapter implementation
* cleaned up log checks
---------
Co-authored-by: Mike Alfare <mike.alfare@dbtlabs.com>
* ADAP-388: Stub materialized view as a materialization - postgres (#7244)
* move the body of the default macros into the postgres implementation, throw errors if the default is used, indicating that materialized views have not been implemented for that adapter
---------
Co-authored-by: Matthew McKnight <matthew.mcknight@dbtlabs.com>
* ADAP-402: Add configuration change option - postgres (#7334)
* changie
* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources
* remove unneeded return statement, rename directory
* remove unneeded ()
* responding to some pr feedback
* adjusting order of events for mv base work
* move up prexisting drop of backup
* change relatiion type to view to be consistent
* add base test case
* fix jinja exeception message expression, basic test passing
* added materialized view stubs and test
* response to feedback, removeal of refresh infavor of combined create_as, etc.
* updated postgres to use the new macros structure
* swapping to api layer and stratgeies for default implementation (basing off postgres, redshift)
* remove stratgey to limit need for now
* remove unneeded story level changelog entry
* add strategies to condtional in place of old macros
* macro name fix
* rename refresh macro in api level
* align names between postgres and default to same convention
* change a create call to full refresh
* pull adapter rename into strategy, add backup_relation as optional arg
* minor typo fix, add intermediate relation to refresh strategy and initial attempt at further conditional logic
* init copy of pr 387 to begin 391 implementation
* init set of Enum for config
* work on initial Enum class for on_configuration_change base it off ConstraintTypes which is also a str based Enum in core
* remove postgres-specific materialization in favor of core default materialization
* update db_api to use native types (e.g. str) and avoid direct calls to relation or config, which would alter the run order for all db_api dependencies
* add clarifying comment as to why we have a single test that's expected to fail at the dbt-core layer
* add on_configuration_change to unit test expected values
* make suggested name change to Enum class
* add on_configuration_change to some integretion tests
* add on_configuration_change to expected_manifest to pass functuional tests
* removing db_api and strategies directories in favor of matching current materialization setups
* macro name change
* revert to current approach for materializations
* revert to current approach for materializations
* added tests
* move materialized view logic into the `/materializations` directory in line with `dbt-core`
* moved default macros in `dbt-core` into `dbt-postgres`
* added `is_materialized_view` to `BaseRelation`
* updated materialized view stored value to snake case
* moved materialized view tests into adapter test framework
* updated materialized view tests to use adapter test framework
* add alter materialized view macro
* add alter materialized view macro
* change class name, and config setup
* change class name, and config setup
* play with field setup for on_configuration_change
* add method for default selection in enum class
* renamed get_refresh_data_in_materialized_view_sql to align with experimental package
* changed expected values to default string
* added in `on_configuration_change` setting
* change ignore to skip
* added in `on_configuration_change` setting
* updated default option for on_configuration_change on NodeConfig
* updated default option for on_configuration_change on NodeConfig
* fixed list being passed as string bug
* removed explicit calls to enum values
* removed unneeded test class
* fixed on_configuration_change to be picked up appropriately
* add test setup for testing fail config option
* remove breakpoint, uncomment tests
* update skip scenario to use empty strings
* update skip scenario to avoid using sql at all, remove extra whitespace in some templates
* push up initial addition of indexes for mv macro
* push slight change up
* reverting alt macro and moving the do create_index call to be more in line with other materializations
* Merge branch 'feature/materialized-views/ADAP-2' into feature/materialized-views/ADAP-402
# Conflicts:
# core/dbt/contracts/graph/model_config.py
# core/dbt/include/global_project/macros/materializations/models/materialized_view/alter_materialized_view.sql
# core/dbt/include/global_project/macros/materializations/models/materialized_view/create_materialized_view_as.sql
# core/dbt/include/global_project/macros/materializations/models/materialized_view/get_materialized_view_configuration_changes.sql
# core/dbt/include/global_project/macros/materializations/models/materialized_view/materialized_view.sql
# core/dbt/include/global_project/macros/materializations/models/materialized_view/refresh_materialized_view.sql
# core/dbt/include/global_project/macros/materializations/models/materialized_view/replace_materialized_view.sql
# plugins/postgres/dbt/include/postgres/macros/materializations/materialized_view.sql
# tests/adapter/dbt/tests/adapter/materialized_views/base.py
# tests/functional/materializations/test_materialized_view.py
* merge feature branch into story branch
* merge feature branch into story branch
* added indexes into the workflow
* fix error in jinja that caused print error
* working on test messaging and skipping tests that might not fit quite into current system
* add drop and show macros for indexes
* add drop and show macros for indexes
* add logic to determine the indexes to create or drop
* pulled index updates through the workflow properly
* convert configuration changes to fixtures, implement index changes into tests
* created Model dataclass for readability, added column to swap index columns for testing
* fixed typo
---------
Co-authored-by: Matthew McKnight <matthew.mcknight@dbtlabs.com>
* ADAP-395: Implement native materialized view DDL (#7336)
* changie
* changie
* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources
* init attempt at mv and basic forms of helper macros by mixing view and experimental mv sources
* remove unneeded return statement, rename directory
* remove unneeded ()
* responding to some pr feedback
* adjusting order of events for mv base work
* move up prexisting drop of backup
* change relatiion type to view to be consistent
* add base test case
* fix jinja exeception message expression, basic test passing
* added materialized view stubs and test
* response to feedback, removeal of refresh infavor of combined create_as, etc.
* updated postgres to use the new macros structure
* swapping to api layer and stratgeies for default implementation (basing off postgres, redshift)
* remove stratgey to limit need for now
* remove unneeded story level changelog entry
* add strategies to condtional in place of old macros
* macro name fix
* rename refresh macro in api level
* align names between postgres and default to same convention
* align names between postgres and default to same convention
* change a create call to full refresh
* pull adapter rename into strategy, add backup_relation as optional arg
* minor typo fix, add intermediate relation to refresh strategy and initial attempt at further conditional logic
* init copy of pr 387 to begin 391 implementation
* updating to feature main
* updating to feature main
* init set of Enum for config
* work on initial Enum class for on_configuration_change base it off ConstraintTypes which is also a str based Enum in core
* remove postgres-specific materialization in favor of core default materialization
* update db_api to use native types (e.g. str) and avoid direct calls to relation or config, which would alter the run order for all db_api dependencies
* add clarifying comment as to why we have a single test that's expected to fail at the dbt-core layer
* add on_configuration_change to unit test expected values
* make suggested name change to Enum class
* add on_configuration_change to some integretion tests
* add on_configuration_change to expected_manifest to pass functuional tests
* removing db_api and strategies directories in favor of matching current materialization setups
* macro name change
* revert to current approach for materializations
* revert to current approach for materializations
* added tests
* move materialized view logic into the `/materializations` directory in line with `dbt-core`
* moved default macros in `dbt-core` into `dbt-postgres`
* added `is_materialized_view` to `BaseRelation`
* updated materialized view stored value to snake case
* typo
* moved materialized view tests into adapter test framework
* updated materialized view tests to use adapter test framework
* add alter materialized view macro
* add alter materialized view macro
* added basic sql to default macros, added postgres-specific sql for alter scenario, stubbed a test case for index update
* change class name, and config setup
* change class name, and config setup
* play with field setup for on_configuration_change
* add method for default selection in enum class
* renamed get_refresh_data_in_materialized_view_sql to align with experimental package
* changed expected values to default string
* added in `on_configuration_change` setting
* change ignore to skip
* added in `on_configuration_change` setting
* updated default option for on_configuration_change on NodeConfig
* updated default option for on_configuration_change on NodeConfig
* fixed list being passed as string bug
* fixed list being passed as string bug
* removed explicit calls to enum values
* removed explicit calls to enum values
* removed unneeded test class
* fixed on_configuration_change to be picked up appropriately
* add test setup for testing fail config option
* remove breakpoint, uncomment tests
* update skip scenario to use empty strings
* update skip scenario to avoid using sql at all, remove extra whitespace in some templates
* push up initial addition of indexes for mv macro
* push slight change up
* reverting alt macro and moving the do create_index call to be more in line with other materializations
* Merge branch 'feature/materialized-views/ADAP-2' into feature/materialized-views/ADAP-402
# Conflicts:
# core/dbt/contracts/graph/model_config.py
# core/dbt/include/global_project/macros/materializations/models/materialized_view/alter_materialized_view.sql
# core/dbt/include/global_project/macros/materializations/models/materialized_view/create_materialized_view_as.sql
# core/dbt/include/global_project/macros/materializations/models/materialized_view/get_materialized_view_configuration_changes.sql
# core/dbt/include/global_project/macros/materializations/models/materialized_view/materialized_view.sql
# core/dbt/include/global_project/macros/materializations/models/materialized_view/refresh_materialized_view.sql
# core/dbt/include/global_project/macros/materializations/models/materialized_view/replace_materialized_view.sql
# plugins/postgres/dbt/include/postgres/macros/materializations/materialized_view.sql
# tests/adapter/dbt/tests/adapter/materialized_views/base.py
# tests/functional/materializations/test_materialized_view.py
* merge feature branch into story branch
* merge feature branch into story branch
* added indexes into the workflow
* fix error in jinja that caused print error
* working on test messaging and skipping tests that might not fit quite into current system
* Merge branch 'feature/materialized-views/ADAP-2' into feature/materialized-views/ADAP-395
# Conflicts:
# core/dbt/include/global_project/macros/materializations/models/materialized_view/get_materialized_view_configuration_changes.sql
# plugins/postgres/dbt/include/postgres/macros/adapters.sql
# plugins/postgres/dbt/include/postgres/macros/materializations/materialized_view.sql
# tests/adapter/dbt/tests/adapter/materialized_views/test_on_configuration_change.py
# tests/functional/materializations/test_materialized_view.py
* moved postgres implemention into plugin directory
* update index methods to align with the configuration update macro
* added native ddl to postgres macros
* removed extra docstring
* updated references to View, now references MaterializedView
* decomposed materialization into macros
* refactor index create statement parser, add exceptions for unexpected formats
* swapped conditional to check for positive state
* removed skipped test now that materialized view is being used
* return the results and logs of the run so that additional checks can be applied at the adapter level, add check for refresh to a test
* add check for indexes in particular for apply on configuration scenario
* removed extra argument
* add materialized views to get_relations / list_relations
* typos in index change logic
* moved full refresh check inside the build sql step
---------
Co-authored-by: Matthew McKnight <matthew.mcknight@dbtlabs.com>
* removing returns from tests to stop logs from printing
* moved test cases into postgres tests, left non-test functionality in base as new methods or fixtures
* fixed overwrite issue, simplified assertion method
* updated import order to standard
* fixed test import paths
* updated naming convention for proper test collection with the test runner
* still trying to make the test runner happy
* rewrite index updates to use a better source in Postgres
* break out a large test suite as a separate run
* update `skip` and `fail` scenarios with more descriptive results
* typo
* removed call to skip status
* reverting `exceptions_jinja.py`
* added FailFastError back, the right way
* removed PostgresIndex in favor of the already existing PostgresIndexConfig, pulled it into its own file to avoid circular imports
* removed assumed models in method calls, removed odd insert records and replaced with get row count
* fixed index issue, removed some indirection in testing
* made test more readable
* remove the "apply" from the tests and put it on the base as the default
* generalized assertion for reuse with dbt-snowflake, fixed bug in record count utility
* fixed type to be more generic to accommodate adapters with their own relation types
* fixed all the broken index stuff
* updated on_configuration_change to use existing patterns
* updated on_configuration_change to use existing patterns
* reflected update in tests and materialization logic
* reflected update in tests and materialization logic
* reverted the change to create a config object from the option object, using just the option object now
* reverted the change to create a config object from the option object, using just the option object now
* modelled database objects to support monitoring all configuration changes
* updated "skip" to "continue", throw an error on non-implemented macro defaults
* updated "skip" to "continue", throw an error on non-implemented macro defaults
* updated "skip" to "continue", throw an error on non-implemented macro defaults
* updated "skip" to "continue", throw an error on non-implemented macro defaults
* reverted centralized framework, retained a few reusable base classes
* updated names to be more consistent
* readability updates
* added readme specifying that `relation_configs` only supports materialized views for now
---------
Co-authored-by: Matthew McKnight <matthew.mcknight@dbtlabs.com>
Co-authored-by: Matthew McKnight <91097623+McKnight-42@users.noreply.github.com>
* --connection-flag
* Standardize the plugin functions used by DebugTask
* Cleanup redundant code and help logic along.
* Add more output tests to add logic coverage and formatting.
* Code review
---------
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Fix names within functional test
* Changelog entry
* Test for implementation of null-safe equals comparison
* Remove duplicated where filter
* Fix null-safe equals comparison
* Fix tests for `concat` and `hash` by using empty strings () instead of `null`
* Remove macro namespace interpolation
* Include null checks in utils test base
* Add tests for the schema test
* Add tests for this macro
---------
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Honor `--skip-profile-setup` parameter when inside an existing project
* Use project name as the profile name
* Use separate file connections for reading and writing
* Raise a custom exception when no adapters are installed
* Test skipping interactive profile setup when inside a dbt project
* Replace `assert_not_called()` since it does not work
* Verbose CLI argument for skipping profile setup
* Use separate file connections for reading and writing
* Check empty list in a Pythonic manner
* CT-2461: Work toward model deprecation
* CT-2461: Remove unneeded conversions
* CT-2461: Fix up unit tests for new fields, correct a couple oversights
* CT-2461: Remaining implementation and tests for model/ref deprecation warnings
* CT-2461: Changelog entry for deprecation warnings
* CT-2461: Refine datetime handling and tests
* CT-2461: Fix up unit test data
* CT-2461: Fix some more unit test data.
* CT-2461: Fix merge issues
* CT-2461: Code review items.
* CT-2461: Improve version -> str conversion
* Allow missing `profiles.yml` for `dbt deps` and `dbt init`
* Some commands allow the `--profiles-dir` to not exist
* Remove fix to verify that CI tests work
* Allow missing `profiles.yml` for `dbt deps` and `dbt init`
* CI is not finding any installed adapters
* Remove functional test for `dbt init`
* Adding perf testing GHA
* Fixing tigger syntax
* Fixing PR creation issue
* Updating testing var
* Remove unneeded branch names
* Fixing branch naming convention
* Standardizing branch name to var
* Consolidating PR jobs
* Updating inputs and making more readable
* Splitting steps up
* Making some updates here to simplify and update
* Remove tab
* Cleaned up testing TODOs before committing
* Fixing spacing
* Fixing spacing issue
* Create publication.py, various Publication classes, Dependency class
* Load dependencies.yml and the corresponding publication file
* Add "public_nodes" and populate ref_lookup
* resolve_ref working
* Add public nodes to parent and child maps
* Bump manifest version and fix tests, use ModelDependsOn
* Split out PublicationArtifact and PublicationConfig, store public_models
separately
* Store dependencies in publication artifact
* change detection of PublicModel for >= python3.10
* Handle removing references for re-processing if publication has changed
* Handle only changed publication artifacts
* Add some logging events
* Remove duplicate nodes from manifest
* refactor relation_from_relation_name
* Remove duplicate writing of manifest.json
* Add public_nodes to flat_graph
* Move some file name constants to core/dbt/constants.py
* Remove "environment" from ProjectDependency. Add
database/schema/identifier to PublicModel. Update TargetNotFound
exception.
* Include external publication dependencies in publication artifact dependencies
* Remove create_from_relation_name, call create_from_node instead
* Change PublicationArtifactChanged message to debug level
* Make write_publication_artifact a function in parser/manifest.py
* Create fixture to create minimal alternate project (just models)
* develop multi project test case
* Latest version should use un-suffixed alias
* Latest version can be in un-suffixed file
* FYI when unpinned ref to model with prerelease version
* [WIP] Nicer error if versioned ref to unversioned model
* Revert "Latest version should use un-suffixed alias"
This reverts commit 3616c52c1eed7588b9e210e1c957dfda598be550.
* Revert "[WIP] Nicer error if versioned ref to unversioned model"
This reverts commit c9ae4af1cfbd6b7bfc5dcbb445556233eb4bd2c0.
* Define real event for UnpinnedRefNewVersionAvailable
* Update pp test for implicit unsuffixed defined_in
* Add changelog entry
* Fix unit test
* marky feedback
* Add test case for UnpinnedRefNewVersionAvailable event
* Adding a new column is not a breaking contract change
* Add changelog entry
* More structured exception
* same_contract: False if non-breaking changes
* PR feedback: rm build_contract_checksum, more comments
* CT-2317: Reset invocation id in preflight for each dbt command.
* CT-2317: Add unit test for invocation_id behavior.
* CT-2317: Add changelog entry.
* CT-2317: Modify freshness test to ignore invocation_id
* CT-2317: Assign invocation_id before tracking initialization.
* CT-2317: Fix unit test failures and a bunch of other stuff
* CT-2317: Remove checks which make outdated assumptions about invocation_id being stable between runs
* CT-2317: Review tweak, more unit test fixes.
* Removed options for `dbt parse`
* Fix misspellings
* Capitalize JSON when appropriate
* Update help text for --write-json/--no-write-json
* Update help text for --config-dir
* Update help text for --resource-types
* Removed decorators for removed dbt parse options
* Remove `--write-manifest` flag from `parse`
* Remove `--parse-only` flag from `compile`
* Update help text for `dbt list --output`
* Standardize on one line per argument
* Factor 3 from 12 Factor CLI Apps
* Update help text for `dbt --version`
* Standardize capitalization of resource types for `dbt build`
* `debug --config-dir` is a boolean flag
* Update help text for `--version-check`
* Specify `-q` as a conventional alias for `--quiet`
* Update help text for `debug --config-dir`
* Update help text for `debug`
* Treat more dense text blobs as binary for `git grep`
* Update help text for `--version-check`
* Update help text for `--defer`
* Update help text for `--indirect-selection`
* Co-locate log colorization with other log settings
* Update help text for `--log-format*`, `--log-level*`, and `--use-colors*`
* Temporarily re-add option for CI tests
* Remove `--parse-only` flag from `show`
* Remove `--write-manifest` flag from `parse` (again)
* Snapshot strategies: newline for subquery
* add changie output
* add test for snapshot ending in comment
* remove import to be flake8 compliant
* add seed import
* add newlines for flake8 compliance
* typo fix
* Fixing up a test, adding a comment or two
* removed un-needed test fixtures
* removed even more un-needed fixtures, collapsed test to single class
* removed errant breakpoint()
* Fix a little typo
---------
Co-authored-by: Ian Knox <ian.knox@dbtlabs.com>
Co-authored-by: Mila Page <67295367+VersusFacit@users.noreply.github.com>
* CT-1922: Rough in functionality for parsing model level constraints
* CT-1922: (Almost) complete support for model level constraints
* CT-1922: Fix typo affecting correct model constraint parsing.
* CT-1922: Rework base class for model tests for greater simplicity
* CT-1922: Rough in functionality for parsing model level constraints
* CT-1922: Revise unit tests for new model-level constraints property
* CT-1922: (Almost) complete support for model level constraints
* first pass
* implement in core
* add proto
* WIP
* resolve errors in columns_spec_ddl
* changelog
* update comment
* move logic over to python
* rename and use enum
* update default constraint_support dict
* generate new proto definition after conflicts
* reorganize code and break warnings into each constraint
* fix postgres constraint support
* remove breakpoint
* convert constraint support to constant
* update postgres
* add to export
* add to export
* regen proto types file
* standardize names
* put back mypy error
* more naming + add back comma
* add constraint support to model level constraints
* update event message and method signature
* rename method
* CT-1922: Rough in functionality for parsing model level constraints
* CT-1922: Revise unit tests for new model-level constraints property
* CT-1922: (Almost) complete support for model level constraints
* CT-1922: Fix typo affecting correct model constraint parsing.
* CT-1922: Improve whitespace handling
* CT-1922: Render raw constraints to constraint list directly
* make method return consistent
* regenerate proto defn
* update evvent test
* add some code cleanup
---------
Co-authored-by: Peter Allen Webb <peter.webb@dbtlabs.com>
* CT-1922: Rough in functionality for parsing model level constraints
* CT-1922: Revise unit tests for new model-level constraints property
* CT-1922: (Almost) complete support for model level constraints
* CT-1922: Fix typo affecting correct model constraint parsing.
* CT-1922: Minor code review refinements
* CT-1922: Improve whitespace handling
* CT-1922: Render raw constraints to constraint list directly
* CT-1922: Rework base class for model tests for greater simplicity
* CT-1922: Remove debugging properties. Oops.
* CT-1922: Fix type annotation
* improved first line of error
* added basic printing of yaml and sql cols as columns
* added changie log
* used listed dictionary as input to match columns
* swapped order of col headers for printing
* used listed dictionary as input to match columns
* removed merge conflict text from file
* Touch-ups
* Update log introspection in functional tests
* Update format_column macro. Case insensitive test
* PR feedback: just data_type, not formatted
---------
Co-authored-by: Kyle Kent <kyle.kent321@gmail.com>
* remove trial nodes before building subdag
* add changie
* Update graph.py
remove comment
* further optimize by sorting node search by degree
* change degree to product of in and out degree
* Add tests for logging jinja2.Undefined objects
[CT-2259](https://github.com/dbt-labs/dbt-core/issues/7108) identifies
an issue wherein dbt-core 1.0-1.3 raise errors if a jinja2.Undefined
object is attempted to be logged. This generally happened in the form
of `{{ log(undefined_variable, info=True) }}`. This commit adding this
test exists for two reasons
1. Ensure we don't have a regression in this going forward
2. Exist as a commit to be used for backport fixes for dbt-core 1.0-1.3
* Add tests for checking `DBT_ENV_SECRET_`s don't break logging
[CT-1783](https://github.com/dbt-labs/dbt-core/issues/6568) describes
a bug in dbt-core 1.0-1.3 wherein when a `DBT_ENV_SECRET_` all
`{{ log("logging stuff", info=True) }}` invocations break. This commit
adds a test for this for two reasons:
1. Ensure we don't regress to this behavior going forward
2. Act as a base commit for making the backport fixes to dbt-core 1.0-1.3
* Add tests ensuring failed event serialization is handled correctly
[CT-2264](https://github.com/dbt-labs/dbt-core/issues/7113) states
that failed serialization should result in an exception handling path
which will fire another event instead of raising an exception. This is
hard to test perfectly because the exception handling path for
serialization depending on whether pytest is present. If pytest isn't
present, a new event documentation the failed serialization is fired.
If pytest is present, the failed serialization gets raised as an exception.
Thus this added test ensures that the expected exception is raised and
assumes that the correct event will be fired normally.
* Log warning when event serialization fails in `msg_to_dict`
This commit updates the `msg_to_dict` exception handling path to
fire a warning level event instead of raising an exception.
Truthfully, we're not sure if this exception handling path is even
possible to hit. That's because we recently switched from betterproto
to google's protobuf. However, exception path is the subject of
[CT-2264](https://github.com/dbt-labs/dbt-core/issues/7113). Though we
don't think it's actually possible to hit it anymore, we still want
to handle the case if it is.
* Update serialization failure note to be a warn level event in `BaseEvent`
[CT-2264](https://github.com/dbt-labs/dbt-core/issues/7113) wants
logging messages about event serialization failure to be `WARNING`
level events. This does that.
* Add changie info for changes
* Add test to check exception handling of `msg_to_dict`
* One argument per line
* Tests for multiple `--select` or `--exclude`
* Allow `--select` and `--exclude` multiple times
* Changelog entry
* MultiOption options must be specified with type=tuple or type=ChoiceTuple
* Testing for `--output-keys` and `--resource-type`
* Validate that any new param with `MultiOption` should also have `type=tuple` (or `ChoiceTuple`) and `multiple=True`
* first pass
* adding tests
* changelog
* split up tests due to order importance
* update test
* add back comment
* rename base test classes
* move sql
* fix test name
* move sql
* test changes to match main
* organize and cleanup fixtures
* more cleanup of tests
* add utility function to EventManager for explicitly adding callbacks
Technically these aren't necessary in their current state. We could instead
have people do `<InstantiatedEventManager>.callbacks.extend(...)` directly.
However, it's not hard to imagine a world wherein extra things need to take
place when a callback is added. Thus abstracting to a utility method
now means that as the implementation of how callbacks are actually added
changes, the invocation to do so can stay the same.
* update `setup_event_logger` to optionally take in callbacks add them to the EventManager
* update preflight decorator to check for and pass along callbacks for event logger setup
* Add `callbacks` to `dbtRunner`
On instantiation of `dbtRunner` one can now provide `callbacks`. These
callbacks are for the `EventLogger`. When `invoke` is called on a `dbtRunner`,
the `callbacks` are added to the cli context object. In the preflight
decorator these callbacks are extracted from the cli context and then
passed to the `setup_event_logger`, finally `setup_event_logger` ensures
the callbacks are added to the global `EVENT_MANAGER`.
* add test to check dbtRunner callbacks get properly set
I believe technically this tests qualifies as more of an integration
test, but no other tests like it currently exist (that I could find
via a cursory search). The `tests/unit/test_dbt_runner.py` seemed like
the most intuitive spot. However, if somewhere else makes sense, I'd be
happy to move it.
* add changie documentation for CT-1928
* Convert simple copy.
* Adjust class names for import.
* adjust test namespacing
* Resolve test error.
---------
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* ct-2198: clean up some type names and uses
* CT-2198: Unify constraints and constraints_check properties on columns
* Make mypy version consistently 0.981 (#7134)
* CT 1808 diff based partial parsing (#6873)
* model contracts on models materialized as views (#7120)
* first pass
* rename tests
* fix failing test
* changelog
* fix functional test
* Update core/dbt/parser/base.py
* Update core/dbt/parser/schemas.py
* Create method for env var deprecation (#7086)
* update to allow adapters to change model name resolution in py models (#7115)
* update to allow adapters to change model name resolution in py models
* add changie
* fix newline adds
* move quoting into macro
* use single quotes
* add env DBT_PROJECT_DIR support #6078 (#6659)
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
* Add new index.html and changelog yaml files from dbt-docs (#7141)
* Make version configs optional (#7060)
* [CT-1584] New top level commands: interactive compile (#7008)
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
* CT-2198: Add changelog entry
* CT-2198: Fix tests which broke after merge
* CT-2198: Add explicit validation of constraint types w/ unit test
* CT-2198: Move access property, per code review
* CT-2198: Remove a redundant macro
* CT-1298: Rework constraints to be adapter-generated in Python code
* CT-2198: Clarify function name per review
---------
Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
Co-authored-by: Stu Kilgore <stu.kilgore@dbtlabs.com>
Co-authored-by: colin-rogers-dbt <111200756+colin-rogers-dbt@users.noreply.github.com>
Co-authored-by: Leo Schick <67712864+leo-schick@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: FishtownBuildBot <77737458+FishtownBuildBot@users.noreply.github.com>
Co-authored-by: dave-connors-3 <73915542+dave-connors-3@users.noreply.github.com>
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
* add protobuf message/class for new CommandCompleted event
For [CT-2049](https://github.com/dbt-labs/dbt-core/issues/6878) we
concluded that we wanted a new event type, [CommandCompleted](https://github.com/dbt-labs/dbt-core/issues/6878#issuecomment-1419718606)
with [four (4) values](https://github.com/dbt-labs/dbt-core/issues/6878#issuecomment-1426118283):
which command was run, whether the command succeeded, the timestamp
that the command finished, and how long the command took. This commit
adds the new event proto defition, the auto generated proto_types, and
the instantiatable even type.
* begin emitting CommandCompleted event in the preflight decorator
The [preflight decorator](4186f99b74/core/dbt/cli/requires.py (L19))
runs at the start of every CLI invocation. Thus is a perfect candidate
for emitting the CommandCompleted event. This is noted in the [dicussion
on CT-2049](https://github.com/dbt-labs/dbt-core/issues/6878#issuecomment-1428643539).
* add CommandCompleted event to event unit tests
* Add: changelog entry
* fire CommandCompleted event reguardless of upstream exceptions
Previously, if `--fail-fast` was specified and an issue was run into
or an unhandled issue became an exception, the CommandCompleted event
would not get fired because at this point in the stack we'd be in
exception thrown handling mode. If an exception does reach this point,
we want to still fire the event and also continue to propogate the
exception. Hence the bare `raise` exists to reraise the caught exception
* Update CommandCompleted event to be a `Debug` level event
We don't actually "always" need this event to be logged. Thus we've
updated it to `Debug` level. [Discussion Context](https://github.com/dbt-labs/dbt-core/pull/7180#discussion_r1139281963)
* Init roadmap
* Rework the top paragraph
* Clean-up the whole thing
* Typos and stuff
* Add a missing word
* Fix typo
* Update "when" note
* Next draft
* Propose rename
* Resolve TODOs, still needs a reread
* Being cute
* Another read through
* Fix sentence fragment
---------
Co-authored-by: Florian Eiden <florian.eiden@dbtlabs.com>
* first pass
* WIP
* add notes/stubs on more pieces
* more work
* more cleanup
* cleanup
* add more cleanup and generalization
* update to use reusable workflow
* add TODO
* Add back initialization events
* Fix log_cache_events. Default stdout logger knows less than it used to
* Add back exception handling events
* Revert "Add back exception handling events"
This reverts commit 26f22d91b660f51f0df6a59f9e5cae16b0ee6fe5.
* Add changelog entry
* Fix test by stringifying dict values
* Add generated CLI API docs
---------
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
* part 1 of env var for core team
* add logic to use env vars to generate changelog
* modify version bump to add members via env var
* pull in main and tweak
* add token
* changes for testing
* split step
* remove leading slash
* add version check
* more debugging
* try curl
* try more things
* try more things
* chnage auth
* put back token
* update permissions
* add back fishtown pat
* use new pat
* fix typo
* swap token
* comment out list teams
* change url
* debug path
* add continue
* change core case
* more tweaks
* send output to file
* add file view
* make array
* tweak
* remove []
* add quotes
* add tojson
* add quotes to set
* tweak
* fix id
* tweaks
* more
* more
* remove new lines
* more tweaks
* update to generate changelog
* remove debugging bits
* use central version-bump
* use correct author list
* testing with changelog team automation
* add new token to input
* move secret
* remove testing aspects from workflow
* clean up team logic
* explicitly send secret
* move bumpversion comment
* move comments
* point workflow back tp main
* point to branch for testing
* point back to main
* inherit secrets
* first pass at automating latest branches
* checkout repo first
* fetch all history
* reorg
* debugging
* update test id
* swap lines
* incorporate new branch aciton
* tweak vars
* Formatting
* Changelog entry
* Rename to BaseSimpleSeedColumnOverride
* Better error handling
* Update test to include the BOM test
* Cleanup and formating
* Unused import remove
* nit line
* Pr comments
* update regex to match all iterations
* convert to num to match all adapters
* add comments, remove extra .
* clarify with more comments
* Update .bumpversion.cfg
Co-authored-by: Nathaniel May <nathaniel.may@fishtownanalytics.com>
---------
Co-authored-by: Nathaniel May <nathaniel.may@fishtownanalytics.com>
* Add clearer directions for custom test suite vars in Makefile.
* Fix up PR for review
* Fix erroneous whitespace.
* Fix a spelling error.
* Add documentation to discourage makefile edits but provide override tooling.
* Fix quotation marks. Very strange behavior
* Compact code and verify quotations happy inside bash and python.
* Fold comments into Makefile.
---------
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Convert test and make it a bit more pytest-onic
* Ax old integration test.
* Run black on test conversion
* I didn't like how pytest was running the fixture so wrapped it into a closure.
* Merge converted test into persist docs.
* Move persist docs tests to the adapter zone. Prep for adapter tests.
* Fix up test names
* Fix name to be less confusing.
---------
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Test converted and reformatted for pytest.
* Ax old versions of 052 test
* Nix the 'os' import and black format
* Change names of models to be more PEP like
* cleanup code
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Convert test and make it a bit more pytest-onic
* Ax old integration test.
* Run black on test conversion
* I didn't like how pytest was running the fixture so wrapped it into a closure.
* Merge converted test into persist docs.
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* init commit for column_types test conversion
* init start of test_column_types.py
* pass tes macros into both tests
* remove alt tests, remove old tests, push up working conversion
* rename base class, move to adapter zone so adapters can use
* typo fix
* Code cleanup and adding stderr to capture dbt
* Debug with --log-format json now prints structured logs.
* Add changelog.
* Move logs into miscellaneous and add values to test.
* nix whitespace and fix log levels
* List will now do structured logging when log format set to json.
* Add a quick None check.
* Add a get guard to class check.
* Better null checking
* The boolean doesn't reflect the original logic but a try-catch does.
* Address some code review comments and get us working again.
* Simplify logic now that we have a namespace object for self.config.args.
* Simplify logic for json log format checking.
* Simplify code for allowing our GraphTest cases to pass while also hiding compile stats from dbt ls/list .
* Simplify structured logging types.
* Fix up boolean logic and simplify via De'Morgan.
* Nix unneeded fixture.
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* CT-1786: Port docs tets to pytest
* Add generated CLI API docs
* CT-1786: Comply with the new style requirements
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
* add defer_to_manifest in before_run to fix faulty deferred docs generate
* add a changelog
* add declaration of defer_to_manifest to FreshnessTask and GraphRunnableTask
* fix: add defer_to_manifest method to ListTask
* Re-factor list of YAML keys for hooks to late-render
* Add `pre_` and `post_hook` to list of late-rendered hooks
* Check for non-empty set intersection
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
* Test functional synonymy of `*_hook` with `*-hook`
Test that `pre_hook`/`post_hook` are functionally synonymous with `pre-hook`/`post-hook` for model project config
* Undo bugfix to validate the new test fails
* Revert "Undo bugfix to validate the new test fails"
This reverts commit e83a2be2eb.
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
* add meta attribute to nodeinfo for events
* also add meta to dataclass
* add to unit test to ensure meta is added
* adding functional test to check that meta is passed to nodeinfo during logging
* changelog
* remove used imported
* add tests with non-string keys
* renaming test dict keys
* add non-string value
* resolve failing test
* test additional non-string values
* fix flake8
* Stringify meta dict in node_info
Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
* convert the test and fix an error due to a dead code seed
* Get rid of old test
* Remove unfortunately added files. Don't use that *
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Update types.proto
* pre-commit passes
* Cleanup tests and tweak EventLevels
* Put node_info back on SQLCommit. Add "level" to fire_event function.
* use event.message() in warn_or_error
* Fix logging test
* Changie
* Fix a couple of unit tests
* import Protocol from typing_extensions for 3.7
* ✨ adding pre-commit install to make dev
* 🎨 updating format of Makefile and CONTRIBUTING.md
* 📝 adding changelog via changie new
* ✨ adding dev_req to Makefile + docs
* 🎨 remove dev_req from docs, dry makefile
* Align names of `.PHONY` targets with their associated rules
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
* starting to move jinja exceptions
* convert some exceptions
* add back old functions for backward compatibility
* organize
* more conversions
* more conversions
* add changelog
* split out CacheInconsistency
* more conversions
* convert even more
* convert parsingexceptions
* fix tests
* more conversions
* more conversions
* finish converting exception functions
* convert more tests
* standardize to msg
* remove some TODOs
* fix test param and check the rest
* add comment, move exceptions
* add types
* fix type errors
* fix type for adapter_response
* remove 0.13 version from message
* pass predicated to merge strategy
* postgres delete and insert
* merge with predicates
* update to use arbitrary list of predicates, not dictionaries, merge and delete
* changie
* add functional test to adapter zone
* comma in test config
* add test for incremental predicates delete and insert postgres
* update test structure for inheritance
* handle predicates config for backwards compatibility
* test for predicates keyword
* Add generated CLI API docs
Co-authored-by: Colin <colin.rogers@dbtlabs.com>
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
* Remove unneeded SQL compilation attributes from SeedNode
* Fix various places that referenced removed attributes
* Cleanup a few Unions
* More formatting in nodes.py
* Mypy passing. Untested.
* Unit tests working
* use "doc" in documentation unique_ids
* update some doc_ids
* Fix some artifact tests. Still need previous version.
* Update manifest/v8.json
* Move relation_names to parsing
* Fix a couple of tests
* Update some artifacts. snapshot_seed has wrong schema.
* Changie
* Tweak NodeType.Documentation
* Put store_failures property in the right place
* Fix setting relation_name
* update changie to require issue or pr, and allow multiple
* remove extraneous data from changelog files.
* allow for multiple PR/issues to be entered
* update contributing guide
* remove issue number from bot changelogs
* update format of PR
* fix dependency changelogs
* remove extra line
* remove extra lines, tweak contributor wording
* Update CONTRIBUTING.md
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* Get running with Python 3.11
* More tests passing, mypy still unhappy
* Upgrade to 3.11, and bump mashumaro
* patch importlib.import_module last
* lambda: Policy() default_factory on include and quote policy
* Add changelog entry
* Put a lambda on it
* Fix text formatting for log file
* Handle variant type return from e.log_level()
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Josh Taylor <joshuataylorx@gmail.com>
Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
* feat: add a list of default values to the ctx manager
* tests: dbt.get.config default values
* feat: validate the num of args in config.get
* feat: jinja template for dbt.config.get default values
* docs: changie yaml
* fix:typo on error message
Co-authored-by: Chenyu Li <chenyulee777@gmail.com>
Co-authored-by: Chenyu Li <chenyulee777@gmail.com>
* v0 - new dbt deps type: tarball url
in support of
https://github.com/dbt-labs/dbt-core/issues/4205
* flake8 fixes
* adding max size tarball condition
* clean up imports
* typing
* adding sha1 and subdirectory options; improve logging feedback
sha1: allow user to specify sha1 in packages.yaml, will only install if package matches
subdirectory: allow user to specify subdirectory of package in tarfile, if the package is a non standard structure (like with git subdirectory option)
* simple tests added
* flake fixes
* changes to support tests; adding exceptions; fire_event logging
* new logging events
* tarball exceptions added
* build out tests
* removing in memory tarball test
* update type codes to M - Misc
* adding new events to test_events
* fix spacing for flake
* add retry download code - as used in registry calls
* clean
* remove saving tar in memory inside tarfile object
will hit url multiple times instead
* remove duplicative code after refactor
* black updates
* black formatting
* black formatting
* refactor - no more in-memory tarfile - all as file operations now
- remove tarfile passing, always use tempfile instead
- reorganize system.* functions, removing duplicative code
- more notes on current flow and structure - esp need for pattern of 1) unpack 2) scan for package dir 3) copy to destination.
- cleaning
* cleaning and sync to new tarball code
* cleaning and sync to new tarball code
* requested changes from PR
https://github.com/dbt-labs/dbt-core/pull/4689#discussion_r812970847
* reversions from revision 2
removing sha1 check to simplify/mirror hub install pattern
* simplify/mirror hub install pattern
to simplify/mirror hub install pattern
- removing sha1 check
- supply name/version to act as our 'metadata' source
* simplify/mirror hub install pattern
simplify with goal of mirroring hub install pattern
- supporting subfolders like git packages, and sha1 checks are removed
- existing code from RegistryPinnedPackage (install() and download_and_untar()) performs the operations
- RegistryPinnedPackage install() and download_and_untar() are not currently set up as functions that can be used across classes - this should be moved to dbt.deps.base, or to a dbt.deps.common file - need dbt labs feedback on how to proceed (or leave as is)
* remove revisions, no longer doing package check
* slim down to basic tests
more complex features have been removed (sha1, subfolder) so testing is much simpler!
* fix naming to match hubs behavior
remove version from package folder name
* refactor install and download to upstream PinnedPackage class
i'm on the fence if this is right approach, but seems like most sensible after some thought
* Create Features-20221107-105018.yaml
* fix flake, black, mypy errors
* additional flake/black fixes
* Update .changes/unreleased/Features-20221107-105018.yaml
fix username on changelog
Co-authored-by: Emily Rockman <ebuschang@gmail.com>
* change to fstring
Co-authored-by: Emily Rockman <ebuschang@gmail.com>
* cleaning - remove comment
* remove comment/question for dbt team
* in support of issuecomment 1334055944
https://github.com/dbt-labs/dbt-core/pull/4689#issuecomment-1334055944
* in support of issuecomment 1334118433
https://github.com/dbt-labs/dbt-core/pull/4689#issuecomment-1334118433
* black fixes; remove debug bits
* remove `.format` & add 'tarball' as version
'tarball' as version so that the temp files format nicely:
[tempfile_location]/dbt_utils_2..tar.gz # old
vs
[tempfile_location]/dbt_utils_1.tarball.tar.gz # current
* port os.path refs in `PinnedPackage._install` to pathlib
* lowercase as per PR feedback
* update tests after removing version arg
goes along with 8787ba41af
Co-authored-by: Emily Rockman <ebuschang@gmail.com>
* removed Compiled versions of nodes
* Remove compiled fields from dictionary if not compiled
* check compiled is False instead of attribute existence in env_var
processing
* Update artifacts test (CompiledSnapshotNode did not have SnapshotConfig)
* Changie
* more complicated 'compiling' check in env_var
* Update test_exit_codes.py
* CT-1405: Refactor event logging code
* CT-1405: Add changelog entry
* CT-1405: Add code to protect against using closed streams from past tests.
* CT-1405: Restore unit test which was only failing locally
* CT-1405: Document a hack with issue # to resolve it in the future
* CT-1405: Make black happy
* CT-1405: Get rid of confusing factory function and duplicated function
* CT-1405: Remove unused event from types.proto and auto-gen'd file
* Fix the partial parse path
Partial parse should use project root or it does not resolve to correct path.
Eg. `target-path: ../some/dir/target`, if not ran from root, creates an erroneous folder.
* Run pre-commit
* Changie
Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
* reformatting of test after some spike investigation
* reformat code to pull tests back into base class definition, move a test to more appropriate spot
* Convert incremental schema tests.
* Drop the old test.
* Bad git add. My disappoint is immeasurable and my day has been ruined.
* Adjustments for flake8.
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Convert old test.
Add documentation. Adapt and reenable previously skipped test.
* Convert test and adapt and comment for current standards.
* Remove old versions of tests.
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Convert test 067. One bug outstanding.
* Test now working! Schema needed renaming to avoid 63 char max problems
* Remove old test.
* Add some docs and rewrite.
* Add exception for when audit tables' schema runs over the db limit.
* Code cleanup.
* Revert exception.
* Round out comments.
* Rename what shouldn't be a base class.
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* BaseContext: expose md5 function in context
* BaseContext: add return value type
* Add changie entry
* rename "md5" to "local_md5"
* fix test_context.py
* init pr for dbt_debug test conversion
* removal of old test
* minor test format change
* add new Base class and Test classes
* reformatting test, new method for capsys and error messgae to check, todo fix badproject
* refomatting tests, ready for review
* checking yaml file, and small reformat
* modifying since update wasn't working in ci/cd
* Combine various print result log events with different levels
* Changie
* more merge cleanup
* Specify DynamicLevel for event classes that must specify level
* Initial structured logging changes
* remove "this" from core/dbt/events/functions.py
* CT-1047: Fix execution_time definitions to use float
* CT-1047: Revert unintended checking of changes to functions.py
* WIP
* first pass to resolve circular deps
* more circular dep resolution
* remove a bunch of duplication
* move message into log line
* update comments
* fix field that wen missing during rebase
* remove double import
* remove some comments and extra code
* fix pre-commit
* rework deprecations
* WIP converting messages
* WIP converting messages
* remove stray comment
* WIP more message conversion
* WIP more message conversion
* tweak the messages
* convert last message
* rename
* remove warn_or_raise as never used
* add fake calls to all new events
* fix some tests
* put back deprecation
* restore deprecation fully
* fix unit test
* fix log levels
* remove some skipped ids
* fix macro log function
* fix how messages are built to match expected outcome
* fix expected test message
* small fixes from reviews
* fix conflict resolution in UI
Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
Co-authored-by: Peter Allen Webb <peter.webb@dbtlabs.com>
* Convert test to functional set.
* Remove old statement tests from integration test set.
* Nix whitespace
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Create functors to initialize event types with str-type member attributes. Before this change, the spec of various classes expected base_msg and msg params to be str's. This assumption did not always hold true. post_init hooks ensures the spec is obeyed.
* Add new changelog.
* Add msg type change functor to a few other events that could use it.
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Updated string formatting on non-f-strings.
Found all cases of strings separated by white space on a single line and
removed white space separation. EX: "hello " "world" -> "hello world".
* add changelog entry
* CT-625: Fail with clear message for invalid materialized vals
* CT-625: Increase test coverage, run pre-commit checks
* CT-625: run black on problem file
* CT-625: Add changelog entry
* CT-625: Remove test that didn't make sense
* Migrate test
* Remove old integration test.
* Simplify object definitions since we enforce python 3
* Factor many fixtures into a file.
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* init query_comment test conversion pr
* importing model and macro, changing to new project_config_update, tests passing locally for core
* delete old integration test
* trying to test against other adapters
* update to main
* file rename
* file rename
* import change
* move query_comment directory to functional/
* move test directory back to adapter zone
* update to main
* updating core test based on feedback from @gshank
* testing removing target checking
* edited comment to correctly specify that views are set, not tables
* updated init test to match starter project change
* added changelog
* update 3 other occurrences of the init test for text update
* clean up debugging
* reword some comments
* changelog
* add more tests
* move around the manifest.node
* fix typos
* all tests passing
* move logic for moving around nodes
* add tests
* more cleanup
* fix failing pp test
* remove comments
* add more tests, patch all disabled nodes
* fix test for windows
* fix node processing to not overwrite enabled nodes
* add checking disabled in pp, fix error msg
* stop deepcopying all nodes when processing
* update error message
* init pr for 026 test conversion
* removing old test, got all tests setup, need to find best way to handle regex in new test and see what we would actually want to do to test check we didn't run anything against
* changes to test_alias_dupe_thorews_exeption passing locally now
* adding test cases for final test
* following the create new shcema method tests are passing up for review for core code
* noving alias test to adapter zone
* adding Base Classes
* changing ref to fixtures
* add double check to test
* minor change to alt schema name formation, removal of unneeded setup fixture
* typo in model names
* update to main
* pull models/schemas/macros into a fixtures file
* Preliminary changes to keep compile from connecting to the warehouse for runtime calls
* Adds option to lib to skip connecting to warehouse for compile; adds prelim tests
* Removes unused imports
* Simplifies test and renames to SqlCompileRunnerNoIntrospection
* Updates name in tests
* Spacing
* Updates test to check for adapter connection call instead of compile and execute
* Removes commented line
* Fixes test names
* Updates plugin to postgres type as snowflake isn't available
* Fixes docstring
* Fixes formatting
* Moves conditional logic out of class
* Fixes formatting
* Removes commented line
* Moves import
* Unmoves import
* Updates changelog
* Adds further info to method docstring
* first pass
* add label and name validation
* changelog
* fix tests
* convert ParsingError to Deprecation
* fix bug where label did not end up in parsed node
* update deprecation msg
* ConfigSelectorMethod should check for bools
* Add changelog entry
* Add support for lists and test cases
* Typo and formatting in test
* pre-commit linting
* Method for capturing standard out during testing (rather than logs)
* Allow dbt exit code assertion to be optional
* Verify priority order to search for profiles.yml configuration
* Updates after pre-commit checks
* Test searching for profiles.yml within the dbt project directory before `~/.dbt/`
* Refactor `dbt debug` to move to the project directory prior to looking up profiles directory
* Search the current working directory for profiles.yml
* Changelog
* Formatting with Black
* Move `run_dbt_and_capture_stdout` into the test case
* Update CLI help text
* Unify separate DEFAULT_PROFILES_DIR definitions
* Remove unused PROFILE_DIR_MESSAGE
* Remove unused DEFAULT_PROFILES_DIR
* Use shared definition of DEFAULT_PROFILES_DIR
* Define global vs. local profiles location and dynamically determine the default
* Restore original
* Remove function for determining the default profiles directory
* init push for 021_test_concurrency conversion
* ref to self, delete old integration tests, core passing locally
* creating base class to send setup to snowflake
* making changes to store all setup in core, todo: remove util changes after 1050 is merged
* swap sql seeds to csv
* white space removal
* rewriting seed to see if it fixes issue in snowflake
* attempt to rewrite file for test in snowflake
* update to main
* remove unneeded variable to seeds
* remove unneeded snowflake specific code
* first pass adding disabled functionality to metrics and exposures
* first pass at getting metrics disabled
* add unsaved file
* fix up comments
* Delete tmp.csv
* fix test
* add exposure logic, fix merge from main
* change when nodes are added to manifest, finish tests
* add changelog
* removed unused code
* minor cleanup
* init file creation for test_ephemeral conversion
* creating base class to run seed through and pass along to classes to test against
* laid out basic flow of tests, need to finish by figuring out how to handle the assertTrue sections and fix error thats occuring
* added creation and comparison of sql and expected result, seeing issue with extra appended test_ on some and issue with errorhandling regarding expect pass
* working on fixing view structure
* update to expected_sql file
* update to expected_sql file
* directory rename, close on all tests need to fix the test_test_ name change for first two tests and figure out why the new test is calling error instead of skipped in status
* renamed expected_sql to include the test_test_ephemeral style name, organized how models are imported into test classes
* move ephemeral functional test to adapter zone
* trying to include the BaseEphemeralMulti class to send to snowflake
* trying to fix snowflake test
* trying to fix snowflake test
* creation of second Base class to feed into others for testing purposes
* found way to check type of warehouse to make data type change for snowflake
* move seed into fixture, to be able to import it from core for adapter tests
* convert to csv and get test passing in core
* remove snowflake specific stuff from util
* remove whitespace
* update to main
* Add structured logging test and provide CI env vars to integration conditionally.
* Add the crazy inline if make feature and ax unneeded variable
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Finish converting first test file.
* Finish test conversion.
* Remove old integration hook tests.
* Move location of schema.yml to models directory.
* fix snapshot delete test that was failing
* Add the extra env var check for our CI.
* Add changelog
* Remove naive json flag check and instead force all integration tests to check for environment variables using flag routine.
* Revise the changelog to be more of an explanation.
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Add dbt Core roadmap as of August 2022
* Cody intro
* Florian intro
* Lint my markdown
* add blurb on 1.5+ for Python next steps
* Revert "add blurb on 1.5+ for Python next steps"
This reverts commit 1659a5a727.
* PR feedback, self review
Co-authored-by: Cody Peterson <cody.dkdc2@gmail.com>
Co-authored-by: Florian Eiden <florian.eiden@dbtlabs.com>
* Method for capturing standard out during testing (rather than logs)
* Allow dbt exit code assertion to be optional
* Verify priority order to search for profiles.yml configuration
* Updates after pre-commit checks
* Move `run_dbt_and_capture_stdout` into the test case
* Add supported languages to materializations
* Add changie entry
* Linting
* add more error and only get supported language for materialization macro, update schema
* fix test and add more check
Co-authored-by: Chenyu Li <chenyu.li@dbtlabs.com>
* First cut at checking version compat for hub pkgs
* Account for field rename
* Add changelog entry
* Update error message
* Fix unit test
* PR feedback
* Try fixing test
* Edit exception msg
* Expand unit test to include pkg prerelease
* Update core/dbt/deps/registry.py
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* Change postgres name truncation logic to be overridable. Add exception with debugging instructions.
* Add changelog.
Co-authored-by: Mila Page <versusfacit@users.noreply.github.com>
* Only consider schema change when column cannot be expanded
* Add test for column shortening
* Add changelog entry
* Move test from integration to adapter tests
* Remove print statement
* add on_schema_change
* show reason for schema change failures
When the incremental model fails, I do not get the context I need to easily fix my discrepency.
Adding more info
* Update on_schema_change.sql
Fix identation
* Added changie changes
Added changie changes
* Update on_schema_change.sql
Trim whitespaces
* Update on_schema_change.sql
Log message text enhancement
* Pass patch_config_dict to build_config_dict when creating
unrendered_config
* Add test case for unrendered_config
* Changie
* formatting, fix test
* Fix test so unrendered config includes docs config
* first pass
* tweaks
* convert to use dbt-docs links in contributors section
* fix eq check
* fix format of contributos prs
* update docs changelog to point back to dbt-docs
* update beta 1.3 docs changelog
* remove optional param
* make issue inclusion conditional on being filled
* add Optional node_color config in Docs dataclass
* Remove node_color from the original docs config
* Add docs config and input validation
* Handle when docs is both under docs and config.docs
* Add node_color to Docs
* Make docs a Dict to avoid parsing errors
* Make docs a dataclass instead of a Dict
* Fix error when using docs as dataclass
* Simplify generator for the default value
* skeleton for test fixtures
* bump manifest to v7
* + config hierarchy tests
* add show override tests
* update manifest
* Remove node_color from the original docs config
* Add node_color to Docs
* Make docs a Dict to avoid parsing errors
* Make docs a dataclass instead of a Dict
* Simplify generator for the default value
* + config hierarchy tests
* add show override tests
* Fix unit tests
* Add tests in case of incorrect input for node_color
* Rename tests and Fix typos
* Fix functional tests
* Fix issues with remote branch
* Add changie entry
* modify tests to meet standards (#5608)
Co-authored-by: Matt Winkler <matt.winkler@fishtownanalytics.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* Python model beta version with update to manifest that renames `raw_sql` and `compiled_sql` to `raw_code` and `compiled_code`
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Ian Knox <ian.knox@dbtlabs.com>
Co-authored-by: Stu Kilgore <stuart.kilgore@gmail.com>
* [CT-700] [Bug] Logging tons of asterisks when sensitive env vars are missing
* [CT-700][Bug] Added changelog entry
* Updated the changelog body message
* feat: Improve generic test UndefinedMacroException message
The error message rendered from the `UndefinedMacroException` when
raised by a TestBuilder is very vague as to where the problem is
and how to resolve it. This commit adds a basic amount of
information about the specific model and column that is
referencing an undefined macro.
Note: All custom macros referenced in a generic test config will
raise an UndefinedMacroException as of v0.20.0.
* feat: Bubble CompilationException into schemas.py
I realized that this exception information would be better if
CompilationExceptions inclulded the file that raised the exception.
To that end, I created a new exception handler in `_parse_generic_test`
to report on CompilationExceptions raised during the parsing of
generic tests. Along the way I reformatted the message returned
from TestBuilder to play nicely with the the existing formatting of
`_parse_generic_test`'s exception handling code.
* feat: Add tests to confirm CompileException
I've added a basic test to confirm that the approriate
CompilationException when a custom macro is referenced
in a generic test config.
* feat: Add changie entry and tweak error msg
* Update .changes/unreleased/Under the Hood-20220617-150744.yaml
Thanks to @emmyoop for the recommendation that this be listed as a Fix change instead of an "Under the Hood" change!
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* fix: Simplified Compliation Error message
I've simplified the error message raised during a Compilation Error
sourced from a test config. Mainly by way of removing tabs and newlines
where not required.
* fix: Convert format to fstring in schemas
This commit moves a format call to a multiline fstring in the
schemas.py file for CompilationExceptions.
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
* add readme to .github
* more changes to readme
* improve docs
* more readme tweaks
* add more docs
* incorporate feedback
* removed section with no info
- This file provides a full account of all changes to `dbt-core` and `dbt-postgres`
- This file provides a full account of all changes to `dbt-core`
- Changes are listed under the (pre)release in which they first appear. Subsequent releases include changes from previous releases.
- "Breaking changes" listed under a version may require action from end users or external maintainers when upgrading to that version.
- Do not edit this file directly. This file is auto-generated using [changie](https://github.com/miniscruff/changie). For details on how to document a change, see [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-changelog-entry)
Thanks for taking the time to fill out this bug report!
- type:checkboxes
attributes:
label:Is there an existing issue for this?
description:Please search to see if an issue already exists for the bug you encountered.
label:Is this a new bug in dbt-core?
description:>
In other words, is this an error, flaw, failure or fault in our software?
If this is a bug that broke existing functionality that used to work, please open a regression issue.
If this is a bug in an adapter plugin, please open an issue in the adapter's repository.
If this is a bug experienced while using dbt Cloud, please report to [support](mailto:support@getdbt.com).
If this is a request for help or troubleshooting code in your own dbt project, please join our [dbt Community Slack](https://www.getdbt.com/community/join-the-community/) or open a [Discussion question](https://github.com/dbt-labs/docs.getdbt.com/discussions).
Please search to see if an issue already exists for the bug you encountered.
options:
- label:I have searched the existing issues
- label:I believe this is a new bug in dbt-core
required:true
- label:I have searched the existing issues, and I could not find an existing issue for this bug
required:true
- type:textarea
attributes:
label:Current Behavior
description:A concise description of what you're experiencing.
validations:
required:false
required:true
- type:textarea
attributes:
label:Expected Behavior
description:A concise description of what you expected to happen.
validations:
required:false
required:true
- type:textarea
attributes:
label:Steps To Reproduce
@@ -36,7 +46,7 @@ body:
3. Run '...'
4. See error...
validations:
required:false
required:true
- type:textarea
id:logs
attributes:
@@ -51,9 +61,9 @@ body:
label:Environment
description:|
examples:
- **OS**: Ubuntu 20.04
- **Python**: 3.7.2 (`python --version`)
- **dbt**: 0.21.0 (`dbt --version`)
- **OS**: Ubuntu 24.04
- **Python**: 3.10.12 (`python3 --version`)
- **dbt-core**: 1.1.1 (`dbt --version`)
value:|
- OS:
- Python:
@@ -64,13 +74,15 @@ body:
- type:dropdown
id:database
attributes:
label:What database are you using dbt with?
label:Which database adapter are you using with dbt?
description:If the bug is specific to the database or adapter, please open the issue in that adapter's repository instead
about:Problems and issues with dbt product documentation hosted on docs.getdbt.com. Issues for markdown files within this repo, such as README, should be opened using the "Code docs" template.
description:This is an implementation ticket intended for use by the maintainers of dbt-core
title:"[<project>] <title>"
labels:["user docs"]
body:
- type:markdown
attributes:
value:This is an implementation ticket intended for use by the maintainers of dbt-core
- type:checkboxes
attributes:
label:Housekeeping
description:>
A couple friendly reminders:
1. Remove the `user docs` label if the scope of this work does not require changes to https://docs.getdbt.com/docs: no end-user interface (e.g. yml spec, CLI, error messages, etc) or functional changes
2. Link any blocking issues in the "Blocked on" field under the "Core devs & maintainers" project.
options:
- label:I am a maintainer of dbt-core
required:true
- type:textarea
attributes:
label:Short description
description:|
Describe the scope of the ticket, a high-level implementation approach and any tradeoffs to consider
validations:
required:true
- type:textarea
attributes:
label:Acceptance criteria
description:|
What is the definition of done for this ticket? Include any relevant edge cases and/or test cases
validations:
required:true
- type:textarea
attributes:
label:Suggested Tests
description:|
Provide scenarios to test. Link to existing similar tests if appropriate.
placeholder:|
1. Test with no version specified in the schema file and use selection logic on a versioned model for a specific version. Expect pass.
2. Test with a version specified in the schema file that is no valid. Expect ParsingError.
validations:
required:true
- type:textarea
attributes:
label:Impact to Other Teams
description:|
Will this change impact other teams? Include details of the kinds of changes required (new tests, code changes, related tickets) and _add the relevant `Impact:[team]` label_.
placeholder:|
Example: This change impacts `dbt-redshift` because the tests will need to be modified. The `Impact:[Adapter]` label has been added.
validations:
required:true
- type:textarea
attributes:
label:Will backports be required?
description:|
Will this change need to be backported to previous versions? Add details, possible blockers to backporting and _add the relevant backport labels `backport 1.x.latest`_
placeholder:|
Example: Backport to 1.6.latest, 1.5.latest and 1.4.latest. Since 1.4 isn't using click, the backport may be complicated. The `backport 1.6.latest`, `backport 1.5.latest` and `backport 1.4.latest` labels have been added.
validations:
required:true
- type:textarea
attributes:
label:Context
description:|
Provide the "why", motivation, and alternative approaches considered -- linking to previous refinement issues, spikes and documentation as appropriate
We try to maintain actions that are shared across repositories in a single place so that necesary changes can be made in a single place.
[dbt-labs/actions](https://github.com/dbt-labs/actions/) is the central repository of actions and workflows we use across repositories.
GitHub Actions also live locally within a repository. The workflows can be found at `.github/workflows` from the root of the repository. These should be specific to that code base.
Note: We are actively moving actions into the central Action repository so there is currently some duplication across repositories.
___
## Basics of Using Actions
### Viewing Output
- View the detailed action output for your PR in the **Checks** tab of the PR. This only shows the most recent run. You can also view high level **Checks** output at the bottom on the PR.
- View _all_ action output for a repository from the [**Actions**](https://github.com/dbt-labs/dbt-core/actions) tab. Workflow results last 1 year. Artifacts last 90 days, unless specified otherwise in individual workflows.
This view often shows what seem like duplicates of the same workflow. This occurs when files are renamed but the workflow name has not changed. These are in fact _not_ duplicates.
You can see the branch the workflow runs from in this view. It is listed in the table between the workflow name and the time/duration of the run. When blank, the workflow is running in the context of the `main` branch.
### How to view what workflow file is being referenced from a run
- When viewing the output of a specific workflow run, click the 3 dots at the top right of the display. There will be an option to `View workflow file`.
### How to manually run a workflow
- If a workflow has the `on: workflow_dispatch` trigger, it can be manually triggered
- From the [**Actions**](https://github.com/dbt-labs/dbt-core/actions) tab, find the workflow you want to run, select it and fill in any inputs requied. That's it!
### How to re-run jobs
- From the UI you can rerun from failure
- You can retrigger the cla check by commenting on the PR with `@cla-bot check`
___
## General Standards
### Permissions
- By default, workflows have read permissions in the repository for the contents scope only when no permissions are explicitly set.
- It is best practice to always define the permissions explicitly. This will allow actions to continue to work when the default permissions on the repository are changed. It also allows explicit grants of the least permissions possible.
- There are a lot of permissions available. [Read up on them](https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs) if you're unsure what to use.
```yaml
permissions:
contents:read
pull-requests:write
```
### Secrets
- When to use a [Personal Access Token (PAT)](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token) vs the [GITHUB_TOKEN](https://docs.github.com/en/actions/security-guides/automatic-token-authentication) generated for the action?
The `GITHUB_TOKEN` is used by default. In most cases it is sufficient for what you need.
If you expect the workflow to result in a commit to that should retrigger workflows, you will need to use a Personal Access Token for the bot to commit the file. When using the GITHUB_TOKEN, the resulting commit will not trigger another GitHub Actions Workflow run. This is due to limitations set by GitHub. See [the docs](https://docs.github.com/en/actions/security-guides/automatic-token-authentication#using-the-github_token-in-a-workflow) for a more detailed explanation.
For example, we must use a PAT in our workflow to commit a new changelog yaml file for bot PRs. Once the file has been committed to the branch, it should retrigger the check to validate that a changelog exists on the PR. Otherwise, it would stay in a failed state since the check would never retrigger.
### Triggers
You can configure your workflows to run when specific activity on GitHub happens, at a scheduled time, or when an event outside of GitHub occurs. Read more details in the [GitHub docs](https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows).
These triggers are under the `on` key of the workflow and more than one can be listed.
```yaml
on:
push:
branches:
- "main"
- "*.latest"
- "releases/*"
pull_request:
# catch when the PR is opened with the label or when the label is added
types:[opened, labeled]
workflow_dispatch:
```
Some triggers of note that we use:
-`push` - Runs your workflow when you push a commit or tag.
-`pull_request` - Runs your workflow when activity on a pull request in the workflow's repository occurs. Takes in a list of activity types (opened, labeled, etc) if appropriate.
-`pull_request_target` - Same as `pull_request` but runs in the context of the PR target branch.
-`workflow_call` - used with reusable workflows. Triggered by another workflow calling it.
-`workflow_dispatch` - Gives the ability to manually trigger a workflow from the GitHub API, GitHub CLI, or GitHub browser interface.
### Basic Formatting
- Add a description of what your workflow does at the top in this format
- Print out all variables you will reference as the first step of a job. This allows for easier debugging. The first job should log all inputs. Subsequent jobs should reference outputs of other jobs, if present.
When possible, generate variables at the top of your workflow in a single place to reference later. This is not always strictly possible since you may generate a value to be used later mid-workflow.
Be sure to use quotes around these logs so special characters are not interpreted.
```yaml
job1:
- name: "[DEBUG] Print Variables"
run: |
echo "all variables defined as inputs"
echo "The last commit sha in the release: ${{ inputs.sha }}"
echo "The release version number: ${{ inputs.version_number }}"
echo "The changelog_path: ${{ inputs.changelog_path }}"
echo "The build_script_path: ${{ inputs.build_script_path }}"
echo "The s3_bucket_name: ${{ inputs.s3_bucket_name }}"
echo "The package_test_command: ${{ inputs.package_test_command }}"
# collect all the variables that need to be used in subsequent jobs
- When it's not obvious what something does, add a comment!
___
## Tips
### Context
- The [GitHub CLI](https://cli.github.com/) is available in the default runners
- Actions run in your context. ie, using an action from the marketplace that uses the GITHUB_TOKEN uses the GITHUB_TOKEN generated by your workflow run.
### Runners
- We dynamically set runners based on repository vars. Admins can view repository vars and reset them. Current values are the following but are subject to change:
- `vars.UBUNTU_LATEST` -> `ubuntu-latest`
- `vars.WINDOWS_LATEST` -> `windows-latest`
- `vars.MACOS_LATEST` -> `macos-14`
### Actions from the Marketplace
- Don’t use external actions for things that can easily be accomplished manually.
- Always read through what an external action does before using it! Often an action in the GitHub Actions Marketplace can be replaced with a few lines in bash. This is much more maintainable (and won’t change under us) and clear as to what’s actually happening. It also prevents any
- Pin actions _we don't control_ to tags.
### Connecting to AWS
- Authenticate with the aws managed workflow
```yaml
- name: Configure AWS credentials from Test account
- Then access with the aws command that comes installed on the action runner machines
```yaml
- name: Copy Artifacts from S3 via CLI
run: aws s3 cp ${{ env.s3_bucket }} . --recursive
```
### Testing
- Depending on what your action does, you may be able to use [`act`](https://github.com/nektos/act) to test the action locally. Some features of GitHub Actions do not work with `act`, among those are reusable workflows. If you can't use `act`, you'll have to push your changes up before being able to test. This can be slow.
Include the number of the issue addressed by this PR above if applicable.
Include the number of the issue addressed by this PR above, if applicable.
PRs for code changes without an associated issue*will not be merged*.
See CONTRIBUTING.md for more information.
Add the `user docs` label to this PR if it will need docs changes. An
issue will get opened in docs.getdbt.com upon successful merge of this PR.
-->
### Description
### Problem
<!---
Describe the Pull Request here. Add any references and info to help reviewers
understand your changes. Include any tradeoffs you considered.
Describe the problem this PR is solving. What is the application state
before this PR is merged?
-->
### Solution
<!---
Describe the way this PR solves the above problem. Add as much detail as you
can to help reviewers understand your changes. Include any alternatives and
tradeoffs you considered.
-->
### Checklist
- [ ] I have read [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md) and understand what's expected of me
- [ ] I have signed the [CLA](https://docs.getdbt.com/docs/contributor-license-agreements)
- [ ]I have run this code in development and it appears to resolve the stated issue
- [ ] This PR includes tests, or tests are not required/relevant for this PR
- [ ]I have [opened an issue to add/update docs](https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose), or docs changes are not required/relevant for this PR
- [ ] I have run `changie new` to [create a changelog entry](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#Adding-CHANGELOG-Entry)
- [ ] I have read [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md) and understand what's expected of me.
- [ ] I have run this code in development, and it appears to resolve the stated issue.
- [ ]This PR includes tests, or tests are not required or relevant for this PR.
- [ ] This PR has no interface changes (e.g., macros, CLI, logs, JSON artifacts, config files, adapter interface, etc.) or this PR has already received feedback and approval from Product or DX.
- [ ]This PR includes [type annotations](https://docs.python.org/3/library/typing.html) for new and modified functions.
if [[ "${{ steps.check_approvals.outputs.CORE_APPROVALS }}" -ge "${{ env.required_approvals }}" ]]; then
title="Extra requirements met"
message="Changes to artifact directory files requires at least ${{ env.required_approvals }} approvals from core team members. Current number of core team approvals: ${{ steps.check_approvals.outputs.CORE_APPROVALS }} "
echo "::notice title=$title::$message"
echo "REVIEW_STATUS=success" >> $GITHUB_OUTPUT
else
title="PR Approval Requirements Not Met"
message="Changes to artifact directory files requires at least ${{ env.required_approvals }} approvals from core team members. Current number of core team approvals: ${{ steps.check_approvals.outputs.CORE_APPROVALS }} "
echo "::notice title=$title::$message"
echo "REVIEW_STATUS=neutral" >> $GITHUB_OUTPUT
fi
id:review_check
- name:"Set check status"
id:status_check
run:|
if [[ "${{ steps.artifact_files_changed.outputs.artifact_changes }}" == 'false' ]]; then
# no extra review required
echo "current_status=success" >> $GITHUB_OUTPUT
elif [[ "${{ steps.review_check.outputs.REVIEW_STATUS }}" == "success" ]]; then
# we have all the required reviews
echo "current_status=success" >> $GITHUB_OUTPUT
else
# neutral exit - neither success nor failure
# we can't fail here because we use multiple triggers for this workflow and they won't reset the check
# workaround is to use a neutral exit to skip the check run until it's actually successful
echo "current_status=neutral" >> $GITHUB_OUTPUT
fi
- name:"Post Event"
# This step posts the status of the check because the workflow is triggered by multiple events
# and we need to ensure the check is always updated. Otherwise we would end up with duplicate
# checks in the GitHub UI.
run:|
if [[ "${{ steps.status_check.outputs.current_status }}" == "success" ]]; then
gh issue comment ${{ github.event.issue.number }} --repo ${{ github.repository }} --body "Thank you for your bug report! Our team is will be out of the office for [Christmas and our Global Week of Rest](https://handbook.getdbt.com/docs/time_off#2024-us-holidays), from December 25, 2024, through January 3, 2025.
We will review your issue as soon as possible after returning.
Thank you for your understanding, and happy holidays! 🎄🎉
If you are a customer of dbt Cloud, please contact our Customer Support team via the dbt Cloud web interface or email **support@dbtlabs.com**."
changelog_comment:'Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-changelog-entry).'
changelog_comment:'Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-changelog-entry).'
echo "To bypass this check, confirm that the change is not breaking (https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/artifacts/README.md#breaking-changes) and add the 'artifact_minor_upgrade' label to the PR. Modifications and additions to all fields require updates to https://github.com/dbt-labs/dbt-jsonschema."
issue_body:"At a minimum, update body to include a link to the page on docs.getdbt.com requiring updates and what part(s) of the page you would like to see updated.\n Originating from this issue: https://github.com/dbt-labs/dbt-core/issues/${{ github.event.issue.number }}"
stale-issue-message:"This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days."
stale-pr-message:"This PR has been marked as Stale because it has been open for 180 days with no activity. If you would like the PR to remain open, please remove the stale label or comment on the PR, or it will be closed in 7 days."
close-issue-message:"Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest; add a comment to notify the maintainers."
# mark issues/PRs stale when they haven't seen activity in 180 days
# This workflow will test all test(s) at the input path given number of times to determine if it's flaky or not. You can test with any supported OS/Python combination.
# This is batched in 10 to allow more test iterations faster.
# **why?**
# Testing if a test is flaky and if a previously flaky test has been fixed. This allows easy testing on supported python versions and OS combinations.
# **when?**
# This is triggered manually from dbt-core.
name:Flaky Tester
on:
workflow_dispatch:
inputs:
branch:
description:"Branch to check out"
type:string
required:true
default:"main"
test_path:
description:"Path to single test to run (ex: tests/functional/retry/test_retry.py::TestRetry::test_fail_fast)"
type:string
required:true
default:"tests/functional/..."
python_version:
description:"Version of Python to Test Against"
type:choice
options:
- "3.10"
- "3.11"
os:
description:"OS to run test in"
type:choice
options:
- "ubuntu-latest"
- "macos-14"
- "windows-latest"
num_runs_per_batch:
description:"Max number of times to run the test per batch. We always run 10 batches."
The "tasks" map to top-level dbt commands. So `dbt run` => task.run.RunTask, etc. Some are more like abstract base classes (GraphRunnableTask, for example) but all the concrete types outside of task should map to tasks. Currently one executes at a time. The tasks kick off their “Runners” and those do execute in parallel. The parallelism is managed via a thread pool, in GraphRunnableTask.
core/dbt/include/index.html
core/dbt/task/docs/index.html
This is the docs website code. It comes from the dbt-docs repository, and is generated when a release is packaged.
## Adapters
dbt uses an adapter-plugin pattern to extend support to different databases, warehouses, query engines, etc. For testing and development purposes, the dbt-postgres plugin lives alongside the dbt-core codebase, in the [`plugins`](plugins) subdirectory. Like other adapter plugins, it is a self-contained codebase and package that builds on top of dbt-core.
dbt uses an adapter-plugin pattern to extend support to different databases, warehouses, query engines, etc.
Note: dbt-postgres used to exist in dbt-core but is now in [the dbt-adapters repo](https://github.com/dbt-labs/dbt-adapters/tree/main/dbt-postgres)
Each adapter is a mix of python, Jinja2, and SQL. The adapter code also makes heavy use of Jinja2 to wrap modular chunks of SQL functionality, define default implementations, and allow plugins to override it.
@@ -39,16 +35,15 @@ Each adapter plugin is a standalone python package that includes:
-`dbt/include/[name]`: A "sub-global" dbt project, of YAML and SQL files, that reimplements Jinja macros to use the adapter's supported SQL syntax
-`dbt/adapters/[name]`: Python modules that inherit, and optionally reimplement, the base adapter classes defined in dbt-core
-`setup.py`
-`pyproject.toml`
The Postgres adapter code is the most central, and many of its implementations are used as the default defined in the dbt-core global project. The greater the distance of a data technology from Postgres, the more its adapter plugin may need to reimplement.
## Testing dbt
The [`test/`](test/) subdirectory includes unit and integration tests that run as continuous integration checks against open pull requests. Unit tests check mock inputs and outputs of specific python functions. Integration tests perform end-to-end dbt invocations against real adapters (Postgres, Redshift, Snowflake, BigQuery) and assert that the results match expectations. See [the contributing guide](CONTRIBUTING.md) for a step-by-step walkthrough of setting up a local development and testing environment.
The [`tests/`](tests/) subdirectory includes unit and fuctional tests that run as continuous integration checks against open pull requests. Unit tests check mock inputs and outputs of specific python functions. Functional tests perform end-to-end dbt invocations against real adapters (Postgres) and assert that the results match expectations. See [the contributing guide](CONTRIBUTING.md) for a step-by-step walkthrough of setting up a local development and testing environment.
## Everything else
- [docker](docker/): All dbt versions are published as Docker images on DockerHub. This subfolder contains the `Dockerfile` (constant) and `requirements.txt` (one for each version).
- [etc](etc/): Images for README
- [scripts](scripts/): Helper scripts for testing, releasing, and producing JSON schemas. These are not included in distributions of dbt, nor are they rigorously tested—they're just handy tools for the dbt maintainers :)
- This file provides a full account of all changes to `dbt-core` and `dbt-postgres`
- This file provides a full account of all changes to `dbt-core`
- Changes are listed under the (pre)release in which they first appear. Subsequent releases include changes from previous releases.
- "Breaking changes" listed under a version may require action from end users or external maintainers when upgrading to that version.
- Do not edit this file directly. This file is auto-generated using [changie](https://github.com/miniscruff/changie). For details on how to document a change, see [the contributing guide](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md#adding-changelog-entry)
## Previous Releases
For information on prior major and minor releases, see their changelogs:
`dbt-core` is open source software. It is what it is today because community members have opened issues, provided feedback, and [contributed to the knowledge loop](https://www.getdbt.com/dbt-labs/values/). Whether you are a seasoned open source contributor or a first-time committer, we welcome and encourage you to contribute code, documentation, ideas, or problem statements to this project.
1. [About this document](#about-this-document)
2. [Getting the code](#getting-the-code)
3. [Setting up an environment](#setting-up-an-environment)
4. [Running `dbt` in development](#running-dbt-core-in-development)
5. [Testing dbt-core](#testing)
6. [Submitting a Pull Request](#submitting-a-pull-request)
- [Contributing to `dbt-core`](#contributing-to-dbt-core)
- [About this document](#about-this-document)
- [Notes](#notes)
- [Getting the code](#getting-the-code)
- [Installing git](#installing-git)
- [External contributors](#external-contributors)
- [dbt Labs contributors](#dbt-labs-contributors)
- [Setting up an environment](#setting-up-an-environment)
- [Tools](#tools)
- [Virtual environments](#virtual-environments)
- [Docker and `docker-compose`](#docker-and-docker-compose)
- [Postgres (optional)](#postgres-optional)
- [Running `dbt-core` in development](#running-dbt-core-in-development)
- [Assorted development tips](#assorted-development-tips)
- [Adding or modifying a CHANGELOG Entry](#adding-or-modifying-a-changelog-entry)
- [Submitting a Pull Request](#submitting-a-pull-request)
- [Troubleshooting Tips](#troubleshooting-tips)
## About this document
There are many ways to contribute to the ongoing development of `dbt-core`, such as by participating in discussions and issues. We encourage you to first read our higher-level document: ["Expectations for Open Source Contributors"](https://docs.getdbt.com/docs/contributing/oss-expectations).
The rest of this document serves as a more granular guide for contributing code changes to `dbt-core` (this repository). It is not intended as a guide for using `dbt-core`, and some pieces assume a level of familiarity with Python development (virtualenvs, `pip`, etc). Specific code snippets in this guide assume you are using macOS or Linux and are comfortable with the command line.
The rest of this document serves as a more granular guide for contributing code changes to `dbt-core` (this repository). It is not intended as a guide for using `dbt-core`, and some pieces assume a level of familiarity with Python development and package managers. Specific code snippets in this guide assume you are using macOS or Linux and are comfortable with the command line.
If you get stuck, we're happy to help! Drop us a line in the `#dbt-core-development` channel in the [dbt Community Slack](https://community.getdbt.com).
### Notes
- **Adapters:** Is your issue or proposed code change related to a specific [database adapter](https://docs.getdbt.com/docs/available-adapters)? If so, please open issues, PRs, and discussions in that adapter's repository instead. The sole exception is Postgres; the `dbt-postgres` plugin lives in this repository (`dbt-core`).
- **Adapters:** Is your issue or proposed code change related to a specific [database adapter](https://docs.getdbt.com/docs/available-adapters)? If so, please open issues, PRs, and discussions in that adapter's repository instead.
- **CLA:** Please note that anyone contributing code to `dbt-core` must sign the [Contributor License Agreement](https://docs.getdbt.com/docs/contributor-license-agreements). If you are unable to sign the CLA, the `dbt-core` maintainers will unfortunately be unable to merge any of your Pull Requests. We welcome you to participate in discussions, open issues, and comment on existing ones.
- **Branches:** All pull requests from community contributors should target the `main` branch (default). If the change is needed as a patch for a minor version of dbt that has already been released (or is already a release candidate), a maintainer will backport the changes in your PR to the relevant "latest" release branch (`1.0.latest`, `1.1.latest`, ...)
- **Branches:** All pull requests from community contributors should target the `main` branch (default). If the change is needed as a patch for a minor version of dbt that has already been released (or is already a release candidate), a maintainer will backport the changes in your PR to the relevant "latest" release branch (`1.0.latest`, `1.1.latest`, ...). If an issue fix applies to a release branch, that fix should be first committed to the development branch and then to the release branch (rarely release-branch fixes may not apply to `main`).
- **Releases**: Before releasing a new minor version of Core, we prepare a series of alphas and release candidates to allow users (especially employees of dbt Labs!) to test the new version in live environments. This is an important quality assurance step, as it exposes the new code to a wide variety of complicated deployments and can surface bugs before official release. Releases are accessible via our [supported installation methods](https://docs.getdbt.com/docs/core/installation-overview#install-dbt-core).
## Getting the code
@@ -51,28 +73,22 @@ There are some tools that will be helpful to you in developing locally. While th
These are the tools used in `dbt-core` development and testing:
- [`tox`](https://tox.readthedocs.io/en/latest/) to manage virtualenvs across python versions. We currently target the latest patch releases for Python 3.7, 3.8, 3.9, and 3.10
- [`hatch`](https://hatch.pypa.io/) for build backend, environment management, and running tests across Python versions (3.10, 3.11, 3.12, and 3.13)
- [`pytest`](https://docs.pytest.org/en/latest/) to define, discover, and run tests
- [`flake8`](https://flake8.pycqa.org/en/latest/) for code linting
- [`black`](https://github.com/psf/black) for code formatting
- [`mypy`](https://mypy.readthedocs.io/en/stable/) for static type checking
- [`pre-commit`](https://pre-commit.com) to easily run those checks
- [`changie`](https://changie.dev/) to create changelog entries, without merge conflicts
- [`make`](https://users.cs.duke.edu/~ola/courses/programming/Makefiles/Makefiles.html) to run multiple setup or test steps in combination. Don't worry too much, nobody _really_ understands how `make` works, and our Makefile aims to be super simple.
- [GitHub Actions](https://github.com/features/actions) for automating tests and checks, once a PR is pushed to the `dbt-core` repository
A deep understanding of these tools in not required to effectively contribute to `dbt-core`, but we recommend checking out the attached documentation if you're interested in learning more about each one.
#### Virtual environments
We strongly recommend using virtual environments when developing code in `dbt-core`. We recommend creating this virtualenv
in the root of the `dbt-core` repository. To create a new virtualenv, run:
```sh
python3 -m venv env
source env/bin/activate
```
dbt-core uses [Hatch](https://hatch.pypa.io/) for dependency and environment management. Hatch automatically creates and manages isolated environments for development, testing, and building, so you don't need to manually create virtualenvironments.
This will create and activate a new Python virtual environment.
For more information on how Hatch manages environments, see the [Hatch environment documentation](https://hatch.pypa.io/latest/environment/).
#### Docker and `docker-compose`
@@ -91,21 +107,44 @@ brew install postgresql
### Installation
First make sure that you set up your `virtualenv` as described in [Setting up an environment](#setting-up-an-environment). Also ensure you have the latest version of pip installed with `pip install --upgrade pip`. Next, install `dbt-core` (and its dependencies) with:
First make sure you have Python 3.10 or later installed. Ensure you have the latest version of pip installed with `pip install --upgrade pip`. Next, install `hatch`. Finally set up `dbt-core` for development:
This will install all development dependencies and set up pre-commit hooks.
By default, hatch will use whatever Python version is active in your environment. To specify a particular Python version, set the `HATCH_PYTHON` environment variable:
```sh
exportHATCH_PYTHON=3.12
hatch env create
```
Or add it to your shell profile (e.g., `~/.zshrc` or `~/.bashrc`) for persistence.
When installed in this way, any changes you make to your local copy of the source code will be reflected immediately in your next `dbt` run.
#### Building dbt-core
dbt-core uses [Hatch](https://hatch.pypa.io/) (specifically `hatchling`) as its build backend. To build distribution packages:
```sh
cd core
hatch build
```
This will create both wheel (`.whl`) and source distribution (`.tar.gz`) files in the `dist/` directory.
The build configuration is defined in `core/pyproject.toml`. You can also use the standard `python -m build` command if you prefer.
### Running `dbt-core`
With your virtualenv activated, the `dbt` script should point back to the source code you've cloned on your machine. You can verify this by running `which dbt`. This command should show you a path to an executable in your virtualenv.
Once you've run `hatch run setup`, the `dbt` command will be available in your PATH. You can verify this by running `which dbt`.
Configure your [profile](https://docs.getdbt.com/docs/configure-your-profile) as necessary to connect to your target databases. It may be a good idea to add a new profile pointing to a local Postgres instance, or a specific test sandbox within your data warehouse if appropriate.
Configure your [profile](https://docs.getdbt.com/docs/configure-your-profile) as necessary to connect to your target databases. It may be a good idea to add a new profile pointing to a local Postgres instance, or a specific test sandbox within your data warehouse if appropriate. Make sure to create a profile before running integration tests.
## Testing
@@ -121,73 +160,139 @@ Although `dbt-core` works with a number of different databases, you won't need t
Postgres offers the easiest way to test most `dbt-core` functionality today. They are the fastest to run, and the easiest to set up. To run the Postgres integration tests, you'll have to do one extra step of setting up the test database:
```sh
make setup-db
cd core
hatch run setup-db
```
or, alternatively:
Alternatively, you can run the setup commands directly:
There are a few methods for running tests locally.
#### Makefile
#### Hatch scripts
There are multiple targets in the Makefile to run common test suites and code
checks, most notably:
The primary way to run tests and checks is using hatch scripts (defined in `core/hatch.toml`):
```sh
# Runs unit tests with py38 and code checks in parallel.
make test
# Runs postgres integration tests with py38 in "fail fast" mode.
make integration
```
> These make targets assume you have a local installation of a recent version of [`tox`](https://tox.readthedocs.io/en/latest/) for unit/integration testing and pre-commit for code quality checks,
> unless you use choose a Docker container to run tests. Run `make help` for more info.
cd core
Check out the other targets in the Makefile to see other commonly used test
suites.
# Run all unit tests
hatch run unit-tests
# Run unit tests and all code quality checks
hatch run test
# Run integration tests
hatch run integration-tests
# Run integration tests in fail-fast mode
hatch run integration-tests-fail-fast
# Run linting checks only
hatch run lint
hatch run flake8
hatch run mypy
hatch run black
# Run all pre-commit hooks
hatch run code-quality
# Clean build artifacts
hatch run clean
```
Hatch manages isolated environments and dependencies automatically. The commands above use the `default` environment which is recommended for most local development.
**Using the `ci` environment (optional)**
If you need to replicate exactly what runs in GitHub Actions (e.g., with coverage reporting), use the `ci` environment:
```sh
cd core
# Run unit tests with coverage
hatch run ci:unit-tests
# Run unit tests with a specific Python version
hatch run +py=3.11 ci:unit-tests
```
> **Note:** Most developers should use the default environment (`hatch run unit-tests`). The `ci` environment is primarily for debugging CI failures or running tests with coverage.
#### `pre-commit`
[`pre-commit`](https://pre-commit.com) takes care of running all code-checks for formatting and linting. Run `make dev` to install `pre-commit` in your local environment. Once this is done you can use any of the linter-based make targets as well as a git pre-commit hook that will ensure proper formatting and linting.
#### `tox`
[`tox`](https://tox.readthedocs.io/en/latest/) takes care of managing virtualenvs and install dependencies in order to run tests. You can also run tests in parallel, for example, you can run unit tests for Python 3.7, Python 3.8, Python 3.9, and Python 3.10 checks in parallel with `tox -p`. Also, you can run unit tests for specific python versions with `tox -e py37`. The configuration for these tests in located in `tox.ini`.
[`pre-commit`](https://pre-commit.com) takes care of running all code-checks for formatting and linting. Run `hatch run setup` to install `pre-commit` in your local environment (we recommend running this command with a python virtual environment active). This installs several pip executables including black, mypy, and flake8. Once installed, hooks will run automatically on `git commit`, or you can run them manually with `hatch run code-quality`.
#### `pytest`
Finally, you can also run a specific test or group of tests using [`pytest`](https://docs.pytest.org/en/latest/) directly. With a virtualenv active and dev dependencies installed you can do things like:
Finally, you can also run a specific test or group of tests using [`pytest`](https://docs.pytest.org/en/latest/) directly. After running `hatch run setup`, you can run pytest commands like:
> See [pytest usage docs](https://docs.pytest.org/en/6.2.x/usage.html) for an overview of useful command-line options.
## Adding CHANGELOG Entry
### Unit, Integration, Functional?
Here are some general rules for adding tests:
* unit tests (`tests/unit`) don’t need to access a database; "pure Python" tests should be written as unit tests
* functional tests (`tests/functional`) cover anything that interacts with a database, namely adapter
## Debugging
1. The logs for a `dbt run` have stack traces and other information for debugging errors (in `logs/dbt.log` in your project directory).
2. Try using a debugger, like `ipdb`. For pytest: `--pdb --pdbcls=IPython.terminal.debugger:pdb`
3. Sometimes, it’s easier to debug on a single thread: `dbt --single-threaded run`
4. To make print statements from Jinja macros: `{{ log(msg, info=true) }}`
5. You can also add `{{ debug() }}` statements, which will drop you into some auto-generated code that the macro wrote.
6. The dbt “artifacts” are written out to the ‘target’ directory of your dbt project. They are in unformatted json, which can be hard to read. Format them with:
* Append `# type: ignore` to the end of a line if you need to disable `mypy` on that line.
* Sometimes flake8 complains about lines that are actually fine, in which case you can put a comment on the line such as: # noqa or # noqa: ANNN, where ANNN is the error code that flake8 issues.
* To collect output for `CProfile`, run dbt with the `-r` option and the name of an output file, i.e. `dbt -r dbt.cprof run`. If you just want to profile parsing, you can do: `dbt -r dbt.cprof parse`. `pip` install `snakeviz` to view the output. Run `snakeviz dbt.cprof` and output will be rendered in a browser window.
## Adding or modifying a CHANGELOG Entry
We use [changie](https://changie.dev) to generate `CHANGELOG` entries. **Note:** Do not edit the `CHANGELOG.md` directly. Your modifications will be lost.
Follow the steps to [install `changie`](https://changie.dev/guide/installation/) for your system.
Once changie is installed and your PR is created, simply run `changie new` and changie will walk you through the process of creating a changelog entry. Commit the file that's created and your changelog entry is complete!
Once changie is installed and your PR is created for a new feature, simply run the following command and changie will walk you through the process of creating a changelog entry:
```shell
changie new
```
Commit the file that's created and your changelog entry is complete!
If you are contributing to a feature already in progress, you will modify the changie yaml file in dbt/.changes/unreleased/ related to your change. If you need help finding this file, please ask within the discussion for the pull request!
You don't need to worry about which `dbt-core` version your change will go into. Just create the changelog entry with `changie`, and open your PR against the `main` branch. All merged changes will be included in the next minor version of `dbt-core`. The Core maintainers _may_ choose to "backport" specific changes in order to patch older minor versions. In that case, a maintainer will take care of that backport after merging your PR, before releasing the new version of `dbt-core`.
## Submitting a Pull Request
A `dbt-core` maintainer will review your PR. They may suggest code revision for style or clarity, or request that you add unit or integration test(s). These are good things! We believe that, with a little bit of help, anyone can contribute high-quality code.
Code can be merged into the current development branch `main` by opening a pull request. If the proposal looks like it's on the right track, then a `dbt-core` maintainer will triage the PR and label it as `ready_for_review`. From this point, two code reviewers will be assigned with the aim of responding to any updates to the PR within about one week. They may suggest code revision for style or clarity, or request that you add unit or integration test(s). These are good things! We believe that, with a little bit of help, anyone can contribute high-quality code. Once merged, your contribution will be available for the next release of `dbt-core`.
Automated tests run via GitHub Actions. If you're a first-time contributor, all tests (including code checks and unit tests) will require a maintainer to approve. Changes in the `dbt-core` repository trigger integration tests against Postgres. dbt Labs also provides CI environments in which to test changes to other adapters, triggered by PRs in those adapters' repositories, as well as periodic maintenance checks of each adapter in concert with the latest `dbt-core` code changes.
Once all tests are passing and your PR has been approved, a `dbt-core` maintainer will merge your changes into the active development branch. And that's it! Happy developing :tada:
We require signed git commits. See docs [here](https://docs.github.com/en/authentication/managing-commit-signature-verification/signing-commits) for setting up code signing.
Once all tests are passing, all comments are resolved, and your PR has been approved, a `dbt-core` maintainer will merge your changes into the active development branch. And that's it! Happy developing :tada:
## Troubleshooting Tips
Sometimes, the content license agreement auto-check bot doesn't find a user's entry in its roster. If you need to force a rerun, add `@cla-bot check` in a comment on the pull request.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.