* bumps to version 1.20.0
* update the hub reference docs, add CI check
* use dependency specifier in hub for plugin version check
* minimum dlt runtime cli check
* rollaback to old fsspec min version
* fixes test_hub ci workflow
* fixes flaky test
* bumps hub extra
* updates cli docs linting
* fixes docs lock
---------
Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org>
Co-authored-by: ivasio <ivan@dlthub.com>
* adds hub extra
* makes hub module more user friendly when hub not installed
* test and lint fixes
* adds plugin version check util function
* basic cell appearing if installed
* use data quality cell
* show raw data too
* adds dlt-runtime to hub extra, minimal import tests
* bumps to dlthub 0.20.0 alpha
* lists pipelines with cli using the same functions as dashboard, dlt pipeline will list pipelines by default
* adds configured propfiles method on context so only profiles with configs or pipelines are listed
* adds list of locations that contained actual configs to provider interface
* improves workspace and profile commands
* test fixes
* fixes tests
* update text
* adds quality widget as python functions
* adds data_quality as module to hub
* adds hub extra to docs deps
* fixes dashboard imports
* bumps to alpha x.20.0a1
---------
Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org>
* adds hub extra
* makes hub module more user friendly when hub not installed
* test and lint fixes
* adds plugin version check util function
* adds dlt-runtime to hub extra, minimal import tests
* bumps to dlthub 0.20.0 alpha
* lists pipelines with cli using the same functions as dashboard, dlt pipeline will list pipelines by default
* adds configured propfiles method on context so only profiles with configs or pipelines are listed
* adds list of locations that contained actual configs to provider interface
* improves workspace and profile commands
* test fixes
* fixes tests
* implement RunContext.reset_config, call it in PluggableRunContext.reload_providers
* fix _config access
* reiinitialize RunContext._runtime_config on access
* adjust the test to .runtime_config being always available
* fixes dlthub tests
---------
Co-authored-by: ivasio <ivan@dlthub.com>
Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org>
* fixes historic builds
* fix broken link
* constrain docs build env to python 3.10
* switch snippets testing to python 3.10
* allows python up to py3.12 in docs project
---------
Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org>
* extracts adbc parquet load job with file format selector
* ports postgres parquet job to base job
* implements mssql adbc job
* adds pickle test for all destination caps
* adds dbc to adbc group, updates test workflow
* fixes sqlglot from find
* fixes docs
* adds sqlalchemy adbc docs
* adds support from sqllite and mysql in sqlalchemy
* fixes and tests str annotation resolving
* allows to disable adbc and does that in tests
* fixes imports
* docs lock bump
* fixes globalns extraction
* clarifies how adbc drivers are installed, implements fallback for postgres
* improves dashboard multi schema test
* fixes followup jobs
* fixes connection string escaping
* Update docs/website/docs/dlt-ecosystem/destinations/sqlalchemy.md
Co-authored-by: djudjuu <djudju@proton.me>
* removes code dedup
* fixes columns that receive None, simple and nested values
---------
Co-authored-by: djudjuu <djudju@proton.me>
Added a dropdown for profile selection in the dashboard interface and updated the layout to display profile and workspace information inline with pipeline selection.
* Minor hub docs polishing
* fixes workflow setup wrt not running certain steps if there are only docs changes
* Remove the duplicate content
* Fix build
---------
Co-authored-by: David Scharf <shrps@posteo.net>
* adds option in load that prevents draining pool on signal
* adds runtime pipeline option to not intercept signals
* refactors signal module
* tests new cases
* describes signal handling in running in prod docs
* bumps dlt to 1.18.0
* fixes tests forked
* removes logging and buffered console output from signals
* adds retry count to load job metrics, generates started_at in init of runnable load job
* allows to update existing metrics in load step
* finalized jobs require start and finish dates
* generates metrics in each job state and in each completed loop, does not complete package if pool drained but jobs left, adds detailed tests for metrics
* fixes remote metrics
* replaces event with package bound semaphore to complete load jobs early
* fixes dashboard to on windows
* improves signals docs
* renames delayed_signals to intercepted_signals
* enable python 3.14
* try on mac
* remove beta 4 disclaimer
* adds sleep before starting windows e2e tests
---------
Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org>
* Feature, Add support of http based paths
* Feature, Add support of http resources
* Feature, Enforce coercion to pendulum types. Add support of RFC 1123 format
* Feature, Add cloudfront base_url to the configurations
* Feature, Add a test for http based resources
* Feature, Add a test case for RFC 1123 datetime format
* Feature, Remove test cases related to datetime parsing in RFC and timestamp formats
* Revert "Feature, Enforce coercion to pendulum types. Add support of RFC 1123 format"
This reverts commit 142624b24a.
* Feature, Restore the structure of the url for the cdn
* Feature, Replace custom datetime parser function with a single dispatched one
* Feature, Add a stub package for singledispatch
* Feature, Reffactor pendulume datetime processing functions
* Feature, Fix the linting errors in time related tests
* Feature, Fix the declaration
* Feature, Revert the changes related to datetime parsing
* Feature, Add http schema for testing. Add pendulum parser to support RFC 1123 format
* Feature, Update the configuration for http bucket
* Feature, Add a http server. Update the test for http fs
* Feature, Upgrade fsspec
* Feature, Fix codestyle
* Feature, Fix the protocol validation for fsspec args
* Feature, Fix the typing annotations
* Add an example for http filesystem
* Feature, Add schema to the urlparse call
* Feature, Fix the codestyle for http entries in MIME_DISPATCH
* Feature, Expand the list of supported locations in the docs
* uses more random port and closes httpd to release it properly, drops auto fixture as it would be attached to all tests
* moves httpd tests to common tests
* adds http extra to support fsspec
---------
Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org>
* does not fail config resolution if native valued provided to a config that does not implement native values
* updates databricks docs
* allows to replace hints regexes on schema
* removes partition hint on eth merge test on databricks
* adds pokemon table count consts
* reorgs databricks dlt fix
* fixes lancedb custom destination example
* fixes lancedb custom destination example
* reduces no sql_database examples run on ci
* fixes merge
* marks and skips rfam tests
* Add DLT destination capabilities tags to documentation files
This commit introduces the `<!--@@@DLT_DESTINATION_CAPABILITIES <destination>-->` tags to various destination documentation files. The following files were updated:
- athena.md
- bigquery.md
- clickhouse.md
- databricks.md
- destination.md
- dremio.md
- duckdb.md
- ducklake.md
- filesystem.md
- lancedb.md
- motherduck.md
- mssql.md
- postgres.md
- qdrant.md
- redshift.md
- snowflake.md
- sqlalchemy.md
- synapse.md
- weaviate.md
* Enhance documentation by adding destination capabilities sections
This commit adds the `## Destination capabilities` section along with the corresponding `<!--@@@DLT_DESTINATION_CAPABILITIES <destination>-->` tags to various destination documentation files. The following files were updated:
- athena.md
- bigquery.md
- clickhouse.md
- databricks.md
- destination.md
- dremio.md
- duckdb.md
- ducklake.md
- filesystem.md
- lancedb.md
- motherduck.md
- mssql.md
- postgres.md
- qdrant.md
- redshift.md
- snowflake.md
- sqlalchemy.md
- synapse.md
- weaviate.md
* Add new script for inserting DLT destination capabilities
* Update package.json and package-lock.json to include new script for inserting destination capabilities
This commit modifies the `package.json` to add a new script for inserting destination capabilities and updates the `package-lock.json` to reflect the changes in dependencies. The new script allows for better integration of destination capabilities into the documentation process.
* Revert "Update package.json and package-lock.json to include new script for inserting destination capabilities"
This reverts commit cd5d6c2fae.
* Add script for inserting destination capabilities into documentation
This commit introduces a new Python script, `insert_destination_capabilities.py`, It contains only place holder for now that prints to the console for testing the setup.
* Add destination capabilities execution
This commit introduces a new function, `executeDestinationCapabilities`, which executes a Python script to insert destination capabilities into the documentation process.
* Enhance destination capabilities insertion script
This commit refines the `insert_destination_capabilities.py` script by adding functionality to dynamically generate and insert destination capabilities tables into documentation files. It introduces a new data structure for capabilities, improves file processing logic, and ensures that only relevant files are processed. Additionally, it enhances error handling and logging for better traceability during execution.
* Refactor destination capabilities insertion script
This commit updates the `insert_destination_capabilities.py` script to improve its functionality by dynamically retrieving supported destination names from the source directory. It enhances the file processing logic to ensure only relevant files are processed based on available destinations. Additionally, it improves error handling and logging for better execution traceability.
* Refactor and enhance destination capabilities insertion script
This commit refines the `insert_destination_capabilities.py` script by adding functionality to dynamically retrieve and format destination capabilities into markdown tables. It introduces improved error handling, validation for destination names, and enhances the file processing logic to ensure only relevant files are processed. Additionally, it updates the main function to include pre-checks for source and target directories, ensuring a more robust execution flow.
* Refactor and improve destination capabilities insertion script
This commit enhances the `insert_destination_capabilities.py` script by refining the logic for generating markdown tables of destination capabilities. It introduces new patterns for documentation links, improves error handling, and optimizes the processing of relevant capabilities. Additionally, it streamlines the file processing logic and ensures that only valid capabilities are included in the output, resulting in cleaner and more informative documentation.
* Remove destination capabilities sections from various documentation files
This commit removes the `## Destination capabilities` sections and their corresponding `<!--@@@DLT_DESTINATION_CAPABILITIES <destination>-->` tags from multiple destination documentation files, including athena.md, bigquery.md, clickhouse.md, databricks.md, dremio.md, duckdb.md, ducklake.md, filesystem.md, lancedb.md, motherduck.md, mssql.md, postgres.md, qdrant.md, redshift.md, snowflake.md, sqlalchemy.md, synapse.md, and weaviate.md. This cleanup helps streamline the documentation and focuses on relevant content.
* Add destination capabilities sections to various documentation files
This commit introduces `## Destination capabilities` sections along with their corresponding `<!--@@@DLT_DESTINATION_CAPABILITIES <destination>-->` tags in multiple destination documentation files, including athena.md, bigquery.md, clickhouse.md, databricks.md, dremio.md, duckdb.md, ducklake.md, filesystem.md, lancedb.md, motherduck.md, mssql.md, postgres.md, qdrant.md, redshift.md, snowflake.md, sqlalchemy.md, synapse.md, and weaviate.md. This addition enhances the documentation by providing clear insights into the capabilities of each destination, improving user understanding and usability.
* Update documentation for various destinations with formatting improvements
This commit enhances the documentation for multiple destinations, including BigQuery, ClickHouse, Databricks, Dremio, DuckDB, DuckLake, Filesystem, LanceDB, MotherDuck, MSSQL, Postgres, Qdrant, Redshift, Snowflake, SQLAlchemy, Synapse, and Weaviate. Changes include improved formatting for warnings, notes, and tips, as well as minor adjustments to the content for clarity and consistency. These updates aim to enhance the readability and usability of the documentation for users.
* Remove destination capabilities sections from various documentation files
* Update destinations with capabilities marker
* Added type guard to guard against Any
* Temporarily commit preprocessed docs
* Add new constants for documentation preprocessing and update requirements
This commit introduces a new `constants.py` file containing various constants for documentation preprocessing, including directory paths, file extensions, timing settings, and markers. Additionally, the `requirements.txt` file is updated to include `watchdog` and `requests` packages, enhancing the project's dependencies.
* Add tuba links processing script and remove unused line from constants
This commit introduces a new script, `preprocess_tuba.py`, which handles the fetching and formatting of tuba links for documentation. It includes functions for fetching configuration, extracting tags, and inserting links into markdown files. Additionally, an unused line has been removed from `constants.py` to clean up the code.
* Refactor tuba link processing and extract utility function
This commit refactors the `preprocess_tuba.py` script by moving the `extract_marker_content` function to a new `utils.py` file for better organization and reusability. The logic for checking the presence of the TUBA marker has been simplified, and the formatting function for tuba links has been updated to improve clarity and maintainability. These changes enhance the overall structure of the documentation preprocessing tools.
* Add snippet processing functionality for documentation
This commit introduces a new script, `preprocess_snippets.py`, which provides functions for building a map of code snippets, retrieving snippets from files, and inserting them into markdown documents. The script enhances the documentation preprocessing tools by allowing for better management and formatting of code snippets. Additionally, the `utils.py` file is updated with new utility functions for directory traversal and marker content extraction, improving overall code organization and reusability.
* Add example processing script for documentation generation
This commit introduces a new script, `process_examples.py`, which automates the generation of example documentation from Python files. The script includes functionality to build documentation by extracting headers, comments, and code snippets, while also handling exclusions and errors gracefully. Additionally, the `utils.py` file is updated with a new utility function, `trim_array`, to enhance the management of line arrays. These changes improve the documentation process by streamlining example integration and ensuring better formatting.
* Enhance documentation preprocessing with Python integration and new script
This commit updates the `package.json` to include a new script for installing Python dependencies and modifies the start and build scripts to incorporate Python preprocessing. Additionally, a new `preprocess_docs.py` script is introduced, which automates the processing of markdown files by inserting code snippets, managing links, and syncing examples. The `requirements.txt` is also updated to include a new dependency, `python-debouncer`, improving the documentation workflow.
* Refactor documentation preprocessing scripts for improved async handling and example processing
This commit enhances the `preprocess_docs.py` script by integrating asynchronous file handling and introducing a lock mechanism to manage concurrent processing. The `package.json` is updated to modify the start script for better coordination of preprocessing tasks. Additionally, a new `preprocess_examples.py` script is added to streamline the generation of example documentation, ensuring proper formatting and error handling. The `preprocess_snippets.py` script is also updated to maintain consistency in line reading methods. These changes collectively improve the efficiency and reliability of the documentation workflow.
* Refactor documentation preprocessing scripts for improved efficiency and caching
This commit updates the `package.json` to streamline the start script by removing the lock file mechanism and enhancing the coordination of preprocessing tasks. The `preprocess_docs.py` script is refactored to eliminate the lock file usage, simplifying the processing flow. Additionally, the `preprocess_tuba.py` script introduces a caching mechanism for tuba configuration to reduce redundant network requests, improving performance. These changes collectively enhance the documentation workflow and processing efficiency.
* Refactor file change handling in documentation preprocessing scripts
This commit enhances the `preprocess_docs.py` script by simplifying the file change handling logic through the introduction of a new `handle_change_impl` function. The previous `should_process` function is removed to streamline the decision-making process for file processing. Additionally, whitespace cleanup is performed for better code readability. The `preprocess_tuba.py` script also receives minor whitespace adjustments. These changes collectively improve the maintainability and clarity of the documentation preprocessing workflow.
* Add destination capabilities processing and refactor related scripts
This commit introduces a new script, `preprocess_destination_capabilities.py`, which handles the generation of destination capabilities tables for documentation. It includes caching mechanisms for improved performance and integrates with existing constants for consistency. The `insert_destination_capabilities` function is now called within `preprocess_docs.py` to streamline the documentation processing workflow. Additionally, the `insert_destination_capabilities.py` script is removed as its functionality is now encapsulated in the new script. These changes enhance the documentation generation process by providing structured capabilities information.
* Update package-lock.json and package.json for improved documentation preprocessing
This commit updates the `package-lock.json` to reflect changes in dependencies and their versions, ensuring compatibility and performance enhancements. The `package.json` is modified to streamline the `start` and `preprocess-docs` scripts by removing the installation of Python dependencies from the start command and adjusting the environment variable settings. These changes collectively enhance the efficiency and reliability of the documentation generation workflow.
* Add processed docs entry to .gitignore
This commit updates the .gitignore file to include the 'docs_processed' entry, ensuring that preprocessed documentation files are excluded from version control. This change helps maintain a cleaner repository by preventing unnecessary files from being tracked.
* Stop tracking docs_processed directory
* Remove the `preprocess_docs.js` script, which handled documentation preprocessing tasks including snippet insertion and link management. This deletion streamlines the codebase by eliminating unused functionality, following recent refactoring efforts to improve documentation processing workflows.
* Refactor destination capabilities processing script for type hinting and formatting improvements
This commit updates the `preprocess_destination_capabilities.py` script by adding type hints for caching variables, enhancing code clarity and maintainability. Additionally, it modifies the formatting of the capabilities table to ensure consistent output and appends a newline for better readability. These changes collectively improve the structure and presentation of destination capabilities in the documentation.
* Refactor documentation processing scripts by removing unnecessary argument documentation
This commit simplifies the `insert_destination_capabilities` function in `preprocess_destination_capabilities.py` by removing the detailed argument and return type documentation. Additionally, the `format_tuba_links_section` function in `preprocess_tuba.py` is updated to streamline its docstring, enhancing clarity while maintaining essential information. These changes improve the readability and maintainability of the documentation processing scripts.
* Update package.json to streamline documentation processing scripts
This commit modifies the `package.json` to include a new script for installing Python dependencies and updates the `start` and `build` scripts to ensure a more efficient workflow. The changes enhance the coordination of documentation preprocessing tasks, improving the overall efficiency of the documentation generation process.
* Added dependency installement in start
* Refactor package.json scripts for improved documentation processing
This commit updates the `package.json` to streamline the `start`, `build`, and `build:cloudflare` scripts by removing redundant installation of Python dependencies. The `preprocess-docs` script is now defined separately, enhancing clarity and efficiency in the documentation generation workflow.
* Add type checking configurations for additional modules in mypy.ini
This commit extends the mypy.ini configuration by adding ignore_missing_imports settings for several new modules, including constants and various preprocess modules. These changes aim to improve type checking flexibility and reduce false positives during type analysis, enhancing the overall development experience.
* Enhance type hinting in preprocessing scripts for improved clarity
This commit updates the type hints in `preprocess_destination_capabilities.py`, `preprocess_snippets.py`, and `preprocess_tuba.py` to provide more specific type information. Changes include casting for constants and refining list and dictionary type annotations. These improvements enhance code readability and maintainability, supporting better type checking and development practices.
* Update dependencies and refactor documentation processing scripts
This commit adds the `python-debouncer` dependency to `pyproject.toml` for improved event handling in documentation processing. Additionally, it refines the `package.json` scripts by separating the `preprocess-docs` command and optimizing the `start` script for better efficiency. The `preprocess_docs.py` script is also updated to utilize lazy imports for certain modules, enhancing performance during documentation processing. These changes collectively improve the clarity and efficiency of the documentation generation workflow.
* Remove requirements.txt and clean up whitespace in preprocess_docs.py
This commit deletes the `requirements.txt` file, which is no longer needed, and cleans up unnecessary whitespace in the `preprocess_docs.py` script. These changes help streamline the codebase and improve overall readability.
* Update documentation for Databricks and DuckLake destinations
This commit enhances the documentation for Databricks by adding a note about loading data to Managed Iceberg tables and refining the descriptions of table and column-level hints. Additionally, it updates the DuckLake documentation to recommend using a more explicit catalog name in configuration examples. These changes improve clarity and usability for users working with these destinations.
* Enhance documentation for various destinations and add requirements.txt for project dependencies
* Fix typo in DuckDB documentation regarding spatial extension installation
* Remove destination capabilities section from AWS Athena documentation
* Feat/adds workspace (#3171)
* ports toml config provider with profiles
* supports run context with profiles
* separates pluggy hooks from impls, uses pyproject and __plugins__.py for self-plugging
* implements workspace run context with profiles and basic cli
* displays workspace name and profile name before executing cli commands if run context supports profiles
* exposes dlt.current.workspace()
* converts run context protocol into abstract class
* fixes plugins tests
* refactors _workspace: private and public modules
* adds workspace test cases
* launches workspace and pipeline mpc with cli, sse by default
* tests basic workspace behaviors
* refactors code to switch context and profile
* adds default profile to run context interface
* ports pipeline and oss mcp, changes derivation structure
* adds safeguards and tests to workspace cleanup cli helper
* adds run_context to SupportsPipeline, checks run_context change on pipeline activation
* adds mcp dependency to workspace extra, fixes types
* renames test fixture
* mcp export tweak
* updates cli reference and common ci workflow
* disables dlt-plus deps in ci
* removes df from mcp tools, fixes workspace tests
* fixes tests
* Fix build scripts for Cloudflare integration in package.json
* Fix preprocess-docs:cloudflare script to use python directly instead of uv
* Restore preprocess-docs scripts in package.json for consistency
* Update preprocess-docs:cloudflare script to include requirements installation
* Update preprocess-docs:cloudflare script to include requirements installation
* Add __init__.py file to tools directory
* Refactor import statements to use relative imports in preprocessing scripts
* Update import statements to use absolute paths for consistency across preprocessing scripts
* Add mypy configuration for additional modules to ignore missing imports
* Removed duplicated line
* Add mypy configuration to ignore missing imports for tools module
* Update ducklake.md
* temporarily add netlify build command back
* fix typing in snippets and update mypy.ini a bit
* reverse build commands back to previous order
* Fixed watch by changing implementation into queue and locks
* Refactor package.json for improved script organization and maintainability
* Add mypy configuration to ignore missing imports for additional modules
* Add mypy configuration to ignore missing imports for more modules
* Remove mypy configuration for preprocess_examples to streamline settings
* Update mypy configuration: rename dlt hub section to dlt plus and remove unused preprocess settings
* Refactor import statements to remove 'tools' prefix, improving module accessibility across preprocess scripts
* Refactor import statements in preprocessing scripts to use relative imports, enhancing module organization and consistency
* Refactor import statements in preprocessing scripts to use absolute imports from the tools module, improving clarity and consistency across the codebase
* Update mypy.ini
* Fix formatting in _generate_doc_link function by removing unnecessary whitespace in return statement for improved readability
* fix linting and script execution
* remove sleeping after preprocessing in favor of predictable processing before docusaurus launch
* remove unnecessary whitespace in preprocess_docs.py for cleaner code
* Update deployment script in package.json and enhance file change handling in preprocess_docs.py; remove obsolete preprocess_change.py
* Refactor preprocess_docs.py to improve file change handling; replace change counter with a pending changes flag for better processing control and enhance logging for file modifications.
* Enhance capabilities table generation in preprocess_destination_capabilities.py by adding a descriptive header and introductory text for improved clarity and context.
* Remove destination capabilities sections from multiple destination documentation files for consistency and clarity.
* Fix formatting in start script of package.json for improved readability
* Enhance capabilities table generation by improving destination name formatting; streamline file change handling in preprocess_docs.py by removing unnecessary print statements.
* update files incrementally only when in watcher mode
make tuba link generation random per day with a seed
* fix duplicate page at examples error
* remove outdated docs deploy action
* add build docs action for better debugability
* revert unintential change to md file
* add info about where capabilities links should go
* refactor: improve documentation link generation for capabilities
* fix: update documentation link for replace strategy and improve link formatting
---------
Co-authored-by: rudolfix <rudolfix@rudolfix.org>
Co-authored-by: dave <shrps@posteo.net>
* adds selective required context, checks profile support in switch_profile
* creates and tests hub module
* adds plugin version to telemetry
* renames imports in docs
* renames ci workflows
* fixes lint
* tests deploy command on duckdb
* moves cli module to workspace
* moves cli tests to workspace module
* renames fixtures, rewrites fixture to patch run context to _storage
* allows to patch global dir in workspace context
* when finding git repo, does not look up if GIT_CEILING_DIRECTORIES is set
* imports git utils only when need to clone package in dbt runner
* runs workspace tests as part of common
* fixes tests, config tests sideeffects
* moves dashboards to workspace
* fixes pipeline trace test
* moves dashboard helper tests
* excludes additional secret files and pinned profile from gitignore
* cleansup hatchling files in pyproject
* fixes dashboard running tests in ci
* moves git module to libs
* diff fix
* fixes fixture names
* adds selective required context, checks profile support in switch_profile
* creates and tests hub module
* adds plugin version to telemetry
* renames imports in docs
* renames ci workflows
* fixes lint
* ports toml config provider with profiles
* supports run context with profiles
* separates pluggy hooks from impls, uses pyproject and __plugins__.py for self-plugging
* implements workspace run context with profiles and basic cli
* displays workspace name and profile name before executing cli commands if run context supports profiles
* exposes dlt.current.workspace()
* converts run context protocol into abstract class
* fixes plugins tests
* refactors _workspace: private and public modules
* adds workspace test cases
* launches workspace and pipeline mpc with cli, sse by default
* tests basic workspace behaviors
* refactors code to switch context and profile
* adds default profile to run context interface
* ports pipeline and oss mcp, changes derivation structure
* adds safeguards and tests to workspace cleanup cli helper
* adds run_context to SupportsPipeline, checks run_context change on pipeline activation
* adds mcp dependency to workspace extra, fixes types
* renames test fixture
* mcp export tweak
* updates cli reference and common ci workflow
* disables dlt-plus deps in ci
* removes df from mcp tools, fixes workspace tests
* fixes tests
* move duckdb capabilities to utility function
* add basic DuckLake files based on DuckDB / Motherduck
* refactor ducklake config
* wip; ducklake destination
* simplified testing
* ignore ducklake files
* completed default config; TODO fix write
* unicode issues
* commented out patches
* lint
* uses destination_type as final fallback when creating default local file names, allows to copy local file context in WithLocalFiles
* creates connection pool for duckdb
* fixes exception handling in open_connection in sql_client, fixes racing when connections opened in duckdb, improves error handling if commit tx fails
* handles ducklake attach/detach in sql_client
* modifes ducklake configuration to: (1) use sqllite as default catalog (2) point all local files to local_dir (3) allow various urls to configure ducklake name (4) uses parquet as default file format
* adjust caps to execute load jobs sequentially for duckdb and sqllite catalogs
* passes ducklake conn to ibis, improves how duckb conn is passed (via open_connection which provides full context)
* adds configuration and credential tests, smoke tests for supported catalogs
* enables ducklake on ci
* fixes ducklake imports
* fixes how secrets are created from filesystem
* generates remote_url in load job metrics with real url of the ducklake table
* tests for all buckets
* adds ducklake extra
* adds hints for secrets.toml gen
* implements cursor for ducklake with correct df vector size
* forces use of ducklake/duckdb datasets in ibis handover, tests non existing dataset behavior
* removes dashboard e2e from common tests on ci
* docs WIP
* implements field resolution check and recursive copy for base configuration
* copies credentials before using as default when resolving capabilities
* allows recursive resolution traces in config field missing exception
* improves config resolve: collects traces recursive, keeps resolving if embedded config fails, collects resolved keys
* decouples connection string credentials and base duckdb credentials
* improves how duckdb handles exceptions when executing query
* makes catalog name explicit in ducklake credentials, creates default db and storage folder names after it
* supports ducklake partitioning on duckdb 1.4
* supports metadata schema on postgres, adds experimental ducklake catalog support on Motherduck
* fixes union config resolve with single base config in union
* docs WIP
* enabled ducklake remote test
* improves ibis filesystem con handover, enables databricks
* fixes tests
* fixes lancedb default name
* propagates only top level config section, replaces with embedded field name in other cases
* adds tests and examples for programmatic creation of ducklake facotry
* adds merge selector in duckdb caps to enable upsert on 1.4
* ducklake code cleanups
* makes sure pipeline is dropped before run_context goes out of scope
* finalizes ducklake docs
* fallback in duckdb merge selector if duckdb not installed
* propagates persist_secret flag in filesystem sql client
* fixes tests and ci
* runs remote ducklake on local postgres catalog for low latency
* uses packaging version, not semver for python packages comparisons
* Update docs/website/docs/dlt-ecosystem/destinations/duckdb.md
* fixes recursive re-raise in sql_client
---------
Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org>
Co-authored-by: Anton Burnashev <anton.burnashev@gmail.com>
* answers defaults in cli if tty disconnected
* adds method to send anon tracker event even if disabled
* fixes types in source/resource build in generator
* adds dlt.hub with transformation decorator
* moves dlt-plus to separate sidebar in docs, renames to dltHub Features, adds EULA
* renamed plus to hub in docs
* fixes docs logos
* removes more dlt+
* renames plus tests
* fixes ci run main
* fixes hub workflows
* run common and dashboard tests also with newest available packages
* fix language in code block
* make basic tests works with updated versions of dependent packages
* disable most tests
* try correct windows command for runnig marimo e2e tests
* try without timeout
* test only launch marimo
* bump python version
* try install playwright deps
* fix e2e tests for dashboard on windows
* enable e2e tests for dashboard
* test macos 14 for dashboard e2e tests
* add basic tests for ui elements
* improve ui elements tests
* revert changes to main github workflow
* review fixes
---------
Co-authored-by: Your Name <you@example.com>
* adds databricks timestamp NTZ
* improves error messages in pyarrow tuples to arrow
* decreases timestamp precision to 6 for mssql
* adds naive datetime to all data types case, enables fallback when testing destinations not supporting it
* other test fixes
* always stores incremental state last value as present in the data, tests tz-awareness edge cases
* fixes ntz timestamp tests
* fixes sqlalchemy destination to work with mssql
* adds func to current module to get current resource instance
* generates LIMIT clause in sql_database when limit step is present
* adds basic tests for mssql in sql_database
* adds docs on tz-awareness in datetime columns in sql_database
* adds naive an tz aware datetimes to destination caps, implements for various destinations
* caches dlt type to python type conversion
* normalizes timezone handling in timestamp and time data types, fixes remaining pendulum timezone problems, applies tz/non-tz preserving methods when necessary, improves test converage
* fixes incremental and lag so they always follow the tz-awareness of the data under cursor column, fixes pendulum tz problems, adds tests
* moves schema inference and data coercion from Schema to item_normalizers, applies timezone normalization to json data, adjusts new columns to destination caps for json data, tests
* casts timezones in arrow table normalizations, datetime and time cases in row tuples to arrow, refactors to get generic method to cast tables to dlt schemas, tests
* tracks resource parent, along pipe parent, fixes resource cloning when adding to source, fixes source and resource iterators, makes sure that list of extracted resources always includes implicit and explicit resources
* updates dbapi sql client for dremio
* adjust column schema inferred from arrow to destination caps in extractor, tests
* moves schema and data setup for all data types tests to common code
* adds option to exclude columns in sql_table, uses LimitItem to generate LIMIT statements, tests incl. proper cursor tests for naive/tz aware incremental cursor columns
* tests sql_database on mssql for all data types and incremental cursor on dates
* improves tests for row tuples to arrow with cast to dlt schema, tests for naive datetimes
* improved test for timestamps and int with precision on duckdb
* disables Python 3.14 tests and dashboard test on mac
* better maybe transaction in job client: takes into account ddl and regular transaction destination caps
* pyodbc py3.13 bump
* move dashboard tests to own workflow
* * do not crash dashboard app if credentials not available
* do not sort columns in dataset browser
* try sleep in e2e tests
* disable python 3.14 tests for now
* disable mac e2e tests for dashboard
clean up step conditions
* remove uneeded file
* fix forwarding of pipelines dir to marimo app
* disable state sync and display all schemas and remote state and schemas in pipeline overview
* add support for multiple schemas
* fix e2e tests, further updates pending
* use dropdown instead of multiselect for schema selection
add multi schema pipeline to fixtures
* add last run info in pipeline overview
add buttons to open pipeline folder and local data folder if present
* fix loads browser to select correct schema
* allow to start dashboard for a pipeline that is not there yet and add helpful error message in this case
* nicer last run time formatting
show pipeline error screen also when manually chnaing the pipeline name in the url
* move buttons to top, add refresh buttons to sections
* use raw query when constructing queries
* lazy load remote state tab
* fix traces and trace typing (mostly)
* add exception traces to ui
* add file watcher
* remove test code
* add source and resource state viewer to data panel
* update existing unit tests
* add unit test for new utils
* make marimo dashboard the default app for pipeline show
* update docs
* update existing e2e tests for new yaml based rendering of state
* move streamlit app down in sidebar
* grammar fixes for dashboard strings
* open duckdb in readme mode in datapanel in dashboard
* remove old tests
re-enable dashboard main command
* add missing args to dashboard command
* small fixes to e2e tests
* add tests for exceptions
* re-organize e2e tests into invidual tests
* add basic schema selection checks
* improve dashboard help and dashboard docs page
* short some strings in testing to make selecting predictable
* merge devel
* typo
---------
Co-authored-by: djudjuu <djudju@proton.me>
* remove transformation code and tests that now live in dlt_plus
* move lineage code and tests into dataset folder scope
* start fixing model item format tests
* revert model item format tests back to version before last big change (with some updates)
* disable transformations snippets linting and testing for now
* remove uneeded test
* enable 3.14 with orjson branch
* make example plugin a uv project
* post rebase pyproject update
* fix one dependency
update readme
* update readme about python 3.14
* make docs snippets tests independent of secrets
* move examples tests into own workflow and remove github fork marker
* make custom naming example use secrets file instead of hardcoded secrets
* run full linter step on docs changes
* disable dashboard e2e tests on 3.11
enable dashboard e2e and unit tests on 3.13
* bump marimo min dependency
* Revert "Auxiliary commit to revert individual files from 52165eaeeb543932bc917bb5efc373c02ab2937b"
This reverts commit b7c5baf7c0c51e67ad323cd1b2cb9423f48f4165.
* re-lock changes
* revert incorrect change in secrets toml
* Redact secrets in URL when logging and raising for status
* Add configuration options to show HTTP response body in exceptions and logs
* Exclude markdown files from size checks
* Moved configuration reading from _dlt_raise_for_status to RESTClient
* adds dlt workspace extra, updates exception and github workflows
* renames app from "marimo app" to "pipeline dashboard"
updates --marimo flag to --dashboard
* rename studio folders to dashboard
* removes all other references to studio
* exclude lockfile and markdown files from lfs
* update workspace extra dependency versions
* bump version
* rename flag for executing raw queries to "execute_raw_query"
* return sge queries from the internal _query method which removes a lot of unneeded transpiling
clean up make_transformation function
tests still pending
* adds some tests to readable dataset and a test for column hint merging
* allows any dialect when writing queries and fixes tests
* update docs and set correct quoting to queries in normalization and load stage
* fixes normalizer tests
* fix limit on mssql
normalize aliases in normalization step
* add missing quote to alias
* revert identifier normalization step in normalizer_query and use bigquery compiler for bigquery destinations
* post rebase fix
* smallish pr fixes
* add materializable sqlmodel and handle hints in extractor
* add and test always_materialize setting
* add test for sql transformation type
* convert transformation functions to need yield instead of return
* migrate tests and docs snippets to yield in transformations
* add simple test for materializable model
* use correct compiler for converting ibis into sqlglot for each dialect
fixes on transformation test
* add first simple version of using unbound ibis tables in transformations
* skip ibis test on python 3.9
* fix query building in new relation
* return a "real" relation from a transformation
* add ibis option when getting table from dataset
natively support unbound ibis tables in transformations and when getting relations from dataset
* update model item format tests to use relation
* * remove one unneeded test (same thing is already tested in transformations)
* fix wei conversion in linneage
* adds support for adding resource hints to pyarrow items
* switch most read access tests to default dataset
* update datasets and transformations docs pages
* separate ibis and default dbapi datasets and fix typing
* update transformation tests and small typing fixes for updated datasets
* fix default dataset type
* fix wei sqlglot conversion
* add sqlglot dialect type and some cleanup
* fix dataset snippets
* fix sqlglot schema test
* removes ibis relation and dataset
consolidates relation and dataset baseclasses with implementations
updates interfaces/protocols fro relation and dataset and makes those the publicly available interface with "Relation" and "Dataset"
remove query method from relation interface
* fix one doc snippet
* rename dataset and relation interfaces
* fix relation ship between cursor and relation, remove function wiring hack in favor of explicit forwarding for better typing
* clean up readablerelation (no actual code changes)
* fix str test to assume pretty sql (which it is now)
fix one transformation snippet
* small changes from review comments:
* query method on dataset
* typing update of table method
* rename query method to "to_sql" on relation
* clean up transform function a bit (could maybe be even better=
reject non-sql strings in transformation to not shadow errors
* add support for "non-generator" transformations
* move hints computation into resource class
* smallish PR fixes
* add support for dynamic hints in transformations
-> this allows to have multiple relations with different schemas in the relation, so this is allowed now too
* fixes dynamic table caching
* Enhances ReadableDBAPIRelation: min/max, filter with expression (#2833)
* Min max, filter with expr_or_string
* Fix in min max test
* Overload fix and docs
* Test read interfaces partially uses default relation max
* prevent sqglot schema from adding default hints info, only allow parametrized types and don't supply hints if none are present in dlt schema
* make multi schema transformations work again
* move model item format tests to transformations folder
* re-order interface tests and fix playground dataset access
* PR review test updated
* update dataset and transformation pages
* update transformations tests to new fruitshop
* Last PR fixes
* update columns_schema property
---------
Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org>
Co-authored-by: anuunchin <88698977+anuunchin@users.noreply.github.com>
* add simple wasm notebook
* add first version of deployment script
* adds pyodid exec_info helper
* small updates to the example notebook
* add example page with transformations notebook into docs
* fix stupid typing error
* disable threading in dlt if platform with out threading detected
* move to playground
* simplify playground notebook
fix typos
add tests for playground notebook
* add missing marimo dependency for tests
* PR reviews plus simple tests
* add playground link to intro page
* adds marimo wasm contributing guide
* one more contributing note
* move notebook deployment to own file with own rules
* add comments to marimo cells
* make dlt app ejectable
* update app file url in makefile and tests
add missing stylesheet to package
* start marimo app in process
* convert caching toggle to button for clearer use
* exlcude incomplete columns
* adds a bunch of tests for marimo app utils
* make normalized query output pretty and disable tests on 3.9
* filter out incomplete tables
* update cli strings and small changes to app ejection
* run all common tests with resolution-lowest on sync
* make model item normalizer tests pass, disable on time test for now
* fix duckdb instantiation for old versions
bump pyarrow to have version that supports "append_column" on recordbatch
exclude deltalake tests for too low pyarrow versions
* fixes errors in makefile
bump minimum pytest version to what was in lockfile
* bump pendulum min requirement
* fix common test file
* bump ibis dependency
* go back to old version of pendulum
bump to prerelease