mirror of
https://github.com/bitol-io/open-data-contract-standard.git
synced 2025-05-28 19:34:02 +00:00
218 lines
9.1 KiB
Markdown
218 lines
9.1 KiB
Markdown
---
|
|
title: "Changelog: Open Data Contract Standard (ODCS)"
|
|
description: "Home of Open Data Contract Standard (ODCS) documentation."
|
|
image: "https://raw.githubusercontent.com/bitol-io/artwork/main/horizontal/color/Bitol_Logo_color.svg"
|
|
---
|
|
|
|
This document tracks the history and evolution of the **Open Data Contract Standard**.
|
|
|
|
# v3.0.2 - 2024-03-31 - APPROVED
|
|
|
|
* Added field `physicalName` for the properties in JSON schema.
|
|
* Explicitly specifies `YYYY-MM-DDTHH:mm:ss.SSSZ` for default date format.
|
|
* Added field `name` team members in JSON schema and docs.
|
|
* Added field `description` team members in JSON schema and docs.
|
|
* Fixed Athena Server required property name from `staging_dir` to `stagingDir`
|
|
|
|
# v3.0.1 - 2024-12-22 - APPROVED
|
|
|
|
* Added field `authoritativeDefinitions` into JSON schema
|
|
* Added field `description.customProperties` into JSON schema
|
|
* Added field `description.authoritativeDefinitions` into JSON schema
|
|
* Added field `role.customProperties` into JSON schema
|
|
* Updated `status` field to include examples
|
|
* Updated `authoritativeDefinitions` description to be vendor agnostic
|
|
* Updated `tags` description and included examples
|
|
|
|
# v3.0.0 - 2024-10-21 - APPROVED
|
|
|
|
* **New section**: Support & communication channels.
|
|
* **New section**: Servers.
|
|
* **Changes** to fundamentals :
|
|
* Rename `uuid` to `id`.
|
|
* Add `name`.
|
|
* Rename `quantumName` to `dataProduct` and make it optional.
|
|
* Rename `datasetDomain` to `domain` (we avoid the dataset prefix).
|
|
* Drop `datasetKind` (example: `virtualDataset`, was optional, have not seen any usage).
|
|
* Drop `userConsumptionMode` (examples: `analytical`, was optional, already deprecated in v2.).
|
|
* Drop `sourceSystem` (example: `bigQuery`, information will be encoded in servers).
|
|
* Drop `sourcePlatform` (example: `googleCloudPlatform`, information will be encoded in servers).
|
|
* Drop `productSlackChannel` (will move to support channels).
|
|
* Drop `productFeedbackUrl` (will move to support channels).
|
|
* Drop `productDl` (will move to support channels).
|
|
* Drop `username` (credentials should not be stored in the data contract).
|
|
* Drop `password` (credentials should not be stored in the data contract).
|
|
* Drop `driverVersion` (will move to servers if needed).
|
|
* Drop `driver` (will move to servers if needed).
|
|
* Drop `server` (will move to servers if needed).
|
|
* Drop `project` (BigQuery-specific, will move to servers).
|
|
* Drop `datasetName` (BigQuery-specific, will move to servers).
|
|
* Drop `database` (BigQuery-specific, will move to servers).
|
|
* Drop `schedulerAppName` (not part of the contract).
|
|
* **Changes** to Schema:
|
|
* Major changes, check spec.
|
|
* Adds support for non table formats, hierarchies, and arrays.
|
|
* `name` is a new field
|
|
* `items` is a new field
|
|
* `priorTableName` is not supported anymore, if needed, consider a custom property.
|
|
* `table` is not supported anymore, if needed, consider using `name`.
|
|
* `columns` is now `properties`
|
|
* `dataGranularity` is now `dataGranularityDescription`.
|
|
* `encryptedColumnName`is now `encryptedName`.
|
|
* `partitionStatus` is now `partitioned`.
|
|
* `clusterStatus` is not supported anymore, if needed, consider a custom property.
|
|
* `clusterKeyPosition` is not supported anymore, if needed, consider a custom property.
|
|
* `sampleValues` is now `examples`.
|
|
* `isNullable` is now `required`.
|
|
* `isUnique` is now `unique`.
|
|
* `isPrimaryKey` is now `primaryKey`.
|
|
* `criticalDataElementStatus` is now `criticalDataElement`.
|
|
* `clusterKeyPosition` is not supported anymore, if needed, consider a custom property.
|
|
* `transformSourceTables` is now `transformSourceObjects`
|
|
* Restrict `schema.*.logicalType` to be one of `string`, `date`, `number`, `integer`, `object`, `array`, `boolean`.
|
|
* Add `schema.*.logicalTypeOptions`.
|
|
* **Changes** to Data Quality:
|
|
* Significant changes have been applied to support more tools and use cases. Please review the new section.
|
|
* If needed, `templateName` is a custom property.
|
|
* `toolName` is obsolete, replaced by `type=custom; engine: <engine name>`.
|
|
* `scheduleCronExpression` is replaced by `schedule` and `scheduler`. `scheduleCronExpression: 0 20 * * *` becomes `schedule: 0 20 * * *` and `scheduler: cron`.
|
|
* Pricing:
|
|
* No changes.
|
|
* **Changes** to team (fka stakeholders):
|
|
* Replaces `stakeholders`. Content stays the same.
|
|
* **Changes** to Role:
|
|
* Added `description`
|
|
* Changed `access` is not required anymore
|
|
* Security:
|
|
* No changes.
|
|
* **Changes** to SLA:
|
|
* Starting with v3, the schema is not purely tables and columns, hence minor modifications: columns are now elements.
|
|
* `slaDefaultColumn` is now `slaDefaultElement`.
|
|
* `column` is now `element`.
|
|
* Explicit reference to Data QoS.
|
|
* **Changes** to custom and other properties:
|
|
* `systemInstance` is not supported anymore, if needed, consider a custom property.
|
|
|
|
|
|
# v2.2.2 - 2024-05-23 - APPROVED
|
|
|
|
* In JSON schema validation:
|
|
* Change `dataset.description` data type from `array` to `string`.
|
|
* Change `dataset.column.isPrimaryKey` data type from `string` to `boolean`.
|
|
* Change `price.priceAmount` data type from `string` to `number`.
|
|
* Change `slaProperties.value` data type from `string` to `oneOf[string, number]`.
|
|
* Change `slaProperties.valueExt` data type from `string` to `oneOf[string, number]`.
|
|
* Update [examples](docs/examples/README.md) to adhere to JSON schema.
|
|
* Full example from README directs to [full-example.yaml](docs/examples/all/full-example.odcs.yaml).
|
|
* Add in mkdocs for creating a [documentation website](https://bitol-io.github.io/open-data-contract-standard/). Check [building-doc.md](building-doc.md).
|
|
* Add vendors page [vendors.md](vendors.md). Feel free to add anyone there.
|
|
|
|
|
|
# v2.2.1 - 2023-12-18 - REPLACED BY V2.2.2
|
|
|
|
* Reformat quality examples to be valid YAML.
|
|
* Type of definition for authority have standard values: `businessDefinition`, `transformationImplementation`, `videoTutorial`, `tutorial`, and `implementation`.
|
|
* Add in `isUnique`, `primaryKeyPosition`, `partitionKeyPosition`, and `clusterKeyPosition` to `column` definition.
|
|
* Add [JSON schema](https://github.com/bitol-io/open-data-contract-standard/blob/main/schema/odcs-json-schema.json) to validate YAML files for v2.2.1.
|
|
* Integrated as part of [Bitol](https://lfaidata.foundation/projects/bitol/).
|
|
* Reformat Markdown tables.
|
|
|
|
|
|
# v2.2.0 - 2023-07-27 - REPLACED BY V2.2.1
|
|
|
|
* New name to Open Data Contract Standard.
|
|
* `templateName` is now called `standardVersion`, v2.2.0 parsers should account for this change and support both to avoid a breaking change.
|
|
* Added support for `authoritativeDefinitions` at the table level.
|
|
* Added many examples.
|
|
* Various improvements and typo corrections.
|
|
* Finalization of fork under AIDA User Group.
|
|
|
|
|
|
# v2.1.1 - 2023-04-26 - REPLACED BY V2.2.0
|
|
|
|
* Open source version.
|
|
* Additional value field `valueExt` in SLA.
|
|
|
|
|
|
# v2.1.0 - 2023-03-23 - REPLACED BY V2.1.1
|
|
|
|
## Data Quality
|
|
The data contract adds elements specifically for interfacing with the Data Quality tooling.
|
|
|
|
Additions:
|
|
* quality (table level & column level check):
|
|
* templateName (called standardVersion since v2.2.0)
|
|
* dimension
|
|
* type
|
|
* severity
|
|
* businessImpact
|
|
* scheduleCronExpression
|
|
* customProperties
|
|
* columns
|
|
* isPrimaryKey
|
|
|
|
## Physical names
|
|
The data contract is a logical construct; we add more specific links to the physical world.
|
|
|
|
## Service-level agreement
|
|
The service-level agreements not previously used are more detailed to follow the DP QoS pattern. See SLA.
|
|
|
|
## Other
|
|
Removed the weight for system ratings from the data contract. Their default values remain.
|
|
|
|
# v2.0.0 - REPLACED BY V2.1.0
|
|
|
|
## Guidelines & Evolution
|
|
* [Type case](https://google.github.io/styleguide/jsoncstyleguide.xml?showone=Property_Name_Format#Property_Name_Format)
|
|
* Support for SemVer versioning.
|
|
* Tags can have values.
|
|
|
|
## Additions
|
|
* Version of contract definition: v2.0.0. A breaking change with v1.
|
|
* Description:
|
|
* Purpose (text field).
|
|
* Limitations (text field).
|
|
* Usage (text field).
|
|
* Domain.
|
|
* Dictionary section:
|
|
* Identification of masked column (encryptedColumnName property), example: the email_decrypted column would be masked by email_encrypted.
|
|
* Flag for critical data element.
|
|
* Added keys for transformation data (sources, logic, description).
|
|
* Sample values.
|
|
* Ability to specify links to authoritative sources at the column level (authoritativeDefinitions).
|
|
* Business name.
|
|
* List of stakeholders:
|
|
* Username (user account).
|
|
* Role.
|
|
* Date in.
|
|
* Date out.
|
|
* Replaced by.
|
|
* Service levels: agreements & objective [orginal inspiration](https://medium.com/@jgperrin/meet-cactar-the-mongolian-warlord-of-data-quality-d7bdbd6a5398).
|
|
* Price / cost.
|
|
* Name changes to match PPaaS type case.
|
|
* Product data:
|
|
* productDl.
|
|
* productSlackChannel.
|
|
* productFeedbackUrl.
|
|
* Renamed `tables` key to `dataset`.
|
|
* Removed `owner` key. Owner is now a stakeholder role.
|
|
* Additional quality keys:
|
|
* description.
|
|
* toolName.
|
|
* toolRuleName.
|
|
* Custom Properties.
|
|
* Product dates:
|
|
* generalAvailabilityDate.
|
|
* endOfSupportDate.
|
|
* endOfLifeDate.
|
|
|
|
# v1 - DEPRECATED
|
|
* Description of the data quantum/data artifact.
|
|
* Roles.
|
|
* Schema:
|
|
* Tables, columns.
|
|
* Data quality.
|
|
* System rating weightage.
|
|
* Ratings:
|
|
* System, user, etc.
|