Compare commits

...

36 Commits

Author SHA1 Message Date
Gerda Shank
845b95f3d0 Initial file creation of code documentation READMEs 2022-01-31 22:09:17 -05:00
Nathaniel May
13b18654f0 Guard against unnecessarily calling dump_graph in logging (#4619)
* add lazy type and apply to cache events
2022-01-31 14:14:34 -05:00
Jeremy Cohen
aafa1c7f47 Change InvalidRefInTestNode level to DEBUG (#4647)
* Debug-level test depends on disabled

* Add PR link to Changelog
2022-01-31 18:28:43 +01:00
Jeremy Cohen
638e3ad299 Drop support for Python <3.7.2 (#4643)
* Drop support for 3.7.1 + 3.7.2

* Rm root level setup.py

* Rm 'dbt' pkg from build-dist script

* Fixup changelog
2022-01-31 17:31:20 +01:00
Emily Rockman
d9cfeb1ea3 Retry after failure to download or failure to open files (#4609)
* add retry logic, tests when extracting tarfile fails

* fixed bug with not catching empty responses

* specify compression type

* WIP test

* more testing work

* fixed up unit test

* add changelog

* Add more comments!

* clarify why we do the json() check for None
2022-01-31 10:26:51 -06:00
Chenyu Li
e6786a2bc3 fix comparision for new model/body (#4631)
* fix comparison for new model/body
2022-01-31 10:33:35 -05:00
leahwicz
13571435a3 Initial addition of CODEOWNERS file (#4620)
* Initial addition of CODEOWNERS file

* Proposed sub-team ownership (#4632)

* Updating for the events module to be both language and execution

* Adding more comment details

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
2022-01-27 16:23:55 -05:00
Gerda Shank
efb890db2d [#4504] Use mashumaro for serializing logging events (#4505) 2022-01-27 14:43:26 -05:00
Niall Woodward
f3735187a6 Run check_if_can_write_profile before create_profile_using_project_profile_template [CT-67] [Backport 1.0.latest] (#4447)
* Run check_if_can_write_profile before create_profile_using_project_profile_template

* Changelog

Co-authored-by: Ian Knox <81931810+iknox-fa@users.noreply.github.com>
2022-01-27 11:17:28 -06:00
Gerda Shank
3032594b26 [#4554] Don't require a profile for dbt deps and clean commands (#4610) 2022-01-25 12:26:44 -05:00
Joel Labes
1df7a029b4 Clarify "incompatible package version" error msg (#4587)
* Clarify "incompatible package version" error msg

* Clarify error message when they shouldn't fall fwd
2022-01-24 18:33:45 -05:00
leahwicz
f467fba151 Changing Jira mirroring workflows to point to shared Actions (#4615) 2022-01-24 12:20:12 -05:00
Amir Kadivar
8791313ec5 Validate project names in interactive dbt init (#4536)
* Validate project names in interactive dbt init

- workflow: ask the user to provide a valid project name until they do.
- new integration tests
- supported scenarios:
  - dbt init
  - dbt init -s
  - dbt init [name]
  - dbt init [name] -s

* Update Changelog.md

* Add full URLs to CHANGELOG.md

Co-authored-by: Chenyu Li <chenyulee777@gmail.com>

Co-authored-by: Chenyu Li <chenyulee777@gmail.com>
2022-01-21 18:24:26 -05:00
leahwicz
7798f932a0 Add Backport Action (#4605) 2022-01-21 12:40:55 -05:00
Nathaniel May
a588607ec6 drop support for Python 3.7.0 and 3.7.1 (#4585) 2022-01-19 12:24:37 -05:00
Joel Labes
348764d99d Rename data directory to seeds (#4589)
* Rename data directory to seeds

* Update CHANGELOG.md
2022-01-19 10:04:35 -06:00
Gerda Shank
5aeb088a73 [#3988] Fix test deprecation warnings (#4556) 2022-01-12 17:03:11 -05:00
leahwicz
e943b9fc84 Mirror labels to Jira (#4550)
* Adding Jira label mirroring

* Fixing bad step name
2022-01-05 09:29:52 -05:00
leahwicz
892426eecb Mirroring issues to Jira (#4548)
* Adding issue creation Jira Action

* Adding issue closing Jira Action

* Add labeling logic
2022-01-04 17:00:03 -05:00
Emily Rockman
1d25b2b046 test name standardization (#4509)
* rename tests for standardization

* more renaming

* rename tests to remove duplicate numbers

* removed unused file

* removed unused files in 016

* removed unused files in 017

* fixed schema number mismatch 027

* fixed to be actual directory name 025

* remove unused dir 029

* remove unused files 039

* remove unused files 053

* updated changelog
2022-01-04 11:36:47 -06:00
github-actions[bot]
da70840be8 Bumping version to 1.0.1 (#4543)
* Bumping version to 1.0.1

* Update CHANGELOG.md

* Update CHANGELOG.md

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2022-01-03 13:04:50 -05:00
leahwicz
7632782ecd Removing Docker from bumpversion script (#4542) 2022-01-03 12:48:03 -05:00
Nathaniel May
6fae647097 copy over windows compat logic for colored log output (#4474) 2022-01-03 12:37:36 -05:00
leahwicz
fc8b8c11d5 Commenting our Docker portion of Version Bump (#4541) 2022-01-03 12:37:20 -05:00
Topherhindman
26a7922a34 Fix small typo in architecture doc (#4533) 2022-01-03 12:00:04 +01:00
Emily Rockman
c18b4f1f1a removed unused code in unit tests (#4496)
* removed unused code

* add changelog

* moved changelog entry
2021-12-23 08:26:22 -06:00
Nathaniel May
fa31a67499 Add Structured Logging ADR (#4308) 2021-12-22 10:26:14 -05:00
Ian Knox
742cd990ee New Dockerfile (#4487)
New Dockerfile supporting individual db adapters and architectures
2021-12-22 08:29:21 -06:00
Gerda Shank
8463af35c3 [#4523] Fix error with env_var in hook (#4524) 2021-12-20 14:19:05 -05:00
github-actions[bot]
b34a4ab493 Bumping version to 1.0.1rc1 (#4517)
* Bumping version to 1.0.1rc1

* Update CHANGELOG.md

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
Co-authored-by: leahwicz <60146280+leahwicz@users.noreply.github.com>
2021-12-19 15:33:38 -05:00
Jeremy Cohen
417ccdc3b4 Fix bool coercion to 0/1 (#4512)
* Fix bool coercion

* Fix unit test
2021-12-19 10:30:25 -05:00
Emily Rockman
7c46b784ef scrub message of secrets (#4507)
* scrub message of secrets

* update changelog

* use new scrubbing and scrub more places using git

* fixed small miss of string conv and missing raise

* fix bug with cloning error

* resolving message issues

* better, more specific scrubbing
2021-12-17 16:05:57 -06:00
Gerda Shank
067b861d30 Improve checking of schema version for pre-1.0.0 manifests (#4497)
* [#4470] Improve checking of schema version for pre-1.0.0 manifests

* Check exception code instead of message in test
2021-12-16 13:30:52 -05:00
Emily Rockman
9f6ed3cec3 update log message to use adapter name (#4501)
* update log message to use adapter name

* add changelog
2021-12-16 11:46:28 -06:00
Nathaniel May
43edc887f9 Simplify Log Destinations (#4483) 2021-12-16 11:40:05 -05:00
Emily Rockman
6d4c64a436 compile new index file for docs (#4484)
* compile new index file for docs

* Add changelog

* move changleog entries for docs changes
2021-12-16 10:09:02 -06:00
654 changed files with 2095 additions and 1581 deletions

View File

@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.0.0
current_version = 1.0.1
parse = (?P<major>\d+)
\.(?P<minor>\d+)
\.(?P<patch>\d+)
@@ -35,5 +35,3 @@ first_value = 1
[bumpversion:file:plugins/postgres/setup.py]
[bumpversion:file:plugins/postgres/dbt/adapters/postgres/__version__.py]
[bumpversion:file:docker/requirements/requirements.txt]

43
.github/CODEOWNERS vendored Normal file
View File

@@ -0,0 +1,43 @@
# This file contains the code owners for the dbt-core repo.
# PRs will be automatically assigned for review to the associated
# team(s) or person(s) that touches any files that are mapped to them.
#
# A statement takes precedence over the statements above it so more general
# assignments are found at the top with specific assignments being lower in
# the ordering (i.e. catch all assignment should be the first item)
#
# Consult GitHub documentation for formatting guidelines:
# https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners#example-of-a-codeowners-file
# As a default for areas with no assignment,
# the core team as a whole will be assigned
* @dbt-labs/core
# Changes to GitHub configurations including Actions
/.github/ @leahwicz
# Language core modules
/core/dbt/config/ @dbt-labs/core-language
/core/dbt/context/ @dbt-labs/core-language
/core/dbt/contracts/ @dbt-labs/core-language
/core/dbt/deps/ @dbt-labs/core-language
/core/dbt/parser/ @dbt-labs/core-language
# Execution core modules
/core/dbt/events/ @dbt-labs/core-execution @dbt-labs/core-language # eventually remove language but they have knowledge here now
/core/dbt/graph/ @dbt-labs/core-execution
/core/dbt/task/ @dbt-labs/core-execution
# Adapter interface, scaffold, Postgres plugin
/core/dbt/adapters @dbt-labs/core-adapters
/core/scripts/create_adapter_plugin.py @dbt-labs/core-adapters
/plugins/ @dbt-labs/core-adapters
# Global project: default macros, including generic tests + materializations
/core/dbt/include/global_project @dbt-labs/core-execution @dbt-labs/core-adapters
# Perf regression testing framework
# This excludes the test project files itself since those aren't specific
# framework changes (excluded by not setting an owner next to it- no owner)
/performance @nathaniel-may
/performance/projects

34
.github/workflows/backport.yml vendored Normal file
View File

@@ -0,0 +1,34 @@
# **what?**
# When a PR is merged, if it has the backport label, it will create
# a new PR to backport those changes to the given branch. If it can't
# cleanly do a backport, it will comment on the merged PR of the failure.
#
# Label naming convention: "backport <branch name to backport to>"
# Example: backport 1.0.latest
#
# You MUST "Squash and merge" the original PR or this won't work.
# **why?**
# Changes sometimes need to be backported to release branches.
# This automates the backporting process
# **when?**
# Once a PR is "Squash and merge"'d and it has been correctly labeled
# according to the naming convention.
name: Backport
on:
pull_request:
types:
- closed
- labeled
jobs:
backport:
runs-on: ubuntu-18.04
name: Backport
steps:
- name: Backport
uses: tibdex/backport@v1.1.1
with:
github_token: ${{ secrets.GITHUB_TOKEN }}

26
.github/workflows/jira-creation.yml vendored Normal file
View File

@@ -0,0 +1,26 @@
# **what?**
# Mirrors issues into Jira. Includes the information: title,
# GitHub Issue ID and URL
# **why?**
# Jira is our tool for tracking and we need to see these issues in there
# **when?**
# On issue creation or when an issue is labeled `Jira`
name: Jira Issue Creation
on:
issues:
types: [opened, labeled]
permissions:
issues: write
jobs:
call-label-action:
uses: dbt-labs/jira-actions/.github/workflows/jira-creation.yml@main
secrets:
JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}
JIRA_API_TOKEN: ${{ secrets.JIRA_API_TOKEN }}

27
.github/workflows/jira-label.yml vendored Normal file
View File

@@ -0,0 +1,27 @@
# **what?**
# Calls mirroring Jira label Action. Includes adding a new label
# to an existing issue or removing a label as well
# **why?**
# Jira is our tool for tracking and we need to see these labels in there
# **when?**
# On labels being added or removed from issues
name: Jira Label Mirroring
on:
issues:
types: [labeled, unlabeled]
permissions:
issues: read
jobs:
call-label-action:
uses: dbt-labs/jira-actions/.github/workflows/jira-label.yml@main
secrets:
JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}
JIRA_API_TOKEN: ${{ secrets.JIRA_API_TOKEN }}

24
.github/workflows/jira-transition.yml vendored Normal file
View File

@@ -0,0 +1,24 @@
# **what?**
# Transition a Jira issue to a new state
# Only supports these GitHub Issue transitions:
# closed, deleted, reopened
# **why?**
# Jira needs to be kept up-to-date
# **when?**
# On issue closing, deletion, reopened
name: Jira Issue Transition
on:
issues:
types: [closed, deleted, reopened]
jobs:
call-label-action:
uses: dbt-labs/jira-actions/.github/workflows/jira-transition.yml@main
secrets:
JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}
JIRA_API_TOKEN: ${{ secrets.JIRA_API_TOKEN }}

View File

@@ -66,12 +66,12 @@ jobs:
git push origin bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_$GITHUB_RUN_ID
git branch --set-upstream-to=origin/bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_$GITHUB_RUN_ID bumping-version/${{steps.variables.outputs.VERSION_NUMBER}}_$GITHUB_RUN_ID
- name: Generate Docker requirements
run: |
source env/bin/activate
pip install -r requirements.txt
pip freeze -l > docker/requirements/requirements.txt
git status
# - name: Generate Docker requirements
# run: |
# source env/bin/activate
# pip install -r requirements.txt
# pip freeze -l > docker/requirements/requirements.txt
# git status
- name: Bump version
run: |

View File

@@ -4,16 +4,17 @@ The core function of dbt is SQL compilation and execution. Users create projects
Most of the python code in the repository is within the `core/dbt` directory. Currently the main subdirectories are:
- [`adapters`](core/dbt/adapters): Define base classes for behavior that is likely to differ across databases
- [`clients`](core/dbt/clients): Interface with dependencies (agate, jinja) or across operating systems
- [`config`](core/dbt/config): Reconcile user-supplied configuration from connection profiles, project files, and Jinja macros
- [`context`](core/dbt/context): Build and expose dbt-specific Jinja functionality
- [`contracts`](core/dbt/contracts): Define Python objects (dataclasses) that dbt expects to create and validate
- [`deps`](core/dbt/deps): Package installation and dependency resolution
- [`graph`](core/dbt/graph): Produce a `networkx` DAG of project resources, and selecting those resources given user-supplied criteria
- [`include`](core/dbt/include): The dbt "global project," which defines default implementations of Jinja2 macros
- [`parser`](core/dbt/parser): Read project files, validate, construct python objects
- [`task`](core/dbt/task): Set forth the actions that dbt can perform when invoked
- [`adapters`](core/dbt/adapters/README.md): Define base classes for behavior that is likely to differ across databases
- [`clients`](core/dbt/clients/README.md): Interface with dependencies (agate, jinja) or across operating systems
- [`config`](core/dbt/config/README.md): Reconcile user-supplied configuration from connection profiles, project files, and Jinja macros
- [`context`](core/dbt/context/README.md): Build and expose dbt-specific Jinja functionality
- [`contracts`](core/dbt/contracts/README.md): Define Python objects (dataclasses) that dbt expects to create and validate
- [`deps`](core/dbt/deps/README.md): Package installation and dependency resolution
- [`events`](core/dbt/events/README.md): Logging events
- [`graph`](core/dbt/graph/README.md): Produce a `networkx` DAG of project resources, and selecting those resources given user-supplied criteria
- [`include`](core/dbt/include/README.md): The dbt "global project," which defines default implementations of Jinja2 macros
- [`parser`](core/dbt/parser/README.md): Read project files, validate, construct python objects
- [`task`](core/dbt/task/README.md): Set forth the actions that dbt can perform when invoked
### Invoking dbt
@@ -44,4 +45,4 @@ The [`test/`](test/) subdirectory includes unit and integration tests that run a
- [docker](docker/): All dbt versions are published as Docker images on DockerHub. This subfolder contains the `Dockerfile` (constant) and `requirements.txt` (one for each version).
- [etc](etc/): Images for README
- [scripts](scripts/): Helper scripts for testing, releasing, and producing JSON schemas. These are not included in distributions of dbt, not are they rigorously tested—they're just handy tools for the dbt maintainers :)
- [scripts](scripts/): Helper scripts for testing, releasing, and producing JSON schemas. These are not included in distributions of dbt, nor are they rigorously tested—they're just handy tools for the dbt maintainers :)

View File

@@ -1,9 +1,50 @@
## dbt-core 1.0.1 (TBD)
## dbt-core 1.1.0 (TBD)
### Features
- New Dockerfile to support specific db adapters and platforms. See docker/README.md for details ([#4495](https://github.com/dbt-labs/dbt-core/issues/4495), [#4487](https://github.com/dbt-labs/dbt-core/pull/4487))
### Fixes
- User wasn't asked for permission to overwite a profile entry when running init inside an existing project ([#4375](https://github.com/dbt-labs/dbt-core/issues/4375), [#4447](https://github.com/dbt-labs/dbt-core/pull/4447))
- Add project name validation to `dbt init` ([#4490](https://github.com/dbt-labs/dbt-core/issues/4490),[#4536](https://github.com/dbt-labs/dbt-core/pull/4536))
### Under the hood
- Testing cleanup ([#4496](https://github.com/dbt-labs/dbt-core/pull/4496), [#4509](https://github.com/dbt-labs/dbt-core/pull/4509))
- Clean up test deprecation warnings ([#3988](https://github.com/dbt-labs/dbt-core/issue/3988), [#4556](https://github.com/dbt-labs/dbt-core/pull/4556))
- Use mashumaro for serialization in event logging ([#4504](https://github.com/dbt-labs/dbt-core/issues/4504), [#4505](https://github.com/dbt-labs/dbt-core/pull/4505))
- Drop support for Python 3.7.0 + 3.7.1 ([#4584](https://github.com/dbt-labs/dbt-core/issues/4584), [#4585](https://github.com/dbt-labs/dbt-core/pull/4585), [#4643](https://github.com/dbt-labs/dbt-core/pull/4643))
Contributors:
- [@NiallRees](https://github.com/NiallRees) ([#4447](https://github.com/dbt-labs/dbt-core/pull/4447))
## dbt-core 1.0.2 (TBD)
### Fixes
- Projects created using `dbt init` now have the correct `seeds` directory created (instead of `data`) ([#4588](https://github.com/dbt-labs/dbt-core/issues/4588), [#4599](https://github.com/dbt-labs/dbt-core/pull/4589))
- Don't require a profile for dbt deps and clean commands ([#4554](https://github.com/dbt-labs/dbt-core/issues/4554), [#4610](https://github.com/dbt-labs/dbt-core/pull/4610))
- Select modified.body works correctly when new model added([#4570](https://github.com/dbt-labs/dbt-core/issues/4570), [#4631](https://github.com/dbt-labs/dbt-core/pull/4631))
- Fix bug in retry logic for bad response from hub and when there is a bad git tarball download. ([#4577](https://github.com/dbt-labs/dbt-core/issues/4577), [#4579](https://github.com/dbt-labs/dbt-core/issues/4579), [#4609](https://github.com/dbt-labs/dbt-core/pull/4609))
- Restore previous log level (DEBUG) when a test depends on a disabled resource. Still WARN if the resource is missing ([#4594](https://github.com/dbt-labs/dbt-core/issues/4594), [#4647](https://github.com/dbt-labs/dbt-core/pull/4647))
## dbt-core 1.0.1 (January 03, 2022)
* [@amirkdv](https://github.com/amirkdv) ([#4536](https://github.com/dbt-labs/dbt-core/pull/4536))
## dbt-core 1.0.1rc1 (December 20, 2021)
### Fixes
- Fix wrong url in the dbt docs overview homepage ([#4442](https://github.com/dbt-labs/dbt-core/pull/4442))
- Fix redefined status param of SQLQueryStatus to typecheck the string which passes on `._message` value of `AdapterResponse` or the `str` value sent by adapter plugin. ([#4463](https://github.com/dbt-labs/dbt-core/pull/4463#issuecomment-990174166))
- Fix `DepsStartPackageInstall` event to use package name instead of version number. ([#4482](https://github.com/dbt-labs/dbt-core/pull/4482))
- Reimplement log message to use adapter name instead of the object method. ([#4501](https://github.com/dbt-labs/dbt-core/pull/4501))
- Issue better error message for incompatible schemas ([#4470](https://github.com/dbt-labs/dbt-core/pull/4442), [#4497](https://github.com/dbt-labs/dbt-core/pull/4497))
- Remove secrets from error related to packages. ([#4507](https://github.com/dbt-labs/dbt-core/pull/4507))
- Prevent coercion of boolean values (`True`, `False`) to numeric values (`0`, `1`) in query results ([#4511](https://github.com/dbt-labs/dbt-core/issues/4511), [#4512](https://github.com/dbt-labs/dbt-core/pull/4512))
- Fix error with an env_var in a project hook ([#4523](https://github.com/dbt-labs/dbt-core/issues/4523), [#4524](https://github.com/dbt-labs/dbt-core/pull/4524))
- Add additional windows compat logic for colored log output. ([#4443](https://github.com/dbt-labs/dbt-core/issues/4443))
### Docs
- Fix missing data on exposures in docs ([#4467](https://github.com/dbt-labs/dbt-core/issues/4467))
Contributors:
- [remoyson](https://github.com/remoyson) ([#4442](https://github.com/dbt-labs/dbt-core/pull/4442))
@@ -210,7 +251,7 @@ Contributors:
- [@laxjesse](https://github.com/laxjesse) ([#4019](https://github.com/dbt-labs/dbt-core/pull/4019))
- [@gitznik](https://github.com/Gitznik) ([#4124](https://github.com/dbt-labs/dbt-core/pull/4124))
## dbt 0.21.1 (Release TBD)
## dbt 0.21.1 (November 29, 2021)
### Fixes
- Add `get_where_subquery` to test macro namespace, fixing custom generic tests that rely on introspecting the `model` arg at parse time ([#4195](https://github.com/dbt-labs/dbt/issues/4195), [#4197](https://github.com/dbt-labs/dbt/pull/4197))
@@ -354,7 +395,7 @@ Contributors:
- [@jmriego](https://github.com/jmriego) ([#3526](https://github.com/dbt-labs/dbt-core/pull/3526))
- [@danielefrigo](https://github.com/danielefrigo) ([#3547](https://github.com/dbt-labs/dbt-core/pull/3547))
## dbt 0.20.2 (Release TBD)
## dbt 0.20.2 (September 07, 2021)
### Under the hood

View File

@@ -1,3 +1,8 @@
##
# This dockerfile is used for local development and adapter testing only.
# See `/docker` for a generic and production-ready docker file
##
FROM ubuntu:20.04
ENV DEBIAN_FRONTEND noninteractive

52
core/dbt/README.md Normal file
View File

@@ -0,0 +1,52 @@
# core/dbt directory README
## The following are individual files in this directory.
### deprecations.py
### flags.py
### main.py
### tracking.py
### version.py
### lib.py
### node_types.py
### helper_types.py
### links.py
### semver.py
### ui.py
### compilation.py
### dataclass_schema.py
### exceptions.py
### hooks.py
### logger.py
### profiler.py
### utils.py
## The subdirectories will be documented in a README in the subdirectory
* config
* include
* adapters
* context
* deps
* graph
* task
* clients
* events

View File

@@ -0,0 +1 @@
# Adapters README

View File

@@ -21,6 +21,7 @@ from dbt.events.types import (
UpdateReference
)
from dbt.utils import lowercase
from dbt.helper_types import Lazy
def dot_separated(key: _ReferenceKey) -> str:
@@ -291,11 +292,12 @@ class RelationsCache:
:raises InternalError: If either entry does not exist.
"""
ref_key = _make_key(referenced)
dep_key = _make_key(dependent)
if (ref_key.database, ref_key.schema) not in self:
# if we have not cached the referenced schema at all, we must be
# referring to a table outside our control. There's no need to make
# a link - we will never drop the referenced relation during a run.
fire_event(UncachedRelation(dep_key=dependent, ref_key=ref_key))
fire_event(UncachedRelation(dep_key=dep_key, ref_key=ref_key))
return
if ref_key not in self.relations:
# Insert a dummy "external" relation.
@@ -303,8 +305,6 @@ class RelationsCache:
type=referenced.External
)
self.add(referenced)
dep_key = _make_key(dependent)
if dep_key not in self.relations:
# Insert a dummy "external" relation.
dependent = dependent.replace(
@@ -323,11 +323,11 @@ class RelationsCache:
"""
cached = _CachedRelation(relation)
fire_event(AddRelation(relation=_make_key(cached)))
fire_event(DumpBeforeAddGraph(dump=self.dump_graph()))
fire_event(DumpBeforeAddGraph(dump=Lazy.defer(lambda: self.dump_graph())))
with self.lock:
self._setdefault(cached)
fire_event(DumpAfterAddGraph(dump=self.dump_graph()))
fire_event(DumpAfterAddGraph(dump=Lazy.defer(lambda: self.dump_graph())))
def _remove_refs(self, keys):
"""Removes all references to all entries in keys. This does not
@@ -342,17 +342,17 @@ class RelationsCache:
for cached in self.relations.values():
cached.release_references(keys)
def _drop_cascade_relation(self, dropped):
def _drop_cascade_relation(self, dropped_key):
"""Drop the given relation and cascade it appropriately to all
dependent relations.
:param _CachedRelation dropped: An existing _CachedRelation to drop.
"""
if dropped not in self.relations:
fire_event(DropMissingRelation(relation=dropped))
if dropped_key not in self.relations:
fire_event(DropMissingRelation(relation=dropped_key))
return
consequences = self.relations[dropped].collect_consequences()
fire_event(DropCascade(dropped=dropped, consequences=consequences))
consequences = self.relations[dropped_key].collect_consequences()
fire_event(DropCascade(dropped=dropped_key, consequences=consequences))
self._remove_refs(consequences)
def drop(self, relation):
@@ -366,10 +366,10 @@ class RelationsCache:
:param str schema: The schema of the relation to drop.
:param str identifier: The identifier of the relation to drop.
"""
dropped = _make_key(relation)
fire_event(DropRelation(dropped=dropped))
dropped_key = _make_key(relation)
fire_event(DropRelation(dropped=dropped_key))
with self.lock:
self._drop_cascade_relation(dropped)
self._drop_cascade_relation(dropped_key)
def _rename_relation(self, old_key, new_relation):
"""Rename a relation named old_key to new_key, updating references.
@@ -441,7 +441,7 @@ class RelationsCache:
new_key = _make_key(new)
fire_event(RenameSchema(old_key=old_key, new_key=new_key))
fire_event(DumpBeforeRenameSchema(dump=self.dump_graph()))
fire_event(DumpBeforeRenameSchema(dump=Lazy.defer(lambda: self.dump_graph())))
with self.lock:
if self._check_rename_constraints(old_key, new_key):
@@ -449,7 +449,7 @@ class RelationsCache:
else:
self._setdefault(_CachedRelation(new))
fire_event(DumpAfterRenameSchema(dump=self.dump_graph()))
fire_event(DumpAfterRenameSchema(dump=Lazy.defer(lambda: self.dump_graph())))
def get_relations(
self, database: Optional[str], schema: Optional[str]

View File

@@ -123,7 +123,7 @@ class SQLAdapter(BaseAdapter):
ColTypeChange(
orig_type=target_column.data_type,
new_type=new_type,
table=current,
table=_make_key(current),
)
)

View File

@@ -0,0 +1 @@
# Clients README

View File

@@ -13,6 +13,18 @@ from dbt.exceptions import RuntimeException
BOM = BOM_UTF8.decode('utf-8') # '\ufeff'
class Number(agate.data_types.Number):
# undo the change in https://github.com/wireservice/agate/pull/733
# i.e. do not cast True and False to numeric 1 and 0
def cast(self, d):
if type(d) == bool:
raise agate.exceptions.CastError(
'Do not cast True to 1 or False to 0.'
)
else:
return super().cast(d)
class ISODateTime(agate.data_types.DateTime):
def cast(self, d):
# this is agate.data_types.DateTime.cast with the "clever" bits removed
@@ -41,7 +53,7 @@ def build_type_tester(
) -> agate.TypeTester:
types = [
agate.data_types.Number(null_values=('null', '')),
Number(null_values=('null', '')),
agate.data_types.Date(null_values=('null', ''),
date_format='%Y-%m-%d'),
agate.data_types.DateTime(null_values=('null', ''),

View File

@@ -8,7 +8,10 @@ from dbt.events.types import (
GitProgressUpdatingExistingDependency, GitProgressPullingNewDependency,
GitNothingToDo, GitProgressUpdatedCheckoutRange, GitProgressCheckedOutAt
)
import dbt.exceptions
from dbt.exceptions import (
CommandResultError, RuntimeException, bad_package_spec, raise_git_cloning_error,
raise_git_cloning_problem
)
from packaging import version
@@ -22,9 +25,9 @@ def _raise_git_cloning_error(repo, revision, error):
if 'usage: git' in stderr:
stderr = stderr.split('\nusage: git')[0]
if re.match("fatal: destination path '(.+)' already exists", stderr):
raise error
raise_git_cloning_error(error)
dbt.exceptions.bad_package_spec(repo, revision, stderr)
bad_package_spec(repo, revision, stderr)
def clone(repo, cwd, dirname=None, remove_git_dir=False, revision=None, subdirectory=None):
@@ -53,7 +56,7 @@ def clone(repo, cwd, dirname=None, remove_git_dir=False, revision=None, subdirec
clone_cmd.append(dirname)
try:
result = run_cmd(cwd, clone_cmd, env={'LC_ALL': 'C'})
except dbt.exceptions.CommandResultError as exc:
except CommandResultError as exc:
_raise_git_cloning_error(repo, revision, exc)
if subdirectory:
@@ -61,7 +64,7 @@ def clone(repo, cwd, dirname=None, remove_git_dir=False, revision=None, subdirec
clone_cmd_subdir = ['git', 'sparse-checkout', 'set', subdirectory]
try:
run_cmd(cwd_subdir, clone_cmd_subdir)
except dbt.exceptions.CommandResultError as exc:
except CommandResultError as exc:
_raise_git_cloning_error(repo, revision, exc)
if remove_git_dir:
@@ -105,9 +108,9 @@ def checkout(cwd, repo, revision=None):
revision = 'HEAD'
try:
return _checkout(cwd, repo, revision)
except dbt.exceptions.CommandResultError as exc:
except CommandResultError as exc:
stderr = exc.stderr.decode('utf-8').strip()
dbt.exceptions.bad_package_spec(repo, revision, stderr)
bad_package_spec(repo, revision, stderr)
def get_current_sha(cwd):
@@ -131,14 +134,11 @@ def clone_and_checkout(repo, cwd, dirname=None, remove_git_dir=False,
remove_git_dir=remove_git_dir,
subdirectory=subdirectory,
)
except dbt.exceptions.CommandResultError as exc:
except CommandResultError as exc:
err = exc.stderr.decode('utf-8')
exists = re.match("fatal: destination path '(.+)' already exists", err)
if not exists:
print(
'\nSomething went wrong while cloning {}'.format(repo) +
'\nCheck the debug logs for more information')
raise
raise_git_cloning_problem(repo)
directory = None
start_sha = None
@@ -148,7 +148,7 @@ def clone_and_checkout(repo, cwd, dirname=None, remove_git_dir=False,
else:
matches = re.match("Cloning into '(.+)'", err.decode('utf-8'))
if matches is None:
raise dbt.exceptions.RuntimeException(
raise RuntimeException(
f'Error cloning {repo} - never saw "Cloning into ..." from git'
)
directory = matches.group(1)

View File

@@ -33,7 +33,12 @@ def _get(path, registry_base_url=None):
resp = requests.get(url, timeout=30)
fire_event(RegistryProgressGETResponse(url=url, resp_code=resp.status_code))
resp.raise_for_status()
if resp is None:
# It is unexpected for the content of the response to be None so if it is, raising this error
# will cause this function to retry (if called within _get_with_retries) and hopefully get
# a response. This seems to happen when there's an issue with the Hub.
# See https://github.com/dbt-labs/dbt-core/issues/4577
if resp.json() is None:
raise requests.exceptions.ContentDecodingError(
'Request error: The response is None', response=resp
)

View File

@@ -485,7 +485,7 @@ def untar_package(
) -> None:
tar_path = convert_path(tar_path)
tar_dir_name = None
with tarfile.open(tar_path, 'r') as tarball:
with tarfile.open(tar_path, 'r:gz') as tarball:
tarball.extractall(dest_dir)
tar_dir_name = os.path.commonprefix(tarball.getnames())
if rename_to:

View File

@@ -3,6 +3,7 @@ from collections import defaultdict
from typing import List, Dict, Any, Tuple, cast, Optional
import networkx as nx # type: ignore
import pickle
import sqlparse
from dbt import flags
@@ -162,7 +163,8 @@ class Linker:
for node_id in self.graph:
data = manifest.expect(node_id).to_dict(omit_none=True)
out_graph.add_node(node_id, **data)
nx.write_gpickle(out_graph, outfile)
with open(outfile, 'wb') as outfh:
pickle.dump(out_graph, outfh, protocol=pickle.HIGHEST_PROTOCOL)
class Compiler:

View File

@@ -0,0 +1 @@
# Config README

View File

@@ -45,7 +45,7 @@ INVALID_VERSION_ERROR = """\
This version of dbt is not supported with the '{package}' package.
Installed version of dbt: {installed}
Required version of dbt for '{package}': {version_spec}
Check the requirements for the '{package}' package, or run dbt again with \
Check for a different version of the '{package}' package, or run dbt again with \
--no-version-check
"""
@@ -54,7 +54,7 @@ IMPOSSIBLE_VERSION_ERROR = """\
The package version requirement can never be satisfied for the '{package}
package.
Required versions of dbt for '{package}': {version_spec}
Check the requirements for the '{package}' package, or run dbt again with \
Check for a different version of the '{package}' package, or run dbt again with \
--no-version-check
"""

View File

@@ -1,7 +1,7 @@
import itertools
import os
from copy import deepcopy
from dataclasses import dataclass, fields
from dataclasses import dataclass
from pathlib import Path
from typing import (
Dict, Any, Optional, Mapping, Iterator, Iterable, Tuple, List, MutableSet,
@@ -13,20 +13,17 @@ from .project import Project
from .renderer import DbtProjectYamlRenderer, ProfileRenderer
from .utils import parse_cli_vars
from dbt import flags
from dbt import tracking
from dbt.adapters.factory import get_relation_class_by_name, get_include_paths
from dbt.helper_types import FQNPath, PathSet
from dbt.config.profile import read_user_config
from dbt.contracts.connection import AdapterRequiredConfig, Credentials
from dbt.contracts.graph.manifest import ManifestMetadata
from dbt.contracts.relation import ComponentName
from dbt.events.types import ProfileLoadError, ProfileNotFound
from dbt.events.functions import fire_event
from dbt.ui import warning_tag
from dbt.contracts.project import Configuration, UserConfig
from dbt.exceptions import (
RuntimeException,
DbtProfileError,
DbtProjectError,
validator_error_message,
warn_or_error,
@@ -191,6 +188,7 @@ class RuntimeConfig(Project, Profile, AdapterRequiredConfig):
profile_renderer: ProfileRenderer,
profile_name: Optional[str],
) -> Profile:
return Profile.render_from_args(
args, profile_renderer, profile_name
)
@@ -412,21 +410,12 @@ class UnsetCredentials(Credentials):
return ()
class UnsetConfig(UserConfig):
def __getattribute__(self, name):
if name in {f.name for f in fields(UserConfig)}:
raise AttributeError(
f"'UnsetConfig' object has no attribute {name}"
)
def __post_serialize__(self, dct):
return {}
# This is used by UnsetProfileConfig, for commands which do
# not require a profile, i.e. dbt deps and clean
class UnsetProfile(Profile):
def __init__(self):
self.credentials = UnsetCredentials()
self.user_config = UnsetConfig()
self.user_config = UserConfig() # This will be read in _get_rendered_profile
self.profile_name = ''
self.target_name = ''
self.threads = -1
@@ -443,6 +432,8 @@ class UnsetProfile(Profile):
return Profile.__getattribute__(self, name)
# This class is used by the dbt deps and clean commands, because they don't
# require a functioning profile.
@dataclass
class UnsetProfileConfig(RuntimeConfig):
"""This class acts a lot _like_ a RuntimeConfig, except if your profile is
@@ -525,7 +516,7 @@ class UnsetProfileConfig(RuntimeConfig):
profile_env_vars=profile.profile_env_vars,
profile_name='',
target_name='',
user_config=UnsetConfig(),
user_config=UserConfig(),
threads=getattr(args, 'threads', 1),
credentials=UnsetCredentials(),
args=args,
@@ -540,21 +531,12 @@ class UnsetProfileConfig(RuntimeConfig):
profile_renderer: ProfileRenderer,
profile_name: Optional[str],
) -> Profile:
try:
profile = Profile.render_from_args(
args, profile_renderer, profile_name
)
except (DbtProjectError, DbtProfileError) as exc:
selected_profile_name = Profile.pick_profile_name(
args_profile_name=getattr(args, 'profile', None),
project_profile_name=profile_name
)
fire_event(ProfileLoadError(exc=exc))
fire_event(ProfileNotFound(profile_name=selected_profile_name))
# return the poisoned form
profile = UnsetProfile()
# disable anonymous usage statistics
tracking.disable_tracking()
profile = UnsetProfile()
# The profile (for warehouse connection) is not needed, but we want
# to get the UserConfig, which is also in profiles.yml
user_config = read_user_config(flags.PROFILES_DIR)
profile.user_config = user_config
return profile
@classmethod
@@ -569,9 +551,6 @@ class UnsetProfileConfig(RuntimeConfig):
:raises ValidationException: If the cli variables are invalid.
"""
project, profile = cls.collect_parts(args)
if not isinstance(profile, UnsetProfile):
# if it's a real profile, return a real config
cls = RuntimeConfig
return cls.from_parts(
project=project,

View File

@@ -0,0 +1 @@
# Contexts and Jinja rendering

View File

@@ -488,9 +488,9 @@ class BaseContext(metaclass=ContextMeta):
{% endmacro %}"
"""
if info:
fire_event(MacroEventInfo(msg))
fire_event(MacroEventInfo(msg=msg))
else:
fire_event(MacroEventDebug(msg))
fire_event(MacroEventDebug(msg=msg))
return ''
@contextproperty

View File

@@ -1186,10 +1186,12 @@ class ProviderContext(ManifestContext):
# If this is compiling, do not save because it's irrelevant to parsing.
if self.model and not hasattr(self.model, 'compiled'):
self.manifest.env_vars[var] = return_value
source_file = self.manifest.files[self.model.file_id]
# Schema files should never get here
if source_file.parse_file_type != 'schema':
source_file.env_vars.append(var)
# hooks come from dbt_project.yml which doesn't have a real file_id
if self.model.file_id in self.manifest.files:
source_file = self.manifest.files[self.model.file_id]
# Schema files should never get here
if source_file.parse_file_type != 'schema':
source_file.env_vars.append(var)
return return_value
else:
msg = f"Env var required but not provided: '{var}'"

View File

@@ -0,0 +1 @@
# Contracts README

View File

@@ -178,7 +178,26 @@ class ParsedNodeMandatory(
@dataclass
class ParsedNodeDefaults(ParsedNodeMandatory):
class NodeInfoMixin():
_event_status: Dict[str, Any] = field(default_factory=dict)
@property
def node_info(self):
node_info = {
"node_path": getattr(self, 'path', None),
"node_name": getattr(self, 'name', None),
"unique_id": getattr(self, 'unique_id', None),
"resource_type": str(getattr(self, 'resource_type', '')),
"materialized": self.config.get('materialized'),
"node_status": str(self._event_status.get('node_status')),
"node_started_at": self._event_status.get("started_at"),
"node_finished_at": self._event_status.get("finished_at")
}
return node_info
@dataclass
class ParsedNodeDefaults(NodeInfoMixin, ParsedNodeMandatory):
tags: List[str] = field(default_factory=list)
refs: List[List[str]] = field(default_factory=list)
sources: List[List[str]] = field(default_factory=list)
@@ -194,7 +213,6 @@ class ParsedNodeDefaults(ParsedNodeMandatory):
unrendered_config: Dict[str, Any] = field(default_factory=dict)
created_at: float = field(default_factory=lambda: time.time())
config_call_dict: Dict[str, Any] = field(default_factory=dict)
_event_status: Dict[str, Any] = field(default_factory=dict)
def write_node(self, target_path: str, subdirectory: str, payload: str):
if (os.path.basename(self.path) ==
@@ -610,12 +628,11 @@ class UnpatchedSourceDefinition(UnparsedBaseNode, HasUniqueID, HasFqn):
@dataclass
class ParsedSourceDefinition(
class ParsedSourceMandatory(
UnparsedBaseNode,
HasUniqueID,
HasRelationMetadata,
HasFqn,
):
name: str
source_name: str
@@ -623,6 +640,13 @@ class ParsedSourceDefinition(
loader: str
identifier: str
resource_type: NodeType = field(metadata={'restrict': [NodeType.Source]})
@dataclass
class ParsedSourceDefinition(
NodeInfoMixin,
ParsedSourceMandatory
):
quoting: Quoting = field(default_factory=Quoting)
loaded_at_field: Optional[str] = None
freshness: Optional[FreshnessThreshold] = None
@@ -637,7 +661,6 @@ class ParsedSourceDefinition(
unrendered_config: Dict[str, Any] = field(default_factory=dict)
relation_name: Optional[str] = None
created_at: float = field(default_factory=lambda: time.time())
_event_status: Dict[str, Any] = field(default_factory=dict)
def __post_serialize__(self, dct):
if '_event_status' in dct:

View File

@@ -18,6 +18,18 @@ DEFAULT_SEND_ANONYMOUS_USAGE_STATS = True
class Name(ValidatedStringMixin):
ValidationRegex = r'^[^\d\W]\w*$'
@classmethod
def is_valid(cls, value: Any) -> bool:
if not isinstance(value, str):
return False
try:
cls.validate(value)
except ValidationError:
return False
return True
register_pattern(Name, r'^[^\d\W]\w*$')

View File

@@ -14,7 +14,8 @@ class PreviousState:
manifest_path = self.path / 'manifest.json'
if manifest_path.exists() and manifest_path.is_file():
try:
self.manifest = WritableManifest.read(str(manifest_path))
# we want to bail with an error if schema versions don't match
self.manifest = WritableManifest.read_and_check_versions(str(manifest_path))
except IncompatibleSchemaException as exc:
exc.add_filename(str(manifest_path))
raise
@@ -22,7 +23,8 @@ class PreviousState:
results_path = self.path / 'run_results.json'
if results_path.exists() and results_path.is_file():
try:
self.results = RunResultsArtifact.read(str(results_path))
# we want to bail with an error if schema versions don't match
self.results = RunResultsArtifact.read_and_check_versions(str(results_path))
except IncompatibleSchemaException as exc:
exc.add_filename(str(results_path))
raise

View File

@@ -9,6 +9,7 @@ from dbt.clients.system import write_json, read_json
from dbt.exceptions import (
InternalException,
RuntimeException,
IncompatibleSchemaException
)
from dbt.version import __version__
from dbt.events.functions import get_invocation_id
@@ -158,6 +159,8 @@ def get_metadata_env() -> Dict[str, str]:
}
# This is used in the ManifestMetadata, RunResultsMetadata, RunOperationResultMetadata,
# FreshnessMetadata, and CatalogMetadata classes
@dataclasses.dataclass
class BaseArtifactMetadata(dbtClassMixin):
dbt_schema_version: str
@@ -177,6 +180,17 @@ class BaseArtifactMetadata(dbtClassMixin):
return dct
# This is used as a class decorator to set the schema_version in the
# 'dbt_schema_version' class attribute. (It's copied into the metadata objects.)
# Name attributes of SchemaVersion in classes with the 'schema_version' decorator:
# manifest
# run-results
# run-operation-result
# sources
# catalog
# remote-compile-result
# remote-execution-result
# remote-run-result
def schema_version(name: str, version: int):
def inner(cls: Type[VersionedSchema]):
cls.dbt_schema_version = SchemaVersion(
@@ -187,6 +201,7 @@ def schema_version(name: str, version: int):
return inner
# This is used in the ArtifactMixin and RemoteResult classes
@dataclasses.dataclass
class VersionedSchema(dbtClassMixin):
dbt_schema_version: ClassVar[SchemaVersion]
@@ -198,6 +213,30 @@ class VersionedSchema(dbtClassMixin):
result['$id'] = str(cls.dbt_schema_version)
return result
@classmethod
def read_and_check_versions(cls, path: str):
try:
data = read_json(path)
except (EnvironmentError, ValueError) as exc:
raise RuntimeException(
f'Could not read {cls.__name__} at "{path}" as JSON: {exc}'
) from exc
# Check metadata version. There is a class variable 'dbt_schema_version', but
# that doesn't show up in artifacts, where it only exists in the 'metadata'
# dictionary.
if hasattr(cls, 'dbt_schema_version'):
if 'metadata' in data and 'dbt_schema_version' in data['metadata']:
previous_schema_version = data['metadata']['dbt_schema_version']
# cls.dbt_schema_version is a SchemaVersion object
if str(cls.dbt_schema_version) != previous_schema_version:
raise IncompatibleSchemaException(
expected=str(cls.dbt_schema_version),
found=previous_schema_version
)
return cls.from_dict(data) # type: ignore
T = TypeVar('T', bound='ArtifactMixin')
@@ -205,6 +244,8 @@ T = TypeVar('T', bound='ArtifactMixin')
# metadata should really be a Generic[T_M] where T_M is a TypeVar bound to
# BaseArtifactMetadata. Unfortunately this isn't possible due to a mypy issue:
# https://github.com/python/mypy/issues/7520
# This is used in the WritableManifest, RunResultsArtifact, RunOperationResultsArtifact,
# and CatalogArtifact
@dataclasses.dataclass(init=False)
class ArtifactMixin(VersionedSchema, Writable, Readable):
metadata: BaseArtifactMetadata

1
core/dbt/deps/README.md Normal file
View File

@@ -0,0 +1 @@
# Deps README

View File

@@ -1,4 +1,5 @@
import os
import functools
from typing import List
from dbt import semver
@@ -14,6 +15,7 @@ from dbt.exceptions import (
DependencyException,
package_not_found,
)
from dbt.utils import _connection_exception_retry as connection_exception_retry
class RegistryPackageMixin:
@@ -68,9 +70,28 @@ class RegistryPinnedPackage(RegistryPackageMixin, PinnedPackage):
system.make_directory(os.path.dirname(tar_path))
download_url = metadata.downloads.tarball
system.download_with_retries(download_url, tar_path)
deps_path = project.packages_install_path
package_name = self.get_project_name(project, renderer)
download_untar_fn = functools.partial(
self.download_and_untar,
download_url,
tar_path,
deps_path,
package_name
)
connection_exception_retry(download_untar_fn, 5)
def download_and_untar(self, download_url, tar_path, deps_path, package_name):
"""
Sometimes the download of the files fails and we want to retry. Sometimes the
download appears successful but the file did not make it through as expected
(generally due to a github incident). Either way we want to retry downloading
and untarring to see if we can get a success. Call this within
`_connection_exception_retry`
"""
system.download(download_url, tar_path)
system.untar_package(tar_path, deps_path, package_name)

View File

@@ -1,10 +1,25 @@
# Events Module
The Events module is the implmentation for structured logging. These events represent both a programatic interface to dbt processes as well as human-readable messaging in one centralized place. The centralization allows for leveraging mypy to enforce interface invariants across all dbt events, and the distinct type layer allows for decoupling events and libraries such as loggers.
The Events module is responsible for communicating internal dbt structures into a consumable interface. Right now, the events module is exclusively used for structured logging, but in the future could grow to include other user-facing components such as exceptions. These events represent both a programatic interface to dbt processes as well as human-readable messaging in one centralized place. The centralization allows for leveraging mypy to enforce interface invariants across all dbt events, and the distinct type layer allows for decoupling events and libraries such as loggers.
# Using the Events Module
The event module provides types that represent what is happening in dbt in `events.types`. These types are intended to represent an exhaustive list of all things happening within dbt that will need to be logged, streamed, or printed. To fire an event, `events.functions::fire_event` is the entry point to the module from everywhere in dbt.
# Logging
When events are processed via `fire_event`, nearly everything is logged. Whether or not the user has enabled the debug flag, all debug messages are still logged to the file. However, some events are particularly time consuming to construct because they return a huge amount of data. Today, the only messages in this category are cache events and are only logged if the `--log-cache-events` flag is on. This is important because these messages should not be created unless they are going to be logged, because they cause a noticable performance degredation. We achieve this by making the event class explicitly use lazy values for the expensive ones so they are not computed until the moment they are required. This is done with the data type `core/dbt/helper_types.py::Lazy` which includes usage documentation.
Example:
```
@dataclass
class DumpBeforeAddGraph(DebugLevel, Cache):
dump: Lazy[Dict[str, List[str]]]
code: str = "E031"
def message(self) -> str:
return f"before adding : {self.dump.force()}"
```
# Adding a New Event
In `events.types` add a new class that represents the new event. All events must be a dataclass with, at minimum, a code. You may also include some other values to construct downstream messaging. Only include the data necessary to construct this message within this class. You must extend all destinations (e.g. - if your log message belongs on the cli, extend `Cli`) as well as the loglevel this event belongs to. This system has been designed to take full advantage of mypy so running it will catch anything you may miss.
@@ -29,28 +44,10 @@ class PartialParsingDeletedExposure(DebugLevel, Cli, File):
## Optional (based on your event)
- Events associated with node status changes must have `report_node_data` passed in and be extended with `NodeInfo`
- define `asdict` if your data is not serializable to json
- Events associated with node status changes must be extended with `NodeInfo` which contains a node_info attribute
Example
```
@dataclass
class SuperImportantNodeEvent(InfoLevel, File, NodeInfo):
node_name: str
run_result: RunResult
report_node_data: ParsedModelNode # may vary
code: str = "Q036"
def message(self) -> str:
return f"{self.node_name} had overly verbose result of {run_result}"
@classmethod
def asdict(cls, data: list) -> dict:
return dict((k, str(v)) for k, v in data)
```
All values other than `code` and `report_node_data` will be included in the `data` node of the json log output.
All values other than `code` and `node_info` will be included in the `data` node of the json log output.
Once your event has been added, add a dummy call to your new event at the bottom of `types.py` and also add your new Event to the list `sample_values` in `test/unit/test_events.py'.

View File

@@ -1,9 +1,10 @@
from abc import ABCMeta, abstractmethod, abstractproperty
from abc import ABCMeta, abstractproperty, abstractmethod
from dataclasses import dataclass
from datetime import datetime
from dbt.events.serialization import EventSerialization
import os
import threading
from typing import Any, Optional
from typing import Any, Dict
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# These base types define the _required structure_ for the concrete event #
@@ -11,50 +12,11 @@ from typing import Any, Optional
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# in preparation for #3977
class TestLevel():
def level_tag(self) -> str:
return "test"
class DebugLevel():
def level_tag(self) -> str:
return "debug"
class InfoLevel():
def level_tag(self) -> str:
return "info"
class WarnLevel():
def level_tag(self) -> str:
return "warn"
class ErrorLevel():
def level_tag(self) -> str:
return "error"
class Cache():
# Events with this class will only be logged when the `--log-cache-events` flag is passed
pass
@dataclass
class Node():
node_path: str
node_name: str
unique_id: str
resource_type: str
materialized: str
node_status: str
node_started_at: datetime
node_finished_at: Optional[datetime]
type: str = 'node_status'
@dataclass
class ShowException():
# N.B.:
@@ -68,15 +30,9 @@ class ShowException():
# TODO add exhaustiveness checking for subclasses
# can't use ABCs with @dataclass because of https://github.com/python/mypy/issues/5374
# top-level superclass for all events
class Event(metaclass=ABCMeta):
# fields that should be on all events with their default implementations
log_version: int = 1
ts: Optional[datetime] = None # use getter for non-optional
ts_rfc3339: Optional[str] = None # use getter for non-optional
pid: Optional[int] = None # use getter for non-optional
node_info: Optional[Node]
# Do not define fields with defaults here
# four digit string code that uniquely identifies this type of event
# uniqueness and valid characters are enforced by tests
@@ -85,6 +41,12 @@ class Event(metaclass=ABCMeta):
def code() -> str:
raise Exception("code() not implemented for event")
# The 'to_dict' method is added by mashumaro via the EventSerialization.
# It should be in all subclasses that are to record actual events.
@abstractmethod
def to_dict(self):
raise Exception('to_dict not implemented for Event')
# do not define this yourself. inherit it from one of the above level types.
@abstractmethod
def level_tag(self) -> str:
@@ -96,25 +58,9 @@ class Event(metaclass=ABCMeta):
def message(self) -> str:
raise Exception("msg not implemented for Event")
# exactly one time stamp per concrete event
def get_ts(self) -> datetime:
if not self.ts:
self.ts = datetime.utcnow()
self.ts_rfc3339 = self.ts.strftime('%Y-%m-%dT%H:%M:%S.%fZ')
return self.ts
# preformatted time stamp
def get_ts_rfc3339(self) -> str:
if not self.ts_rfc3339:
# get_ts() creates the formatted string too so all time logic is centralized
self.get_ts()
return self.ts_rfc3339 # type: ignore
# exactly one pid per concrete event
def get_pid(self) -> int:
if not self.pid:
self.pid = os.getpid()
return self.pid
return os.getpid()
# in theory threads can change so we don't cache them.
def get_thread_name(self) -> str:
@@ -125,49 +71,50 @@ class Event(metaclass=ABCMeta):
from dbt.events.functions import get_invocation_id
return get_invocation_id()
# default dict factory for all events. can override on concrete classes.
@classmethod
def asdict(cls, data: list) -> dict:
d = dict()
for k, v in data:
# stringify all exceptions
if isinstance(v, Exception) or isinstance(v, BaseException):
d[k] = str(v)
# skip all binary data
elif isinstance(v, bytes):
continue
else:
d[k] = v
return d
# in preparation for #3977
@dataclass # type: ignore[misc]
class TestLevel(EventSerialization, Event):
def level_tag(self) -> str:
return "test"
@dataclass # type: ignore
class NodeInfo(Event, metaclass=ABCMeta):
report_node_data: Any # Union[ParsedModelNode, ...] TODO: resolve circular imports
def get_node_info(self):
node_info = Node(
node_path=self.report_node_data.path,
node_name=self.report_node_data.name,
unique_id=self.report_node_data.unique_id,
resource_type=self.report_node_data.resource_type.value,
materialized=self.report_node_data.config.get('materialized'),
node_status=str(self.report_node_data._event_status.get('node_status')),
node_started_at=self.report_node_data._event_status.get("started_at"),
node_finished_at=self.report_node_data._event_status.get("finished_at")
)
return node_info
@dataclass # type: ignore[misc]
class DebugLevel(EventSerialization, Event):
def level_tag(self) -> str:
return "debug"
class File(Event, metaclass=ABCMeta):
# Solely the human readable message. Timestamps and formatting will be added by the logger.
def file_msg(self) -> str:
# returns the event msg unless overriden in the concrete class
return self.message()
@dataclass # type: ignore[misc]
class InfoLevel(EventSerialization, Event):
def level_tag(self) -> str:
return "info"
class Cli(Event, metaclass=ABCMeta):
# Solely the human readable message. Timestamps and formatting will be added by the logger.
def cli_msg(self) -> str:
# returns the event msg unless overriden in the concrete class
return self.message()
@dataclass # type: ignore[misc]
class WarnLevel(EventSerialization, Event):
def level_tag(self) -> str:
return "warn"
@dataclass # type: ignore[misc]
class ErrorLevel(EventSerialization, Event):
def level_tag(self) -> str:
return "error"
# prevents an event from going to the file
class NoFile():
pass
# prevents an event from going to stdout
class NoStdOut():
pass
# This class represents the node_info which is generated
# by the NodeInfoMixin class in dbt.contracts.graph.parsed
@dataclass
class NodeInfo():
node_info: Dict[str, Any]

View File

@@ -1,12 +1,12 @@
import colorama
from colorama import Style
from datetime import datetime
import dbt.events.functions as this # don't worry I hate it too.
from dbt.events.base_types import Cli, Event, File, ShowException, NodeInfo, Cache
from dbt.events.base_types import NoStdOut, Event, NoFile, ShowException, Cache
from dbt.events.types import EventBufferFull, T_Event, MainReportVersion, EmptyLine
import dbt.flags as flags
# TODO this will need to move eventually
from dbt.logger import SECRET_ENV_PREFIX, make_log_dir_if_missing, GLOBAL_LOGGER
from datetime import datetime
import json
import io
from io import StringIO, TextIOWrapper
@@ -18,10 +18,11 @@ from logging.handlers import RotatingFileHandler
import os
import uuid
import threading
from typing import Any, Callable, Dict, List, Optional, Union
import dataclasses
from typing import Any, Dict, List, Optional, Union
from collections import deque
global LOG_VERSION
LOG_VERSION = 2
# create the global event history buffer with the default max size (10k)
# python 3.7 doesn't support type hints on globals, but mypy requires them. hence the ignore.
@@ -48,6 +49,25 @@ format_color = True
format_json = False
invocation_id: Optional[str] = None
# Colorama needs some help on windows because we're using logger.info
# intead of print(). If the Windows env doesn't have a TERM var set,
# then we should override the logging stream to use the colorama
# converter. If the TERM var is set (as with Git Bash), then it's safe
# to send escape characters and no log handler injection is needed.
colorama_stdout = sys.stdout
colorama_wrap = True
colorama.init(wrap=colorama_wrap)
if sys.platform == 'win32' and not os.getenv('TERM'):
colorama_wrap = False
colorama_stdout = colorama.AnsiToWin32(sys.stdout).stream
elif sys.platform == 'win32':
colorama_wrap = False
colorama.init(wrap=colorama_wrap)
def setup_event_logger(log_path, level_override=None):
# flags have been resolved, and log_path is known
@@ -131,47 +151,36 @@ def scrub_secrets(msg: str, secrets: List[str]) -> str:
return scrubbed
# returns a dictionary representation of the event fields. You must specify which of the
# available messages you would like to use (i.e. - e.message, e.cli_msg(), e.file_msg())
# used for constructing json formatted events. includes secrets which must be scrubbed at
# the usage site.
# returns a dictionary representation of the event fields.
# the message may contain secrets which must be scrubbed at the usage site.
def event_to_serializable_dict(
e: T_Event, ts_fn: Callable[[datetime], str],
msg_fn: Callable[[T_Event], str]
e: T_Event,
) -> Dict[str, Any]:
data = dict()
node_info = dict()
log_line = dict()
code: str
try:
log_line = dataclasses.asdict(e, dict_factory=type(e).asdict)
except AttributeError:
log_line = e.to_dict()
except AttributeError as exc:
event_type = type(e).__name__
raise Exception( # TODO this may hang async threads
f"type {event_type} is not serializable to json."
f" First make sure that the call sites for {event_type} match the type hints"
f" and if they do, you can override the dataclass method `asdict` in {event_type} in"
" types.py to define your own serialization function to a dictionary of valid json"
" types"
f"type {event_type} is not serializable. {str(exc)}"
)
if isinstance(e, NodeInfo):
node_info = dataclasses.asdict(e.get_node_info())
for field, value in log_line.items(): # type: ignore[attr-defined]
if field not in ["code", "report_node_data"]:
data[field] = value
# We get the code from the event object, so we don't need it in the data
if 'code' in log_line:
del log_line['code']
event_dict = {
'type': 'log_line',
'log_version': e.log_version,
'ts': ts_fn(e.get_ts()),
'log_version': LOG_VERSION,
'ts': get_ts_rfc3339(),
'pid': e.get_pid(),
'msg': msg_fn(e),
'msg': e.message(),
'level': e.level_tag(),
'data': data,
'data': log_line,
'invocation_id': e.get_invocation_id(),
'thread_name': e.get_thread_name(),
'node_info': node_info,
'code': e.code
}
@@ -179,25 +188,24 @@ def event_to_serializable_dict(
# translates an Event to a completely formatted text-based log line
# you have to specify which message you want. (i.e. - e.message, e.cli_msg(), e.file_msg())
# type hinting everything as strings so we don't get any unintentional string conversions via str()
def create_info_text_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) -> str:
def create_info_text_log_line(e: T_Event) -> str:
color_tag: str = '' if this.format_color else Style.RESET_ALL
ts: str = e.get_ts().strftime("%H:%M:%S")
scrubbed_msg: str = scrub_secrets(msg_fn(e), env_secrets())
ts: str = get_ts().strftime("%H:%M:%S")
scrubbed_msg: str = scrub_secrets(e.message(), env_secrets())
log_line: str = f"{color_tag}{ts} {scrubbed_msg}"
return log_line
def create_debug_text_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) -> str:
def create_debug_text_log_line(e: T_Event) -> str:
log_line: str = ''
# Create a separator if this is the beginning of an invocation
if type(e) == MainReportVersion:
separator = 30 * '='
log_line = f'\n\n{separator} {e.get_ts()} | {get_invocation_id()} {separator}\n'
log_line = f'\n\n{separator} {get_ts()} | {get_invocation_id()} {separator}\n'
color_tag: str = '' if this.format_color else Style.RESET_ALL
ts: str = e.get_ts().strftime("%H:%M:%S.%f")
scrubbed_msg: str = scrub_secrets(msg_fn(e), env_secrets())
ts: str = get_ts().strftime("%H:%M:%S.%f")
scrubbed_msg: str = scrub_secrets(e.message(), env_secrets())
level: str = e.level_tag() if len(e.level_tag()) == 5 else f"{e.level_tag()} "
thread = ''
if threading.current_thread().getName():
@@ -210,12 +218,11 @@ def create_debug_text_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) ->
# translates an Event to a completely formatted json log line
# you have to specify which message you want. (i.e. - e.message(), e.cli_msg(), e.file_msg())
def create_json_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) -> Optional[str]:
def create_json_log_line(e: T_Event) -> Optional[str]:
if type(e) == EmptyLine:
return None # will not be sent to logger
# using preformatted string instead of formatting it here to be extra careful about timezone
values = event_to_serializable_dict(e, lambda _: e.get_ts_rfc3339(), lambda x: msg_fn(x))
# using preformatted ts string instead of formatting it here to be extra careful about timezone
values = event_to_serializable_dict(e)
raw_log_line = json.dumps(values, sort_keys=True)
return scrub_secrets(raw_log_line, env_secrets())
@@ -223,15 +230,14 @@ def create_json_log_line(e: T_Event, msg_fn: Callable[[T_Event], str]) -> Option
# calls create_stdout_text_log_line() or create_json_log_line() according to logger config
def create_log_line(
e: T_Event,
msg_fn: Callable[[T_Event], str],
file_output=False
) -> Optional[str]:
if this.format_json:
return create_json_log_line(e, msg_fn) # json output, both console and file
return create_json_log_line(e) # json output, both console and file
elif file_output is True or flags.DEBUG:
return create_debug_text_log_line(e, msg_fn) # default file output
return create_debug_text_log_line(e) # default file output
else:
return create_info_text_log_line(e, msg_fn) # console output
return create_info_text_log_line(e) # console output
# allows for resuse of this obnoxious if else tree.
@@ -328,29 +334,28 @@ def fire_event(e: Event) -> None:
if flags.ENABLE_LEGACY_LOGGER:
# using Event::message because the legacy logger didn't differentiate messages by
# destination
log_line = create_log_line(e, msg_fn=lambda x: x.message())
log_line = create_log_line(e)
if log_line:
send_to_logger(GLOBAL_LOGGER, e.level_tag(), log_line)
return # exit the function to avoid using the current logger as well
# always logs debug level regardless of user input
if isinstance(e, File):
log_line = create_log_line(e, msg_fn=lambda x: x.file_msg(), file_output=True)
if not isinstance(e, NoFile):
log_line = create_log_line(e, file_output=True)
# doesn't send exceptions to exception logger
if log_line:
send_to_logger(FILE_LOG, level_tag=e.level_tag(), log_line=log_line)
if isinstance(e, Cli):
if not isinstance(e, NoStdOut):
# explicitly checking the debug flag here so that potentially expensive-to-construct
# log messages are not constructed if debug messages are never shown.
if e.level_tag() == 'debug' and not flags.DEBUG:
return # eat the message in case it was one of the expensive ones
log_line = create_log_line(e, msg_fn=lambda x: x.cli_msg())
log_line = create_log_line(e)
if log_line:
if not isinstance(e, ShowException):
send_to_logger(STDOUT_LOG, level_tag=e.level_tag(), log_line=log_line)
# CliEventABC and ShowException
else:
send_exc_to_logger(
STDOUT_LOG,
@@ -374,3 +379,16 @@ def set_invocation_id() -> None:
# commands in the dbt servers. It shouldn't be necessary for the CLI.
global invocation_id
invocation_id = str(uuid.uuid4())
# exactly one time stamp per concrete event
def get_ts() -> datetime:
ts = datetime.utcnow()
return ts
# preformatted time stamp
def get_ts_rfc3339() -> str:
ts = get_ts()
ts_rfc3339 = ts.strftime('%Y-%m-%dT%H:%M:%S.%fZ')
return ts_rfc3339

View File

@@ -0,0 +1,56 @@
from dbt.helper_types import Lazy
from mashumaro import DataClassDictMixin
from mashumaro.config import (
BaseConfig as MashBaseConfig
)
from mashumaro.types import SerializationStrategy
from typing import Dict, List
# The dbtClassMixin serialization class has a DateTime serialization strategy
# class. If a datetime ends up in an event class, we could use a similar class
# here to serialize it in our preferred format.
class ExceptionSerialization(SerializationStrategy):
def serialize(self, value):
out = str(value)
return out
def deserialize(self, value):
return (Exception(value))
class BaseExceptionSerialization(SerializationStrategy):
def serialize(self, value):
return str(value)
def deserialize(self, value):
return (BaseException(value))
# This is an explicit deserializer for the type Lazy[Dict[str, List[str]]]
# mashumaro does not support composing serialization strategies, so all
# future uses of Lazy will need to register a unique serialization class like this one.
class LazySerialization1(SerializationStrategy):
def serialize(self, value) -> Dict[str, List[str]]:
return value.force()
# we _can_ deserialize into a lazy value, but that defers running the deserialization
# function till the value is used which can raise errors at very unexpected times.
# It's best practice to do strict deserialization unless you're in a very special case.
def deserialize(self, value):
raise Exception("Don't deserialize into a Lazy value. Try just using the value itself.")
# This class is the equivalent of dbtClassMixin that's used for serialization
# in other parts of the code. That class did extra things which we didn't want
# to use for events, so this class is a simpler version of dbtClassMixin.
class EventSerialization(DataClassDictMixin):
# This is where we register serializtion strategies per type.
class Config(MashBaseConfig):
serialization_strategy = {
Exception: ExceptionSerialization(),
BaseException: ExceptionSerialization(),
Lazy[Dict[str, List[str]]]: LazySerialization1()
}

View File

@@ -1,72 +0,0 @@
from typing import (
Any,
List,
NamedTuple,
Optional,
Dict,
)
# N.B.:
# These stubs were autogenerated by stubgen and then hacked
# to pieces to ensure we had something other than "Any" types
# where using external classes to instantiate event subclasses
# in events/types.py.
#
# This goes away when we turn mypy on for everything.
#
# Don't trust them too much at all!
class _ReferenceKey(NamedTuple):
database: Any
schema: Any
identifier: Any
class _CachedRelation:
referenced_by: Any
inner: Any
class BaseRelation:
path: Any
type: Optional[Any]
quote_character: str
include_policy: Any
quote_policy: Any
dbt_created: bool
class InformationSchema(BaseRelation):
information_schema_view: Optional[str]
class CompiledNode():
compiled_sql: Optional[str]
extra_ctes_injected: bool
extra_ctes: List[Any]
relation_name: Optional[str]
class CompiledModelNode(CompiledNode):
resource_type: Any
class ParsedModelNode():
resource_type: Any
class ParsedHookNode():
resource_type: Any
index: Optional[int]
class RunResult():
status: str
timing: List[Any]
thread_id: str
execution_time: float
adapter_response: Dict[str, Any]
message: Optional[str]
failures: Optional[int]
node: Any

View File

@@ -5,7 +5,7 @@ from .types import (
WarnLevel,
ErrorLevel,
ShowException,
Cli
NoFile
)
@@ -13,7 +13,7 @@ from .types import (
# Reuse the existing messages when adding logs to tests.
@dataclass
class IntegrationTestInfo(InfoLevel, Cli):
class IntegrationTestInfo(InfoLevel, NoFile):
msg: str
code: str = "T001"
@@ -22,7 +22,7 @@ class IntegrationTestInfo(InfoLevel, Cli):
@dataclass
class IntegrationTestDebug(DebugLevel, Cli):
class IntegrationTestDebug(DebugLevel, NoFile):
msg: str
code: str = "T002"
@@ -31,7 +31,7 @@ class IntegrationTestDebug(DebugLevel, Cli):
@dataclass
class IntegrationTestWarn(WarnLevel, Cli):
class IntegrationTestWarn(WarnLevel, NoFile):
msg: str
code: str = "T003"
@@ -40,7 +40,7 @@ class IntegrationTestWarn(WarnLevel, Cli):
@dataclass
class IntegrationTestError(ErrorLevel, Cli):
class IntegrationTestError(ErrorLevel, NoFile):
msg: str
code: str = "T004"
@@ -49,7 +49,7 @@ class IntegrationTestError(ErrorLevel, Cli):
@dataclass
class IntegrationTestException(ShowException, ErrorLevel, Cli):
class IntegrationTestException(ShowException, ErrorLevel, NoFile):
msg: str
code: str = "T005"
@@ -58,7 +58,7 @@ class IntegrationTestException(ShowException, ErrorLevel, Cli):
@dataclass
class UnitTestInfo(InfoLevel, Cli):
class UnitTestInfo(InfoLevel, NoFile):
msg: str
code: str = "T006"

File diff suppressed because it is too large Load Diff

View File

@@ -2,8 +2,7 @@ import builtins
import functools
from typing import NoReturn, Optional, Mapping, Any
from dbt.logger import get_secret_env
from dbt.events.functions import fire_event
from dbt.events.functions import fire_event, scrub_secrets, env_secrets
from dbt.events.types import GeneralWarningMsg, GeneralWarningException
from dbt.node_types import NodeType
from dbt import flags
@@ -54,7 +53,7 @@ class RuntimeException(RuntimeError, Exception):
def __init__(self, msg, node=None):
self.stack = []
self.node = node
self.msg = msg
self.msg = scrub_secrets(msg, env_secrets())
def add_node(self, node=None):
if node is not None and node is not self.node:
@@ -401,8 +400,6 @@ class CommandError(RuntimeException):
super().__init__(message)
self.cwd = cwd
self.cmd = cmd
for secret in get_secret_env():
self.cmd = str(self.cmd).replace(secret, "*****")
self.args = (cwd, cmd, message)
def __str__(self):
@@ -466,7 +463,21 @@ def raise_database_error(msg, node=None) -> NoReturn:
def raise_dependency_error(msg) -> NoReturn:
raise DependencyException(msg)
raise DependencyException(scrub_secrets(msg, env_secrets()))
def raise_git_cloning_error(error: CommandResultError) -> NoReturn:
error.cmd = scrub_secrets(str(error.cmd), env_secrets())
raise error
def raise_git_cloning_problem(repo) -> NoReturn:
repo = scrub_secrets(repo, env_secrets())
msg = '''\
Something went wrong while cloning {}
Check the debug logs for more information
'''
raise RuntimeException(msg.format(repo))
def disallow_secret_env_var(env_var_name) -> NoReturn:
@@ -692,9 +703,9 @@ def missing_materialization(model, adapter_type):
def bad_package_spec(repo, spec, error_message):
raise InternalException(
"Error checking out spec='{}' for repo {}\n{}".format(
spec, repo, error_message))
msg = "Error checking out spec='{}' for repo {}\n{}".format(spec, repo, error_message)
raise InternalException(scrub_secrets(msg, env_secrets()))
def raise_cache_inconsistent(message):
@@ -763,6 +774,10 @@ def system_error(operation_name):
class ConnectionException(Exception):
"""
There was a problem with the connection that returned a bad response,
timed out, or resulted in a file that is corrupt.
"""
pass
@@ -999,7 +1014,7 @@ def raise_duplicate_alias(
def warn_or_error(msg, node=None, log_fmt=None):
if flags.WARN_ERROR:
raise_compiler_error(msg, node)
raise_compiler_error(scrub_secrets(msg, env_secrets()), node)
else:
fire_event(GeneralWarningMsg(msg=msg, log_fmt=log_fmt))

1
core/dbt/graph/README.md Normal file
View File

@@ -0,0 +1 @@
# Graph README

View File

@@ -31,11 +31,13 @@ class Graph:
"""Returns all nodes having a path to `node` in `graph`"""
if not self.graph.has_node(node):
raise InternalException(f'Node {node} not found in the graph!')
with nx.utils.reversed(self.graph):
anc = nx.single_source_shortest_path_length(G=self.graph,
source=node,
cutoff=max_depth)\
.keys()
# This used to use nx.utils.reversed(self.graph), but that is deprecated,
# so changing to use self.graph.reverse(copy=False) as recommeneded
G = self.graph.reverse(copy=False) if self.graph.is_directed() else self.graph
anc = nx.single_source_shortest_path_length(G=G,
source=node,
cutoff=max_depth)\
.keys()
return anc - {node}
def descendants(

View File

@@ -86,8 +86,9 @@ class NodeSelector(MethodManager):
try:
collected = self.select_included(nodes, spec)
except InvalidSelectorException:
valid_selectors = ", ".join(self.SELECTOR_METHODS)
fire_event(SelectorReportInvalidSelector(
selector_methods=self.SELECTOR_METHODS,
valid_selectors=valid_selectors,
spec_method=spec.method,
raw_spec=spec.raw
))

View File

@@ -1,7 +1,7 @@
import abc
from itertools import chain
from pathlib import Path
from typing import Set, List, Dict, Iterator, Tuple, Any, Union, Type, Optional
from typing import Set, List, Dict, Iterator, Tuple, Any, Union, Type, Optional, Callable
from dbt.dataclass_schema import StrEnum
@@ -478,42 +478,28 @@ class StateSelectorMethod(SelectorMethod):
previous_macros = []
return self.recursively_check_macros_modified(node, previous_macros)
def check_modified(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
# TODO check modifed_content and check_modified macro seems a bit redundent
def check_modified_content(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
different_contents = not new.same_contents(old) # type: ignore
upstream_macro_change = self.check_macros_modified(new)
return different_contents or upstream_macro_change
def check_modified_body(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
if hasattr(new, "same_body"):
return not new.same_body(old) # type: ignore
else:
return False
def check_modified_configs(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
if hasattr(new, "same_config"):
return not new.same_config(old) # type: ignore
else:
return False
def check_modified_persisted_descriptions(
self, old: Optional[SelectorTarget], new: SelectorTarget
) -> bool:
if hasattr(new, "same_persisted_description"):
return not new.same_persisted_description(old) # type: ignore
else:
return False
def check_modified_relation(
self, old: Optional[SelectorTarget], new: SelectorTarget
) -> bool:
if hasattr(new, "same_database_representation"):
return not new.same_database_representation(old) # type: ignore
else:
return False
def check_modified_macros(self, _, new: SelectorTarget) -> bool:
return self.check_macros_modified(new)
@staticmethod
def check_modified_factory(
compare_method: str
) -> Callable[[Optional[SelectorTarget], SelectorTarget], bool]:
# get a function that compares two selector target based on compare method provided
def check_modified_things(old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
if hasattr(new, compare_method):
# when old body does not exist or old and new are not the same
return not old or not getattr(new, compare_method)(old) # type: ignore
else:
return False
return check_modified_things
def check_new(self, old: Optional[SelectorTarget], new: SelectorTarget) -> bool:
return old is None
@@ -527,14 +513,21 @@ class StateSelectorMethod(SelectorMethod):
state_checks = {
# it's new if there is no old version
'new': lambda old, _: old is None,
'new':
lambda old, _: old is None,
# use methods defined above to compare properties of old + new
'modified': self.check_modified,
'modified.body': self.check_modified_body,
'modified.configs': self.check_modified_configs,
'modified.persisted_descriptions': self.check_modified_persisted_descriptions,
'modified.relation': self.check_modified_relation,
'modified.macros': self.check_modified_macros,
'modified':
self.check_modified_content,
'modified.body':
self.check_modified_factory('same_body'),
'modified.configs':
self.check_modified_factory('same_config'),
'modified.persisted_descriptions':
self.check_modified_factory('same_persisted_description'),
'modified.relation':
self.check_modified_factory('same_database_representation'),
'modified.macros':
self.check_modified_macros,
}
if selector in state_checks:
checker = state_checks[selector]

View File

@@ -1,4 +1,8 @@
# never name this package "types", or mypy will crash in ugly ways
# necessary for annotating constructors
from __future__ import annotations
from dataclasses import dataclass
from datetime import timedelta
from pathlib import Path
@@ -9,6 +13,7 @@ from dbt.dataclass_schema import (
)
from hologram import FieldEncoder, JsonDict
from mashumaro.types import SerializableType
from typing import Callable, cast, Generic, Optional, TypeVar
class Port(int, SerializableType):
@@ -93,3 +98,35 @@ dbtClassMixin.register_field_encoders({
FQNPath = Tuple[str, ...]
PathSet = AbstractSet[FQNPath]
T = TypeVar('T')
# A data type for representing lazily evaluated values.
#
# usage:
# x = Lazy.defer(lambda: expensive_fn())
# y = x.force()
#
# inspired by the purescript data type
# https://pursuit.purescript.org/packages/purescript-lazy/5.0.0/docs/Data.Lazy
@dataclass
class Lazy(Generic[T]):
_f: Callable[[], T]
memo: Optional[T] = None
# constructor for lazy values
@classmethod
def defer(cls, f: Callable[[], T]) -> Lazy[T]:
return Lazy(f)
# workaround for open mypy issue:
# https://github.com/python/mypy/issues/6910
def _typed_eval_f(self) -> T:
return cast(Callable[[], T], getattr(self, "_f"))()
# evaluates the function if the value has not been memoized already
def force(self) -> T:
if self.memo is None:
self.memo = self._typed_eval_f()
return self.memo

View File

@@ -0,0 +1 @@
# Include README

File diff suppressed because one or more lines are too long

View File

@@ -36,7 +36,7 @@ from dbt.adapters.factory import reset_adapters, cleanup_connections
import dbt.tracking
from dbt.utils import ExitCodes
from dbt.utils import ExitCodes, args_to_dict
from dbt.config.profile import DEFAULT_PROFILES_DIR, read_user_config
from dbt.exceptions import (
InternalException,
@@ -140,7 +140,7 @@ def main(args=None):
exit_code = e.code
except BaseException as e:
fire_event(MainEncounteredError(e=e))
fire_event(MainEncounteredError(e=str(e)))
fire_event(MainStackTrace(stack_trace=traceback.format_exc()))
exit_code = ExitCodes.UnhandledError.value
@@ -205,7 +205,7 @@ def track_run(task):
)
except (NotImplementedException,
FailedToConnectException) as e:
fire_event(MainEncounteredError(e=e))
fire_event(MainEncounteredError(e=str(e)))
dbt.tracking.track_invocation_end(
config=task.config, args=task.args, result_type="error"
)
@@ -235,10 +235,10 @@ def run_from_args(parsed):
setup_event_logger(log_path or 'logs', level_override)
fire_event(MainReportVersion(v=str(dbt.version.installed)))
fire_event(MainReportArgs(args=parsed))
fire_event(MainReportArgs(args=args_to_dict(parsed)))
if dbt.tracking.active_user is not None: # mypy appeasement, always true
fire_event(MainTrackingUserState(dbt.tracking.active_user.state()))
fire_event(MainTrackingUserState(user_state=dbt.tracking.active_user.state()))
results = None

View File

@@ -0,0 +1 @@
# Parser README

View File

@@ -1,5 +1,6 @@
from dataclasses import dataclass
import re
import warnings
from typing import List
from packaging import version as packaging_version
@@ -145,10 +146,13 @@ class VersionSpecifier(VersionSpecification):
return 1
if b is None:
return -1
if packaging_version.parse(a) > packaging_version.parse(b):
return 1
elif packaging_version.parse(a) < packaging_version.parse(b):
return -1
# This suppresses the LegacyVersion deprecation warning
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=DeprecationWarning)
if packaging_version.parse(a) > packaging_version.parse(b):
return 1
elif packaging_version.parse(a) < packaging_version.parse(b):
return -1
equal = ((self.matcher == Matchers.GREATER_THAN_OR_EQUAL and
other.matcher == Matchers.LESS_THAN_OR_EQUAL) or

1
core/dbt/task/README.md Normal file
View File

@@ -0,0 +1 @@
# Task README

View File

@@ -290,7 +290,7 @@ class BaseRunner(metaclass=ABCMeta):
ctx.node._event_status['node_status'] = RunningStatus.Compiling
fire_event(
NodeCompiling(
report_node_data=ctx.node,
node_info=ctx.node.node_info,
unique_id=ctx.node.unique_id,
)
)
@@ -306,7 +306,7 @@ class BaseRunner(metaclass=ABCMeta):
ctx.node._event_status['node_status'] = RunningStatus.Executing
fire_event(
NodeExecuting(
report_node_data=ctx.node,
node_info=ctx.node.node_info,
unique_id=ctx.node.unique_id,
)
)
@@ -448,7 +448,7 @@ class BaseRunner(metaclass=ABCMeta):
node_name=node_name,
index=self.node_index,
total=self.num_nodes,
report_node_data=self.node
node_info=self.node.node_info
)
)

View File

@@ -41,7 +41,7 @@ class FreshnessRunner(BaseRunner):
description=description,
index=self.node_index,
total=self.num_nodes,
report_node_data=self.node
node_info=self.node.node_info
)
)
@@ -60,7 +60,7 @@ class FreshnessRunner(BaseRunner):
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time,
report_node_data=self.node
node_info=self.node.node_info
)
)
elif result.status == FreshnessStatus.Error:
@@ -71,7 +71,7 @@ class FreshnessRunner(BaseRunner):
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time,
report_node_data=self.node
node_info=self.node.node_info
)
)
elif result.status == FreshnessStatus.Warn:
@@ -82,7 +82,7 @@ class FreshnessRunner(BaseRunner):
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time,
report_node_data=self.node
node_info=self.node.node_info
)
)
else:
@@ -93,7 +93,7 @@ class FreshnessRunner(BaseRunner):
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time,
report_node_data=self.node
node_info=self.node.node_info
)
)

View File

@@ -14,6 +14,8 @@ from dbt import flags
from dbt.version import _get_adapter_plugin_names
from dbt.adapters.factory import load_plugin, get_include_paths
from dbt.contracts.project import Name as ProjectName
from dbt.events.functions import fire_event
from dbt.events.types import (
StarterProjectPath, ConfigFolderDirectory, NoSampleProfileFound, ProfileWrittenWithSample,
@@ -269,6 +271,16 @@ class InitTask(BaseTask):
numeric_choice = click.prompt(prompt_msg, type=click.INT)
return available_adapters[numeric_choice - 1]
def get_valid_project_name(self) -> str:
"""Returns a valid project name, either from CLI arg or user prompt."""
name = self.args.project_name
while not ProjectName.is_valid(name):
if name:
click.echo(name + " is not a valid project name.")
name = click.prompt("Enter a name for your project (letters, digits, underscore)")
return name
def run(self):
"""Entry point for the init task."""
profiles_dir = flags.PROFILES_DIR
@@ -285,6 +297,8 @@ class InitTask(BaseTask):
# just setup the user's profile.
fire_event(SettingUpProfile())
profile_name = self.get_profile_name_from_current_project()
if not self.check_if_can_write_profile(profile_name=profile_name):
return
# If a profile_template.yml exists in the project root, that effectively
# overrides the profile_template.yml for the given target.
profile_template_path = Path("profile_template.yml")
@@ -296,8 +310,6 @@ class InitTask(BaseTask):
return
except Exception:
fire_event(InvalidProfileTemplateYAML())
if not self.check_if_can_write_profile(profile_name=profile_name):
return
adapter = self.ask_for_adapter_choice()
self.create_profile_from_target(
adapter, profile_name=profile_name
@@ -306,11 +318,7 @@ class InitTask(BaseTask):
# When dbt init is run outside of an existing project,
# create a new project and set up the user's profile.
project_name = self.args.project_name
if project_name is None:
# If project name is not provided,
# ask the user which project name they'd like to use.
project_name = click.prompt("What is the desired project name?")
project_name = self.get_valid_project_name()
project_path = Path(project_name)
if project_path.exists():
fire_event(ProjectNameAlreadyExists(name=project_name))

View File

@@ -178,7 +178,7 @@ class ModelRunner(CompileRunner):
description=self.describe_node(),
index=self.node_index,
total=self.num_nodes,
report_node_data=self.node
node_info=self.node.node_info
)
)
@@ -192,7 +192,7 @@ class ModelRunner(CompileRunner):
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time,
report_node_data=self.node
node_info=self.node.node_info
)
)
else:
@@ -203,7 +203,7 @@ class ModelRunner(CompileRunner):
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time,
report_node_data=self.node
node_info=self.node.node_info
)
)
@@ -357,8 +357,7 @@ class RunTask(CompileTask):
statement=hook_text,
index=idx,
total=num_hooks,
truncate=True,
report_node_data=hook
node_info=hook.node_info
)
)
@@ -380,8 +379,7 @@ class RunTask(CompileTask):
index=idx,
total=num_hooks,
execution_time=timer.elapsed,
truncate=True,
report_node_data=hook
node_info=hook.node_info
)
)
# `_event_status` dict is only used for logging. Make sure
@@ -401,7 +399,7 @@ class RunTask(CompileTask):
try:
self.run_hooks(adapter, hook_type, extra_context)
except RuntimeException:
fire_event(DatabaseErrorRunning(hook_type))
fire_event(DatabaseErrorRunning(hook_type=hook_type.value))
raise
def print_results_line(self, results, execution_time):

View File

@@ -6,7 +6,6 @@ from concurrent.futures import as_completed
from datetime import datetime
from multiprocessing.dummy import Pool as ThreadPool
from typing import Optional, Dict, List, Set, Tuple, Iterable, AbstractSet
from pathlib import PosixPath, WindowsPath
from .printer import (
print_run_result_error,
@@ -216,7 +215,7 @@ class GraphRunnableTask(ManifestTask):
with startctx, extended_metadata:
fire_event(
NodeStart(
report_node_data=runner.node,
node_info=runner.node.node_info,
unique_id=runner.node.unique_id,
)
)
@@ -231,9 +230,9 @@ class GraphRunnableTask(ManifestTask):
with finishctx, DbtModelState(status):
fire_event(
NodeFinished(
report_node_data=runner.node,
node_info=runner.node.node_info,
unique_id=runner.node.unique_id,
run_result=result
run_result=result.to_dict(),
)
)
# `_event_status` dict is only used for logging. Make sure
@@ -359,7 +358,7 @@ class GraphRunnableTask(ManifestTask):
adapter = get_adapter(self.config)
if not adapter.is_cancelable():
fire_event(QueryCancelationUnsupported(type=adapter.type))
fire_event(QueryCancelationUnsupported(type=adapter.type()))
else:
with adapter.connection_named('master'):
for conn_name in adapter.cancel_open_connections():
@@ -591,38 +590,8 @@ class GraphRunnableTask(ManifestTask):
results=results,
elapsed_time=elapsed_time,
generated_at=generated_at,
args=self.args_to_dict(),
args=dbt.utils.args_to_dict(self.args),
)
def args_to_dict(self):
var_args = vars(self.args).copy()
# update the args with the flags, which could also come from environment
# variables or user_config
flag_dict = flags.get_flag_dict()
var_args.update(flag_dict)
dict_args = {}
# remove args keys that clutter up the dictionary
for key in var_args:
if key == 'cls':
continue
if var_args[key] is None:
continue
# TODO: add more default_false_keys
default_false_keys = (
'debug', 'full_refresh', 'fail_fast', 'warn_error',
'single_threaded', 'log_cache_events',
'use_experimental_parser',
)
if key in default_false_keys and var_args[key] is False:
continue
if key == 'vars' and var_args[key] == '{}':
continue
# this was required for a test case
if (isinstance(var_args[key], PosixPath) or
isinstance(var_args[key], WindowsPath)):
var_args[key] = str(var_args[key])
dict_args[key] = var_args[key]
return dict_args
def task_end_messages(self, results):
print_run_end_messages(results)

View File

@@ -11,7 +11,7 @@ from dbt.graph import ResourceTypeSelector
from dbt.logger import TextOnly
from dbt.events.functions import fire_event
from dbt.events.types import (
SeedHeader, SeedHeaderSeperator, EmptyLine, PrintSeedErrorResultLine,
SeedHeader, SeedHeaderSeparator, EmptyLine, PrintSeedErrorResultLine,
PrintSeedResultLine, PrintStartLine
)
from dbt.node_types import NodeType
@@ -28,7 +28,7 @@ class SeedRunner(ModelRunner):
description=self.describe_node(),
index=self.node_index,
total=self.num_nodes,
report_node_data=self.node
node_info=self.node.node_info
)
)
@@ -52,7 +52,7 @@ class SeedRunner(ModelRunner):
execution_time=result.execution_time,
schema=self.node.schema,
relation=model.alias,
report_node_data=model
node_info=model.node_info
)
)
else:
@@ -64,7 +64,7 @@ class SeedRunner(ModelRunner):
execution_time=result.execution_time,
schema=self.node.schema,
relation=model.alias,
report_node_data=model
node_info=model.node_info
)
)
@@ -109,7 +109,7 @@ class SeedTask(RunTask):
with TextOnly():
fire_event(EmptyLine())
fire_event(SeedHeader(header=header))
fire_event(SeedHeaderSeperator(len_header=len(header)))
fire_event(SeedHeaderSeparator(len_header=len(header)))
rand_table.print_table(max_rows=10, max_columns=None)
with TextOnly():

View File

@@ -24,7 +24,7 @@ class SnapshotRunner(ModelRunner):
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time,
report_node_data=model
node_info=model.node_info
)
)
else:
@@ -36,7 +36,7 @@ class SnapshotRunner(ModelRunner):
index=self.node_index,
total=self.num_nodes,
execution_time=result.execution_time,
report_node_data=model
node_info=model.node_info
)
)

View File

@@ -75,7 +75,7 @@ class TestRunner(CompileRunner):
index=self.node_index,
num_models=self.num_nodes,
execution_time=result.execution_time,
report_node_data=model
node_info=model.node_info
)
)
elif result.status == TestStatus.Pass:
@@ -85,7 +85,7 @@ class TestRunner(CompileRunner):
index=self.node_index,
num_models=self.num_nodes,
execution_time=result.execution_time,
report_node_data=model
node_info=model.node_info
)
)
elif result.status == TestStatus.Warn:
@@ -96,7 +96,7 @@ class TestRunner(CompileRunner):
num_models=self.num_nodes,
execution_time=result.execution_time,
failures=result.failures,
report_node_data=model
node_info=model.node_info
)
)
elif result.status == TestStatus.Fail:
@@ -107,7 +107,7 @@ class TestRunner(CompileRunner):
num_models=self.num_nodes,
execution_time=result.execution_time,
failures=result.failures,
report_node_data=model
node_info=model.node_info
)
)
else:
@@ -119,7 +119,7 @@ class TestRunner(CompileRunner):
description=self.describe_node(),
index=self.node_index,
total=self.num_nodes,
report_node_data=self.node
node_info=self.node.node_info
)
)

View File

@@ -10,12 +10,15 @@ import jinja2
import json
import os
import requests
from tarfile import ReadError
import time
from pathlib import PosixPath, WindowsPath
from contextlib import contextmanager
from dbt.exceptions import ConnectionException
from dbt.events.functions import fire_event
from dbt.events.types import RetryExternalCall
from dbt import flags
from enum import Enum
from typing_extensions import Protocol
from typing import (
@@ -598,7 +601,9 @@ class MultiDict(Mapping[str, Any]):
def _connection_exception_retry(fn, max_attempts: int, attempt: int = 0):
"""Attempts to run a function that makes an external call, if the call fails
on a connection error or timeout, it will be tried up to 5 more times.
on a connection error, timeout or decompression issue, it will be tried up to 5 more times.
See https://github.com/dbt-labs/dbt-core/issues/4579 for context on this decompression issues
specifically.
"""
try:
return fn()
@@ -606,6 +611,7 @@ def _connection_exception_retry(fn, max_attempts: int, attempt: int = 0):
requests.exceptions.ConnectionError,
requests.exceptions.Timeout,
requests.exceptions.ContentDecodingError,
ReadError,
) as exc:
if attempt <= max_attempts - 1:
fire_event(RetryExternalCall(attempt=attempt, max=max_attempts))
@@ -613,3 +619,40 @@ def _connection_exception_retry(fn, max_attempts: int, attempt: int = 0):
_connection_exception_retry(fn, max_attempts, attempt + 1)
else:
raise ConnectionException('External connection exception occurred: ' + str(exc))
# This is used to serialize the args in the run_results and in the logs.
# We do this separately because there are a few fields that don't serialize,
# i.e. PosixPath, WindowsPath, and types. It also includes args from both
# cli args and flags, which is more complete than just the cli args.
# If new args are added that are false by default (particularly in the
# global options) they should be added to the 'default_false_keys' list.
def args_to_dict(args):
var_args = vars(args).copy()
# update the args with the flags, which could also come from environment
# variables or user_config
flag_dict = flags.get_flag_dict()
var_args.update(flag_dict)
dict_args = {}
# remove args keys that clutter up the dictionary
for key in var_args:
if key == 'cls':
continue
if var_args[key] is None:
continue
# TODO: add more default_false_keys
default_false_keys = (
'debug', 'full_refresh', 'fail_fast', 'warn_error',
'single_threaded', 'log_cache_events', 'store_failures',
'use_experimental_parser',
)
if key in default_false_keys and var_args[key] is False:
continue
if key == 'vars' and var_args[key] == '{}':
continue
# this was required for a test case
if (isinstance(var_args[key], PosixPath) or
isinstance(var_args[key], WindowsPath)):
var_args[key] = str(var_args[key])
dict_args[key] = var_args[key]
return dict_args

View File

@@ -96,5 +96,5 @@ def _get_dbt_plugins_info():
yield plugin_name, mod.version
__version__ = '1.0.0'
__version__ = '1.0.1'
installed = get_installed_version()

View File

@@ -284,12 +284,12 @@ def parse_args(argv=None):
parser.add_argument('adapter')
parser.add_argument('--title-case', '-t', default=None)
parser.add_argument('--dependency', action='append')
parser.add_argument('--dbt-core-version', default='1.0.0')
parser.add_argument('--dbt-core-version', default='1.0.1')
parser.add_argument('--email')
parser.add_argument('--author')
parser.add_argument('--url')
parser.add_argument('--sql', action='store_true')
parser.add_argument('--package-version', default='1.0.0')
parser.add_argument('--package-version', default='1.0.1')
parser.add_argument('--project-version', default='1.0')
parser.add_argument(
'--no-dependency', action='store_false', dest='set_dependency'

View File

@@ -2,9 +2,9 @@
import os
import sys
if sys.version_info < (3, 7):
if sys.version_info < (3, 7, 2):
print('Error: dbt does not support this version of Python.')
print('Please upgrade to Python 3.7 or higher.')
print('Please upgrade to Python 3.7.2 or higher.')
sys.exit(1)
@@ -25,7 +25,7 @@ with open(os.path.join(this_directory, 'README.md')) as f:
package_name = "dbt-core"
package_version = "1.0.0"
package_version = "1.0.1"
description = """With dbt, data analysts and engineers can build analytics \
the way engineers build applications."""
@@ -85,5 +85,5 @@ setup(
'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: 3.9',
],
python_requires=">=3.7",
python_requires=">=3.7.2",
)

View File

@@ -1,3 +1,8 @@
##
# This compose file is used for local development and adapter testing only.
# See `/docker` for a generic and production-ready docker file
##
version: "3.5"
services:
database:

View File

@@ -1,34 +1,134 @@
ARG BASE_IMAGE=python:3.8-slim-bullseye
##
# Generic dockerfile for dbt image building.
# See README for operational details
##
FROM $BASE_IMAGE
ARG BASE_REQUIREMENTS_SRC_PATH
ARG WHEEL_REQUIREMENTS_SRC_PATH
ARG DIST_PATH
# Top level build args
ARG build_for=linux/amd64
##
# base image (abstract)
##
FROM --platform=$build_for python:3.9.9-slim-bullseye as base
# N.B. The refs updated automagically every release via bumpversion
# N.B. dbt-postgres is currently found in the core codebase so a value of dbt-core@<some_version> is correct
ARG dbt_core_ref=dbt-core@v1.0.0
ARG dbt_postgres_ref=dbt-core@v1.0.0
ARG dbt_redshift_ref=dbt-redshift@v1.0.0
ARG dbt_bigquery_ref=dbt-bigquery@v1.0.0
ARG dbt_snowflake_ref=dbt-snowflake@v1.0.0
ARG dbt_spark_ref=dbt-spark@v1.0.0
# special case args
ARG dbt_spark_version=all
ARG dbt_third_party
# System setup
RUN apt-get update \
&& apt-get dist-upgrade -y \
&& apt-get install -y --no-install-recommends \
git \
ssh-client \
software-properties-common \
make \
build-essential \
ca-certificates \
libpq-dev \
git \
ssh-client \
software-properties-common \
make \
build-essential \
ca-certificates \
libpq-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
&& rm -rf \
/var/lib/apt/lists/* \
/tmp/* \
/var/tmp/*
RUN echo BASE_REQUIREMENTS_SRC_PATH=$BASE_REQUIREMENTS_SRC_PATH
RUN echo WHEEL_REQUIREMENTS_SRC_PATH=$WHEEL_REQUIREMENTS_SRC_PATH
RUN echo DIST_PATH=$DIST_PATH
COPY $BASE_REQUIREMENTS_SRC_PATH ./requirements.txt
COPY $WHEEL_REQUIREMENTS_SRC_PATH ./wheel_requirements.txt
COPY $DIST_PATH ./dist
RUN pip install --upgrade pip setuptools
RUN pip install --requirement ./requirements.txt
RUN pip install --requirement ./wheel_requirements.txt
# Env vars
ENV PYTHONIOENCODING=utf-8
ENV LANG C.UTF-8
WORKDIR /usr/app
ENV LANG=C.UTF-8
# Update python
RUN python -m pip install --upgrade pip setuptools wheel --no-cache-dir
# Set docker basics
WORKDIR /usr/app/dbt/
VOLUME /usr/app
ENTRYPOINT ["dbt"]
##
# dbt-core
##
FROM base as dbt-core
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_core_ref}#egg=dbt-core&subdirectory=core"
##
# dbt-postgres
##
FROM base as dbt-postgres
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_postgres_ref}#egg=dbt-postgres&subdirectory=plugins/postgres"
##
# dbt-redshift
##
FROM base as dbt-redshift
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_redshift_ref}#egg=dbt-redshift"
##
# dbt-bigquery
##
FROM base as dbt-bigquery
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_bigquery_ref}#egg=dbt-bigquery"
##
# dbt-snowflake
##
FROM base as dbt-snowflake
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_snowflake_ref}#egg=dbt-snowflake"
##
# dbt-spark
##
FROM base as dbt-spark
RUN apt-get update \
&& apt-get dist-upgrade -y \
&& apt-get install -y --no-install-recommends \
python-dev \
libsasl2-dev \
gcc \
unixodbc-dev \
&& apt-get clean \
&& rm -rf \
/var/lib/apt/lists/* \
/tmp/* \
/var/tmp/*
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_spark_ref}#egg=dbt-spark[${dbt_spark_version}]"
##
# dbt-third-party
##
FROM dbt-core as dbt-third-party
RUN python -m pip install --no-cache-dir "${dbt_third_party}"
##
# dbt-all
##
FROM base as dbt-all
RUN apt-get update \
&& apt-get dist-upgrade -y \
&& apt-get install -y --no-install-recommends \
python-dev \
libsasl2-dev \
gcc \
unixodbc-dev \
&& apt-get clean \
&& rm -rf \
/var/lib/apt/lists/* \
/tmp/* \
/var/tmp/*
RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_redshift_ref}#egg=dbt-redshift"
RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_bigquery_ref}#egg=dbt-bigquery"
RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_snowflake_ref}#egg=dbt-snowflake"
RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_spark_ref}#egg=dbt-spark[${dbt_spark_version}]"
RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_postgres_ref}#egg=dbt-postgres&subdirectory=plugins/postgres"

106
docker/README.md Normal file
View File

@@ -0,0 +1,106 @@
# Docker for dbt
This docker file is suitable for building dbt Docker images locally or using with CI/CD to automate populating a container registry.
## Building an image:
This Dockerfile can create images for the following targets, each named after the database they support:
* `dbt-core` _(no db-adapter support)_
* `dbt-postgres`
* `dbt-redshift`
* `dbt-bigquery`
* `dbt-snowflake`
* `dbt-spark`
* `dbt-third-party` _(requires additional build-arg)_
* `dbt-all` _(installs all of the above in a single image)_
In order to build a new image, run the following docker command.
```
docker build --tag <your_image_name> --target <target_name> <path/to/dockerfile>
```
By default the images will be populated with the most recent release of `dbt-core` and whatever database adapter you select. If you need to use a different version you can specify it by git ref using the `--build-arg` flag:
```
docker build --tag <your_image_name> \
--target <target_name> \
--build-arg <arg_name>=<git_ref> \
<path/to/dockerfile>
```
valid arg names for versioning are:
* `dbt_core_ref`
* `dbt_postgres_ref`
* `dbt_redshift_ref`
* `dbt_bigquery_ref`
* `dbt_snowflake_ref`
* `dbt_spark_ref`
> Note: Only overide a _single_ build arg for each build. Using multiple overides may lead to a non-functioning image.
If you wish to build an image with a third-party adapter you can use the `dbt-third-party` target. This target requires you provide a path to the adapter that can be processed by `pip` by using the `dbt_third_party` build arg:
```
docker build --tag <your_image_name> \
--target dbt-third-party \
--build-arg dbt_third_party=<pip_parsable_install_string> \
<path/to/dockerfile>
```
### Examples:
To build an image named "my-dbt" that supports redshift using the latest releases:
```
cd dbt-core/docker
docker build --tag my-dbt --target dbt-redshift .
```
To build an image named "my-other-dbt" that supports bigquery using `dbt-core` version 0.21.latest and the bigquery adapter version 1.0.0b1:
```
cd dbt-core/docker
docker build \
--tag my-other-dbt \
--target dbt-bigquery \
--build-arg dbt_bigquery_ref=dbt-bigquery@v1.0.0b1 \
--build-arg dbt_core_ref=dbt-core@0.21.latest \
.
```
To build an image named "my-third-party-dbt" that uses [Materilize third party adapter](https://github.com/MaterializeInc/materialize/tree/main/misc/dbt-materialize) and the latest release of `dbt-core`:
```
cd dbt-core/docker
docker build --tag my-third-party-dbt \
--target dbt-third-party \
--build-arg dbt_third_party=dbt-materialize \
.
```
## Special cases
There are a few special cases worth noting:
* The `dbt-spark` database adapter comes in three different versions named `PyHive`, `ODBC`, and the default `all`. If you wish to overide this you can use the `--build-arg` flag with the value of `dbt_spark_version=<version_name>`. See the [docs](https://docs.getdbt.com/reference/warehouse-profiles/spark-profile) for more information.
* The `dbt-postgres` database adapter is released as part of the `dbt-core` codebase. If you wish to overide the version used, make sure you use the gitref for `dbt-core`:
```
docker build --tag my_dbt \
--target dbt-postgres \
--build-arg dbt_postgres_ref=dbt-core@1.0.0b1 \
<path/to/dockerfile> \
```
* If you need to build against another architecture (linux/arm64 in this example) you can overide the `build_for` build arg:
```
docker build --tag my_dbt \
--target dbt-postgres \
--build-arg build_for=linux/arm64 \
<path/to/dockerfile> \
```
Supported architectures can be found in the python docker [dockerhub page](https://hub.docker.com/_/python).
## Running an image in a container:
The `ENTRYPOINT` for this Dockerfile is the command `dbt` so you can bind-mount your project to `/usr/app` and use dbt as normal:
```
docker run \
--network=host
--mount type=bind,source=path/to/project,target=/usr/app \
--mount type=bind,source=path/to/profiles.yml,target=/root/.dbt/ \
my-dbt \
ls
```
> Notes:
> * Bind-mount sources _must_ be an absolute path
> * You may need to make adjustments to the docker networking setting depending on the specifics of your data warehouse/database host.

View File

@@ -1,44 +0,0 @@
agate==1.6.3
attrs==21.2.0
Babel==2.9.1
certifi==2021.10.8
cffi==1.15.0
charset-normalizer==2.0.8
click==8.0.3
colorama==0.4.4
dbt-core==1.0.0
dbt-extractor==0.4.0
dbt-postgres==1.0.0
future==0.18.2
hologram==0.0.14
idna==3.3
importlib-metadata==4.8.2
isodate==0.6.0
Jinja2==2.11.3
jsonschema==3.1.1
leather==0.3.4
Logbook==1.5.3
MarkupSafe==2.0.1
mashumaro==2.9
minimal-snowplow-tracker==0.0.2
msgpack==1.0.3
networkx==2.6.3
packaging==21.3
parsedatetime==2.4
psycopg2-binary==2.9.2
pycparser==2.21
pyparsing==3.0.6
pyrsistent==0.18.0
python-dateutil==2.8.2
python-slugify==5.0.2
pytimeparse==1.1.8
pytz==2021.3
PyYAML==6.0
requests==2.26.0
six==1.16.0
sqlparse==0.4.2
text-unidecode==1.3
typing-extensions==3.10.0.2
urllib3==1.26.7
Werkzeug==2.0.2
zipp==3.6.0

136
docker/test.sh Executable file
View File

@@ -0,0 +1,136 @@
# - VERY rudimentary test script to run latest + specific branch image builds and test them all by running `--version`
# TODO: create a real test suite
clear \
&& echo "\n\n"\
"###################################\n"\
"##### Testing dbt-core latest #####\n"\
"###################################\n"\
&& docker build --tag dbt-core \
--target dbt-core \
docker \
&& docker run dbt-core --version \
\
&& echo "\n\n"\
"####################################\n"\
"##### Testing dbt-core-1.0.0b1 #####\n"\
"####################################\n"\
&& docker build --tag dbt-core-1.0.0b1 \
--target dbt-core \
--build-arg dbt_core_ref=dbt-core@v1.0.0b1 \
docker \
&& docker run dbt-core-1.0.0b1 --version \
\
&& echo "\n\n"\
"#######################################\n"\
"##### Testing dbt-postgres latest #####\n"\
"#######################################\n"\
&& docker build --tag dbt-postgres \
--target dbt-postgres \
docker \
&& docker run dbt-postgres --version \
\
&& echo "\n\n"\
"########################################\n"\
"##### Testing dbt-postgres-1.0.0b1 #####\n"\
"########################################\n"\
&& docker build --tag dbt-postgres-1.0.0b1 \
--target dbt-postgres \
--build-arg dbt_postgres_ref=dbt-core@v1.0.0b1 \
docker \
&& docker run dbt-postgres-1.0.0b1 --version \
\
&& echo "\n\n"\
"#######################################\n"\
"##### Testing dbt-redshift latest #####\n"\
"#######################################\n"\
&& docker build --tag dbt-redshift \
--target dbt-redshift \
docker \
&& docker run dbt-redshift --version \
\
&& echo "\n\n"\
"########################################\n"\
"##### Testing dbt-redshift-1.0.0b1 #####\n"\
"########################################\n"\
&& docker build --tag dbt-redshift-1.0.0b1 \
--target dbt-redshift \
--build-arg dbt_redshift_ref=dbt-redshift@v1.0.0b1 \
docker \
&& docker run dbt-redshift-1.0.0b1 --version \
\
&& echo "\n\n"\
"#######################################\n"\
"##### Testing dbt-bigquery latest #####\n"\
"#######################################\n"\
&& docker build --tag dbt-bigquery \
--target dbt-bigquery \
docker \
&& docker run dbt-bigquery --version \
\
&& echo "\n\n"\
"########################################\n"\
"##### Testing dbt-bigquery-1.0.0b1 #####\n"\
"########################################\n"\
&& docker build --tag dbt-bigquery-1.0.0b1 \
--target dbt-bigquery \
--build-arg dbt_bigquery_ref=dbt-bigquery@v1.0.0b1 \
docker \
&& docker run dbt-bigquery-1.0.0b1 --version \
\
&& echo "\n\n"\
"########################################\n"\
"##### Testing dbt-snowflake latest #####\n"\
"########################################\n"\
&& docker build --tag dbt-snowflake \
--target dbt-snowflake \
docker \
&& docker run dbt-snowflake --version \
\
&& echo "\n\n"\
"#########################################\n"\
"##### Testing dbt-snowflake-1.0.0b1 #####\n"\
"#########################################\n"\
&& docker build --tag dbt-snowflake-1.0.0b1 \
--target dbt-snowflake\
--build-arg dbt_snowflake_ref=dbt-snowflake@v1.0.0b1 \
docker \
&& docker run dbt-snowflake-1.0.0b1 --version \
\
&& echo "\n\n"\
"####################################\n"\
"##### Testing dbt-spark latest #####\n"\
"####################################\n"\
&& docker build --tag dbt-spark \
--target dbt-spark \
docker \
&& docker run dbt-spark --version \
\
&& echo "\n\n"\
"#####################################\n"\
"##### Testing dbt-spark-1.0.0rc2 ####\n"\
"#####################################\n"\
&& docker build --tag dbt-spark-1.0.0rc2 \
--target dbt-spark \
--build-arg dbt_spark_ref=dbt-spark@v1.0.0rc2 \
docker \
&& docker run dbt-spark-1.0.0rc2 --version \
\
&& echo "\n\n"\
"###########################\n"\
"##### Testing dbt-all #####\n"\
"###########################\n"\
&& docker build --tag dbt-all \
--target dbt-all \
docker \
&& docker run dbt-all --version \
\
&& echo "\n\n"\
"##########################################\n"\
"##### Testing third party db adapter #####\n"\
"##########################################\n"\
&& docker build --tag dbt-materialize \
--target dbt-third-party \
--build-arg dbt_third_party="dbt-materialize" \
docker \
&& docker run dbt-materialize --version

View File

@@ -0,0 +1,34 @@
# Structured Logging Arch
## Context
Consumers of dbt have been relying on log parsing well before this change. However, our logs were never optimized for programatic consumption, nor were logs treated like a formal interface between dbt and users. dbt's logging strategy was changed explicitly to address these two realities.
### Options
#### How to structure the data
- Using a library like structlog to represent log data with structural types like dictionaries. This would allow us to easily add data to a log event's context at each call site and have structlog do all the string formatting and io work.
- Creating our own nominal type layer that describes each event in source. This allows event fields to be enforced statically via mypy accross all call sites.
#### How to output the data
- Using structlog to output log lines regardless of if we used it to represent the data. The defaults for structlog are good, and it handles json vs text and formatting for us.
- Using the std lib logger to log our messages more manually. Easy to use, but does far less for us.
## Decision
#### How to structure the data
We decided to go with a custom nominal type layer even though this was going to be more work. This type layer centralizes our assumptions about what data each log event contains, and allows us to use mypy to enforce these centralized assumptions acrosss the codebase. This is all for the purpose for treating logs like a formal interface between dbt and users. Here are two concrete, practical examples of how this pattern is used:
1. On the abstract superclass of all events, there are abstract methods and fields that each concrete class must implement such as `level_tag()` and `code`. If you make a new concrete event type without those, mypy will fail and tell you that you need them, preventing lost log lines, and json log events without a computer-friendly code.
2. On each concrete event, the fields we need to construct the message are explicitly in the source of the class. At every call site if you construct an event without all the necessary data, mypy will fail and tell you which fields you are missing.
Using mypy to enforce these assumptions is a step better than testing becacuse we do not need to write tests to run through every branch of dbt that it could take. Because it is checked statically on every file, mypy will give us these guarantees as long as it is configured to run everywhere.
#### How to output the data
We decided to use the std lib logger because it was far more difficult than we expected to get to structlog to work properly. Documentation was lacking, and reading the source code wasn't a quick way to learn. The std lib logger was used mostly out of a necessity, and because many of the pleasantries you get from using a log library we had already chosen to do explicitly with functions in our nominal typing layer. Swapping out the std lib logger in the future should be an easy task should we choose to do it.
## Status
Completed
## Consequences
Adding a new log event is more cumbersome than it was previously: instead of writing the message at the log callsite, you must create a new concrete class in the event types. This is more opaque for new contributors. The json serialization approach we are using via `asdict` is fragile and unoptimized and should be replaced.
All user-facing log messages now live in one file which makes the job of conforming them much simpler. Because they are all nominally typed separately, it opens up the possibility to have log documentation generated from the type hints as well as outputting our logs in multiple human languages if we want to translate our messages.

View File

@@ -1 +1 @@
version = '1.0.0'
version = '1.0.1'

View File

@@ -41,7 +41,7 @@ def _dbt_psycopg2_name():
package_name = "dbt-postgres"
package_version = "1.0.0"
package_version = "1.0.1"
description = """The postgres adpter plugin for dbt (data build tool)"""
this_directory = os.path.abspath(os.path.dirname(__file__))

3
pytest.ini Normal file
View File

@@ -0,0 +1,3 @@
[pytest]
filterwarnings =
ignore:.*'soft_unicode' has been renamed to 'soft_str'*:DeprecationWarning

View File

@@ -23,7 +23,4 @@ do
cp -r "$DBT_PATH"/"$SUBPATH"/dist/* "$DBT_PATH"/dist/
done
cd "$DBT_PATH"
$PYTHON_BIN setup.py sdist
set +x

View File

@@ -1,84 +0,0 @@
#!/usr/bin/env python
import os
import sys
if 'sdist' not in sys.argv:
print('')
print('As of v1.0.0, `pip install dbt` is no longer supported.')
print('Instead, please use one of the following.')
print('')
print('**To use dbt with your specific database, platform, or query engine:**')
print('')
print(' pip install dbt-<adapter>')
print('')
print(' See full list: https://docs.getdbt.com/docs/available-adapters')
print('')
print('**For developers of integrations with dbt Core:**')
print('')
print(' pip install dbt-core')
print('')
print(' Be advised, dbt Core''s python API is not yet stable or documented')
print(' (https://docs.getdbt.com/docs/running-a-dbt-project/dbt-api)')
print('')
print('**For the previous behavior of `pip install dbt`:**')
print('')
print(' pip install dbt-core dbt-postgres dbt-redshift dbt-snowflake dbt-bigquery')
print('')
sys.exit(1)
if sys.version_info < (3, 7):
print('Error: dbt does not support this version of Python.')
print('Please upgrade to Python 3.7 or higher.')
sys.exit(1)
from setuptools import setup
try:
from setuptools import find_namespace_packages
except ImportError:
# the user has a downlevel version of setuptools.
print('Error: dbt requires setuptools v40.1.0 or higher.')
print('Please upgrade setuptools with "pip install --upgrade setuptools" '
'and try again')
sys.exit(1)
this_directory = os.path.abspath(os.path.dirname(__file__))
with open(os.path.join(this_directory, 'README.md')) as f:
long_description = f.read()
package_name = "dbt"
package_version = "1.0.0"
description = """With dbt, data analysts and engineers can build analytics \
the way engineers build applications."""
setup(
name=package_name,
version=package_version,
description=description,
long_description=long_description,
long_description_content_type='text/markdown',
author="dbt Labs",
author_email="info@dbtlabs.com",
url="https://github.com/dbt-labs/dbt-core",
zip_safe=False,
classifiers=[
'Development Status :: 7 - Inactive',
'License :: OSI Approved :: Apache Software License',
'Operating System :: Microsoft :: Windows',
'Operating System :: MacOS :: MacOS X',
'Operating System :: POSIX :: Linux',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: 3.9',
],
python_requires=">=3.7",
)

View File

@@ -1,6 +0,0 @@
{{ config(materialized='incremental', unique_key='id') }}
select 1 as id
union all
select 1 as id

View File

@@ -1,101 +0,0 @@
ID,FIRST_NAME,LAST_NAME,EMAIL,GENDER,IP_ADDRESS
1,Jack,Hunter,jhunter0@pbs.org,Male,59.80.20.168
2,Kathryn,Walker,kwalker1@ezinearticles.com,Female,194.121.179.35
3,Gerald,Ryan,gryan2@com.com,Male,11.3.212.243
4,Bonnie,Spencer,bspencer3@ameblo.jp,Female,216.32.196.175
5,Harold,Taylor,htaylor4@people.com.cn,Male,253.10.246.136
6,Jacqueline,Griffin,jgriffin5@t.co,Female,16.13.192.220
7,Wanda,Arnold,warnold6@google.nl,Female,232.116.150.64
8,Craig,Ortiz,cortiz7@sciencedaily.com,Male,199.126.106.13
9,Gary,Day,gday8@nih.gov,Male,35.81.68.186
10,Rose,Wright,rwright9@yahoo.co.jp,Female,236.82.178.100
11,Raymond,Kelley,rkelleya@fc2.com,Male,213.65.166.67
12,Gerald,Robinson,grobinsonb@disqus.com,Male,72.232.194.193
13,Mildred,Martinez,mmartinezc@samsung.com,Female,198.29.112.5
14,Dennis,Arnold,darnoldd@google.com,Male,86.96.3.250
15,Judy,Gray,jgraye@opensource.org,Female,79.218.162.245
16,Theresa,Garza,tgarzaf@epa.gov,Female,21.59.100.54
17,Gerald,Robertson,grobertsong@csmonitor.com,Male,131.134.82.96
18,Philip,Hernandez,phernandezh@adobe.com,Male,254.196.137.72
19,Julia,Gonzalez,jgonzalezi@cam.ac.uk,Female,84.240.227.174
20,Andrew,Davis,adavisj@patch.com,Male,9.255.67.25
21,Kimberly,Harper,kharperk@foxnews.com,Female,198.208.120.253
22,Mark,Martin,mmartinl@marketwatch.com,Male,233.138.182.153
23,Cynthia,Ruiz,cruizm@google.fr,Female,18.178.187.201
24,Samuel,Carroll,scarrolln@youtu.be,Male,128.113.96.122
25,Jennifer,Larson,jlarsono@vinaora.com,Female,98.234.85.95
26,Ashley,Perry,aperryp@rakuten.co.jp,Female,247.173.114.52
27,Howard,Rodriguez,hrodriguezq@shutterfly.com,Male,231.188.95.26
28,Amy,Brooks,abrooksr@theatlantic.com,Female,141.199.174.118
29,Louise,Warren,lwarrens@adobe.com,Female,96.105.158.28
30,Tina,Watson,twatsont@myspace.com,Female,251.142.118.177
31,Janice,Kelley,jkelleyu@creativecommons.org,Female,239.167.34.233
32,Terry,Mccoy,tmccoyv@bravesites.com,Male,117.201.183.203
33,Jeffrey,Morgan,jmorganw@surveymonkey.com,Male,78.101.78.149
34,Louis,Harvey,lharveyx@sina.com.cn,Male,51.50.0.167
35,Philip,Miller,pmillery@samsung.com,Male,103.255.222.110
36,Willie,Marshall,wmarshallz@ow.ly,Male,149.219.91.68
37,Patrick,Lopez,plopez10@redcross.org,Male,250.136.229.89
38,Adam,Jenkins,ajenkins11@harvard.edu,Male,7.36.112.81
39,Benjamin,Cruz,bcruz12@linkedin.com,Male,32.38.98.15
40,Ruby,Hawkins,rhawkins13@gmpg.org,Female,135.171.129.255
41,Carlos,Barnes,cbarnes14@a8.net,Male,240.197.85.140
42,Ruby,Griffin,rgriffin15@bravesites.com,Female,19.29.135.24
43,Sean,Mason,smason16@icq.com,Male,159.219.155.249
44,Anthony,Payne,apayne17@utexas.edu,Male,235.168.199.218
45,Steve,Cruz,scruz18@pcworld.com,Male,238.201.81.198
46,Anthony,Garcia,agarcia19@flavors.me,Male,25.85.10.18
47,Doris,Lopez,dlopez1a@sphinn.com,Female,245.218.51.238
48,Susan,Nichols,snichols1b@freewebs.com,Female,199.99.9.61
49,Wanda,Ferguson,wferguson1c@yahoo.co.jp,Female,236.241.135.21
50,Andrea,Pierce,apierce1d@google.co.uk,Female,132.40.10.209
51,Lawrence,Phillips,lphillips1e@jugem.jp,Male,72.226.82.87
52,Judy,Gilbert,jgilbert1f@multiply.com,Female,196.250.15.142
53,Eric,Williams,ewilliams1g@joomla.org,Male,222.202.73.126
54,Ralph,Romero,rromero1h@sogou.com,Male,123.184.125.212
55,Jean,Wilson,jwilson1i@ocn.ne.jp,Female,176.106.32.194
56,Lori,Reynolds,lreynolds1j@illinois.edu,Female,114.181.203.22
57,Donald,Moreno,dmoreno1k@bbc.co.uk,Male,233.249.97.60
58,Steven,Berry,sberry1l@eepurl.com,Male,186.193.50.50
59,Theresa,Shaw,tshaw1m@people.com.cn,Female,120.37.71.222
60,John,Stephens,jstephens1n@nationalgeographic.com,Male,191.87.127.115
61,Richard,Jacobs,rjacobs1o@state.tx.us,Male,66.210.83.155
62,Andrew,Lawson,alawson1p@over-blog.com,Male,54.98.36.94
63,Peter,Morgan,pmorgan1q@rambler.ru,Male,14.77.29.106
64,Nicole,Garrett,ngarrett1r@zimbio.com,Female,21.127.74.68
65,Joshua,Kim,jkim1s@edublogs.org,Male,57.255.207.41
66,Ralph,Roberts,rroberts1t@people.com.cn,Male,222.143.131.109
67,George,Montgomery,gmontgomery1u@smugmug.com,Male,76.75.111.77
68,Gerald,Alvarez,galvarez1v@flavors.me,Male,58.157.186.194
69,Donald,Olson,dolson1w@whitehouse.gov,Male,69.65.74.135
70,Carlos,Morgan,cmorgan1x@pbs.org,Male,96.20.140.87
71,Aaron,Stanley,astanley1y@webnode.com,Male,163.119.217.44
72,Virginia,Long,vlong1z@spiegel.de,Female,204.150.194.182
73,Robert,Berry,rberry20@tripadvisor.com,Male,104.19.48.241
74,Antonio,Brooks,abrooks21@unesco.org,Male,210.31.7.24
75,Ruby,Garcia,rgarcia22@ovh.net,Female,233.218.162.214
76,Jack,Hanson,jhanson23@blogtalkradio.com,Male,31.55.46.199
77,Kathryn,Nelson,knelson24@walmart.com,Female,14.189.146.41
78,Jason,Reed,jreed25@printfriendly.com,Male,141.189.89.255
79,George,Coleman,gcoleman26@people.com.cn,Male,81.189.221.144
80,Rose,King,rking27@ucoz.com,Female,212.123.168.231
81,Johnny,Holmes,jholmes28@boston.com,Male,177.3.93.188
82,Katherine,Gilbert,kgilbert29@altervista.org,Female,199.215.169.61
83,Joshua,Thomas,jthomas2a@ustream.tv,Male,0.8.205.30
84,Julie,Perry,jperry2b@opensource.org,Female,60.116.114.192
85,Richard,Perry,rperry2c@oracle.com,Male,181.125.70.232
86,Kenneth,Ruiz,kruiz2d@wikimedia.org,Male,189.105.137.109
87,Jose,Morgan,jmorgan2e@webnode.com,Male,101.134.215.156
88,Donald,Campbell,dcampbell2f@goo.ne.jp,Male,102.120.215.84
89,Debra,Collins,dcollins2g@uol.com.br,Female,90.13.153.235
90,Jesse,Johnson,jjohnson2h@stumbleupon.com,Male,225.178.125.53
91,Elizabeth,Stone,estone2i@histats.com,Female,123.184.126.221
92,Angela,Rogers,arogers2j@goodreads.com,Female,98.104.132.187
93,Emily,Dixon,edixon2k@mlb.com,Female,39.190.75.57
94,Albert,Scott,ascott2l@tinypic.com,Male,40.209.13.189
95,Barbara,Peterson,bpeterson2m@ow.ly,Female,75.249.136.180
96,Adam,Greene,agreene2n@fastcompany.com,Male,184.173.109.144
97,Earl,Sanders,esanders2o@hc360.com,Male,247.34.90.117
98,Angela,Brooks,abrooks2p@mtv.com,Female,10.63.249.126
99,Harold,Foster,hfoster2q@privacy.gov.au,Male,139.214.40.244
100,Carl,Meyer,cmeyer2r@disqus.com,Male,204.117.7.88
1 ID FIRST_NAME LAST_NAME EMAIL GENDER IP_ADDRESS
2 1 Jack Hunter jhunter0@pbs.org Male 59.80.20.168
3 2 Kathryn Walker kwalker1@ezinearticles.com Female 194.121.179.35
4 3 Gerald Ryan gryan2@com.com Male 11.3.212.243
5 4 Bonnie Spencer bspencer3@ameblo.jp Female 216.32.196.175
6 5 Harold Taylor htaylor4@people.com.cn Male 253.10.246.136
7 6 Jacqueline Griffin jgriffin5@t.co Female 16.13.192.220
8 7 Wanda Arnold warnold6@google.nl Female 232.116.150.64
9 8 Craig Ortiz cortiz7@sciencedaily.com Male 199.126.106.13
10 9 Gary Day gday8@nih.gov Male 35.81.68.186
11 10 Rose Wright rwright9@yahoo.co.jp Female 236.82.178.100
12 11 Raymond Kelley rkelleya@fc2.com Male 213.65.166.67
13 12 Gerald Robinson grobinsonb@disqus.com Male 72.232.194.193
14 13 Mildred Martinez mmartinezc@samsung.com Female 198.29.112.5
15 14 Dennis Arnold darnoldd@google.com Male 86.96.3.250
16 15 Judy Gray jgraye@opensource.org Female 79.218.162.245
17 16 Theresa Garza tgarzaf@epa.gov Female 21.59.100.54
18 17 Gerald Robertson grobertsong@csmonitor.com Male 131.134.82.96
19 18 Philip Hernandez phernandezh@adobe.com Male 254.196.137.72
20 19 Julia Gonzalez jgonzalezi@cam.ac.uk Female 84.240.227.174
21 20 Andrew Davis adavisj@patch.com Male 9.255.67.25
22 21 Kimberly Harper kharperk@foxnews.com Female 198.208.120.253
23 22 Mark Martin mmartinl@marketwatch.com Male 233.138.182.153
24 23 Cynthia Ruiz cruizm@google.fr Female 18.178.187.201
25 24 Samuel Carroll scarrolln@youtu.be Male 128.113.96.122
26 25 Jennifer Larson jlarsono@vinaora.com Female 98.234.85.95
27 26 Ashley Perry aperryp@rakuten.co.jp Female 247.173.114.52
28 27 Howard Rodriguez hrodriguezq@shutterfly.com Male 231.188.95.26
29 28 Amy Brooks abrooksr@theatlantic.com Female 141.199.174.118
30 29 Louise Warren lwarrens@adobe.com Female 96.105.158.28
31 30 Tina Watson twatsont@myspace.com Female 251.142.118.177
32 31 Janice Kelley jkelleyu@creativecommons.org Female 239.167.34.233
33 32 Terry Mccoy tmccoyv@bravesites.com Male 117.201.183.203
34 33 Jeffrey Morgan jmorganw@surveymonkey.com Male 78.101.78.149
35 34 Louis Harvey lharveyx@sina.com.cn Male 51.50.0.167
36 35 Philip Miller pmillery@samsung.com Male 103.255.222.110
37 36 Willie Marshall wmarshallz@ow.ly Male 149.219.91.68
38 37 Patrick Lopez plopez10@redcross.org Male 250.136.229.89
39 38 Adam Jenkins ajenkins11@harvard.edu Male 7.36.112.81
40 39 Benjamin Cruz bcruz12@linkedin.com Male 32.38.98.15
41 40 Ruby Hawkins rhawkins13@gmpg.org Female 135.171.129.255
42 41 Carlos Barnes cbarnes14@a8.net Male 240.197.85.140
43 42 Ruby Griffin rgriffin15@bravesites.com Female 19.29.135.24
44 43 Sean Mason smason16@icq.com Male 159.219.155.249
45 44 Anthony Payne apayne17@utexas.edu Male 235.168.199.218
46 45 Steve Cruz scruz18@pcworld.com Male 238.201.81.198
47 46 Anthony Garcia agarcia19@flavors.me Male 25.85.10.18
48 47 Doris Lopez dlopez1a@sphinn.com Female 245.218.51.238
49 48 Susan Nichols snichols1b@freewebs.com Female 199.99.9.61
50 49 Wanda Ferguson wferguson1c@yahoo.co.jp Female 236.241.135.21
51 50 Andrea Pierce apierce1d@google.co.uk Female 132.40.10.209
52 51 Lawrence Phillips lphillips1e@jugem.jp Male 72.226.82.87
53 52 Judy Gilbert jgilbert1f@multiply.com Female 196.250.15.142
54 53 Eric Williams ewilliams1g@joomla.org Male 222.202.73.126
55 54 Ralph Romero rromero1h@sogou.com Male 123.184.125.212
56 55 Jean Wilson jwilson1i@ocn.ne.jp Female 176.106.32.194
57 56 Lori Reynolds lreynolds1j@illinois.edu Female 114.181.203.22
58 57 Donald Moreno dmoreno1k@bbc.co.uk Male 233.249.97.60
59 58 Steven Berry sberry1l@eepurl.com Male 186.193.50.50
60 59 Theresa Shaw tshaw1m@people.com.cn Female 120.37.71.222
61 60 John Stephens jstephens1n@nationalgeographic.com Male 191.87.127.115
62 61 Richard Jacobs rjacobs1o@state.tx.us Male 66.210.83.155
63 62 Andrew Lawson alawson1p@over-blog.com Male 54.98.36.94
64 63 Peter Morgan pmorgan1q@rambler.ru Male 14.77.29.106
65 64 Nicole Garrett ngarrett1r@zimbio.com Female 21.127.74.68
66 65 Joshua Kim jkim1s@edublogs.org Male 57.255.207.41
67 66 Ralph Roberts rroberts1t@people.com.cn Male 222.143.131.109
68 67 George Montgomery gmontgomery1u@smugmug.com Male 76.75.111.77
69 68 Gerald Alvarez galvarez1v@flavors.me Male 58.157.186.194
70 69 Donald Olson dolson1w@whitehouse.gov Male 69.65.74.135
71 70 Carlos Morgan cmorgan1x@pbs.org Male 96.20.140.87
72 71 Aaron Stanley astanley1y@webnode.com Male 163.119.217.44
73 72 Virginia Long vlong1z@spiegel.de Female 204.150.194.182
74 73 Robert Berry rberry20@tripadvisor.com Male 104.19.48.241
75 74 Antonio Brooks abrooks21@unesco.org Male 210.31.7.24
76 75 Ruby Garcia rgarcia22@ovh.net Female 233.218.162.214
77 76 Jack Hanson jhanson23@blogtalkradio.com Male 31.55.46.199
78 77 Kathryn Nelson knelson24@walmart.com Female 14.189.146.41
79 78 Jason Reed jreed25@printfriendly.com Male 141.189.89.255
80 79 George Coleman gcoleman26@people.com.cn Male 81.189.221.144
81 80 Rose King rking27@ucoz.com Female 212.123.168.231
82 81 Johnny Holmes jholmes28@boston.com Male 177.3.93.188
83 82 Katherine Gilbert kgilbert29@altervista.org Female 199.215.169.61
84 83 Joshua Thomas jthomas2a@ustream.tv Male 0.8.205.30
85 84 Julie Perry jperry2b@opensource.org Female 60.116.114.192
86 85 Richard Perry rperry2c@oracle.com Male 181.125.70.232
87 86 Kenneth Ruiz kruiz2d@wikimedia.org Male 189.105.137.109
88 87 Jose Morgan jmorgan2e@webnode.com Male 101.134.215.156
89 88 Donald Campbell dcampbell2f@goo.ne.jp Male 102.120.215.84
90 89 Debra Collins dcollins2g@uol.com.br Female 90.13.153.235
91 90 Jesse Johnson jjohnson2h@stumbleupon.com Male 225.178.125.53
92 91 Elizabeth Stone estone2i@histats.com Female 123.184.126.221
93 92 Angela Rogers arogers2j@goodreads.com Female 98.104.132.187
94 93 Emily Dixon edixon2k@mlb.com Female 39.190.75.57
95 94 Albert Scott ascott2l@tinypic.com Male 40.209.13.189
96 95 Barbara Peterson bpeterson2m@ow.ly Female 75.249.136.180
97 96 Adam Greene agreene2n@fastcompany.com Male 184.173.109.144
98 97 Earl Sanders esanders2o@hc360.com Male 247.34.90.117
99 98 Angela Brooks abrooks2p@mtv.com Female 10.63.249.126
100 99 Harold Foster hfoster2q@privacy.gov.au Male 139.214.40.244
101 100 Carl Meyer cmeyer2r@disqus.com Male 204.117.7.88

View File

@@ -1,201 +0,0 @@
ID,FIRST_NAME,LAST_NAME,EMAIL,GENDER,IP_ADDRESS
1,Jack,Hunter,jhunter0@pbs.org,Male,59.80.20.168
2,Kathryn,Walker,kwalker1@ezinearticles.com,Female,194.121.179.35
3,Gerald,Ryan,gryan2@com.com,Male,11.3.212.243
4,Bonnie,Spencer,bspencer3@ameblo.jp,Female,216.32.196.175
5,Harold,Taylor,htaylor4@people.com.cn,Male,253.10.246.136
6,Jacqueline,Griffin,jgriffin5@t.co,Female,16.13.192.220
7,Wanda,Arnold,warnold6@google.nl,Female,232.116.150.64
8,Craig,Ortiz,cortiz7@sciencedaily.com,Male,199.126.106.13
9,Gary,Day,gday8@nih.gov,Male,35.81.68.186
10,Rose,Wright,rwright9@yahoo.co.jp,Female,236.82.178.100
11,Raymond,Kelley,rkelleya@fc2.com,Male,213.65.166.67
12,Gerald,Robinson,grobinsonb@disqus.com,Male,72.232.194.193
13,Mildred,Martinez,mmartinezc@samsung.com,Female,198.29.112.5
14,Dennis,Arnold,darnoldd@google.com,Male,86.96.3.250
15,Judy,Gray,jgraye@opensource.org,Female,79.218.162.245
16,Theresa,Garza,tgarzaf@epa.gov,Female,21.59.100.54
17,Gerald,Robertson,grobertsong@csmonitor.com,Male,131.134.82.96
18,Philip,Hernandez,phernandezh@adobe.com,Male,254.196.137.72
19,Julia,Gonzalez,jgonzalezi@cam.ac.uk,Female,84.240.227.174
20,Andrew,Davis,adavisj@patch.com,Male,9.255.67.25
21,Kimberly,Harper,kharperk@foxnews.com,Female,198.208.120.253
22,Mark,Martin,mmartinl@marketwatch.com,Male,233.138.182.153
23,Cynthia,Ruiz,cruizm@google.fr,Female,18.178.187.201
24,Samuel,Carroll,scarrolln@youtu.be,Male,128.113.96.122
25,Jennifer,Larson,jlarsono@vinaora.com,Female,98.234.85.95
26,Ashley,Perry,aperryp@rakuten.co.jp,Female,247.173.114.52
27,Howard,Rodriguez,hrodriguezq@shutterfly.com,Male,231.188.95.26
28,Amy,Brooks,abrooksr@theatlantic.com,Female,141.199.174.118
29,Louise,Warren,lwarrens@adobe.com,Female,96.105.158.28
30,Tina,Watson,twatsont@myspace.com,Female,251.142.118.177
31,Janice,Kelley,jkelleyu@creativecommons.org,Female,239.167.34.233
32,Terry,Mccoy,tmccoyv@bravesites.com,Male,117.201.183.203
33,Jeffrey,Morgan,jmorganw@surveymonkey.com,Male,78.101.78.149
34,Louis,Harvey,lharveyx@sina.com.cn,Male,51.50.0.167
35,Philip,Miller,pmillery@samsung.com,Male,103.255.222.110
36,Willie,Marshall,wmarshallz@ow.ly,Male,149.219.91.68
37,Patrick,Lopez,plopez10@redcross.org,Male,250.136.229.89
38,Adam,Jenkins,ajenkins11@harvard.edu,Male,7.36.112.81
39,Benjamin,Cruz,bcruz12@linkedin.com,Male,32.38.98.15
40,Ruby,Hawkins,rhawkins13@gmpg.org,Female,135.171.129.255
41,Carlos,Barnes,cbarnes14@a8.net,Male,240.197.85.140
42,Ruby,Griffin,rgriffin15@bravesites.com,Female,19.29.135.24
43,Sean,Mason,smason16@icq.com,Male,159.219.155.249
44,Anthony,Payne,apayne17@utexas.edu,Male,235.168.199.218
45,Steve,Cruz,scruz18@pcworld.com,Male,238.201.81.198
46,Anthony,Garcia,agarcia19@flavors.me,Male,25.85.10.18
47,Doris,Lopez,dlopez1a@sphinn.com,Female,245.218.51.238
48,Susan,Nichols,snichols1b@freewebs.com,Female,199.99.9.61
49,Wanda,Ferguson,wferguson1c@yahoo.co.jp,Female,236.241.135.21
50,Andrea,Pierce,apierce1d@google.co.uk,Female,132.40.10.209
51,Lawrence,Phillips,lphillips1e@jugem.jp,Male,72.226.82.87
52,Judy,Gilbert,jgilbert1f@multiply.com,Female,196.250.15.142
53,Eric,Williams,ewilliams1g@joomla.org,Male,222.202.73.126
54,Ralph,Romero,rromero1h@sogou.com,Male,123.184.125.212
55,Jean,Wilson,jwilson1i@ocn.ne.jp,Female,176.106.32.194
56,Lori,Reynolds,lreynolds1j@illinois.edu,Female,114.181.203.22
57,Donald,Moreno,dmoreno1k@bbc.co.uk,Male,233.249.97.60
58,Steven,Berry,sberry1l@eepurl.com,Male,186.193.50.50
59,Theresa,Shaw,tshaw1m@people.com.cn,Female,120.37.71.222
60,John,Stephens,jstephens1n@nationalgeographic.com,Male,191.87.127.115
61,Richard,Jacobs,rjacobs1o@state.tx.us,Male,66.210.83.155
62,Andrew,Lawson,alawson1p@over-blog.com,Male,54.98.36.94
63,Peter,Morgan,pmorgan1q@rambler.ru,Male,14.77.29.106
64,Nicole,Garrett,ngarrett1r@zimbio.com,Female,21.127.74.68
65,Joshua,Kim,jkim1s@edublogs.org,Male,57.255.207.41
66,Ralph,Roberts,rroberts1t@people.com.cn,Male,222.143.131.109
67,George,Montgomery,gmontgomery1u@smugmug.com,Male,76.75.111.77
68,Gerald,Alvarez,galvarez1v@flavors.me,Male,58.157.186.194
69,Donald,Olson,dolson1w@whitehouse.gov,Male,69.65.74.135
70,Carlos,Morgan,cmorgan1x@pbs.org,Male,96.20.140.87
71,Aaron,Stanley,astanley1y@webnode.com,Male,163.119.217.44
72,Virginia,Long,vlong1z@spiegel.de,Female,204.150.194.182
73,Robert,Berry,rberry20@tripadvisor.com,Male,104.19.48.241
74,Antonio,Brooks,abrooks21@unesco.org,Male,210.31.7.24
75,Ruby,Garcia,rgarcia22@ovh.net,Female,233.218.162.214
76,Jack,Hanson,jhanson23@blogtalkradio.com,Male,31.55.46.199
77,Kathryn,Nelson,knelson24@walmart.com,Female,14.189.146.41
78,Jason,Reed,jreed25@printfriendly.com,Male,141.189.89.255
79,George,Coleman,gcoleman26@people.com.cn,Male,81.189.221.144
80,Rose,King,rking27@ucoz.com,Female,212.123.168.231
81,Johnny,Holmes,jholmes28@boston.com,Male,177.3.93.188
82,Katherine,Gilbert,kgilbert29@altervista.org,Female,199.215.169.61
83,Joshua,Thomas,jthomas2a@ustream.tv,Male,0.8.205.30
84,Julie,Perry,jperry2b@opensource.org,Female,60.116.114.192
85,Richard,Perry,rperry2c@oracle.com,Male,181.125.70.232
86,Kenneth,Ruiz,kruiz2d@wikimedia.org,Male,189.105.137.109
87,Jose,Morgan,jmorgan2e@webnode.com,Male,101.134.215.156
88,Donald,Campbell,dcampbell2f@goo.ne.jp,Male,102.120.215.84
89,Debra,Collins,dcollins2g@uol.com.br,Female,90.13.153.235
90,Jesse,Johnson,jjohnson2h@stumbleupon.com,Male,225.178.125.53
91,Elizabeth,Stone,estone2i@histats.com,Female,123.184.126.221
92,Angela,Rogers,arogers2j@goodreads.com,Female,98.104.132.187
93,Emily,Dixon,edixon2k@mlb.com,Female,39.190.75.57
94,Albert,Scott,ascott2l@tinypic.com,Male,40.209.13.189
95,Barbara,Peterson,bpeterson2m@ow.ly,Female,75.249.136.180
96,Adam,Greene,agreene2n@fastcompany.com,Male,184.173.109.144
97,Earl,Sanders,esanders2o@hc360.com,Male,247.34.90.117
98,Angela,Brooks,abrooks2p@mtv.com,Female,10.63.249.126
99,Harold,Foster,hfoster2q@privacy.gov.au,Male,139.214.40.244
100,Carl,Meyer,cmeyer2r@disqus.com,Male,204.117.7.88
101,Michael,Perez,mperez0@chronoengine.com,Male,106.239.70.175
102,Shawn,Mccoy,smccoy1@reddit.com,Male,24.165.76.182
103,Kathleen,Payne,kpayne2@cargocollective.com,Female,113.207.168.106
104,Jimmy,Cooper,jcooper3@cargocollective.com,Male,198.24.63.114
105,Katherine,Rice,krice4@typepad.com,Female,36.97.186.238
106,Sarah,Ryan,sryan5@gnu.org,Female,119.117.152.40
107,Martin,Mcdonald,mmcdonald6@opera.com,Male,8.76.38.115
108,Frank,Robinson,frobinson7@wunderground.com,Male,186.14.64.194
109,Jennifer,Franklin,jfranklin8@mail.ru,Female,91.216.3.131
110,Henry,Welch,hwelch9@list-manage.com,Male,176.35.182.168
111,Fred,Snyder,fsnydera@reddit.com,Male,217.106.196.54
112,Amy,Dunn,adunnb@nba.com,Female,95.39.163.195
113,Kathleen,Meyer,kmeyerc@cdc.gov,Female,164.142.188.214
114,Steve,Ferguson,sfergusond@reverbnation.com,Male,138.22.204.251
115,Teresa,Hill,thille@dion.ne.jp,Female,82.84.228.235
116,Amanda,Harper,aharperf@mail.ru,Female,16.123.56.176
117,Kimberly,Ray,krayg@xing.com,Female,48.66.48.12
118,Johnny,Knight,jknighth@jalbum.net,Male,99.30.138.123
119,Virginia,Freeman,vfreemani@tiny.cc,Female,225.172.182.63
120,Anna,Austin,aaustinj@diigo.com,Female,62.111.227.148
121,Willie,Hill,whillk@mail.ru,Male,0.86.232.249
122,Sean,Harris,sharrisl@zdnet.com,Male,117.165.133.249
123,Mildred,Adams,madamsm@usatoday.com,Female,163.44.97.46
124,David,Graham,dgrahamn@zimbio.com,Male,78.13.246.202
125,Victor,Hunter,vhuntero@ehow.com,Male,64.156.179.139
126,Aaron,Ruiz,aruizp@weebly.com,Male,34.194.68.78
127,Benjamin,Brooks,bbrooksq@jalbum.net,Male,20.192.189.107
128,Lisa,Wilson,lwilsonr@japanpost.jp,Female,199.152.130.217
129,Benjamin,King,bkings@comsenz.com,Male,29.189.189.213
130,Christina,Williamson,cwilliamsont@boston.com,Female,194.101.52.60
131,Jane,Gonzalez,jgonzalezu@networksolutions.com,Female,109.119.12.87
132,Thomas,Owens,towensv@psu.edu,Male,84.168.213.153
133,Katherine,Moore,kmoorew@naver.com,Female,183.150.65.24
134,Jennifer,Stewart,jstewartx@yahoo.com,Female,38.41.244.58
135,Sara,Tucker,stuckery@topsy.com,Female,181.130.59.184
136,Harold,Ortiz,hortizz@vkontakte.ru,Male,198.231.63.137
137,Shirley,James,sjames10@yelp.com,Female,83.27.160.104
138,Dennis,Johnson,djohnson11@slate.com,Male,183.178.246.101
139,Louise,Weaver,lweaver12@china.com.cn,Female,1.14.110.18
140,Maria,Armstrong,marmstrong13@prweb.com,Female,181.142.1.249
141,Gloria,Cruz,gcruz14@odnoklassniki.ru,Female,178.232.140.243
142,Diana,Spencer,dspencer15@ifeng.com,Female,125.153.138.244
143,Kelly,Nguyen,knguyen16@altervista.org,Female,170.13.201.119
144,Jane,Rodriguez,jrodriguez17@biblegateway.com,Female,12.102.249.81
145,Scott,Brown,sbrown18@geocities.jp,Male,108.174.99.192
146,Norma,Cruz,ncruz19@si.edu,Female,201.112.156.197
147,Marie,Peters,mpeters1a@mlb.com,Female,231.121.197.144
148,Lillian,Carr,lcarr1b@typepad.com,Female,206.179.164.163
149,Judy,Nichols,jnichols1c@t-online.de,Female,158.190.209.194
150,Billy,Long,blong1d@yahoo.com,Male,175.20.23.160
151,Howard,Reid,hreid1e@exblog.jp,Male,118.99.196.20
152,Laura,Ferguson,lferguson1f@tuttocitta.it,Female,22.77.87.110
153,Anne,Bailey,abailey1g@geocities.com,Female,58.144.159.245
154,Rose,Morgan,rmorgan1h@ehow.com,Female,118.127.97.4
155,Nicholas,Reyes,nreyes1i@google.ru,Male,50.135.10.252
156,Joshua,Kennedy,jkennedy1j@house.gov,Male,154.6.163.209
157,Paul,Watkins,pwatkins1k@upenn.edu,Male,177.236.120.87
158,Kathryn,Kelly,kkelly1l@businessweek.com,Female,70.28.61.86
159,Adam,Armstrong,aarmstrong1m@techcrunch.com,Male,133.235.24.202
160,Norma,Wallace,nwallace1n@phoca.cz,Female,241.119.227.128
161,Timothy,Reyes,treyes1o@google.cn,Male,86.28.23.26
162,Elizabeth,Patterson,epatterson1p@sun.com,Female,139.97.159.149
163,Edward,Gomez,egomez1q@google.fr,Male,158.103.108.255
164,David,Cox,dcox1r@friendfeed.com,Male,206.80.80.58
165,Brenda,Wood,bwood1s@over-blog.com,Female,217.207.44.179
166,Adam,Walker,awalker1t@blogs.com,Male,253.211.54.93
167,Michael,Hart,mhart1u@wix.com,Male,230.206.200.22
168,Jesse,Ellis,jellis1v@google.co.uk,Male,213.254.162.52
169,Janet,Powell,jpowell1w@un.org,Female,27.192.194.86
170,Helen,Ford,hford1x@creativecommons.org,Female,52.160.102.168
171,Gerald,Carpenter,gcarpenter1y@about.me,Male,36.30.194.218
172,Kathryn,Oliver,koliver1z@army.mil,Female,202.63.103.69
173,Alan,Berry,aberry20@gov.uk,Male,246.157.112.211
174,Harry,Andrews,handrews21@ameblo.jp,Male,195.108.0.12
175,Andrea,Hall,ahall22@hp.com,Female,149.162.163.28
176,Barbara,Wells,bwells23@behance.net,Female,224.70.72.1
177,Anne,Wells,awells24@apache.org,Female,180.168.81.153
178,Harry,Harper,hharper25@rediff.com,Male,151.87.130.21
179,Jack,Ray,jray26@wufoo.com,Male,220.109.38.178
180,Phillip,Hamilton,phamilton27@joomla.org,Male,166.40.47.30
181,Shirley,Hunter,shunter28@newsvine.com,Female,97.209.140.194
182,Arthur,Daniels,adaniels29@reuters.com,Male,5.40.240.86
183,Virginia,Rodriguez,vrodriguez2a@walmart.com,Female,96.80.164.184
184,Christina,Ryan,cryan2b@hibu.com,Female,56.35.5.52
185,Theresa,Mendoza,tmendoza2c@vinaora.com,Female,243.42.0.210
186,Jason,Cole,jcole2d@ycombinator.com,Male,198.248.39.129
187,Phillip,Bryant,pbryant2e@rediff.com,Male,140.39.116.251
188,Adam,Torres,atorres2f@sun.com,Male,101.75.187.135
189,Margaret,Johnston,mjohnston2g@ucsd.edu,Female,159.30.69.149
190,Paul,Payne,ppayne2h@hhs.gov,Male,199.234.140.220
191,Todd,Willis,twillis2i@businessweek.com,Male,191.59.136.214
192,Willie,Oliver,woliver2j@noaa.gov,Male,44.212.35.197
193,Frances,Robertson,frobertson2k@go.com,Female,31.117.65.136
194,Gregory,Hawkins,ghawkins2l@joomla.org,Male,91.3.22.49
195,Lisa,Perkins,lperkins2m@si.edu,Female,145.95.31.186
196,Jacqueline,Anderson,janderson2n@cargocollective.com,Female,14.176.0.187
197,Shirley,Diaz,sdiaz2o@ucla.edu,Female,207.12.95.46
198,Nicole,Meyer,nmeyer2p@flickr.com,Female,231.79.115.13
199,Mary,Gray,mgray2q@constantcontact.com,Female,210.116.64.253
200,Jean,Mcdonald,jmcdonald2r@baidu.com,Female,122.239.235.117
1 ID FIRST_NAME LAST_NAME EMAIL GENDER IP_ADDRESS
2 1 Jack Hunter jhunter0@pbs.org Male 59.80.20.168
3 2 Kathryn Walker kwalker1@ezinearticles.com Female 194.121.179.35
4 3 Gerald Ryan gryan2@com.com Male 11.3.212.243
5 4 Bonnie Spencer bspencer3@ameblo.jp Female 216.32.196.175
6 5 Harold Taylor htaylor4@people.com.cn Male 253.10.246.136
7 6 Jacqueline Griffin jgriffin5@t.co Female 16.13.192.220
8 7 Wanda Arnold warnold6@google.nl Female 232.116.150.64
9 8 Craig Ortiz cortiz7@sciencedaily.com Male 199.126.106.13
10 9 Gary Day gday8@nih.gov Male 35.81.68.186
11 10 Rose Wright rwright9@yahoo.co.jp Female 236.82.178.100
12 11 Raymond Kelley rkelleya@fc2.com Male 213.65.166.67
13 12 Gerald Robinson grobinsonb@disqus.com Male 72.232.194.193
14 13 Mildred Martinez mmartinezc@samsung.com Female 198.29.112.5
15 14 Dennis Arnold darnoldd@google.com Male 86.96.3.250
16 15 Judy Gray jgraye@opensource.org Female 79.218.162.245
17 16 Theresa Garza tgarzaf@epa.gov Female 21.59.100.54
18 17 Gerald Robertson grobertsong@csmonitor.com Male 131.134.82.96
19 18 Philip Hernandez phernandezh@adobe.com Male 254.196.137.72
20 19 Julia Gonzalez jgonzalezi@cam.ac.uk Female 84.240.227.174
21 20 Andrew Davis adavisj@patch.com Male 9.255.67.25
22 21 Kimberly Harper kharperk@foxnews.com Female 198.208.120.253
23 22 Mark Martin mmartinl@marketwatch.com Male 233.138.182.153
24 23 Cynthia Ruiz cruizm@google.fr Female 18.178.187.201
25 24 Samuel Carroll scarrolln@youtu.be Male 128.113.96.122
26 25 Jennifer Larson jlarsono@vinaora.com Female 98.234.85.95
27 26 Ashley Perry aperryp@rakuten.co.jp Female 247.173.114.52
28 27 Howard Rodriguez hrodriguezq@shutterfly.com Male 231.188.95.26
29 28 Amy Brooks abrooksr@theatlantic.com Female 141.199.174.118
30 29 Louise Warren lwarrens@adobe.com Female 96.105.158.28
31 30 Tina Watson twatsont@myspace.com Female 251.142.118.177
32 31 Janice Kelley jkelleyu@creativecommons.org Female 239.167.34.233
33 32 Terry Mccoy tmccoyv@bravesites.com Male 117.201.183.203
34 33 Jeffrey Morgan jmorganw@surveymonkey.com Male 78.101.78.149
35 34 Louis Harvey lharveyx@sina.com.cn Male 51.50.0.167
36 35 Philip Miller pmillery@samsung.com Male 103.255.222.110
37 36 Willie Marshall wmarshallz@ow.ly Male 149.219.91.68
38 37 Patrick Lopez plopez10@redcross.org Male 250.136.229.89
39 38 Adam Jenkins ajenkins11@harvard.edu Male 7.36.112.81
40 39 Benjamin Cruz bcruz12@linkedin.com Male 32.38.98.15
41 40 Ruby Hawkins rhawkins13@gmpg.org Female 135.171.129.255
42 41 Carlos Barnes cbarnes14@a8.net Male 240.197.85.140
43 42 Ruby Griffin rgriffin15@bravesites.com Female 19.29.135.24
44 43 Sean Mason smason16@icq.com Male 159.219.155.249
45 44 Anthony Payne apayne17@utexas.edu Male 235.168.199.218
46 45 Steve Cruz scruz18@pcworld.com Male 238.201.81.198
47 46 Anthony Garcia agarcia19@flavors.me Male 25.85.10.18
48 47 Doris Lopez dlopez1a@sphinn.com Female 245.218.51.238
49 48 Susan Nichols snichols1b@freewebs.com Female 199.99.9.61
50 49 Wanda Ferguson wferguson1c@yahoo.co.jp Female 236.241.135.21
51 50 Andrea Pierce apierce1d@google.co.uk Female 132.40.10.209
52 51 Lawrence Phillips lphillips1e@jugem.jp Male 72.226.82.87
53 52 Judy Gilbert jgilbert1f@multiply.com Female 196.250.15.142
54 53 Eric Williams ewilliams1g@joomla.org Male 222.202.73.126
55 54 Ralph Romero rromero1h@sogou.com Male 123.184.125.212
56 55 Jean Wilson jwilson1i@ocn.ne.jp Female 176.106.32.194
57 56 Lori Reynolds lreynolds1j@illinois.edu Female 114.181.203.22
58 57 Donald Moreno dmoreno1k@bbc.co.uk Male 233.249.97.60
59 58 Steven Berry sberry1l@eepurl.com Male 186.193.50.50
60 59 Theresa Shaw tshaw1m@people.com.cn Female 120.37.71.222
61 60 John Stephens jstephens1n@nationalgeographic.com Male 191.87.127.115
62 61 Richard Jacobs rjacobs1o@state.tx.us Male 66.210.83.155
63 62 Andrew Lawson alawson1p@over-blog.com Male 54.98.36.94
64 63 Peter Morgan pmorgan1q@rambler.ru Male 14.77.29.106
65 64 Nicole Garrett ngarrett1r@zimbio.com Female 21.127.74.68
66 65 Joshua Kim jkim1s@edublogs.org Male 57.255.207.41
67 66 Ralph Roberts rroberts1t@people.com.cn Male 222.143.131.109
68 67 George Montgomery gmontgomery1u@smugmug.com Male 76.75.111.77
69 68 Gerald Alvarez galvarez1v@flavors.me Male 58.157.186.194
70 69 Donald Olson dolson1w@whitehouse.gov Male 69.65.74.135
71 70 Carlos Morgan cmorgan1x@pbs.org Male 96.20.140.87
72 71 Aaron Stanley astanley1y@webnode.com Male 163.119.217.44
73 72 Virginia Long vlong1z@spiegel.de Female 204.150.194.182
74 73 Robert Berry rberry20@tripadvisor.com Male 104.19.48.241
75 74 Antonio Brooks abrooks21@unesco.org Male 210.31.7.24
76 75 Ruby Garcia rgarcia22@ovh.net Female 233.218.162.214
77 76 Jack Hanson jhanson23@blogtalkradio.com Male 31.55.46.199
78 77 Kathryn Nelson knelson24@walmart.com Female 14.189.146.41
79 78 Jason Reed jreed25@printfriendly.com Male 141.189.89.255
80 79 George Coleman gcoleman26@people.com.cn Male 81.189.221.144
81 80 Rose King rking27@ucoz.com Female 212.123.168.231
82 81 Johnny Holmes jholmes28@boston.com Male 177.3.93.188
83 82 Katherine Gilbert kgilbert29@altervista.org Female 199.215.169.61
84 83 Joshua Thomas jthomas2a@ustream.tv Male 0.8.205.30
85 84 Julie Perry jperry2b@opensource.org Female 60.116.114.192
86 85 Richard Perry rperry2c@oracle.com Male 181.125.70.232
87 86 Kenneth Ruiz kruiz2d@wikimedia.org Male 189.105.137.109
88 87 Jose Morgan jmorgan2e@webnode.com Male 101.134.215.156
89 88 Donald Campbell dcampbell2f@goo.ne.jp Male 102.120.215.84
90 89 Debra Collins dcollins2g@uol.com.br Female 90.13.153.235
91 90 Jesse Johnson jjohnson2h@stumbleupon.com Male 225.178.125.53
92 91 Elizabeth Stone estone2i@histats.com Female 123.184.126.221
93 92 Angela Rogers arogers2j@goodreads.com Female 98.104.132.187
94 93 Emily Dixon edixon2k@mlb.com Female 39.190.75.57
95 94 Albert Scott ascott2l@tinypic.com Male 40.209.13.189
96 95 Barbara Peterson bpeterson2m@ow.ly Female 75.249.136.180
97 96 Adam Greene agreene2n@fastcompany.com Male 184.173.109.144
98 97 Earl Sanders esanders2o@hc360.com Male 247.34.90.117
99 98 Angela Brooks abrooks2p@mtv.com Female 10.63.249.126
100 99 Harold Foster hfoster2q@privacy.gov.au Male 139.214.40.244
101 100 Carl Meyer cmeyer2r@disqus.com Male 204.117.7.88
102 101 Michael Perez mperez0@chronoengine.com Male 106.239.70.175
103 102 Shawn Mccoy smccoy1@reddit.com Male 24.165.76.182
104 103 Kathleen Payne kpayne2@cargocollective.com Female 113.207.168.106
105 104 Jimmy Cooper jcooper3@cargocollective.com Male 198.24.63.114
106 105 Katherine Rice krice4@typepad.com Female 36.97.186.238
107 106 Sarah Ryan sryan5@gnu.org Female 119.117.152.40
108 107 Martin Mcdonald mmcdonald6@opera.com Male 8.76.38.115
109 108 Frank Robinson frobinson7@wunderground.com Male 186.14.64.194
110 109 Jennifer Franklin jfranklin8@mail.ru Female 91.216.3.131
111 110 Henry Welch hwelch9@list-manage.com Male 176.35.182.168
112 111 Fred Snyder fsnydera@reddit.com Male 217.106.196.54
113 112 Amy Dunn adunnb@nba.com Female 95.39.163.195
114 113 Kathleen Meyer kmeyerc@cdc.gov Female 164.142.188.214
115 114 Steve Ferguson sfergusond@reverbnation.com Male 138.22.204.251
116 115 Teresa Hill thille@dion.ne.jp Female 82.84.228.235
117 116 Amanda Harper aharperf@mail.ru Female 16.123.56.176
118 117 Kimberly Ray krayg@xing.com Female 48.66.48.12
119 118 Johnny Knight jknighth@jalbum.net Male 99.30.138.123
120 119 Virginia Freeman vfreemani@tiny.cc Female 225.172.182.63
121 120 Anna Austin aaustinj@diigo.com Female 62.111.227.148
122 121 Willie Hill whillk@mail.ru Male 0.86.232.249
123 122 Sean Harris sharrisl@zdnet.com Male 117.165.133.249
124 123 Mildred Adams madamsm@usatoday.com Female 163.44.97.46
125 124 David Graham dgrahamn@zimbio.com Male 78.13.246.202
126 125 Victor Hunter vhuntero@ehow.com Male 64.156.179.139
127 126 Aaron Ruiz aruizp@weebly.com Male 34.194.68.78
128 127 Benjamin Brooks bbrooksq@jalbum.net Male 20.192.189.107
129 128 Lisa Wilson lwilsonr@japanpost.jp Female 199.152.130.217
130 129 Benjamin King bkings@comsenz.com Male 29.189.189.213
131 130 Christina Williamson cwilliamsont@boston.com Female 194.101.52.60
132 131 Jane Gonzalez jgonzalezu@networksolutions.com Female 109.119.12.87
133 132 Thomas Owens towensv@psu.edu Male 84.168.213.153
134 133 Katherine Moore kmoorew@naver.com Female 183.150.65.24
135 134 Jennifer Stewart jstewartx@yahoo.com Female 38.41.244.58
136 135 Sara Tucker stuckery@topsy.com Female 181.130.59.184
137 136 Harold Ortiz hortizz@vkontakte.ru Male 198.231.63.137
138 137 Shirley James sjames10@yelp.com Female 83.27.160.104
139 138 Dennis Johnson djohnson11@slate.com Male 183.178.246.101
140 139 Louise Weaver lweaver12@china.com.cn Female 1.14.110.18
141 140 Maria Armstrong marmstrong13@prweb.com Female 181.142.1.249
142 141 Gloria Cruz gcruz14@odnoklassniki.ru Female 178.232.140.243
143 142 Diana Spencer dspencer15@ifeng.com Female 125.153.138.244
144 143 Kelly Nguyen knguyen16@altervista.org Female 170.13.201.119
145 144 Jane Rodriguez jrodriguez17@biblegateway.com Female 12.102.249.81
146 145 Scott Brown sbrown18@geocities.jp Male 108.174.99.192
147 146 Norma Cruz ncruz19@si.edu Female 201.112.156.197
148 147 Marie Peters mpeters1a@mlb.com Female 231.121.197.144
149 148 Lillian Carr lcarr1b@typepad.com Female 206.179.164.163
150 149 Judy Nichols jnichols1c@t-online.de Female 158.190.209.194
151 150 Billy Long blong1d@yahoo.com Male 175.20.23.160
152 151 Howard Reid hreid1e@exblog.jp Male 118.99.196.20
153 152 Laura Ferguson lferguson1f@tuttocitta.it Female 22.77.87.110
154 153 Anne Bailey abailey1g@geocities.com Female 58.144.159.245
155 154 Rose Morgan rmorgan1h@ehow.com Female 118.127.97.4
156 155 Nicholas Reyes nreyes1i@google.ru Male 50.135.10.252
157 156 Joshua Kennedy jkennedy1j@house.gov Male 154.6.163.209
158 157 Paul Watkins pwatkins1k@upenn.edu Male 177.236.120.87
159 158 Kathryn Kelly kkelly1l@businessweek.com Female 70.28.61.86
160 159 Adam Armstrong aarmstrong1m@techcrunch.com Male 133.235.24.202
161 160 Norma Wallace nwallace1n@phoca.cz Female 241.119.227.128
162 161 Timothy Reyes treyes1o@google.cn Male 86.28.23.26
163 162 Elizabeth Patterson epatterson1p@sun.com Female 139.97.159.149
164 163 Edward Gomez egomez1q@google.fr Male 158.103.108.255
165 164 David Cox dcox1r@friendfeed.com Male 206.80.80.58
166 165 Brenda Wood bwood1s@over-blog.com Female 217.207.44.179
167 166 Adam Walker awalker1t@blogs.com Male 253.211.54.93
168 167 Michael Hart mhart1u@wix.com Male 230.206.200.22
169 168 Jesse Ellis jellis1v@google.co.uk Male 213.254.162.52
170 169 Janet Powell jpowell1w@un.org Female 27.192.194.86
171 170 Helen Ford hford1x@creativecommons.org Female 52.160.102.168
172 171 Gerald Carpenter gcarpenter1y@about.me Male 36.30.194.218
173 172 Kathryn Oliver koliver1z@army.mil Female 202.63.103.69
174 173 Alan Berry aberry20@gov.uk Male 246.157.112.211
175 174 Harry Andrews handrews21@ameblo.jp Male 195.108.0.12
176 175 Andrea Hall ahall22@hp.com Female 149.162.163.28
177 176 Barbara Wells bwells23@behance.net Female 224.70.72.1
178 177 Anne Wells awells24@apache.org Female 180.168.81.153
179 178 Harry Harper hharper25@rediff.com Male 151.87.130.21
180 179 Jack Ray jray26@wufoo.com Male 220.109.38.178
181 180 Phillip Hamilton phamilton27@joomla.org Male 166.40.47.30
182 181 Shirley Hunter shunter28@newsvine.com Female 97.209.140.194
183 182 Arthur Daniels adaniels29@reuters.com Male 5.40.240.86
184 183 Virginia Rodriguez vrodriguez2a@walmart.com Female 96.80.164.184
185 184 Christina Ryan cryan2b@hibu.com Female 56.35.5.52
186 185 Theresa Mendoza tmendoza2c@vinaora.com Female 243.42.0.210
187 186 Jason Cole jcole2d@ycombinator.com Male 198.248.39.129
188 187 Phillip Bryant pbryant2e@rediff.com Male 140.39.116.251
189 188 Adam Torres atorres2f@sun.com Male 101.75.187.135
190 189 Margaret Johnston mjohnston2g@ucsd.edu Female 159.30.69.149
191 190 Paul Payne ppayne2h@hhs.gov Male 199.234.140.220
192 191 Todd Willis twillis2i@businessweek.com Male 191.59.136.214
193 192 Willie Oliver woliver2j@noaa.gov Male 44.212.35.197
194 193 Frances Robertson frobertson2k@go.com Female 31.117.65.136
195 194 Gregory Hawkins ghawkins2l@joomla.org Male 91.3.22.49
196 195 Lisa Perkins lperkins2m@si.edu Female 145.95.31.186
197 196 Jacqueline Anderson janderson2n@cargocollective.com Female 14.176.0.187
198 197 Shirley Diaz sdiaz2o@ucla.edu Female 207.12.95.46
199 198 Nicole Meyer nmeyer2p@flickr.com Female 231.79.115.13
200 199 Mary Gray mgray2q@constantcontact.com Female 210.116.64.253
201 200 Jean Mcdonald jmcdonald2r@baidu.com Female 122.239.235.117

Some files were not shown because too many files have changed in this diff Show More