Files
sqlfluff/AGENTS.md
2025-11-11 23:09:05 +00:00

8.5 KiB

SQLFluff AI Coding Assistant Instructions

Project Overview

SQLFluff is a dialect-flexible SQL linter and auto-fixer supporting 25+ SQL dialects including T-SQL, PostgreSQL, BigQuery, MySQL, Snowflake, and more. The project is written primarily in Python with an experimental Rust component for performance optimization.

Core Architecture:

  • Parser-first design: SQL is lexed → parsed into segment trees → linted by rules → optionally auto-fixed
  • Dialect inheritance: Each dialect extends ANSI base grammar using .replace() to override specific segments
  • Segment-based AST: Everything is a BaseSegment subclass forming a recursive tree structure
  • Rule crawlers: Rules traverse segment trees to find violations and generate LintFix objects

Repository Structure

/
├── src/sqlfluff/              # Main Python package (see src/sqlfluff/AGENTS.md)
│   ├── dialects/              # SQL dialect definitions (see src/sqlfluff/dialects/AGENTS.md)
│   ├── rules/                 # Linting rules by category
│   ├── core/                  # Parser, lexer, config infrastructure
│   ├── cli/                   # Command-line interface
│   └── api/                   # Public Python API
├── sqlfluffrs/                # Experimental Rust components (see sqlfluffrs/AGENTS.md)
├── test/                      # Test suite (see test/AGENTS.md)
│   ├── fixtures/              # Test data (SQL files, YAML expected outputs)
│   ├── dialects/              # Dialect parsing tests
│   └── rules/                 # Rule testing infrastructure
├── docs/                      # Sphinx documentation (see docs/AGENTS.md)
├── plugins/                   # Pluggable extensions (dbt templater, examples)
├── utils/                     # Build and development utilities
└── examples/                  # API usage examples

Universal Conventions

Language Support

  • Python: 3.9 minimum, 3.12 recommended for development, 3.13 supported
  • Rust: Experimental, used for performance-critical lexing/parsing

Code Quality Standards

  • Formatting: Black (Python), rustfmt (Rust)
  • Linting: Ruff + Flake8 (Python), clippy (Rust)
  • Type Checking: Mypy strict mode (Python)
  • Pre-commit hooks: Run before all commits via .venv/bin/pre-commit run --all-files

Testing Philosophy

  • All tests must pass before merging
  • Test coverage should reach 100%
  • Use YAML fixtures for dialect/rule tests
  • Mirror source structure in test directories

Commit Messages

  • Keep messages clear and descriptive
  • Reference issue numbers when applicable
  • Use conventional commit style when appropriate

Core Commands

Environment Setup

# Create development environment (first time)
tox -e py312 --devenv .venv
source .venv/bin/activate

# Always activate before working in a new terminal
source .venv/bin/activate

Testing

# Run full test suite
tox

# Run specific Python version tests
tox -e py312

# Run with coverage
tox -e cov-init,py312,cov-report,linting,mypy

# Quick dialect test (after adding SQL fixtures)
python test/generate_parse_fixture_yml.py -d tsql

Quality Checks

# Run all pre-commit checks (format, lint, type check)
.venv/bin/pre-commit run --all-files

# Individual checks
black src/ test/                # Format
ruff check src/ test/           # Lint
mypy src/sqlfluff/              # Type check

Building

# Install package in editable mode (done automatically by tox --devenv)
pip install -e .

# Install plugins
pip install -e plugins/sqlfluff-templater-dbt/

Architecture Principles

Layer Separation

The codebase enforces strict architectural boundaries (via importlinter in pyproject.toml):

  • core layer cannot import api, cli, dialects, rules, or utils
  • api layer cannot import cli
  • Dependencies flow: linterrulesparsererrors/typeshelpers

Immutability

  • Segments are immutable - never modify directly
  • Use .copy() or LintFix mechanisms for changes
  • Parser creates fresh tree structures

Lazy Loading

  • Dialects loaded via dialect_selector() or load_raw_dialect()
  • Never import dialect modules directly
  • Supports dynamic dialect discovery

Development Workflows

Adding Dialect Features

  1. Create .sql test files in test/fixtures/dialects/<dialect>/
  2. Run python test/generate_parse_fixture_yml.py -d <dialect> to generate expected .yml outputs
  3. Implement grammar in src/sqlfluff/dialects/dialect_<name>.py
  4. Use dialect.replace() to override inherited ANSI segments
  5. Verify: tox -e generate-fixture-yml -- -d <dialect>

See src/sqlfluff/dialects/AGENTS.md for detailed dialect development guide.

Adding Linting Rules

  1. Create rule class in appropriate category under src/sqlfluff/rules/
  2. Define metadata: code, name, description, groups
  3. Implement _eval(context: RuleContext) -> Optional[LintResult]
  4. Add YAML test cases to test/fixtures/rules/std_rule_cases/<category>.yml
  5. Run: tox -e py312 -- test/rules/yaml_test_cases_test.py -k <rule_code>

Fixing Parser Issues

  1. Identify failing SQL in test/fixtures/dialects/<dialect>/*.sql
  2. Run fixture generator to see current parse tree
  3. Modify grammar segments in dialect file
  4. Regenerate fixtures to verify
  5. Check that changes don't break other dialects: tox -e generate-fixture-yml

Documentation Updates

  1. Edit source files in docs/source/
  2. Build locally: cd docs && make html
  3. View: open docs/build/html/index.html
  4. Verify links and formatting

See docs/AGENTS.md for documentation-specific guidelines.

Component-Specific Instructions

For detailed instructions on specific components, refer to:

  • Python source code: src/sqlfluff/AGENTS.md
  • Dialect development: src/sqlfluff/dialects/AGENTS.md
  • Rust components: sqlfluffrs/AGENTS.md
  • Testing: test/AGENTS.md
  • Documentation: docs/AGENTS.md

Common Pitfalls

Parser Development

  • Don't modify segment instances directly (immutable)
  • Use .copy() or LintFix for modifications
  • Don't import dialect modules directly
  • Use dialect_selector() for lazy loading
  • Don't use class references in grammar definitions
  • Use Ref("SegmentName") string references

Testing

  • Don't put dialect-specific tests in ANSI fixtures
  • Place tests in the most specific applicable dialect
  • Don't forget to regenerate YAML fixtures after grammar changes
  • Always run generate_parse_fixture_yml.py after parser edits
  • Don't create monolithic test files
  • Organize by segment type (e.g., create_table.sql, select_statement.sql)

Code Quality

  • Don't skip type hints
  • All public functions need type annotations
  • Don't bypass pre-commit hooks
  • Run .venv/bin/pre-commit run --all-files before committing
  • Don't violate import layer boundaries
  • Check pyproject.toml importlinter contracts

Configuration

SQLFluff uses .sqlfluff files (INI format) for configuration:

  • Placed in project root or any parent directory
  • Key sections: [sqlfluff], [sqlfluff:rules], [sqlfluff:rules:<rule_code>]
  • Programmatic: FluffConfig.from_root(overrides={...})

Plugin System

  • Plugins live in plugins/ directory
  • Installed via pip install -e plugins/<plugin-name>/
  • Entry points defined in plugin's pyproject.toml
  • Examples: sqlfluff-templater-dbt, sqlfluff-plugin-example

Quick Reference

Most Common Tasks

# Add new SQL test case for a dialect
echo "SELECT TOP 10 * FROM users;" > test/fixtures/dialects/tsql/top_clause.sql
python test/generate_parse_fixture_yml.py -d tsql

# Test a specific rule
tox -e py312 -- test/rules/yaml_test_cases_test.py -k AL01

# Check code quality before commit
.venv/bin/pre-commit run --all-files

# Run tests for just the parser module
tox -e py312 -- test/core/parser/

# Check dialect parsing without writing fixtures
sqlfluff parse test.sql --dialect tsql

Performance Tips

  • Use -k flag in pytest to filter tests during development
  • Run generate-fixture-yml with -d <dialect> to test one dialect
  • Use tox -e py312 instead of full tox during iteration
  • Activate venv to run pytest directly (faster than tox for single runs)

Remember: The goal is to maintain SQLFluff as a high-quality, reliable SQL linting tool. Take time to understand the architecture, write comprehensive tests, and follow the established patterns. When in doubt, look at existing similar implementations in the codebase.