# SQLFluff AI Coding Assistant Instructions ## Project Overview SQLFluff is a dialect-flexible SQL linter and auto-fixer supporting 25+ SQL dialects including T-SQL, PostgreSQL, BigQuery, MySQL, Snowflake, and more. The project is written primarily in **Python** with an experimental **Rust** component for performance optimization. **Core Architecture:** - **Parser-first design**: SQL is lexed → parsed into segment trees → linted by rules → optionally auto-fixed - **Dialect inheritance**: Each dialect extends ANSI base grammar using `.replace()` to override specific segments - **Segment-based AST**: Everything is a `BaseSegment` subclass forming a recursive tree structure - **Rule crawlers**: Rules traverse segment trees to find violations and generate `LintFix` objects ## Repository Structure ``` / ├── src/sqlfluff/ # Main Python package (see src/sqlfluff/AGENTS.md) │ ├── dialects/ # SQL dialect definitions (see src/sqlfluff/dialects/AGENTS.md) │ ├── rules/ # Linting rules by category │ ├── core/ # Parser, lexer, config infrastructure │ ├── cli/ # Command-line interface │ └── api/ # Public Python API ├── sqlfluffrs/ # Experimental Rust components (see sqlfluffrs/AGENTS.md) ├── test/ # Test suite (see test/AGENTS.md) │ ├── fixtures/ # Test data (SQL files, YAML expected outputs) │ ├── dialects/ # Dialect parsing tests │ └── rules/ # Rule testing infrastructure ├── docs/ # Sphinx documentation (see docs/AGENTS.md) ├── plugins/ # Pluggable extensions (dbt templater, examples) ├── utils/ # Build and development utilities └── examples/ # API usage examples ``` ## Universal Conventions ### Language Support - **Python**: 3.9 minimum, 3.12 recommended for development, 3.13 supported - **Rust**: Experimental, used for performance-critical lexing/parsing ### Code Quality Standards - **Formatting**: Black (Python), rustfmt (Rust) - **Linting**: Ruff + Flake8 (Python), clippy (Rust) - **Type Checking**: Mypy strict mode (Python) - **Pre-commit hooks**: Run before all commits via `.venv/bin/pre-commit run --all-files` ### Testing Philosophy - All tests must pass before merging - Test coverage should reach 100% - Use YAML fixtures for dialect/rule tests - Mirror source structure in test directories ### Commit Messages - Keep messages clear and descriptive - Reference issue numbers when applicable - Use conventional commit style when appropriate ## Core Commands ### Environment Setup ```bash # Create development environment (first time) tox -e py312 --devenv .venv source .venv/bin/activate # Always activate before working in a new terminal source .venv/bin/activate ``` ### Testing ```bash # Run full test suite tox # Run specific Python version tests tox -e py312 # Run with coverage tox -e cov-init,py312,cov-report,linting,mypy # Quick dialect test (after adding SQL fixtures) python test/generate_parse_fixture_yml.py -d tsql ``` ### Quality Checks ```bash # Run all pre-commit checks (format, lint, type check) .venv/bin/pre-commit run --all-files # Individual checks black src/ test/ # Format ruff check src/ test/ # Lint mypy src/sqlfluff/ # Type check ``` ### Building ```bash # Install package in editable mode (done automatically by tox --devenv) pip install -e . # Install plugins pip install -e plugins/sqlfluff-templater-dbt/ ``` ## Architecture Principles ### Layer Separation The codebase enforces strict architectural boundaries (via `importlinter` in `pyproject.toml`): - `core` layer cannot import `api`, `cli`, `dialects`, `rules`, or `utils` - `api` layer cannot import `cli` - Dependencies flow: `linter` → `rules` → `parser` → `errors`/`types` → `helpers` ### Immutability - Segments are immutable - never modify directly - Use `.copy()` or `LintFix` mechanisms for changes - Parser creates fresh tree structures ### Lazy Loading - Dialects loaded via `dialect_selector()` or `load_raw_dialect()` - Never import dialect modules directly - Supports dynamic dialect discovery ## Development Workflows ### Adding Dialect Features 1. Create `.sql` test files in `test/fixtures/dialects//` 2. Run `python test/generate_parse_fixture_yml.py -d ` to generate expected `.yml` outputs 3. Implement grammar in `src/sqlfluff/dialects/dialect_.py` 4. Use `dialect.replace()` to override inherited ANSI segments 5. Verify: `tox -e generate-fixture-yml -- -d ` See `src/sqlfluff/dialects/AGENTS.md` for detailed dialect development guide. ### Adding Linting Rules 1. Create rule class in appropriate category under `src/sqlfluff/rules/` 2. Define metadata: `code`, `name`, `description`, `groups` 3. Implement `_eval(context: RuleContext) -> Optional[LintResult]` 4. Add YAML test cases to `test/fixtures/rules/std_rule_cases/.yml` 5. Run: `tox -e py312 -- test/rules/yaml_test_cases_test.py -k ` ### Fixing Parser Issues 1. Identify failing SQL in `test/fixtures/dialects//*.sql` 2. Run fixture generator to see current parse tree 3. Modify grammar segments in dialect file 4. Regenerate fixtures to verify 5. Check that changes don't break other dialects: `tox -e generate-fixture-yml` ### Documentation Updates 1. Edit source files in `docs/source/` 2. Build locally: `cd docs && make html` 3. View: `open docs/build/html/index.html` 4. Verify links and formatting See `docs/AGENTS.md` for documentation-specific guidelines. ## Component-Specific Instructions For detailed instructions on specific components, refer to: - **Python source code**: `src/sqlfluff/AGENTS.md` - **Dialect development**: `src/sqlfluff/dialects/AGENTS.md` - **Rust components**: `sqlfluffrs/AGENTS.md` - **Testing**: `test/AGENTS.md` - **Documentation**: `docs/AGENTS.md` ## Common Pitfalls ### Parser Development - ❌ Don't modify segment instances directly (immutable) - ✅ Use `.copy()` or `LintFix` for modifications - ❌ Don't import dialect modules directly - ✅ Use `dialect_selector()` for lazy loading - ❌ Don't use class references in grammar definitions - ✅ Use `Ref("SegmentName")` string references ### Testing - ❌ Don't put dialect-specific tests in ANSI fixtures - ✅ Place tests in the most specific applicable dialect - ❌ Don't forget to regenerate YAML fixtures after grammar changes - ✅ Always run `generate_parse_fixture_yml.py` after parser edits - ❌ Don't create monolithic test files - ✅ Organize by segment type (e.g., `create_table.sql`, `select_statement.sql`) ### Code Quality - ❌ Don't skip type hints - ✅ All public functions need type annotations - ❌ Don't bypass pre-commit hooks - ✅ Run `.venv/bin/pre-commit run --all-files` before committing - ❌ Don't violate import layer boundaries - ✅ Check `pyproject.toml` importlinter contracts ## Configuration SQLFluff uses `.sqlfluff` files (INI format) for configuration: - Placed in project root or any parent directory - Key sections: `[sqlfluff]`, `[sqlfluff:rules]`, `[sqlfluff:rules:]` - Programmatic: `FluffConfig.from_root(overrides={...})` ## Plugin System - Plugins live in `plugins/` directory - Installed via `pip install -e plugins//` - Entry points defined in plugin's `pyproject.toml` - Examples: `sqlfluff-templater-dbt`, `sqlfluff-plugin-example` ## Quick Reference ### Most Common Tasks ```bash # Add new SQL test case for a dialect echo "SELECT TOP 10 * FROM users;" > test/fixtures/dialects/tsql/top_clause.sql python test/generate_parse_fixture_yml.py -d tsql # Test a specific rule tox -e py312 -- test/rules/yaml_test_cases_test.py -k AL01 # Check code quality before commit .venv/bin/pre-commit run --all-files # Run tests for just the parser module tox -e py312 -- test/core/parser/ # Check dialect parsing without writing fixtures sqlfluff parse test.sql --dialect tsql ``` ### Performance Tips - Use `-k` flag in pytest to filter tests during development - Run `generate-fixture-yml` with `-d ` to test one dialect - Use `tox -e py312` instead of full `tox` during iteration - Activate venv to run `pytest` directly (faster than tox for single runs) --- **Remember**: The goal is to maintain SQLFluff as a high-quality, reliable SQL linting tool. Take time to understand the architecture, write comprehensive tests, and follow the established patterns. When in doubt, look at existing similar implementations in the codebase.