8.5 KiB
SQLFluff AI Coding Assistant Instructions
Project Overview
SQLFluff is a dialect-flexible SQL linter and auto-fixer supporting 25+ SQL dialects including T-SQL, PostgreSQL, BigQuery, MySQL, Snowflake, and more. The project is written primarily in Python with an experimental Rust component for performance optimization.
Core Architecture:
- Parser-first design: SQL is lexed → parsed into segment trees → linted by rules → optionally auto-fixed
- Dialect inheritance: Each dialect extends ANSI base grammar using
.replace()to override specific segments - Segment-based AST: Everything is a
BaseSegmentsubclass forming a recursive tree structure - Rule crawlers: Rules traverse segment trees to find violations and generate
LintFixobjects
Repository Structure
/
├── src/sqlfluff/ # Main Python package (see src/sqlfluff/AGENTS.md)
│ ├── dialects/ # SQL dialect definitions (see src/sqlfluff/dialects/AGENTS.md)
│ ├── rules/ # Linting rules by category
│ ├── core/ # Parser, lexer, config infrastructure
│ ├── cli/ # Command-line interface
│ └── api/ # Public Python API
├── sqlfluffrs/ # Experimental Rust components (see sqlfluffrs/AGENTS.md)
├── test/ # Test suite (see test/AGENTS.md)
│ ├── fixtures/ # Test data (SQL files, YAML expected outputs)
│ ├── dialects/ # Dialect parsing tests
│ └── rules/ # Rule testing infrastructure
├── docs/ # Sphinx documentation (see docs/AGENTS.md)
├── plugins/ # Pluggable extensions (dbt templater, examples)
├── utils/ # Build and development utilities
└── examples/ # API usage examples
Universal Conventions
Language Support
- Python: 3.9 minimum, 3.12 recommended for development, 3.13 supported
- Rust: Experimental, used for performance-critical lexing/parsing
Code Quality Standards
- Formatting: Black (Python), rustfmt (Rust)
- Linting: Ruff + Flake8 (Python), clippy (Rust)
- Type Checking: Mypy strict mode (Python)
- Pre-commit hooks: Run before all commits via
.venv/bin/pre-commit run --all-files
Testing Philosophy
- All tests must pass before merging
- Test coverage should reach 100%
- Use YAML fixtures for dialect/rule tests
- Mirror source structure in test directories
Commit Messages
- Keep messages clear and descriptive
- Reference issue numbers when applicable
- Use conventional commit style when appropriate
Core Commands
Environment Setup
# Create development environment (first time)
tox -e py312 --devenv .venv
source .venv/bin/activate
# Always activate before working in a new terminal
source .venv/bin/activate
Testing
# Run full test suite
tox
# Run specific Python version tests
tox -e py312
# Run with coverage
tox -e cov-init,py312,cov-report,linting,mypy
# Quick dialect test (after adding SQL fixtures)
python test/generate_parse_fixture_yml.py -d tsql
Quality Checks
# Run all pre-commit checks (format, lint, type check)
.venv/bin/pre-commit run --all-files
# Individual checks
black src/ test/ # Format
ruff check src/ test/ # Lint
mypy src/sqlfluff/ # Type check
Building
# Install package in editable mode (done automatically by tox --devenv)
pip install -e .
# Install plugins
pip install -e plugins/sqlfluff-templater-dbt/
Architecture Principles
Layer Separation
The codebase enforces strict architectural boundaries (via importlinter in pyproject.toml):
corelayer cannot importapi,cli,dialects,rules, orutilsapilayer cannot importcli- Dependencies flow:
linter→rules→parser→errors/types→helpers
Immutability
- Segments are immutable - never modify directly
- Use
.copy()orLintFixmechanisms for changes - Parser creates fresh tree structures
Lazy Loading
- Dialects loaded via
dialect_selector()orload_raw_dialect() - Never import dialect modules directly
- Supports dynamic dialect discovery
Development Workflows
Adding Dialect Features
- Create
.sqltest files intest/fixtures/dialects/<dialect>/ - Run
python test/generate_parse_fixture_yml.py -d <dialect>to generate expected.ymloutputs - Implement grammar in
src/sqlfluff/dialects/dialect_<name>.py - Use
dialect.replace()to override inherited ANSI segments - Verify:
tox -e generate-fixture-yml -- -d <dialect>
See src/sqlfluff/dialects/AGENTS.md for detailed dialect development guide.
Adding Linting Rules
- Create rule class in appropriate category under
src/sqlfluff/rules/ - Define metadata:
code,name,description,groups - Implement
_eval(context: RuleContext) -> Optional[LintResult] - Add YAML test cases to
test/fixtures/rules/std_rule_cases/<category>.yml - Run:
tox -e py312 -- test/rules/yaml_test_cases_test.py -k <rule_code>
Fixing Parser Issues
- Identify failing SQL in
test/fixtures/dialects/<dialect>/*.sql - Run fixture generator to see current parse tree
- Modify grammar segments in dialect file
- Regenerate fixtures to verify
- Check that changes don't break other dialects:
tox -e generate-fixture-yml
Documentation Updates
- Edit source files in
docs/source/ - Build locally:
cd docs && make html - View:
open docs/build/html/index.html - Verify links and formatting
See docs/AGENTS.md for documentation-specific guidelines.
Component-Specific Instructions
For detailed instructions on specific components, refer to:
- Python source code:
src/sqlfluff/AGENTS.md - Dialect development:
src/sqlfluff/dialects/AGENTS.md - Rust components:
sqlfluffrs/AGENTS.md - Testing:
test/AGENTS.md - Documentation:
docs/AGENTS.md
Common Pitfalls
Parser Development
- ❌ Don't modify segment instances directly (immutable)
- ✅ Use
.copy()orLintFixfor modifications - ❌ Don't import dialect modules directly
- ✅ Use
dialect_selector()for lazy loading - ❌ Don't use class references in grammar definitions
- ✅ Use
Ref("SegmentName")string references
Testing
- ❌ Don't put dialect-specific tests in ANSI fixtures
- ✅ Place tests in the most specific applicable dialect
- ❌ Don't forget to regenerate YAML fixtures after grammar changes
- ✅ Always run
generate_parse_fixture_yml.pyafter parser edits - ❌ Don't create monolithic test files
- ✅ Organize by segment type (e.g.,
create_table.sql,select_statement.sql)
Code Quality
- ❌ Don't skip type hints
- ✅ All public functions need type annotations
- ❌ Don't bypass pre-commit hooks
- ✅ Run
.venv/bin/pre-commit run --all-filesbefore committing - ❌ Don't violate import layer boundaries
- ✅ Check
pyproject.tomlimportlinter contracts
Configuration
SQLFluff uses .sqlfluff files (INI format) for configuration:
- Placed in project root or any parent directory
- Key sections:
[sqlfluff],[sqlfluff:rules],[sqlfluff:rules:<rule_code>] - Programmatic:
FluffConfig.from_root(overrides={...})
Plugin System
- Plugins live in
plugins/directory - Installed via
pip install -e plugins/<plugin-name>/ - Entry points defined in plugin's
pyproject.toml - Examples:
sqlfluff-templater-dbt,sqlfluff-plugin-example
Quick Reference
Most Common Tasks
# Add new SQL test case for a dialect
echo "SELECT TOP 10 * FROM users;" > test/fixtures/dialects/tsql/top_clause.sql
python test/generate_parse_fixture_yml.py -d tsql
# Test a specific rule
tox -e py312 -- test/rules/yaml_test_cases_test.py -k AL01
# Check code quality before commit
.venv/bin/pre-commit run --all-files
# Run tests for just the parser module
tox -e py312 -- test/core/parser/
# Check dialect parsing without writing fixtures
sqlfluff parse test.sql --dialect tsql
Performance Tips
- Use
-kflag in pytest to filter tests during development - Run
generate-fixture-ymlwith-d <dialect>to test one dialect - Use
tox -e py312instead of fulltoxduring iteration - Activate venv to run
pytestdirectly (faster than tox for single runs)
Remember: The goal is to maintain SQLFluff as a high-quality, reliable SQL linting tool. Take time to understand the architecture, write comprehensive tests, and follow the established patterns. When in doubt, look at existing similar implementations in the codebase.