Add instructions for agents (#7261)

2025-12-17 19:31:32 +00:00 · 2025-11-12 00:09:05 +01:00
parent f6742cbee5
commit 6325aa7f3a
7 changed files with 2692 additions and 0 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,234 @@
+# SQLFluff AI Coding Assistant Instructions
+
+## Project Overview
+
+SQLFluff is a dialect-flexible SQL linter and auto-fixer supporting 25+ SQL dialects including T-SQL, PostgreSQL, BigQuery, MySQL, Snowflake, and more. The project is written primarily in **Python** with an experimental **Rust** component for performance optimization.
+
+**Core Architecture:**
+- **Parser-first design**: SQL is lexed → parsed into segment trees → linted by rules → optionally auto-fixed
+- **Dialect inheritance**: Each dialect extends ANSI base grammar using `.replace()` to override specific segments
+- **Segment-based AST**: Everything is a `BaseSegment` subclass forming a recursive tree structure
+- **Rule crawlers**: Rules traverse segment trees to find violations and generate `LintFix` objects
+
+## Repository Structure
+
+```
+/
+├── src/sqlfluff/              # Main Python package (see src/sqlfluff/AGENTS.md)
+│   ├── dialects/              # SQL dialect definitions (see src/sqlfluff/dialects/AGENTS.md)
+│   ├── rules/                 # Linting rules by category
+│   ├── core/                  # Parser, lexer, config infrastructure
+│   ├── cli/                   # Command-line interface
+│   └── api/                   # Public Python API
+├── sqlfluffrs/                # Experimental Rust components (see sqlfluffrs/AGENTS.md)
+├── test/                      # Test suite (see test/AGENTS.md)
+│   ├── fixtures/              # Test data (SQL files, YAML expected outputs)
+│   ├── dialects/              # Dialect parsing tests
+│   └── rules/                 # Rule testing infrastructure
+├── docs/                      # Sphinx documentation (see docs/AGENTS.md)
+├── plugins/                   # Pluggable extensions (dbt templater, examples)
+├── utils/                     # Build and development utilities
+└── examples/                  # API usage examples
+```
+
+## Universal Conventions
+
+### Language Support
+- **Python**: 3.9 minimum, 3.12 recommended for development, 3.13 supported
+- **Rust**: Experimental, used for performance-critical lexing/parsing
+
+### Code Quality Standards
+- **Formatting**: Black (Python), rustfmt (Rust)
+- **Linting**: Ruff + Flake8 (Python), clippy (Rust)
+- **Type Checking**: Mypy strict mode (Python)
+- **Pre-commit hooks**: Run before all commits via `.venv/bin/pre-commit run --all-files`
+
+### Testing Philosophy
+- All tests must pass before merging
+- Test coverage should reach 100%
+- Use YAML fixtures for dialect/rule tests
+- Mirror source structure in test directories
+
+### Commit Messages
+- Keep messages clear and descriptive
+- Reference issue numbers when applicable
+- Use conventional commit style when appropriate
+
+## Core Commands
+
+### Environment Setup
+```bash
+# Create development environment (first time)
+tox -e py312 --devenv .venv
+source .venv/bin/activate
+
+# Always activate before working in a new terminal
+source .venv/bin/activate
+```
+
+### Testing
+```bash
+# Run full test suite
+tox
+
+# Run specific Python version tests
+tox -e py312
+
+# Run with coverage
+tox -e cov-init,py312,cov-report,linting,mypy
+
+# Quick dialect test (after adding SQL fixtures)
+python test/generate_parse_fixture_yml.py -d tsql
+```
+
+### Quality Checks
+```bash
+# Run all pre-commit checks (format, lint, type check)
+.venv/bin/pre-commit run --all-files
+
+# Individual checks
+black src/ test/                # Format
+ruff check src/ test/           # Lint
+mypy src/sqlfluff/              # Type check
+```
+
+### Building
+```bash
+# Install package in editable mode (done automatically by tox --devenv)
+pip install -e .
+
+# Install plugins
+pip install -e plugins/sqlfluff-templater-dbt/
+```
+
+## Architecture Principles
+
+### Layer Separation
+The codebase enforces strict architectural boundaries (via `importlinter` in `pyproject.toml`):
+- `core` layer cannot import `api`, `cli`, `dialects`, `rules`, or `utils`
+- `api` layer cannot import `cli`
+- Dependencies flow: `linter` → `rules` → `parser` → `errors`/`types` → `helpers`
+
+### Immutability
+- Segments are immutable - never modify directly
+- Use `.copy()` or `LintFix` mechanisms for changes
+- Parser creates fresh tree structures
+
+### Lazy Loading
+- Dialects loaded via `dialect_selector()` or `load_raw_dialect()`
+- Never import dialect modules directly
+- Supports dynamic dialect discovery
+
+## Development Workflows
+
+### Adding Dialect Features
+1. Create `.sql` test files in `test/fixtures/dialects/<dialect>/`
+2. Run `python test/generate_parse_fixture_yml.py -d <dialect>` to generate expected `.yml` outputs
+3. Implement grammar in `src/sqlfluff/dialects/dialect_<name>.py`
+4. Use `dialect.replace()` to override inherited ANSI segments
+5. Verify: `tox -e generate-fixture-yml -- -d <dialect>`
+
+See `src/sqlfluff/dialects/AGENTS.md` for detailed dialect development guide.
+
+### Adding Linting Rules
+1. Create rule class in appropriate category under `src/sqlfluff/rules/`
+2. Define metadata: `code`, `name`, `description`, `groups`
+3. Implement `_eval(context: RuleContext) -> Optional[LintResult]`
+4. Add YAML test cases to `test/fixtures/rules/std_rule_cases/<category>.yml`
+5. Run: `tox -e py312 -- test/rules/yaml_test_cases_test.py -k <rule_code>`
+
+### Fixing Parser Issues
+1. Identify failing SQL in `test/fixtures/dialects/<dialect>/*.sql`
+2. Run fixture generator to see current parse tree
+3. Modify grammar segments in dialect file
+4. Regenerate fixtures to verify
+5. Check that changes don't break other dialects: `tox -e generate-fixture-yml`
+
+### Documentation Updates
+1. Edit source files in `docs/source/`
+2. Build locally: `cd docs && make html`
+3. View: `open docs/build/html/index.html`
+4. Verify links and formatting
+
+See `docs/AGENTS.md` for documentation-specific guidelines.
+
+## Component-Specific Instructions
+
+For detailed instructions on specific components, refer to:
+- **Python source code**: `src/sqlfluff/AGENTS.md`
+- **Dialect development**: `src/sqlfluff/dialects/AGENTS.md`
+- **Rust components**: `sqlfluffrs/AGENTS.md`
+- **Testing**: `test/AGENTS.md`
+- **Documentation**: `docs/AGENTS.md`
+
+## Common Pitfalls
+
+### Parser Development
+- ❌ Don't modify segment instances directly (immutable)
+- ✅ Use `.copy()` or `LintFix` for modifications
+- ❌ Don't import dialect modules directly
+- ✅ Use `dialect_selector()` for lazy loading
+- ❌ Don't use class references in grammar definitions
+- ✅ Use `Ref("SegmentName")` string references
+
+### Testing
+- ❌ Don't put dialect-specific tests in ANSI fixtures
+- ✅ Place tests in the most specific applicable dialect
+- ❌ Don't forget to regenerate YAML fixtures after grammar changes
+- ✅ Always run `generate_parse_fixture_yml.py` after parser edits
+- ❌ Don't create monolithic test files
+- ✅ Organize by segment type (e.g., `create_table.sql`, `select_statement.sql`)
+
+### Code Quality
+- ❌ Don't skip type hints
+- ✅ All public functions need type annotations
+- ❌ Don't bypass pre-commit hooks
+- ✅ Run `.venv/bin/pre-commit run --all-files` before committing
+- ❌ Don't violate import layer boundaries
+- ✅ Check `pyproject.toml` importlinter contracts
+
+## Configuration
+
+SQLFluff uses `.sqlfluff` files (INI format) for configuration:
+- Placed in project root or any parent directory
+- Key sections: `[sqlfluff]`, `[sqlfluff:rules]`, `[sqlfluff:rules:<rule_code>]`
+- Programmatic: `FluffConfig.from_root(overrides={...})`
+
+## Plugin System
+
+- Plugins live in `plugins/` directory
+- Installed via `pip install -e plugins/<plugin-name>/`
+- Entry points defined in plugin's `pyproject.toml`
+- Examples: `sqlfluff-templater-dbt`, `sqlfluff-plugin-example`
+
+## Quick Reference
+
+### Most Common Tasks
+
+```bash
+# Add new SQL test case for a dialect
+echo "SELECT TOP 10 * FROM users;" > test/fixtures/dialects/tsql/top_clause.sql
+python test/generate_parse_fixture_yml.py -d tsql
+
+# Test a specific rule
+tox -e py312 -- test/rules/yaml_test_cases_test.py -k AL01
+
+# Check code quality before commit
+.venv/bin/pre-commit run --all-files
+
+# Run tests for just the parser module
+tox -e py312 -- test/core/parser/
+
+# Check dialect parsing without writing fixtures
+sqlfluff parse test.sql --dialect tsql
+```
+
+### Performance Tips
+- Use `-k` flag in pytest to filter tests during development
+- Run `generate-fixture-yml` with `-d <dialect>` to test one dialect
+- Use `tox -e py312` instead of full `tox` during iteration
+- Activate venv to run `pytest` directly (faster than tox for single runs)
+
+---
+
+**Remember**: The goal is to maintain SQLFluff as a high-quality, reliable SQL linting tool. Take time to understand the architecture, write comprehensive tests, and follow the established patterns. When in doubt, look at existing similar implementations in the codebase.
--- a/docs/AGENTS.md
+++ b/docs/AGENTS.md
@@ -0,0 +1,532 @@
+# Documentation - AI Assistant Instructions
+
+This file provides guidelines for building and maintaining SQLFluff documentation.
+
+## Documentation System
+
+SQLFluff uses **Sphinx** for documentation generation with:
+- **Source**: `docs/source/` (reStructuredText files)
+- **Build output**: `docs/build/` (HTML, generated)
+- **Live docs**: https://docs.sqlfluff.com
+- **Auto-generated content**: API docs, rule reference, dialect lists
+
+## Documentation Structure
+
+```
+docs/
+├── source/                        # Documentation source files
+│   ├── conf.py                    # Sphinx configuration
+│   ├── index.rst                  # Homepage
+│   ├── gettingstarted.rst         # Getting started guide
+│   ├── why_sqlfluff.rst          # Project overview
+│   ├── inthewild.rst             # Real-world usage
+│   ├── jointhecommunity.rst      # Community info
+│   ├── configuration/            # Configuration docs
+│   │   ├── index.rst
+│   │   └── setting_configuration.rst
+│   ├── guides/                   # Developer guides
+│   │   ├── index.rst
+│   │   ├── first_contribution.rst
+│   │   └── dialect_development.rst
+│   ├── reference/                # API and rule reference
+│   │   ├── index.rst
+│   │   ├── rules.rst
+│   │   └── api.rst
+│   ├── production/               # Production deployment
+│   ├── _static/                  # Static assets (CSS, images)
+│   ├── _ext/                     # Sphinx extensions
+│   └── _partials/                # Reusable doc fragments
+├── build/                        # Generated HTML (gitignored)
+├── Makefile                      # Build commands (Unix)
+├── make.bat                      # Build commands (Windows)
+├── requirements.txt              # Doc build dependencies
+└── README.md                     # Documentation README
+
+../generate-auto-docs.py          # Script to generate auto-docs
+```
+
+## Building Documentation Locally
+
+### Setup
+
+```bash
+# Activate virtual environment
+source .venv/bin/activate
+
+# Install documentation dependencies
+pip install -r docs/requirements.txt
+```
+
+### Building HTML Docs
+
+```bash
+# Navigate to docs directory
+cd docs
+
+# Build HTML (Unix/Linux/Mac)
+make html
+
+# Build HTML (Windows)
+make.bat html
+
+# View built documentation
+open build/html/index.html  # macOS
+xdg-open build/html/index.html  # Linux
+# Or manually open docs/build/html/index.html in browser
+```
+
+### Clean Build
+
+```bash
+cd docs
+
+# Clean previous build
+make clean
+
+# Build fresh
+make html
+```
+
+### Live Reload During Development
+
+For rapid iteration, use `sphinx-autobuild`:
+
+```bash
+# Install sphinx-autobuild
+pip install sphinx-autobuild
+
+# Run live-reload server
+cd docs
+sphinx-autobuild source build/html
+
+# Open browser to http://127.0.0.1:8000
+# Docs rebuild automatically on file changes
+```
+
+## Documentation Format
+
+### reStructuredText (RST)
+
+SQLFluff docs use RST format (`.rst` files).
+
+**Basic syntax:**
+
+```rst
+Page Title
+==========
+
+Section Heading
+---------------
+
+Subsection
+~~~~~~~~~~
+
+**Bold text**
+
+*Italic text*
+
+``inline code``
+
+`Link text <https://example.com>`_
+
+- Bullet list item
+- Another item
+
+1. Numbered list
+2. Second item
+
+.. code-block:: sql
+
+   SELECT * FROM users
+   WHERE active = 1;
+
+.. code-block:: python
+
+   from sqlfluff.core import Linter
+   linter = Linter(dialect="tsql")
+
+.. note::
+   This is a note box.
+
+.. warning::
+   This is a warning box.
+```
+
+### Cross-References
+
+```rst
+Link to another doc:
+:doc:`gettingstarted`
+
+Link to section:
+:ref:`configuration-label`
+
+Link to Python class:
+:class:`sqlfluff.core.Linter`
+
+Link to function:
+:func:`sqlfluff.lint`
+```
+
+## Documentation Types
+
+### User-Facing Documentation
+
+**Getting Started** (`gettingstarted.rst`):
+- Installation instructions
+- Quick start examples
+- Basic usage patterns
+
+**Configuration** (`configuration/`):
+- Configuration file format
+- Available settings
+- Dialect-specific config
+
+**Rules Reference** (`reference/rules.rst`):
+- Auto-generated from rule metadata
+- Rule descriptions, examples, configuration options
+- **Updated automatically** via `generate-auto-docs.py`
+
+### Developer Documentation
+
+**Guides** (`guides/`):
+- First contribution walkthrough
+- Dialect development guide
+- Rule development guide
+- Architecture overview
+
+**API Reference** (`reference/api.rst`):
+- Auto-generated from docstrings
+- Python API documentation
+- Class and function references
+
+### Production Documentation
+
+**Production** (`production/`):
+- CI/CD integration
+- Performance tuning
+- Deployment best practices
+
+## Auto-Generated Documentation
+
+### Generating Auto-Docs
+
+Some documentation is generated from source code:
+
+```bash
+# Generate auto-documentation (rules, dialects, etc.)
+python docs/generate-auto-docs.py
+
+# Build docs after generation
+cd docs
+make html
+```
+
+**What gets auto-generated:**
+- Rule reference (from rule metadata)
+- Dialect list (from available dialects)
+- API documentation (from docstrings)
+
+### Rule Documentation
+
+Rules are documented via their metadata:
+
+```python
+class Rule_AL01(BaseRule):
+    """Implicit aliasing of table not allowed.
+
+    **Anti-pattern**
+
+    Using implicit alias for tables:
+
+    .. code-block:: sql
+
+        SELECT * FROM users u
+
+    **Best practice**
+
+    Use explicit AS keyword:
+
+    .. code-block:: sql
+
+        SELECT * FROM users AS u
+    """
+
+    groups = ("all", "aliasing")
+    # ... rest of rule
+```
+
+Docstring is extracted and added to rule reference docs.
+
+## Documentation Style Guide
+
+### Writing Style
+
+- **Clear and concise**: Use simple language
+- **Active voice**: "Run the command" not "The command should be run"
+- **Present tense**: "SQLFluff parses SQL" not "SQLFluff will parse SQL"
+- **Examples**: Include code examples for every feature
+- **User perspective**: Write from user's point of view
+
+### Code Examples
+
+**Always include:**
+- Context (what the example demonstrates)
+- Complete, runnable code
+- Expected output when relevant
+
+**SQL examples:**
+```rst
+.. code-block:: sql
+
+   -- Anti-pattern: implicit alias
+   SELECT * FROM users u;
+
+.. code-block:: sql
+
+   -- Best practice: explicit alias
+   SELECT * FROM users AS u;
+```
+
+**Python examples:**
+```rst
+.. code-block:: python
+
+   from sqlfluff.core import Linter
+
+   linter = Linter(dialect="tsql")
+   result = linter.lint_string("SELECT * FROM users")
+   print(result.violations)
+```
+
+**Shell examples:**
+```rst
+.. code-block:: bash
+
+   # Lint a SQL file
+   sqlfluff lint query.sql
+
+   # Fix issues automatically
+   sqlfluff fix query.sql
+```
+
+### Sections and Headers
+
+Use consistent header hierarchy:
+
+```rst
+Page Title (Top Level)
+======================
+
+Major Section
+-------------
+
+Subsection
+~~~~~~~~~~
+
+Sub-subsection
+^^^^^^^^^^^^^^
+```
+
+### Links and References
+
+**External links:**
+```rst
+See the `official documentation <https://docs.sqlfluff.com>`_ for details.
+```
+
+**Internal cross-references:**
+```rst
+For configuration options, see :doc:`configuration/index`.
+
+As described in :ref:`dialect-development`, each dialect...
+```
+
+**Define reference labels:**
+```rst
+.. _dialect-development:
+
+Dialect Development
+-------------------
+
+This section covers dialect development...
+```
+
+## Checking Documentation Quality
+
+### Sphinx Warnings
+
+Sphinx warns about issues during build:
+
+```bash
+cd docs
+make html
+
+# Look for warnings like:
+# WARNING: document isn't included in any toctree
+# WARNING: undefined label: some-label
+# ERROR: Unknown directive type "cod-block"  (typo!)
+```
+
+Fix all warnings before committing documentation changes.
+
+### Link Checking
+
+```bash
+cd docs
+
+# Check for broken links
+make linkcheck
+
+# Review output for HTTP errors, redirects, broken anchors
+```
+
+### Spell Checking
+
+SQLFluff uses `codespell` for spell checking:
+
+```bash
+# Run from repository root
+codespell docs/source/
+
+# Or via pre-commit
+.venv/bin/pre-commit run codespell --all-files
+```
+
+## Documentation Workflow
+
+### Adding New Documentation
+
+1. **Create or edit `.rst` file** in `docs/source/`
+2. **Add to table of contents** (toctree) in parent `index.rst`:
+   ```rst
+   .. toctree::
+      :maxdepth: 2
+
+      existing_page
+      new_page
+   ```
+3. **Build and review:**
+   ```bash
+   cd docs
+   make clean html
+   open build/html/index.html
+   ```
+4. **Check for warnings** during build
+5. **Run link checker:**
+   ```bash
+   make linkcheck
+   ```
+6. **Commit both source and auto-generated files** if applicable
+
+### Updating Existing Documentation
+
+1. **Edit `.rst` file**
+2. **Rebuild docs:**
+   ```bash
+   cd docs
+   make html
+   ```
+3. **Review changes** in browser
+4. **Check for new warnings**
+5. **Commit changes**
+
+### Adding Code Examples
+
+1. **Create example in `examples/`** directory (optional):
+   ```python
+   # examples/08_new_feature.py
+   from sqlfluff.core import Linter
+
+   linter = Linter(dialect="tsql")
+   result = linter.lint_string("SELECT * FROM users")
+   print(result.violations)
+   ```
+
+2. **Reference in documentation:**
+   ```rst
+   .. literalinclude:: ../../examples/08_new_feature.py
+      :language: python
+      :linenos:
+   ```
+
+3. **Or embed directly:**
+   ```rst
+   .. code-block:: python
+
+      from sqlfluff.core import Linter
+      linter = Linter(dialect="tsql")
+   ```
+
+## Common Documentation Tasks
+
+### Document New Rule
+
+1. **Add docstring to rule class** with anti-pattern and best practice
+2. **Regenerate docs:**
+   ```bash
+   python docs/generate-auto-docs.py
+   ```
+3. **Build and verify:**
+   ```bash
+   cd docs && make html
+   ```
+
+### Document New Dialect
+
+1. **Add dialect overview** to `reference/dialects.rst` or create new file
+2. **Include supported features** and known limitations
+3. **Provide examples** of dialect-specific syntax
+4. **Update auto-generated dialect list:**
+   ```bash
+   python docs/generate-auto-docs.py
+   ```
+
+### Add Tutorial/Guide
+
+1. **Create new `.rst` file** in `docs/source/guides/`
+2. **Add to toctree** in `docs/source/guides/index.rst`
+3. **Include step-by-step instructions** with examples
+4. **Build and test** all commands/code in tutorial
+
+## Sphinx Configuration
+
+Configuration in `docs/source/conf.py`:
+
+**Key settings:**
+- `project`: "SQLFluff"
+- `extensions`: Sphinx extensions used
+- `html_theme`: Documentation theme
+- `html_static_path`: Static assets directory
+
+**Custom extensions** in `docs/source/_ext/`:
+- Custom directives or roles
+- Auto-documentation generators
+
+## Testing Documentation Build in CI
+
+Documentation builds are tested in CI/CD:
+- Ensures no Sphinx warnings or errors
+- Validates all links
+- Checks for spelling errors
+
+**Local pre-check before committing:**
+```bash
+# Build docs
+cd docs && make clean html
+
+# Check links
+make linkcheck
+
+# Spell check
+cd .. && codespell docs/source/
+
+# Review any warnings/errors
+```
+
+---
+
+**See also:**
+- Root `AGENTS.md` for general project overview
+- `CONTRIBUTING.md` for contribution guidelines
+- [Sphinx documentation](https://www.sphinx-doc.org/) for RST syntax reference
--- a/sqlfluffrs/AGENTS.md
+++ b/sqlfluffrs/AGENTS.md
@@ -0,0 +1,439 @@
+# Rust Components - AI Assistant Instructions
+
+This file provides guidance for SQLFluff's Rust components.
+
+## Overview
+
+The `sqlfluffrs/` directory contains an **experimental Rust implementation** of performance-critical SQLFluff components. This is an ongoing effort to accelerate lexing and parsing operations while maintaining compatibility with the Python implementation.
+
+## Project Status
+
+**Current state**: Experimental and under development
+
+**Goals:**
+- Accelerate lexing performance (tokenization)
+- Speed up parsing for large SQL files
+- Maintain API compatibility with Python components
+- Provide optional Rust-based acceleration for production users
+
+**Not a replacement**: The Rust components are designed to work alongside Python, not replace the entire codebase.
+
+## Structure
+
+```
+sqlfluffrs/
+├── Cargo.toml              # Rust package manifest
+├── pyproject.toml          # Python packaging for Rust extension
+├── LICENSE.md              # License
+├── README.md               # Rust component README
+├── py.typed                # Type stub marker
+├── sqlfluffrs.pyi          # Python type stubs for Rust extension
+└── src/                    # Rust source code
+    ├── lib.rs              # Library root
+    ├── python.rs           # Python bindings (PyO3)
+    ├── lexer.rs            # Lexer implementation
+    ├── marker.rs           # Position markers
+    ├── matcher.rs          # Pattern matching
+    ├── regex.rs            # Regex utilities
+    ├── slice.rs            # String slicing
+    ├── config/             # Configuration handling
+    ├── dialect/            # Dialect definitions
+    ├── templater/          # Template handling
+    └── token/              # Token types
+```
+
+## Rust Development Setup
+
+### Requirements
+
+- **Rust**: Install via [rustup](https://rustup.rs/)
+  ```bash
+  curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
+  ```
+- **Cargo**: Comes with Rust installation
+- **Python development headers**: Required for PyO3 bindings
+
+### Building Rust Components
+
+```bash
+# Navigate to Rust directory
+cd sqlfluffrs
+
+# Build Rust library
+cargo build
+
+# Build release (optimized)
+cargo build --release
+
+# Run Rust tests
+cargo test
+
+# Run with output
+cargo test -- --nocapture
+
+# Check code without building
+cargo check
+
+# Format code
+cargo fmt
+
+# Lint code
+cargo clippy
+```
+
+### Python Integration
+
+The Rust components are exposed to Python via **PyO3**:
+
+```bash
+# Build and install Python extension
+cd sqlfluffrs
+pip install -e .
+
+# Or from repository root
+pip install -e ./sqlfluffrs/
+```
+
+## Rust Coding Standards
+
+### Style
+
+- **Follow Rust conventions**: Use `rustfmt` for formatting
+- **Naming**:
+  - `snake_case` for functions, variables, modules
+  - `PascalCase` for types, structs, enums, traits
+  - `SCREAMING_SNAKE_CASE` for constants
+- **Idiomatic Rust**: Prefer iterators, pattern matching, and ownership patterns
+
+### Error Handling
+
+**Prefer `Result` and `?` operator:**
+```rust
+fn parse_token(input: &str) -> Result<Token, ParseError> {
+    let trimmed = input.trim();
+    if trimmed.is_empty() {
+        return Err(ParseError::EmptyInput);
+    }
+    Ok(Token::new(trimmed))
+}
+
+fn process() -> Result<(), ParseError> {
+    let token = parse_token("  SELECT  ")?;  // Use ? operator
+    // ... use token
+    Ok(())
+}
+```
+
+**Avoid `unwrap()` and `expect()` in production code:**
+```rust
+// ❌ Bad: Can panic
+let value = some_option.unwrap();
+
+// ✅ Good: Handle None case
+let value = match some_option {
+    Some(v) => v,
+    None => return Err(Error::MissingValue),
+};
+
+// ✅ Also good: Use ? with Option
+let value = some_option.ok_or(Error::MissingValue)?;
+```
+
+**Exception**: `unwrap()` and `expect()` are acceptable in tests.
+
+### Testing
+
+```rust
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_token_parsing() {
+        let result = parse_token("SELECT");
+        assert!(result.is_ok());
+        assert_eq!(result.unwrap().value, "SELECT");
+    }
+
+    #[test]
+    fn test_empty_input_fails() {
+        let result = parse_token("");
+        assert!(result.is_err());
+    }
+}
+```
+
+Run tests:
+```bash
+cargo test
+cargo test --lib          # Library tests only
+cargo test --release      # Optimized build
+```
+
+## Python-Rust Interface (PyO3)
+
+### Exposing Rust to Python
+
+**Basic example** in `src/python.rs`:
+
+```rust
+use pyo3::prelude::*;
+
+#[pyfunction]
+fn tokenize(sql: &str) -> PyResult<Vec<String>> {
+    let tokens = internal_tokenize(sql)
+        .map_err(|e| PyErr::new::<pyo3::exceptions::PyValueError, _>(e.to_string()))?;
+    Ok(tokens)
+}
+
+#[pymodule]
+fn sqlfluffrs(_py: Python, m: &PyModule) -> PyResult<()> {
+    m.add_function(wrap_pyfunction!(tokenize, m)?)?;
+    Ok(())
+}
+```
+
+**Python usage:**
+```python
+import sqlfluffrs
+
+tokens = sqlfluffrs.tokenize("SELECT * FROM users")
+print(tokens)  # ['SELECT', '*', 'FROM', 'users']
+```
+
+### Type Stubs
+
+Provide Python type hints in `sqlfluffrs.pyi`:
+
+```python
+from typing import List
+
+def tokenize(sql: str) -> List[str]: ...
+```
+
+## Architecture
+
+### Lexer
+
+The Rust lexer (`src/lexer.rs`) tokenizes SQL strings:
+
+```rust
+pub struct Lexer {
+    config: LexerConfig,
+}
+
+impl Lexer {
+    pub fn new(config: LexerConfig) -> Self {
+        Lexer { config }
+    }
+
+    pub fn lex(&self, sql: &str) -> Result<Vec<Token>, LexError> {
+        // Tokenization logic
+    }
+}
+```
+
+### Matcher
+
+Pattern matching for grammar rules (`src/matcher.rs`):
+
+```rust
+pub trait Matcher {
+    fn matches(&self, tokens: &[Token]) -> bool;
+}
+
+pub struct SequenceMatcher {
+    matchers: Vec<Box<dyn Matcher>>,
+}
+```
+
+### Dialect Support
+
+Rust dialects mirror Python dialects (`src/dialect/`):
+
+```rust
+pub struct Dialect {
+    name: String,
+    reserved_keywords: HashSet<String>,
+    unreserved_keywords: HashSet<String>,
+}
+```
+
+## Performance Considerations
+
+### Benchmarking
+
+Use Criterion for benchmarks:
+
+```rust
+// benches/lexer_bench.rs
+use criterion::{black_box, criterion_group, criterion_main, Criterion};
+
+fn lexer_benchmark(c: &mut Criterion) {
+    c.bench_function("lex_simple_select", |b| {
+        b.iter(|| {
+            let sql = black_box("SELECT * FROM users WHERE id = 1");
+            lex(sql)
+        });
+    });
+}
+
+criterion_group!(benches, lexer_benchmark);
+criterion_main!(benches);
+```
+
+Run benchmarks:
+```bash
+cargo bench
+```
+
+### Optimization
+
+- Use `cargo build --release` for production builds
+- Profile with `cargo flamegraph` or `perf`
+- Prefer zero-copy operations where possible
+- Use `&str` over `String` when ownership not needed
+
+## Development Workflow
+
+### Making Changes
+
+1. **Edit Rust code** in `src/`
+2. **Run tests:**
+   ```bash
+   cargo test
+   ```
+3. **Format code:**
+   ```bash
+   cargo fmt
+   ```
+4. **Lint:**
+   ```bash
+   cargo clippy
+   ```
+5. **Build Python extension:**
+   ```bash
+   pip install -e .
+   ```
+6. **Test Python integration:**
+   ```python
+   import sqlfluffrs
+   # Test Rust functions from Python
+   ```
+
+### Syncing with Python
+
+After changing Rust lexer/parser:
+
+1. **Regenerate dialect bindings:**
+   ```bash
+   # From repository root
+   source .venv/bin/activate
+   python utils/rustify.py build
+   ```
+
+2. **Test against Python test suite:**
+   ```bash
+   tox -e py312
+   ```
+
+## Common Tasks
+
+### Adding New Lexer Pattern
+
+1. Edit `src/lexer.rs`
+2. Add pattern matching logic
+3. Write tests
+4. Run `cargo test`
+5. Update Python bindings if needed
+
+### Updating Dialect
+
+1. Edit `src/dialect/<dialect>.rs`
+2. Update keyword lists or grammar
+3. Sync with Python via `utils/rustify.py build`
+4. Test with `cargo test`
+
+### Exposing New Function to Python
+
+1. Add function in appropriate Rust module
+2. Add Python binding in `src/python.rs`:
+   ```rust
+   #[pyfunction]
+   fn my_new_function(input: &str) -> PyResult<String> {
+       // Implementation
+   }
+   ```
+3. Register in module:
+   ```rust
+   #[pymodule]
+   fn sqlfluffrs(_py: Python, m: &PyModule) -> PyResult<()> {
+       m.add_function(wrap_pyfunction!(my_new_function, m)?)?;
+       Ok(())
+   }
+   ```
+4. Add type stub to `sqlfluffrs.pyi`:
+   ```python
+   def my_new_function(input: str) -> str: ...
+   ```
+5. Rebuild and test
+
+## Testing
+
+### Rust Unit Tests
+
+```bash
+# All tests
+cargo test
+
+# Specific test
+cargo test test_lexer_keywords
+
+# Show output
+cargo test -- --nocapture
+
+# With release optimizations
+cargo test --release
+```
+
+### Integration with Python Tests
+
+Rust components are tested via Python test suite:
+
+```bash
+# Ensure Rust extension is built
+cd sqlfluffrs && pip install -e . && cd ..
+
+# Run Python tests
+tox -e py312
+```
+
+## Resources
+
+- **Rust Book**: https://doc.rust-lang.org/book/
+- **PyO3 Guide**: https://pyo3.rs/
+- **Cargo Book**: https://doc.rust-lang.org/cargo/
+- **Rust by Example**: https://doc.rust-lang.org/rust-by-example/
+
+## Current Limitations
+
+- Experimental and incomplete
+- Not all Python features implemented
+- Performance gains vary by use case
+- May have compatibility issues with some dialects
+
+## Contributing to Rust Components
+
+Rust contributions are welcome but should:
+- Maintain API compatibility with Python
+- Include tests
+- Follow Rust conventions
+- Update Python type stubs
+- Sync with Python implementation via `rustify.py`
+
+---
+
+**See also:**
+- Root `AGENTS.md` for general project overview
+- `src/sqlfluff/AGENTS.md` for Python coding standards
+- `sqlfluffrs/README.md` for Rust-specific README
--- a/src/sqlfluff/AGENTS.md
+++ b/src/sqlfluff/AGENTS.md
@@ -0,0 +1,413 @@
+# Python Source Code - AI Assistant Instructions
+
+This file provides Python-specific development guidelines for SQLFluff's main source code.
+
+## Python Standards
+
+### Version Support
+- **Minimum**: Python 3.9
+- **Recommended for development**: Python 3.12
+- **Maximum tested**: Python 3.13
+
+### Code Style & Formatting
+
+#### Black (Auto-formatter)
+- Default settings (line length: 88 characters)
+- Run: `black src/ test/`
+- Automatically enforced via pre-commit hooks
+
+#### Ruff (Linter)
+- Fast Python linter with isort and pydocstyle integration
+- Run: `ruff check src/ test/`
+- Auto-fix: `ruff check --fix src/ test/`
+- Checks import order, docstring style, common code smells
+
+#### Flake8 (Additional Linting)
+- Used with flake8-black plugin
+- Configured in `pyproject.toml`
+
+### Type Annotations
+
+**Required for all public functions and methods:**
+
+```python
+from typing import Optional, Union, List, Dict, cast, TYPE_CHECKING
+
+def parse_sql(sql: str, dialect: str = "ansi") -> Optional[BaseSegment]:
+    """Parse SQL string into segment tree.
+
+    Args:
+        sql: SQL string to parse.
+        dialect: SQL dialect name.
+
+    Returns:
+        Root segment or None if parsing fails.
+    """
+    pass
+```
+
+**Key Mypy settings** (strict mode enabled):
+- `warn_unused_configs = true`
+- `strict_equality = true`
+- `no_implicit_reexport = true`
+
+**Avoiding circular imports:**
+```python
+from __future__ import annotations
+from typing import TYPE_CHECKING
+
+if TYPE_CHECKING:
+    from sqlfluff.core.parser import BaseSegment
+```
+
+### Documentation Standards
+
+**Google-style docstrings required:**
+
+```python
+def complex_function(param1: str, param2: int, flag: bool = False) -> Dict[str, Any]:
+    """Short one-line description.
+
+    Longer description explaining the purpose, behavior, and usage.
+    Can span multiple lines when needed.
+
+    Args:
+        param1: Description of first parameter.
+        param2: Description of second parameter.
+        flag: Optional flag for special behavior. Defaults to False.
+
+    Returns:
+        Dictionary containing results with keys 'status', 'data', etc.
+
+    Raises:
+        ValueError: If param1 is empty.
+        SQLParseError: If parsing fails.
+    """
+    pass
+```
+
+**Exceptions to docstring requirements:**
+- Magic methods (e.g., `__init__`, `__str__`) - D105, D107 ignored
+- Private methods may have simplified docstrings
+- Test functions use descriptive names instead
+
+### Import Organization
+
+**Enforced order** (via Ruff isort):
+
+```python
+# 1. Standard library imports
+import os
+import sys
+from typing import Optional
+
+# 2. Third-party imports
+import click
+import yaml
+
+# 3. First-party imports (sqlfluff packages)
+from sqlfluff.core.parser import BaseSegment
+from sqlfluff.core.rules import BaseRule
+```
+
+**Import linter contracts** (in `pyproject.toml`):
+- `core` cannot import from `api`, `cli`, `dialects`, `rules`, `utils`
+- `api` cannot import from `cli`
+- Use specific imports: `from module import SpecificClass` (not `import *`)
+
+## Architecture & Design Patterns
+
+### Segment System
+
+**All AST nodes inherit from `BaseSegment`:**
+
+```python
+from sqlfluff.core.parser import BaseSegment
+from sqlfluff.core.parser.grammar import Sequence, Ref, OneOf
+
+class SelectStatementSegment(BaseSegment):
+    """A SELECT statement."""
+
+    type = "select_statement"
+    match_grammar = Sequence(
+        "SELECT",
+        Ref("SelectClauseSegment"),
+        Ref("FromClauseSegment", optional=True),
+        Ref("WhereClauseSegment", optional=True),
+    )
+```
+
+**Key principles:**
+- Segments are **immutable** - never modify in place
+- Use `.copy()` to create modified versions
+- `match_grammar` defines parsing rules recursively
+- Use `Ref("SegmentName")` not direct class references
+
+### Rule System
+
+**Rules inherit from `BaseRule`:**
+
+```python
+from sqlfluff.core.rules import BaseRule, LintResult, LintFix, RuleContext
+from sqlfluff.core.rules.crawlers import SegmentSeekerCrawler
+
+class Rule_AL01(BaseRule):
+    """Implicit aliasing of table not allowed."""
+
+    groups = ("all", "aliasing")
+    crawl_behaviour = SegmentSeekerCrawler({"table_reference"})
+
+    def _eval(self, context: RuleContext) -> Optional[LintResult]:
+        """Evaluate rule against segment.
+
+        Args:
+            context: Rule context with segment and dialect info.
+
+        Returns:
+            LintResult if violation found, None otherwise.
+        """
+        if context.segment.has_implicit_alias:
+            return LintResult(
+                anchor=context.segment,
+                fixes=[LintFix.replace(context.segment, [new_segments])],
+            )
+        return None
+```
+
+**Rule metadata:**
+- `code`: Unique identifier (e.g., "AL01", "LT02")
+- `name`: Human-readable name
+- `description`: What the rule checks
+- `groups`: Categories like "all", "core", "aliasing"
+- `crawl_behaviour`: Which segment types to examine
+
+### Dialect System
+
+**Dialects use inheritance and replacement:**
+
+```python
+from sqlfluff.core.dialects import load_raw_dialect
+from sqlfluff.core.parser.grammar import Sequence, Ref
+
+# Load parent dialect
+ansi_dialect = load_raw_dialect("ansi")
+tsql_dialect = ansi_dialect.copy_as("tsql")
+
+# Override specific segments
+tsql_dialect.replace(
+    SelectStatementSegment=Sequence(
+        "SELECT",
+        Ref("TopClauseSegment", optional=True),  # T-SQL specific
+        Ref("SelectClauseSegment"),
+        Ref("FromClauseSegment", optional=True),
+    ),
+)
+```
+
+**Never import dialects directly:**
+```python
+# ❌ Wrong
+from sqlfluff.dialects.dialect_tsql import tsql_dialect
+
+# ✅ Correct
+from sqlfluff.core.dialects import dialect_selector
+dialect = dialect_selector("tsql")
+```
+
+## Testing Patterns
+
+### Test File Organization
+
+```
+test/
+├── core/
+│   ├── parser/
+│   │   ├── grammar_test.py
+│   │   └── segments_test.py
+│   └── rules/
+│       └── base_test.py
+├── dialects/
+│   └── tsql_test.py
+└── rules/
+    ├── yaml_test_cases_test.py
+    └── std_fix_auto_test.py
+```
+
+**Naming convention**: `*_test.py` (enforced by pytest)
+
+### Pytest Fixtures
+
+**Use fixtures in `conftest.py`:**
+
+```python
+import pytest
+from sqlfluff.core import FluffConfig
+
+@pytest.fixture
+def default_config():
+    """Provide default SQLFluff config for tests."""
+    return FluffConfig.from_root()
+
+def test_parser_with_config(default_config):
+    """Test parser using fixture."""
+    assert default_config.get("dialect") == "ansi"
+```
+
+### Test Markers
+
+```python
+import pytest
+
+@pytest.mark.dbt
+def test_dbt_templater():
+    """Test requiring dbt installation."""
+    pass
+
+@pytest.mark.integration
+def test_full_parse_flow():
+    """Integration test for complete parsing flow."""
+    pass
+```
+
+## Common Commands
+
+### Development Workflow
+
+```bash
+# Activate virtual environment
+source .venv/bin/activate
+
+# Run tests for specific module
+pytest test/core/parser/ -v
+
+# Run with coverage
+pytest test/core/ --cov=src/sqlfluff/core --cov-report=term-missing
+
+# Test specific function
+pytest test/core/parser/grammar_test.py::test_sequence_matching -v
+
+# Run type checking
+mypy src/sqlfluff/
+
+# Format and lint
+black src/ test/
+ruff check --fix src/ test/
+```
+
+### Installing Dependencies
+
+```bash
+# Install main package in editable mode
+pip install -e .
+
+# Install with development dependencies
+pip install -e .[dev]
+
+# Install specific plugin
+pip install -e plugins/sqlfluff-templater-dbt/
+```
+
+## Performance Considerations
+
+### Efficient Segment Tree Traversal
+
+```python
+# ✅ Good: Use crawlers for targeted traversal
+from sqlfluff.core.rules.crawlers import SegmentSeekerCrawler
+
+crawl_behaviour = SegmentSeekerCrawler({"select_statement", "insert_statement"})
+
+# ❌ Bad: Manual recursive traversal
+def find_all_selects(segment):
+    results = []
+    if segment.type == "select_statement":
+        results.append(segment)
+    for child in segment.segments:
+        results.extend(find_all_selects(child))
+    return results
+```
+
+### Lazy Evaluation
+
+```python
+# ✅ Good: Lazy loading
+from sqlfluff.core.dialects import dialect_selector
+dialect = dialect_selector("tsql")  # Loaded on demand
+
+# ❌ Bad: Eager imports
+from sqlfluff.dialects.dialect_tsql import tsql_dialect
+```
+
+## Debugging Tips
+
+### Parser Debugging
+
+```python
+# Enable detailed logging
+import logging
+logging.basicConfig(level=logging.DEBUG)
+
+# Use parse debugging
+from sqlfluff.core import Linter
+linter = Linter(dialect="tsql")
+parsed = linter.parse_string("SELECT * FROM users")
+print(parsed.tree.stringify())  # View parse tree
+```
+
+### Rule Debugging
+
+```bash
+# Run single rule against SQL file
+sqlfluff lint test.sql --rules AL01 -v
+
+# Show fixes without applying
+sqlfluff fix test.sql --rules AL01 --diff
+
+# Parse and show tree structure
+sqlfluff parse test.sql --dialect tsql
+```
+
+## Anti-Patterns to Avoid
+
+```python
+# ❌ Don't modify segments in place
+segment.raw = "NEW VALUE"  # Segments are immutable!
+
+# ✅ Use copy or LintFix
+new_segment = segment.copy(raw="NEW VALUE")
+
+# ❌ Don't import across architectural boundaries
+from sqlfluff.cli import commands  # In core/ module - violation!
+
+# ✅ Respect layer separation
+# core/ should not import from cli/, api/, dialects/, rules/
+
+# ❌ Don't use bare except
+try:
+    parse_sql(sql)
+except:
+    pass
+
+# ✅ Catch specific exceptions
+try:
+    parse_sql(sql)
+except SQLParseError as e:
+    logger.error(f"Parse failed: {e}")
+
+# ❌ Don't use mutable default arguments
+def process_segments(segments=[]):  # Bug waiting to happen!
+    segments.append(new_segment)
+
+# ✅ Use None and initialize
+def process_segments(segments=None):
+    if segments is None:
+        segments = []
+    segments.append(new_segment)
+```
+
+---
+
+**See also:**
+- `src/sqlfluff/dialects/AGENTS.md` for dialect-specific development
+- `test/AGENTS.md` for testing conventions and commands
+- Root `AGENTS.md` for general project overview
--- a/src/sqlfluff/dialects/AGENTS.md
+++ b/src/sqlfluff/dialects/AGENTS.md
@@ -0,0 +1,423 @@
+# Dialect Development - AI Assistant Instructions
+
+This file provides guidance for developing and extending SQL dialect support in SQLFluff.
+
+## Overview
+
+SQLFluff supports 25+ SQL dialects through an inheritance-based system. Each dialect extends the ANSI base dialect and overrides specific grammar segments to match the target SQL variant's syntax.
+
+## Dialect Architecture
+
+### Inheritance Hierarchy
+
+```
+ANSI (base) ← All dialects inherit from here
+├── T-SQL (Microsoft SQL Server)
+├── PostgreSQL
+│   └── Redshift (extends PostgreSQL)
+├── MySQL
+│   └── MariaDB (extends MySQL)
+├── BigQuery
+├── Snowflake
+└── ... (20+ more dialects)
+```
+
+### File Organization
+
+```
+src/sqlfluff/dialects/
+├── dialect_ansi.py              # Base ANSI SQL dialect
+├── dialect_tsql.py              # T-SQL (SQL Server)
+├── dialect_postgres.py          # PostgreSQL
+├── dialect_bigquery.py          # Google BigQuery
+├── dialect_snowflake.py         # Snowflake
+├── ...
+├── dialect_ansi_keywords.py     # ANSI reserved/unreserved keywords
+├── dialect_tsql_keywords.py     # T-SQL keywords
+└── dialect_instructions/    # Per-dialect agent instructions (optional)
+    ├── tsql.md
+    ├── postgres.md
+    └── ...
+```
+
+## Creating/Extending a Dialect
+
+### Basic Dialect Structure
+
+```python
+"""The T-SQL (Microsoft SQL Server) dialect."""
+
+from sqlfluff.core.dialects import load_raw_dialect
+from sqlfluff.core.parser import BaseSegment
+from sqlfluff.core.parser.grammar import (
+    Sequence, OneOf, Ref, Bracketed, Delimited, AnyNumberOf, Optional
+)
+
+# Load parent dialect
+ansi_dialect = load_raw_dialect("ansi")
+
+# Create new dialect as copy
+tsql_dialect = ansi_dialect.copy_as("tsql")
+
+# Set keywords from separate file
+tsql_dialect.sets("reserved_keywords").update([
+    "CLUSTERED", "NONCLUSTERED", "ROWGUIDCOL", "TOP"
+])
+
+# Define new segments specific to T-SQL
+class TopClauseSegment(BaseSegment):
+    """TOP clause for T-SQL SELECT statements."""
+
+    type = "top_clause"
+    match_grammar = Sequence(
+        "TOP",
+        OneOf(
+            Ref("NumericLiteralSegment"),
+            Bracketed(Ref("ExpressionSegment")),
+        ),
+        Sequence("PERCENT", optional=True),
+        Sequence("WITH", "TIES", optional=True),
+    )
+
+# Override existing ANSI segments
+tsql_dialect.replace(
+    SelectStatementSegment=Sequence(
+        "SELECT",
+        Ref("TopClauseSegment", optional=True),  # T-SQL addition
+        Ref("SelectClauseSegment"),
+        Ref("FromClauseSegment", optional=True),
+        Ref("WhereClauseSegment", optional=True),
+    ),
+)
+```
+
+### Grammar Composition Primitives
+
+Located in `src/sqlfluff/core/parser/grammar/`:
+
+| Primitive | Purpose | Example |
+|-----------|---------|---------|
+| `Sequence()` | Ordered sequence of elements | `Sequence("SELECT", Ref("SelectClauseSegment"))` |
+| `OneOf()` | Choice between alternatives | `OneOf("ASC", "DESC")` |
+| `Delimited()` | Comma-separated list | `Delimited(Ref("ColumnReferenceSegment"))` |
+| `AnyNumberOf()` | Zero or more repetitions | `AnyNumberOf(Ref("WhereClauseSegment"))` |
+| `Bracketed()` | Content in parentheses | `Bracketed(Ref("ExpressionSegment"))` |
+| `Ref()` | Reference to another segment | `Ref("TableReferenceSegment")` |
+| `Optional()` | Optional element (or use `optional=True`) | `Optional(Ref("WhereClause"))` |
+
+### Grammar Organization Patterns
+
+#### Internal Grammar (Private Attributes with `_` prefix)
+
+Use for grammar components specific to one statement:
+
+```python
+class CreateDatabaseStatementSegment(BaseSegment):
+    """A CREATE DATABASE statement."""
+
+    # Internal grammar - only used in this segment
+    _filestream_option = OneOf(
+        Sequence("NON_TRANSACTED_ACCESS", Ref("EqualsSegment"), "OFF"),
+        Sequence("DIRECTORY_NAME", Ref("EqualsSegment"), Ref("QuotedLiteralSegment")),
+    )
+
+    _create_database_option = OneOf(
+        Sequence("FILESTREAM", Bracketed(Delimited(_filestream_option))),
+        Sequence("DEFAULT_LANGUAGE", Ref("EqualsSegment"), Ref("LanguageNameSegment")),
+        Sequence("DEFAULT_FULLTEXT_LANGUAGE", Ref("EqualsSegment"), Ref("LanguageNameSegment")),
+    )
+
+    type = "create_database_statement"
+    match_grammar = Sequence(
+        "CREATE", "DATABASE",
+        Ref("DatabaseReferenceSegment"),
+        Sequence("WITH", Delimited(_create_database_option), optional=True),
+    )
+```
+
+#### Shared Segments (Named Classes)
+
+Create separate segment classes for reusable components:
+
+```python
+class FileSpecSegment(BaseSegment):
+    """File specification - reusable in CREATE/ALTER statements."""
+
+    type = "file_spec"
+    match_grammar = Bracketed(
+        Sequence(
+            Sequence("NAME", Ref("EqualsSegment"), Ref("QuotedLiteralSegment"), optional=True),
+            Sequence("FILENAME", Ref("EqualsSegment"), Ref("QuotedLiteralSegment")),
+            Sequence("SIZE", Ref("EqualsSegment"), Ref("FileSizeSegment"), optional=True),
+        )
+    )
+
+# Now FileSpecSegment can be used in multiple statements
+class CreateDatabaseStatementSegment(BaseSegment):
+    match_grammar = Sequence(
+        "CREATE", "DATABASE",
+        Ref("DatabaseReferenceSegment"),
+        Sequence("ON", Delimited(Ref("FileSpecSegment")), optional=True),
+    )
+
+class AlterDatabaseStatementSegment(BaseSegment):
+    match_grammar = Sequence(
+        "ALTER", "DATABASE",
+        Ref("DatabaseReferenceSegment"),
+        "ADD", "FILE", Ref("FileSpecSegment"),
+    )
+```
+
+**Decision criteria:**
+- **Use `_prefix` internal grammar** when:
+  - Grammar is specific to one statement type
+  - No other segments need to reference it
+  - Breaking down complex `match_grammar` for readability
+
+- **Use shared segment classes** when:
+  - Multiple statements use the same construct
+  - Construct represents a meaningful SQL element
+  - Other rules or segments need to `Ref()` it by name
+  - Semantic meaning beyond one statement
+
+## Development Workflow
+
+### Step 1: Create Test SQL Files
+
+```bash
+# Add SQL test cases to test/fixtures/dialects/<dialect>/
+echo "SELECT TOP 10 * FROM users;" > test/fixtures/dialects/tsql/top_clause.sql
+echo "CREATE CLUSTERED INDEX idx_id ON users(id);" > test/fixtures/dialects/tsql/create_index.sql
+```
+
+**Test file conventions:**
+- Organize by segment type (e.g., `select_statement.sql`, `create_table.sql`, `merge_statement.sql`)
+- Include multiple test cases per file covering edge cases
+- Use descriptive filenames matching the segment being tested
+- Test various keyword combinations, identifier formats, literal types, comments
+
+**Example structure:**
+```
+test/fixtures/dialects/tsql/
+├── select_top.sql           # TOP clause variations
+├── create_index.sql         # CLUSTERED/NONCLUSTERED indexes
+├── merge_statement.sql      # MERGE operations
+├── pivot_unpivot.sql        # PIVOT/UNPIVOT queries
+└── table_hints.sql          # WITH (NOLOCK) etc.
+```
+
+### Step 2: Generate Expected Parse Trees
+
+```bash
+# Activate virtual environment
+source .venv/bin/activate
+
+# Generate YAML fixtures for specific dialect
+python test/generate_parse_fixture_yml.py -d tsql
+
+# Or use tox
+tox -e generate-fixture-yml -- -d tsql
+```
+
+This creates `.yml` files showing the current parse tree. Initially these may show parsing failures or incorrect structures.
+
+### Step 3: Implement Grammar
+
+Edit `src/sqlfluff/dialects/dialect_<name>.py`:
+
+```python
+# 1. Define new segments needed
+class TopClauseSegment(BaseSegment):
+    """TOP clause for T-SQL."""
+    type = "top_clause"
+    match_grammar = Sequence(
+        "TOP",
+        Ref("NumericLiteralSegment"),
+        Sequence("PERCENT", optional=True),
+    )
+
+# 2. Override parent segments
+tsql_dialect.replace(
+    SelectStatementSegment=Sequence(
+        "SELECT",
+        Ref("TopClauseSegment", optional=True),
+        Ref("SelectClauseSegment"),
+        # ... rest of SELECT grammar
+    ),
+)
+```
+
+### Step 4: Regenerate and Verify
+
+```bash
+# Regenerate YAML to see updated parse tree
+python test/generate_parse_fixture_yml.py -d tsql
+
+# Check that parsing now works correctly
+sqlfluff parse test/fixtures/dialects/tsql/top_clause.sql --dialect tsql
+```
+
+### Step 5: Run Full Test Suite
+
+```bash
+# Test just the dialect
+tox -e generate-fixture-yml -- -d tsql
+
+# Run full test suite to ensure no regressions
+tox -e py312
+```
+
+## Keywords Management
+
+### Keyword Files
+
+Each dialect should have a keywords file: `dialect_<name>_keywords.py`
+
+```python
+"""T-SQL reserved and unreserved keywords."""
+
+RESERVED_KEYWORDS = [
+    "ADD", "ALL", "ALTER", "AND", "ANY", "AS", "ASC",
+    "CLUSTERED", "NONCLUSTERED", "TOP", "PIVOT", "UNPIVOT",
+    # ... full list
+]
+
+UNRESERVED_KEYWORDS = [
+    "ABSOLUTE", "ACTION", "ADA", "ALIAS", "ALLOCATE",
+    # ... full list
+]
+```
+
+In dialect file:
+```python
+from sqlfluff.dialects.dialect_tsql_keywords import (
+    RESERVED_KEYWORDS, UNRESERVED_KEYWORDS
+)
+
+tsql_dialect.sets("reserved_keywords").update(RESERVED_KEYWORDS)
+tsql_dialect.sets("unreserved_keywords").update(UNRESERVED_KEYWORDS)
+```
+
+## Common Dialect Patterns
+
+### Adding Vendor-Specific Functions
+
+```python
+class TSQLFunctionNameSegment(BaseSegment):
+    """T-SQL specific function names."""
+
+    type = "function_name"
+    match_grammar = OneOf(
+        "GETDATE", "NEWID", "SCOPE_IDENTITY",
+        "IDENT_CURRENT", "ROWCOUNT_BIG",
+        # Add more T-SQL functions
+    )
+
+tsql_dialect.replace(
+    FunctionNameSegment=OneOf(
+        Ref("AnsiSQLFunctionNameSegment"),  # Inherit ANSI functions
+        Ref("TSQLFunctionNameSegment"),      # Add T-SQL specific
+    ),
+)
+```
+
+### Adding Statement Types
+
+```python
+class MergeStatementSegment(BaseSegment):
+    """MERGE statement (T-SQL, Oracle, etc.)."""
+
+    type = "merge_statement"
+    match_grammar = Sequence(
+        "MERGE",
+        Sequence("TOP", Ref("ExpressionSegment"), optional=True),
+        "INTO", Ref("TableReferenceSegment"),
+        "USING", Ref("TableReferenceSegment"),
+        "ON", Ref("ExpressionSegment"),
+        AnyNumberOf(
+            Sequence("WHEN", "MATCHED", "THEN", Ref("MergeActionSegment")),
+            Sequence("WHEN", "NOT", "MATCHED", "THEN", Ref("MergeActionSegment")),
+        ),
+    )
+
+# Add to statement grammar
+tsql_dialect.replace(
+    StatementSegment=OneOf(
+        Ref("SelectStatementSegment"),
+        Ref("InsertStatementSegment"),
+        Ref("MergeStatementSegment"),  # New addition
+        # ... other statements
+    ),
+)
+```
+
+### Adding Data Types
+
+```python
+tsql_dialect.replace(
+    DatatypeSegment=OneOf(
+        # Inherit ANSI types
+        Sequence("VARCHAR", Bracketed(Ref("NumericLiteralSegment"), optional=True)),
+        Sequence("INT"),
+
+        # Add T-SQL specific types
+        Sequence("NVARCHAR",
+                 OneOf(Bracketed(Ref("NumericLiteralSegment")), "MAX", optional=True)),
+        Sequence("UNIQUEIDENTIFIER"),
+        Sequence("DATETIME2", Bracketed(Ref("NumericLiteralSegment"), optional=True)),
+        Sequence("HIERARCHYID"),
+    ),
+)
+```
+
+## Testing Dialect Changes
+
+### Dialect-Specific Tests
+
+Located in `test/dialects/<dialect>_test.py`:
+
+```python
+"""Tests specific to T-SQL dialect."""
+import pytest
+from sqlfluff.core import Linter
+
+@pytest.fixture
+def tsql_linter():
+    """Provide T-SQL linter for tests."""
+    return Linter(dialect="tsql")
+
+def test_top_clause_parsing(tsql_linter):
+    """Test TOP clause in SELECT."""
+    sql = "SELECT TOP 10 * FROM users;"
+    parsed = tsql_linter.parse_string(sql)
+    assert parsed.tree is not None
+    # Find TOP clause in parse tree
+    top_clause = parsed.tree.find("top_clause")
+    assert top_clause is not None
+```
+
+### Regression Prevention
+
+Always run the full fixture generation to ensure your changes don't break other dialects:
+
+```bash
+# Test all dialects
+tox -e generate-fixture-yml
+
+# Or specific ones that might be affected
+tox -e generate-fixture-yml -- -d ansi -d postgres -d mysql
+```
+
+## Per-Dialect Agent Instructions
+
+For complex dialects with vendor-specific quirks, SEE detailed instructions:
+
+**T-SQL**: `src/sqlfluff/dialects/dialect_instructions/tsql.md`
+
+---
+
+**See also:**
+- Root `AGENTS.md` for general project overview
+- `src/sqlfluff/AGENTS.md` for Python coding standards
+- `test/AGENTS.md` for testing conventions
+- Individual `dialect_instructions/<dialect>.md` files for dialect-specific guidance
--- a/src/sqlfluff/dialects/dialect_instructions/tsql.md
+++ b/src/sqlfluff/dialects/dialect_instructions/tsql.md
@@ -0,0 +1,153 @@
+# T-SQL Dialect - AI Assistant Instructions
+
+This file provides T-SQL (Microsoft SQL Server) specific development guidance.
+
+## T-SQL Syntax Documentation
+
+When implementing T-SQL features, refer to:
+- **Primary**: [T-SQL Reference](https://learn.microsoft.com/en-us/sql/t-sql/)
+- **Syntax Conventions**: [Transact-SQL Syntax Conventions](https://learn.microsoft.com/en-us/sql/t-sql/language-elements/transact-sql-syntax-conventions-transact-sql)
+
+## Microsoft Docs → SQLFluff Translation
+
+### T-SQL (Microsoft Docs) Translation
+
+Microsoft's syntax notation → SQLFluff grammar:
+
+| Microsoft Notation | Meaning | SQLFluff Translation |
+|-------------------|---------|---------------------|
+| `UPPERCASE` | Keyword | Literal string `"UPPERCASE"` |
+| *italic* | User parameter | `Ref("SegmentName")` |
+| `\|` (pipe) | Choice | `OneOf(...)` |
+| `[ ]` (brackets) | Optional | `optional=True` or `Ref(..., optional=True)` |
+| `{ }` (braces) | Required choice | `OneOf(...)` without optional |
+| `[, ...n]` | Comma-separated repetition | `Delimited(...)` |
+| `[...n]` | Space-separated repetition | `AnyNumberOf(...)` |
+| `;` | Statement terminator | `Ref("SemicolonSegment")` |
+| `<label> ::=` | Named syntax block | Define as separate segment class |
+
+**Example:**
+```
+Microsoft Docs:
+CREATE TABLE <table_name>
+(
+    <column_definition> [, ...n]
+)
+[ WITH ( <table_option> [, ...n] ) ]
+
+SQLFluff:
+class CreateTableStatementSegment(BaseSegment):
+    type = "create_table_statement"
+    match_grammar = Sequence(
+        "CREATE", "TABLE",
+        Ref("TableReferenceSegment"),
+        Bracketed(
+            Delimited(Ref("ColumnDefinitionSegment"))
+        ),
+        Sequence(
+            "WITH",
+            Bracketed(Delimited(Ref("TableOptionSegment"))),
+            optional=True,
+        ),
+    )
+```
+
+## Known Edge Cases
+
+### Quoted Identifiers
+
+T-SQL supports:
+- Square brackets: `[column name]`, `[table].[column]`
+- Double quotes: `"column name"` (when `QUOTED_IDENTIFIER` is ON)
+
+Square brackets are the standard T-SQL approach.
+
+### String Literals
+
+- Single quotes: `'string value'`
+- Escaped quotes: `'It''s a string'` (double single quote)
+- Unicode prefix: `N'Unicode string'`
+
+### Multi-part Identifiers
+
+T-SQL supports up to 4-part names:
+- `[server].[database].[schema].[object]`
+- `[database].[schema].[table]`
+- `[schema].[table]`
+- `[table]`
+
+### SET Statements
+
+T-SQL uses many SET statements for session configuration:
+```sql
+SET NOCOUNT ON
+SET ANSI_NULLS ON
+SET QUOTED_IDENTIFIER ON
+SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
+```
+
+## Testing T-SQL Features
+
+### Test File Locations
+
+```
+test/fixtures/dialects/tsql/
+├── select_top.sql
+├── merge_statement.sql
+├── pivot_unpivot.sql
+├── table_hints.sql
+├── output_clause.sql
+├── cte_multiple.sql
+└── create_index_clustered.sql
+```
+
+### Running T-SQL Tests
+
+```bash
+# Generate all T-SQL fixtures
+python test/generate_parse_fixture_yml.py -d tsql
+
+# Run T-SQL dialect tests
+tox -e py312 -- test/dialects/tsql_test.py
+
+# Parse single file
+sqlfluff parse test/fixtures/dialects/tsql/select_top.sql --dialect tsql
+```
+
+## Common Implementation Tasks
+
+### Adding New T-SQL Function
+
+1. Check if function exists in Microsoft docs
+2. Add to T-SQL function list in dialect file
+3. Create test case in appropriate test file
+4. Verify parsing
+
+### Adding New Statement Type
+
+1. Study Microsoft docs syntax
+2. Create segment class with `match_grammar`
+3. Add to `StatementSegment` via `.replace()`
+4. Create comprehensive test cases
+5. Regenerate fixtures
+
+### Fixing Parsing Issue
+
+1. Identify failing SQL in test fixtures
+2. Run `sqlfluff parse <file> --dialect tsql` to see error
+3. Examine parse tree output
+4. Adjust grammar in dialect file
+5. Regenerate and verify
+
+## Resources
+
+- [T-SQL Language Reference](https://learn.microsoft.com/en-us/sql/t-sql/language-reference-database-engine)
+- [T-SQL Statements](https://learn.microsoft.com/en-us/sql/t-sql/statements/statements)
+- [T-SQL Functions](https://learn.microsoft.com/en-us/sql/t-sql/functions/functions)
+- [T-SQL Data Types](https://learn.microsoft.com/en-us/sql/t-sql/data-types/data-types-transact-sql)
+
+---
+
+**See also:**
+- `src/sqlfluff/dialects/AGENTS.md` for general dialect development
+- Root `AGENTS.md` for project overview
--- a/test/AGENTS.md
+++ b/test/AGENTS.md
@@ -0,0 +1,498 @@
+# Testing - AI Assistant Instructions
+
+This file provides testing guidelines for SQLFluff development.
+
+## Testing Philosophy
+
+SQLFluff uses comprehensive testing with:
+- **High coverage requirements**: Changes should reach 100% coverage
+- **Multiple test types**: Unit tests, integration tests, fixture-based tests
+- **Automated verification**: All tests run via tox and CI/CD
+
+## Test Organization
+
+```
+test/
+├── conftest.py                    # Shared pytest fixtures
+├── api/                           # API tests
+│   ├── simple_test.py
+│   └── classes_test.py
+├── cli/                           # CLI tests
+│   ├── commands_test.py
+│   └── formatters_test.py
+├── core/                          # Core component tests
+│   ├── parser/
+│   │   ├── grammar_test.py
+│   │   └── segments_test.py
+│   └── rules/
+│       └── base_test.py
+├── dialects/                      # Dialect parsing tests
+│   ├── ansi_test.py
+│   ├── tsql_test.py
+│   └── postgres_test.py
+├── rules/                         # Rule testing
+│   ├── yaml_test_cases_test.py    # YAML-based rule tests
+│   └── std_fix_auto_test.py       # Auto-fix integration tests
+└── fixtures/                      # Test data
+    ├── dialects/
+    │   ├── ansi/
+    │   │   ├── select_statement.sql
+    │   │   └── select_statement.yml
+    │   └── tsql/
+    │       ├── select_top.sql
+    │       └── select_top.yml
+    └── rules/
+        └── std_rule_cases/
+            ├── aliasing.yml
+            └── layout.yml
+```
+
+## Test Frameworks
+
+### Pytest
+
+Primary test framework for all Python tests.
+
+**Key features:**
+- Test discovery: `*_test.py` files
+- Fixtures: Reusable test setup in `conftest.py`
+- Markers: Categorize tests (`@pytest.mark.dbt`, `@pytest.mark.integration`)
+- Parametrization: Run same test with different inputs
+
+**Basic test structure:**
+```python
+import pytest
+from sqlfluff.core import Linter
+
+def test_simple_parsing():
+    """Test basic SQL parsing."""
+    linter = Linter(dialect="ansi")
+    result = linter.parse_string("SELECT * FROM users")
+    assert result.tree is not None
+    assert result.violations == []
+```
+
+### Fixtures (Pytest)
+
+**Common fixtures** in `conftest.py`:
+- `default_config`: Default SQLFluff configuration
+- `fresh_ansi_dialect`: Clean ANSI dialect instance
+- `caplog`: Capture log output
+
+**Using fixtures:**
+```python
+@pytest.fixture
+def tsql_linter():
+    """Provide T-SQL linter for tests."""
+    return Linter(dialect="tsql")
+
+def test_with_fixture(tsql_linter):
+    """Test using fixture."""
+    result = tsql_linter.parse_string("SELECT TOP 10 * FROM users")
+    assert result.tree is not None
+```
+
+### Test Markers
+
+**Built-in markers:**
+```python
+@pytest.mark.dbt
+def test_dbt_templater():
+    """Test requiring dbt installation."""
+    pass
+
+@pytest.mark.integration
+def test_full_workflow():
+    """Integration test spanning multiple components."""
+    pass
+
+@pytest.mark.parametrize("sql,expected", [
+    ("SELECT * FROM t", True),
+    ("SELECT", False),
+])
+def test_multiple_cases(sql, expected):
+    """Test with multiple inputs."""
+    result = is_valid_sql(sql)
+    assert result == expected
+```
+
+## Dialect Testing
+
+### SQL Fixture Files
+
+Located in `test/fixtures/dialects/<dialect>/`:
+
+```sql
+-- test/fixtures/dialects/tsql/select_top.sql
+SELECT TOP 10 * FROM users;
+
+SELECT TOP (10) PERCENT * FROM products;
+
+SELECT TOP 5 WITH TIES * FROM orders ORDER BY total_amount DESC;
+```
+
+**Best practices:**
+- One file per segment type or feature
+- Multiple test cases per file covering variations
+- Use descriptive filenames
+- Include comments explaining edge cases
+
+### YAML Expected Outputs
+
+Generated automatically by `generate_parse_fixture_yml.py`:
+
+```yaml
+# test/fixtures/dialects/tsql/select_top.yml
+- file:
+    statement:
+    - select_statement:
+      - keyword: SELECT
+      - top_clause:
+        - keyword: TOP
+        - numeric_literal: '10'
+      - whitespace: ' '
+      - select_clause_element:
+        - wildcard_expression:
+          - wildcard_identifier:
+            - star: '*'
+      # ... rest of parse tree
+```
+
+**Workflow:**
+1. Create `.sql` file with test cases
+2. Run `python test/generate_parse_fixture_yml.py -d <dialect>`
+3. Script generates `.yml` with current parse tree
+4. Review `.yml` to verify correctness
+5. Commit both `.sql` and `.yml` files
+
+### Generating Fixtures
+
+```bash
+# Activate environment
+source .venv/bin/activate
+
+# Generate for specific dialect
+python test/generate_parse_fixture_yml.py -d tsql
+
+# Generate for all dialects (slow!)
+python test/generate_parse_fixture_yml.py
+
+# Using tox
+tox -e generate-fixture-yml -- -d tsql
+```
+
+### Dialect Test Files
+
+Beyond fixtures, write explicit tests in `test/dialects/<dialect>_test.py`:
+
+```python
+"""Tests specific to T-SQL dialect."""
+import pytest
+from sqlfluff.core import Linter
+
+class TestTSQLDialect:
+    """T-SQL dialect tests."""
+
+    @pytest.fixture
+    def linter(self):
+        """Provide T-SQL linter."""
+        return Linter(dialect="tsql")
+
+    def test_top_clause(self, linter):
+        """Test TOP clause parsing."""
+        sql = "SELECT TOP 10 * FROM users"
+        result = linter.parse_string(sql)
+
+        # Verify parsing succeeded
+        assert result.tree is not None
+
+        # Find TOP clause in tree
+        top_clause = result.tree.get_child("top_clause")
+        assert top_clause is not None
+
+    def test_table_hint(self, linter):
+        """Test table hint WITH (NOLOCK)."""
+        sql = "SELECT * FROM users WITH (NOLOCK)"
+        result = linter.parse_string(sql)
+
+        assert result.tree is not None
+        hints = result.tree.get_child("table_hint")
+        assert hints is not None
+```
+
+## Rule Testing
+
+### YAML Test Cases
+
+Primary method for testing rules. Located in `test/fixtures/rules/std_rule_cases/`:
+
+```yaml
+# test/fixtures/rules/std_rule_cases/aliasing.yml
+rule: AL01
+
+test_implicit_alias_fail:
+  fail_str: SELECT * FROM users u
+
+test_explicit_alias_pass:
+  pass_str: SELECT * FROM users AS u
+
+test_implicit_alias_fix:
+  fail_str: SELECT * FROM users u
+  fix_str: SELECT * FROM users AS u
+
+test_with_config:
+  fail_str: SELECT * FROM users AS u
+  configs:
+    rules:
+      aliasing.table:
+        aliasing: implicit
+```
+
+**YAML structure:**
+- `rule`: Rule code being tested
+- `test_*`: Test case name (descriptive)
+- `fail_str`: SQL that should fail the rule
+- `pass_str`: SQL that should pass the rule
+- `fix_str`: Expected SQL after auto-fix (optional)
+- `configs`: Override configuration (optional)
+
+### Running Rule Tests
+
+```bash
+# Test specific rule
+tox -e py312 -- test/rules/yaml_test_cases_test.py -k AL01
+
+# Test all rules
+tox -e py312 -- test/rules/yaml_test_cases_test.py
+
+# Test auto-fixing
+tox -e py312 -- test/rules/std_fix_auto_test.py
+
+# Direct pytest (faster during development)
+pytest test/rules/yaml_test_cases_test.py -k AL01 -v
+```
+
+### Rule Unit Tests
+
+For complex rule logic, write explicit tests:
+
+```python
+"""Tests for Rule AL01."""
+import pytest
+from sqlfluff.core.rules import RuleContext
+from sqlfluff.rules.aliasing.AL01 import Rule_AL01
+
+class TestRuleAL01:
+    """Tests for implicit alias rule."""
+
+    def test_implicit_alias_detected(self):
+        """Test that implicit alias is detected."""
+        rule = Rule_AL01()
+        # Create test context and segment
+        # ... test implementation
+        result = rule._eval(context)
+        assert result is not None
+        assert "implicit" in result.description.lower()
+```
+
+## Coverage Testing
+
+### Running with Coverage
+
+```bash
+# Coverage for specific module
+pytest test/core/parser/ --cov=src/sqlfluff/core/parser --cov-report=term-missing
+
+# Coverage for rules (shows uncovered lines)
+pytest test/rules/ --cov=src/sqlfluff/rules --cov-report=term-missing:skip-covered
+
+# Full coverage report
+pytest test/ --cov=src/sqlfluff --cov-report=term-missing
+
+# HTML coverage report (creates htmlcov/ directory)
+pytest test/ --cov=src/sqlfluff --cov-report=html
+open htmlcov/index.html
+
+# Using tox
+tox -e cov-init,py312,cov-report
+```
+
+### Coverage Requirements
+
+- New code should have high test coverage (100%)
+- Changes should not decrease overall coverage
+- Critical paths (parser, rules) require comprehensive coverage
+
+## Test Commands Reference
+
+### Quick Testing During Development
+
+```bash
+# Single test file
+pytest test/core/parser/grammar_test.py -v
+
+# Single test function
+pytest test/core/parser/grammar_test.py::test_sequence_matching -v
+
+# Tests matching pattern
+pytest test/rules/ -k AL01 -v
+
+# Specific dialect fixtures
+python test/generate_parse_fixture_yml.py -d tsql
+
+# Run and stop on first failure
+pytest test/core/ -x
+
+# Show print statements
+pytest test/core/ -s
+
+# Verbose output with captured logs
+pytest test/core/ -v --log-cli-level=DEBUG
+```
+
+### Full Test Suite
+
+```bash
+# Run all tests for Python 3.12
+tox -e py312
+
+# Run with coverage
+tox -e cov-init,py312,cov-report
+
+# Run linting and type checking
+tox -e linting,mypy
+
+# Full suite (all Python versions, linting, type checking)
+tox
+```
+
+### Test-Driven Development Workflow
+
+1. **Write failing test:**
+   ```python
+   def test_new_feature():
+       """Test new feature."""
+       result = new_feature("input")
+       assert result == "expected"
+   ```
+
+2. **Run test to confirm failure:**
+   ```bash
+   pytest test/core/new_feature_test.py::test_new_feature -v
+   ```
+
+3. **Implement feature**
+
+4. **Run test to confirm success:**
+   ```bash
+   pytest test/core/new_feature_test.py::test_new_feature -v
+   ```
+
+5. **Run broader tests to check for regressions:**
+   ```bash
+   pytest test/core/ -v
+   ```
+
+6. **Check coverage:**
+   ```bash
+   pytest test/core/ --cov=src/sqlfluff/core --cov-report=term-missing
+   ```
+
+## Test Data Management
+
+### SQL Test Files
+
+**Location**: `test/fixtures/dialects/<dialect>/*.sql`
+
+**Guidelines:**
+- Descriptive filenames: `select_top.sql`, `merge_statement.sql`
+- Multiple test cases per file
+- Include edge cases and variations
+- Add comments for complex cases
+
+**Example:**
+```sql
+-- test/fixtures/dialects/tsql/select_top.sql
+
+-- Basic TOP clause
+SELECT TOP 10 * FROM users;
+
+-- TOP with parentheses
+SELECT TOP (10) * FROM users;
+
+-- TOP with PERCENT
+SELECT TOP 10 PERCENT * FROM users;
+
+-- TOP with WITH TIES (requires ORDER BY)
+SELECT TOP 5 WITH TIES * FROM orders ORDER BY amount DESC;
+```
+
+### YAML Expected Outputs
+
+**Generated automatically** - do not edit manually unless absolutely necessary.
+
+**Regenerate after grammar changes:**
+```bash
+python test/generate_parse_fixture_yml.py -d <dialect>
+```
+
+### Rule Test YAML Files
+
+**Location**: `test/fixtures/rules/std_rule_cases/<category>.yml`
+
+**Categories:**
+- `aliasing.yml`: Aliasing rules (AL*)
+- `layout.yml`: Layout rules (LT*)
+- `capitalisation.yml`: Capitalisation rules (CP*)
+- `convention.yml`: Convention rules (CV*)
+- `structure.yml`: Structure rules (ST*)
+- `references.yml`: Reference rules (RF*)
+
+## Common Testing Patterns
+
+### Testing Exceptions
+
+```python
+import pytest
+from sqlfluff.core.errors import SQLParseError
+
+def test_invalid_sql_raises():
+    """Test that invalid SQL raises error."""
+    with pytest.raises(SQLParseError):
+        parse_invalid_sql("SELECT * FROM")
+```
+
+### Parametrized Tests
+
+```python
+@pytest.mark.parametrize("sql,expected_type", [
+    ("SELECT * FROM users", "select_statement"),
+    ("INSERT INTO users VALUES (1)", "insert_statement"),
+    ("UPDATE users SET name = 'x'", "update_statement"),
+])
+def test_statement_types(sql, expected_type):
+    """Test various statement types."""
+    result = parse_sql(sql)
+    assert result.tree.type == expected_type
+```
+
+### Fixture Parametrization
+
+```python
+@pytest.fixture(params=["ansi", "tsql", "postgres"])
+def dialect_linter(request):
+    """Provide linter for multiple dialects."""
+    return Linter(dialect=request.param)
+
+def test_across_dialects(dialect_linter):
+    """Test behavior across multiple dialects."""
+    result = dialect_linter.parse_string("SELECT * FROM users")
+    assert result.tree is not None
+```
+
+---
+
+**See also:**
+- Root `AGENTS.md` for general project overview
+- `src/sqlfluff/AGENTS.md` for Python coding standards
+- `src/sqlfluff/dialects/AGENTS.md` for dialect development and testing