forked from repo-mirrors/sqlfluff
101 lines
4.0 KiB
ReStructuredText
101 lines
4.0 KiB
ReStructuredText
.. _developingrulesref:
|
|
|
|
Developing Rules
|
|
================
|
|
|
|
`Rules` in `SQLFluff` are implemented as classes inheriting from ``BaseRule``.
|
|
SQLFluff crawls through the parse tree of a SQL file, calling the rule's
|
|
``_eval()`` function for each segment in the tree. For many rules, this allows
|
|
the rule code to be really streamlined and only contain the logic for the rule
|
|
itself, with all the other mechanics abstracted away.
|
|
|
|
Running Tests
|
|
-------------
|
|
|
|
The majority of the test cases for most bundled rules are *"yaml test cases"*,
|
|
i.e. test cases defined in `yaml`_ files. You can find those `yaml fixtures on github`_.
|
|
While this provides a very simple way to *write* tests, it can be occasionally tedious
|
|
to *run* specific tests.
|
|
|
|
Within either a `tox` environment or `virtualenv` (as described in the `contributing.md`_
|
|
file), you can either run all of the rule yaml tests with:
|
|
|
|
.. code-block:: sh
|
|
|
|
pytest test/rules/yaml_test_cases_test.py -vv
|
|
|
|
...or to just run tests for a specific rule, there are two options for a syntax to select
|
|
only those tests:
|
|
|
|
.. code-block:: sh
|
|
|
|
pytest -vv test/rules/yaml_test_cases_test.py --rule_id=RF01
|
|
pytest -vv test/rules/ -k RF01
|
|
|
|
The :code:`--rule_id` syntax relies on the name of the yaml file, and the :code:`-k`
|
|
option simply searches for the content of the argument being in the name of the test.
|
|
The latter is slightly less to type and so is generally the most frequently used.
|
|
|
|
.. _`yaml`: https://yaml.org/
|
|
.. _`yaml fixtures on github`: https://github.com/sqlfluff/sqlfluff/tree/main/test/fixtures/rules/std_rule_cases
|
|
.. _`contributing.md`: https://github.com/sqlfluff/sqlfluff/blob/main/CONTRIBUTING.md
|
|
|
|
Traversal Options
|
|
-----------------
|
|
|
|
``recurse_into``
|
|
^^^^^^^^^^^^^^^^
|
|
Some rules are a poor fit for the simple traversal pattern described above.
|
|
Typical reasons include:
|
|
|
|
* The rule only looks at a small portion of the file (e.g. the beginning or
|
|
end).
|
|
* The rule needs to traverse the parse tree in a non-standard way.
|
|
|
|
These rules can override ``BaseRule``'s ``recurse_into`` field, setting it to
|
|
``False``. For these rules ``False``, ``_eval()`` is only called *once*, with
|
|
the root segment of the tree. This can be much more efficient, especially on
|
|
large files. For example, see rules ``LT13`` and ``LT12`` , which only look at
|
|
the beginning or end of the file, respectively.
|
|
|
|
``_works_on_unparsable``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
By default, `SQLFluff` calls ``_eval()`` for all segments, even "unparsable"
|
|
segments, i.e. segments that didn't match the parsing rules in the dialect.
|
|
This causes issues for some rules. If so, setting ``_works_on_unparsable``
|
|
to ``False`` tells SQLFluff not to call ``_eval()`` for unparsable segments and
|
|
their descendants.
|
|
|
|
Performance-related Options
|
|
---------------------------
|
|
These are other fields on ``BaseRule``. Rules can override them.
|
|
|
|
``needs_raw_stack``
|
|
^^^^^^^^^^^^^^^^^^^
|
|
``needs_raw_stack`` defaults to ``False``. Some rules use
|
|
``RuleContext.raw_stack`` property to access earlier segments in the traversal.
|
|
This can be useful, but it adds significant overhead to the linting process.
|
|
For this reason, it is disabled by default.
|
|
|
|
``lint_phase``
|
|
^^^^^^^^^^^^^^
|
|
There are two phases of rule running.
|
|
|
|
1. The ``main`` phase is appropriate for most rules. These rules are assumed to
|
|
interact and potentially cause a cascade of fixes requiring multiple passes.
|
|
These rules run the `runaway_limit` number of times (default 10).
|
|
|
|
2. The ``post`` phase is for post-processing rules, not expected to trigger
|
|
any downstream rules, e.g. capitalization fixes. They are run in a
|
|
post-processing loop at the end. This loop is identical to the ``main`` loop,
|
|
but is only run 2 times at the end (once to fix, and once again to confirm no
|
|
remaining issues).
|
|
|
|
The two phases add complexity, but they also improve performance by allowing
|
|
SQLFluff to run fewer rules during the ``main`` phase, which often runs several
|
|
times.
|
|
|
|
NOTE: ``post`` rules also run on the *first* pass of the ``main`` phase so that
|
|
any issues they find will be presented in the list of issues output by
|
|
``sqlfluff fix`` and ``sqlfluff lint``.
|