Skip to content

Commit

Permalink
Merge pull request #2224 from opensafely-core/debug
Browse files Browse the repository at this point in the history
Add a debug ehrQL command
  • Loading branch information
rebkwok authored Nov 15, 2024
2 parents 97b8a3d + 9b6f5c5 commit 74d59f8
Show file tree
Hide file tree
Showing 17 changed files with 1,231 additions and 51 deletions.
102 changes: 101 additions & 1 deletion docs/includes/generated_docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,14 @@ Internal command for testing code isolation support.
Output the dataset definition's query graph
</p>

<div class="attr-heading">
<a href="#debug"><tt>debug</tt></a>
</div>
<p class="indent">
Internal command for getting debugging information from a dataset
definition; used by the [OpenSAFELY VSCode extension][opensafely-vscode].
</p>

</div>

<div class="attr-heading" id="ehrql.help">
Expand Down Expand Up @@ -638,6 +646,7 @@ Database connection string.
```
ehrql serialize-definition DEFINITION_FILE [--help]
[--definition-type DEFINITION_TYPE] [--output OUTPUT_FILE]
[--dummy-tables DUMMY_TABLES_PATH] [--display-format RENDER_FORMAT]
[ -- ... PARAMETERS ...]
```
Internal command for serializing a definition file to a JSON representation.
Expand Down Expand Up @@ -667,7 +676,7 @@ show this help message and exit
<a class="headerlink" href="#serialize-definition.definition-type" title="Permanent link">🔗</a>
</div>
<div markdown="block" class="indent">
Options: `dataset`, `measures`, `test`
Options: `dataset`, `measures`, `test`, `debug`

</div>

Expand All @@ -680,6 +689,29 @@ Output file path (stdout by default)

</div>

<div class="attr-heading" id="serialize-definition.dummy-tables">
<tt>--dummy-tables DUMMY_TABLES_PATH</tt>
<a class="headerlink" href="#serialize-definition.dummy-tables" title="Permanent link">🔗</a>
</div>
<div markdown="block" class="indent">
Path to directory of files (one per table) to use as dummy tables
(see [`create-dummy-tables`](#create-dummy-tables)).

Files may be in any supported format: `.arrow`, `.csv`, `.csv.gz`

This argument is ignored when running against real tables.

</div>

<div class="attr-heading" id="serialize-definition.display-format">
<tt>--display-format RENDER_FORMAT</tt>
<a class="headerlink" href="#serialize-definition.display-format" title="Permanent link">🔗</a>
</div>
<div markdown="block" class="indent">
Render format for debug command, default ascii

</div>

<div class="attr-heading" id="serialize-definition.user_args">
<tt>PARAMETERS</tt>
<a class="headerlink" href="#serialize-definition.user_args" title="Permanent link">🔗</a>
Expand Down Expand Up @@ -758,4 +790,72 @@ supplied after all ehrQL arguments and separated from the ehrQL arguments with a
double-dash ` -- `.


</div>


<h2 id="debug" data-toc-label="debug" markdown>
debug
</h2>
```
ehrql debug DEFINITION_FILE [--help] [--dummy-tables DUMMY_TABLES_PATH]
[--display-format RENDER_FORMAT] [ -- ... PARAMETERS ...]
```
Internal command for getting debugging information from a dataset
definition; used by the [OpenSAFELY VSCode extension][opensafely-vscode].

Note that **this in an internal command** and not intended for end users.

[opensafely-vscode]: https://marketplace.visualstudio.com/items?itemName=bennettoxford.opensafely

<div class="attr-heading" id="debug.definition_file">
<tt>DEFINITION_FILE</tt>
<a class="headerlink" href="#debug.definition_file" title="Permanent link">🔗</a>
</div>
<div markdown="block" class="indent">
Path of the Python file where the dataset is defined.

</div>

<div class="attr-heading" id="debug.help">
<tt>-h, --help</tt>
<a class="headerlink" href="#debug.help" title="Permanent link">🔗</a>
</div>
<div markdown="block" class="indent">
show this help message and exit

</div>

<div class="attr-heading" id="debug.dummy-tables">
<tt>--dummy-tables DUMMY_TABLES_PATH</tt>
<a class="headerlink" href="#debug.dummy-tables" title="Permanent link">🔗</a>
</div>
<div markdown="block" class="indent">
Path to directory of files (one per table) to use as dummy tables
(see [`create-dummy-tables`](#create-dummy-tables)).

Files may be in any supported format: `.arrow`, `.csv`, `.csv.gz`

This argument is ignored when running against real tables.

</div>

<div class="attr-heading" id="debug.display-format">
<tt>--display-format RENDER_FORMAT</tt>
<a class="headerlink" href="#debug.display-format" title="Permanent link">🔗</a>
</div>
<div markdown="block" class="indent">
Render format for debug command, default ascii

</div>

<div class="attr-heading" id="debug.user_args">
<tt>PARAMETERS</tt>
<a class="headerlink" href="#debug.user_args" title="Permanent link">🔗</a>
</div>
<div markdown="block" class="indent">
Parameters are extra arguments you can pass to your Python definition file. They must be
supplied after all ehrQL arguments and separated from the ehrQL arguments with a
double-dash ` -- `.


</div>
2 changes: 2 additions & 0 deletions ehrql/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from pathlib import Path

from ehrql.codes import codelist_from_csv
from ehrql.debug import show
from ehrql.measures import INTERVAL, Measures, create_measures
from ehrql.query_language import (
Dataset,
Expand Down Expand Up @@ -34,6 +35,7 @@
"maximum_of",
"minimum_of",
"months",
"show",
"weeks",
"when",
"years",
Expand Down
52 changes: 52 additions & 0 deletions ehrql/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,13 @@
split_directory_and_extension,
)
from ehrql.loaders import DEFINITION_LOADERS, DefinitionError
from ehrql.renderers import DISPLAY_RENDERERS
from ehrql.utils.string_utils import strip_indent

from .main import (
assure,
create_dummy_tables,
debug_dataset_definition,
dump_dataset_sql,
dump_example_data,
generate_dataset,
Expand Down Expand Up @@ -161,6 +163,7 @@ def show_help(**kwargs):
add_serialize_definition(subparsers, environ, user_args)
add_isolation_report(subparsers, environ, user_args)
add_graph_query(subparsers, environ, user_args)
add_debug_dataset_definition(subparsers, environ, user_args)

return parser

Expand Down Expand Up @@ -375,6 +378,29 @@ def add_run_sandbox(subparsers, environ, user_args):
)


def add_debug_dataset_definition(subparsers, environ, user_args):
parser = subparsers.add_parser(
"debug",
help=strip_indent(
"""
Internal command for getting debugging information from a dataset
definition; used by the [OpenSAFELY VSCode extension][opensafely-vscode].
Note that **this in an internal command** and not intended for end users.
[opensafely-vscode]: https://marketplace.visualstudio.com/items?itemName=bennettoxford.opensafely
"""
),
formatter_class=RawTextHelpFormatter,
)
parser.set_defaults(function=debug_dataset_definition)
parser.set_defaults(environ=environ)
parser.set_defaults(user_args=user_args)
add_dataset_definition_file_argument(parser, environ)
add_dummy_tables_argument(parser, environ)
add_display_renderer_argument(parser, environ)


def add_assure(subparsers, environ, user_args):
parser = subparsers.add_parser(
"assure",
Expand Down Expand Up @@ -475,6 +501,8 @@ def add_serialize_definition(subparsers, environ, user_args):
type=existing_python_file,
metavar="definition_file",
)
add_dummy_tables_argument(parser, environ)
add_display_renderer_argument(parser, environ)


def add_isolation_report(subparsers, environ, user_args):
Expand Down Expand Up @@ -589,6 +617,30 @@ def add_backend_argument(parser, environ):
)


def add_display_renderer_argument(parser, environ):
parser.add_argument(
"--display-format",
help=strip_indent(
"""
Render format for debug command, default ascii
"""
),
dest="render_format",
default="ascii",
type=renderer,
)


def renderer(value):
if value not in DISPLAY_RENDERERS:
raise ArgumentTypeError(
f"'{value}' is not a supported display format, "
f"must be one of: "
f"{backtick_join((renderer_format) for renderer_format in DISPLAY_RENDERERS)}"
)
return value


def existing_file(value):
path = Path(value)
if not path.exists():
Expand Down
66 changes: 66 additions & 0 deletions ehrql/debug.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
import inspect
import sys

from ehrql.renderers import truncate_table
from ehrql.utils.docs_utils import exclude_from_docs


@exclude_from_docs
def show(
element,
*other_elements,
label: str | None = None,
head: int | None = None,
tail: int | None = None,
):
"""
Show the output of the specified element within a dataset definition
_element_<br>
Any element within the dataset definition file; can be a string, constant value etc,
but will typically be a dataset variable (filtered table, column, or a dataset itself.)
_label_<br>
Optional label which will be printed in the debug output.
_head_<br>
Show only the first N lines. If the output is an ehrQL column, table or dataset, it will
print only the first N lines of the table.
_tail_<br>
Show only the last N lines. If the output is an ehrQL column, table or dataset, it will
print only the last N lines of the table.
head and tail arguments can be combined, e.g. to show the first and last 5 lines of a table:
show(<table>, head=5, tail=5)
"""
line_no = inspect.getframeinfo(sys._getframe(1))[1]
elements = [element, *other_elements]
element_reprs = [repr(el) for el in elements]
if head or tail:
element_reprs = [
truncate_table(el_repr, head, tail) for el_repr in element_reprs
]
label = f" {label}" if label else ""
print(f"Debug line {line_no}:{label}", file=sys.stderr)
for el_repr in element_reprs:
print(el_repr, file=sys.stderr)


def stop(*, head: int | None = None, tail: int | None = None):
"""
Stop loading the dataset definition and show the contents of the dataset at this point.
_head_<br>
Show only the first N lines of the dataset.
_tail_<br>
Show only the last N lines of the dataset.
head and tail arguments can be combined, e.g. to show the first and last 5 lines of the dataset:
stop(head=5, tail=5)
"""
line_no = inspect.getframeinfo(sys._getframe(1))[1]
print(f"Stopping at line {line_no}", file=sys.stderr)
13 changes: 11 additions & 2 deletions ehrql/docs/language.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,10 +56,15 @@ def build_language():
# The namespace we're going to document includes all the public names in `ehrql`,
# plus all the classes in `ehrql.query_language` which we haven't explicitly
# excluded
namespace = {name: getattr(ehrql, name) for name in ehrql.__all__}
ehrql_namespace = [(name, getattr(ehrql, name)) for name in ehrql.__all__]
ql_namespace = vars(ql).items()
namespace = {
name: value for name, value in ehrql_namespace if is_included_object(value)
}
namespace.update(
(name, attr) for name, attr in vars(ql).items() if is_included_class(attr)
{name: value for name, value in ql_namespace if is_included_class(value)}
)

# Add class which exists only for documentation purposes – see above
namespace["SortedEventFrame"] = SortedEventFrame

Expand Down Expand Up @@ -172,6 +177,10 @@ def is_included_attr(name, attr):
return inspect.isfunction(attr) or inspect.isdatadescriptor(attr)


def is_included_object(value):
return not getattr(value, "exclude_from_docs", None)


def method_order(details):
# We generally present methods in the order they were defined but because of the
# inheritance hierarchy this can lead to methods which naturally belong together
Expand Down
Loading

0 comments on commit 74d59f8

Please sign in to comment.