You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, when a user wants columns from a 2D dataset, each column must be explicitly indexed in the columns input value. For example, given a 2D dataset named xvar with n columns, if the user wishes to have all n columns appear in the output file, the columns input value must include xvarX for every X in the range 0 through n - 1. For example, if xvar has 4 columns, and the user wants all of them in the output, then xvar0,xvar1,xvar2,xvar3 must be included in the columns input. For a small number of columns, this is acceptable, but when the 2D dataset contains more than a handful of columns, this is tedious, error-prone, and inconvenient.
To make it easy for users to automatically get all columns of a 2D dataset in the output, we should support a shorthand notation within the columns input. I propose that we support the syntax *VAR as a column name, where VAR is the name of a 2D dataset. For example, continuing from above, if *xvar is part of the columns input, we should automatically "spread" this into xvar0,xvar1,xvar2,xvar3, just as if the user had included such an expanded form in the columns input value to begin with. This syntax is consistent with the Python syntax for iterable unpacking.
However, since we don't know in advance how many columns a 2D dataset contains, this can be resolved by first implementing #47 with one of the proposed approaches, because both approaches specify a means to readily determine the number of columns in any supported 2D dataset.
The text was updated successfully, but these errors were encountered:
After discussion with Abigail Barenblitt, we landed on using the Python slice syntax: xvar[a:b]
This would allow users even better flexibility, such that a user can specify a specific start index (a, 0-based) and stop index (b, exclusive), in case the user does not want all columns of xvar (for example).
Further, to select all columns, the syntax would be xvar[:], again, just like Python slice syntax.
Currently, when a user wants columns from a 2D dataset, each column must be explicitly indexed in the
columns
input value. For example, given a 2D dataset namedxvar
with n columns, if the user wishes to have all n columns appear in the output file, thecolumns
input value must includexvarX
for everyX
in the range0
through n - 1. For example, ifxvar
has 4 columns, and the user wants all of them in the output, thenxvar0,xvar1,xvar2,xvar3
must be included in thecolumns
input. For a small number of columns, this is acceptable, but when the 2D dataset contains more than a handful of columns, this is tedious, error-prone, and inconvenient.To make it easy for users to automatically get all columns of a 2D dataset in the output, we should support a shorthand notation within the
columns
input. I propose that we support the syntax*VAR
as a column name, whereVAR
is the name of a 2D dataset. For example, continuing from above, if*xvar
is part of thecolumns
input, we should automatically "spread" this intoxvar0,xvar1,xvar2,xvar3
, just as if the user had included such an expanded form in thecolumns
input value to begin with. This syntax is consistent with the Python syntax for iterable unpacking.However, since we don't know in advance how many columns a 2D dataset contains, this can be resolved by first implementing #47 with one of the proposed approaches, because both approaches specify a means to readily determine the number of columns in any supported 2D dataset.
The text was updated successfully, but these errors were encountered: