-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DISC: concensus primitives #84
Comments
(1) and 3 seem clear to me, and sound good. Regarding (2), Regarding (4): sounds fine, as long as we can ignore that arrays may not support every dtype. Also, |
@jbrockmendel regarding numerical operations, gh-50 may be relevant. |
Maybe not exactly what you wanted to discuss here (rather terminology in general, not specific examples), but I don't really understand this example (at least in pandas this doesn't hold because of alignment?) |
w/r/t matches/equals/assert_frame_equals I was thinking of exact matches, i.e. 0 tolerance for floating points, but I don't know if that's necessary when it comes time to instrumentalize it. The point in this context is to be able to make meaningful statements about commutativity and column-wise operations. w/r/t concat im only interested in axis=1 in this context. Again, the motivating case is to be able to describe |
OK, but then I don't understand how "concat" is useful to describe those operations (or are the X and Y in your example columns, not dataframes?) |
It should hold for any decomposition of a DataFrame along the lines of |
@jbrockmendel it seems like we don't need these primitives (at least in an API, they may be useful for an implementer, but Marco's MVP code doesn't use them I believe). I think this can be closed now - is that okay with you? |
In writing up proposals for arithmetic I find myself referencing other methods/characteristics that might not be well-defined within the spec, but that I think should be clear from context. Before spending much more time on this, I want to double-check that everyone is a) in agreement about what these mean informally and b) OK with these being used informally until something more formal is available.
pd.concat([left, right], axis=1)
. In the cases of interest len(left) == len(right).Example: "Scalar arithmetic commutes with concatenation, so
concat([X, Y], axis=1) + Z
matchesconcat([X+Z, Y+Z], axis=1)
"pd.testing.assert_frame_equal
. Similar but not identical topd.DataFrame.equals
pd.DataFrame._indexed_same
. In cases without a row-index, would require columns match and lengths matchExample: "DataFrame arithmetic
X+Y
is defined so long as X is indexed-like Y"update
4) "extract_array", analogous to
pd.core.construction.extract_array
. For a single-column dataframedf
, get the array backing it. In pandas we would usually do this on a Series withser._values
. This may be making assumptions about dataframe internals that we don't want to make.Example: "This operation wraps the array behavior
op(X)
matchesDataFrame(op(extract_array(X)))
"The text was updated successfully, but these errors were encountered: