-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fill out a few function docstrings #196
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# Contributing | ||
|
||
|
||
|
||
## Contributing Time-Series Features | ||
|
||
We gratefully accept contributions of new time-series features, be they | ||
domain-specific or general. Please follow the below guidelines in order that | ||
your features may be successfully incorporated into the Cesium feature base. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,7 +20,19 @@ | |
|
||
|
||
def rectangularize_featureset(featureset): | ||
"""Convert xarray.Dataset into (2d) Pandas.DataFrame for use with sklearn.""" | ||
"""Convert xarray.Dataset into (2d) Pandas.DataFrame for use with sklearn. | ||
|
||
Params | ||
------ | ||
featureset : xarray.Dataset | ||
The xarray.Dataset object containing features. | ||
|
||
Returns | ||
------- | ||
Pandas.DataFrame | ||
2-D, sklearn-compatible Dataframe containing features. | ||
|
||
""" | ||
featureset = featureset.drop([coord for coord in featureset.coords | ||
if coord not in ['name', 'channel']]) | ||
feature_df = featureset.to_dataframe() | ||
|
@@ -71,7 +83,38 @@ def fit_model_optimize_hyperparams(data, targets, model, params_to_optimize, | |
def build_model_from_featureset(featureset, model=None, model_type=None, | ||
model_options={}, params_to_optimize=None, | ||
cv=None): | ||
"""Build model from (non-rectangular) xarray.Dataset of features.""" | ||
"""Build model from (non-rectangular) xarray.Dataset of features. | ||
|
||
Parameters | ||
---------- | ||
featureset : xarray.Dataset of features | ||
Features for training model. | ||
model : scikit-learn model, optional | ||
Instantiated scikit-learn model. If None, `model_type` must not be. | ||
Defaults to None. | ||
model_type : str, optional | ||
String indicating model to be used, e.g. "RandomForestClassifier". | ||
If None, `model` must not be. Defaults to None. | ||
model_options : dict, optional | ||
Dictionary with hyperparameter values to be used in model building. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Describe the structure of this dictionary. |
||
Keys are parameter names, values are the associated values. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about saying what the keys are + what the associated values are and the type of structures they should be? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @stefanv don't think I follow... "Keys are parameter names" seems pretty straightforward to me - what are you picturing beyond that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the context of model hyperparameters, it seems self-explanatory There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know why this is called
|
||
params_to_optimize : list of str, optional | ||
List of parameters to be optimized (whose corresponding entries | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens if this is None? |
||
in `model_options` would be a list of values to try). If None, | ||
parameters specified in `model_options` will be passed to model | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is what should be describe above. |
||
constructor as-is (i.e. they are assumed not to be lists/grids of | ||
values to try). Defaults to None. | ||
cv : int, cross-validation generator or an iterable, optional | ||
Number of folds (defaults to 3 if None) or an iterable yielding | ||
train/test splits. See documentation for `GridSearchCV` for details. | ||
Defaults to None (yielding 3 folds). | ||
|
||
Returns | ||
------- | ||
sklearn estimator object | ||
The fitted sklearn model. | ||
|
||
""" | ||
if featureset.get('target') is None: | ||
raise ValueError("Cannot build model for unlabeled feature set.") | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are the below guidelines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stefanv bnaul offered to fill this out - he's going to contribute to my branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bnaul still up for adding this? If not, I'd like to sit down for a few minutes together to get a better sense of what this needs to be.