-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Documentation] sagemaker-debugger open source documentation pre-launch #506
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #506 +/- ##
===========================================
- Coverage 75.35% 50.62% -24.73%
===========================================
Files 127 117 -10
Lines 11117 10590 -527
===========================================
- Hits 8377 5361 -3016
- Misses 2740 5229 +2489
Continue to review full report at Codecov.
|
@@ -63,10 +63,10 @@ The following frameworks are available AWS Deep Learning Containers with the dee | |||
|
|||
| Framework | Version | | |||
| --- | --- | | |||
| [TensorFlow](docs/tensorflow.md) | 1.15, 2.1.0, 2.2.0, 2.3.0, 2.3.1 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2.4 and 2.5 are also supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incorporated
| [MXNet](docs/mxnet.md) | 1.6, 1.7 | | ||
| [PyTorch](docs/pytorch.md) | 1.4, 1.5, 1.6 | | ||
| [XGBoost](docs/xgboost.md) | 0.90-2, 1.0-1 ([As a built-in algorithm](docs/xgboost.md#use-xgboost-as-a-built-in-algorithm))| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smdebug is supported on the latest versions of all available DLCs.
See page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incorporated
README.md
Outdated
| [TensorFlow](tensorflow.md) | 1.13, 1.14, 1.15, 2.1.0, 2.2.0, 2.3.0, 2.3.1 | | ||
| Keras (with TensorFlow backend) | 2.3 | | ||
| [MXNet](docs/mxnet.md) | 1.4, 1.5, 1.6, 1.7 | | ||
| [PyTorch](docs/pytorch.md) | 1.2, 1.3, 1.4, 1.5, 1.6 | | ||
| [XGBoost](docs/xgboost.md) | 0.90-2, 1.0-1 (As a framework)| | ||
| [MXNet](mxnet.md) | 1.4, 1.5, 1.6, 1.7 | | ||
| [PyTorch](pytorch.md) | 1.2, 1.3, 1.4, 1.5, 1.6 | | ||
| [XGBoost](xgboost.md) | 0.90-2, 1.0-1 (As a framework)| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incorporated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left mostly nits, otherwise looks good.
|
||
# Optionally set the version of Python and requirements required to build your docs | ||
python: | ||
version: 3.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why Python 3.6? can we use Python 3.9?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
You can use Debugger with your training script on your own container | ||
making only a minimal modification to your training script to add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: by making
Using SageMaker Debugger on custom containers | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Debugger is available for any deep learning models that you bring to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "any deep learning model" or "all deep learning models"
~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Below is a comprehensive list of the built-in collections that are | ||
managed by SageMaker Debugger. The Hook identifes the tensors that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: identifies
``XGBoost`` METRICS | ||
============== =========================== | ||
|
||
If for some reason, you want to disable the saving of these collections, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should tell customers to set debugger_hook_config=False
in the estimator, this is a simpler alternative.
name="weights", | ||
parameters={ "parameter": "value" }) | ||
|
||
The parameters can be one of the following. The meaning of these |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The meaning of these parameters -> These parameters
- docutils==0.15.2 | ||
- bokeh | ||
- ipython | ||
- pandas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we pin the versions of bokeh, ipython, and pandas here?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
The available ``hook_parameters`` keys are listed in the following. The meaning | ||
of these parameters will be clear as you review the sections of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The meaning of these parameters -> These parameters
.. method:: create_from_json_file(json_file_path (str) | ||
|
||
Takes the path of a file which holds the json configuration of the hook, | ||
and creates hook from that configuration. This is an optional parameter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: creates a hook
|
||
### AWS training containers with script mode | ||
The following frameworks are available AWS Deep Learning Containers with | ||
the deep learning frameworks for the zero script change experience. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we explaining what is 'zero script change experience' in the doc? If yes, can we link it here?
In other locations, I am seeing the lines such as 'no changes to your training script'
|
||
However, for some advanced use cases where you need access to customized | ||
tensors from targeted parts of a training script, you can manually | ||
construct the hook object. The SMDebug library provides hook classes to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we using the 'SMDebug' consistently? In other locations I am seeing it is mentioned as 'smdebug'
Support | ||
------- | ||
|
||
- Zero Script Change experience where you need no modifications to your |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. If we are introducing a new term 'Zero script change experience', it needs to be explained some where.
Migration to Deep Learning Containers | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why TBD?
* add unified search to RTD website * configure rtd build environment to make it functional
* add licensing information * add licensing information
* add licensing information * add search filter
Description of changes:
readthedocs build log: https://readthedocs.org/projects/sagemaker-debugger/builds/14082688/
pre-launched doc: https://sagemaker-debugger.readthedocs.io/en/website/
Style and formatting:
I have run
pre-commit install
to ensure that auto-formatting happens with every commit.Issue number, if available
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.