Skip to content

Commit

Permalink
[skip ci] Update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
sbrugman committed Apr 13, 2020
1 parent ecd2bf1 commit 298f6ba
Show file tree
Hide file tree
Showing 2 changed files with 99 additions and 33 deletions.
116 changes: 87 additions & 29 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -44,15 +44,26 @@ <h1 id="pandas-profiling">Pandas Profiling</h1>
<li><strong>Text analysis</strong> learn about categories (Uppercase, Space), scripts (Latin, Cyrillic) and blocks (ASCII) of text data.</li>
</ul>
<h2 id="announcements">Announcements</h2>
<h3 id="new-in-v260">New in v2.6.0</h3>
<h4 id="dependency-policy">Dependency policy</h4>
<p>The current dependency policy is suboptimal. Pinning the dependencies is great for reproducibility (high guarantee to work), but on the downside requires frequent maintenance and introduces compatibility issues with other packages. Therefore, we are moving away from pinning dependencies and instead specify a minimum version. </p>
<h4 id="pandas-v1">Pandas v1</h4>
<p>Early releases of pandas v1 demonstrated many regressions that broke functionality (as acknowledged by the authors <a href="https://github.com/pandas-dev/pandas/issues/31523">here</a>. At this point, pandas is more stable and we notice high demand for compatibility. We move on to support pandas' latest versions. To ensure compatibility with both versions, we have extended the test matrix to test against both pandas 0.x.y and 1.x.y.</p>
<h4 id="python-36-features">Python 3.6+ features</h4>
<p>Python 3.6 introduces ordered dicts and f-strings, which we now rely on. This means that from pandas-profiling 2.6, you should minimally run Python 3.6. For users that for some reason cannot update, you can use pandas-profiling 2.5.0, but you unfortunately won't benefit from updates or maintenance.</p>
<h4 id="extended-continuous-integration">Extended continuous integration</h4>
<p>Starting from this release, we use Github Actions and Travis CI combined to increase maintainability.
Travis CI handles the testing, Github Actions automates part of the development process by running black and building the docs.</p>
<h3 id="support-pandas-profiling">Support <code>pandas-profiling</code></h3>
<p>With your help, we got approved for <a href="https://github.com/sponsors/sbrugman">GitHub Sponsors</a>!
It's extra exciting that GitHub <strong>matches your contribution</strong> for the first year.
Therefore, we welcome you to support the project through GitHub! </p>
<p>The v2.5.0 release includes many new features and stability improvements.</p>
<p>Find more information here:</p>
<ul>
<li><a href="https://github.com/sponsors/sbrugman">Sponsor the project on GitHub</a></li>
<li><a href="https://github.com/pandas-profiling/pandas-profiling/releases/tag/v2.5.0">Read the release notes v2.5.0</a> </li>
<li><a href="https://github.com/pandas-profiling/pandas-profiling/releases/tag/v2.6.0">Read the release notes v2.6.0</a> </li>
</ul>
<p><em>February 14, 2020 💘</em></p>
<p><em>April 14, 2020 💘</em></p>
<hr>
<p><em>Contents:</em> <strong><a href="#examples">Examples</a></strong> |
<strong><a href="#installation">Installation</a></strong> | <strong><a href="#documentation">Documentation</a></strong> |
Expand Down Expand Up @@ -84,7 +95,7 @@ <h3 id="using-pip">Using pip</h3>
<p>You can install using the pip package manager by running</p>
<pre><code>pip install pandas-profiling[notebook,html]
</code></pre>
<p>Alternatively, you could install directly from Github:</p>
<p>Alternatively, you could install the latest version directly from Github:</p>
<pre><code>pip install &lt;https://github.com/pandas-profiling/pandas-profiling/archive/master.zip&gt;
</code></pre>
<h3 id="using-conda">Using conda</h3>
Expand Down Expand Up @@ -259,6 +270,7 @@ <h2 id="dependencies">Dependencies</h2>
.. include:: ../../README.md
&#34;&#34;&#34;
import json
import warnings
from pathlib import Path
from datetime import datetime

Expand Down Expand Up @@ -390,7 +402,13 @@ <h2 id="dependencies">Dependencies</h2>
elif output_file.suffix == &#34;.json&#34;:
data = self.to_json()
else:
raise ValueError(&#34;Extension not supported (please use .html, .json)&#34;)
suffix = output_file.suffix
output_file = output_file.with_suffix(&#34;.html&#34;)
data = self.to_html()
warnings.warn(
f&#34;Extension {suffix} not supported. For now we assume .html was intended. &#34;
f&#34;To remove this warning, please use .html or .json.&#34;
)

with output_file.open(&#34;w&#34;, encoding=&#34;utf8&#34;) as f:
f.write(data)
Expand Down Expand Up @@ -443,16 +461,22 @@ <h2 id="dependencies">Dependencies</h2>

def to_json(self) -&gt; str:
class CustomEncoder(json.JSONEncoder):
def key_to_json(self, data):
if data is None or isinstance(data, (bool, int, str)):
return data
return str(data)

def default(self, o):
name = o.__class__.__name__
if isinstance(o, pd.core.series.Series) or isinstance(
o, pd.core.frame.DataFrame
):
return {f&#34;__{name}__&#34;: o.to_json()}
if isinstance(o, pd.core.series.Series):
return self.default(o.to_dict())

if isinstance(o, np.integer):
return {f&#34;__{name}__&#34;: o.tolist()}
return o.tolist()

if isinstance(o, dict):
return {self.key_to_json(key): self.default(o[key]) for key in o}

return {f&#34;__{name}__&#34;: str(o)}
return str(o)

return json.dumps(self.description_set, indent=4, cls=CustomEncoder)

Expand Down Expand Up @@ -680,7 +704,13 @@ <h2 class="section-title" id="header-classes">Classes</h2>
elif output_file.suffix == &#34;.json&#34;:
data = self.to_json()
else:
raise ValueError(&#34;Extension not supported (please use .html, .json)&#34;)
suffix = output_file.suffix
output_file = output_file.with_suffix(&#34;.html&#34;)
data = self.to_html()
warnings.warn(
f&#34;Extension {suffix} not supported. For now we assume .html was intended. &#34;
f&#34;To remove this warning, please use .html or .json.&#34;
)

with output_file.open(&#34;w&#34;, encoding=&#34;utf8&#34;) as f:
f.write(data)
Expand Down Expand Up @@ -733,16 +763,22 @@ <h2 class="section-title" id="header-classes">Classes</h2>

def to_json(self) -&gt; str:
class CustomEncoder(json.JSONEncoder):
def key_to_json(self, data):
if data is None or isinstance(data, (bool, int, str)):
return data
return str(data)

def default(self, o):
name = o.__class__.__name__
if isinstance(o, pd.core.series.Series) or isinstance(
o, pd.core.frame.DataFrame
):
return {f&#34;__{name}__&#34;: o.to_json()}
if isinstance(o, pd.core.series.Series):
return self.default(o.to_dict())

if isinstance(o, np.integer):
return {f&#34;__{name}__&#34;: o.tolist()}
return o.tolist()

if isinstance(o, dict):
return {self.key_to_json(key): self.default(o[key]) for key in o}

return {f&#34;__{name}__&#34;: str(o)}
return str(o)

return json.dumps(self.description_set, indent=4, cls=CustomEncoder)

Expand Down Expand Up @@ -962,7 +998,13 @@ <h2 id="args">Args</h2>
elif output_file.suffix == &#34;.json&#34;:
data = self.to_json()
else:
raise ValueError(&#34;Extension not supported (please use .html, .json)&#34;)
suffix = output_file.suffix
output_file = output_file.with_suffix(&#34;.html&#34;)
data = self.to_html()
warnings.warn(
f&#34;Extension {suffix} not supported. For now we assume .html was intended. &#34;
f&#34;To remove this warning, please use .html or .json.&#34;
)

with output_file.open(&#34;w&#34;, encoding=&#34;utf8&#34;) as f:
f.write(data)
Expand Down Expand Up @@ -1038,16 +1080,22 @@ <h2 id="returns">Returns</h2>
</summary>
<pre><code class="python">def to_json(self) -&gt; str:
class CustomEncoder(json.JSONEncoder):
def key_to_json(self, data):
if data is None or isinstance(data, (bool, int, str)):
return data
return str(data)

def default(self, o):
name = o.__class__.__name__
if isinstance(o, pd.core.series.Series) or isinstance(
o, pd.core.frame.DataFrame
):
return {f&#34;__{name}__&#34;: o.to_json()}
if isinstance(o, pd.core.series.Series):
return self.default(o.to_dict())

if isinstance(o, np.integer):
return {f&#34;__{name}__&#34;: o.tolist()}
return o.tolist()

if isinstance(o, dict):
return {self.key_to_json(key): self.default(o[key]) for key in o}

return {f&#34;__{name}__&#34;: str(o)}
return str(o)

return json.dumps(self.description_set, indent=4, cls=CustomEncoder)</code></pre>
</details>
Expand Down Expand Up @@ -1120,7 +1168,17 @@ <h1>Index</h1>
<div class="toc">
<ul>
<li><a href="#pandas-profiling">Pandas Profiling</a><ul>
<li><a href="#announcements">Announcements</a></li>
<li><a href="#announcements">Announcements</a><ul>
<li><a href="#new-in-v260">New in v2.6.0</a><ul>
<li><a href="#dependency-policy">Dependency policy</a></li>
<li><a href="#pandas-v1">Pandas v1</a></li>
<li><a href="#python-36-features">Python 3.6+ features</a></li>
<li><a href="#extended-continuous-integration">Extended continuous integration</a></li>
</ul>
</li>
<li><a href="#support-pandas-profiling">Support pandas-profiling</a></li>
</ul>
</li>
<li><a href="#examples">Examples</a></li>
<li><a href="#installation">Installation</a><ul>
<li><a href="#using-pip">Using pip</a></li>
Expand Down
16 changes: 12 additions & 4 deletions docs/model/describe.html
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,11 @@ <h1 class="title">Module <code>pandas_profiling.model.describe</code></h1>
Returns:
A dict containing calculated series description values.
&#34;&#34;&#34;
stats = {&#34;min&#34;: series.min(), &#34;max&#34;: series.max(), &#34;histogram_data&#34;: series}
stats = {
&#34;min&#34;: pd.Timestamp.to_pydatetime(series.min()),
&#34;max&#34;: pd.Timestamp.to_pydatetime(series.max()),
&#34;histogram_data&#34;: series,
}

bins = config[&#34;plot&#34;][&#34;histogram&#34;][&#34;bins&#34;].get(int)
# Bins should never be larger than the number of distinct values
Expand Down Expand Up @@ -587,7 +591,7 @@ <h1 class="title">Module <code>pandas_profiling.model.describe</code></h1>
- messages: direct special attention to these patterns in your data.
&#34;&#34;&#34;
if not isinstance(df, pd.DataFrame):
raise TypeError(&#34;df must be of type pandas.DataFrame&#34;)
warnings.warn(&#34;df is not of type pandas.DataFrame&#34;)

if df.empty:
raise ValueError(&#34;df can not be empty&#34;)
Expand Down Expand Up @@ -743,7 +747,7 @@ <h2 id="returns">Returns</h2>
- messages: direct special attention to these patterns in your data.
&#34;&#34;&#34;
if not isinstance(df, pd.DataFrame):
raise TypeError(&#34;df must be of type pandas.DataFrame&#34;)
warnings.warn(&#34;df is not of type pandas.DataFrame&#34;)

if df.empty:
raise ValueError(&#34;df can not be empty&#34;)
Expand Down Expand Up @@ -1027,7 +1031,11 @@ <h2 id="returns">Returns</h2>
Returns:
A dict containing calculated series description values.
&#34;&#34;&#34;
stats = {&#34;min&#34;: series.min(), &#34;max&#34;: series.max(), &#34;histogram_data&#34;: series}
stats = {
&#34;min&#34;: pd.Timestamp.to_pydatetime(series.min()),
&#34;max&#34;: pd.Timestamp.to_pydatetime(series.max()),
&#34;histogram_data&#34;: series,
}

bins = config[&#34;plot&#34;][&#34;histogram&#34;][&#34;bins&#34;].get(int)
# Bins should never be larger than the number of distinct values
Expand Down

0 comments on commit 298f6ba

Please sign in to comment.