Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/cleanup examples #340

Merged
merged 5 commits into from
Dec 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 2 additions & 6 deletions docs/example-notebooks/Extending-pandas.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -385,11 +385,7 @@
"Auto cleaning and the low code UI work together for more fine grained editting of data. The low code UI presents a gui that works on columns and allows functions with arguments. \n",
"\n",
"Auto cleaning works to suggest operations that are then loaded into the low code ui. Then these operations can be editted or removed.\n",
"Auto cleaning options can be cycled through to generate different cleanings.\n",
"\n",
"## Why did this release remove auto cleaning and the low code UI?\n",
"\n",
"Although auto cleaning and the low code UI is my favorite feature of Buckaroo, and the first part I built, it hasn't seemed to have gained traction with users. Buckaroo for that matter hasn't gained a lot of traction. For the time being I have decided to put more effort into refining and promoting the parts of Buckaroo that people do understand. "
"Auto cleaning options can be cycled through to generate different cleanings.\n"
]
},
{
Expand Down Expand Up @@ -417,7 +413,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.20"
"version": "3.12.8"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion docs/example-notebooks/Extending.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -434,7 +434,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.19"
"version": "3.12.8"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion docs/example-notebooks/Full-tour.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.20"
"version": "3.12.8"
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
"metadata": {},
"outputs": [],
"source": [
"w = BuckarooWidget(df, showCommands=False)\n",
"w = BuckarooWidget(df)\n",
"w"
]
},
Expand All @@ -46,7 +46,8 @@
},
"outputs": [],
"source": [
"from buckaroo.pluggable_analysis_framework import (ColAnalysis)\n",
"w = BuckarooWidget(df)\n",
"from buckaroo.pluggable_analysis_framework.pluggable_analysis_framework import ColAnalysis\n",
"from scipy.stats import skew\n",
"class Skew(ColAnalysis):\n",
" provides_summary = [\"skew\"]\n",
Expand All @@ -60,6 +61,7 @@
" return dict(skew=skew(sampled_ser.astype('float64')))\n",
" else:\n",
" return dict(skew=\"NA\")\n",
" #fixme\n",
" summary_stats_display = [\n",
" 'dtype',\n",
" 'length',\n",
Expand All @@ -78,7 +80,8 @@
" 'mean',\n",
" # we must add skew to the list of summary_stats_display, otherwise our new stat won't be displayed\n",
" 'skew']\n",
"w.add_analysis(Skew)"
"w.add_analysis(Skew)\n",
"w"
]
},
{
Expand Down Expand Up @@ -236,8 +239,19 @@
"metadata": {},
"outputs": [],
"source": [
"from buckaroo.all_transforms import Command\n",
"from buckaroo.lispy import s\n",
"w = BuckarooWidget(df[:500])\n",
"w"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from buckaroo.customizations.all_transforms import Command\n",
"from buckaroo.jlisp.lisp_utils import s\n",
"w = BuckarooWidget(df[:500])\n",
"#Here we start adding commands to the Buckaroo Widget. Every call to add_command replaces a command with the same name\n",
"@w.add_command\n",
"class GroupBy2(Command):\n",
Expand Down Expand Up @@ -283,6 +297,15 @@
"source": [
"Note that `groupby2` has been added to the commands"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"w.ac_obj."
]
}
],
"metadata": {
Expand All @@ -301,7 +324,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.20"
"version": "3.12.8"
}
},
"nbformat": 4,
Expand Down
149 changes: 149 additions & 0 deletions docs/example-notebooks/Lowcode-UI.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"from buckaroo.buckaroo_widget import BuckarooInfiniteWidget, BuckarooWidget"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv('./yellow_tripdata_2021-02.csv')\n",
"df"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"w = BuckarooInfiniteWidget(df)\n",
"w"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Adding a Command to the Low Code UI"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from buckaroo.customizations.all_transforms import Command\n",
"from buckaroo.jlisp.lisp_utils import s\n",
"w = BuckarooWidget(df[:500])\n",
"#Here we start adding commands to the Buckaroo Widget. Every call to add_command replaces a command with the same name\n",
"#@w.add_command\n",
"class GroupBy2(Command):\n",
" command_default = [s(\"groupby2\"), s('df'), 'col', {}]\n",
" command_pattern = [[3, 'colMap', 'colEnum', ['null', 'sum', 'mean', 'median', 'count']]]\n",
" @staticmethod \n",
" def transform(df, col, col_spec):\n",
" grps = df.groupby(col)\n",
" df_contents = {}\n",
" for k, v in col_spec.items():\n",
" if v == \"sum\":\n",
" df_contents[k] = grps[k].apply(lambda x: x.sum())\n",
" elif v == \"mean\":\n",
" df_contents[k] = grps[k].apply(lambda x: x.mean())\n",
" elif v == \"median\":\n",
" df_contents[k] = grps[k].apply(lambda x: x.median())\n",
" elif v == \"count\":\n",
" df_contents[k] = grps[k].apply(lambda x: x.count())\n",
" return pd.DataFrame(df_contents)\n",
"\n",
" @staticmethod \n",
" def transform_to_py(df, col, col_spec):\n",
" commands = [\n",
" \" grps = df.groupby('%s')\" % col,\n",
" \" df_contents = {}\"\n",
" ]\n",
" for k, v in col_spec.items():\n",
" if v == \"sum\":\n",
" commands.append(\" paddydf_contents['%s'] = grps['%s'].apply(lambda x: x.sum())\" % (k, k))\n",
" elif v == \"mean\":\n",
" commands.append(\" df_contents['%s'] = grps['%s'].apply(lambda x: x.mean())\" % (k, k))\n",
" elif v == \"median\":\n",
" commands.append(\" df_contents['%s'] = grps['%s'].apply(lambda x: x.median())\" % (k, k))\n",
" elif v == \"count\":\n",
" commands.append(\" df_contents['%s'] = grps['%s'].apply(lambda x: x.count())\" % (k, k))\n",
" commands.append(\" df = pd.DataFrame(df_contents)\")\n",
" return \"\\n\".join(commands)\n",
" \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from buckaroo.dataflow.autocleaning import AutocleaningConfig\n",
"from buckaroo.customizations.pd_autoclean_conf import BASE_COMMANDS, NoCleaningConf\n",
"\n",
"LOCAL_COMMANDS = BASE_COMMANDS.copy()\n",
"LOCAL_COMMANDS.append(GroupBy2)\n",
"\n",
"class ExtraGroupbyConf(NoCleaningConf):\n",
" command_klasses = LOCAL_COMMANDS\n",
"\n",
"class BuckarooExtraCommands(BuckarooInfiniteWidget):\n",
" #autoclean_conf = tuple([CleaningConf, NoCleaningConf]) #override the base CustomizableDataFlow conf\n",
" autoclean_conf = tuple([ExtraGroupbyConf])\n",
"BuckarooExtraCommands(df)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that `groupby2` has been added to the commands"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"w.ac_obj."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.8"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
10 changes: 9 additions & 1 deletion docs/example-notebooks/Styling-Gallery-Pandas.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -541,6 +541,14 @@
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "25",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -559,7 +567,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.20"
"version": "3.12.8"
}
},
"nbformat": 4,
Expand Down
6 changes: 4 additions & 2 deletions scripts/clean_notebooks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,10 @@ nbstripout docs/example-notebooks/DFViewer.ipynb \
docs/example-notebooks/Itables-testcases.ipynb \
docs/example-notebooks/Pluggable-Analysis-Framework.ipynb \
docs/example-notebooks/Solara-Buckaroo.ipynb \
docs/example-notebooks/introduction.ipynb \
docs/example-notebooks/Introduction.ipynb \
docs/example-notebooks/Styling-Gallery-Polars.ipynb \
docs/example-notebooks/Styling-Gallery-Pandas.ipynb \
docs/example-notebooks/styling-howto.ipynb \
docs/example-notebooks/Styling-Howto.ipynb \
docs/example-notebooks/Lowcode-UI.ipynb \
docs/example-notebooks/testcases-fast.ipynb

Loading