Skip to content

Commit

Permalink
Update use case for Fondant 0.10.1 with lightweight components
Browse files Browse the repository at this point in the history
  • Loading branch information
RobbeSneyders committed Feb 5, 2024
1 parent 1e53df6 commit 1c3a1d8
Show file tree
Hide file tree
Showing 12 changed files with 284 additions and 390 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ There are 5 components in total, these are:

> ⚠️ **Prerequisites:**
>
> - A Python version between 3.8 and 3.10 installed on your system.
> - A Python version between 3.8 and 3.11 installed on your system.
> - Docker installed and configured on your system.
> - A GPU is recommended to run the model-based components of the pipeline.
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
fondant==0.8.0
fondant==0.10.1
notebook==7.0.6
21 changes: 10 additions & 11 deletions src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,18 +47,15 @@ For more details on the pipeline creation, you can have a look at the

## Running the pipeline

This pipeline will generate prompts, retrieve urls of matching images in the laion dataset, download them
This pipeline will generate prompts, retrieve urls of matching images in the LAION dataset, download them
and generate corresponding captions and segmentations. If you added the optional `write_to_hf_hub`
component, it will write the resulting dataset to the HF hub.

Fondant provides multiple runners to run our pipeline:
- A Docker runner for local execution
- A Vertex AI runner for managed execution on Google Cloud
- A Kubeflow Pipelines runner for execution anywhere

Fondant provides different runners to run our pipeline.
Here we will use the local runner, which utilizes Docker compose under the hood.
For an overview of all runners, check the [Fondant documentation](https://fondant.ai/en/latest/pipeline/#running-a-pipeline).

The runner will first build the custom component and download the reusable components from the
The runner will first download the reusable components from the
component hub. Afterwards, you will see the components execute one by one.

```shell
Expand All @@ -78,11 +75,13 @@ fondant explore -b data_dir
To create your own dataset, you can update the generate_prompts component to generate prompts
describing the images you want.

Make the changes you in the
[./components/generate_prompts/src/main.py](./components/generate_prompts/src/main.py) file.
The component is implemented as a
[lightweight component](https://fondant.ai/en/latest/components/lightweight_components/)
at [./components/generate_prompts/__init__.py](./components/generate_prompts/__init__.py).
You can update it to create your own prompts.

If you now re-run your pipeline, the new changes will be picked up and Fondant will automatically
re-build the component with the changes included.
execute the component with the changes included.

```shell
fondant run local pipeline.py
Expand All @@ -98,5 +97,5 @@ fondant explore -b data_dir
## Scaling up

If you're happy with your dataset, it's time to scale up. Check
[our documentation](https://fondant.ai/en/latest/pipeline/#compiling-and-running-a-pipeline) for
[our documentation](https://fondant.ai/en/latest/components/lightweight_components/) for
more information about the available runners.
Empty file added src/components/__init__.py
Empty file.
125 changes: 125 additions & 0 deletions src/components/generate_prompts.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
"""
This component generates a set of initial prompts that will be used to retrieve images
from the LAION-5B dataset.
"""
import typing as t

import dask.dataframe as dd
import pandas as pd
import pyarrow as pa

from fondant.component import DaskLoadComponent
from fondant.pipeline import lightweight_component


@lightweight_component(produces={"prompt": pa.string()})
class GeneratePromptsComponent(DaskLoadComponent):
interior_styles = [
"art deco",
"bauhaus",
"bouclé",
"maximalist",
"brutalist",
"coastal",
"minimalist",
"rustic",
"hollywood regency",
"midcentury modern",
"modern organic",
"contemporary",
"modern",
"scandinavian",
"eclectic",
"bohemiam",
"industrial",
"traditional",
"transitional",
"farmhouse",
"country",
"asian",
"mediterranean",
"rustic",
"southwestern",
"coastal",
]

interior_prefix = [
"comfortable",
"luxurious",
"simple",
]

rooms = [
"Bathroom",
"Living room",
"Hotel room",
"Lobby",
"Entrance hall",
"Kitchen",
"Family room",
"Master bedroom",
"Bedroom",
"Kids bedroom",
"Laundry room",
"Guest room",
"Home office",
"Library room",
"Playroom",
"Home Theater room",
"Gym room",
"Basement room",
"Garage",
"Walk-in closet",
"Pantry",
"Gaming room",
"Attic",
"Sunroom",
"Storage room",
"Study room",
"Dining room",
"Loft",
"Studio room",
"Appartement",
]

def __init__(self, *, n_rows_to_load: t.Optional[int]) -> None:
"""
Generate a set of initial prompts that will be used to retrieve images from the
LAION-5B dataset.
Args:
n_rows_to_load: Optional argument that defines the number of rows to load.
Useful for testing pipeline runs on a small scale
"""
self.n_rows_to_load = n_rows_to_load

@staticmethod
def make_interior_prompt(room: str, prefix: str, style: str) -> str:
"""Generate a prompt for the interior design model.
Args:
room: room name
prefix: prefix for the room
style: interior style
Returns:
prompt for the interior design model
"""
return f"{prefix.lower()} {room.lower()}, {style.lower()} interior design"

def load(self) -> dd.DataFrame:
import itertools

room_tuples = itertools.product(
self.rooms, self.interior_prefix, self.interior_styles
)
prompts = map(lambda x: self.make_interior_prompt(*x), room_tuples)

pandas_df = pd.DataFrame(prompts, columns=["prompt"])

if self.n_rows_to_load:
pandas_df = pandas_df.head(self.n_rows_to_load)

df = dd.from_pandas(pandas_df, npartitions=1)

return df
20 changes: 0 additions & 20 deletions src/components/generate_prompts/Dockerfile

This file was deleted.

40 changes: 0 additions & 40 deletions src/components/generate_prompts/README.md

This file was deleted.

13 changes: 0 additions & 13 deletions src/components/generate_prompts/fondant_component.yaml

This file was deleted.

1 change: 0 additions & 1 deletion src/components/generate_prompts/requirements.txt

This file was deleted.

122 changes: 0 additions & 122 deletions src/components/generate_prompts/src/main.py

This file was deleted.

Loading

0 comments on commit 1c3a1d8

Please sign in to comment.