-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* initial video support * support map and formatting * ci test * set row group size * add to webdataset * typos * try ci without decord just in case * import torch before decord to fix random_device could not be read * fix CI * minor * better memory handling in push_to_hub * better memory handling in load_dataset * basic docs * add to toc * streaming tweaks * keep hf:// URL in the video "path" field for the viewer
- Loading branch information
Showing
25 changed files
with
710 additions
and
19 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
# Load video data | ||
|
||
<Tip warning={true}> | ||
|
||
Video support is experimental and is subject to change. | ||
|
||
</Tip> | ||
|
||
Video datasets have [`Video`] type columns, which contain `decord` objects. | ||
|
||
<Tip> | ||
|
||
To work with video datasets, you need to have the `vision` dependency installed. Check out the [installation](./installation#vision) guide to learn how to install it. | ||
|
||
</Tip> | ||
|
||
When you load an video dataset and call the video column, the videos are decoded as `decord` Videos: | ||
|
||
```py | ||
>>> from datasets import load_dataset, Video | ||
|
||
>>> dataset = load_dataset("path/to/video/folder", split="train") | ||
>>> dataset[0]["video"] | ||
<decord.video_reader.VideoReader at 0x1652284c0> | ||
``` | ||
|
||
<Tip warning={true}> | ||
|
||
Index into an video dataset using the row index first and then the `video` column - `dataset[0]["video"]` - to avoid decoding and resampling all the video objects in the dataset. Otherwise, this can be a slow and time-consuming process if you have a large dataset. | ||
|
||
</Tip> | ||
|
||
For a guide on how to load any type of dataset, take a look at the <a class="underline decoration-sky-400 decoration-2 font-semibold" href="./loading">general loading guide</a>. | ||
|
||
## Local files | ||
|
||
You can load a dataset from the video path. Use the [`~Dataset.cast_column`] function to accept a column of video file paths, and decode it into a `decord` video with the [`Video`] feature: | ||
```py | ||
>>> from datasets import Dataset, Video | ||
|
||
>>> dataset = Dataset.from_dict({"video": ["path/to/video_1", "path/to/video_2", ..., "path/to/video_n"]}).cast_column("video", Video()) | ||
>>> dataset[0]["video"] | ||
<decord.video_reader.VideoReader at 0x1657d0280> | ||
``` | ||
|
||
If you only want to load the underlying path to the video dataset without decoding the video object, set `decode=False` in the [`Video`] feature: | ||
|
||
```py | ||
>>> dataset = dataset.cast_column("video", Video(decode=False)) | ||
>>> dataset[0]["video"] | ||
{'bytes': None, | ||
'path': 'path/to/video/folder/video0.mp4'} | ||
``` | ||
|
||
## VideoFolder | ||
|
||
You can also load a dataset with an `VideoFolder` dataset builder which does not require writing a custom dataloader. This makes `VideoFolder` ideal for quickly creating and loading video datasets with several thousand videos for different vision tasks. Your video dataset structure should look like this: | ||
|
||
``` | ||
folder/train/dog/golden_retriever.mp4 | ||
folder/train/dog/german_shepherd.mp4 | ||
folder/train/dog/chihuahua.mp4 | ||
folder/train/cat/maine_coon.mp4 | ||
folder/train/cat/bengal.mp4 | ||
folder/train/cat/birman.mp4 | ||
``` | ||
|
||
Load your dataset by specifying `videofolder` and the directory of your dataset in `data_dir`: | ||
|
||
```py | ||
>>> from datasets import load_dataset | ||
|
||
>>> dataset = load_dataset("videofolder", data_dir="/path/to/folder") | ||
>>> dataset["train"][0] | ||
{"video": <decord.video_reader.VideoReader at 0x161715e50>, "label": 0} | ||
|
||
>>> dataset["train"][-1] | ||
{"video": <decord.video_reader.VideoReader at 0x16170bd90>, "label": 1} | ||
``` | ||
|
||
Load remote datasets from their URLs with the `data_files` parameter: | ||
|
||
```py | ||
>>> dataset = load_dataset("videofolder", data_files="https://foo.bar/videos.zip", split="train") | ||
``` | ||
|
||
Some datasets have a metadata file (`metadata.csv`/`metadata.jsonl`) associated with it, containing other information about the data like bounding boxes, text captions, and labels. The metadata is automatically loaded when you call [`load_dataset`] and specify `videofolder`. | ||
|
||
To ignore the information in the metadata file, set `drop_labels=False` in [`load_dataset`], and allow `VideoFolder` to automatically infer the label name from the directory name: | ||
|
||
```py | ||
>>> from datasets import load_dataset | ||
|
||
>>> dataset = load_dataset("videofolder", data_dir="/path/to/folder", drop_labels=False) | ||
``` | ||
|
||
## WebDataset | ||
|
||
The [WebDataset](https://github.com/webdataset/webdataset) format is based on a folder of TAR archives and is suitable for big video datasets. | ||
Because of their size, WebDatasets are generally loaded in streaming mode (using `streaming=True`). | ||
|
||
You can load a WebDataset like this: | ||
|
||
```python | ||
>>> from datasets import load_dataset | ||
|
||
>>> dataset = load_dataset("webdataset", data_dir="/path/to/folder", streaming=True) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.