Skip to content

Commit

Permalink
Add data format to readme (#90)
Browse files Browse the repository at this point in the history
  • Loading branch information
tleyden authored Jan 19, 2024
1 parent 5196a07 commit 699aedd
Showing 1 changed file with 20 additions and 0 deletions.
20 changes: 20 additions & 0 deletions dalm/pipelines/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,24 @@ python dalm/pipelines/reading_comprehension_pipeline.py --model_name HuggingFace
--llm_synth_model_name meta-llama/Llama-2-13b-chat-hf \
--llm_synth_model_context_length 4096

```

### Data format

```
{"messages":[
[{"role":"...", "content": "..."}, {"role":"...", "content": "..."}, ...],
[{"role":"...", "content": "..."}, {"role":"...", "content": "..."}, ...],
[{"role":"...", "content": "..."}, {"role":"...", "content": "..."}, ...],
....
]
}
```

take from this snippet:

```
import datasets
a = datasets.load_dataset("arcee-ai/azure-reading-comprehension-dataset")
print(a["train"]["messages"][0])
```

0 comments on commit 699aedd

Please sign in to comment.