Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve descriptions of task_description and domain elements #26

Merged
merged 1 commit into from
May 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions v2/compositional_skills.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,14 @@
"minLength": 1
},
"task_description": {
"description": "A description of the skill.",
"description": "A description of the task which is used in prompts to the teacher model during synthetic data generation. The description should be detailed and prescriptive to improve the teacher model's responses.",
"type": "string",
"minLength": 1
"minLength": 1,
"examples": [
"Extracting content from a financial report and providing it in bulleted format",
"Providing engaging explanations for common questions across diverse topics at a primary school level",
"Assume the roles of historical figures and provide engaging explanations for common questions across diverse topics"
]
},
"seed_examples": {
"description": "An array of seed examples for synthetic data generation.",
Expand All @@ -34,7 +39,7 @@
"unevaluatedProperties": false,
"properties": {
"context": {
"description": "Information that the model is expected to take into account during processing. This is different from knowledge, where the model is expected to gain facts and background knowledge from the tuning process.",
"description": "Information that the teacher model is expected to take into account during processing. This is different from knowledge, where the model is expected to gain facts and background knowledge from the tuning process.",
"type": "string",
"minLength": 1
},
Expand Down
21 changes: 15 additions & 6 deletions v2/knowledge.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,23 @@
"minLength": 1
},
"domain": {
"description": "The knowledge domain.",
"description": "The knowledge domain which is used in prompts to the teacher model during synthetic data generation. The domain should be brief such as the title to a textbook chapter or section.",
"type": "string",
"minLength": 1
"minLength": 1,
"examples": [
"Chemistry",
"History",
"Pop culture"
]
},
"task_description": {
"description": "A description of the skill.",
"description": "A description of the task which is used in prompts to the teacher model during synthetic data generation. The description should be detailed and prescriptive to improve the teacher model's responses.",
"type": "string",
"minLength": 1
"minLength": 1,
"examples": [
"To teach a language model about softball history",
"To teach a language model about tabby cats"
]
},
"seed_examples": {
"description": "An array of seed examples for synthetic data generation.",
Expand Down Expand Up @@ -68,15 +77,15 @@
"type": "string",
"minLength": 1,
"examples": [
"https://github.com/instructlab/instructlab"
"https://github.com/instructlab/instructlab.git"
]
},
"commit": {
"description": "The commit in the Git repository containing the knowledge documents.",
"type": "string",
"minLength": 1,
"examples": [
"951999a"
"951999afdc59c46d325493568193b40bd5439c9e"
]
},
"patterns": {
Expand Down