Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiWOZ turn based dataset #66

Open
1 of 5 tasks
oplatek opened this issue Feb 9, 2023 · 1 comment
Open
1 of 5 tasks

MultiWOZ turn based dataset #66

oplatek opened this issue Feb 9, 2023 · 1 comment
Assignees
Labels
data Related to datasets

Comments

@oplatek
Copy link
Collaborator

oplatek commented Feb 9, 2023

  • whole conversation view with reference instructions to users. See Multiwoz #63
  • Turn based view:
    • Not straightforward using huggingface Dataset & tabgenie: I need 1 - to n mapping between
      • Need to pregenerate all prefixes - bad for storage
      • Need to create completely new class DialogueDataset / equivalent to TabularDataset
        • computing of get_example_count will differ - turns instead of conversations (as tables)
        • the turn prefixes will be generated on the fly from the stored conversation
    • system view
    • user view
  • consider adding multiple annotation. See "all_version" multiwoz version https://huggingface.co/datasets/pietrolesci/multiwoz_all_versions/viewer/pietrolesci--multiwoz_all_versions/test

See also budzianowski/multiwoz#119 (comment)

@oplatek oplatek self-assigned this Feb 9, 2023
@oplatek
Copy link
Collaborator Author

oplatek commented Feb 9, 2023

@oplatek oplatek changed the title MultiWOZ dataset MultiWOZ turn based dataset Feb 10, 2023
@kasnerz kasnerz added the data Related to datasets label Feb 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Related to datasets
Projects
None yet
Development

No branches or pull requests

2 participants