🌞🌞How to Quickly Identify and Add Newly Added Files to DVC? #10383
Replies: 2 comments 4 replies
-
@Leo5050xvjf, Please read Modifying large datasets. Let me. know if you have any questions. But yes, you can track a single dataset and incorporate new files to it. dvc add dataset # track a dataset
# to only update with new files in /dataset/new-files directory
dvc add dataset/new-files The only downside is that you'd have to provide filenames or directories to update explicitly ( EDIT: I forgot to mention that it can even work virtually, you don't have to download everything to even update or remove items from the tracked dataset. |
Beta Was this translation helpful? Give feedback.
-
hi @Leo5050xvjf , it might be a good scenario for the new upcoming DVCx release. If you are interested to discuss and potentially be one of the teams trying it first - please get back to me and let's schedule a chat :). It would be great to learn more about your use case and I hope I can give you more insights as well. Ping me at ivan [at] iterative.ai. |
Beta Was this translation helpful? Give feedback.
-
Imagine working with a dataset containing one million entries (e.g., .bmp files), and you've just added 1,000 new files to it. Is there a way to swiftly incorporate these 1,000 new files into DVC? Or must I resort to using commands to iteratively search for all .bmp files and attempt to add them to DVC?
What strategies or commands can be used to efficiently manage the addition of new datasets in such a large-scale project using DVC?
Looking for insights or tips from those who have navigated similar challenges.
Beta Was this translation helpful? Give feedback.
All reactions