-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stuck at 'Stage 1 - Document filtering' #68
Comments
Testing this just with that doc it seems like we can make that regular expression more performant by making the start more specific, using |
Hey @peteruithoven - the initial raw export JSON from couchdb contains an encapsulating id/key/rev/doc section for each individual document within the database. To make the document importable back into couchdb, we need to strip this off; then the stage2 sed is to remove the leftover closing curly brace for this 'wrapping' section, and then sed sections 3 and 4 are to fix the header and footer of the JSON. I don't have a means to test it right now- would you be able to confirm that the exported file is importable again using this change? And for the record, any detail of the speed improvement/time reduction as a result? |
Thanks for the clarifications. I've exported all my databases with the altered script, removed them and then reimported them. I haven't found any issues so far. In regards to speed, I've let old version work for > 10 minutes, on that 39MB file with no progress, I've not seen it finish at all. With my alteration, it takes maybe a few seconds. |
* Optimized stage 1 reg-exp See: #68 * Version bump to 1.1.8
Merged and closed; thanks! |
Thanks for checking and merging |
Thanks Darren
… Il giorno 03 ago 2017, alle ore 14:06, Peter Uithoven ***@***.***> ha scritto:
Thanks for checking and merging
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#68 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AD3IdpPLlXzxJLSQlBHYorrZkZMio_L-ks5sUbfKgaJpZM4Oq5O4>.
|
I'm using couchdb-dump version: 1.1.7
I have a database, which is successfully downloaded to a file (39MB), but it get's stuck at
Stage 1 - Document filtering
.Since it's below 250MB, the parsing isn't multi-threaded.
I'm assuming it's stuck at the sed line:
Could someone what's the purpose of removing
.*,"doc":
? Is this the Database Compaction or Purge Historic and Deleted Data logic?Looking into the json file, it removed the following part on each line.
I think a comment above that code is welcome.
I'm assuming my issue is caused by binary attachments in all the docs.
I don't think I'm helped with #31, since I do want this to happen.
The text was updated successfully, but these errors were encountered: