-
Notifications
You must be signed in to change notification settings - Fork 27
Getting the annotation ordered by page ? #3
Comments
Hi,
thank you for your interest in this tiny script, and you are welcome.
As I briefly mentioned here:
#1 (comment)
with regard to another issue, it does not seem to me that the
KoboReader.sqlite file contains the location information in a sane way.
In particular, the location information seems to rely on the Kobo-fied
version of the EPUB, which is similar to the EPUB Canonical Fragment
Identifier (EPUB CFI) format. For example, in table "Bookmarks" of the
SQLite file, for each annotation/bookmark/highlighting, the location is
expressed by:
ContentID: 136dc3fc-b6c2-4006-bb65-390f5e26e0df!OEBPS!ch01.html
StartContainerPath: span#kobo\.3\.1
StartOffset: 18
EndContainerPath: span#kobo\.3\.1
EndOffset: 213
While the ContainerPath/Offset seems amenable to be ordered
lexicographically without any knowledge about their semantics (again, I
speculate they are the "Kobo equivalent" to EPUB CFI), the ContentID
depends on the structure of the EPUB, and you cannot just order it
lexicographically, because e.g. "acknowledgments.html" might be after
"ch01.html" in the reading order of the EPUB.
The table "content" contains the values for ContentID (so I guess there
is a foreign key relationship between tables "content" and "Bookmarks"),
and there is a VolumeIndex integer field that seems to suggest some
ordering of the ContentID values. However, in the KoboReader.sqlite I
have, for some EPUB books there are gaps in the values, and for some
other EPUB books there is no VolumeIndex value at all.
Point #3 of https://github.com/pettarin/export-kobo#notes says: "Bear in
mind that no official specifications are published by Kobo, hence the
script works as far as my understanding of the database structure of
KoboReader.sqlite is correct, and its schema remains the same."
…On 01/27/2017 03:30 PM, sappounet wrote:
Hello,
Thanks for this usefull tool :)
I'm having a small problem with the annotations :
Most of the time, I read a book two or three times, and I annotate
something every time I read it.
The problem is that when I export the Annotations, they are ordered by
date and not by Page number or Line number, so when I then read the
Annotations, the order is a bit random and it makes no real sense
according to the book order/chapters.
Is it possible to add a functionality that will order the Annotations in
order of appearance in a book, instead of ordering them by the date ?
Thanks :)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AD5Zk1JWk8VpZyoD2VrW4KVJDal_zpuxks5rWf-ZgaJpZM4Lv0rp>.
|
Hi :) I kind of forgot about exporting Notes from my Kobo reader, but I decided to give it one more try today. I've been browsing the content of the file KoboReader.sqlite, and it seems that indeed, the column "Bookmark.StartContainerPath" is kind of containing a value that is precise enough to order the result in an order that match the appearance of the highlights in the book. I tried with couple of books, and this query returns ALMOST the right order : SELECT Bookmark.Text FROM Bookmark I wrote "ALMOST", because the "ORDER BY" cause is a bit stupid and put the following row But i'm pretty sure it's quite easy to fix that with python (but my python skills are bad, so I'm going to ask a coworker tomorrow :) ) So yeah, maybe for some books it will not work, but it seems that so far it's enough to get the Notes in the right order for pretty much all Ebooks that I tried :) I'll keep you posted :) |
If Bookmark.StartContainerPath is the right order, we can just add a numeric sort in the I'm happy to contribute a pull request on this; however I'm not totally sure about this. Need the author's confirmation. |
Well, again, nobody except Kobo knows for sure, we can only observe the values their code puts in the SQLite file. If you want to provide a PR with the functionality, I will merge. But I am afraid that there is no simple way to correctly sort Bookmark.StartContainerPath (even parsing it to take into account numeric vs. lexicographic values) by just looking at its values, since an EPUB might contain a file "page2.xhtml" appearing before "page1.xhtml" in the TOC order. Or, if you fancy another example, if you just sort Bookmark.StartContainerPath, you get "acknowledgments.xhtml" before "title.xhtml". Unfortunately I have no time (and no longer a Kobo!) to investigate the issuer further. |
Hello,
Thanks for this usefull tool :)
I'm having a small problem with the annotations :
Most of the time, I read a book two or three times, and I annotate something every time I read it.
The problem is that when I export the Annotations, they are ordered by date and not by Page number or Line number, so when I then read the Annotations, the order is a bit random and it makes no real sense according to the book order/chapters.
Is it possible to add a functionality that will order the Annotations in order of appearance in a book, instead of ordering them by the date ?
Thanks :)
The text was updated successfully, but these errors were encountered: