Access to secondary databases #570

ederag · 2020-02-21T17:50:46Z

There is a need to access other hamster databases as well, not only the main hamster.db,
e.g. for merging a foreign database into the main one (#340).
As recently discussed, this is not possible using the org.gnome.Hamster D-Bus service,
which is tied to the main database.

One possibility:

add an option (--no-dbus ?) to hamster-cli to bypass D-Bus and use db.Storage directly.
add an option (--db-file=<filename>) to hamster-cli.
Edit: Still not fully sorted out, but the following one was probably being over-cautious,
as correctly indicated by @GeraldJansen below.
Of course this access is risky
(there is no central server to avoid simultaneous write access in this case),
so if the server is running, it should be forbidden to access the server database.
(that one will remain hard-coded)

Still thinking, open to discussion.

The text was updated successfully, but these errors were encountered:

GeraldJansen · 2020-02-23T11:04:25Z

It seems to me that you underestimate sqlite3's capabilities for handling concurrent requests from multiple processes and overestimate the risk associated with accessing hamster.db directly, by-passing the hamster-service D-Bus interface (eg. here). From a bit of reading, for example

in my assessment, considering the small size our hamster.db files, the processing speed of even oldish laptops/desktops and the low likelihood of concurrent write requests, the "risk" associated with accessing hamster.db without using the D-Bus interface is extremely low.

ederag · 2020-02-23T19:48:42Z

Thanks for the links ! Lots of food for thought.

ederag · 2020-03-01T15:07:11Z

The links were spot-on,
and reading carefully is fully convincing that sqlite is robust and perfect for hamster,
sql writes from different processes just queue up.
(timeout is possible, but the database integrity remains)
It's not a matter of probabilities, but of how sqlite works.
What a relief, thanks Gerald !

Next step is to check that the start_transaction / end_transaction blocs are enough
to protect from database changes during python code execution,
in particular this call in the middle of __add_fact__ should be checked:

hamster/src/hamster/storage/db.py

Line 665 in 6fa4c20

self.__solve_overlaps(start_time, end_time)

Not saying there is an issue, just that it deserves to be checked twice.

The first point is almost done (PR #573).
It will be used to test the second one, which should be simpler.

matthijskooijman · 2020-05-18T13:01:26Z

I actually wonder if a --no-dbus option is needed at all for this? I can imagine two flavours of "access to secondary databases":

Running hamster normally, with multiple databases active (e.g. loading data from two databases, writing data to one, or maybe other ways to decide where to write data).
Running a one-off "merge data" process.

I think the point of this issue is option 2. As for option 1, this would probably require invasive changes, which would be better suited to include in the rewrite rather than the current codebase, I believe. Also, if this would be supported usecase, I think it should actually access both databases through dbus (maybe even present them as a single db somehow), rather than bypassing dbus for this usecase.

As for option 2., I wonder if this is not something that can be implemented already? I would think you can create a script that creates two instances of hamster.storage.db.Storage and merges them into either one (or a new db) without accessing dbus at all? Or maybe even more elegant, a script that creates one extra storage and merges that into the main storage accessed through dbus. I can imagine that this needs some refactoring for code that currently assumes there is a dbus connection available and/or assumes there is only one storage available, but I expect that this would need a lot less than what #573 implements.

GeraldJansen · 2020-05-18T16:38:09Z

Yes, the point of this issue is option 2, with the proviso that "one-off" could include frequent (say daily) synchronization of databases (say work PC and home PC).

An external script (something like htool) is okay for the one-off case, but seriously risks obscurity and abandonment. Integrating that functionality into hamster-cli.py would be better. All that is needed is something like

hamster --no-dbus --db-file=<some/hamster.db> export tsv [date1 [time1]] [date2 [time2]] > some.tsv
hamster --no-dbus --db-file=<other/hamster.db> load tsv some.tsv

A lot of flexibility is gained if the whole CLI can use hamster.storage.db.Storage directly instead of the dbus interface, thereby avoiding having to be careful about stopping dbus services tied to a specific database file. This would also very convenient for mock databases for development and testing, so you don't have to continuously be worried about corrupting your precious "real" hamster.db.

matthijskooijman · 2020-05-18T17:54:25Z

Yes, the point of this issue is option 2, with the proviso that "one-off" could include frequent (say daily) synchronization of databases (say work PC and home PC).

Ah, good point. I was thinking about an "importing data from an old machine" kind of setup, but tracking on multiple machines indeed makes sense.

And indeed, just using separate export and load commands seems like an elegant way to support this usecase, while also allowing the user to do some preprocessing before import if needed.

Of course, you can still do a slightly less flexible version of this without --no-dbus, e.g.:

hamster importdb <some/hamster.db>

which would import the contents of the given db into the main db (accessed through dbus). This could be implemented by creating a secondary db.Storage next to the main client.Storage, as I previously suggested.

However, I like the elegancy and flexibility of importing using a load tsv (or similar) command.

Thanks for adding some examples :-)

ederag added the enhancement label Feb 21, 2020

ederag mentioned this issue Feb 21, 2020

Not using "exactly the main hamster machinery" GeraldJansen/hamster-tool#1

Closed

ederag mentioned this issue Mar 1, 2020

WIP: No dbus #573

Open

ederag mentioned this issue Mar 10, 2020

Welcome, a heap of gratitude and update! #574

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Access to secondary databases #570

Access to secondary databases #570

ederag commented Feb 21, 2020 •

edited

Loading

GeraldJansen commented Feb 23, 2020

ederag commented Feb 23, 2020

ederag commented Mar 1, 2020 •

edited

Loading

matthijskooijman commented May 18, 2020

GeraldJansen commented May 18, 2020

matthijskooijman commented May 18, 2020

Access to secondary databases #570

Access to secondary databases #570

Comments

ederag commented Feb 21, 2020 • edited Loading

GeraldJansen commented Feb 23, 2020

ederag commented Feb 23, 2020

ederag commented Mar 1, 2020 • edited Loading

matthijskooijman commented May 18, 2020

GeraldJansen commented May 18, 2020

matthijskooijman commented May 18, 2020

ederag commented Feb 21, 2020 •

edited

Loading

ederag commented Mar 1, 2020 •

edited

Loading