Skip to content

Commit

Permalink
Add transaction control mechanisms for query plans
Browse files Browse the repository at this point in the history
* Add automatic session pickup from request
* Add ability to automatically commit a query plan
* Clean up transaction manager interface
* Add diagnostic route /status/tx for running transactions
* Add tests
* Add exemplary python client
* Add documentation
  • Loading branch information
bastih committed Aug 20, 2013
1 parent c0c70c7 commit ab19a78
Show file tree
Hide file tree
Showing 27 changed files with 765 additions and 200 deletions.
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,6 @@
[submodule "third_party/backward"]
path = third_party/backward
url = git://github.com/bastih/backward-cpp.git
[submodule "third_party/cereal"]
path = third_party/cereal
url = https://github.com/USCiLab/cereal.git
8 changes: 4 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ lib_io := $(lib_dir)/io
lib_testing := $(lib_dir)/testing
lib_net := $(lib_dir)/net
lib_layouter:= $(lib_dir)/layouter
lib_taskscheduler := $(lib_dir)/taskscheduler
lib_taskscheduler := $(lib_dir)/taskscheduler

# third party dependencies
json := $(build_dir)/jsoncpp
Expand Down Expand Up @@ -75,7 +75,7 @@ $(lib_ebb):
$(lib_helper):
$(lib_taskscheduler): $(lib_helper)
$(lib_storage): $(lib_helper) $(lib_ftprinter) $(ext_gtest) $(lib_ftprinter)
$(lib_io): $(lib_storage) $(lib_helper)
$(lib_io): $(lib_storage) $(lib_helper) $(lib_net)
$(lib_access): $(lib_storage) $(lib_helper) $(lib_io) $(lib_layouter) $(json) $(lib_taskscheduler) $(lib_net)
$(lib_testing): $(ext_gtest) $(lib_storage) $(lib_taskscheduler) $(lib_access)
$(lib_net): $(lib_helper) $(json) $(lib_taskscheduler) $(lib_ebb)
Expand Down Expand Up @@ -122,8 +122,8 @@ endif
python_test:
python tools/test_server.py

basic_test_targets := $(basic_test_binaries)
all_test_targets := $(all_test_binaries) python_test
basic_test_targets := $(basic_test_binaries) python_test
all_test_targets := $(all_test_binaries)

# Test invocation rules
test: unit_test_params = --minimal
Expand Down
4 changes: 3 additions & 1 deletion build/log.properties
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,10 @@ log4j.appender.QueryLog.MaxBackupIndex=1
log4j.appender.QueryLog.layout=org.apache.log4j.PatternLayout
log4j.appender.QueryLog.layout.ConversionPattern=%d [%t] %-5p %c - %m%n


log4j.rootLogger=error,stdout
log4j.logger.hyrise=warn
log4j.logger.hyrise=warn,stdout
log4j.additivity.hyrise=false

log4j.logger.hyrise.access.queries=error,QueryLog
#Prevent parent logging settings from propagating
Expand Down
2 changes: 2 additions & 0 deletions config.g++47.mk
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
CC := gcc-4.7
CXX:= g++-4.7
LD := g++-4.7

BUILD_FLAGS += -Wno-type-limits
3 changes: 2 additions & 1 deletion config.mk
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,8 @@ endif

JSON_PATH := $(IMH_PROJECT_PATH)/third_party/jsoncpp
FTPRINTER_PATH := $(IMH_PROJECT_PATH)/third_party/ftprinter/include
PROJECT_INCLUDE += $(IMH_PROJECT_PATH)/src/lib $(IMH_PROJECT_PATH)/third_party $(FTPRINTER_PATH) $(JSON_PATH)
CEREAL_PATH := $(IMH_PROJECT_PATH)/third_party/cereal/include
PROJECT_INCLUDE += $(IMH_PROJECT_PATH)/src/lib $(IMH_PROJECT_PATH)/third_party $(FTPRINTER_PATH) $(JSON_PATH) $(CEREAL_PATH)
LINKER_FLAGS += -llog4cxx -lpthread -lboost_system
BINARY_LINKER_FLAGS += -lbackward-hyr

Expand Down
209 changes: 191 additions & 18 deletions docs/queryexecution/tx.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,186 @@ guarantees during query execution. However, currently HYRISE is only able to
guarantee atomicity, consistency, and isolation but not durability since
currently no logging is used to store the transactions on disk.

In addition, transactions in HYRISE are not enabled by default, but are rather can be additionally selected when building the query plan.

Using transactions in your plan
===============================

Transactions work on the basis of session contexts. Every executed query plan
receives a transaction context (`TXContext`) as a part of parsing the plan. This
context allows plan operations to act based on the context.

There are two essential parts to transaction control in HYRISE:

1. Creating a transaction context/reusing an existing transaction context
2. Commit/Rollback transaction context

Single-request transactions
---------------------------

Typically, we will run plans which are self-contained, as an example
we may grab a table and insert a few records and then commit those
records in order to make our changes visible to other transactions.

.. literalinclude:: ../../test/autojson/insert_commit.json
:language: javascript
:linenos:
:lines: 1-2,8-25,27-
:emphasize-lines: 4-5,8-13,17

The above example shows what a typical insert looks like: In line
`retrieve_revenue`, we retrieve a handle to the `revenue` table, which
has 3 columns (year, month, revenue).

Then, we continue by inserting 4 new rows into `revenue`. Finally, we
expose the inserted rows to other transactions by committing them in
the `commit` operation.

Let's use this example to examine the essential parts of transaction
control: Context creation and committing of the same.

.. seqdiag::

seqdiag {
autonumber = True;
activation = none;

client -> server [label = "POST /query/"];
server -> parser [label = "query plan"];
parser => txmanager [label = "create new context", return="context"]
parser --> server [label = "plan operations"];
=== execute plan operations ===
server => txmanager [label = "commit context", return="commit_id"]
client <-- server [label = "response"];
}

Context creation happens during the initial plan parsing: We create a
new plan context during every request (if you need different behavior,
look at the :ref:`next section <multiple-tx>`). All the operations
created for our request may then use the context to carry out their
operations. As an example ``InsertScan`` uses the context to insert rows
marked with our *unique* transaction so other transactions will not
assume that they are already visible.

Finally, we commit the automatically created context in
``Commit``. Commit basically checks for possible conflicts with other
transactions - in the case of plain insertion of rows, such a conflict
is not possible and thus, our example should always succeed.

.. important:
Once the execution of `insert` finishes, these 4 records are then
visible to all operations within the same context, but not for
other transactions. Only `commit` makes them available to other
transactions.
.. _multiple-tx:

Multiple transaction steps
--------------------------

While the behavior in the beforehand example is sufficient for simple
transactions, more elaborate transactions involving multiple queries
and logic carried out on the client naturally involve transaction
contexts that live longer than one request.

.. literalinclude:: ../../test/autojson/insert_wo_commit.json
:lines: 1-2,8-22,24-
:language: javascript
:linenos:

Now in this query, we don't use the ``Commit`` -Operation. Instead, we
leave the context in an uncommitted state.

The server will the send a response similar to the following::

{
"session_context" : 12355,
"performanceData" : { /*...*/ },
/* and more ... */
}

Since the session was not ended, it is up to the client to re-use the
``session_context`` returned by the query to eventually use the
``Commit`` or ``Rollback`` plan operation to end the session.

The following sequence diagram illustrates the first request.

.. seqdiag::

seqdiag {
autonumber = True;
activation = none;

client -> server [label = "POST /query/"];
server -> parser [label = "query plan"];
parser => txmanager [label = "create new context", return="context"]
parser --> server [label = "plan operations"];
=== execute plan operations `retrieve_revenue` and `insert` ===
client <-- server [label = "response with context"];
}

To reuse an existing session context, we need to add the
``session_context`` as an additional ``POST`` parameter our HTTP
request to `/query/`. An example implementation for Python can be
found in ``tools/client.py``.

Following requests using a ``session_context`` post parameter will not
create a new session context but re-use the existing context. If we
would rerun the above query with a set session context, the sequence
of activities would change accordingly:

.. seqdiag::

seqdiag {
autonumber = True;
activation = none;

client -> server [label = "POST /query/ with session context"];
server -> parser [label = "query plan"];
parser => txmanager [label = "retrieve from session context", return="context"]
parser --> server [label = "plan operations"];
=== execute plan operations `retrieve_revenue` and `insert` ===
client <-- server [label = "response with context"];
}

Eventually, we will want to end the transaction and make our changes
visible to other transactions. This can be done by either extending
the last query with a ``Commit`` operation or a plan simply consisting
of one single ``Commit`` operation::

{ "operations": {"commit_op": {"type" : "Commit"} } }

This would then result in a response without ``session_context`` as
the session has been closed at this point.

.. seqdiag ::
seqdiag {
autonumber = True;
activation = none;
client -> server [label = "POST /query/ with session context"];
server -> parser [label = "query plan"];
parser => txmanager [label = "retrieve from session context", return="context"]
parser --> server [label = "plan operations"];
server => txmanager [label = "commit context", return="commit_id"]
client <-- server [label = "response"];
}
.. important:
Reusing a session_context that has been committed/rolled back
results in *undefined behavior*.
Autcommit shortcut
------------------

As explicitly writing out commit operations at the end of your queries
is tedious and error-prone, the ``/query/`` interface accepts an
additional ``POST`` parameter: ``autocommit``. This allows the query
parser to automatically append a ``Commit`` operation at the end of
the current query.

Architectural Overview
=======================
Expand Down Expand Up @@ -67,21 +245,16 @@ Possible TID Combinations
==========================

::
+----------------+---------------+--------+--------+---------------+
| CID > lastCID | TID = tx.TID | valid | Keep? | Comment |
+================+===============+========+========+===============+
| yes | yes | yes | -- | impossible |
| no | yes | yes | -- | impossible |
| yes | no | yes | -- | Future insert |
| yes | yes | no | -- | impossible |
| no | no | yes | -- | Past insert |
| yes | no | no | -- | Future delete |
| no | yes | no | -- | Own write |
| no | no | no | -- | Past delete |
+----------------+---------------+--------+--------+---------------+





+----------------+---------------+--------+--------+---------------+
| CID > lastCID | TID = tx.TID | valid | Keep? | Comment |
+================+===============+========+========+===============+
| yes | yes | yes | -- | impossible |
| no | yes | yes | -- | impossible |
| yes | no | yes | -- | Future insert |
| yes | yes | no | -- | impossible |
| no | no | yes | -- | Past insert |
| yes | no | no | -- | Future delete |
| no | yes | no | -- | Own write |
| no | no | no | -- | Past delete |
+----------------+---------------+--------+--------+---------------+
2 changes: 1 addition & 1 deletion mkplugins/maxwarnings.mk
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
ifneq (,$(findstring clang++,$(CXX)))
BUILD_FLAGS += -Weverything
BUILD_FLAGS += -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-padded
else
BUILD_FLAGS += -Wcast-align -Wcast-qual -Wctor-dtor-privacy -Wdisabled-optimization -Wformat=2 -Winit-self -Wmissing-declarations -Wmissing-include-dirs -Wold-style-cast -Woverloaded-virtual -Wredundant-decls -Wstrict-overflow=5 #-Wswitch-default -Wno-unused -Wsign-conversion -Wsign-promo #-Wshadow crashes on g++-4.7
endif
Loading

0 comments on commit ab19a78

Please sign in to comment.