Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to best execute arbitrary SQL on first startup? #996

Closed
KonradHoeffner opened this issue Nov 7, 2021 · 9 comments
Closed

How to best execute arbitrary SQL on first startup? #996

KonradHoeffner opened this issue Nov 7, 2021 · 9 comments
Assignees

Comments

@KonradHoeffner
Copy link

I want to execute a setup.sql on first startup of a Virtuoso docker container, what would be the best method to do this using the official docker image?

In the following, I describe how I do it right now using the tenforce/virtuoso container by using a similar method as they do in their startup script, as they publish the GitHub repository source their image.
However I want to switch to the official image and don't know if that is the intended method or if there is already a hook in place to do just that.

My use case for creating a docker image of Virtuoso for a specific knowledge base is:

  1. load the knowledge base RDF files (Turtle or N-Triples) in their respective graphs
  2. set default namespaces (prefixes)
  3. setup a graph group and it's members
  4. enable CORS

When I create a Virtuoso instance once, the graphical user interface of the conductor allows all that very well. However as I recreate the docker container very often, I want to automate that, which also helps someone else who uses that docker image. I know how to do 1-3 using SQL and I am hoping that I can do 4 with SQL as well.

The Dockerfile

FROM tenforce/virtuoso:latest
COPY --from=rdf /rdf /data/toLoad
ENV DEFAULT_GRAPH=http://this.is/my/graph
ENV DBA_PASSWORD=dba
WORKDIR /virtuoso
COPY setup.sql .
COPY wrapper.sh .
ENTRYPOINT ["/bin/sh","./wrapper.sh"]

wrapper.sh

#/bin/sh    

if [ ! -f ./virtuoso.ini ];
then
  mv /virtuoso.ini . 2>/dev/null 
fi

if [ ! -f ".setup" ] ;
then
    echo "Start setup"
    chmod +x /virtuoso.sh
    pwd="dba"
    graph="http://localhost:8890/DAV"

    if [ "$DBA_PASSWORD" ]; then pwd="$DBA_PASSWORD" ; fi
    if [ "$DEFAULT_GRAPH" ]; then graph="$DEFAULT_GRAPH" ; fi
    echo "$(cat setup.sql)"
    virtuoso-t +wait && isql-v -U dba -P "$pwd" < setup.sql
    kill $(ps aux | grep '[v]irtuoso-t' | awk '{print $2}')
    echo "`date +%Y-%m-%dT%H:%M:%S%:z`" > .setup
fi

/virtuoso.sh

However since I use that wrapper, my Virtuoso Docker container seems like it takes longer to shutdown or requires a kill command, maybe because it tries to shutdown the wrapper script instead of Virtuoso itself? Should I use this way with the official image or is there a better or standard method to achieve this?

@pkleef
Copy link
Collaborator

pkleef commented Nov 7, 2021

I would not recommend the method you are using as the standard entrypoint script can basically perform all of these functions and more.

I am actually writing a tutorial on now on how to run scripts on database creation, which i hope to finish in the coming week.

As soon as the draft is ready on our community forum i will post a link here.

@KonradHoeffner
Copy link
Author

Great to hear! I will switch to your method as soon as the draft is available.

@KonradHoeffner
Copy link
Author

@pkleef: Is it possible to share the current state of the draft? I checked the community forum but didn't find it yet.

@KonradHoeffner
Copy link
Author

Unfortunately I have not been able to contact @pkleef neither here nor by email, is someone in contact with him and can ask him?

@TallTed
Copy link
Collaborator

TallTed commented Mar 4, 2022

@pkleef -- Can you please provide an ETA for the tutorial you had expected to deliver around November 15?

/cc @openlink @HughWilliams

@KonradHoeffner
Copy link
Author

@pkleef: Is there an update on this? I still cannot find it on the forums.

@KonradHoeffner
Copy link
Author

Any news about this feature?

@pkleef
Copy link
Collaborator

pkleef commented Jul 13, 2022

The latest version of our openlink/virtuoso-closedsource-8 and openlink/virtuoso-opensource-7 docker images contain support for running a combination of shell (.sh) and Virtuoso PL (.sql) scripts as part of the initialization of the database before it goes "online".

This feature allows you to either mount a directory with your scripts to the /initdb.d directory inside the docker image during its creation, or copy your scripts into the /initdb.d directory if you prefer to build your own clone of our docker images.

We published a bulk load example on Github to show how this feature works.

We also added the following section to our Virtuoso Docker — Reference Guide as well as to the Overview page of our Docker images.

/initdb.d

This directory can contain a mix of shell (.sh) and Virtuoso PL (.sql) scripts that can perform functions such as the following:

  • Installing additional Ubuntu packages
  • Loading data from remote locations such as Amazon S3 buckets, Google Drive, or other locations
  • Bulk loading data into the Virtuoso database
  • Installing additional VAD packages into the database
  • Adding new Virtuoso users
  • Granting permissions to Virtuoso users
  • Regenerating free-text indexes or other initial data

These scripts are run only once, during the initial database creation; subsequent restarts of the docker image will not cause these script to be re-run.

These scripts are run in alphabetical order, so we suggest starting the script names with sequence numbers, so the ordering is explicit and obvious.

For security purposes, Virtuoso will run the .sql scripts in a special mode, in which Virtuoso will not respond to connections on its SQL (1111 by default) and/or HTTP (8890 by default) ports.

At the end of each .sql script, Virtuoso automatically performs a checkpoint, to make sure the changes are fully written back to the database. This is very important for scripts that use the bulk loader function, rdf_loader_run(), or manually change the ACID mode of the database for any reason.

After all the initialization scripts have run to completion, Virtuoso will be started normally, and will start listening to requests on its SQL (1111) and HTTP (8890) ports.

@KonradHoeffner
Copy link
Author

Thank you, this guide works perfectly, except for the virtuoso.ini but I made a separate issue for that at #1060.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants