A Unified API for IT’s JOINTLY AI Services in Python

A unified API to the various AI applications we have built as part of the IT’s JOINTLY project in order to generate additional or missing metadata.

Build & run OCI image

An OCI-compliant image can be built in one of two ways:

Build using Nix

Ensure that nix is installed with flakes support. Then, the image can be copied directly to the Docker or podman daemon through

nix run "github:openeduhub/python-kidra#docker.copyToDockerDaemon"

or

nix run "github:openeduhub/python-kidra#docker.copyToPodman"

Build using Docker

The image can also be built without a local nix installation through bootstrapping. For this, another docker image, containing a nix installation with flakes support, will be used. Make sure to be inside of repository before running build.sh; it will not work otherwise.

git clone https://github.com/openeduhub/python-kidra.git
cd python-kidra
sh build.sh

The image will be available as result.

Note: in order to reduce the amount of redundant building in future build processes, a persistent build-container kidra-builder is created as part of the script. This container will contain a cache of all used artifacts of previous builds. While it is safe to remove it afterward, this will cause a full re-build when running the script again.

Now, load the created image through

docker load -i result

A message will appear to confirm that the image has been loaded, including its name and version.

Run the Image

Now, start the service through

docker run -p 8080:8080 python-kidra:<version>

Running as a native application

This service can also be run and installed as a native Nix Flake application. In particular, the following command will run the service locally:

nix run "github:openeduhub/python-kidra"

Additional Options (CUDA Support or More Minimal Application)

We provide two additional versions of the application – one with CUDA support and one with fewer dependencies (i.e. no bundled web browsers). Note that the latter will disable some features.

CUDA support can be accessed through the Nix Flake endpoints with the with-cuda-suffix, like docker-with-cuda, python-kidra-with-cuda, or simply with-cuda:
```
# enter development environment with CUDA
nix develop "github:openeduhub/python-kidra#with-cuda"
# run webservice with CUDA
nix run "github:openeduhub/python-kidra#with-cuda"
# build docker image with CUDA
nix run "github:openeduhub/python-kidra#docker-with-cuda.copyToDockerDaemon"
    
```
Do note that the size of the resulting application will be significantly larger (almost twice as large) and that wlo-classification (i.e. the /disciplines endpoint of the API) will not be built with CUDA support regardless.

Additionally, building the application with CUDA support may take a considerable amount of time, especially when the additional caches specified in this project are not used.

Similarly, the more minimal builds can be accessed with the without-browsers-suffix. These save a bit more than 1 GB of space.

# run more minimal webservice
nix run "github:openeduhub/python-kidra#without-browsers"
# build more minimal docker image
nix run "github:openeduhub/python-kidra#docker-without-browsers.copyToPodman"

Implemented Services

The following services are currently available from the Kidra:

text-extraction: Extract text from URLs
text-statistics: Calculate various metrics on text, e.g. reading time or readability
topic-statistics: Calculate various statistics for WLO topic pages
its-jointprobability: Bayesian model that predicts multiple metadata fields, such as school discipline or educational context
wlo-topic-assistant: Find WLO topics in texts
wlo-classification: Predict disciplines relevant for texts
kea: Link relevant Wikipedia articles found in texts (requests are simply forwarded to an external service)

Requirements

The service requires around 8 GB of RAM to start up.

Depending on the usage of the Bayesian prediction model, this requirement may be higher – specifically, the RAM usage of predictions is directly proportional to the num_samples parameter. At num_samples = =1000, around 2 GB of additional RAM are required to process the request.

API

Each individual service available through this API is located on another subdomain. The input data, and potential parameters, are passed as JSON objects.

Once the service is running, an interface listing all the available end-points and their documentation is available at http://localhost:8080/docs.

Additionally, this service implements an OpenAPI specification, which is accessible from the /v3/api-docs end-point.

Development

Development environment

To ensure that all Python packages with their correct versions are installed, we recommend using Nix. The development environment can be activated locally by running

nix develop

while inside this project.

With direnv installed, this process can be automated such that the development environment will be loaded whenever the project is visited. To allow direnv to activate the environment automatically, run

direnv allow

while inside this project.

Adding additional services

Prerequisites

As a prerequisite to adding a new service to the Kidra, the service in question must implement a web-service that exposes the service’s functionality through POST requests. Ideally, the service also provides an OpenAPI specification, which will then be automatically integrated.

If the service shall be packaged as part of the Kidra and be run as part of it, this web-service must also offer a way to specify the port on which it shall run at. For this, we recommend a CLI flag --port.

Making a service accessible in the web-service

All services are added to the Kidra web-service in webservice.py. Here, you have two primary options:

Add information about the service to SERVICES. Services collected in SERVICES will be automatically added to the web-service according to the information and parameters provided.

name
defines the name of the end-point in the Kidra that links to the service.

autostart
whether to automatically start the service from the Kidra. If the service shall be automatically started, it must be available to the Kidra, see Installing a new service

boot_timeout
the number of seconds to wait for the service to start. No timeout is enforced when set to None.

binary
the name of the executable that is run when the service shall be started from within the Kidra.

host
the host to contact when trying to access the service. Should be set to "localhost" if the service is started as part of the Kidra.

port
the port to start the service with when automatically starting it. This is also the port that delegated requests to the service are sent to.

post_subdomain
the subdomain of the service to access when delegating a request to it.

openapi_schema
the subdomain of the service on which the OpenAPI specification is available.
Alternatively, manually add an end-point to the FastAPI application (see https://fastapi.tiangolo.com/tutorial/first-steps/)

Installing a new service

When a service shall be started as part of the Kidra (i.e. it is not an external service that might run on a different system), it must be added to the run-time environment.

If the service has already been packaged in nixpkgs, no further work is necessary here. Otherwise, we recommend packaging the service as a Flake and providing it as an input in flake.nix (see the other sub-services, such as text-statistics).
Make the binaries of the service available to the Kidra in makeWrapperArgs of the build specification of python-kidra (package.nix). Additionally, add an overlay that provides the package in flake.nix.

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
scripts		scripts
src/python_kidra		src/python_kidra
.envrc		.envrc
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
README.org		README.org
build.sh		build.sh
docker.nix		docker.nix
flake.lock		flake.lock
flake.nix		flake.nix
overlays.nix		overlays.nix
package.nix		package.nix
requirements.txt		requirements.txt
setup.py		setup.py
shell.nix		shell.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Unified API for IT’s JOINTLY AI Services in Python

Build & run OCI image

Build using Nix

Build using Docker

Run the Image

Running as a native application

Additional Options (CUDA Support or More Minimal Application)

Implemented Services

Requirements

API

Development

Development environment

Adding additional services

Prerequisites

Making a service accessible in the web-service

Installing a new service

About

Releases

Packages

Languages

openeduhub/python-kidra

Folders and files

Latest commit

History

Repository files navigation

A Unified API for IT’s JOINTLY AI Services in Python

Build & run OCI image

Build using Nix

Build using Docker

Run the Image

Running as a native application

Additional Options (CUDA Support or More Minimal Application)

Implemented Services

Requirements

API

Development

Development environment

Adding additional services

Prerequisites

Making a service accessible in the web-service

Installing a new service

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages