Skip to content

Commit

Permalink
feat(reliability): integrate the ryuk container for better container …
Browse files Browse the repository at this point in the history
…cleanup (#314)

> [!NOTE]
> Editor's note from @totallyzen:
> After a thorough discussion between @santi @alexanderankin, @kiview
and @totallyzen on the
[slack](https://testcontainers.slack.com/archives/C04SRG5AXNU/p1710156743640249)
we've decided this isn't a breaking change in api, but a modification in
behaviour. Therefore not worth a 5.0 but we'll release it under 4.x
> If this did end up breaking your workflow, come talk to us about your
use-case!

**What are you trying to do?**

Use Ryuk as the default resource cleanup strategy.

**Why should it be done this way?**

The current implementation of tc-python does not handle container
lifecycle management the same way as other TC implementations. In this
PR I introduce Ryuk to be the default resource cleanup strategy, in
order to be better aligned with other TC implementations.

Ryuk is enabled by default, but can be disabled with the
`TESTCONTAINERS_RYUK_DISABLED` env variable (which follows the behavior
of other TC implementations). Ryuk behavior can further be altered with
the `TESTCONTAINERS_RYUK_PRIVILEGED`, `RYUK_CONTAINER_IMAGE` and
`TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE` (also following the same
conventions from other implementations). Documentation of these env
variables is added to the README in the root of the repo.

The implementation uses a singleton container that starts an instance of
Ryuk and stores it on class / module level, so that it is reused within
a process. This follows the same pattern as in other languages, and Ryuk
behaviour itself is implemented the exact same way as other
implementations.

**BREAKING CHANGE**

`__del__` support is now removed, in order to better align with other TC
implementations. From the comments in the `__del__` implementation, it
seems like this was added as a final attempt to cleanup resources on
exit/garbage collection. This leaves three ways to cleanup resources,
which has better defined behaviors and follows modern patterns for
resource management in Python:

Method 1: Context Manager
Init/cleanup through a context manager (__enter__/__exit__):
```
with DockerContainer("postgres") as container:
    # some logic here

# end of context: container is killed by `__exit__` method
```

Method 2: Manual start() and stop()
```
container = DockerContainer("postgres").start()
# some logic here
container.stop()
```

Method 3: Ryuk
```
container = DockerContainer("postgres").start()
# some logic here

# You forget to .stop() the container, but Ryuk kills it for you 10 seconds after your process exits.
```

_Why remove `__del__`?_ According to the previous maintainer of the
repo, it has been causing “[a bunch of
issues](https://github.com/testcontainers/testcontainers-python/pull/314#discussion_r1185321083)”,
which I have personally experienced while using TC in a Django app, due
to the automatic GC behavior when no longer referencing the container
with a variable. E.g. if you instantiate the container in a method, only
returning the connection string, the Python garbage collector will
automatically call `__del__` on the instance at the end of the function,
thus killing your container. This leads to clunky workarounds like
having to store a reference to the container in a module-level variable,
or always having to return a reference to the container from the
function creating the container. In addition, the gc behaviour is not
consistent across Python implementations, making the reliance on
`__del__` flaky at best.

Also, having the __del__ method cleanup your container prevents us from
implementing `with_reuse()` (which is implemented in other TC
implementations) in the future, as a process exit would always instantly
kill the container, preventing us to use it in another process before
Ryuk reaps it.

**Next steps**

Once this PR is accepted, my plan is to implement the `with_reuse()`
functionality seen in other implementations, to enable faster / instant
usage of existing containers. This is very useful in simple testing
scenarios or local development workflows using hot reload behaviour. The
`with_reuse()` API requires the removal of `__del__` cleanup, as
otherwise the container would not be available for reuse due to the GC
reaping the container as soon as the process exits.

**Other changes**
- Adds “x-tc-sid=SESSION_ID” header to the underlying Docker API client
with the value of the current session ID (created on module init), in
order to enable Testcontainers Cloud to operate in “Turbo mode”
#314 (comment)
- Adds labels `org.testcontainers.lang=python` and
`org.testcontainers.session-id=SESSION_ID`to the containers created by
TC
- As mentioned above, the env variables TESTCONTAINERS_RYUK_DISABLED,
TESTCONTAINERS_RYUK_PRIVILEGED, RYUK_CONTAINER_IMAGE and
TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE are now used for customizing
tc-python behavior.

---------

Co-authored-by: Andre Hofmeister <[email protected]>
Co-authored-by: Balint Bartha <[email protected]>
  • Loading branch information
3 people authored Mar 19, 2024
1 parent 08a6293 commit d019874
Show file tree
Hide file tree
Showing 8 changed files with 153 additions and 33 deletions.
15 changes: 15 additions & 0 deletions INDEX.rst
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,21 @@ When trying to launch a testcontainer from within a Docker container, e.g., in c
1. The container has to provide a docker client installation. Either use an image that has docker pre-installed (e.g. the `official docker images <https://hub.docker.com/_/docker>`_) or install the client from within the `Dockerfile` specification.
2. The container has to have access to the docker daemon which can be achieved by mounting `/var/run/docker.sock` or setting the `DOCKER_HOST` environment variable as part of your `docker run` command.

Configuration
-------------

+-------------------------------------------+-------------------------------+------------------------------------------+
| Env Variable | Example | Description |
+===========================================+===============================+==========================================+
| ``TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE`` | ``/var/run/docker.sock`` | Path to Docker's socket used by ryuk |
+-------------------------------------------+-------------------------------+------------------------------------------+
| ``TESTCONTAINERS_RYUK_PRIVILEGED`` | ``false`` | Run ryuk as a privileged container |
+-------------------------------------------+-------------------------------+------------------------------------------+
| ``TESTCONTAINERS_RYUK_DISABLED`` | ``false`` | Disable ryuk |
+-------------------------------------------+-------------------------------+------------------------------------------+
| ``RYUK_CONTAINER_IMAGE`` | ``testcontainers/ryuk:0.5.1`` | Custom image for ryuk |
+-------------------------------------------+-------------------------------+------------------------------------------+

Development and Contributing
----------------------------

Expand Down
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,12 @@ For more information, see [the docs][readthedocs].
```

The snippet above will spin up a postgres database in a container. The `get_connection_url()` convenience method returns a `sqlalchemy` compatible url we use to connect to the database and retrieve the database version.

## Configuration

| Env Variable | Example | Description |
| ----------------------------------------- | ----------------------------- | ---------------------------------------- |
| `TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE` | `/var/run/docker.sock` | Path to Docker's socket used by ryuk |
| `TESTCONTAINERS_RYUK_PRIVILEGED` | `false` | Run ryuk as a privileged container |
| `TESTCONTAINERS_RYUK_DISABLED` | `false` | Disable ryuk |
| `RYUK_CONTAINER_IMAGE` | `testcontainers/ryuk:0.5.1` | Custom image for ryuk |
5 changes: 5 additions & 0 deletions core/testcontainers/core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,8 @@
MAX_TRIES = int(environ.get("TC_MAX_TRIES", 120))
SLEEP_TIME = int(environ.get("TC_POOLING_INTERVAL", 1))
TIMEOUT = MAX_TRIES * SLEEP_TIME

RYUK_IMAGE: str = environ.get("RYUK_CONTAINER_IMAGE", "testcontainers/ryuk:0.5.1")
RYUK_PRIVILEGED: bool = environ.get("TESTCONTAINERS_RYUK_PRIVILEGED", "false") == "true"
RYUK_DISABLED: bool = environ.get("TESTCONTAINERS_RYUK_DISABLED", "false") == "true"
RYUK_DOCKER_SOCKET: str = environ.get("TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE", "/var/run/docker.sock")
91 changes: 72 additions & 19 deletions core/testcontainers/core/container.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
import contextlib
from platform import system
from typing import Optional

from docker.models.containers import Container
from socket import socket
from typing import TYPE_CHECKING, Optional

from testcontainers.core.config import RYUK_DISABLED, RYUK_DOCKER_SOCKET, RYUK_IMAGE, RYUK_PRIVILEGED
from testcontainers.core.docker_client import DockerClient
from testcontainers.core.exceptions import ContainerStartException
from testcontainers.core.labels import LABEL_SESSION_ID, SESSION_ID
from testcontainers.core.utils import inside_container, is_arm, setup_logger
from testcontainers.core.waiting_utils import wait_container_is_ready
from testcontainers.core.waiting_utils import wait_container_is_ready, wait_for_logs

if TYPE_CHECKING:
from docker.models.containers import Container

logger = setup_logger(__name__)

Expand All @@ -25,7 +28,12 @@ class DockerContainer:
... delay = wait_for_logs(container, "Hello from Docker!")
"""

def __init__(self, image: str, docker_client_kw: Optional[dict] = None, **kwargs) -> None:
def __init__(
self,
image: str,
docker_client_kw: Optional[dict] = None,
**kwargs,
) -> None:
self.env = {}
self.ports = {}
self.volumes = {}
Expand Down Expand Up @@ -58,7 +66,10 @@ def maybe_emulate_amd64(self) -> "DockerContainer":
return self.with_kwargs(platform="linux/amd64")
return self

def start(self) -> "DockerContainer":
def start(self):
if not RYUK_DISABLED and self.image != RYUK_IMAGE:
logger.debug("Creating Ryuk container")
Reaper.get_instance()
logger.info("Pulling image %s", self.image)
docker_client = self.get_docker_client()
self._container = docker_client.run(
Expand All @@ -69,7 +80,7 @@ def start(self) -> "DockerContainer":
ports=self.ports,
name=self._name,
volumes=self.volumes,
**self._kwargs
**self._kwargs,
)
logger.info("Container started: %s", self._container.short_id)
return self
Expand All @@ -78,21 +89,12 @@ def stop(self, force=True, delete_volume=True) -> None:
self._container.remove(force=force, v=delete_volume)
self.get_docker_client().client.close()

def __enter__(self) -> "DockerContainer":
def __enter__(self):
return self.start()

def __exit__(self, exc_type, exc_val, exc_tb) -> None:
self.stop()

def __del__(self) -> None:
"""
__del__ runs when Python attempts to garbage collect the object.
In case of leaky test design, we still attempt to clean up the container.
"""
with contextlib.suppress(Exception):
if self._container is not None:
self.stop()

def get_container_host_ip(self) -> str:
# infer from docker host
host = self.get_docker_client().host()
Expand Down Expand Up @@ -140,7 +142,7 @@ def with_volume_mapping(self, host: str, container: str, mode: str = "ro") -> "D
self.volumes[host] = mapping
return self

def get_wrapped_container(self) -> Container:
def get_wrapped_container(self) -> "Container":
return self._container

def get_docker_client(self) -> DockerClient:
Expand All @@ -155,3 +157,54 @@ def exec(self, command) -> tuple[int, str]:
if not self._container:
raise ContainerStartException("Container should be started before executing a command")
return self._container.exec_run(command)


class Reaper:
_instance: "Optional[Reaper]" = None
_container: Optional[DockerContainer] = None
_socket: Optional[socket] = None

@classmethod
def get_instance(cls) -> "Reaper":
if not Reaper._instance:
Reaper._instance = Reaper._create_instance()

return Reaper._instance

@classmethod
def delete_instance(cls) -> None:
if Reaper._socket is not None:
Reaper._socket.close()
Reaper._socket = None

if Reaper._container is not None:
Reaper._container.stop()
Reaper._container = None

if Reaper._instance is not None:
Reaper._instance = None

@classmethod
def _create_instance(cls) -> "Reaper":
logger.debug(f"Creating new Reaper for session: {SESSION_ID}")

Reaper._container = (
DockerContainer(RYUK_IMAGE)
.with_name(f"testcontainers-ryuk-{SESSION_ID}")
.with_exposed_ports(8080)
.with_volume_mapping(RYUK_DOCKER_SOCKET, "/var/run/docker.sock", "rw")
.with_kwargs(privileged=RYUK_PRIVILEGED)
.start()
)
wait_for_logs(Reaper._container, r".* Started!")

container_host = Reaper._container.get_container_host_ip()
container_port = int(Reaper._container.get_exposed_port(8080))

Reaper._socket = socket()
Reaper._socket.connect((container_host, container_port))
Reaper._socket.send(f"label={LABEL_SESSION_ID}={SESSION_ID}\r\n".encode())

Reaper._instance = Reaper()

return Reaper._instance
19 changes: 5 additions & 14 deletions core/testcontainers/core/docker_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
import atexit
import functools as ft
import os
import urllib
Expand All @@ -19,10 +18,10 @@
from typing import Optional, Union

import docker
from docker.errors import NotFound
from docker.models.containers import Container, ContainerCollection

from .utils import default_gateway_ip, inside_container, setup_logger
from testcontainers.core.labels import SESSION_ID, create_labels
from testcontainers.core.utils import default_gateway_ip, inside_container, setup_logger

LOGGER = setup_logger(__name__)
TC_FILE = ".testcontainers.properties"
Expand All @@ -42,6 +41,7 @@ def __init__(self, **kwargs) -> None:
self.client = docker.DockerClient(base_url=docker_host)
else:
self.client = docker.from_env(**kwargs)
self.client.api.headers["x-tc-sid"] = SESSION_ID

@ft.wraps(ContainerCollection.run)
def run(
Expand All @@ -50,6 +50,7 @@ def run(
command: Optional[Union[str, list[str]]] = None,
environment: Optional[dict] = None,
ports: Optional[dict] = None,
labels: Optional[dict[str, str]] = None,
detach: bool = False,
stdout: bool = True,
stderr: bool = False,
Expand All @@ -65,10 +66,9 @@ def run(
detach=detach,
environment=environment,
ports=ports,
labels=create_labels(image, labels),
**kwargs,
)
if detach:
atexit.register(_stop_container, container)
return container

def port(self, container_id: str, port: int) -> int:
Expand Down Expand Up @@ -145,12 +145,3 @@ def read_tc_properties() -> dict[str, str]:
tuples = [line.split("=") for line in contents.readlines() if "=" in line]
settings = {**settings, **{item[0].strip(): item[1].strip() for item in tuples}}
return settings


def _stop_container(container: Container) -> None:
try:
container.stop()
except NotFound:
pass
except Exception as ex:
LOGGER.warning("failed to shut down container %s with image %s: %s", container.id, container.image, ex)
20 changes: 20 additions & 0 deletions core/testcontainers/core/labels.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from typing import Optional
from uuid import uuid4

from testcontainers.core.config import RYUK_IMAGE

SESSION_ID: str = str(uuid4())
LABEL_SESSION_ID = "org.testcontainers.session-id"
LABEL_LANG = "org.testcontainers.lang"


def create_labels(image: str, labels: Optional[dict[str, str]]) -> dict[str, str]:
if labels is None:
labels = {}
labels[LABEL_LANG] = "python"

if image == RYUK_IMAGE:
return labels

labels[LABEL_SESSION_ID] = SESSION_ID
return labels
24 changes: 24 additions & 0 deletions core/tests/test_ryuk.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
from testcontainers.core import container
from testcontainers.core.container import Reaper
from testcontainers.core.container import DockerContainer
from testcontainers.core.waiting_utils import wait_for_logs


def test_wait_for_reaper():
container = DockerContainer("hello-world").start()
wait_for_logs(container, "Hello from Docker!")

assert Reaper._socket is not None
Reaper._socket.close()

assert Reaper._container is not None
wait_for_logs(Reaper._container, r".* Removed \d .*", timeout=30)

Reaper.delete_instance()


def test_container_without_ryuk(monkeypatch):
monkeypatch.setattr(container, "RYUK_DISABLED", True)
with DockerContainer("hello-world") as cont:
wait_for_logs(cont, "Hello from Docker!")
assert Reaper._instance is None
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,9 @@ ignore = [
"INP001"
]

[tool.ruff.lint.pyupgrade]
keep-runtime-typing = true

[tool.ruff.lint.flake8-type-checking]
strict = true

Expand Down

0 comments on commit d019874

Please sign in to comment.