Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: reusable containers #636

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion core/testcontainers/core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,10 @@
return settings


_WARNINGS = {"DOCKER_AUTH_CONFIG": "DOCKER_AUTH_CONFIG is experimental, see testcontainers/testcontainers-python#566"}
_WARNINGS = {
"DOCKER_AUTH_CONFIG": "DOCKER_AUTH_CONFIG is experimental, see testcontainers/testcontainers-python#566",
"tc_properties_get_tc_host": "this method has moved to property 'tc_properties_tc_host'",
}


@dataclass
Expand Down Expand Up @@ -73,8 +76,19 @@
self._docker_auth_config = value

def tc_properties_get_tc_host(self) -> Union[str, None]:
if "tc_properties_get_tc_host" in _WARNINGS:
warning(_WARNINGS.pop("tc_properties_get_tc_host"))

Check warning on line 80 in core/testcontainers/core/config.py

View check run for this annotation

Codecov / codecov/patch

core/testcontainers/core/config.py#L80

Added line #L80 was not covered by tests
return self.tc_properties.get("tc.host")

@property
def tc_properties_tc_host(self) -> Union[str, None]:
return self.tc_properties.get("tc.host")

@property
def tc_properties_testcontainers_reuse_enable(self) -> bool:
enabled = self.tc_properties.get("testcontainers.reuse.enable")
return enabled == "true"
matthiasschaub marked this conversation as resolved.
Show resolved Hide resolved

@property
def timeout(self):
return self.max_tries * self.sleep_time
Expand Down
59 changes: 54 additions & 5 deletions core/testcontainers/core/container.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
import contextlib
import hashlib
import logging
from platform import system
from socket import socket
from typing import TYPE_CHECKING, Optional
Expand Down Expand Up @@ -49,6 +51,7 @@ def __init__(
self._name = None
self._network: Optional[Network] = None
self._network_aliases: Optional[list[str]] = None
self._reuse: bool = False
self._kwargs = kwargs

def with_env(self, key: str, value: str) -> Self:
Expand Down Expand Up @@ -76,6 +79,10 @@ def with_kwargs(self, **kwargs) -> Self:
self._kwargs = kwargs
return self

def with_reuse(self, reuse=True) -> Self:
self._reuse = reuse
return self

def maybe_emulate_amd64(self) -> Self:
if is_arm():
return self.with_kwargs(platform="linux/amd64")
Expand All @@ -86,8 +93,49 @@ def start(self) -> Self:
logger.debug("Creating Ryuk container")
Reaper.get_instance()
logger.info("Pulling image %s", self.image)
docker_client = self.get_docker_client()
self._configure()

# container hash consisting of run arguments
args = (
self.image,
self._command,
self.env,
self.ports,
self._name,
self.volumes,
str(tuple(sorted(self._kwargs.items()))),
)
hash_ = hashlib.sha256(bytes(str(args), encoding="utf-8")).hexdigest()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we use the full container create request as the hash input. In tc-java, this is the CreateContainerCmd from docker-java, I guess we have some equivalent request object from the Docker Python SDK somewhere available?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would be the ideal solution. Unfortunately, I could not find an equivalent function to CreateContainerCmd in the Docker SDK for Python: Not by going through the documentation and not by browsing the code base.


if self._reuse and (not c.tc_properties_testcontainers_reuse_enable or not c.ryuk_disabled):
logging.warning(
"Reuse was requested (`with_reuse`) but the environment does not "
+ "support the reuse of containers. To enable container reuse, add "
+ "the 'testcontainers.reuse.enable=true' to "
+ "'~/.testcontainers.properties' and disable ryuk by setting the "
+ "environment variable 'TESTCONTAINERS_RYUK_DISABLED=true'"
)

if self._reuse and c.tc_properties_testcontainers_reuse_enable:
docker_client = self.get_docker_client()
container = docker_client.find_container_by_hash(hash_)
if container:
if container.status != "running":
container.start()
logger.info("Existing container started: %s", container.id)
logger.info("Container is already running: %s", container.id)
self._container = container
else:
self._start(hash_)
else:
self._start(hash_)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we refactor this so that is is obvious where the if clause is that triggers this?

want to make sure

  1. we are doing the hash inside the clause
  2. want to make it more readable - not in general but specifically for ensuring correcteness of logic that disables or enables reuse

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I will revisit this part next week and try to improve upon it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexanderankin I moved the generation of the hash inside the if-clause and removed passing the hash_ to start if reuse is not in use. I think that makes it better readable in general.


if self._network:
self._network.connect(self._container.id, self._network_aliases)
return self
Comment on lines +157 to +159
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apparently this part is also fairly jank and we should remove/rework so as a note to myself i can only do that after this pr merges


def _start(self, hash_):
docker_client = self.get_docker_client()
self._container = docker_client.run(
self.image,
command=self._command,
Expand All @@ -96,16 +144,17 @@ def start(self) -> Self:
ports=self.ports,
name=self._name,
volumes=self.volumes,
labels={"hash": hash_},
**self._kwargs,
)
logger.info("Container started: %s", self._container.short_id)
if self._network:
self._network.connect(self._container.id, self._network_aliases)
return self

def stop(self, force=True, delete_volume=True) -> None:
if self._container:
self._container.remove(force=force, v=delete_volume)
if self._reuse and c.tc_properties_testcontainers_reuse_enable:
self._container.stop()
else:
self._container.remove(force=force, v=delete_volume)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, isnt the point to not even stop it so it is warm for next run? i guess if people are using the explicit api then whatever. I do see a bit of a mirror with start so i guess it will just have to be consistent and maybe clear in docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other languages, having a reusable container does not change the contract of the stop() method. This is obviously something that needs to be considered to make this a full fledged use case, but as of now, I would suggest we start with an experimental reusable implementation, that mirrors the Java implementation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see. I updated the code and documentation (how to use reusable containers) to not change the contract of the stop() method.

self.get_docker_client().client.close()

def __enter__(self) -> Self:
Expand Down
8 changes: 7 additions & 1 deletion core/testcontainers/core/docker_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,9 +215,15 @@ def client_networks_create(self, name: str, param: dict):
labels = create_labels("", param.get("labels"))
return self.client.networks.create(name, **{**param, "labels": labels})

def find_container_by_hash(self, hash_: str) -> Union[Container, None]:
for container in self.client.containers.list(all=True):
if container.labels.get("hash", None) == hash_:
return container
return None


def get_docker_host() -> Optional[str]:
return c.tc_properties_get_tc_host() or os.getenv("DOCKER_HOST")
return c.tc_properties_tc_host or os.getenv("DOCKER_HOST")


def get_docker_auth_config() -> Optional[str]:
Expand Down
105 changes: 105 additions & 0 deletions core/tests/test_reusable_containers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
from time import sleep

from docker.models.containers import Container

from testcontainers.core.config import testcontainers_config
from testcontainers.core.container import DockerContainer
from testcontainers.core.docker_client import DockerClient
from testcontainers.core.waiting_utils import wait_for_logs
from testcontainers.core.container import Reaper


def test_docker_container_reuse_default():
with DockerContainer("hello-world") as container:
assert container._reuse == False
id = container._container.id
wait_for_logs(container, "Hello from Docker!")
containers = DockerClient().client.containers.list(all=True)
assert id not in [container.id for container in containers]


def test_docker_container_with_reuse_reuse_disabled():
with DockerContainer("hello-world").with_reuse() as container:
assert container._reuse == True
assert testcontainers_config.tc_properties_testcontainers_reuse_enable == False
id = container._container.id
wait_for_logs(container, "Hello from Docker!")
containers = DockerClient().client.containers.list(all=True)
assert id not in [container.id for container in containers]


def test_docker_container_with_reuse_reuse_enabled_ryuk_enabled(monkeypatch):
# Make sure Ryuk cleanup is not active from previous test runs
Reaper.delete_instance()

tc_properties_mock = testcontainers_config.tc_properties | {"testcontainers.reuse.enable": "true"}
monkeypatch.setattr(testcontainers_config, "tc_properties", tc_properties_mock)
monkeypatch.setattr(testcontainers_config, "ryuk_reconnection_timeout", "0.1s")

with DockerContainer("hello-world").with_reuse() as container:
id = container._container.id
wait_for_logs(container, "Hello from Docker!")

Reaper._socket.close()
# Sleep until Ryuk reaps all dangling containers
sleep(0.6)

containers = DockerClient().client.containers.list(all=True)
assert id not in [container.id for container in containers]

# Cleanup Ryuk class fields after manual Ryuk shutdown
Reaper.delete_instance()


def test_docker_container_with_reuse_reuse_enabled_ryuk_disabled(monkeypatch):
# Make sure Ryuk cleanup is not active from previous test runs
Reaper.delete_instance()

tc_properties_mock = testcontainers_config.tc_properties | {"testcontainers.reuse.enable": "true"}
monkeypatch.setattr(testcontainers_config, "tc_properties", tc_properties_mock)
monkeypatch.setattr(testcontainers_config, "ryuk_disabled", True)

with DockerContainer("hello-world").with_reuse() as container:
id = container._container.id
wait_for_logs(container, "Hello from Docker!")

containers = DockerClient().client.containers.list(all=True)
assert id in [container.id for container in containers]

# Cleanup after keeping container alive (with_reuse)
container._container.remove(force=True)


def test_docker_container_with_reuse_reuse_enabled_ryuk_disabled_same_id(monkeypatch):
# Make sure Ryuk cleanup is not active from previous test runs
Reaper.delete_instance()

tc_properties_mock = testcontainers_config.tc_properties | {"testcontainers.reuse.enable": "true"}
monkeypatch.setattr(testcontainers_config, "tc_properties", tc_properties_mock)
monkeypatch.setattr(testcontainers_config, "ryuk_disabled", True)

with DockerContainer("hello-world").with_reuse() as container:
id = container._container.id
with DockerContainer("hello-world").with_reuse() as container:
assert id == container._container.id

# Cleanup after keeping container alive (with_reuse)
container._container.remove(force=True)


def test_docker_container_labels_hash():
expected_hash = "91fde3c09244e1d3ec6f18a225b9261396b9a1cb0f6365b39b9795782817c128"
with DockerContainer("hello-world").with_reuse() as container:
assert container._container.labels["hash"] == expected_hash


def test_docker_client_find_container_by_hash_not_existing():
with DockerContainer("hello-world"):
assert DockerClient().find_container_by_hash("foo") == None


def test_docker_client_find_container_by_hash_existing():
with DockerContainer("hello-world").with_reuse() as container:
hash_ = container._container.labels["hash"]
found_container = DockerClient().find_container_by_hash(hash_)
assert isinstance(found_container, Container)
27 changes: 26 additions & 1 deletion index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,6 @@ When trying to launch Testcontainers from within a Docker container, e.g., in co
1. The container has to provide a docker client installation. Either use an image that has docker pre-installed (e.g. the `official docker images <https://hub.docker.com/_/docker>`_) or install the client from within the `Dockerfile` specification.
2. The container has to have access to the docker daemon which can be achieved by mounting `/var/run/docker.sock` or setting the `DOCKER_HOST` environment variable as part of your `docker run` command.


Private Docker registry
-----------------------

Expand Down Expand Up @@ -118,6 +117,32 @@ Fetching passwords from cloud providers:
GCP_PASSWORD = $(gcloud auth print-access-token)
AZURE_PASSWORD = $(az acr login --name <registry-name> --expose-token --output tsv)

Reusable Containers (Experimental)
----------------------------------

Containers can be reused across consecutive test runs. To reuse a container, the container configuration must be the same.

Containers that are set up for reuse will not be automatically removed. Thus, those containers need to be removed manually.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"...removed manually."

maybe add:
"In re-usable mode, the 'stop' api on a container will now 'stop' a container, rather than 'remove' it"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this discussion, the stop method has not been changed.


Containers should not be reused in a CI environment.

How to use?
^^^^^^^^^^^

1. Add :code:`testcontainers.reuse.enable=true` to :code:`~/.testcontainers.properties`
2. Disable ryuk by setting the environment variable :code:`TESTCONTAINERS_RYUK_DISABLED=true`
3. Instantiate a container using :code:`with_reuse`

.. doctest::

>>> from testcontainers.core.container import DockerContainer

>>> with DockerContainer("hello-world").with_reuse() as container:
... first_id = container._container.id
>>> with DockerContainer("hello-world").with_reuse() as container:
... second_id == container._container.id
>>> print(first_id == second_id)
True

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the user be warned that by using this feature, containers need to be removed manually? (That this feature should not be used in a CI)

Also, do we need to make clear how this feature works (explaining the hash in use). -> If a container's run configuration changes, the hash changes and a new container will be used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like you have added these comments to the doc, i think that is fine. the hash would be great to add as users would benefit from knowing exactly what is hashed.

  • self.image,
  • self._command,
  • self.env,
  • self.ports,
  • self._name,
  • self.volumes,
  • str(tuple(sorted(self._kwargs.items()))), - this may fail and why i want to have this be tucked away inside an obviously readable if block

Configuration
-------------
Expand Down