Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: network isolation between tests #1475

Open
yanns opened this issue Apr 30, 2024 · 15 comments
Open

Idea: network isolation between tests #1475

yanns opened this issue Apr 30, 2024 · 15 comments
Labels
help wanted Extra attention is needed

Comments

@yanns
Copy link

yanns commented Apr 30, 2024

use-case

My tests are starting different servers, picking TCP a free port randomly. I first need to pick a port, and then run the http server as the port is being used is some shared configuration. I cannot "simply" use the 120.0.0.1:0 approach. I'm using port-selector for that. I could also use reserve-port but it does not help when running with nextest.

When running with nextest, as tests are running in different processes in parallel, it can happen that 2 tests pick the same port.

My current mitigation is to use the retry mechanism to restart the tests failing because they pick identical ports. The tests themselves are not flaky, but running them in parallel make them flaky.

Idea

One possible idea would be to use network isolation on linux, so that each process can pick same ports without conflict.

Possible issues

  • Network isolation only works on Linux kernel ( I haven't checked other OSs).
  • It adds complexity, in the code of nextest. And it can also add complexity on how to configure those namespaces, if they need to access internet...
  • Creating new namespaces requires privileges
@NobodyXu
Copy link
Contributor

Creating new namespaces requires privileges

On linux, if you have unprivileged user namespace, then you could create namespace and all those isolation without privileges.

@sunshowers
Copy link
Member

sunshowers commented May 6, 2024

Hi --

The immediate problem you have is probably easiest to solve via test groups: https://nexte.st/book/test-groups

For network isolation, I'm not completely opposed to it, and there are related issues where something other than execing the test process would make sense. (For example, see #1371.)

  • But I don't know how complex it's going to be, especially handling all the different failure modes (what if the system doesn't allow creating unprivileged user namespaces?)
  • Are there alternatives like systemd-run --user --scope (which creates a cgroup) that can be helpful here?

I think the best way to get started would be by prototyping your solution using a target runner: https://nexte.st/book/target-runners. A target runner is a custom script or binary that gets invoked separately for each test. Hopefully it should be possible to build a proof of concept with that.

@NobodyXu
Copy link
Contributor

NobodyXu commented May 7, 2024

I think systemd-nspawn can be used, to sandbox test on Linux.

It should be fairly easy with no change required to nextest, it supports --as-pid2 so that systemd-nspawn would run as pid1 inside container and reaping children, while nextest will run as pid2.

It also has options for dealing with networking, setting up private network namespace via --private-network and setting up network bridges, etc.

Though it looks like you would have to use --bind-ro to manually mount stuff into the container.

@sunshowers
Copy link
Member

sunshowers commented May 26, 2024

Thanks @NobodyXu. I think testing these strategies out in a target runner would be the best next step. I don't have the time to work on this myself, but I'll open the floor to contributions.

@sunshowers sunshowers added the help wanted Extra attention is needed label May 26, 2024
@yanns
Copy link
Author

yanns commented May 27, 2024

For info, it seems that https://maelstrom-software.com/ embraces the idea of one container per test. I had no opportunity to try it out yet.

@yanns
Copy link
Author

yanns commented May 28, 2024

I've played a bit with systemd-nspawn, but I could not manage starting it with a normal user (not root)

@NobodyXu
Copy link
Contributor

I've played a bit with systemd-nspawn, but I could not manage starting it with a normal user (not root)

I think you would need to enable user namespace and use it?

@yanns
Copy link
Author

yanns commented May 28, 2024

I've played a bit with systemd-nspawn, but I could not manage starting it with a normal user (not root)

I think you would need to enable user namespace and use it?

I'm trying, but without success:

$ systemd-nspawn  --private-users=yes --private-users-ownership=auto --as-pid2 'echo hello'
Need to be root.

@NobodyXu
Copy link
Contributor

According to https://wiki.archlinux.org/title/systemd-nspawn#Unprivileged_containers , systemd-nspawn supports unprivileged container, but it has to spawn by root.

So I was wrong about that

@sunshowers
Copy link
Member

Is systemd-run an option? I thought I could get it to work as a user.

@NobodyXu
Copy link
Contributor

I think podman might be another option, it supports non-root mode, doesn't have to root to create an unprivileged container

@NobodyXu
Copy link
Contributor

Or you could also try https://firejail.wordpress.com/

@yanns
Copy link
Author

yanns commented May 29, 2024

We could also check how https://maelstrom-software.com/ is doing it.

@sunshowers
Copy link
Member

Or you could also try https://firejail.wordpress.com/

Oh this is good, I've used firejail and it works very well. I think this would be great as part of a library of target runners.

@PegasusPlusUS
Copy link

Hello, I see 'Help wanted' tag and just read this article. I have a simple method for TCP listen port selection, and I just write a simple program to test it works OK.

My method is to first connect to somewhere, a server at LAN, localhost, or 8.8.8.8:53(TCP DNS query), the OS will help each client connection get a unique port, usually clients just connect to outside, waste the resource a bit, :). The fact is that client can also use the port to listen, Thus each test can have their own listen port, will not conflict with each other.

Here is the simple verification python program:

import socket

def start_client():
    # Create a socket and connect to the server
    client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    client_socket.connect(('8.8.8.8', 53))
    local_port = client_socket.getsockname()[1]
    print(f"Connected to server, local port is {local_port}")

    # Attempt to bind and listen on the same local port
    try:
        client_bind_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        client_bind_socket.bind(('localhost', local_port))
        client_bind_socket.listen(1)
        print(f"Client is now listening on local port {local_port}")
    except Exception as e:
        print(f"Failed to listen on local port {local_port}: {e}")

    client_socket.close()

if __name__ == "__main__":
    start_client()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants