-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ideas for improving E2E test developer experience #33532
Comments
Yes, it's super annoying, but it doesn't happen that often. It all depends on what we consider a priority here. The current setup enforces that the conflicts in the Gutenberg plugin are fixed as soon as possible. Some of those issues are obvious mistakes like duplicated function/class definition, others are related to changes in the default theme. There is definitely room for improvement in the process to find a better balance between the developer experience and ensuring that the Gutenberg plugin is compatible with WP core's
By looking at a random PR:
When we split into more jobs we get a major improvement for the second part of the CI job. A quick estimate would be ~11-13 minutes reduced to hopefully ~7-8 minutes on a single node. The drawback would be that we not only use 2 more nodes but also increase the total execution time by ~8-10 minutes to perform additional env setup (for nodes 5 and 6). It would be essential to figure out if there is a way to share at least the build part between more nodes so we could scale to as many nodes as we need. I also think @desrosj did some testing with splitting e2e tests into more nodes. It would be great to see what he learned. |
I tried it once in my fork, and to my surprise, sometimes building the app from scratch is actually faster than downloading and unpacking them from GH artifacts. I'm still confident that there must be some way to reduce the execution time though, just have to do some more experiments. |
I think this needs to be scoped down a little bit.
|
The current implementation proposes in #33506 slows down test execution by a few minutes per CI node, so your comment is valid. It could be useful for debugging, but overall it should be rather disabled if there is a performance penalty involved. |
I don't think a couple of minutes of slowdown in tests is that much of a problem though. E2E tests are already very slow, slowing down each by a few seconds shouldn't outweigh the benefits of debugging ability it brings. I've already encountered several tests which are only failing in CI and hard to debug/reproduce locally. Often times the only option we could make is to skip the tests and risk regressing the bug in future PRs. Furthermore, you can think of the slowdown as an emulation of CPU throttling, so that we can build more resilient tests. I've already found several flaky tests in that PR because of the slowdown. They are just hidden in plain sight, waiting to surface in some random PRs. In conclusion, I think the trade-offs are worth it. (In addition, it's currently implemented so that it's only enabled in CI by default, so no performance penalty during development) A dashboard can also help here. IMO flaky tests which are only rarely failing don't worth the time to fix it ASAP. We need a way to determine and prioritize each flaky test by its failing rate. I'm aware of the complexity of a dashboard could bring, hence I opened a separate issue #33809 to track it. |
Regarding this, I'm opening #33979 in order to experiment with re-running failed jobs instead of the entire workflow. I might look into automatically retrying if this works out; otherwise, my thought is that retrying full workflows would add way too much time to be worth it. |
Agree that we need to fix conflicts as soon as possible and keep Gutenberg tested against the latest WordPress trunk. But I don't think conflicts should block all developers (there are a lot of us now! 😀) from working and I really don't think we should have to deal with conflicts at very stressful times e.g. plugin release day. Being a Gutenberg developer should be fun and chill.
100%. Ideally the parallelised jobs that run E2E tests should happen after a single non-parallelised setup job. Maybe this won't improve total performance all that much but it would definitely improve re-run performance which I think is a big deal as many developers spend a lot of time waiting for failed tests to re-run.
I don't think we can systematically address flakey tests unless we measure what we want to improve. That's the value of a dashboard. I am thinking that it could be a GitHub Action that runs daily, scrapes the E2E test logs, and publishes to a static GitHub Pages site. If we have to set up seperate hosting, a database, etc. then I agree that the overhead is probably too high and that it would become basically unmaintained. (I think gutenberg.run suffers from this.)
No real opinion on this one. I trust @kevin940726 😛 |
Considering this hasn't been scoped down further and hasn't had much traction in a few years, I'm going to close this out but welcome folks to either reopen or start a new issue with more relevant/recent info about improving E2E test developer experience. |
A few things we could try to improve the experience of writing and running E2E tests:
npm run build
into its own GitHub action which runs prior to the fournpm run test-e2e
actions. This might make it quicker to restart failing E2E tests on a PR..wp-env.json
to a known git commit which is updated automatically via a PR every week. This might make it less disruptive (i.e. doesn't block every single developer) when a Core change breaks Gutenberg CI.npm run test-e2e
actions into 6 actions. This might speed up E2E test runs on a PR.puppeteer-testing-library
. This might help with stability. (Made obsolete by "role selectors" in Playwright)The text was updated successfully, but these errors were encountered: