Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-8266][CI] retry more times on download spark release #8267

Closed
wants to merge 4 commits into from

Conversation

zhouyuan
Copy link
Contributor

What changes were proposed in this pull request?

This patch increase the wget retry times

(Fixes: #8266 )

How was this patch tested?

pass GHA

@github-actions github-actions bot added the INFRA label Dec 18, 2024
Copy link

#8266

Signed-off-by: Yuan Zhou <[email protected]>
@zjuwangg
Copy link
Contributor

I also noticed that https://archive.apache.org/ has a limit on download archive
Do note that heavy use of this service will result in immediate throttling of your download speeds to either 12 or 6 mbps for the remainder of the day, depending on severity. Continuous abuse (to the tune of more than 40 GB downloaded per week) will cause an [automatic ban](https://infra.apache.org/infra-ban.html), so please tune your services to this fact.

I don't know whether our GHA will be affected by this restriction. If so, we may need install spark in docker image or download from other dist mirror?

Copy link
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found some mirrors may help as well:

https://github.com/fink/fink-mirrors/blob/master/apache (a list of apache mirrors)
https://ftp.unicamp.br/pub/apache/spark (spark)

Would also suggest doing checksum after downloading the tarball.

@zhouyuan
Copy link
Contributor Author

Note:

  • retry on wget doesn't help
  • the issue is due to we launch these jobs concurrently, so apache.org baned some IP
  • Hosting the package to other place or move into docker can fix it

@zhouyuan zhouyuan closed this Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[VL] GHA is not stable when downloading Spark package
3 participants