I have been investigating random build failures in our jenkins build system. The builds fail occasionally - maybe a few times per week - in a centos 7 based docker container build phase because the epel repository metalink fetch fails.
I have debugged the issue further and it seems that sometimes https requests to fedora mirror at proxy06.fedoraproject.org (22.214.171.124) either get completely stuck or take extremely long time to complete.
I have verified this with the following curl command:
# curl -v --resolve "mirrors.fedoraproject.org:443:126.96.36.199" "https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=x86_64" * About to connect() to mirrors.fedoraproject.org port 443 (#0) * Trying 188.8.131.52... * Connected to mirrors.fedoraproject.org (184.108.40.206) port 443 (#0) * Initializing NSS with certpath: sql:/etc/pki/nssdb * CAfile: /etc/pki/tls/certs/ca-bundle.crt CApath: none * SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 * Server certificate: * subject: CN=*.fedoraproject.org,O="Red Hat, Inc.",L=Raleigh,ST=North Carolina,C=US * start date: Feb 27 00:00:00 2020 GMT * expire date: Mar 02 12:00:00 2022 GMT * common name: *.fedoraproject.org * issuer: CN=DigiCert SHA2 High Assurance Server CA,OU=www.digicert.com,O=DigiCert Inc,C=US > GET /metalink?repo=epel-7&arch=x86_64 HTTP/1.1 > User-Agent: curl/7.29.0 > Host: mirrors.fedoraproject.org > Accept: */* >
The requests above got stuck after the http request had been sent. Request do not get stuck on every try, but when doing them in a loop, it does not take many minutes to see this happening. As you can see from the curl output, this problem happens after TLS negotiation has completed. Thus the problem is in no way related to system trusted CA configurations.
The reason why our builds keep failing much less often is that mirrors.fedoraproject.org seems to be DNS load balanced and the name resolver picks a different proxy IP address on each resolve request:
$ host mirrors.fedoraproject.org mirrors.fedoraproject.org is an alias for wildcard.fedoraproject.org. wildcard.fedoraproject.org has address 220.127.116.11 wildcard.fedoraproject.org has address 18.104.22.168 wildcard.fedoraproject.org has address 22.214.171.124 wildcard.fedoraproject.org has address 126.96.36.199 wildcard.fedoraproject.org has address 188.8.131.52 wildcard.fedoraproject.org has address 184.108.40.206 wildcard.fedoraproject.org has address 220.127.116.11 wildcard.fedoraproject.org has address 18.104.22.168 wildcard.fedoraproject.org has address 22.214.171.124 wildcard.fedoraproject.org has IPv6 address 2605:bc80:3010:600:dead:beef:cafe:fed9 wildcard.fedoraproject.org has IPv6 address 2604:1580:fe00:0:dead:beef:cafe:fed1 wildcard.fedoraproject.org has IPv6 address 2600:2701:4000:5211:dead:beef:fe:fed3 wildcard.fedoraproject.org has IPv6 address 2620:52:3:1:dead:beef:cafe:fed6 wildcard.fedoraproject.org has IPv6 address 2605:bc80:3010:600:dead:beef:cafe:feda
Does anyone know if this is a known issue, or if it is caused by some request rate limiting going on at proxy06…?
Are there any known work arounds? I am thinking something like blacklisting the hostname proxy06… or IP address 126.96.36.199 so that yum will never attempt to use that mirror.