Я пытаюсь выкачать страницы из веб-архива с паузой, используя Python. Но менее чем через минуту получаю сообщение об ошибке:
Код: Выделить всё
ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
NewConnectionError Traceback (most recent call last) NewConnectionError: : Failed to establish a new connection: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
MaxRetryError Traceback (most recent call last) MaxRetryError: HTTPSConnectionPool(host='web.archive.org', port=443): Max retries exceeded with url: /web/20230331062349/http://100vann.ru/ (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
ConnectionError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
517 raise SSLError(e, request=request)
518
--> 519 raise ConnectionError(e, request=request)
520
521 except ClosedPoolError as e:
ConnectionError: HTTPSConnectionPool(host='web.archive.org', port=443): Max retries exceeded with url: /web/20230331062349/http://100vann.ru/ (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused'))
Код: Выделить всё
headers = requests.utils.default_headers()
time.sleep(1)
headers.update(
{
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36',
}
)
webarchive = requests.get(url, headers=headers, timeout=15)
webarchive.raise_for_status()
if webarchive.status_code == 200:
Подробнее здесь: https://stackoverflow.com/questions/779 ... rchive-org