Aiohttp и запросы дают разные ответы для одного и того же URL-адреса и параметров. ⇐ Python
Aiohttp и запросы дают разные ответы для одного и того же URL-адреса и параметров.
I am trying to download hundreds of files from NexusMods, most are hundreds of mebibytes (1048576 bytes) in size, many are gibibytes (1073741824 bytes) in size.
I am using aiohttp + aiofiles to download them, my code is working but the whole process is complicated by my network condition, long story short I was born in China and I am still behind the Great Firewall of China and I use VPNs which are constantly throttled by the GFW.
It is extremely easy for the downloads to hang and freeze the progress, the connections will easily become stale and download speed drops to zero, the program will halt the execution to wait for the data that will never arrive without throwing exceptions, it just won't timeout.
Using an external downloader however prevents these problems from occurring, but these downloaders only have GUI and are hard to automate and hard to integrate with my own PyQt6 GUI application.
So I tried to change the hosts file and disconnect the VPN, this makes ping faster and requests library downloads successfully but aiohttp can't download the file because somehow it receives a different response for the exactly same parameters...
Steps to reproduce the error:
Assuming you are running Windows 10,
open C:\Windows\System32\drivers\etc\hosts file, you must run with administrative privileges
add the following line, then save
45.150.242.245 files.nexus-cdn.com run the following commands in cmd.exe:
ipconfig /release ipconfig /flushdns ipconfig /renew Now paste these lines of code into your Python interpreter, you must have the relevant libraries installed of course:
import asyncio import aiohttp import json import requests from pathlib import Path URL = "https://files.nexus-cdn.com/120/16892/D ... 50.242.245" HEADERS = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:123.0) Gecko/20100101 Firefox/123.0" } print(requests.head(URL, headers=HEADERS).headers) async def test(): async with aiohttp.ClientSession( headers=HEADERS, connector=aiohttp.TCPConnector(ssl=False) ) as session: async with session.head(url=URL) as resp: return resp.headers print(asyncio.run(test())) I don't know what you will see, but for me the output is always this:
{'Server': 'nginx/1.24.0', 'Date': 'Sat, 02 Mar 2024 08:19:09 GMT', 'Content-Type': 'application/x-rar-compressed', 'Content-Length': '108004175', 'Last-Modified': 'Wed, 07 Oct 2015 12:58:46 GMT', 'Connection': 'keep-alive', 'ETag': '"56151706-670034f"', 'Expires': 'Thu, 31 Dec 2037 23:55:55 GMT', 'Cache-Control': 'max-age=315360000', 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload', 'Accept-Ranges': 'bytes'} Somehow aiohttp can't download the file.
The download link will expire and when it expires you will get 403 responses, the following code is used to generate the download link programmatically:
FIELDS = ( "_app_session", "fwroute", "jwt_fingerprint", "member_id", "pass_hash", "sid_develop", ) def load_cookies_list(file: str) -> list: lines = Path(file).read_text().splitlines() return [dict(zip(FIELDS, lines[i : i + 6])) for i in range(0, len(lines), 6)] COOKIES = load_cookies_list("D:/cookies_list.txt") DOWNLOAD_LINK_GENERATOR = ( "https://www.nexusmods.com/Core/Libs/Com ... ownloadUrl" ) def generate_download_link(file_id: int, game_id: int) -> str: resp = requests.post( url=DOWNLOAD_LINK_GENERATOR, json={"fid": file_id, "game_id": game_id}, cookies=COOKIES[0], ) return json.loads(resp.content)["url"] You need a NexusMods account. Go to www.nexusmods.com, login to your NexusMods account, press F12, and find the cookies. This depends on your browser, if you are using Firefox, click storage tab and find the cookies there, if you are using Chrome click application tab.
You will need to copy the values of all the necessary cookies listed in the code, double click, ctrl + c then ctrl + v into a text file line by line in the listed order, save the file and change the path in the code.
Now you can copy paste the code into the console and run.
Download link is generated by generate_download_link(16892, 120). If the download link has expired, you need to do the above mentioned procedure to get a new download link.
Now I have verified that aiohttp only gets a different response if I change the hosts file, if I undo the change and ipconfig again, print(asyncio.run(test())) will give the correct output, but the latency is way higher:
Why aiohttp gives a different response when I change the hosts file and how do I fix this?
I am trying to download hundreds of files from NexusMods, most are hundreds of mebibytes (1048576 bytes) in size, many are gibibytes (1073741824 bytes) in size.
I am using aiohttp + aiofiles to download them, my code is working but the whole process is complicated by my network condition, long story short I was born in China and I am still behind the Great Firewall of China and I use VPNs which are constantly throttled by the GFW.
It is extremely easy for the downloads to hang and freeze the progress, the connections will easily become stale and download speed drops to zero, the program will halt the execution to wait for the data that will never arrive without throwing exceptions, it just won't timeout.
Using an external downloader however prevents these problems from occurring, but these downloaders only have GUI and are hard to automate and hard to integrate with my own PyQt6 GUI application.
So I tried to change the hosts file and disconnect the VPN, this makes ping faster and requests library downloads successfully but aiohttp can't download the file because somehow it receives a different response for the exactly same parameters...
Steps to reproduce the error:
Assuming you are running Windows 10,
open C:\Windows\System32\drivers\etc\hosts file, you must run with administrative privileges
add the following line, then save
45.150.242.245 files.nexus-cdn.com run the following commands in cmd.exe:
ipconfig /release ipconfig /flushdns ipconfig /renew Now paste these lines of code into your Python interpreter, you must have the relevant libraries installed of course:
import asyncio import aiohttp import json import requests from pathlib import Path URL = "https://files.nexus-cdn.com/120/16892/D ... 50.242.245" HEADERS = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:123.0) Gecko/20100101 Firefox/123.0" } print(requests.head(URL, headers=HEADERS).headers) async def test(): async with aiohttp.ClientSession( headers=HEADERS, connector=aiohttp.TCPConnector(ssl=False) ) as session: async with session.head(url=URL) as resp: return resp.headers print(asyncio.run(test())) I don't know what you will see, but for me the output is always this:
{'Server': 'nginx/1.24.0', 'Date': 'Sat, 02 Mar 2024 08:19:09 GMT', 'Content-Type': 'application/x-rar-compressed', 'Content-Length': '108004175', 'Last-Modified': 'Wed, 07 Oct 2015 12:58:46 GMT', 'Connection': 'keep-alive', 'ETag': '"56151706-670034f"', 'Expires': 'Thu, 31 Dec 2037 23:55:55 GMT', 'Cache-Control': 'max-age=315360000', 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload', 'Accept-Ranges': 'bytes'} Somehow aiohttp can't download the file.
The download link will expire and when it expires you will get 403 responses, the following code is used to generate the download link programmatically:
FIELDS = ( "_app_session", "fwroute", "jwt_fingerprint", "member_id", "pass_hash", "sid_develop", ) def load_cookies_list(file: str) -> list: lines = Path(file).read_text().splitlines() return [dict(zip(FIELDS, lines[i : i + 6])) for i in range(0, len(lines), 6)] COOKIES = load_cookies_list("D:/cookies_list.txt") DOWNLOAD_LINK_GENERATOR = ( "https://www.nexusmods.com/Core/Libs/Com ... ownloadUrl" ) def generate_download_link(file_id: int, game_id: int) -> str: resp = requests.post( url=DOWNLOAD_LINK_GENERATOR, json={"fid": file_id, "game_id": game_id}, cookies=COOKIES[0], ) return json.loads(resp.content)["url"] You need a NexusMods account. Go to www.nexusmods.com, login to your NexusMods account, press F12, and find the cookies. This depends on your browser, if you are using Firefox, click storage tab and find the cookies there, if you are using Chrome click application tab.
You will need to copy the values of all the necessary cookies listed in the code, double click, ctrl + c then ctrl + v into a text file line by line in the listed order, save the file and change the path in the code.
Now you can copy paste the code into the console and run.
Download link is generated by generate_download_link(16892, 120). If the download link has expired, you need to do the above mentioned procedure to get a new download link.
Now I have verified that aiohttp only gets a different response if I change the hosts file, if I undo the change and ipconfig again, print(asyncio.run(test())) will give the correct output, but the latency is way higher:
Why aiohttp gives a different response when I change the hosts file and how do I fix this?
-
- Похожие темы
- Ответы
- Просмотры
- Последнее сообщение
-
-
Aiohttp: как получить данные (тело) на сервере aiohttp из Requests.get
Anonymous » » в форуме Python - 0 Ответы
- 72 Просмотры
-
Последнее сообщение Anonymous
-
-
-
Почему запросы SPARQL 3 и 4 дают пустые результаты, а запросы 1 и 2 работают как положено?
Anonymous » » в форуме JAVA - 0 Ответы
- 65 Просмотры
-
Последнее сообщение Anonymous
-