Я пытаюсь автоматизировать процесс, в котором я пытаюсь получить доступ к API страницы со списком продуктов на веб-сайте и получить информацию, связанную с продуктом. Когда я использую данные массива urls1[], мой код выдает правильный вывод, но когда я использую данные массива urls2[], я получаю сообщение об ошибке отказа в доступе.
Ниже — первое набор URL-адресов:
urls1 = [
'https://staging1-japan.coach.com/featur ... kento.html',
'https://staging1-japan.coach.com/shop/new/women',
'https://staging1-japan.coach.com/shop/women',
'https://staging1-japan.coach.com/shop/men/bags',
'https://staging1-japan.coach.com/shop/collection-2',
'https://staging1-japan.coach.com/shop/g ... bestseller',
'https://staging1-japan.coach.com/shop/coachworld',
'https://staging1-japan.coach.com/shop/c ... ch-reloved'
]
Ниже приведен второй набор URL-адресов:
urls2 = [
'https://staging1.coachoutlet.com/shop/b ... s/view-all',
'https://staging1.coachoutlet.com/shop/women/view-all',
'https://staging1.coachoutlet.com/shop/men/view-all',
'https://staging1.coachoutlet.com/shop/bags/view-all',
'https://staging1.coachoutlet.com/shop/gifts/view-all',
'https://staging1.coachoutlet.com/shop/shop-by'
]
Ниже мой код:
import requests
import warnings
from time import sleep
# Suppress SSL warnings
warnings.simplefilter('ignore', requests.packages.urllib3.exceptions.InsecureRequestWarning)
# List of URLs to iterate over
urls = [
'https://staging1.coachoutlet.com/shop/b ... s/view-all',
'https://staging1.coachoutlet.com/shop/women/view-all',
'https://staging1.coachoutlet.com/shop/men/view-all',
'https://staging1.coachoutlet.com/shop/bags/view-all',
'https://staging1.coachoutlet.com/shop/gifts/view-all',
'https://staging1.coachoutlet.com/shop/shop-by'
]
# Full header simulation of a real browser session
headers = {
"Cookie": "auth-bypass=true;", # Ensure this is up-to-date
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3',
'Accept': 'application/json',
'Connection': 'keep-alive',
'Accept-Language': 'en-US,en;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'Referer': 'https://staging1.coachoutlet.com/shop',
'Upgrade-Insecure-Requests': '1',
'TE': 'Trailers'
}
# Start a session to handle cookies across requests
session = requests.Session()
# Iterate over each URL in the list
for url in urls:
count_of_items = 1 # Reset item count for each URL
page = 1 # Starting page for pagination
if '/shop' not in url:
print(f"No products available in this PLP: {url}")
continue
try:
while True:
ct = count_of_items + 15 # Adjust count to match pagination
if page == 1:
full_url = f"{url}"
else:
full_url = f"{url}?page={page}"
if 'api/shop' not in full_url:
full_url = full_url.replace('/shop', '/api/shop')
print(f"Fetching: {full_url}")
response = session.get(full_url, headers=headers, verify=False)
if response.status_code == 403:
print(f"Access denied for URL {full_url}. Check cookies or headers.")
break
print(f"Response Status Code: {response.status_code}")
if response.status_code != 200:
print(f"Failed to retrieve data from {full_url}")
break
products = response.json().get('pageData', {}).get('products', [])
if not products:
print(f"No products available in this PLP: {full_url}")
break
pro_count = response.json()['pageData'].get('total', 0)
print(f"Total products found: {pro_count}")
break
except Exception as e:
print(f"Exception raised for URL {url}: {e}")
continue
Вывод моего кода при использовании urls1[]:
No products available in this PLP: https://staging1-japan.coach.com/featur ... kento.html
Fetching: https://staging1-japan.coach.com/api/shop/new/women
omation/testing.py
No products available in this PLP: https://staging1-japan.coach.com/featur ... kento.html
Fetching: https://staging1-japan.coach.com/api/shop/new/women
No products available in this PLP: https://staging1-japan.coach.com/featur ... kento.html
Fetching: https://staging1-japan.coach.com/api/shop/new/women
Fetching: https://staging1-japan.coach.com/api/shop/new/women
Response Status Code: 200
Response Status Code: 200
Total products found: 349
Fetching: https://staging1-japan.coach.com/api/shop/women
Response Status Code: 200
Total products found: 349
Fetching: https://staging1-japan.coach.com/api/shop/women
Response Status Code: 200
Response Status Code: 200
Total products found: 991
Fetching: https://staging1-japan.coach.com/api/shop/men/bags
Fetching: https://staging1-japan.coach.com/api/shop/men/bags
Response Status Code: 200
Total products found: 138
Total products found: 138
Fetching: https://staging1-japan.coach.com/api/shop/collection-2
Response Status Code: 200
Total products found: 346
Fetching: https://staging1-japan.coach.com/api/sh ... bestseller
Response Status Code: 200
Total products found: 76
Fetching: https://staging1-japan.coach.com/api/shop/coachworld
Response Status Code: 200
No products available in this PLP: https://staging1-japan.coach.com/api/shop/coachworld
Fetching: https://staging1-japan.coach.com/api/sh ... ch-reloved
Response Status Code: 200
Total products found: 20
Вывод моего кода при использовании urls2[]:
Fetching: https://staging1.coachoutlet.com/api/sh ... s/view-all
Access denied for URL https://staging1.coachoutlet.com/api/sh ... s/view-all. Check cookies or headers.
Fetching: https://staging1.coachoutlet.com/api/sh ... n/view-all
Access denied for URL https://staging1.coachoutlet.com/api/sh ... n/view-all. Check cookies or headers.
Fetching: https://staging1.coachoutlet.com/api/shop/men/view-all
Access denied for URL https://staging1.coachoutlet.com/api/shop/men/view-all. Check cookies or headers.
Fetching: https://staging1.coachoutlet.com/api/shop/bags/view-all
Access denied for URL https://staging1.coachoutlet.com/api/shop/bags/view-all. Check cookies or headers.
Fetching: https://staging1.coachoutlet.com/api/sh ... s/view-all
Access denied for URL https://staging1.coachoutlet.com/api/sh ... s/view-all. Check cookies or headers.
Fetching: https://staging1.coachoutlet.com/api/shop/api/shop-by
Access denied for URL https://staging1.coachoutlet.com/api/shop/api/shop-by. Check cookies or headers.
Подробнее здесь: https://stackoverflow.com/questions/792 ... d-selenium
Получение ошибки отказа в доступе при очистке данных API с использованием Python и Selenium ⇐ Python
Программы на Python
1733068177
Anonymous
Я пытаюсь автоматизировать процесс, в котором я пытаюсь получить доступ к API страницы со списком продуктов на веб-сайте и получить информацию, связанную с продуктом. Когда я использую данные массива urls1[], мой код выдает правильный вывод, но когда я использую данные массива urls2[], я получаю сообщение об ошибке отказа в доступе.
[b]Ниже — первое набор URL-адресов:[/b]
urls1 = [
'https://staging1-japan.coach.com/feature/holiday2024_nkento.html',
'https://staging1-japan.coach.com/shop/new/women',
'https://staging1-japan.coach.com/shop/women',
'https://staging1-japan.coach.com/shop/men/bags',
'https://staging1-japan.coach.com/shop/collection-2',
'https://staging1-japan.coach.com/shop/gift/women/bestseller',
'https://staging1-japan.coach.com/shop/coachworld',
'https://staging1-japan.coach.com/shop/coachreloved/coach-reloved'
]
[b]Ниже приведен второй набор URL-адресов:[/b]
urls2 = [
'https://staging1.coachoutlet.com/shop/black-friday-deals/view-all',
'https://staging1.coachoutlet.com/shop/women/view-all',
'https://staging1.coachoutlet.com/shop/men/view-all',
'https://staging1.coachoutlet.com/shop/bags/view-all',
'https://staging1.coachoutlet.com/shop/gifts/view-all',
'https://staging1.coachoutlet.com/shop/shop-by'
]
[b]Ниже мой код:[/b]
import requests
import warnings
from time import sleep
# Suppress SSL warnings
warnings.simplefilter('ignore', requests.packages.urllib3.exceptions.InsecureRequestWarning)
# List of URLs to iterate over
urls = [
'https://staging1.coachoutlet.com/shop/black-friday-deals/view-all',
'https://staging1.coachoutlet.com/shop/women/view-all',
'https://staging1.coachoutlet.com/shop/men/view-all',
'https://staging1.coachoutlet.com/shop/bags/view-all',
'https://staging1.coachoutlet.com/shop/gifts/view-all',
'https://staging1.coachoutlet.com/shop/shop-by'
]
# Full header simulation of a real browser session
headers = {
"Cookie": "auth-bypass=true;", # Ensure this is up-to-date
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3',
'Accept': 'application/json',
'Connection': 'keep-alive',
'Accept-Language': 'en-US,en;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'Referer': 'https://staging1.coachoutlet.com/shop',
'Upgrade-Insecure-Requests': '1',
'TE': 'Trailers'
}
# Start a session to handle cookies across requests
session = requests.Session()
# Iterate over each URL in the list
for url in urls:
count_of_items = 1 # Reset item count for each URL
page = 1 # Starting page for pagination
if '/shop' not in url:
print(f"No products available in this PLP: {url}")
continue
try:
while True:
ct = count_of_items + 15 # Adjust count to match pagination
if page == 1:
full_url = f"{url}"
else:
full_url = f"{url}?page={page}"
if 'api/shop' not in full_url:
full_url = full_url.replace('/shop', '/api/shop')
print(f"Fetching: {full_url}")
response = session.get(full_url, headers=headers, verify=False)
if response.status_code == 403:
print(f"Access denied for URL {full_url}. Check cookies or headers.")
break
print(f"Response Status Code: {response.status_code}")
if response.status_code != 200:
print(f"Failed to retrieve data from {full_url}")
break
products = response.json().get('pageData', {}).get('products', [])
if not products:
print(f"No products available in this PLP: {full_url}")
break
pro_count = response.json()['pageData'].get('total', 0)
print(f"Total products found: {pro_count}")
break
except Exception as e:
print(f"Exception raised for URL {url}: {e}")
continue
[b]Вывод моего кода при использовании urls1[]:[/b]
No products available in this PLP: https://staging1-japan.coach.com/feature/holiday2024_nkento.html
Fetching: https://staging1-japan.coach.com/api/shop/new/women
omation/testing.py
No products available in this PLP: https://staging1-japan.coach.com/feature/holiday2024_nkento.html
Fetching: https://staging1-japan.coach.com/api/shop/new/women
No products available in this PLP: https://staging1-japan.coach.com/feature/holiday2024_nkento.html
Fetching: https://staging1-japan.coach.com/api/shop/new/women
Fetching: https://staging1-japan.coach.com/api/shop/new/women
Response Status Code: 200
Response Status Code: 200
Total products found: 349
Fetching: https://staging1-japan.coach.com/api/shop/women
Response Status Code: 200
Total products found: 349
Fetching: https://staging1-japan.coach.com/api/shop/women
Response Status Code: 200
Response Status Code: 200
Total products found: 991
Fetching: https://staging1-japan.coach.com/api/shop/men/bags
Fetching: https://staging1-japan.coach.com/api/shop/men/bags
Response Status Code: 200
Total products found: 138
Total products found: 138
Fetching: https://staging1-japan.coach.com/api/shop/collection-2
Response Status Code: 200
Total products found: 346
Fetching: https://staging1-japan.coach.com/api/shop/gift/women/bestseller
Response Status Code: 200
Total products found: 76
Fetching: https://staging1-japan.coach.com/api/shop/coachworld
Response Status Code: 200
No products available in this PLP: https://staging1-japan.coach.com/api/shop/coachworld
Fetching: https://staging1-japan.coach.com/api/shop/coachreloved/coach-reloved
Response Status Code: 200
Total products found: 20
[b]Вывод моего кода при использовании urls2[]:[/b]
Fetching: https://staging1.coachoutlet.com/api/shop/black-friday-deals/view-all
Access denied for URL https://staging1.coachoutlet.com/api/shop/black-friday-deals/view-all. Check cookies or headers.
Fetching: https://staging1.coachoutlet.com/api/shop/women/view-all
Access denied for URL https://staging1.coachoutlet.com/api/shop/women/view-all. Check cookies or headers.
Fetching: https://staging1.coachoutlet.com/api/shop/men/view-all
Access denied for URL https://staging1.coachoutlet.com/api/shop/men/view-all. Check cookies or headers.
Fetching: https://staging1.coachoutlet.com/api/shop/bags/view-all
Access denied for URL https://staging1.coachoutlet.com/api/shop/bags/view-all. Check cookies or headers.
Fetching: https://staging1.coachoutlet.com/api/shop/gifts/view-all
Access denied for URL https://staging1.coachoutlet.com/api/shop/gifts/view-all. Check cookies or headers.
Fetching: https://staging1.coachoutlet.com/api/shop/api/shop-by
Access denied for URL https://staging1.coachoutlet.com/api/shop/api/shop-by. Check cookies or headers.
Подробнее здесь: [url]https://stackoverflow.com/questions/79241946/getting-access-denied-error-while-scraping-api-data-using-python-and-selenium[/url]
Ответить
1 сообщение
• Страница 1 из 1
Перейти
- Кемерово-IT
- ↳ Javascript
- ↳ C#
- ↳ JAVA
- ↳ Elasticsearch aggregation
- ↳ Python
- ↳ Php
- ↳ Android
- ↳ Html
- ↳ Jquery
- ↳ C++
- ↳ IOS
- ↳ CSS
- ↳ Excel
- ↳ Linux
- ↳ Apache
- ↳ MySql
- Детский мир
- Для души
- ↳ Музыкальные инструменты даром
- ↳ Печатная продукция даром
- Внешняя красота и здоровье
- ↳ Одежда и обувь для взрослых даром
- ↳ Товары для здоровья
- ↳ Физкультура и спорт
- Техника - даром!
- ↳ Автомобилистам
- ↳ Компьютерная техника
- ↳ Плиты: газовые и электрические
- ↳ Холодильники
- ↳ Стиральные машины
- ↳ Телевизоры
- ↳ Телефоны, смартфоны, плашеты
- ↳ Швейные машинки
- ↳ Прочая электроника и техника
- ↳ Фототехника
- Ремонт и интерьер
- ↳ Стройматериалы, инструмент
- ↳ Мебель и предметы интерьера даром
- ↳ Cантехника
- Другие темы
- ↳ Разное даром
- ↳ Давай меняться!
- ↳ Отдам\возьму за копеечку
- ↳ Работа и подработка в Кемерове
- ↳ Давай с тобой поговорим...
Мобильная версия