Парсинг URL-адреса с помощью Beautiful Soup

Парсинг URL-адреса с помощью Beautiful Soup ⇐ Python

1 сообщение • Страница 1 из 1

Anonymous

Парсинг URL-адреса с помощью Beautiful Soup

Цитата

Сообщение Anonymous » 23 ноя 2025, 00:54

Я новичок в сборе данных.
В этом случае я хочу получить URL-адрес типа «https:// . . .», но результатом является список в переменной link, содержащий все ссылки в Интернете. Вот код ниже;

Код: Выделить всё

import requests
from bs4 import BeautifulSoup
url = 'https://www.detik.com/search/searchall?query=KPK'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
artikel = soup.findAll('div', {'class' : 'list media_rows list-berita'})
p = 1
link = []
for p in artikel:
s = p.findAll('a', href=True)['href']
link.append(s)

Результатом выполнения приведенного выше кода является ошибка, например:

Код: Выделить всё

TypeError                                 Traceback (most recent call last)
 in 
3 link = []
4 for p in artikel:
5         s = p.findAll('a', href=True)['href']
6         link.append(s)
TypeError: list indices must be integers or slices, not str

В результате я хочу получить все ссылки https:// . . . в

Подробнее здесь: https://stackoverflow.com/questions/680 ... tiful-soup

1763848483

Anonymous

Я новичок в сборе данных.
В этом случае я хочу получить URL-адрес типа «https:// . . .», но результатом является список в переменной link, содержащий все ссылки в Интернете. Вот код ниже;
[code]import requests
from bs4 import BeautifulSoup
url = 'https://www.detik.com/search/searchall?query=KPK'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
artikel = soup.findAll('div', {'class' : 'list media_rows list-berita'})
p = 1
link = []
for p in artikel:
s = p.findAll('a', href=True)['href']
link.append(s)
[/code]
Результатом выполнения приведенного выше кода является ошибка, например:
[code]TypeError                                 Traceback (most recent call last)
 in 
3 link = []
4 for p in artikel:
5         s = p.findAll('a', href=True)['href']
6         link.append(s)
TypeError: list indices must be integers or slices, not str
[/code]
В результате я хочу получить все ссылки https:// . . . в 

Подробнее здесь: [url]https://stackoverflow.com/questions/68014275/scraping-a-url-using-beautiful-soup[/url]