Шаги в моем подходе
Код: Выделить всё
Locate Map Pins: Each pin on the map () represents a location.
Click a Pin: Clicking a pin opens a modal () listing swim clubs or teams associated with that location.
Extract Data from Modal:
Club Name
Email
Phone
Website
Club Size
Address
Iterate Through Pins: Repeat the process for each pin.
< /code>
Проблемы < /p>
Modal Detection Issue:
After clicking a pin, the modal sometimes isn’t detected.
I use WebDriverWait with presence_of_element_located, but the script often fails with:
Error locating modal or swim club links: Message:
Efficient Pin Iteration:
The map spans the entire USA, and pins are scattered across the country.
Panning or zooming manually isn’t scalable, and I need a way to iterate through all pins efficiently.
Вот скрипт, который я использую:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
import time
# Initialize Selenium WebDriver
options = webdriver.ChromeOptions()
# Comment this line to visually debug
# options.add_argument('--headless')
driver = webdriver.Chrome(options=options)
# URL of the Swim Club Finder page
url = "https://www.usaswimming.org/home/find-a-team"
# Open the website
driver.get(url)
# Wait for the map to load
wait = WebDriverWait(driver, 20)
# Data storage
swim_club_data = []
try:
# Locate all pins on the map
print("Locating pins on the map...")
pins = wait.until(EC.presence_of_all_elements_located((By.CLASS_NAME, "maplibregl-marker")))
print(f"Found {len(pins)} pins on the map.")
# Limit for testing
for i, pin in enumerate(pins[:5]): # Test with the first 5 pins
print(f"Clicking pin {i+1}...")
driver.execute_script("arguments[0].click();", pin) # Use JavaScript click
time.sleep(3) # Allow modal to load
# Locate the modal
modal = wait.until(EC.presence_of_element_located((By.CLASS_NAME, "popup-content-container")))
print("Modal located.")
# Extract links to swim clubs in the modal
club_links = modal.find_elements(By.XPATH, "//ul/li/a")
print(f"Found {len(club_links)} swim club links.")
for club_link in club_links:
# Click on each club link
print(f"Clicking swim club link: {club_link.text}...")
driver.execute_script("arguments[0].click();", club_link)
time.sleep(3) # Allow details to load
# Extract swim club details
try:
details_modal = wait.until(EC.presence_of_element_located((By.CLASS_NAME, "popup-content-container")))
club_name = details_modal.find_element(By.CLASS_NAME, "popupTitle").text
email = details_modal.find_element(By.CSS_SELECTOR, "a[href^='mailto:']").get_attribute("href").replace("mailto:", "")
phone = details_modal.find_element(By.CSS_SELECTOR, "a[href^='tel:']").get_attribute("href").replace("tel:", "")
website = details_modal.find_element(By.CSS_SELECTOR, "a[target='_blank']").get_attribute("href")
club_size = details_modal.find_element(By.XPATH, "//li[contains(text(), 'Club Size')]").text.split(": ")[1]
address = details_modal.find_element(By.XPATH, "//ul[@class='popupSubTitle']/following-sibling::text()").strip()
swim_club_data.append({
"Name": club_name,
"Email": email,
"Phone": phone,
"Website": website,
"Club Size": club_size,
"Address": address
})
print(f"Extracted data for: {club_name}")
except Exception as e:
print(f"Error extracting club details: {e}")
# Close the club details modal
try:
close_button = wait.until(EC.element_to_be_clickable((By.CLASS_NAME, "popup-close")))
close_button.click()
time.sleep(2)
except Exception as e:
print(f"Error closing club details modal: {e}")
# Close the pin modal
try:
close_button = wait.until(EC.element_to_be_clickable((By.CLASS_NAME, "popup-close")))
close_button.click()
time.sleep(2)
except Exception as e:
print(f"Error closing pin modal: {e}")
except Exception as e:
print(f"Error interacting with the map: {e}")
# Quit the browser
driver.quit()
# Print the results
print("Final Extracted Data:")
for club in swim_club_data:
print(club)
# Save to CSV
if swim_club_data:
df = pd.DataFrame(swim_club_data)
df.to_csv("swim_clubs.csv", index=False)
print("Data saved to 'swim_clubs.csv'")
else:
print("No data to save.")
< /code>
Вопросы < /p>
Почему модал не загружается или обнаруживается селеном? Есть ли какие -нибудь дополнительные шаги или элементы, которых я должен ждать после нажатия PIN -кода? ? Булавки распространяются по всей США. Есть ли способ программного панорамирования и масштабировать карту для обнаружения всех контактов? Можно ли извлечь данные с помощью API или структуры JSON, встроенной на странице? /p>
Скорректированное время ожидания и тайм-ауты. > Проверенные имена классов и элементы с инструментами Dev Browser. Br /> Любое руководство или предложения будут высоко оценены!
Подробнее здесь: https://stackoverflow.com/questions/793 ... m-efficien