I would like to scrape the marathon results from the link (call it page A): https://www.marathon.tokyo/2023/result/index.php
Suppose I choose the 'Marathon Men' in the first option and then search, I get to the following webpage showing the results (call it page B):

When I click the names, I then get to the result of each individual athlete (page C):

My question is, how to get from page A to page C? I have no problems scraping the data I want from page C. The problem is getting from page A to B, obtain all the URLs pointing to the individual result entry (page C), and then navigate to page C.
To get from page A to page B, I have something like the following:
from selenium import webdriver from selenium.webdriver.support.wait import WebDriverWait from selenium.webdriver.chrome.options import Options from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC url = 'https://www.marathon.tokyo/2023/result/index.php' driver = webdriver.Chrome() driver.get(url) options = driver.find_elements(By.TAG_NAME, "option") for option in options: if 'Marathon Men' in option.text: print(option.text) option.click() # click on the option 100 break It does automatically select the correct option (Marathon Men), but I don't know how to click the 'search' button.
To get from page B to C, I try the following code while at page B:
raw_links = driver.find_elements(By.XPATH, '//a [@href]') for link in raw_links: l = link.get_attribute("href") print("raw_link:{}".format(l)) And I get the following output:
raw_link:javascript:page(2); raw_link:javascript:page(3); raw_link:javascript:page(4); # and so on Again, the problem is I don't know how to convert those to clickable URLs and navigate to them.
Any help to get me started would be greatly appreciated.
Источник: https://stackoverflow.com/questions/780 ... ed-website