I have written some code in python to scrape some items from a webpage. There is a link (titling see more) attach to each container. If I click on that link I can reach a certain page where all the information are available. It is quite convenient to track the see more link and parse all documents from there. However, my goal is to parse the first two documents from the first page and going on to the other page (clicking on the see more button) I’ll parse the rest. I have tried to do the same as I described just now. It is doing just fine. At this point I’m seriously dubious whether the way I’m doing things is the right one or error-prone because I used two items from earlier page to get printed within the newly defined for loop created in the later function. It is giving me the accurate results though. Any suggestion as to what I did here is ideal or any guidance why I should not practice this will be highly appreciated.
This is the site link: Page_url
This is the full script:
import requests from urllib.parse import urljoin from bs4 import BeautifulSoup url = 'replace_with_the_above_link' def glean_items(main_link): res = requests.get(main_link) soup = BeautifulSoup(res.text,"lxml") for item in soup.select('.course-list-item'): event = item.select(".lead").text date = item.select(".date").text link = urljoin(main_link,item.select(".pull-right a")['href']) parse_more(event,date,link) def parse_more(event_name,ev_date,i_link): ## notice the two items (event_name,ev_date) res = requests.get(i_link) soup = BeautifulSoup(res.text,"lxml") for items in soup.select('.e-loc-cost'): location = ' '.join([' '.join(item.text.split()) for item in items.select(".event-add")]) cost = ' '.join([' '.join(item.text.split()) for item in items.select(".costs")]) print(event_name,ev_date,location,cost) ##again take a look: I used those two items within this newly created for loop. if __name__ == '__main__': glean_items(url)
✓ Extra quality
ExtraProxies brings the best proxy quality for you with our private and reliable proxies
✓ Extra anonymity
Top level of anonymity and 100% safe proxies – this is what you get with every proxy package
✓ Extra speed
1,ooo mb/s proxy servers speed – we are way better than others – just enjoy our proxies!
USA proxy location
We offer premium quality USA private proxies – the most essential proxies you can ever want from USA
Our proxies have TOP level of anonymity + Elite quality, so you are always safe and secure with your proxies
Use your proxies as much as you want – we have no limits for data transfer and bandwidth, unlimited usage!
Superb fast proxy servers with 1,000 mb/s speed – sit back and enjoy your lightning fast private proxies!
99,9% servers uptime
Alive and working proxies all the time – we are taking care of our servers so you can use them without any problems
No usage restrictions
You have freedom to use your proxies with every software, browser or website you want without restrictions
Perfect for SEO
We are 100% friendly with all SEO tasks as well as internet marketing – feel the power with our proxies
Buy more proxies and get better price – we offer various proxy packages with great deals and discounts
We are working 24/7 to bring the best proxy experience for you – we are glad to help and assist you!