Page With Anti-scraping Protection In The Code?
I am trying to extract information from a web page, when dealing with Xpath helper (chrome extension) it shows the content perfectly, but when taking it to scrapy it returns 'None'
Solution 1:
The element you're targeting might be dynamically rendered. I tried this and got it to work, I'm targeting the price lower down on the page instead.
import scrapy
classTestSpider(scrapy.Spider):
name = 'testspider'defstart_requests(self):
return [scrapy.Request(
url='https://cutt.ly/bjj3ohW',
)]
defparse(self, response):
price = response.css('.price-final > strong::text').get()
print(price)
A good way to test if it's dynamically rendered is to open inspect panel in Chrome (F12)
and look under the Network tab. Reload the page and look and the first response which should be a .html
file. Click on that file and then Response. There you can see the html code you can parse in Scrapy. Click ctrl+F
and search for the CSS selector you're trying to parse.
Post a Comment for "Page With Anti-scraping Protection In The Code?"