Skip to content Skip to sidebar Skip to footer

Xpath Works For Just One Item When Add // In It

I have this html page

Solution 1:

Say your page looks like (page.html):

<page><divid="results-list"><divclass="item paid-featured-item"><divclass="something"><divclass="title">Title 1</div></div><divclass="anotherthing"></div></div><divclass="item paid-featured-item"><divclass="something"><divclass="title">Title 2</div></div><divclass="anotherthing"></div></div><divclass="item paid-featured-item"><divclass="something"><divclass="title">Title 3</div></div><divclass="anotherthing"></div></div><divclass="item paid-featured-item"><divclass="something"><divclass="title">Title 4</div></div><divclass="anotherthing"></div></div></div></page>

To extract each title, you do:

from scrapy.selector import Selector
sel = Selector(text=open('page.html').read())

container = sel.xpath('//div[@id="results-list"]')
items = container.xpath('.//div[@class="item paid-featured-item"]')
for item in items:
    # *extracted* is a single-item list containing the title.
    extracted = item.xpath('.//div[@class="title"]/text()').extract()
    title = extracted[0]
    print title

This will output:

Title 1
Title 2
Title 3
Title 4

Post a Comment for "Xpath Works For Just One Item When Add // In It"