python lxml xpath get the nodes attributes with specific string pattern -


im learning xpath , trying value of node specific node attribute example(google playstore) using python lxml/html. below code wanted developer email value node "a" attribute "href" starting "mailto:". python code snippet returns app name empty developer email. thank you

<html> <div class="id-app-title" tabindex="0">candy crush saga</div> <div class="meta-info meta-info-wide">  <div class="title"> developer </div>  <a class="dev-link" href="https://www.google.com/url?q=http://candycrush.com" rel="nofollow" target="_blank"> visit website </a> <a class="dev-link" href="mailto:candycrush@kingping.com" rel="nofollow" target="_blank">candycrush@kingping.com </a> ##interesting part here </div> </html> 

python code (2.7)

 def get_app_from_link(self,link):     start_page=requests.get(link)     #print start_page.text     tree = html.fromstring(start_page.text)     name = tree.xpath('//div[@class="id-app-title"]/text()')[0]     #developer=tree.xpath('//div[@class="dev-link"]//*/div/@href')     developer=tree.xpath('//div[contains(@href,"mailto") , @class="dev-link"]/text()')     print name,developer     return  

now using tag div, not a:

'//a[contains(@href,"mailto") , @class="dev-link"]/text()'

also, function don't return items. use return like:

def get_app_from_link(self,link)::     # code     return name, developer 

Comments

Popular posts from this blog

Django REST Framework perform_create: You cannot call `.save()` after accessing `serializer.data` -

Why does Go error when trying to marshal this JSON? -