Showing posts with label lxml. Show all posts
Showing posts with label lxml. Show all posts

Tuesday, December 13, 2011

Scraping

LXML is useful Python library for scraping. Here is an example of scraping script

pip install requests
pip install lxml

#! /usr/bin/python
import requests
import lxml
from lxml import html

r = requests.get('https://www.google.com/')
tree = lxml.html.fromstring(r.content)
elements = tree.get_element_by_id("prm")
for el in elements:
print el.text_content()

Note: It works with Python 2.6.1 (did not work with Python 4). LXML specs