Python urllib

Python urllib is another important parameter used by different programmers in performing different tasks. Learn more about it here.

Python urllib: HTTP GET

Python 2.x Version ≤ 2.7

Python 2

import urllib
response = urllib.urlopen('http://stackoverflow.com/documentation/')

Using urllib.urlopen() will return a response object, which can be handled similar to a file.

print response.code

Prints: 200

The response.code represents the http return value. 200 is OK, 404 is NotFound, etc.

print response.read()
'\r\n\r\n\r\n\r\nDocumentation - Stack. etc'</p> <p>response.read() and response.readlines() can be used to read the actual html file returned from the request. These methods operate similarly to file.read*</p> <p>Python 3.x Version ≥ 3.0</p> <p>Python 3</p> <p>import urllib.request<br /> print(urllib.request.urlopen("http://stackoverflow.com/documentation/"))</p> <h1>Prints: <http.client.HTTPResponse at 0x7f37a97e3b00></h1> <p>response = urllib.request.urlopen("http://stackoverflow.com/documentation/")<br /> print(response.code)</p> <h1>Prints: 200</h1> <p>print(response.read())</p> <h1>Prints: b'<!DOCTYPE html>\r\n<html>\r\n<head>\r\n\r\n<title>Documentation - Stack Overflow

The module has been updated for Python 3.x, but use cases remain basically the same. urllib.request.urlopen will return a similar file-like object.

HTTP POST

To POST data pass the encoded query arguments as data to urlopen()

Python 2.x Version ≤ 2.7

Python 2

import urllib
query_parms = {'username':'stackoverflow', 'password':'me.me'} encoded_parms = urllib.urlencode(query_parms)
response = urllib.urlopen("https://stackoverflow.com/users/login", encoded_parms) response.code
Output: 200 response.read()
Output: '\r\n\r\n\r\n\r\nLog In - Stack Overflow'</p> <p>GoalKicker.com – Python® Notes for Professionals 438</p> <p>Python 3.x Version ≥ 3.0</p> <p>Python 3</p> <p>import urllib<br /> query_parms = {'username':'stackoverflow', 'password':'me.me'} encoded_parms = urllib.parse.urlencode(query_parms).encode('utf-8')<br /> response = urllib.request.urlopen("https://stackoverflow.com/users/login", encoded_parms) response.code<br /> Output: 200 response.read()<br /> Output: b'<!DOCTYPE html>\r\n<html>….etc'</p> <p>Section 91.3: Decode received bytes according to content type encoding</p> <p>The received bytes have to be decoded with the correct character encoding to be interpreted as text:</p> <p>Python 3.x Version ≥ 3.0</p> <p>import urllib.request<br /> response = urllib.request.urlopen("http://stackoverflow.com/")<br /> data = response.read()<br /> encoding = response.info().get_content_charset()<br /> html = data.decode(encoding)</p> <p>Python 2.x Version ≤ 2.7</p> <p>import urllib2<br /> response = urllib2.urlopen("http://stackoverflow.com/")<br /> data = response.read()<br /> encoding = response.info().getencoding()<br /> html = data.decode(encoding)</p>

LEAVE A REPLY

Please enter your comment!
Please enter your name here