HTML Parsing in Python

HTML Parsing in Python is another important parameter used by different programmers in performing different tasks. Learn more about it here.

Using CSS selectors in BeautifulSoup

BeautifulSoup has a limited support for CSS selectors, but covers most commonly used ones. Use SELECT() method to find multiple elements and select_one() to find a single element.

Basic example:

from bs4 import BeautifulSoup
data = “””

  • item1
  • item2
  • item3

“””

soup = BeautifulSoup(data, "html.parser")
for item in soup.select("li.item"):
print(item.get_text())

Prints:

item1
item2
item3

PyQuery

pyquery is a jquery-like library for python. It has very well support for css selectors.

from pyquery import PyQuery

html = “””

Sales

Lorem46
Ipsum12
Dolor27
Sit90

“””

doc = PyQuery(html)
title = doc('h1').text()
print title
table_data = []
rows = doc('#table > tr')
for row in rows:
name = PyQuery(row).find('td').eq(0).text()
value = PyQuery(row).find('td').eq(1).text()
print "%s\t %s" % (name, value)

HTML Parsing in Python: Locate a text after an element in BeautifulSoup

Imagine you have the following HTML: Name: John Smith

And you need to locate the text “John Smith” after the label element.

In this case, you can locate the label element by text and then use .next_sibling property:

from bs4 import BeautifulSoup
data = """ Name: John Smith
"""
soup = BeautifulSoup(data, "html.parser")
label = soup.find("label", text="Name:")
print(label.next_sibling.strip())
Prints John Smith.

Read more

Must Read Python Interview Questions

200+ Python Tutorials With Coding Examples

Python Language Basics TutorialPython String Representations of Class Instances
Python For Beginners TutorialPython Debugging Tutorial
Python Data Types TutorialReading and Writing CSV File Using Python
Python Indentation TutorialWriting to CSV in Python from String/List
Python Comments and Documentation TutorialPython Dynamic Code Execution Tutorial
Python Date And Time TutorialPython Code Distributing using Pyinstaller
Python Date Formatting TutorialPython Data Visualization Tutorial
Python Enum TutorialPython Interpreter Tutorial
Python Set TutorialPython Args and Kwargs
Python Mathematical Operators TutorialPython Garbage Collection Tutorial
Python Bitwise Operators TutorialPython Pickle Data Serialisation
Python Bolean Operators TutorialPython Binary Data Tutorial
Python Operator Precedance TutorialPython Idioms Tutorial
Python Variable Scope And Binding TutorialPython Data Serialization Tutorial
Python Conditionals TutorialPython Multiprocessing Tutorial
Python Comparisons TutorialPython Multithreading Tutorial
Python Loops TutorialPython Processes and Threads
Python Arrays TutorialPython Concurrency Tutorial
Python Multidimensional Arrays TutorialPython Parallel Computation Tutorial
Python List TutorialPython Sockets Module Tutorial
Python List Comprehensions TutorialPython Websockets Tutorial
Python List Slicing TutorialSockets Encryption Decryption in Python
Python Grouby() TutorialPython Networking Tutorial
Python Linked Lists TutorialPython http Server Tutorial
Linked List Node TutorialPython Flask Tutorial
Python Filter TutorialIntroduction to Rabbitmq using Amqpstorm Python
Python Heapq TutorialPython Descriptor Tutorial
Python Tuple TutorialPython Tempflile Tutorial
Python Basic Input And Output TutorialInput Subset and Output External Data Files using Pandas in Python
Python Files And Folders I/O TutorialUnzipping Files in Python Tutorial
Python os.path TutorialWorking with Zip Archives in Python
Python Iterables And Iterators Tutorialgzip in Python Tutorial
Python Functions TutorialStack in Python Tutorial
Defining Functions With List Arguments In PythonWorking with Global Interpreter Lock (GIL)
Functional Programming In PythonPython Deployment Tutorial
Partial Functions In PythonPython Logging Tutorial
Decorators Function In PythonPython Server Sent Events Tutorial
Python Classes TutorialPython Web Server Gateway Interface (WSGI)
Python Metaclasses TutorialPython Alternatives to Switch Statement
Python String Formatting TutorialPython Packing and Unpacking Tutorial
Python String Methods TutorialAccessing Python Sourcecode and Bytecode
Using Loops Within Functions In PythonPython Mixins Tutorial
Python Importing Modules TutorialPython Attribute Access Tutorial
Difference Betweeb Module And Package In PythonPython Arcpy Tutorial
Python Math Module TutorialPython Abstract Base Class Tutorial
Python Complex Math TutorialPython Plugin and Extension Classes
Python Collections Module TutorialPython Immutable Datatypes Tutorial
Python Operator Module TutorialPython Incompatibilities Moving from Python 2 to Python 3
Python JSON Module TutorialPython 2to3 Tool Tutorial
Python Sqlite3 Module TutorialNon-Official Python implementations
Python os Module TutorialPython Abstract Syntax Tree
Python Locale Module TutorialPython Unicode and Bytes
Python Itertools Module TutorialPython Serial Communication (pyserial)
Python Asyncio Module TutorialNeo4j and Cypher using Py2Neo
Python Random Module TutorialBasic Curses with Python
Python Functools Module TutorialTemplates in Python
Python dis Module TutorialPython Pillow
Python Base64 Module TutorialPython CLI subcommands with precise help output
Python Queue Module TutorialPython Database Access
Python Deque Module TutorialConnecting Python to SQL Server
Python Webbrowser Module TutorialPython and Excel
Python tkinter TutorialPython Turtle Graphics
Python pyautogui Module TutorialPython Persistence
Python Indexing And Slicing TutorialPython Design Patterns
Python Plotting With Matplotlib TutorialPython hashlib
Python Graph Tool TutorialCreating a Windows Service Using Python
Python Generators TutorialMutable vs Immutable (and Hashable) in Python
Python Reduce TutorialPython configparser
Python Map Function TutorialPython Optical Character Recognition
Python Exponentiation TutorialPython Virtual Environments
Python Searching TutorialPython Virtual Environment – virtualenv
Sorting Minimum And Maximum In PythonPython Virtual environment with virtualenvwrapper
Python Print Function TutorialCreate virtual environment with virtualenvwrapper in windows
Python Regular Expressions Regex TutorialPython sys Tutorial
Copying Data In Python TutorialChemPy – Python package
Python Context Managers (“with” Statement) TutorialPython pygame
Python Name Special Variable TutorialPython pyglet
Checking Path Existence And Permissions In PythonWorking with Audio in Python
Creating Python Packages TutorialPython pyaudio
Usage of pip Module In Python TutorialPython shelve
Python PyPi Package Manager TutorialIoT Programming with Python and Raspberry PI
Parsing Command Line Arguments In Pythonkivy – Cross-platform Python Framework for NUI Development
Python Subprocess Library TutorialPandas Transform
Python setup.py TutorialPython vs. JavaScript
Python Recursion TutorialCall Python from C#
Python Type Hints TutorialPython Writing Extensions
Python Exceptions TutorialPython Lex-Yacc
Raise Custom Exceptions In PythonPython Unit Testing
Python Commonwealth Exceptions TutorialPython py.test
Python urllib TutorialPython Profiling
Web Scraping With Python TutorialPython Speed of Program
Python HTML Parsing TutorialPython Performance Optimization
Manipulating XML In PythonPython Security and Cryptography
Python Requests Post TutorialSecure Shell Connection in Python
Python Distribution TutorialPython Anti Patterns
Python Property Objects TutorialPython Common Pitfalls
Python Overloading TutorialPython Hidden Features
Python Polymorphism TutorialPython For Machine Learning
Python Method Overriding TutorialPython Interview Questions And Answers For Experienced
Python User Defined Methods TutorialPython Coding Interview Questions And Answers
Python Programming Tutorials With Examples

Other Python Tutorials

Leave a Comment