How to Get the HTML Source of a WebElement in Selenium WebDriver Using Python

21 May 2024 Stephan Petzl Leave a comment Tech-Help

If you are working with Selenium WebDriver in Python, you might find yourself needing to extract the HTML source of a specific WebElement. While getting the full page source is straightforward with the wd.page_source method, accessing the HTML of an individual element requires a different approach. This guide will walk you through the most effective methods to achieve this.

Using the `get_attribute` Method

The simplest way to obtain the HTML source of a WebElement is by using the get_attribute method. This method can retrieve the innerHTML or outerHTML of an element.

Example Code:

from selenium import webdriver

# Initialize the WebDriver
driver = webdriver.Firefox()

# Navigate to the desired webpage
driver.get('http://example.com')

# Locate the element using CSS Selector
element = driver.find_element_by_css_selector('#my-id')

# Retrieve the innerHTML of the element
inner_html = element.get_attribute('innerHTML')

# Retrieve the outerHTML of the element
outer_html = element.get_attribute('outerHTML')

print("Inner HTML:", inner_html)
print("Outer HTML:", outer_html)

# Close the WebDriver
driver.quit()

Handling Dynamic Content

Modern web applications often use JavaScript frameworks like ReactJS, Vue.js, or Angular to dynamically render content. In such cases, it is crucial to wait for the element to be fully loaded before retrieving its HTML.

Using WebDriverWait:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

# Initialize the WebDriver
driver = webdriver.Firefox()

# Navigate to the desired webpage
driver.get('http://example.com')

# Wait for the element to be visible
element = WebDriverWait(driver, 20).until(
    EC.visibility_of_element_located((By.CSS_SELECTOR, '#my-id'))
)

# Retrieve the outerHTML of the element
outer_html = element.get_attribute('outerHTML')

print("Outer HTML:", outer_html)

# Close the WebDriver
driver.quit()

Saving HTML to a File

Sometimes, you may need to save the retrieved HTML to a file for further analysis. Here is how you can do it:

Example Code:

with open('html_source_code.html', 'w', encoding='utf-8') as file:
    file.write(outer_html)

Conclusion

By using the methods described above, you can efficiently retrieve and handle the HTML source of WebElements in Selenium WebDriver using Python. This can be particularly useful for debugging, data extraction, and automated testing purposes.

For those looking to streamline their automated testing process, consider using Repeato. Repeato is a no-code test automation tool for iOS and Android that simplifies the creation, execution, and maintenance of automated tests. With its intuitive test recorder and computer vision-based approach, Repeato can help you achieve faster and more reliable test results. Additionally, it offers a scripting interface for advanced users to automate complex scenarios.

How to Get the HTML Source of a WebElement in Selenium WebDriver Using Python

Recent Posts

Using the `get_attribute` Method

Example Code:

Handling Dynamic Content

Using WebDriverWait:

Saving HTML to a File

Example Code:

Conclusion

Like this article? there’s more where that came from!

Repeato

Documentation

Support

Legal

How to Get the HTML Source of a WebElement in Selenium WebDriver Using Python

Recent Posts

Using the get_attribute Method

Example Code:

Handling Dynamic Content

Using WebDriverWait:

Saving HTML to a File

Example Code:

Conclusion

Like this article? there’s more where that came from!

Repeato

Documentation

Support

Legal

Using the `get_attribute` Method