Playwright + DeepSeek in action: How to make AI “understand” website page content? Automatically locate page elements?

Written by
Iris Vance
Updated on:July-08th-2025
Recommendation

In-depth analysis of how AI can "understand" dynamic website page content and the key steps to achieve automated testing.

Core content:
1. The difficulty of AI understanding dynamic website content in automated testing
2. DeepSeek's "understanding" boundary and AI capability limitations
3. The bridge from URL to page content: three solutions

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)


"How do I make DeepSeek understand the content of a website page?" This question comes from a message from a WeChat reader, which reflects a major pain point of AI in automated testing. In order to give a practical answer, I will take saucedemo.com as an example and share multiple implementation solutions in Python. If you are interested in AI-driven testing, please continue reading - I hope this article can inspire you! If you find it useful, don't forget to like, follow and bookmark it!

1. Introduction

A reader asked in a message on the official account:

"How do I make DeepSeek understand website pages? I can't just enter a URL and expect it to understand the content, and each module needs to be clicked to load the page. Or was there an introduction to this content before?"

This question goes to the heart of the matter:DeepSeekSuchAIHow does the model “understand” the content of a dynamic website? A website is not like static text.URLCan't letAIGet page information directly, especially those modules that require interaction (such as clicking a button) to load. Many test engineers may also have similar confusions:AICanURLLeap to understanding page logic and even generating automation scripts?

In previous articles

Playwright + DeepSeek in action: Teach you how to use AI to achieve xmind use case generation and automated testing (taking e-commerce as an example)

I have described the website features manually.DeepSeekGenerate test cases andPlaywrightThis time, we will go a step further and explore the reader's questions.DeepSeekA feasible method to understand the website content andsaucedemo.comProvide practical examplesPythonDetailed analysis of the implementation.

2. Problem Analysis:DeepSeekWhere is the limit of "understanding"?

Let’s first break down the problem and identify the challenge:

2.1 Dynamic Page Challenge

saucedemo.comIt is an e-commerce website that includes login, product list, shopping cart and checkout functions. Some content (such as product details or shopping cart list) will not load until the user clicks on it.DeepSeekoneURL, it cannot directly access or parse these dynamic contents.

2.2 AICapacity Limitations

CurrentDeepSeekLarge language models mainly rely on text input to generate output, and do not have built-in browser or crawler functions. It cannot open web pages, click buttons, and observe page changes like humans.

2.3 Potential Demand for Test Engineers

Test engineers may wish toDeepSeekIt can "automatically" understand the website content and generate test cases or scripts instead of relying on manual input. So, is this feasible? If not, what are the alternatives? Based on these analyses, let
DeepSeekTry to understandsaucedemo.omThe goal of this article is to explore possible methods to extract page content and generate useful automated test code.

Solution: Bridge from URL to page content

now thatDeepSeekWithout direct access to the website, we need to provide it with "eyes" and "language". Here are three solutions, ranging from practical to innovative:

3.1 Method 1: Manually describe page functions (practical solution)

This is the most direct and efficient method. By manually providing a description of the website's structure and functions,DeepSeekTest cases and scripts can be generated based on text.

Practical Example: Analysissaucedemo.com

Suppose we want to testsaucedemo.comThe shopping cart functionality can be described as follows:

  • • Demonstration of prompt words
Website URL: https://www.saucedemo.com
Functional description:
1. Login page: Enter the username "standard_user" and password "secret_sauce", and click the "Login" button to enter the product page.
2. Product page: Display multiple products. Each product has an "Add to cart" button. After clicking, the shopping cart icon will display the quantity.
3. Shopping cart page: Click the shopping cart icon in the upper right corner to enter, which will display the added items and include a "Checkout" button.
4. Checkout page: Fill in your name, zip code and other information, and click "Continue" to submit.
Interaction logic:
- After clicking "Add to cart" on the product page, the button changes to "Remove" and the shopping cart quantity increases by 1.
- If the postal code is empty on the checkout page, an error message "Error: Postal Code is required" will be displayed.
Output requirements:
- Write Python Playwright automated testing scripts for the shopping cart feature of saucedemo.com
- Requires the use of POM mode
- Covers normal product addition and empty zip code exception scenarios
  • • DeepSeekGenerate code
# pages/login_page.py
from playwright.sync_api import Page

class LoginPage:
    def __init__(self, page: Page):
        self.page = page
        self.username = page.locator("#user-name")
        self.password = page.locator("#password")
        self.login_btn = page.locator("#login-button")

    def login(self, username: str, password: str):
        self.username.fill(username)
        self.password.fill(password)
        self.login_btn.click()

# pages/product_page.py
class ProductPage:
    def __init__(self, page: Page):
        self.page = page
        self.add_to_cart_btn = page.locator("button:has-text('Add to cart')").first
        self.cart_icon = page.locator(".shopping_cart_badge")

    def add_to_cart(self):
        self.add_to_cart_btn.click()

    def get_cart_count(self) -> int:
        return int(self.cart_icon.inner_text()) if self.cart_icon.is_visible() else 0

# pages/checkout_page.py
class CheckoutPage:
    def __init__(self, page: Page):
        self.page = page
        self.postal_code = page.locator("#postal-code")
        self.continue_btn = page.locator("text=Continue")
        self.error_msg = page.locator(".error-message-container")

    def fill_shipping_info(self, postal_code: str = ""):
        self.postal_code.fill(postal_code)
        self.continue_btn.click()

    def get_error_message(self) -> str:
        return self.error_msg.inner_text()

# tests/test_cart.py
import pytest
from playwright.sync_api import expect
from pages.login_page import LoginPage
from pages.product_page import ProductPage
from pages.checkout_page import CheckoutPage

def test_add_to_cart(page):
    login_page = LoginPage(page)
    product_page = ProductPage(page)
    
    page.goto("https://www.saucedemo.com")
    login_page.login("standard_user", "secret_sauce")
    product_page.add_to_cart()
    assert product_page.get_cart_count() == 1

def test_empty_postal_code(page):
    login_page = LoginPage(page)
    checkout_page = CheckoutPage(page)
    
    page.goto("https://www.saucedemo.com")
    login_page.login("standard_user", "secret_sauce")
    page.goto("https://www.saucedemo.com/checkout-step-one.html")
    checkout_page.fill_shipping_info("")
    assert "Postal Code is required" in checkout_page.get_error_message()

The comments are as follows:

  • • Advantages: Simple and efficient, the generated code precisely meets the requirements.
  • • Limitations: Page information needs to be manually refined, and the description workload increases when the website is complex.

3.2 Capture HTML source code inputDeepSeek(Advanced plan)

Practical example: UsePlaywrightGetsaucedemo.comSource code of the product page

If you want to reduce manual description, you can use tools to grab the HTML source code of the page.DeepSeekAnalysis. For example:

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://www.saucedemo.com/inventory.html")
    html_content = page.content()
    print(html_content)
    browser.close()
  • • Demonstration of prompt words
    HTMLenterDeepSeek:
Please analyze the following HTML content, extract the key elements of the saucedemo.com product page (such as product list, add button), and generate a Playwright automation script.
  • • DeepSeekGenerate code
# pages/product_page.py
class ProductPage:
    def __init__(self, page: Page):
        self.page = page
        self.add_to_cart_btn = page.locator("[data-test^='add-to-cart-']")
        self.cart_icon = page.locator(".shopping_cart_badge")

    def add_to_cart(self):
        self.add_to_cart_btn.first.click()
  • • Dynamic content processing For content that needs to be clicked to load, you can simulate the interaction before crawling. For example:
page.locator("button:has-text('Add to cart')").first.click()
new_html = page.content()

Willnew_htmlenterDeepSeek, let it analyze the changes after the interaction.

The comments are as follows:

  • • Advantages: Reduces manual description and can handle some dynamic content.
  • • Limitations:HTMLIt may be lengthy and contain irrelevant information, affecting the analysis effect; dynamic interaction still requires manual design of crawling logic.

3.3 Recording user operation flow and combiningAIOptimization (Innovation Solutions)

Practical example:PlaywrightResponsible for recording the script,AIResponsible for code optimization and expansion

To further improve efficiency, we can usePlaywrightThe script recording function allowsAIOptimize and expand based on recorded scripts.

  • • The idea is as follows:
  1. 1. Recording operation: Runplaywright codegen https://www.saucedemo.com, manually perform operations (such as logging in, adding products), generatePlaywrightscript.
  2. 2. Feed AI: Provide the script toDeepSeek, suggesting optimizing code structure, improving element positioning, or expanding test cases.
  3. 3. AIOptimization and expansion:DeepSeekGenerate modular code and supplement exception scenario testing.
•  Assuming the login operation is recorded, the following script is generated
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://www.saucedemo.com/")
    page.fill("#user-name", "standard_user")
    page.fill("#password", "secret_sauce")
    page.click("#login-button")
    browser.close()
  • • Demonstration of prompt words
Please optimize the following Playwright script to make it more modular and extend an abnormal test case (such as login failure scenario).
  • • DeepSeekreply
# pages/login_page.py
class LoginPage:
    def __init__(self, page):
        self.page = page
        self.username = page.locator("#user-name")
        self.password = page.locator("#password")
        self.login_btn = page.locator("#login-button")
        self.error_msg = page.locator(".error-message-container")

    def login(self, username, password):
        self.username.fill(username)
        self.password.fill(password)
        self.login_btn.click()

    def get_error_message(self):
        return self.error_msg.inner_text()

# tests/test_login.py
from playwright.sync_api import expect

def test_successful_login(page):
    login_page = LoginPage(page)
    page.goto("https://www.saucedemo.com/")
    login_page.login("standard_user", "secret_sauce")
    expect(page).to_have_url("https://www.saucedemo.com/inventory.html")

def test_failed_login(page):
    login_page = LoginPage(page)
    page.goto("https://www.saucedemo.com/")
    login_page.login("invalid_user", "wrong_password")
    expect(login_page.get_error_message()).to_contain("Username and password do not match")

The comments are as follows:

  1. 1. High efficiency:playwright codegenRecording is simple and efficient.AIProviding concrete input can significantly improve code quality.
  2. 2. Value points: Reduce the workload of manual script writing, AI optimization improves code maintainability, and abnormal use case expansion enhances test coverage.
  3. 3. Limitations: The recorded script may contain redundant operations.AIHuman context may be required.

IV. Conclusion In summary:

  • • Method 1 (manual description): the most practical and efficient method at present, which can fully demonstrate theDeepSeeklanguage skills to ensure accurate results.
  • • Method 2 (CrawlingHTML): Suitable for scenarios where you want to automatically extract page information, but it requires tool support and the effect is affected by the quality of HTML.
  • • Method 3 (recording operation flow +AIOptimization): An innovative solution that is suitable for simple operation flows and can quickly generate high-quality code.

bysaucedemo.comFor example, the recommended process is:

  • • Simple scenario: Direct recording operation toAIoptimization.
  • • Complex scenarios: After recording the script, manually supplement the requirement description to assistAIGenerate perfect code.
  • • Manual description: For scenes that cannot be recorded, the manual description function is used. Although this method requires a certain amount of human participation, it can balance efficiency and accuracy.
    AIWith technological advancement, we may be able to directlyDeepSeekoneURL, let it "understand" the page by itself.