Microsoft Playwright MCP Server provides browser automation capabilities for LLM

Microsoft Playwright MCP server provides a new tool for browser automation for large language models.
Core content:
1. Innovative features and advantages of Playwright MCP server
2. The role of Playwright in modern web application testing
3. Detailed explanation of Playwright's multi-browser support, execution mode and debugging tools
Unlock the powerful web interactivity of large language models with the Playwright -based Claude MCP server. This innovative solution enables seamless communication of LLMs between web pages via structured accessibility snapshots - no screenshots or visual models required .
Repository
https://github.com/microsoft/playwright-mcp
Homepage
https://www.npmjs.com/package/@playwright/mcp
What is Playwright?
Playwright
It is an open source browser automation tool developed by Microsoft that enables testers and developers to automatically interact with web applications across multiple browsers and platforms. Unlike traditional automation tools,Playwright
Designed for modern web applications, it supports dynamic content, real-time interactions, and even network monitoring, helping teams test applications faster and more efficiently.
Automated browser testing has become indispensable in modern software development to ensure that web applications run smoothly across different browsers and environments. Playwright
, you will understand its powerful function in automating web interactions. However, when multiple test scripts, debugging tools, or automation services need to communicate with the same Playwright
When the instance interacts,Playwright
The Multi-Client Protocol (MCP) server will come into play.
Playwright's core features
Multi-Browser Support
Playwright
Seamless support for Chromium, Firefox, and WebKit ensures compatibility with major browsers. This means a single test script can be executed on different browsers, reducing redundant work and ensuring a consistent user experience.
Headless and Headed execution modes
Playwright
Can run in headless mode (without UI) to speed up test execution, making it ideal for CI/CD pipelines. It also supports headed mode for debugging and interactive testing, allowing developers to visually inspect test runs.
Parallel test execution
Playwright
One of the biggest advantages of is its ability to execute multiple tests simultaneously. Parallel execution reduces the overall test run time, making it an ideal solution for large applications that require frequent and rapid testing.
Advanced debugging tools
Playwright
Contains built-in tools that significantly simplify the process of debugging failing tests. It provides:
• Trace Viewer – step-by-step visual representation of test execution • Video Recording – Capture test runs for troubleshooting • Screenshots – Helps detect UI inconsistencies
Powerful Web interaction API
Playwright
Supports a wide range of user interactions, including:
• Click buttons, fill out forms, and scroll • Capture network requests and responses • Handling authentication flows and cookies • Automatically upload and download files
Playwright MCP Server
Playwright MCP Server is a Playwright-based MCP server that enables testers and developers to automatically interact with web applications across multiple browsers and platforms. This server allows Large Language Models (LLMs) to interact with web pages through structured accessibility snapshots, without relying on screenshots or visually tuned models. It provides the following core features:
• Enable LLMs through browser automation : Connect LLMs through MCP, allowing AI to directly manipulate web pages. Compatible with large language models such as Claude, GPT-4o, and DeepSeek. • Support for Web page interactions : Supports common Web actions, including clicking buttons, filling out forms, and scrolling pages. • Capture Web Screenshots : Take screenshots of web pages through the Playwright MCP server to analyze the current UI and content. • Execute JavaScript code : Run JavaScript in the browser environment to enable more complex interactions with web pages. • Integrated convenience tools : Supports tools such as Smithery and mcp-get to simplify installation and configuration.
It is very suitable for automated testing, data scraping, SEO competitor analysis, AI intelligent agent, etc. If you want AI to handle web tasks more intelligently or need an efficient automation tool, try Playwright MCP Server.
Install in Cursor
In the cursor settings, switch to the MCP tab and click Adding a new global MCP server
button, and enter the following configuration:
{
"mcpServers" : {
"playwright" : {
"command" : "npx" ,
"args" : [
"@playwright/mcp"
]
}
}
}
If you don't want to enable it globally, add the above configuration to your project's root directory .cursor/mcp.json
in the file.
⚠️ Note: The official documentation recommends using npx @playwright/mcp@latest
command, but it may cause errors during configuration:
$ npx @playwright/mcp@latest
node:internal/modules/cjs/loader:646 throw e; ^
Error: Cannot find module '/Users/cnych/.npm/_npx/9833c18b2d85bc59/node_modules/yaml/dist/index.js' at createEsmNotFoundErr (node:internal/modules/cjs/loader:1285:15) at finalizeEsmResolution (node:internal/modules/cjs/loader:1273:15) at resolveExports (node:internal/modules/cjs/loader:639:14) at Module._findPath (node:internal/modules/cjs/loader:747:31) at Module._resolveFilename (node:internal/modules/cjs/loader:1234:27) at Module._load (node:internal/modules/cjs/loader:1074:27) at TracingChannel.traceSync (node:diagnostics_channel:315:14) at wrapModuleLoad (node:internal/modules/cjs/loader:217:24) at Module.require (node:internal/modules/cjs/loader:1339:12) at require (node:internal/modules/helpers:135:16) { code: 'MODULE_NOT_FOUND' , path: '/Users/cnych/.npm/_npx/9833c18b2d85bc59/node_modules/yaml/package.json' }
Will npx @playwright/mcp@latest
Replace with npx @playwright/mcp
.
Once configured, you should see the Playwright MCP server successfully configured in the MCP tab of Cursor Settings:
VS Code Installation
# For VS Code
code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp"]}'
For VS Code Insiders
Once installed, the Playwright MCP server will be immediately available to the GitHub Copilot agent in VS Code.
Advanced Configuration
Browser options
You can args
Add parameters to customize your browser:
• --browser <browser>
: Options:--browser <browser>
:options:• Standard browsers: chrome
,Firefox
,webkit
,msedge
• Chrome variants: chrome-beta
,chrome-canary
,chrome-dev
• Edge variant: msedge-beta
,msedge-canary
,msedge-dev
• default value: chrome
• --cdp-endpoint <endpoint>
: Connect to an existing Chrome DevTools Protocol endpoint• --executable-path <path>
: Specify the path to the custom browser executable file• --headless
: Run in headless mode (headed by default)• --port <port>
: Set the SSE transport listening port• --user-data-dir <path>
: Customize the user data directory• --vision
: Enable interactive screenshot mode
Configuration file management
Playwright MCP creates a dedicated browser profile in the following location:
• Windows: %USERPROFILE%\AppData\Local\ms-playwright\mcp-chrome-profile
• macOS: ~/Library/Caches/ms-playwright/mcp-chrome-profile
• Linux: ~/.cache/ms-playwright/mcp-chrome-profile
Deleting these directories between sessions will clear the browsing state.
Operation Modes
Headless Operation (Recommended for Automation)
{
"mcpServers" : {
"playwright" : {
"command" : "npx" ,
"args" : [
"@playwright/mcp@latest" ,
"--headless"
]
}
}
}
Headed Operation on Headless Systems
For Linux systems with a headless or IDE worker, you can start the server using the SSE transport. First, start the server with the following command:
npx @playwright/mcp --port 8931
Then configure the MCP client:
{
"mcpServers" : {
"playwright" : {
"url" : "http://localhost:8931/sse"
}
}
}
Interactive Mode
Once the server is running and connected to the agent, the agent can call specific tools provided by MCP to control the browser. The tools available depend on whether the server is running in snapshot mode or image mode.
Snapshot mode (recommended)
This is the default mode, using the accessibility snapshot for best performance and reliability. The provided MCP tools work primarily with the accessibility tree. A typical workflow involves:
1. Use browser_snapshot
Gets the current state of the accessibility tree.2. The agent analyzes the snapshot (structured text/JSON) to understand the page content and identify the target elements. Each interactive element in the snapshot usually has a unique ref (reference identifier). 3. Proxy call browser_click
orbrowser_type
Interactive tools such as , provide the ref of the target element.
Playwright MCP provides a set of tools for browser automation. Here are all the available tools:
• browser_navigate : Navigate to a URL • Parameters: • url (string): The URL to navigate to url (string): The URL to navigate to • browser_go_back : Return to the previous page • Parameters: None • browser_go_forward : Go forward to the next page • Parameters: None • browser_click : click on an element • Parameters: element (string): description of the element to click ref (string): exact target element reference in the page snapshot • browser_hover : Hover the mouse over an element • Parameters: element (string): description of the element to hover ref (string): exact target element reference in the page snapshot
- browser_drag : Drag and drop an element - Parameters: startElement (string): description of the element to drag startRef (string): exact source element reference in the page snapshot endElement (string): description of the target element to drop to endRef (string): exact target element reference in the page snapshot
• ** browser_type : input text (optional submit) • Parameters: • element (string): description of the element to enter the text into • ref (string): the exact target element reference in the page snapshot • text (string): the text to be entered • submit (boolean): whether to submit the input text (after pressing Enter) • ** browser_select_option : Select the drop-down option • Parameters: element (string): description of the element to be selected ref (string): exact target element reference in the page snapshot values (array): drop-down option values to be selected • ** browser_choose_file : select file • Parameters: paths (array): The absolute paths of the files to be uploaded. Can be a single file or multiple files. • ** browser_press_key : Press a key on the keyboard • Parameters: key (string): The name or character of the key to press, such as ArrowLeft or • ** browser_snapshot : captures an accessibility snapshot of the current page (better than a screenshot) • Parameters: None • ** browser_save_as_pdf : Save the page as PDF • Parameters: None • browser_take_screenshot**: Capture a screenshot of a page - Parameters: None • browser_wait : wait for a specified time • Parameters: time (number): waiting time (up to 10 seconds) • browser_close : Close the page • Parameters: None
Visual Mode
For screenshot-based visual interaction, enable it with the following command:
{
"mcpServers" : {
"playwright" : {
"command" : "npx" ,
"args" : [
"@playwright/mcp" ,
"--vision"
]
}
}
}
Vision mode provides MCP tools that rely on coordinates taken from the screenshot. A typical workflow includes:
1. Use browser_screenshot
Capture the current view.2. The agent (may require visualization capabilities) analyzes the screenshot to identify the target location (X, Y coordinates). 3. The agent calls with the determined coordinates browser_click
orbrowser_type
And other interactive tools.
Vision Mode provides a set of tools for screenshot-based visual interaction. Here are all the available tools:
• ** browser_navigate : Navigate to a URL - Parameters: url (string): URL to navigate to • ** browser_go_back : Return to the previous page • Parameters: None • ** browser_go_forward : Go to next page - Parameters: None • browser_screenshot : captures a screenshot of a page • Parameters: None • browser_move_mouse : Moves the mouse to the specified coordinates • Parameters: • x (number): X coordinate • y (number): Y coordinate • browser_click : click on an element • Parameters: • x (number): X coordinate • y (number): Y coordinate • browser_drag : drag and drop elements • Parameters: • startX (number): starting X coordinate • startY (number): starting Y coordinate • endX (number): Ending X coordinate • endY (number): End Y coordinate • browser_type**: Input text (optional submission) • Parameters: • x (number): X coordinate • y (number): Y coordinate • text (string): the text to be entered • submit (boolean): whether to submit the input text (after pressing Enter) • ** browser_press_key : Press a key on the keyboard • Parameters: • key (string): the name or character of the key to press, such as ArrowLeft or • browser_choose_file : select file • Parameters: • paths (array): The absolute paths of the files to upload. Can be a single file or multiple files. • ** browser_save_as_pdf : Save the page as PDF • Parameters: None • browser_wait : wait for a specified time • Parameters: • time (number): waiting time (up to 10 seconds) • browser_close : Close the page • Parameters: None
Custom settings
In addition to configuration files and automatic startup via your IDE, Playwright MCP can be integrated directly into your Node.js application. This provides more control over server settings and communication transport.
import { createServer } from "@playwright/mcp" ;
// Import necessary transport classes, eg, from '@playwright/mcp/lib/sseServerTransport';
// Or potentially implement your own transport mechanism.
async function runMyMCPServer () { // Create the MCP server instance const server = createServer({ // You can pass Playwright launch options here launchOptions: { headless: true, // other Playwright options... }, // You might specify other server options if available });
// Example using SSE transport (requires appropriate setup like an HTTP server) // This part is conceptual and depends on your specific server framework (eg, Express, Node http) /* const http = require('http'); const { SSEServerTransport } = require('@playwright/mcp/lib/sseServerTransport'); // Adjust path as needed
const httpServer = http. createServer ( ( req, res ) => { if (req. url === '/messages' && req. method === 'GET' ) { res. writeHead ( 200 , { 'Content-Type' : 'text/event-stream' , 'Cache-Control' : 'no-cache' , 'Connection' : 'keep-alive' , }); const transport = new SSEServerTransport ( "/messages" , res); // Pass the response object server.connect(transport); // Connect the MCP server to this transport
req.on ( 'close' , () = > {
// Handle client disconnect if necessary
server.disconnect (transport) ;
});
} else {
res.writeHead ( 404 ) ;
res.end () ;
}
});
httpServer. listen ( 8931 , () => { console . log ( 'MCP Server with SSE transport listening on port 8931' ); }); */
// For simpler non-web transport, you might use other mechanisms // server.connect(yourCustomTransport);
console . log ( "Playwright MCP server started programmatically." );
// Keep the server running, handle connections, etc. // Add cleanup logic for server shutdown.}
This custom approach allows for fine-grained control, customization of the transport layer (beyond the default mechanism or SSE), and embedding MCP functionality directly into a larger application or proxy framework.
Best Practices
1. In most cases, snapshot mode is preferred – faster and more reliable 2. Use visual mode only when it is absolutely necessary 3. Clear user profiles between sensitive sessions 4. Use Headless mode to automate workflow 5. Combine LLMs’ natural language capabilities for powerful automation
in conclusion
Microsoft Playwright MCP provides a powerful and efficient way for AI agents to interact with the web. By leveraging the browser's accessibility tree in the default snapshot mode, it provides a fast, reliable, and text-friendly approach to browser automation, ideal for common tasks such as navigation, data extraction, and form filling. The optional visual mode serves as a fallback for scenarios that require coordinate-based interaction with visual elements.
With simple installation via npx, deep integration into Claude MCP clients like Cursor, and flexible configuration options (including headless operation and custom transports), Playwright MCP is a versatile tool for developers to build the next generation of web-aware AI agents. By understanding its core concepts and available tools, you can effectively enable your applications and agents to navigate and interact on the vast Internet.