Spaces:
Running
A newer version of the Gradio SDK is available:
5.49.1
title: Browser
emoji: 🦀
colorFrom: purple
colorTo: indigo
sdk: gradio
sdk_version: 5.34.2
app_file: app.py
pinned: false
Browser API
This document describes how to use the Browser API to search the web and scrape website content. The API is built with Gradio and Playwright, providing a simple interface for web automation tasks.
API Endpoint
The primary endpoint for this API is /api/web_browse. This is a POST endpoint that accepts a JSON payload.
Authentication
This API is public and does not require authentication.
Actions
The API can perform two main actions: Search and Scrape URL.
Search
The Search action allows you to perform a web search using a specified search engine. The API will return the content of the search results page in Markdown format.
Scrape URL
The Scrape URL action allows you to retrieve the content of a specific URL. The API will fetch the page, process the HTML, and return the main content in a clean, readable Markdown format.
Request Body
The request body must be a JSON object with the following structure:
{
"action": "Search" | "Scrape URL",
"query": "string",
"browser_name": "firefox" | "chromium" | "webkit",
"search_engine_name": "string"
}
Parameters:
action(string, required): The action to perform. Must be either"Search"or"Scrape URL".query(string, required): The search query or the URL to scrape.browser_name(string, optional): The browser to use for the operation. Defaults to"firefox".- Available options:
"firefox","chromium","webkit".
- Available options:
search_engine_name(string, optional): The search engine to use when the action is"Search". Defaults to"DuckDuckGo".- A full list of supported search engines can be found in the "Supported Search Engines" section.
Response Body
The API will return a JSON object with the results of the operation.
On Success:
{
"status": "success",
"query": "your_query",
"action": "Search" | "Scrape URL",
"final_url": "https://example.com",
"page_title": "Example Domain",
"http_status": 200,
"proxy_used": "Direct Connection",
"markdown_content": "# Example Domain..."
}
On Error:
{
"status": "error",
"query": "your_query",
"proxy_used": "Direct Connection",
"error_message": "Navigation Timeout: The page for 'your_query' took too long to load."
}
Examples
Here are some examples of how to use the API with curl.
Example 1: Performing a Search
This example performs a search for "latest AI research" using Google.
curl -X POST -H "Content-Type: application/json" \
-d '{
"action": "Search",
"query": "latest AI research",
"browser_name": "chromium",
"search_engine_name": "Google"
}' \
https://broadfield-dev-browser.hf.space/api/web_browse
Example 2: Scraping a URL
This example scrapes the content from the Wikipedia page for "Web scraping".
curl -X POST -H "Content-Type: application/json" \
-d '{
"action": "Scrape URL",
"query": "https://en.wikipedia.org/wiki/Web_scraping",
"browser_name": "firefox"
}' \
https://broadfield-dev-browser.hf.space/api/web_browse
Supported Search Engines
The following search engines are supported when using the "Search" action:
- DuckDuckGo
- Bing
- Brave
- Ecosia
- Yahoo
- Startpage
- Qwant
- Swisscows
- You.com
- SearXNG
- MetaGer
- Yandex
- Baidu
- Perplexity