Spaces:

broadfield-dev
/

browser

Running

File size: 3,502 Bytes

---
title: Browser
emoji: 🦀
colorFrom: purple
colorTo: indigo
sdk: gradio
sdk_version: 5.34.2
app_file: app.py
pinned: false
---

# Browser API

This document describes how to use the Browser API to search the web and scrape website content. The API is built with Gradio and Playwright, providing a simple interface for web automation tasks.

## API Endpoint

The primary endpoint for this API is `/api/web_browse`. This is a `POST` endpoint that accepts a JSON payload.

## Authentication

This API is public and does not require authentication.

## Actions

The API can perform two main actions: `Search` and `Scrape URL`.

### Search

The `Search` action allows you to perform a web search using a specified search engine. The API will return the content of the search results page in Markdown format.

### Scrape URL

The `Scrape URL` action allows you to retrieve the content of a specific URL. The API will fetch the page, process the HTML, and return the main content in a clean, readable Markdown format.

## Request Body

The request body must be a JSON object with the following structure:

```json
{
  "action": "Search" | "Scrape URL",
  "query": "string",
  "browser_name": "firefox" | "chromium" | "webkit",
  "search_engine_name": "string"
}
```

**Parameters:**

*   `action` (string, required): The action to perform. Must be either `"Search"` or `"Scrape URL"`.
*   `query` (string, required): The search query or the URL to scrape.
*   `browser_name` (string, optional): The browser to use for the operation. Defaults to `"firefox"`.
    *   Available options: `"firefox"`, `"chromium"`, `"webkit"`.
*   `search_engine_name` (string, optional): The search engine to use when the action is `"Search"`. Defaults to `"DuckDuckGo"`.
    *   A full list of supported search engines can be found in the "Supported Search Engines" section.

## Response Body

The API will return a JSON object with the results of the operation.

**On Success:**

```json
{
  "status": "success",
  "query": "your_query",
  "action": "Search" | "Scrape URL",
  "final_url": "https://example.com",
  "page_title": "Example Domain",
  "http_status": 200,
  "proxy_used": "Direct Connection",
  "markdown_content": "# Example Domain..."
}
```

**On Error:**

```json
{
  "status": "error",
  "query": "your_query",
  "proxy_used": "Direct Connection",
  "error_message": "Navigation Timeout: The page for 'your_query' took too long to load."
}
```

## Examples

Here are some examples of how to use the API with `curl`.

### Example 1: Performing a Search

This example performs a search for "latest AI research" using Google.

```bash
curl -X POST -H "Content-Type: application/json" \
-d '{
  "action": "Search",
  "query": "latest AI research",
  "browser_name": "chromium",
  "search_engine_name": "Google"
}' \
https://broadfield-dev-browser.hf.space/api/web_browse
```

### Example 2: Scraping a URL

This example scrapes the content from the Wikipedia page for "Web scraping".

```bash
curl -X POST -H "Content-Type: application/json" \
-d '{
  "action": "Scrape URL",
  "query": "https://en.wikipedia.org/wiki/Web_scraping",
  "browser_name": "firefox"
}' \
https://broadfield-dev-browser.hf.space/api/web_browse
```

## Supported Search Engines

The following search engines are supported when using the `"Search"` action:

*   Google
*   DuckDuckGo
*   Bing
*   Brave
*   Ecosia
*   Yahoo
*   Startpage
*   Qwant
*   Swisscows
*   You.com
*   SearXNG
*   MetaGer
*   Yandex
*   Baidu
*   Perplexity