File size: 3,502 Bytes
085aa8e
 
 
 
 
 
 
 
 
 
 
383c5ed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f76443
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
---
title: Browser
emoji: 🦀
colorFrom: purple
colorTo: indigo
sdk: gradio
sdk_version: 5.34.2
app_file: app.py
pinned: false
---

# Browser API

This document describes how to use the Browser API to search the web and scrape website content. The API is built with Gradio and Playwright, providing a simple interface for web automation tasks.

## API Endpoint

The primary endpoint for this API is `/api/web_browse`. This is a `POST` endpoint that accepts a JSON payload.

## Authentication

This API is public and does not require authentication.

## Actions

The API can perform two main actions: `Search` and `Scrape URL`.

### Search

The `Search` action allows you to perform a web search using a specified search engine. The API will return the content of the search results page in Markdown format.

### Scrape URL

The `Scrape URL` action allows you to retrieve the content of a specific URL. The API will fetch the page, process the HTML, and return the main content in a clean, readable Markdown format.

## Request Body

The request body must be a JSON object with the following structure:

```json
{
  "action": "Search" | "Scrape URL",
  "query": "string",
  "browser_name": "firefox" | "chromium" | "webkit",
  "search_engine_name": "string"
}
```

**Parameters:**

*   `action` (string, required): The action to perform. Must be either `"Search"` or `"Scrape URL"`.
*   `query` (string, required): The search query or the URL to scrape.
*   `browser_name` (string, optional): The browser to use for the operation. Defaults to `"firefox"`.
    *   Available options: `"firefox"`, `"chromium"`, `"webkit"`.
*   `search_engine_name` (string, optional): The search engine to use when the action is `"Search"`. Defaults to `"DuckDuckGo"`.
    *   A full list of supported search engines can be found in the "Supported Search Engines" section.

## Response Body

The API will return a JSON object with the results of the operation.

**On Success:**

```json
{
  "status": "success",
  "query": "your_query",
  "action": "Search" | "Scrape URL",
  "final_url": "https://example.com",
  "page_title": "Example Domain",
  "http_status": 200,
  "proxy_used": "Direct Connection",
  "markdown_content": "# Example Domain..."
}
```

**On Error:**

```json
{
  "status": "error",
  "query": "your_query",
  "proxy_used": "Direct Connection",
  "error_message": "Navigation Timeout: The page for 'your_query' took too long to load."
}
```

## Examples

Here are some examples of how to use the API with `curl`.

### Example 1: Performing a Search

This example performs a search for "latest AI research" using Google.

```bash
curl -X POST -H "Content-Type: application/json" \
-d '{
  "action": "Search",
  "query": "latest AI research",
  "browser_name": "chromium",
  "search_engine_name": "Google"
}' \
https://broadfield-dev-browser.hf.space/api/web_browse
```

### Example 2: Scraping a URL

This example scrapes the content from the Wikipedia page for "Web scraping".

```bash
curl -X POST -H "Content-Type: application/json" \
-d '{
  "action": "Scrape URL",
  "query": "https://en.wikipedia.org/wiki/Web_scraping",
  "browser_name": "firefox"
}' \
https://broadfield-dev-browser.hf.space/api/web_browse
```

## Supported Search Engines

The following search engines are supported when using the `"Search"` action:

*   Google
*   DuckDuckGo
*   Bing
*   Brave
*   Ecosia
*   Yahoo
*   Startpage
*   Qwant
*   Swisscows
*   You.com
*   SearXNG
*   MetaGer
*   Yandex
*   Baidu
*   Perplexity