Web Scraping

Overview

The NanoGPT Web Scraping API allows you to extract clean, formatted content from web pages. It uses the Firecrawl service to scrape URLs and returns both raw HTML content and formatted markdown.

Authentication

The API supports two authentication methods:

1. API Key Authentication (Recommended)

Include your API key in the request header:

x-api-key: YOUR_API_KEY

2. Bearer Token Authentication

Authorization: Bearer YOUR_API_KEY

Request Format

Headers

Content-Type: application/json
x-api-key: YOUR_API_KEY

Request Body

{
  "urls": [
    "https://example.com/article",
    "https://blog.com/post",
    "https://news.site.com/story"
  ]
}

Parameters

Parameter	Type	Required	Description
urls	string[]	Yes	Array of URLs to scrape. Maximum 5 URLs per request.

URL Requirements

Must be valid HTTP or HTTPS URLs
Must have standard web ports (80, 443, or default)
Cannot be localhost, private IPs, or metadata endpoints
YouTube URLs are not supported (use the YouTube transcription endpoint instead)

Response Format

Success Response (200 OK)

{
  "results": [
    {
      "url": "https://example.com/article",
      "success": true,
      "title": "Article Title",
      "content": "Raw HTML content...",
      "markdown": "# Article Title\n\nFormatted markdown content..."
    },
    {
      "url": "https://invalid.site.com",
      "success": false,
      "error": "Failed to scrape URL"
    }
  ],
  "summary": {
    "requested": 3,
    "processed": 3,
    "successful": 2,
    "failed": 1,
    "totalCost": 0.002
  }
}

Response Fields

results

Array of scraping results for each URL:

url (string): The URL that was scraped
success (boolean): Whether the scraping was successful
title (string, optional): Page title if successfully scraped
content (string, optional): Raw HTML content
markdown (string, optional): Formatted markdown version of the content
error (string, optional): Error message if scraping failed

summary

Summary statistics for the request:

requested (number): Number of URLs in the original request
processed (number): Number of valid URLs that were processed
successful (number): Number of URLs successfully scraped
failed (number): Number of URLs that failed to scrape
totalCost (number): Total cost in USD (only for successful scrapes)

Error Responses

400 Bad Request

{
  "error": "Please provide an array of URLs to scrape"
}

401 Unauthorized

{
  "error": "Invalid session"
}

402 Payment Required

{
  "error": "Insufficient balance"
}

429 Too Many Requests

{
  "error": "Rate limit exceeded. Please wait before sending another request."
}

500 Internal Server Error

{
  "error": "An error occurred while processing your request"
}

Pricing

Cost: $0.001 per successfully scraped URL
Billing: You are only charged for URLs that are successfully scraped
Payment Methods: USD balance or Nano (XNO) cryptocurrency

Rate Limits

Default: 30 requests per minute per IP address
With API Key: 30 requests per minute per API key

Code Examples

curl -X POST https://nano-gpt.com/api/scrape-urls \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "urls": [
      "https://example.com/article",
      "https://blog.com/post"
    ]
  }'

Best Practices

Batch Requests: Send multiple URLs in a single request (up to 5) to minimize API calls
Error Handling: Always check the success field for each result before accessing content
Content Size: Scraped content is limited to 100KB per URL
URL Validation: Validate URLs on your end before sending to reduce failed requests
Markdown Format: Use the markdown field for better readability and formatting

Limitations

Maximum 5 URLs per request
Maximum content size: 100KB per URL
No JavaScript rendering (static content only)

FAQ

Q: Why was my URL rejected? A: URLs can be rejected for several reasons:

Invalid format (not HTTP/HTTPS)
Pointing to localhost or private IPs
Using non-standard ports
Being a YouTube URL (use the YouTube transcription endpoint)

Q: Can I scrape JavaScript-heavy sites? A: The scraper fetches static HTML content. Sites that rely heavily on JavaScript may not return complete content. Q: What happens if a URL fails to scrape? A: You are not charged for failed URLs. The response will include an error message for that specific URL. Q: Is there a sandbox/test environment? A: You can test with your regular API key. Since you’re only charged for successful scrapes, failed attempts during testing won’t cost anything.

Authorizations

x-api-key

string

header

required

Body

application/json

Web scraping parameters

The body is of type object.

Response

200

application/json

Web scraping response with results for each URL

The response is of type object.

Get Started

Endpoint Examples

API Reference

Miscellaneous

Integrations

Overview

Authentication

1. API Key Authentication (Recommended)

2. Bearer Token Authentication

Request Format

Headers

Request Body

Parameters

URL Requirements

Response Format

Success Response (200 OK)

Response Fields

results

summary

Error Responses

400 Bad Request

401 Unauthorized

402 Payment Required

429 Too Many Requests

500 Internal Server Error

Pricing

Rate Limits

Code Examples

Best Practices

Limitations

FAQ

Authorizations

Body

Response

Get Started

Endpoint Examples

API Reference

Miscellaneous

Integrations

​Overview

​Authentication

​1. API Key Authentication (Recommended)

​2. Bearer Token Authentication

​Request Format

​Headers

​Request Body

​Parameters

​URL Requirements

​Response Format

​Success Response (200 OK)

​Response Fields

​results

​summary

​Error Responses

​400 Bad Request

​401 Unauthorized

​402 Payment Required

​429 Too Many Requests

​500 Internal Server Error

​Pricing

​Rate Limits

​Code Examples

​Best Practices

​Limitations

​FAQ

Authorizations

Body

Response

Overview

Authentication

1. API Key Authentication (Recommended)

2. Bearer Token Authentication

Request Format

Headers

Request Body

Parameters

URL Requirements

Response Format

Success Response (200 OK)

Response Fields

results

summary

Error Responses

400 Bad Request

401 Unauthorized

402 Payment Required

429 Too Many Requests

500 Internal Server Error

Pricing

Rate Limits

Code Examples

Best Practices

Limitations

FAQ