POST
/
api
/
scrape-urls
curl --request POST \
  --url https://nano-gpt.com/api/api/scrape-urls \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '{
  "urls": [
    "https://example.com/article",
    "https://blog.com/post"
  ]
}'
{
  "results": [
    {
      "url": "<string>",
      "success": true,
      "title": "<string>",
      "content": "<string>",
      "markdown": "<string>",
      "error": "<string>"
    }
  ],
  "summary": {
    "requested": 123,
    "processed": 123,
    "successful": 123,
    "failed": 123,
    "totalCost": 123
  }
}

Overview

The NanoGPT Web Scraping API allows you to extract clean, formatted content from web pages. It uses the Firecrawl service to scrape URLs and returns both raw HTML content and formatted markdown.

Authentication

The API supports two authentication methods:

Include your API key in the request header:

x-api-key: YOUR_API_KEY

2. Bearer Token Authentication

Authorization: Bearer YOUR_API_KEY

Request Format

Headers

Content-Type: application/json
x-api-key: YOUR_API_KEY

Request Body

{
  "urls": [
    "https://example.com/article",
    "https://blog.com/post",
    "https://news.site.com/story"
  ]
}

Parameters

ParameterTypeRequiredDescription
urlsstring[]YesArray of URLs to scrape. Maximum 5 URLs per request.

URL Requirements

  • Must be valid HTTP or HTTPS URLs
  • Must have standard web ports (80, 443, or default)
  • Cannot be localhost, private IPs, or metadata endpoints
  • YouTube URLs are not supported (use the YouTube transcription endpoint instead)

Response Format

Success Response (200 OK)

{
  "results": [
    {
      "url": "https://example.com/article",
      "success": true,
      "title": "Article Title",
      "content": "Raw HTML content...",
      "markdown": "# Article Title\n\nFormatted markdown content..."
    },
    {
      "url": "https://invalid.site.com",
      "success": false,
      "error": "Failed to scrape URL"
    }
  ],
  "summary": {
    "requested": 3,
    "processed": 3,
    "successful": 2,
    "failed": 1,
    "totalCost": 0.002
  }
}

Response Fields

results

Array of scraping results for each URL:

  • url (string): The URL that was scraped
  • success (boolean): Whether the scraping was successful
  • title (string, optional): Page title if successfully scraped
  • content (string, optional): Raw HTML content
  • markdown (string, optional): Formatted markdown version of the content
  • error (string, optional): Error message if scraping failed

summary

Summary statistics for the request:

  • requested (number): Number of URLs in the original request
  • processed (number): Number of valid URLs that were processed
  • successful (number): Number of URLs successfully scraped
  • failed (number): Number of URLs that failed to scrape
  • totalCost (number): Total cost in USD (only for successful scrapes)

Error Responses

400 Bad Request

{
  "error": "Please provide an array of URLs to scrape"
}

401 Unauthorized

{
  "error": "Invalid session"
}

402 Payment Required

{
  "error": "Insufficient balance"
}

429 Too Many Requests

{
  "error": "Rate limit exceeded. Please wait before sending another request."
}

500 Internal Server Error

{
  "error": "An error occurred while processing your request"
}

Pricing

  • Cost: $0.001 per successfully scraped URL
  • Billing: You are only charged for URLs that are successfully scraped
  • Payment Methods: USD balance or Nano (XNO) cryptocurrency

Rate Limits

  • Default: 30 requests per minute per IP address
  • With API Key: 30 requests per minute per API key

Code Examples

curl -X POST https://nano-gpt.com/api/scrape-urls \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "urls": [
      "https://example.com/article",
      "https://blog.com/post"
    ]
  }'

Best Practices

  1. Batch Requests: Send multiple URLs in a single request (up to 5) to minimize API calls
  2. Error Handling: Always check the success field for each result before accessing content
  3. Content Size: Scraped content is limited to 100KB per URL
  4. URL Validation: Validate URLs on your end before sending to reduce failed requests
  5. Markdown Format: Use the markdown field for better readability and formatting

Limitations

  • Maximum 5 URLs per request
  • Maximum content size: 100KB per URL
  • No JavaScript rendering (static content only)

FAQ

Q: Why was my URL rejected? A: URLs can be rejected for several reasons:

  • Invalid format (not HTTP/HTTPS)
  • Pointing to localhost or private IPs
  • Using non-standard ports
  • Being a YouTube URL (use the YouTube transcription endpoint)

Q: Can I scrape JavaScript-heavy sites? A: The scraper fetches static HTML content. Sites that rely heavily on JavaScript may not return complete content.

Q: What happens if a URL fails to scrape? A: You are not charged for failed URLs. The response will include an error message for that specific URL.

Q: Is there a sandbox/test environment? A: You can test with your regular API key. Since you’re only charged for successful scrapes, failed attempts during testing won’t cost anything.

Authorizations

x-api-key
string
header
required

Body

application/json

Web scraping parameters

The body is of type object.

Response

200
application/json

Web scraping response with results for each URL

The response is of type object.