Overview
Create embeddings for text using OpenAI-compatible and alternative embedding models. Our API provides access to 16 different embedding models at competitive prices, fully compatible with OpenAI’s embedding API.Available Models
OpenAI Models
text-embedding-3-small- 1536 dimensions, $0.02/1M tokens - Most cost-effective with dimension reduction supporttext-embedding-3-large- 3072 dimensions, $0.13/1M tokens - Highest performance with dimension reduction supporttext-embedding-ada-002- 1536 dimensions, $0.10/1M tokens - Legacy model
Alternative Models
Multilingual:BAAI/bge-m3- 1024 dimensions, $0.01/1M tokens - Multilingual supportjina-clip-v1- 768 dimensions, $0.04/1M tokens - Multimodal CLIP embeddings
BAAI/bge-large-en-v1.5- 768 dimensions, $0.01/1M tokens - English optimizedBAAI/bge-large-zh-v1.5- 1024 dimensions, $0.01/1M tokens - Chinese optimizedjina-embeddings-v2-base-en- 768 dimensions, $0.05/1M tokens - Englishjina-embeddings-v2-base-de- 768 dimensions, $0.05/1M tokens - Germanjina-embeddings-v2-base-zh- 768 dimensions, $0.05/1M tokens - Chinesejina-embeddings-v2-base-es- 768 dimensions, $0.05/1M tokens - Spanish
jina-embeddings-v2-base-code- 768 dimensions, $0.05/1M tokens - Code embeddingsBaichuan-Text-Embedding- 1024 dimensions, $0.088/1M tokensnetease-youdao/bce-embedding-base_v1- 1024 dimensions, $0.02/1M tokenszhipu-embedding-2- 1024 dimensions, $0.07/1M tokensQwen/Qwen3-Embedding-0.6B- 1024 dimensions, $0.01/1M tokens - Supports dimension reduction
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
input | string or array | Yes | Single text string or array of up to 2048 strings to embed |
model | string | Yes | ID of the embedding model to use |
encoding_format | string | No | Format for embeddings: "float" (default) or "base64" |
dimensions | integer | No | Reduce embedding dimensions (only for supported models) |
user | string | No | Optional identifier for tracking usage |
Response Format
Code Examples
Python with OpenAI SDK
JavaScript/TypeScript
cURL
Batch Processing
Dimension Reduction
For models that support it (text-embedding-3-small, text-embedding-3-large, Qwen/Qwen3-Embedding-0.6B):Use Cases
Semantic Search
RAG (Retrieval Augmented Generation)
Best Practices
Model Selection
- General English text: Use
text-embedding-3-smallfor best price/performance - Maximum accuracy: Use
text-embedding-3-large - Multilingual: Use
BAAI/bge-m3or language-specific Jina models - Code: Use
jina-embeddings-v2-base-code - Budget-conscious: Use BAAI models at $0.01/1M tokens
Performance Optimization
- Batch requests: Send up to 2048 texts in a single request
- Use dimension reduction: Reduce dimensions for faster similarity calculations when exact precision isn’t critical
- Cache embeddings: Store computed embeddings to avoid re-processing identical texts
- Choose appropriate models: Don’t use 3072-dimension models if 768 dimensions suffice
Cost Optimization
- Monitor token usage: Track the
usagefield in responses - Use smaller models: Start with
text-embedding-3-smallbefore upgrading - Implement caching: Avoid re-embedding identical content
- Batch processing: Reduce API call overhead
Rate Limits
- Default: 100 requests per second per IP address
- Internal requests: No rate limiting (requires internal auth token)
Error Handling
The API returns standard HTTP status codes and OpenAI-compatible error responses:401: Invalid or missing API key400: Invalid request parameters429: Rate limit exceeded500: Server error
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
application/json
Parameters for creating embeddings
Text to embed - single string or array of up to 2048 strings Single text string to embed
ID of the embedding model to use
Example:
"text-embedding-3-small"
Format for embeddings
Available options:
float, base64 Reduce embedding dimensions (only for supported models)
Example:
256
Optional identifier for tracking usage