Overview
Create embeddings for text using OpenAI-compatible and alternative embedding models. Our API provides access to 16 different embedding models at competitive prices, fully compatible with OpenAI’s embedding API.Available Models
OpenAI Models
text-embedding-3-small
- 1536 dimensions, $0.02/1M tokens - Most cost-effective with dimension reduction supporttext-embedding-3-large
- 3072 dimensions, $0.13/1M tokens - Highest performance with dimension reduction supporttext-embedding-ada-002
- 1536 dimensions, $0.10/1M tokens - Legacy model
Alternative Models
Multilingual:BAAI/bge-m3
- 1024 dimensions, $0.01/1M tokens - Multilingual supportjina-clip-v1
- 768 dimensions, $0.04/1M tokens - Multimodal CLIP embeddings
BAAI/bge-large-en-v1.5
- 768 dimensions, $0.01/1M tokens - English optimizedBAAI/bge-large-zh-v1.5
- 1024 dimensions, $0.01/1M tokens - Chinese optimizedjina-embeddings-v2-base-en
- 768 dimensions, $0.05/1M tokens - Englishjina-embeddings-v2-base-de
- 768 dimensions, $0.05/1M tokens - Germanjina-embeddings-v2-base-zh
- 768 dimensions, $0.05/1M tokens - Chinesejina-embeddings-v2-base-es
- 768 dimensions, $0.05/1M tokens - Spanish
jina-embeddings-v2-base-code
- 768 dimensions, $0.05/1M tokens - Code embeddingsBaichuan-Text-Embedding
- 1024 dimensions, $0.088/1M tokensnetease-youdao/bce-embedding-base_v1
- 1024 dimensions, $0.02/1M tokenszhipu-embedding-2
- 1024 dimensions, $0.07/1M tokensQwen/Qwen3-Embedding-0.6B
- 1024 dimensions, $0.01/1M tokens - Supports dimension reduction
Request Parameters
Parameter | Type | Required | Description |
---|---|---|---|
input | string or array | Yes | Single text string or array of up to 2048 strings to embed |
model | string | Yes | ID of the embedding model to use |
encoding_format | string | No | Format for embeddings: "float" (default) or "base64" |
dimensions | integer | No | Reduce embedding dimensions (only for supported models) |
user | string | No | Optional identifier for tracking usage |
Response Format
Code Examples
Python with OpenAI SDK
JavaScript/TypeScript
cURL
Batch Processing
Dimension Reduction
For models that support it (text-embedding-3-small, text-embedding-3-large, Qwen/Qwen3-Embedding-0.6B):Use Cases
Semantic Search
RAG (Retrieval Augmented Generation)
Best Practices
Model Selection
- General English text: Use
text-embedding-3-small
for best price/performance - Maximum accuracy: Use
text-embedding-3-large
- Multilingual: Use
BAAI/bge-m3
or language-specific Jina models - Code: Use
jina-embeddings-v2-base-code
- Budget-conscious: Use BAAI models at $0.01/1M tokens
Performance Optimization
- Batch requests: Send up to 2048 texts in a single request
- Use dimension reduction: Reduce dimensions for faster similarity calculations when exact precision isn’t critical
- Cache embeddings: Store computed embeddings to avoid re-processing identical texts
- Choose appropriate models: Don’t use 3072-dimension models if 768 dimensions suffice
Cost Optimization
- Monitor token usage: Track the
usage
field in responses - Use smaller models: Start with
text-embedding-3-small
before upgrading - Implement caching: Avoid re-embedding identical content
- Batch processing: Reduce API call overhead
Rate Limits
- Default: 100 requests per second per IP address
- Internal requests: No rate limiting (requires internal auth token)
Error Handling
The API returns standard HTTP status codes and OpenAI-compatible error responses:401
: Invalid or missing API key400
: Invalid request parameters429
: Rate limit exceeded500
: Server error
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
application/json
Parameters for creating embeddings
Text to embed - single string or array of up to 2048 strings Single text string to embed
ID of the embedding model to use
Example:
"text-embedding-3-small"
Format for embeddings
Available options:
float
, base64
Reduce embedding dimensions (only for supported models)
Example:
256
Optional identifier for tracking usage