Introduction
A single, OpenAI-compatible endpoint for healthcare AI — symptom triage, drug lookups, lab analysis, and more.
What is Vaidya AI API?
Vaidya AI API is an OpenAI/vLLM-compatible Chat Completions endpoint built for healthcare. Send a standard model and messages body; add optional case when you want a credit-priced healthcare workflow (symptom triage, drug lookup, labs, and more).
If you've used the OpenAI SDK before, you already know the request format. case is optional for general chat and required only when your deployment or product uses workflow routing.
Why use it?
- One endpoint, many workflows — symptom Q&A, drug interactions, lab analysis, health plans, and more behind a single
POST. - Drop-in compatible — works with the OpenAI Python/JS SDK,
requests,curl, or any HTTP client. - Predictable pricing — credit-based model so you know the cost per call before you ship.
Features and credits
Each case maps to a healthcare capability and costs a fixed number of credits:
| Feature | case | Credits | File upload? | What it does |
|---|---|---|---|---|
| Symptom Q&A | symptom_qa | 1 | No | Triage and next-step guidance per turn |
| Drug Lookup | drug_lookup | 2 | No | Interaction checks, side effects, spacing advice |
| Lab Report (text) | lab_report_text | 10 | No | Structured interpretation of pasted lab values |
| Lab Report (image) | lab_report_image | 20 | Yes | OCR + interpretation from a photo of a report |
Quick overview
| Base URL | https://api.vaidya-dev.fractal.ai |
| Endpoint | POST /chat/completions |
| Auth | Bearer token — get your key from console.vaidya.ai |
| Format | Standard OpenAI chat request; optional case for healthcare workflows |
Minimal request
The smallest valid call matches OpenAI Chat Completions — no case required for general chat:
curl --request POST \
--url https://api.vaidya-dev.fractal.ai/chat/completions \
--header "authorization: Bearer $VAIDYA_API_KEY" \
--header "content-type: application/json" \
--data '{
"model": "Vaidya-v2",
"messages": [
{
"role": "user",
"content": "Explain the pathogenesis of rheumatoid arthritis."
}
],
"max_tokens": 1000,
"temperature": 0.7
}'Same body as JSON:
{
"model": "Vaidya-v2",
"messages": [
{
"role": "user",
"content": "Explain the pathogenesis of rheumatoid arthritis."
}
],
"max_tokens": 1000,
"temperature": 0.7
}Add "case": "drug_lookup" (or another value from the table above) when you need a specific healthcare workflow and credit pricing.
Error handling scenarios
Failures return JSON with an error object (message, type, code) and an HTTP status, similar to OpenAI-style APIs. Typical situations:
| Scenario | Status | What to do |
|---|---|---|
No Authorization header, wrong key format, or invalid key | 401 | Send Authorization: Bearer <key> with a key from console.vaidya.ai. See Authentication. |
| Key revoked or expired | 401 | Create a new key in the console and update your config. |
| Key lacks access to the resource or project | 403 | Check account or project permissions in the console. |
Unknown or unsupported case value | 400 | Use a supported case from the table above (full list in Chat Completions). |
messages missing roles, empty content, or invalid structure | 400 | Match the chat schema: alternating user/assistant turns with string content (or valid multimodal parts). |
| File-backed case without a file in the request | 400 | Attach the required PDF or image for that case. |
| Request or file payload too large | 413 | Shorten text, compress images, or split work across calls. |
| Field types or enums fail validation | 422 | Align types with the API reference (e.g. temperature as number). |
| Too many requests in a short window | 429 | Retry with exponential backoff; honor Retry-After when present. |
| Client or upstream timeout | 408 | Increase timeout, retry idempotent calls, or shorten prompts. |
| Transient server or overload | 500 / 503 | Retry with backoff; log the response/request id if you need support. |
Using LiteLLM
When you call Vaidya through LiteLLM (Python SDK or Proxy), LiteLLM maps provider HTTP errors to OpenAI-compatible exception types. You can import them from litellm or catch the matching openai types — see LiteLLM’s Exception Mapping.
Each raised exception includes at least status_code, message, and llm_provider (in addition to the standard OpenAI error fields when present).
| HTTP status | LiteLLM / OpenAI exception (typical) | Vaidya-related examples |
|---|---|---|
| 400 | BadRequestError | Invalid case, bad messages, missing file for a file-backed case, payload rules; may be ContextWindowExceededError, ContentPolicyViolationError, or UnsupportedParamsError when the message matches those cases |
| 401 | AuthenticationError | Missing/invalid/expired API key |
| 403 | PermissionDeniedError | Key or account lacks permission |
| 404 | NotFoundError | Wrong model id in config (often before the request reaches Vaidya) |
| 408 | APITimeoutError | Request timeout from client or gateway |
| 413 | BadRequestError | Request or attachment too large (treat as 400-class; confirm with status_code) |
| 422 | UnprocessableEntityError | Validation failures on typed fields |
| 429 | RateLimitError | Rate limit exceeded |
| 500 | APIError, APIConnectionError, or InternalServerError | Upstream or network failure; LiteLLM may use APIConnectionError when mapping is ambiguous |
| 503 | ServiceUnavailableError | Temporary unavailability |
LiteLLM Proxy only: spend or budget limits can raise BudgetExceededError (no fixed HTTP status — handled in your proxy config).
import os
import litellm
from litellm import AuthenticationError, BadRequestError, RateLimitError
from openai import APIStatusError
try:
response = litellm.completion(
model="openai/Vaidya-v2", # example: your LiteLLM model name
api_base="https://api.vaidya-dev.fractal.ai",
api_key=os.environ["VAIDYA_API_KEY"],
extra_body={"case": "drug_lookup"},
messages=[{"role": "user", "content": "Can I take ibuprofen with paracetamol?"}],
)
except AuthenticationError as e:
# e.status_code == 401
...
except BadRequestError as e:
# e.status_code often 400; inspect e.message for Vaidya `error.code`
...
except RateLimitError as e:
# e.status_code == 429
...
except APIStatusError as e:
# Fallback for any mapped OpenAI-style error with a status code
...For the raw JSON error shape, Vaidya error.code values, and retry guidance, see Error Responses and Rate Limiting and Retries in Chat Completions.
Who is this for?
- Health apps — add symptom triage, drug info, or lab analysis to your product.
- Wellness platforms — generate health scores and personalized plans.
- Clinical tools — build copilots and workflow assistants for care teams.
- Enterprise — embed healthcare AI into internal tools and customer-facing products.
Next steps
- Get your API key at console.vaidya.ai.
- Set up auth — see Authentication.
- Write better prompts — read the Prompt Guide.
- Start building — jump to the Chat Completions API reference.
- Ship safely — follow Best Practices for prompts, retries, security, and disclaimers.
- Watch usage — use the Console Monitoring Usage page for quotas, charts, and alerts.

