Best Practices
Guidance for system prompts, temperature, multi-turn chat, tokens, errors, security, and responsible use when building with Vaidya.
Best Practices
1. Use system prompts effectively
- You may set a system prompt as the first message in the
messagesarray to establish context, instructions, and constraints for the model. This is a powerful way to guide the tone, style, and content of responses across the conversation. - Include language preference, response format (bullets, structured sections, length), and tone (clinical, patient-friendly, etc.).
- Example patterns and case-specific ideas live in the Prompt Guide and Chat Completions API examples.
2. Temperature settings by use case
| Use case | Suggested temperature | Rationale |
|---|---|---|
| Emergency / clinical | 0.2–0.3 | Accuracy-focused, less randomness |
| Symptom checker | 0.4–0.6 | Balanced clarity and flexibility |
| Wellness plans | 0.6–0.7 | More personalized, varied wording |
| Admin summarization | 0.2–0.4 | Factual, consistent summaries |
Tune per deployment; lower is safer when factual correctness matters most.
3. Multi-turn conversations
- Pass the full conversation history in the
messagesarray on every request. - The model has no memory between HTTP calls-only what you send in
messagescounts as context. - Include all relevant context (prior symptoms, constraints, user goals) in each turn when it affects the answer.
4. Token management
- Monitor usage from the response
usageobject (prompt, completion, and total tokens when returned). - Set
max_tokensper use case so answers are long enough to be useful but bounded. - Longer prompts and history mean more prompt tokens and higher cost-trim redundant turns when safe.
For account-level call volume, success rate, errors, plan limits, and Console alerts, see Monitoring Usage.
5. Error handling
- Implement retry logic with exponential backoff (and jitter) for transient failures.
- Retry
429(rate limit) and5xx(server errors) gracefully; honorRetry-Afterwhen present. - Do not blindly retry
400(bad request) or401(auth)-fix the payload or credentials first.
More detail: Introduction and Chat Completions - Error Responses.
6. Security
- Keep API keys server-side only; never expose them in browsers or mobile apps without a backend.
- Load keys from environment variables or a secrets manager in production.
- Rotate keys periodically and after any suspected leak.
- Use separate keys for development, staging, and production.
See also Authentication.
7. Responsible use disclaimers
- Always make clear that Vaidya is not a replacement for qualified medical professionals or emergency care.
- Show disclaimers in user-facing UI wherever health guidance is shown.
- For emergency-style assist flows, always instruct users to contact local emergency services when symptoms may be urgent or life-threatening.
Chat Completions API
OpenAI-compatible chat endpoint for healthcare AI — supports Symptom Checker, Wellness Management, Emergency Assist, Patient Journey Assist, and Administrator Assist.
Monitoring Usage
Use the Console Usage dashboard to track API volume, success rate, errors, plan limits, and alerts.

