Vaidya.ai

Monitoring Usage

Use the Console Usage dashboard to track API volume, success rate, errors, plan limits, and alerts.

Monitoring Usage

Sign in to console.vaidya.ai and open the Usage page in the dashboard. It complements per-response metrics (such as the usage object on chat completions) with account-level trends and guardrails.

What the Usage page shows

  • Daily and monthly API call counts — how many requests your keys or project sent over the selected period.
  • Success rate (%) — share of calls that completed successfully vs failed; use it to spot regressions after deploys or client changes.
  • Error breakdown by HTTP status code — e.g. 401, 429, 5xx; helps you separate auth misconfiguration, throttling, and server-side issues.
  • Usage vs plan limits — a progress bar (or equivalent) showing consumption against the allowance on your current plan.
  • Usage alerts at configurable thresholds — notifications when usage crosses 80% or 95% of your limit (or other thresholds you configure), so you are warned before hard cutoffs.

Reading the usage charts

  • Time range — switch between daily and monthly views to match how you bill internally (per day for ops, per month for finance).
  • Call volume trends — rising slopes usually mean more traffic or retries; a spike after a release may be expected; a flat line with falling success rate points to client or integration bugs.
  • Success rate — treat sustained drops as a priority: correlate with the error breakdown to see whether failures are mostly 429 (back off, raise limits, or optimize call patterns) or 4xx validation/auth (fix clients).
  • Error-by-status chart or table — use it to prioritize fixes: many 401s → keys or headers; many 400s → payload validation; many 429s → rate limits; many 5xx → transient outages (retry + support if persistent).
  • Plan limit bar — read it as remaining headroom, not just total used. If the bar moves quickly near month-end, forecast next month and adjust plan or traffic before you hit 100%.

Setting up usage alerts

  1. In the Console, open Usage (or Settings / Notifications, depending on your console layout) and find usage or quota alerts.
  2. Enable alerts and set thresholds — commonly 80% for an early warning and 95% for a critical warning before the limit.
  3. Choose delivery channels (email, in-app, or other options your workspace supports) so the right people (engineering + ops/finance) get notified.
  4. After a threshold fires, confirm whether the jump is legitimate growth (plan upgrade) or bugs (retry storms, runaway jobs) before changing plans.

When you are approaching limits

SituationWhat to do
80% thresholdReview trending charts; estimate days until 100%; trim unnecessary calls, fix retry loops, and cache idempotent reads where possible. Open a conversation with your account team about a plan increase if growth is expected.
95% thresholdTreat as urgent: reduce non-production traffic, pause soak tests, and throttle optional features if needed. Confirm production keys are not shared with dev/staging (use separate keys).
At or over limitExpect 429 or policy blocks; ensure your app handles rate limits without hammering the API. Upgrade the plan or negotiate limits before users see failures.
Errors rising while usage is flatFocus on success rate and error codes — this is usually a regression, not a quota issue.

For per-request token and completion metrics from the API itself, see Best Practices — Token management.

On this page