Admin Access

Sign in to manage API keys and monitor usage

Auto-refresh: ON
Active API Keys
-
Total Keys
-
Requests Today
-
Avg Latency
-
milliseconds

Daily Requests (30 days)

By Task

Live Activity

Loading activity...

API Keys

Key Owner Tier Rate Limit Status Created Last Used Actions

Requests Over Time

Requests by Task

Usage by API Key (30 days)

Key PrefixOwnerRequests

Model Performance

TaskF1 MacroAccuracyTraining Time
Total Predictions
-
Avg Latency
-
milliseconds
P95 Latency
-
milliseconds
Avg Confidence
-
prediction certainty
Peak Throughput
-
requests / hour

Latency Distribution

P50
-
ms
P95
-
ms
P99
-
ms
Min
-
ms
Max
-
ms

Confidence Distribution

Hourly Throughput (24h)

Per-Task Confidence

Cost / 1K Requests
-
at current capacity
Avg Latency
-
milliseconds per prediction
Monthly Infrastructure
-
Monthly Capacity
-
predictions / month

Cost per 1K Requests at Scale

Fixed GPU cost means the more you use it, the cheaper each prediction becomes.

SLM vs LLM API — When to Use What

Use IRIS SLM when:
  • You need real-time classification (<50ms)
  • Data cannot leave your network (compliance, PII)
  • You're making high-volume classification calls
  • You need predictable latency with no cold starts
  • You want zero vendor dependency
  • The task is well-defined classification (not open-ended generation)
Use LLM APIs when:
  • You need open-ended text generation
  • The task requires reasoning or multi-step logic
  • You have low volume (<10K/month) and minimal latency needs
  • The classification categories change frequently

Per-Task Inference Metrics

TaskRequestsAvg LatencyMinMaxAvg Confidence

Request Logs

Conf: -
Time (IST) API Key Task Label Confidence Latency Input Size Batch IP
Use the search filters above to query request logs

Inline Model Tester

Predicted Label
-
Confidence
-
  - ms
Probability Distribution
Open in Classifier
Service State
-
Models Validated
-
Uptime Since
-
GPU
-

Maintenance Mode

Maintenance mode blocks all prediction traffic and shows a branded maintenance page to end users. Admin panel and health probes remain accessible. Use this during model updates, retraining, or scheduled downtime.

Model Reload

Hot-reload models from disk. Use this after retraining to pick up new weights without restarting the service. Models are re-validated with warm-up predictions after reload.

Model Status

TaskDeviceVersionSizeWarm-upLatencyF1