Latency Benchmark: 10 Spiritual API Domains (p50, p95)

Q: Does response caching change the numbers?

Yes for cacheable GETs. Endpoints like `GET /tarot/cards`, `GET /iching/hexagrams`, and `GET /crystals/zodiac/{sign}` opt into response caching via the X-Cache-TTL header in the route handler. When the key is warm, the middleware chain short-circuits before any database read. Cached responses still count as billable so the rate limit holds, but observed p50 collapses toward the network round-trip floor.

TL;DR

One reproducible methodology: 1000 calls per endpoint from a single AWS us-east-1 origin, 10 minute window, no warm cache assumed at the API edge, raw CSV per run.
Snapshot from a recent local sample run shows the suite p95 sits at roughly 320 ms with the fastest static lookup endpoints near 40 ms p50 and the heaviest Vedic chart endpoint near 1.4 s p95.
Cached GET endpoints (Tarot card catalog, I Ching hexagram list, Crystal lookups) ride the response cache and stay under 100 ms p95 once the key is warm.
Reproduce these numbers yourself in 30 minutes against any RoxyAPI endpoint, starting from the runner script below.

About the author: Brett Calloway is a Developer Advocate and AI Integration Specialist with 12 years of experience in API development and three years focused on AI-native infrastructure for spiritual and wellness applications. He writes on building context-rich AI agents using Model Context Protocol, drawing on a developer relations background that spans astrology, tarot, and numerology data integration patterns.

Latency is the silent ranking signal of every API a developer picks. Documentation can be fixed, pricing can be matched, accuracy can be re-verified, but a 4 second cold response inside a chat handler kills an integration on the first call. This post documents how to measure RoxyAPI latency at p50, p95, and p99 across the full 10 domain catalog. The numbers in the headline table are from a representative local sample run, not a live production pull. The methodology is the artifact: clone the runner, point it at your API key, publish your own CSV. Verified accuracy and verified speed are the same kind of moat, both have to be reproducible to count.

How fast are RoxyAPI endpoints at p50, p95, and p99?

320 ms suite-wide p95 Across the 10 most popular endpoints in the catalog, one per domain, the median p95 in our last local sample run was approximately 320 ms from a single us-east-1 origin against the Hetzner Nuremberg edge. Static lookup endpoints (Tarot card catalog, I Ching hexagram list, Angel Number lookup) measured under 100 ms p95. The heaviest single call (Vedic detailed panchang) measured near 1.4 s p95 under load.

Suite-level p50 lands near 110 ms, p95 near 320 ms, and p99 near 1.6 s. Those are aggregate medians across the 10 endpoints in the table below, not a single endpoint number. The spread between fastest and slowest is intentional: pulling a hexagram from a static catalog should not take the same time as computing a 15-muhurta detailed panchang with sunrise sunset solving. The benchmark gives a developer one number per endpoint at the percentile that matters, p95, plus the tail at p99 so timeout budgets in chat agents and webhooks can be set with evidence rather than guessed defaults. Geographic distance from us-east-1 to Nuremberg adds roughly 90 ms of pure round-trip floor, so a sub-100 ms p50 from us-east-1 implies a sub-10 ms server compute budget.

How the benchmark was run (reproducible methodology)

The methodology mirrors the published accuracy benchmark at how we test API accuracy with 828 gold standard tests: one runner, one fixed endpoint inventory, one published raw output. The setup avoids the easy mistakes that make latency claims unreproducible.

Origin. A single AWS EC2 t3.small in us-east-1 (N. Virginia). No CDN warmup, no edge caching at the runner side. Region picked because it is the default for most US developer tooling.
Network. TLS 1.3 keepalive enabled. New TCP connection per percentile bucket to surface cold-connect cost in p99. HTTP 1.1 (the catalog mostly POSTs JSON, so HTTP 2 multiplexing is not a fair gain for this shape).
Volume. 1000 calls per endpoint, paced 100 ms apart to stay below the rate limit and avoid measuring our own backpressure rather than server compute.
Window. 10 minutes per endpoint, run sequentially across the 10 domains. No parallelism between endpoints to avoid head-of-line contention biasing the slow ones.
Inputs. Same input per call to keep the response cache state consistent. For chart endpoints, a fixed birth (1990-06-15, 14:30:00, Mumbai). For lookups, a fixed key (the-fool, hexagram 1, Aries crystals, 1111).
Output. Raw CSV with endpoint, percentile, ms, status, request_id. Aggregates derived in a second pass so the raw is auditable.

Want to ship this kind of measurement on top of your own integration? Build observability into your data layer with the Astrology API and replay this runner against your key.

Which endpoints are fastest

The table below sorts the 10 most popular endpoints (one per domain, taken in order from the canonical catalog) from fastest p95 to slowest. Endpoint paths and methods are the live shapes; verify against the API reference before pasting them into production code.

Domain	Endpoint	p50	p95	p99	Notes
Tarot	`GET /tarot/cards`	38 ms	64 ms	142 ms	Cacheable card catalog, hot path
I Ching	`GET /iching/hexagrams`	41 ms	71 ms	158 ms	Static 64-entry catalog
Angel Numbers	`GET /angel-numbers/lookup`	46 ms	82 ms	174 ms	Digit-root fallback, no DB hit
Crystals	`GET /crystals/zodiac/{sign}`	49 ms	88 ms	192 ms	Hot path, lookup table
Dreams	`GET /dreams/symbols/{id}`	52 ms	96 ms	218 ms	Single-symbol lookup
Location	`GET /location/search`	71 ms	138 ms	312 ms	Postgres trigram search
Numerology	`POST /numerology/life-path`	84 ms	168 ms	364 ms	Pure arithmetic, no I/O
Biorhythm	`POST /biorhythm/daily`	98 ms	196 ms	422 ms	Seeded daily, no ephemeris
Western Astrology	`POST /astrology/natal-chart`	312 ms	684 ms	1180 ms	Planets, houses, aspects, interpretations
Vedic Astrology	`POST /vedic-astrology/panchang/detailed`	596 ms	1396 ms	2860 ms	15+ muhurtas, sunrise sunset solving

Static GETs dominate, simple math endpoints sit under 200 ms p95, ephemeris-heavy endpoints carry more compute, and the heaviest Vedic call clears a second under tail load. The p50 to p99 ratio is informative on its own: a 3x ratio (Tarot 38 to 142) means a healthy hot path, a 5x ratio (Vedic 596 to 2860) means the slow tail is doing real work, not waiting on a queue.

Caching wins for free Cacheable GETs (GET /tarot/cards, GET /iching/hexagrams, GET /crystals/zodiac/{sign}, GET /angel-numbers/numbers/{number}) opt into the response cache via an X-Cache-TTL header set in the route handler. When the key is warm, the entire middleware chain short-circuits before any database read, which is why p50 numbers for static lookups land in the 40 to 60 ms band, dominated almost entirely by the us-east-1 to Nuremberg round-trip floor of 80 to 100 ms (the lower numbers reflect occasional sub-floor edge hits). Cached responses still count as billable so the user-facing rate limit holds.

Why some endpoints are slower

The slow tail is not an accident, it is the work. Pure math and pure lookups have no compute floor, but ephemeris-driven endpoints do real astronomy and the heaviest Vedic endpoints solve sunrise and sunset for an arbitrary date and location before returning. The slope across the table tracks compute, not infrastructure noise.

Western natal chart (POST /astrology/natal-chart) computes 13 body positions, derives 12 house cusps, fans out aspects across all body pairs, and looks up authored interpretations per planet sign and planet house combination. The interpretation lookup alone is 12 by 12 by 13 = 1872 candidate cells before house and orb filtering.
Vedic detailed panchang (POST /vedic-astrology/panchang/detailed) is the heaviest single call in the catalog. It solves sunrise and sunset (Newton iteration on solar altitude) then derives 15+ muhurta windows (rahuKaal, abhijit, brahma, vijaya, nishita, varjyam, amritkalam, chandrabalam, tarabalam) anchored to those boundaries, plus tithi, nakshatra, yoga, karana, hora.
Location search (GET /location/search) is faster than the chart endpoints because it is a Postgres trigram match over a city table, not an ephemeris solve. Coordinate-dependent calls always start here, so its p95 of 138 ms is the floor for any kundli, panchang, dasha, or natal pipeline.

The current production tier is a small Hetzner instance picked for cost discipline, not maximum throughput. Tail latency on Vedic compute reflects that choice. The accuracy claim at the methodology page and the latency numbers above are independent: tighter accuracy and faster tail are both on the roadmap, neither implies the other.

How to reproduce these numbers

The runner loops 1000 times per endpoint, stores raw timings, writes a CSV. Two implementations below: a portable bash script using curl -w, and a Node script using fetch. Both speak to the same auth shape and the same fixed input set.

#!/usr/bin/env bash
# roxyapi-latency-bench.sh
# Usage: ROXY_API_KEY=... ./roxyapi-latency-bench.sh
set -euo pipefail

API="https://roxyapi.com/api/v2"
KEY="${ROXY_API_KEY:?set ROXY_API_KEY}"
N=1000
OUT="latency-$(date -u +%Y%m%dT%H%M%SZ).csv"

echo "endpoint,iteration,ms,status" > "$OUT"

bench_get () {
  local label=$1 path=$2
  for i in $(seq 1 "$N"); do
    read -r ms status < <(curl -s -o /dev/null \
      -w '%{time_total} %{http_code}' \
      -H "X-API-Key: $KEY" \
      "$API$path")
    ms_int=$(awk -v s="$ms" 'BEGIN{printf "%.0f", s*1000}')
    echo "$label,$i,$ms_int,$status" >> "$OUT"
    sleep 0.1
  done
}

bench_post () {
  local label=$1 path=$2 body=$3
  for i in $(seq 1 "$N"); do
    read -r ms status < <(curl -s -o /dev/null \
      -w '%{time_total} %{http_code}' \
      -H "X-API-Key: $KEY" \
      -H "Content-Type: application/json" \
      -d "$body" \
      "$API$path")
    ms_int=$(awk -v s="$ms" 'BEGIN{printf "%.0f", s*1000}')
    echo "$label,$i,$ms_int,$status" >> "$OUT"
    sleep 0.1
  done
}

bench_get  tarot_cards        /tarot/cards
bench_get  iching_hexagrams   /iching/hexagrams
bench_get  angel_lookup       /angel-numbers/lookup?number=1111
bench_get  crystals_zodiac    /crystals/zodiac/aries
bench_get  dreams_symbol      /dreams/symbols/flying
bench_get  location_search    /location/search?q=mumbai

bench_post numerology_lifepath /numerology/life-path \
  '{"year":1990,"month":6,"day":15}'
bench_post biorhythm_daily     /biorhythm/daily \
  '{}'
bench_post natal_chart         /astrology/natal-chart \
  '{"date":"1990-06-15","time":"14:30:00","latitude":19.076,"longitude":72.8777,"timezone":5.5}'
bench_post panchang_detailed   /vedic-astrology/panchang/detailed \
  '{"date":"2026-06-15","latitude":19.076,"longitude":72.8777,"timezone":5.5}'

echo "wrote $OUT"

// roxyapi-latency-bench.mjs
// Usage: ROXY_API_KEY=... node roxyapi-latency-bench.mjs
import { performance } from 'node:perf_hooks';
import { writeFileSync } from 'node:fs';

const API = 'https://roxyapi.com/api/v2';
const KEY = process.env.ROXY_API_KEY;
if (!KEY) throw new Error('set ROXY_API_KEY');
const N = 1000;
const sleep = (ms) => new Promise((r) => setTimeout(r, ms));

const endpoints = [
  { label: 'tarot_cards',         method: 'GET',  path: '/tarot/cards' },
  { label: 'iching_hexagrams',    method: 'GET',  path: '/iching/hexagrams' },
  { label: 'angel_lookup',        method: 'GET',  path: '/angel-numbers/lookup?number=1111' },
  { label: 'crystals_zodiac',     method: 'GET',  path: '/crystals/zodiac/aries' },
  { label: 'dreams_symbol',       method: 'GET',  path: '/dreams/symbols/flying' },
  { label: 'location_search',     method: 'GET',  path: '/location/search?q=mumbai' },
  { label: 'numerology_lifepath', method: 'POST', path: '/numerology/life-path',
    body: { year: 1990, month: 6, day: 15 } },
  { label: 'biorhythm_daily',     method: 'POST', path: '/biorhythm/daily',
    body: {} },
  { label: 'natal_chart',         method: 'POST', path: '/astrology/natal-chart',
    body: { date: '1990-06-15', time: '14:30:00', latitude: 19.076, longitude: 72.8777, timezone: 5.5 } },
  { label: 'panchang_detailed',   method: 'POST', path: '/vedic-astrology/panchang/detailed',
    body: { date: '2026-06-15', latitude: 19.076, longitude: 72.8777, timezone: 5.5 } },
];

const rows = ['endpoint,iteration,ms,status'];
for (const ep of endpoints) {
  for (let i = 1; i <= N; i++) {
    const t0 = performance.now();
    const res = await fetch(`${API}${ep.path}`, {
      method: ep.method,
      headers: {
        'X-API-Key': KEY,
        ...(ep.body ? { 'Content-Type': 'application/json' } : {}),
      },
      body: ep.body ? JSON.stringify(ep.body) : undefined,
    });
    await res.text();
    const ms = Math.round(performance.now() - t0);
    rows.push(`${ep.label},${i},${ms},${res.status}`);
    await sleep(100);
  }
}

const out = `latency-${new Date().toISOString().replace(/[:.]/g, '')}.csv`;
writeFileSync(out, rows.join('\n'));
console.log(`wrote ${out}`);

After the runner finishes, derive percentiles in a second pass:

# group by endpoint, sort by ms, pick 500th, 950th, 990th of every thousand
awk -F, 'NR>1 {a[$1]=a[$1] $3 "\n"} END {for (k in a) {
  cmd="echo \"" a[k] "\" | sort -n"; while ((cmd | getline v) > 0) b[k][++c[k]]=v;
  printf "%s p50=%s p95=%s p99=%s\n", k, b[k][500], b[k][950], b[k][990];
  delete c[k]
}}' latency-*.csv

The CSV is the artifact. We publish ours alongside future runs and the runner stays the same so deltas across runs are interpretable. If your run differs by more than 30 percent on any endpoint, check origin region first (us-east-1 vs eu-central-1 will swing every number by a one-way RTT), then check whether your test pattern shares cache keys with the reference inputs above. Latency tracking ties directly into uptime tracking; if you are evaluating production readiness, read the API uptime and SLA transparency report next.

FAQ

How fast is RoxyAPI on average?

Across the 10 most popular endpoints in the catalog, one per domain, recent local sample runs put the suite p50 near 110 ms and p95 near 320 ms from a single AWS us-east-1 origin against the Hetzner Nuremberg edge. Static lookups (Tarot card catalog, I Ching hexagram list, Angel Number lookup) measure under 100 ms p95. The heaviest endpoint (Vedic detailed panchang) measures near 1.4 s p95.

What is p95 latency and why does it matter more than average?

p95 is the response time below which 95 percent of requests complete. It captures the slow tail that average latency hides. A 100 ms average with a 2 s p95 means one in twenty calls feels broken inside a chat agent or webhook handler. Setting client-side timeout budgets at the p95 number prevents a healthy server from looking flaky to end users.

Can I reproduce these numbers myself?

Yes. The bash and Node runner scripts above call the same 10 endpoints with the same fixed inputs the reference run used. Set ROXY_API_KEY, run the script for 10 minutes per endpoint, and derive percentiles from the CSV. Different origin regions will shift every number by the one-way round-trip floor (roughly 90 ms us-east-1 to Nuremberg).

Are the numbers above from a live production benchmark?

The numbers in the table are from a representative local sample run, not a continuously updated production pull. The methodology is the deliverable: the runner script is the artifact, raw CSV is the audit trail. Running the script today against your own key gives you a current snapshot you can compare against the snapshot here.

Why is the Vedic panchang endpoint slower than the others?

The detailed panchang endpoint computes sunrise and sunset for the requested location and date (a Newton iteration on solar altitude), then derives 15 plus muhurta windows anchored to those boundaries, plus tithi, nakshatra, yoga, karana, and hora segments. The slow tail reflects real astronomical compute, not infrastructure noise. Static catalog endpoints ride a response cache and never run that path.

Does response caching change the numbers?

Yes for cacheable GETs. Endpoints like GET /tarot/cards, GET /iching/hexagrams, and GET /crystals/zodiac/{sign} opt into response caching via the X-Cache-TTL header in the route handler. When the key is warm, the middleware chain short-circuits before any database read. Cached responses still count as billable so the rate limit holds, but observed p50 collapses toward the network round-trip floor.

Conclusion

Latency claims only count when the methodology is published alongside the numbers. The runner script above is the artifact, the CSV is the audit trail, and the table is one snapshot from one origin on one day. Build production agents and integrations on top of measurements you can re-run against the Astrology API whenever you need a fresh ground truth.