RUM and the Front-end Gap
Series
Observability in DepthYou can have perfect backend observability and still have no idea what users are experiencing. A 50ms API response means nothing if the browser spends 4 seconds parsing a JavaScript bundle, re-layouting the DOM three times, and painting late because a third-party ad script is blocking the main thread. Backend traces end at the network edge. Real User Monitoring (RUM) begins there.
The front-end gap is the time between your API responding and the user seeing a usable page — and for many applications it is the largest contributor to perceived latency.
The Observability Stack: Where RUM Lives
The connection between RUM and backend traces is a trace_id that flows from the browser's XHR/fetch call through the traceparent header into your backend spans — and then back through a response header so the browser agent can correlate the RUM session with the server-side trace.
Core Web Vitals: The Metrics That Matter
Google's Core Web Vitals are the industry standard for user-perceived performance:
| Metric | Full name | Good | Needs work | Poor | Measures |
|---|---|---|---|---|---|
| LCP | Largest Contentful Paint | < 2.5s | 2.5–4s | > 4s | Load speed (largest element) |
| INP | Interaction to Next Paint | < 200ms | 200–500ms | > 500ms | Responsiveness |
| CLS | Cumulative Layout Shift | < 0.1 | 0.1–0.25 | > 0.25 | Visual stability |
| FCP | First Contentful Paint | < 1.8s | 1.8–3s | > 3s | Perceived load start |
| TTFB | Time to First Byte | < 800ms | 800ms–1.8s | > 1.8s | Server + network response |
TTFB is the one metric bridging RUM and backend. A high TTFB with normal backend P99 points to network or CDN issues. A high TTFB matching high backend P99 is a backend problem.
Grafana Faro: Open-Source RUM
Grafana Faro is the pragmatic choice if you are already on the Grafana stack. It collects Web Vitals, JS errors, and custom events, sending them to a Faro collector that forwards to Loki and Tempo.
// Initialize Faro in your React app
import { initializeFaro, getWebInstrumentations } from '@grafana/faro-web-sdk';
import { TracingInstrumentation } from '@grafana/faro-web-tracing';
const faro = initializeFaro({
url: 'https://faro-collector.example.com/collect',
app: {
name: 'shop-frontend',
version: '2.4.1',
environment: 'production',
},
instrumentations: [
...getWebInstrumentations({
captureConsole: true,
captureConsoleDisabledLevels: ['debug', 'log'],
}),
new TracingInstrumentation({
instrumentationOptions: {
propagateTraceHeaderCorsUrls: [/api\.example\.com/],
},
}),
],
});
// Custom event: user completes checkout
faro.api.pushEvent('checkout_completed', {
order_id: orderId,
amount_cents: String(amountCents),
payment_method: paymentMethod,
});The TracingInstrumentation automatically adds traceparent headers to fetch/XHR calls matching propagateTraceHeaderCorsUrls, connecting browser activity to backend traces.
Web Vitals Collection with the web-vitals Library
If you prefer a lighter-weight approach without a full RUM SDK:
import { onCLS, onINP, onLCP, onFCP, onTTFB } from 'web-vitals';
function sendToAnalytics({ name, value, rating, id, navigationType }) {
const body = JSON.stringify({
metric: name,
value: Math.round(name === 'CLS' ? value * 1000 : value),
rating, // 'good' | 'needs-improvement' | 'poor'
id,
navigation_type: navigationType,
page: window.location.pathname,
timestamp: Date.now(),
session_id: getSessionID(),
});
// Use sendBeacon for reliability at page unload
navigator.sendBeacon('/metrics/web-vitals', body);
}
onCLS(sendToAnalytics);
onINP(sendToAnalytics);
onLCP(sendToAnalytics);
onFCP(sendToAnalytics);
onTTFB(sendToAnalytics);On the backend, ingest these into Prometheus via a small proxy:
// Express-style handler that converts web-vitals JSON to Prometheus metrics
var webVitalsHistogram = promauto.NewHistogramVec(prometheus.HistogramOpts{
Name: "web_vitals_seconds",
Help: "Core Web Vitals measurements",
Buckets: []float64{0.1, 0.25, 0.5, 1, 2.5, 4, 7.5, 15},
}, []string{"metric", "rating", "page"})
func handleWebVitals(w http.ResponseWriter, r *http.Request) {
var payload WebVitalsPayload
json.NewDecoder(r.Body).Decode(&payload)
val := float64(payload.Value) / 1000 // ms to seconds
webVitalsHistogram.WithLabelValues(
payload.Metric, payload.Rating, sanitizePath(payload.Page),
).Observe(val)
}Browser Error Tracking
JavaScript errors that don't surface in backend logs are the most under-monitored failure category. Wire up a global error handler:
// Capture unhandled errors and promise rejections
window.addEventListener('error', (event) => {
faro.api.pushError(event.error, {
type: 'unhandled_error',
context: {
message: event.message,
filename: event.filename,
lineno: String(event.lineno),
},
});
});
window.addEventListener('unhandledrejection', (event) => {
faro.api.pushError(
event.reason instanceof Error
? event.reason
: new Error(String(event.reason)),
{ type: 'unhandled_promise_rejection' }
);
});Track error rate as a signal in your SLO:
# Browser error rate (from Faro → Loki)
sum(rate({app="shop-frontend"} | json | kind="exception" [5m]))
/
sum(rate({app="shop-frontend"} | json | kind="navigate" [5m]))Sampling RUM Data Responsibly
At 10M page views/day, capturing every interaction is prohibitively expensive. Use session sampling:
const SAMPLE_RATE = 0.10; // 10% of sessions
const faro = initializeFaro({
// ...
sessionTracking: {
samplingRate: SAMPLE_RATE,
// Always capture sessions with errors regardless of sample rate
persistSessionSampling: true,
},
beforeSend: (item) => {
// Always send errors, even in unsampled sessions
if (item.type === 'exception') return item;
// Drop other items from unsampled sessions
return faro.api.getSession()?.attributes?.sampled === 'true' ? item : null;
},
});Connecting RUM to Backend Traces in Grafana
Configure Grafana to correlate Faro sessions with Tempo traces:
{
"correlations": [
{
"sourceUID": "faro-datasource",
"targetUID": "tempo-prod",
"label": "Open trace",
"config": {
"type": "query",
"field": "traceId",
"target": {
"query": "${__value.raw}"
}
}
}
]
}Now in the Faro explore panel you can click a slow page load, see the traceId from the API call, and jump directly to the Tempo waterfall showing which backend service was responsible.
Key Takeaways
- Backend traces end at the network edge — RUM fills the observability gap between API response and the user seeing a usable page.
- Core Web Vitals (LCP, INP, CLS) are the standard vocabulary for front-end performance; TTFB is the bridge metric between RUM and backend observability.
- Grafana Faro's
TracingInstrumentationautomatically injectstraceparentheaders into fetch/XHR calls, linking browser sessions to backend traces without manual correlation. - Browser JavaScript errors are the most under-monitored failure category in most stacks — unhandled errors and promise rejections must be captured explicitly.
- Session sampling at 10% is a reasonable default for high-traffic applications; always override sampling for sessions that contain errors.
- RUM-to-trace correlation in Grafana turns a slow page load report into a one-click path to the root-cause service, making the full observability stack genuinely end-to-end.