Operating Rust Services
← Part 8
FFI and Embedding
Building a Rust service is the first half of the job. Operating one requires instrumentation that tells you what is happening at runtime, panic handling that fails gracefully, and deployment patterns that take advantage of what Rust binaries actually are. This post covers the operational layer that most tutorials skip.
Observability with the tracing Crate
The tracing crate is the standard observability layer for Rust async services. Unlike log, tracing is structured — spans and events carry key-value fields, not just formatted strings.
[dependencies]
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] }Structured Logging
use tracing::{info, warn, error, instrument};
#[instrument(skip(pool), fields(user_id = %id))]
async fn handle_user_request(id: u64, pool: &sqlx::PgPool) -> Result<User, UserError> {
info!("fetching user");
let user = fetch_user(id, pool).await.map_err(|e| {
error!(error = %e, "database query failed");
UserError::Database(e)
})?;
if user.is_suspended {
warn!(reason = "account_suspended", "user access denied");
return Err(UserError::Forbidden {
user_id: id,
resource: "profile".to_string(),
});
}
info!(email = %user.email, "user fetched successfully");
Ok(user)
}#[instrument] automatically creates a span around the function and records its name, arguments, and timing. Every log event inside the function is attached to that span. In a distributed system, spans chain across service boundaries to form traces.
Initializing the Subscriber
use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt, EnvFilter};
fn init_tracing() {
tracing_subscriber::registry()
.with(EnvFilter::try_from_default_env()
.unwrap_or_else(|_| "info,sqlx=warn,hyper=warn".into()))
.with(
tracing_subscriber::fmt::layer()
.json() // structured JSON output
.with_current_span(true) // include span context
.with_target(true),
)
.init();
}Set RUST_LOG=debug in development, RUST_LOG=info in production, RUST_LOG=warn in performance-sensitive paths. The EnvFilter layer parses this at startup with no runtime overhead per log event.
OpenTelemetry Integration
For distributed tracing across services, connect tracing to the OpenTelemetry pipeline:
[dependencies]
opentelemetry = "0.21"
opentelemetry_sdk = { version = "0.21", features = ["rt-tokio"] }
opentelemetry-otlp = { version = "0.14", features = ["tonic"] }
tracing-opentelemetry = "0.22"use opentelemetry_otlp::WithExportConfig;
async fn init_telemetry() -> anyhow::Result<()> {
let exporter = opentelemetry_otlp::new_exporter()
.tonic()
.with_endpoint("http://otel-collector:4317");
let tracer = opentelemetry_otlp::new_pipeline()
.tracing()
.with_exporter(exporter)
.install_batch(opentelemetry_sdk::runtime::Tokio)?;
tracing_subscriber::registry()
.with(tracing_opentelemetry::layer().with_tracer(tracer))
.with(EnvFilter::from_default_env())
.with(tracing_subscriber::fmt::layer())
.init();
Ok(())
}All #[instrument] spans now export to your OTLP-compatible backend (Jaeger, Tempo, Honeycomb, Datadog) with no code changes.
Metrics with Prometheus
[dependencies]
metrics = "0.22"
metrics-exporter-prometheus = "0.13"use metrics::{counter, gauge, histogram};
fn init_metrics() {
metrics_exporter_prometheus::PrometheusBuilder::new()
.with_http_listener(([0, 0, 0, 0], 9090))
.install()
.expect("failed to install Prometheus exporter");
}
async fn handle_request(method: &str, path: &str) {
let start = std::time::Instant::now();
counter!("http_requests_total", "method" => method.to_string(), "path" => path.to_string())
.increment(1);
// ... process request ...
histogram!("http_request_duration_seconds",
"method" => method.to_string(),
"path" => path.to_string()
).record(start.elapsed().as_secs_f64());
}The metrics facade decouples your instrumentation from the exporter. Switch from Prometheus to statsd or another backend by changing one line in init_metrics.
Panic Handling
In Rust, a panic is not an exception — it is an unrecoverable error that unwinds the stack and terminates the thread. In an async service, an unhandled panic in a spawned task causes the task to stop, not the process.
Always handle the Err case from JoinHandle:
let handle = tokio::spawn(async move {
process_batch(batch).await
});
match handle.await {
Ok(Ok(result)) => { /* success */ }
Ok(Err(e)) => { error!(error = %e, "batch processing failed"); }
Err(panic) => { error!("task panicked: {:?}", panic); }
}Custom Panic Hook
Install a custom panic hook to emit structured logs before the process terminates:
fn install_panic_hook() {
std::panic::set_hook(Box::new(|info| {
let location = info.location()
.map(|l| format!("{}:{}", l.file(), l.line()))
.unwrap_or_else(|| "unknown".to_string());
let message = match info.payload().downcast_ref::<&str>() {
Some(s) => *s,
None => "Box<Any>",
};
tracing::error!(
panic.message = message,
panic.location = %location,
"process panicked"
);
// Flush spans before exit
opentelemetry::global::shutdown_tracer_provider();
}));
}This ensures the panic is visible in your log aggregation system, not just in stderr.
Health Checks and Graceful Shutdown
use axum::{routing::get, Router};
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
static READY: AtomicBool = AtomicBool::new(false);
async fn liveness() -> &'static str {
"alive"
}
async fn readiness() -> axum::response::Result<&'static str> {
if READY.load(Ordering::Relaxed) {
Ok("ready")
} else {
Err(axum::http::StatusCode::SERVICE_UNAVAILABLE.into())
}
}
async fn run_service(pool: sqlx::PgPool) {
let app = Router::new()
.route("/health/live", get(liveness))
.route("/health/ready", get(readiness))
.with_state(pool);
let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
// Signal readiness after startup is complete
READY.store(true, Ordering::Relaxed);
info!("service ready");
axum::serve(listener, app)
.with_graceful_shutdown(shutdown_signal())
.await
.unwrap();
}
async fn shutdown_signal() {
use tokio::signal;
let ctrl_c = async { signal::ctrl_c().await.expect("ctrl-c handler") };
let terminate = async {
signal::unix::signal(signal::unix::SignalKind::terminate())
.expect("SIGTERM handler")
.recv()
.await;
};
tokio::select! {
_ = ctrl_c => {},
_ = terminate => {},
}
info!("shutdown signal received — draining connections");
READY.store(false, Ordering::Relaxed);
}Deployment Considerations
Rust binaries are statically linked by default on Linux (with musl target). The resulting binary runs in a FROM scratch image with no OS dependencies.
# Multi-stage build
FROM rust:1.75-alpine AS builder
RUN apk add --no-cache musl-dev
WORKDIR /app
COPY . .
RUN cargo build --release --target x86_64-unknown-linux-musl
FROM scratch
COPY --from=builder /app/target/x86_64-unknown-linux-musl/release/myservice /myservice
ENTRYPOINT ["/myservice"]For services that link against system libraries (OpenSSL, libpq), use distroless instead:
FROM gcr.io/distroless/cc-debian12
COPY --from=builder /app/target/release/myservice /myservice
ENTRYPOINT ["/myservice"]Environment variables for operational configuration:
fn load_config() -> Config {
Config {
database_url: std::env::var("DATABASE_URL").expect("DATABASE_URL required"),
port: std::env::var("PORT")
.unwrap_or_else(|_| "3000".to_string())
.parse()
.expect("PORT must be a number"),
log_level: std::env::var("RUST_LOG").unwrap_or_else(|_| "info".to_string()),
}
}Key Takeaways
- Use
tracingwith the#[instrument]macro for structured, span-aware logging that integrates directly with OpenTelemetry without code changes. - Always await
JoinHandleand handle the panic case — an unhandled panic in a spawned task silently drops the task in async Rust. - Install a custom panic hook to emit structured logs and flush telemetry before the process exits; panics are otherwise invisible in your observability stack.
- Separate liveness (
/health/live) from readiness (/health/ready) — liveness tells the orchestrator the process is running; readiness tells it the service is ready to serve traffic. - Rust binaries link statically with musl and run in
FROM scratchimages of 3–5 MB — a significant operational advantage for image pull times and attack surface. - Graceful shutdown via
with_graceful_shutdowndrains in-flight connections before the process exits, preventing dropped requests during rolling deployments.
← Part 8
FFI and Embedding