When it comes to tracking real-time percentiles and instrumenting app spans, you have two choices: buy a managed APM that works out of the box, or build an infinitely scalable metrics pipeline from scratch. Here is how to decide between Sentry and ClickHouse.
Managed APM & Zero Ops vs. Infinite Scale & Build-It-Yourself.
Event-based pricing. Predictable and easy to forecast, but scales linearly. Once you hit massive scale, the bill can become a conversation at the board level.
Infrastructure cost. Extremely cheap at scale for the compute/storage you get, but carries a high fixed cost in engineering time and ongoing DevOps maintenance.
Drop in the SDK, configure your sample rate, and you are done. The dashboards, alerts, and rollups are pre-built.
You own the pipeline. You need to route spans (often via Kafka or Vector), design the schema, manage cluster shards, and wire it all up to Grafana or Metabase.
Opinionated dashboards tailored for software performance and error tracking. Excellent for standard APM, but rigid if you want to run complex custom business queries.
Arbitrary SQL queries over massive datasets. You can join your performance spans with your billing data, user metadata, or custom product events.
Spans are automatically linked to errors, releases, and user sessions. The context is built-in.
You build the correlation. If you want to link a slow p99 trace to a specific exception, you have to ensure the trace IDs are passed correctly and write the JOIN yourself.
Different philosophies for ingesting and querying massive amounts of span and metric data.
To render dashboards quickly, Sentry relies on pre-aggregated rollups. When spans hit their ingestion pipeline, they calculate metrics (like p50, p75, p99) into time-series buckets.
The Tradeoff: Instant dashboard loading, but you are querying aggregated data, not raw rows.
ClickHouse handles real-time percentiles over massive datasets using the AggregatingMergeTree engine. You use functions like quantileTiming() to calculate accurate percentiles on the fly.
The Tradeoff: Retain raw data for infinite drill-down capabilities, as long as your storage budget allows.