Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
2ce67d9
Add info about support
tothmano Jul 23, 2025
59d56d0
First draft
tothmano Jul 23, 2025
03b5d99
Fix query metrics page
tothmano Jul 24, 2025
6436905
Get query language overview ready for review
tothmano Jul 24, 2025
f1479ed
Update overview.mdx
tothmano Jul 24, 2025
c07d2dc
Apply suggestions from code review
manototh Jul 25, 2025
47f5e94
Hide unavailable features
tothmano Jul 25, 2025
834fdb3
Implement review
tothmano Jul 25, 2025
f7602d8
Merge branch 'main' into mano/metrics
manototh Jul 25, 2025
2061259
Merge branch 'main' into mano/metrics
tothmano Jul 30, 2025
681b987
Add dataset types, sample queries, migration
tothmano Jul 30, 2025
45ee75a
Add differences to migration
tothmano Jul 30, 2025
48fc557
Add to overview
tothmano Jul 30, 2025
c9bf70b
Finish overview page
tothmano Jul 30, 2025
013b880
Update datasets.mdx
tothmano Jul 30, 2025
50bf289
Update query-data/metrics/migrate-metrics.mdx
manototh Jul 30, 2025
72f5cb5
Apply changes to align syntax
manototh Oct 9, 2025
6698b37
Merge branch 'main' into mano/metrics
manototh Oct 9, 2025
f012596
Fix Vale
manototh Oct 9, 2025
b699c1b
Remove join
manototh Oct 9, 2025
c0913f8
Remove join 2
manototh Oct 9, 2025
7a9a106
Remove replace, comment out logical
manototh Oct 9, 2025
bf07dd9
Remove running avg examples
manototh Oct 9, 2025
20b5dc6
Remove currently unsupported features
manototh Oct 9, 2025
615a150
Correct syntax for align
manototh Oct 9, 2025
468bc74
Update query-data/metrics/query-metrics.mdx
manototh Oct 9, 2025
1eae185
Back to correct syntax
manototh Oct 9, 2025
774a203
Fix align syntax
manototh Oct 9, 2025
2752a6a
Add code highlighting
manototh Oct 9, 2025
60bb337
Update reference/datasets.mdx
manototh Oct 9, 2025
14877fc
Add current limitations
manototh Oct 9, 2025
d966e14
Delete space from allowed chars
manototh Oct 9, 2025
57fd6a2
First draft
manototh Oct 10, 2025
0de519e
Fixes
manototh Oct 10, 2025
aaba984
Change to private preview
manototh Oct 10, 2025
92eeeb4
Merge branch 'main' into mano/metrics-builder
manototh Oct 10, 2025
39a4662
Apply suggestions from code review
manototh Oct 10, 2025
bc1ab1c
Improve metric definition
manototh Oct 10, 2025
e822555
Add info about x-axiom-metrics-dataset header
manototh Oct 22, 2025
1a2578c
Apply suggestions from code review
manototh Oct 22, 2025
26f18d2
Merge branch 'main' into mano/metrics-builder
manototh Oct 22, 2025
e7c5858
Implement review
manototh Oct 22, 2025
a14581d
Add info about `application/x-protobuf` content type
manototh Oct 23, 2025
30940bd
Rename sidenav of Traces
manototh Oct 23, 2025
5329789
Explain design choices better
manototh Oct 23, 2025
6570a9d
Add to intro, arch, and features pages
manototh Oct 23, 2025
07222b0
Merge branch 'main' into mano/metrics-builder
manototh Oct 30, 2025
093717b
Add dashboard screenshot
manototh Oct 31, 2025
4368725
Implement review
manototh Nov 3, 2025
66c6351
Merge branch 'main' into mano/metrics-builder
manototh Nov 3, 2025
943cce7
Remove unsupported functions
manototh Nov 3, 2025
af189c2
Apply suggestions from code review
manototh Nov 3, 2025
112eab3
Implement reviews from Dom
manototh Nov 3, 2025
79c99a6
Implement review
manototh Nov 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added doc-assets/shots/example-metrics-query.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc-assets/shots/otel-metrics-dashboard.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
9 changes: 8 additions & 1 deletion docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,14 @@
"query-data/visualizations",
"query-data/views",
"query-data/virtual-fields",
"query-data/traces"
"query-data/traces",
{
"group": "Metrics",
"pages": [
"query-data/metrics/overview",
"query-data/metrics/query-metrics"
]
}
]
},
{
Expand Down
34 changes: 21 additions & 13 deletions introduction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,34 +9,42 @@ Trusted by 30,000+ organizations, from high-growth startups to global enterprise

## Components

Axiom consists of two fundamental components:
Axiom consists of three specialized data storage engines, each purpose-built for its specific workload:

### EventDB

Robust, cost-effective, and scalable datastore specifically optimized for timestamped event data. Built from the ground up to handle the vast volumes and high velocity of event ingestion, EventDB ensures:

* **Scalable data loading:** Events are ingested seamlessly without complex middleware, scaling linearly with no single points of failure.
* **Extreme compression:** Tuned storage format compresses data 25-50x, significantly reducing storage costs and ensuring data remains queryable at any time.
* **Serverless querying:** Axiom spins up ephemeral, serverless runtimes on-demand to execute queries efficiently, minimizing idle compute resources and costs.
- **Scalable data loading:** Events are ingested seamlessly without complex middleware, scaling linearly with no single points of failure.
- **Extreme compression:** Tuned storage format compresses data 25-50x, significantly reducing storage costs and ensuring data remains queryable at any time.
- **Serverless querying:** Axiom spins up ephemeral, serverless runtimes on-demand to execute queries efficiently, minimizing idle compute resources and costs.

### MetricsDB

Dedicated metrics database engineered specifically for high-cardinality time-series data. Unlike traditional metrics solutions that penalize you for dimensional complexity, MetricsDB embraces high-cardinality tags as a design principle:

- **High-cardinality native:** Store metrics with high-cardinality dimensional tags without performance degradation or cost penalties.
- **Optimized storage:** Purpose-built storage format designed for time-series workloads delivers efficient compression and fast aggregations across millions of unique tag combinations.
- **Thoughtful constraints:** Design choices prioritize the most common metrics use cases while maintaining exceptional performance.

For more information, see [Axiom’s architecture](/platform-overview/architecture).

### Console

Intuitive web app built for exploration, visualization, and monitoring of your data.

* **Real-time exploration:** Effortlessly query and visualize data streams in real-time, providing instant clarity on operational and business conditions.
* **Dynamic visualizations:** Generate insightful visualizations, from straightforward counts to sophisticated aggregations, tailored specifically to your needs.
* **Robust monitoring:** Set up threshold-based and anomaly driven alerts, ensuring proactive visibility into potential issues.
- **Real-time exploration:** Effortlessly query and visualize data streams in real-time, providing instant clarity on operational and business conditions.
- **Dynamic visualizations:** Generate insightful visualizations, from straightforward counts to sophisticated aggregations, tailored specifically to your needs.
- **Robust monitoring:** Set up threshold-based and anomaly driven alerts, ensuring proactive visibility into potential issues.

## Why choose Axiom?

* **Cost-efficiency:** Axiom dramatically lowers data ingestion and storage costs compared to traditional observability and logging solutions.
* **Flexible insights:** Real-time query capabilities and an increasingly intelligent UI help pinpoint issues and opportunities without sampling.
* **AI engineering:** Axiom provides specialized features designed explicitly for AI engineering workflows, allowing teams to confidently build, deploy, and optimize AI capabilities.
- **Cost-efficiency:** Axiom dramatically lowers data ingestion and storage costs compared to traditional observability and logging solutions.
- **Flexible insights:** Real-time query capabilities and an increasingly intelligent UI help pinpoint issues and opportunities without sampling.
- **AI engineering:** Axiom provides specialized features designed explicitly for AI engineering workflows, allowing teams to confidently build, deploy, and optimize AI capabilities.

## Getting started

* [Learn more about Axiom’s features](/platform-overview/features).
* [Explore the interactive demo playground](https://play.axiom.co/).
* [Create your own organization](https://app.axiom.co/register).
- [Learn more about Axiom’s features](/platform-overview/features).
- [Explore the interactive demo playground](https://play.axiom.co/).
- [Create your own organization](https://app.axiom.co/register).
98 changes: 47 additions & 51 deletions platform-overview/architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,92 +8,88 @@ description: "Technical deep-dive into Axiom’s distributed architecture."
You don’t need to understand any of the following material to get massive value from Axiom. As a fully managed data platform, Axiom just works. This technical deep-dive is intended for curious minds wondering: Why is Axiom different?
</Tip>

Axiom routes ingestion requests through a distributed edge layer to a cluster of specialized services that process and store data in a proprietary columnar format optimized for event data. Query requests are executed by ephemeral, serverless workers that operate directly on compressed data stored in object storage.
Axiom routes ingestion requests through a distributed edge layer to a cluster of specialized services that process and store data in proprietary columnar formats optimized for different data types. EventDB handles high-volume event data, while MetricsDB is purpose-built for time-series metrics with high-cardinality dimensions. Query requests are executed by ephemeral, serverless workers that operate directly on compressed data stored in object storage.

## Ingestion architecture

Data flows through a multi-layered ingestion system designed for high throughput and reliability:

**Regional Edge Layer**: HTTPS ingestion requests are received by regional edge proxies positioned to meet data jurisdiction requirements. These proxies handle protocol translation, authentication, and initial data validation. The edge layer supports multiple input formats (JSON, CSV, compressed streams) and can buffer data during downstream issues.

**High-availability routing**: The system provides intelligent routing to healthy database nodes using real-time health monitoring. When primary ingestion paths fail, requests are automatically routed to available nodes or queued in a backlog system that processes data when systems recover.

**Streaming Pipeline**: Raw events are parsed, validated, and transformed in streaming fashion. Field limits and schema validation occur during this phase.

**Write-Ahead Logging**: All ingested data is durably written to a distributed write-ahead log before being processed. This ensures zero data loss even during system failures and supports concurrent writes across multiple ingestion nodes.
- **Regional Edge Layer:** HTTPS ingestion requests are received by regional edge proxies positioned to meet data jurisdiction requirements. These proxies handle protocol translation, authentication, and initial data validation. The edge layer supports multiple input formats (JSON, CSV, compressed streams) and can buffer data during downstream issues.
- **High-availability routing:** The system provides intelligent routing to healthy database nodes using real-time health monitoring. When primary ingestion paths fail, requests are automatically routed to available nodes or queued in a backlog system that processes data when systems recover.
- **Streaming Pipeline:** Raw events are parsed, validated, and transformed in streaming fashion. Field limits and schema validation occur during this phase.
- **Write-Ahead Logging:** All ingested data is durably written to a distributed write-ahead log before being processed. This ensures zero data loss even during system failures and supports concurrent writes across multiple ingestion nodes.

## Storage architecture

Axiom’s storage layer is built around a custom columnar format that achieves extreme compression ratios:
Axiom’s storage layer uses specialized columnar formats optimized for different workload types:

**Columnar organization**: Events are decomposed into columns and stored using specialized encodings optimized for each data type. String columns use dictionary encoding, numeric columns use various compression schemes, and boolean columns use bitmap compression.
### EventDB storage

**Block-based storage**: Data is organized into immutable blocks that are written once and read many times. Each block contains:
EventDB’s storage is built around a custom columnar format that achieves extreme compression ratios:

- Column metadata and statistics
- Compressed column data in a proprietary format
- Separate time indexes for temporal queries
- Field schemas and type information
- **Columnar organization:** Events are decomposed into columns and stored using specialized encodings optimized for each data type. String columns use dictionary encoding, numeric columns use various compression schemes, and boolean columns use bitmap compression.
- **Block-based storage:** Data is organized into immutable blocks that are written once and read many times. Each block contains:

**Compression pipeline**: Data flows through multiple compression stages:
- Column metadata and statistics
- Compressed column data in a proprietary format
- Separate time indexes for temporal queries
- Field schemas and type information

1. **Ingestion compression**: Real-time compression during ingestion (25-50% reduction)
2. **Block compression**: Columnar compression within storage blocks (10-20x additional compression)
3. **Compaction compression**: Background compaction further optimizes storage (additional 2-5x compression)
- **Compression pipeline:** Data flows through multiple compression stages:

**Object storage integration**: Blocks are stored in object storage (S3) with intelligent partitioning strategies that distribute load and avoid hot-spotting. The system supports multiple storage tiers and automatic lifecycle management.
1. **Ingestion compression:** Real-time compression during ingestion (25-50% reduction)
1. **Block compression:** Columnar compression within storage blocks (10-20x additional compression)
1. **Compaction compression:** Background compaction further optimizes storage (additional 2-5x compression)

## Query architecture
- **Object storage integration:** Blocks are stored in object storage (S3) with intelligent partitioning strategies that distribute load and avoid hot-spotting. The system supports multiple storage tiers and automatic lifecycle management.

Axiom executes queries using a serverless architecture that spins up compute resources on-demand:
### MetricsDB storage

**Query compilation**: The APL (Axiom Processing Language) query is parsed, optimized, and compiled into an execution plan. The compiler performs predicate pushdown, projection optimization, and identifies which blocks need to be read.
MetricsDB uses a specialized columnar format engineered for time-series metrics with high-cardinality dimensions:

**Serverless Workers**: Query execution occurs in ephemeral workers optimized through "Fusion queries"—a system that runs parallel queries inside a single worker to reduce costs and leave more resources for large queries. Workers download only the necessary column data from object storage, enabling efficient resource utilization. Multiple workers can process different blocks in parallel.
- **High-cardinality optimization:** Unlike traditional metrics databases that struggle with dimensional complexity, MetricsDB is designed from the ground up to handle high numbers of unique tag combinations efficiently.
- **Intentional design constraints:** MetricsDB makes deliberate trade-offs to optimize for the most common metrics use cases. These constraints are purposeful architectural choices that enable MetricsDB to deliver exceptional performance and cost-efficiency for real-world metrics workloads. Where other systems penalize you for high cardinality or force you to pre-aggregate data, MetricsDB lets you store and query metrics with full dimensional flexibility.
- **Unified observability:** Query metrics alongside logs and traces, enabling powerful correlations across all your telemetry data without switching tools or learning multiple query languages.

**Block-level parallelism**: Each query spawns multiple workers that process different blocks concurrently. Workers read compressed column data directly from object storage, decompress it in memory, and execute the query.
## Query architecture

**Result aggregation**: Worker results are streamed back and aggregated by a coordinator process. Large result sets are automatically spilled to object storage and streamed to clients via signed URLs.
Axiom executes queries using a serverless architecture that spins up compute resources on-demand:

**Intelligent caching**: Query results are cached in object storage with intelligent cache keys that account for time ranges and query patterns. Cache hits dramatically reduce query latency for repeated queries.
- **Query compilation:** The APL (Axiom Processing Language) query is parsed, optimized, and compiled into an execution plan. The compiler performs predicate pushdown, projection optimization, and identifies which blocks need to be read.
- **Serverless Workers:** Query execution occurs in ephemeral workers optimized through "Fusion queries"—a system that runs parallel queries inside a single worker to reduce costs and leave more resources for large queries. Workers download only the necessary column data from object storage, enabling efficient resource utilization. Multiple workers can process different blocks in parallel.
- **Block-level parallelism:** Each query spawns multiple workers that process different blocks concurrently. Workers read compressed column data directly from object storage, decompress it in memory, and execute the query.
- **Result aggregation:** Worker results are streamed back and aggregated by a coordinator process. Large result sets are automatically spilled to object storage and streamed to clients via signed URLs.
- **Intelligent caching:** Query results are cached in object storage with intelligent cache keys that account for time ranges and query patterns. Cache hits dramatically reduce query latency for repeated queries.

## Compaction system

A background compaction system continuously optimizes storage efficiency:

**Automatic compaction**: The compaction scheduler identifies blocks that can be merged based on size, age, and access patterns. Small blocks are combined into larger "superblocks" that provide better compression ratios and query performance.

**Multiple strategies**: The system supports several compaction algorithms:
- **Automatic compaction:** The compaction scheduler identifies blocks that can be merged based on size, age, and access patterns. Small blocks are combined into larger "superblocks" that provide better compression ratios and query performance.
- **Multiple strategies:** The system supports several compaction algorithms:

- **Default**: General-purpose compaction with optimal compression
- **Clustered**: Groups data by common field values for better locality
- **Fieldspace**: Optimizes for specific field access patterns
- **Concat**: Simple concatenation for append-heavy workloads
- **Default:** General-purpose compaction with optimal compression
- **Clustered:** Groups data by common field values for better locality
- **Fieldspace:** Optimizes for specific field access patterns
- **Concat:** Simple concatenation for append-heavy workloads

**Compression optimization**: During compaction, data is recompressed using more aggressive algorithms and column-specific optimizations that aren’t feasible during real-time ingestion.
- **Compression optimization:** During compaction, data is recompressed using more aggressive algorithms and column-specific optimizations that aren’t feasible during real-time ingestion.

## System architecture

The overall system is composed of specialized microservices:

**Core services**: Handle authentication, billing, dataset management, and API routing. These services are stateless and horizontally scalable.

**Database layer**: The core database engine processes ingestion, manages storage, and coordinates query execution. It supports multiple deployment modes and automatic failover.

**Orchestration layer**: Manages distributed operations, monitors system health, and coordinates background processes like compaction and maintenance.

**Edge services**: Handle real-time data ingestion, protocol translation, and provide regional data collection points.
- **Core services:** Handle authentication, billing, dataset management, and API routing. These services are stateless and horizontally scalable.
- **Database layer:** The core database engine processes ingestion, manages storage, and coordinates query execution. It supports multiple deployment modes and automatic failover.
- **Orchestration layer:** Manages distributed operations, monitors system health, and coordinates background processes like compaction and maintenance.
- **Edge services:** Handle real-time data ingestion, protocol translation, and provide regional data collection points.

## Why this architecture wins

**Cost efficiency**: Serverless query execution means you only pay for compute during active queries. Extreme compression (25-50x) dramatically reduces storage costs compared to traditional row-based systems.

**Operational simplicity**: The system is designed to be self-managing. Automatic compaction, intelligent caching, and distributed coordination eliminate operational overhead.

**Elastic scale**: Each component scales independently. Ingestion scales with edge capacity, storage scales with object storage, and query capacity scales with serverless workers.

**Fault tolerance**: Write-ahead logging, distributed routing, and automatic failover ensure high availability. The system gracefully handles node failures and storage outages.

**Real-time performance**: Despite the distributed architecture, the system maintains sub-second query performance through intelligent caching, predicate pushdown, and columnar storage optimizations.
- **Cost efficiency:** Serverless query execution means you only pay for compute during active queries. Extreme compression (25-50x) dramatically reduces storage costs compared to traditional row-based systems.
- **Operational simplicity:** The system is designed to be self-managing. Automatic compaction, intelligent caching, and distributed coordination eliminate operational overhead.
- **Elastic scale:** Each component scales independently. Ingestion scales with edge capacity, storage scales with object storage, and query capacity scales with serverless workers.
- **Fault tolerance:** Write-ahead logging, distributed routing, and automatic failover ensure high availability. The system gracefully handles node failures and storage outages.
- **Real-time performance:** Despite the distributed architecture, the system maintains sub-second query performance through intelligent caching, predicate pushdown, and columnar storage optimizations.

This architecture enables Axiom to ingest millions of events per second while maintaining sub-second query latency at a fraction of the cost of traditional logging and observability solutions.
Loading