Open Source — Apache 2.0

Clean data in.
Garbage stays out.

Bastion is a lightweight, high-performance data ingestion gateway built in Rust. Validate, transform, and route your data before it touches your infrastructure. Send events to S3, Kafka, BigQuery, or any destination. Deploy anywhere — from cloud to Raspberry Pi.

Sensors APIs Webhooks Bastion validate → transform → route Kafka (any region) DWH BQ, Redshift, ... Cloud Storage (archive)

Why Bastion?

Getting data into your data platform shouldn't require a PhD in distributed systems.

Validate at the Gate

Real-time schema validation before data enters your pipeline. No more 3 AM surprises from malformed payloads or silent schema changes.

Edge-Native

A single binary under 20 MB. No JVM, no runtime dependencies, no GC pauses. Runs on a Raspberry Pi or a cloud VM. Built in Rust for when every millisecond matters.

Native Fan-Out

Publish to multiple destinations in a single pass — Kafka clusters, S3, webhooks. No MirrorMaker. No cross-cluster replication. Data goes where it needs to from the start.

Bronze → Silver → Gold

Built-in data pipeline architecture. Raw data is validated (Bronze), cleaned and transformed (Silver), and enriched with business logic (Gold) — all before storage.

No Stack Required

Start ingesting on day one. Bastion writes clean Parquet files directly to S3 or GCS — no Kafka, no Spark jobs, no Glue pipelines to maintain. Add destinations as your stack grows.

AI-Ready by Design

Clean, structured data isn't just good engineering — it's what makes AI actually work. The validated, well-typed data that Bastion produces is exactly the context an LLM needs to query your data accurately. No preprocessing, no guesswork. Bastion doesn't just prepare your data for your stack — it prepares it for AI.

How Bastion Compares

Bastion isn't a replacement for Kafka. It's the layer that sits in front, ensuring your data is clean and properly routed.

Bastion REST Proxy Kafka Connect Custom
Memory footprint ~20 MB 512 MB+ (JVM) 512 MB+ (JVM) Varies
Schema validation Built-in Separate service Limited Manual
Data transformation Bronze → Silver → Gold None SMTs (limited) Manual
Multi-destination fan-out Native Single cluster Single cluster Manual
Edge deployable Depends
Parquet output Native Requires Spark/Glue Manual
Requires Kafka No Yes Yes Depends
Deployment Single binary JVM + Schema Registry JVM + Kafka cluster Varies

Roadmap

Where we are and where we're heading.

Schema Management Dashboard

Django-based dashboard for CRUD operations, schema versioning, authentication, and datasource management.

REST API & OpenAPI Documentation

Full REST API with auto-generated OpenAPI specs via drf-spectacular.

Rust Ingestion Core

High-performance HTTP server, real-time validation, Bronze → Silver pipeline, and micro-batching engine.

Multi-Destination Fan-Out

Native routing to Kafka, S3, webhooks, and custom destinations from a single ingestion point.

Edge Buffering & Store-and-Forward

Local buffering when downstream is unavailable. Zero data loss, automatic catch-up on reconnect.

Python Worker Enrichment (Silver → Gold)

Pluggable Python workers for data enrichment — joins, API calls, ML inference — communicating via Apache Arrow IPC.

Observability & Circuit Breakers

Prometheus metrics, structured logging, distributed tracing, and automatic circuit breaking for downstream failures.

Get Early Access

Be the first to know when Bastion is ready. No spam — just launch updates and early access.

What interests you most? (optional)