Biogas Platform

The Problem#

My nephew is an environmental engineer. He told me how biogas plants are managed day-to-day: Excel spreadsheets. Sensor readings copied by hand, daily logs that nobody analyzed afterwards, decisions made on intuition because the data was inert. No real-time view, no anomaly detection, no predictive maintenance, no business intelligence — just files that grew bigger every week and were never queried.

The opportunity was obvious. Replace the spreadsheets with a real platform that ingests sensor data, lets operators work from it, and turns the historical record into something the business can actually act on.

What Biogas Platform Does#

A full operations platform for biogas plants. Sensors stream data over MQTT to a Go backend; the data lands in PostgreSQL with time-series-friendly schemas; operators use a web dashboard and a mobile app for daily workflows; a Python ML pipeline trains models that get deployed to a Rust edge service for sub-50ms inference at the plant; an AI assistant helps operators query the system in plain language. The original Excel workflow disappears.

Loading diagram...

Topología edge→cloud: el gateway Rust opera local (offline-first) y sincroniza con el backend cuando hay link; los sensores también entran por MQTT.

Constraints I Set#

Edge has to work without internet. Plants can lose connectivity for hours; operations and anomaly detection must keep running locally.
Models are versioned production assets. No ad-hoc model drops — every model is trained, validated, packaged as ONNX, versioned in S3-compatible storage, and rolled out through a controlled deployment.
Specs-driven from day one. OpenSpec is the source of truth for contracts between apps; no API is "just shipped" without a spec first.
Role-aware from the start. Plant operators, technical staff, supervisors, and owners see different views and have different capabilities — the role model is part of the data layer, not a UI afterthought.

My Role#

Single developer. Started February 9, 2026. My nephew, an environmental engineer, is the domain expert who validates that the product matches how plants actually operate. I owned every technical decision in the stack:

The monorepo structure — what is an app, what is a shared package, what is a service, and where the boundaries sit.
The contract layer — OpenSpec as the source of truth between apps before any code is written.
The edge gateway design in Rust — Modbus protocol integration, offline-first SQLite store with sync queue, the AI subsystem layout (agents, classifier, model registry, local LLM), OTA model updates with signed artifacts.
The three-layer ML architecture — what runs at the edge, what runs in the cloud, what trains in batch, what infers in real time, and how models move between them.
The role model — operators, technical staff, supervisors, owners — as a data-layer concept, not a UI toggle.
The custom Biome plugin (eslint-plugin-biogas-ssot) that enforces single-source-of-truth conventions across apps at lint time.
The CI/CD pipeline on GitLab — model versioning to S3-compatible storage, parity tests as a deployment gate, controlled rollouts.

How Biogas Platform Started, And Why It Grew#

It began as a conversation: my nephew described the Excel reality, I described the platform that should replace it. The first version was modest — a Go backend, a Postgres database, a basic dashboard. Ingest sensor data, show it on a chart, replace the daily log.

Once the basic loop worked, the questions stacked up. If the data is in a real database, why not detect anomalies automatically? If we detect anomalies, why not predict failures? If we predict failures, why not run the inference at the plant so it works offline? If we run it at the plant, how do we update the models safely? Each answer added a layer, and the platform grew into what it is today.

Key Decisions#

1. Edge gateway in Rust as a self-sufficient industrial node#

The edge gateway is the heart of the system, not a thin inference wrapper. It is the component that the plant runs locally, and it has to keep working when everything else is unavailable — the WAN link, the cloud backend, the model registry. That is why it is the largest single application in the platform: 74 Rust source files, 18 subsystems, version 2.1.0, designed as an autonomous industrial node that happens to sync with the cloud when it can.

What the gateway actually does on the plant:

Talks to the PLCs over Modbus TCP/RTU — the industrial protocol that sensors and controllers actually speak. Holding, input, coils, and discrete-input registers, with configurable scale, offset, and data types (u16/i16/f32).
Persists every reading in a local SQLite store with a sync queue, so an outbound link drop just delays sync — it never loses data. Sync is HTTP batch with exponential retry, circuit breaker, and configurable batch sizes.
Runs ML inference locally via onnxruntime 2.0 (the ort crate) — anomaly detection on every reading without a cloud round-trip.
Hosts a local AI subsystem with on-device language models (llama_cpp), classifier, correlator, and a model registry with hardware-aware selection — picks the right model size for the gateway it is running on.
Has an AI agent framework: help agent, SQLite query agent, status agent — small specialized agents an operator can talk to from the plant dashboard without any cloud connection.
Supports OTA (over-the-air) updates with ed25519-signed model artifacts — models are downloaded, signature-verified, and deployed without restarting the gateway.
Exposes Prometheus metrics on :9090/metrics and health checks on :8888/health for monitoring; ships its own embedded dashboard PWA so an operator can inspect state without an external tool.

Why Rust: a process that runs unattended on plant hardware for weeks at a time, doing real-time IO with industrial protocols and ML inference, cannot afford memory leaks, GC pauses, or unhandled panics taking the gateway down. Rust delivers predictable performance, no GC, and compile-time guarantees that match the operational profile. The Tokio async runtime makes coordinating Modbus polling, SQLite writes, HTTP sync, and the AI subsystem realistic in a single process.

Why ONNX as the model interchange: the Python training pipeline (scikit-learn for anomaly models, transformers stack for NLP) exports to ONNX, and the Rust runtime consumes the exact same file. Parity tests verify that the Rust outputs match the Python outputs bit-identically before a model is ever promoted.

Tradeoff: scope and maintenance weight. The edge gateway is essentially its own product inside the platform — it ships its own version (2.1.0), its own configuration model (edge.toml), its own dashboard, its own CLI tooling for commissioning (validate config, dry-run a register read, convert a CSV of tags into an edge.toml draft). That breadth is the right answer for an industrial node, but it is a non-trivial amount of code to keep healthy alongside the rest of the platform.

Loading diagram...

Ciclo de vida del modelo: entrenado en Python, exportado a ONNX, y un test de paridad (Rust == Python, bit-identical) es el gate antes de versionar y desplegar por OTA firmado.

2. Three-layer AI architecture#

Instead of treating ML as a feature stuck onto a screen, the platform has three explicit AI layers, each with its own purpose, latency budget, and lifecycle:

Edge inference layer — Isolation Forest + Autoencoder running locally on ONNX, scoring anomalies on every sensor reading in under 50ms.
Anomaly detection layer — 32 engineered features (temporal, change, z-score, co-variation, data quality, biogas-domain) fed into ensemble voting with dynamic per-sensor thresholds. SHAP attributions explain why a reading was flagged.
Predictive AI layer — LSTM + Prophet for biogas/energy forecasting, Random Forest + XGBoost for equipment failure prediction 4-24 hours in advance, plus optimization recommendations for operating parameters. Continuous learning with automated retraining; PSI-based drift detection monitors model degradation in production.

Each layer is independent: edge keeps working if the cloud is offline; anomaly detection works without the predictive layer; the predictive layer can be retrained without touching the edge runtime. The separation is what makes the system operable, not just impressive.

Tradeoff: model lifecycle weight. Three layers means three training pipelines, three model registries, three deployment paths, three sets of drift monitoring. It is a lot of infrastructure for a single dev to maintain — only worth it because each layer pays back operationally.

Loading diagram...

Tres capas de IA independientes: el edge sigue detectando anomalías aunque el cloud esté caído; la capa predictiva se reentrena sin tocar el runtime del edge.

3. Spec-driven development with OpenSpec from day one#

The first commit was literally "init: project structure with openspec specs and tooling". Every contract between apps — backend to frontend, edge to backend, ML service to backend — has a spec before any code is written. OpenSpec is the source of truth; the implementation has to match it.

Tradeoff: process overhead at the start. Every new endpoint takes longer because the spec comes first. The payback is that when a contract changes, all consumers see the diff explicitly, and AI assistants generating client code can use the spec instead of guessing from the implementation.

What Biogas Platform Can Do Today#

5 applications in one monorepo — Go backend, Rust edge gateway, Python ML service, React/Vite web frontend, mobile app.
Real-time ingestion over MQTT (Mosquitto broker), persistence in PostgreSQL with Redis for hot data, time-series-aware schemas.
Autonomous edge gateway in Rust (v2.1.0) — Modbus TCP/RTU to PLCs, offline-first SQLite with sync queue, sub-50ms ONNX anomaly inference, on-device LLM with AI agents, OTA model updates with ed25519 signature verification, embedded dashboard PWA.
Three-layer AI: edge anomaly detection (Isolation Forest + Autoencoder), cloud anomaly analysis with SHAP explainability, predictive layer with LSTM + Prophet forecasts and Random Forest + XGBoost failure predictions 4-24h ahead.
Predictive maintenance based on real equipment condition replaces reactive or calendar-based maintenance, reducing unplanned downtime.
Role-aware UI for operators, technical staff, supervisors, and owners — distinct views and permissions, not toggles on a single dashboard.
OpenSpec contracts for every cross-app API; custom Biome plugin (eslint-plugin-biogas-ssot) enforces single-source-of-truth conventions at lint time.
GitLab CI/CD with versioned model storage on S3-compatible object storage, controlled model rollouts, and drift monitoring feeding retraining decisions.

What I'd Reconsider#

I waited too long to ship something. The instinct was to deliver a mature product — the three AI layers, the mobile app, the role model, the edge inference, all done well before showing it. That is not how this should have gone.

A smaller, earlier delivery would have been the right call. A backend that ingests sensor data and a dashboard that shows it, shipped in week three, would have given my nephew something real to use immediately. Feedback from actual plant operations would have shaped the priorities for what comes next. Instead I built outward — adding layers, apps, and capabilities — before any of it ran against real daily use.

The cost is invisible from the outside: the platform looks comprehensive and the architecture is clean. The cost is in what I never learned because nothing was in front of users early enough. The next product I build, I will ship the smallest thing that solves the original problem in the first month, and earn the right to add layers from there.

Architecture Snapshot#

Monorepo with five applications and supporting packages:

Loading diagram...

5 apps en un monorepo, con OpenSpec como contrato único entre ellas (validado en lint con un plugin Biome propio).

apps/backend — Go + GORM + Gin, REST API, MQTT subscriber, OpenSpec source of truth.
apps/edge-gateway — Rust (Tokio async runtime), Modbus TCP/RTU client to PLCs, SQLite local store with sync queue, onnxruntime 2.0 inference, local LLM via llama_cpp, AI agent framework, OTA model updates with ed25519 signing, embedded dashboard PWA, Prometheus metrics, health endpoints.
apps/ml-service — Python + scikit-learn, training pipelines, SHAP explainability, ONNX export with parity tests.
apps/frontend-vite — React 19 + Vite + Mantine + Recharts, operator and analyst dashboards.
apps/mobile — companion app on Ionic + Capacitor + Vite for operator field workflows.
packages/eslint-plugin-biogas-ssot — custom lint rules that enforce single-source-of-truth across apps.
services/ai-assistant — natural language query interface over the operational data.

Persistence is PostgreSQL for canonical state, Redis for hot data and caching, S3-compatible object storage for model artifacts and large blobs. Messaging is Mosquitto MQTT. Deployment is GitLab CI/CD with Docker images per app and Caddyfile-based edge routing.

The Problem#

What Biogas Platform Does#

Loading diagram...

Topología edge→cloud: el gateway Rust opera local (offline-first) y sincroniza con el backend cuando hay link; los sensores también entran por MQTT.

Constraints I Set#

Edge has to work without internet. Plants can lose connectivity for hours; operations and anomaly detection must keep running locally.
Models are versioned production assets. No ad-hoc model drops — every model is trained, validated, packaged as ONNX, versioned in S3-compatible storage, and rolled out through a controlled deployment.
Specs-driven from day one. OpenSpec is the source of truth for contracts between apps; no API is "just shipped" without a spec first.
Role-aware from the start. Plant operators, technical staff, supervisors, and owners see different views and have different capabilities — the role model is part of the data layer, not a UI afterthought.

My Role#

The monorepo structure — what is an app, what is a shared package, what is a service, and where the boundaries sit.
The contract layer — OpenSpec as the source of truth between apps before any code is written.
The edge gateway design in Rust — Modbus protocol integration, offline-first SQLite store with sync queue, the AI subsystem layout (agents, classifier, model registry, local LLM), OTA model updates with signed artifacts.
The three-layer ML architecture — what runs at the edge, what runs in the cloud, what trains in batch, what infers in real time, and how models move between them.
The role model — operators, technical staff, supervisors, owners — as a data-layer concept, not a UI toggle.
The custom Biome plugin (eslint-plugin-biogas-ssot) that enforces single-source-of-truth conventions across apps at lint time.
The CI/CD pipeline on GitLab — model versioning to S3-compatible storage, parity tests as a deployment gate, controlled rollouts.

How Biogas Platform Started, And Why It Grew#

Key Decisions#

1. Edge gateway in Rust as a self-sufficient industrial node#

What the gateway actually does on the plant:

Talks to the PLCs over Modbus TCP/RTU — the industrial protocol that sensors and controllers actually speak. Holding, input, coils, and discrete-input registers, with configurable scale, offset, and data types (u16/i16/f32).
Persists every reading in a local SQLite store with a sync queue, so an outbound link drop just delays sync — it never loses data. Sync is HTTP batch with exponential retry, circuit breaker, and configurable batch sizes.
Runs ML inference locally via onnxruntime 2.0 (the ort crate) — anomaly detection on every reading without a cloud round-trip.
Hosts a local AI subsystem with on-device language models (llama_cpp), classifier, correlator, and a model registry with hardware-aware selection — picks the right model size for the gateway it is running on.
Has an AI agent framework: help agent, SQLite query agent, status agent — small specialized agents an operator can talk to from the plant dashboard without any cloud connection.
Supports OTA (over-the-air) updates with ed25519-signed model artifacts — models are downloaded, signature-verified, and deployed without restarting the gateway.
Exposes Prometheus metrics on :9090/metrics and health checks on :8888/health for monitoring; ships its own embedded dashboard PWA so an operator can inspect state without an external tool.

Loading diagram...

Ciclo de vida del modelo: entrenado en Python, exportado a ONNX, y un test de paridad (Rust == Python, bit-identical) es el gate antes de versionar y desplegar por OTA firmado.

2. Three-layer AI architecture#

Instead of treating ML as a feature stuck onto a screen, the platform has three explicit AI layers, each with its own purpose, latency budget, and lifecycle:

Edge inference layer — Isolation Forest + Autoencoder running locally on ONNX, scoring anomalies on every sensor reading in under 50ms.
Anomaly detection layer — 32 engineered features (temporal, change, z-score, co-variation, data quality, biogas-domain) fed into ensemble voting with dynamic per-sensor thresholds. SHAP attributions explain why a reading was flagged.
Predictive AI layer — LSTM + Prophet for biogas/energy forecasting, Random Forest + XGBoost for equipment failure prediction 4-24 hours in advance, plus optimization recommendations for operating parameters. Continuous learning with automated retraining; PSI-based drift detection monitors model degradation in production.

Loading diagram...

Tres capas de IA independientes: el edge sigue detectando anomalías aunque el cloud esté caído; la capa predictiva se reentrena sin tocar el runtime del edge.

3. Spec-driven development with OpenSpec from day one#

What Biogas Platform Can Do Today#

5 applications in one monorepo — Go backend, Rust edge gateway, Python ML service, React/Vite web frontend, mobile app.
Real-time ingestion over MQTT (Mosquitto broker), persistence in PostgreSQL with Redis for hot data, time-series-aware schemas.
Autonomous edge gateway in Rust (v2.1.0) — Modbus TCP/RTU to PLCs, offline-first SQLite with sync queue, sub-50ms ONNX anomaly inference, on-device LLM with AI agents, OTA model updates with ed25519 signature verification, embedded dashboard PWA.
Three-layer AI: edge anomaly detection (Isolation Forest + Autoencoder), cloud anomaly analysis with SHAP explainability, predictive layer with LSTM + Prophet forecasts and Random Forest + XGBoost failure predictions 4-24h ahead.
Predictive maintenance based on real equipment condition replaces reactive or calendar-based maintenance, reducing unplanned downtime.
Role-aware UI for operators, technical staff, supervisors, and owners — distinct views and permissions, not toggles on a single dashboard.
OpenSpec contracts for every cross-app API; custom Biome plugin (eslint-plugin-biogas-ssot) enforces single-source-of-truth conventions at lint time.
GitLab CI/CD with versioned model storage on S3-compatible object storage, controlled model rollouts, and drift monitoring feeding retraining decisions.

What I'd Reconsider#

Architecture Snapshot#

Monorepo with five applications and supporting packages:

Loading diagram...

5 apps en un monorepo, con OpenSpec como contrato único entre ellas (validado en lint con un plugin Biome propio).

apps/backend — Go + GORM + Gin, REST API, MQTT subscriber, OpenSpec source of truth.
apps/edge-gateway — Rust (Tokio async runtime), Modbus TCP/RTU client to PLCs, SQLite local store with sync queue, onnxruntime 2.0 inference, local LLM via llama_cpp, AI agent framework, OTA model updates with ed25519 signing, embedded dashboard PWA, Prometheus metrics, health endpoints.
apps/ml-service — Python + scikit-learn, training pipelines, SHAP explainability, ONNX export with parity tests.
apps/frontend-vite — React 19 + Vite + Mantine + Recharts, operator and analyst dashboards.
apps/mobile — companion app on Ionic + Capacitor + Vite for operator field workflows.
packages/eslint-plugin-biogas-ssot — custom lint rules that enforce single-source-of-truth across apps.
services/ai-assistant — natural language query interface over the operational data.

Biogas Platform

Descripción del Proyecto

The Problem#

What Biogas Platform Does#

Constraints I Set#

My Role#

How Biogas Platform Started, And Why It Grew#

Key Decisions#

1. Edge gateway in Rust as a self-sufficient industrial node#

2. Three-layer AI architecture#

3. Spec-driven development with OpenSpec from day one#

What Biogas Platform Can Do Today#

What I'd Reconsider#

Architecture Snapshot#

Tecnologías

Información

¿Te interesa trabajar conmigo?

Biogas Platform

Descripción del Proyecto

The Problem#

What Biogas Platform Does#

Constraints I Set#

My Role#

How Biogas Platform Started, And Why It Grew#

Key Decisions#

1. Edge gateway in Rust as a self-sufficient industrial node#

2. Three-layer AI architecture#

3. Spec-driven development with OpenSpec from day one#

What Biogas Platform Can Do Today#

What I'd Reconsider#

Architecture Snapshot#

Tecnologías

Información

¿Te interesa trabajar conmigo?