syslog-harness/README.md

# Syslog Harness

Operational orchestration layer for Syslog's internal AI agents.

## Architecture

```
┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│  Agent      │────>│  Nginx       │────>│  GPU Pool   │
│  (Hermes)   │     │  Router      │     │  (MoE/Dense)│
└─────────────┘     └──────────────┘     └─────────────┘
                         │
                         ├──> :8091 Queue Service (Docker)
                         │
                         └──> :3001 Dashboard (Docker)
```

## Components

| Service | Port | Container | Purpose |
|---|---|---|---|
| Nginx Router | 8080 | Host | Routes requests to GPU backends |
| Queue Service | 8091 | `syslog-queue` | Enqueues requests when GPUs are down |
| Dashboard | 3001 | `syslog-dashboard` | Observability UI + API |

## GPU Routing

| Header `X-Syslog-Model` | Backend | Model |
|---|---|---|
| (none) / `standard` | amdpve (.15) | qwen3.6-35B-A3B (MoE) |
| `heavy` / `qwen3.5-27B` | llmgpu (.8) | qwen3.5-27B (Dense) |
| `light` / `gemma-4` | ocu_llm (.110) | gemma-4-E4B (Light) |

## Quick Start

```bash
# Build & start
docker compose build
docker compose up -d

# Verify
curl http://localhost:8091/health
curl http://localhost:3001/api/status
```

## Dashboard

- **UI:** `http://<host>:8080/dashboard/harness.html`
- **API:** `http://<host>:8080/dashboard/api/status`

## Circuit Breaker

- Rate limit: 10 req/s per IP
- Burst: 20 requests
- Excess returns 503
- Queue fallback on GPU 502/503

## Production Migration

See [MIGRATION_PLAN.md](./MIGRATION_PLAN.md)

---
*Built for Syslog Solution LLC — Quality over speed.*