abiba-bot/inference-harness: SyslogAI Inference Harness — 3-GPU router, dashboard, LiteLLM proxy - inference-harness - Syslog Solution Repository

Explore Help

Register Sign In

abiba-bot/inference-harness

Code Issues Pull Requests 1 Actions Packages Projects Releases Wiki Activity

28 Commits 1 Branch 0 Tags

93d0d3cc4b158be9e5667bd14c63e1e37573c2ef

T

Clone

Open with VS Code Open with VSCodium Open with Intellij IDEA

Download ZIP Download TAR.GZ Download BUNDLE

Abiba 93d0d3cc4b revert: MoE concurrency back to 2 (Dense-first routing handles thermal)

2026-05-27 00:04:42 +00:00

fix: default performance window to 24h so all models appear immediately

2026-05-26 12:37:52 +00:00

feat: per-request performance tracking + /metrics/performance endpoint

2026-05-25 16:50:45 +00:00

revert: MoE concurrency back to 2 (Dense-first routing handles thermal)

2026-05-27 00:04:42 +00:00

.gitignore

May 19, 2026: Full harness update

2026-05-19 15:03:47 +00:00

docker-compose.yml

fix: non-blocking GPU health checks + 256K turboquant context upgrade

2026-05-23 05:57:13 +00:00

docker-compose.yml.bak

May 19, 2026: Full harness update

2026-05-19 15:03:47 +00:00

litellm_config.yaml

May 19, 2026: Full harness update

2026-05-19 15:03:47 +00:00

maintenance.sh

May 19, 2026: Full harness update

2026-05-19 15:03:47 +00:00

S

Description

SyslogAI Inference Harness — 3-GPU router, dashboard, LiteLLM proxy

916 KiB

Languages

Python 77%

HTML 22%

Shell 0.8%

Dockerfile 0.2%

Powered by Gitea Version: 1.27.0 Page: 19ms Template: 2ms

Auto

English

Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語简体中文繁體中文（台灣）繁體中文（香港） 한국어

Licenses API