Skip to main content

Firebird API

LLM Inference Engine with BitNet Support.

Module: src/firebird/

CLI Commands

Chat Mode

./bin/firebird chat --model path/to/model.gguf

Server Mode

./bin/firebird serve --port 8080 --model model.gguf

HTTP API

POST /v1/chat/completions

OpenAI-compatible chat endpoint.

{
"model": "bitnet-3b",
"messages": [{"role": "user", "content": "Hello!"}],
"temperature": 0.7
}

Performance

Model SizeMemoryTokens/sec
1.5B~1GB15-20
3B~2GB8-12
7B~4GB4-6