CURRENT PROGRESS
27 token/s on 8B parameter models
Consuming under 10W while inference
Costs under 100$
Everything you see in the video is run locally (voice transcription, voice to text, LLM)
Exponential progress coming!
Now, we're pushing performance per dollar
Want to contribute with custom inference firmware, hardware, or novel AI model? Let's talk!
