Posted by AbuAssar 6 days ago
I use an older Google Coral TPU running in my home lab being used by Frigate NVR for object detection for security cameras. It's more efficient, but less flexible than running it on the GPU.
Don't know if I need an NPU for my daily driver computer, but I would want one for my next home server.
Any context that needs some limited intelligence while consuming little power would benefit from this.
AMD employees work on it/have been making blog posts about it for a bit.
Found this on the github readme.
[1]: https://github.com/lemonade-sdk/lemonade/releases/tag/v10.0....
This way software adoption will be very limited.
"FastFlowLM (FLM) support in Lemonade is in Early Access. FLM is free for non-commercial use, however note that commercial licensing terms apply. "
Lemonade is really just a management plane/proxy. It translates ollama/anthropic APIs to OpenAI format for llama.cpp. It runs different backends for sst/tts and image generation. Lets you manage it all in one place.