Posted by ingve 9 hours ago
Yes, but that is normal I guess:
Dont need more than 8gb. It'll be enough power. IT can do audio to audio.
I was able to run a speech to text on my old Pixel 4 but it’s a bit flaky (the background process loses the audio device occasionally). I just want to take some wake word and then send everything to remote LLM and then get back text that I do TTS on.
I was only using it for local Home Assistant tasks, didn't try anything further like retrieving sports scores, managing TODO lists, or anything like that.
TinyML is a book that goes through the process of building a wake word model for such constrained environments.
1. Can I run a local LLM that allows me to control Home Assistant with natural language? Some basic stuff like timers, to do/shopping lists etc would be nice etc.
2. Can I run object/person detection on local video streams?
I want some AI stuff, but I want it local.
Looks like the answer for this one is: Meh. It can do point 2, but it's not the best option.
2. Has been possible in realtime since the first camera was released and has most likely improved since. I did this years ago on the pi zero and it was surprisingly good.
No. Get the larger PI recommended by the article.
Quote from the article:
> So power holds it back, but the 8 gigs of RAM holds back the LLM use case (vs just running on the Pi's CPU) the most. The Pi 5 can be bought in up to a 16 GB configuration. That's as much as you get in decent consumer graphics cards1.
> Because of that, many quantized medium-size models target 10-12 GB of RAM usage (leaving space for context, which eats up another 2+ GB of RAM).
…
> 8 GB of RAM is useful, but it's not quite enough to give this HAT an advantage over just paying for the bigger 16GB Pi with more RAM, which will be more flexible and run models faster.
The model specs shown for this device in the article are small, and not fit for purpose even for the relatively trivial use case you mentioned.
I mean, look, lots of people have lots of opinions about this (many of them wrong); it’s cheap, you can buy one and try… but, look. The OP really gave it a shot, and results were kind of shit. The article is pretty clear.
Don’t bother.
You want a device with more memory to mess around with for what you want to do.
I once tried to run a segmentation model based on a vision transformer on a PC and that model used somewhere around 1 GB for the parameters and several gigabytes for the KV cache and it was almost entirely compute bound. You couldn't run that type of model on previous AI accelerators because they only supported model sizes in the megabytes range.
Case closed. And that's extremely slow to begin with, the Pi 5 only gets what, a 32 bit bus? Laughable performance for a purpose built ASIC that costs more than the Pi itself.
> In my testing, Hailo's hailo-rpi5-examples were not yet updated for this new HAT, and even if I specified the Hailo 10H manually, model files would not load
Laughable levels of support too.
As another datapoint, I've recently managed to get the 8L working natively on Ubuntu 24 with ROS, but only after significant shenanigans involving recompiling the kernel module and building their library for python 3.12 that Hailo for some reason does not provide outside 3.11. They only support the Pi OS (like anyone would use that in prod) and even that is very spotty. Like, why would you not target the most popular robotics distro for an AI accelerator? Who else is gonna buy these things exactly?