Top
Best
New

Posted by jamesxv7 6/30/2025

Ask HN: What's the 2025 stack for a self-hosted photo library with local AI?

First of all, this is purely a personal learning project for me, aiming to combine three of my passions: photography, software engineering, and my family memories. I have a large collection of family photos and want to build an interactive experience to explore them, ala Google or Apple Photo features.

My goal is to create a system with smart search capabilities, and one of the most important requirements is that it must run entirely on my local hardware. Privacy is key, but the main driver is the challenge and joy of building it myself (an obviously learn).

The key features I'm aiming for are:

Automatic identification and tagging of family members (local face recognition).

Generation of descriptive captions for each photo.

Natural language search (e.g., "Show me photos of us at the beach in Luquillo from last summer").

I've already prompted AI tools for a high-level project plan, and they provided a solid blueprint (eg, Ollama with LLaVA, a vector DB like ChromaDB, you know it). Now, I'm highly interested in the real-world human experience. I'm looking for advice, learning stories, and the little details that only come from building something similar.

What tools, models, and best practices would you recommend for a project like this in 2025? Specifically, I'm curious about combining structured metadata (EXIF), face recognition data, and semantic vector search into a single, cohesive application.

Any and all advice would be deeply appreciated. Thanks!

230 points | 121 commentspage 2
mlunar 7/1/2025|
It's a pretty deep rabbit hole. For semantic search CLIP and cosine similarity are just fine. SmolVLM(2) mentioned by spacecadet looks interesting though. I haven't integrated face recognition myself, but [deepface] seemed pretty complete.

I focused more on fast rendering in [photofield] (quick [explainer] if you're interested), but even the hacked up basic semantic search with CLIP works better than it has any right to. Vector DBs are cool, but what is cooler is writing float arrays to sqlite :)

[deepface]: https://github.com/serengil/deepface

[photofield]: https://github.com/SmilyOrg/photofield

[explainer]: https://lnar.dev/blog/photofield-origins/

sneak 6/30/2025||
I believe Ente supports all of this, and can be self-hosted. All of the AI stuff is done locally.

I pay them for service/storage as it’s e2ee and it doesn’t matter to me if they or I store the encrypted blobs.

They also have a CLI tool you can run from cron on your NAS or whatever to make sure you have a complete local copy of your data, too.

https://ente.io - if you use the referral code SNEAK we both get additional free storage.

xtrememarketers 7/5/2025||
For a 2025 self-hosted photo library with local AI, a great stack is PhotoPrism or Immich running on Docker with local storage (NAS or server). Both offer built-in face/object detection without cloud reliance, using local TensorFlow or custom models. Add a reverse proxy like Nginx for HTTPS, and use a GPU for faster tagging. Deployment is easy with Docker Compose on Raspberry Pi or x86 servers. For managing costs on gear or upgrades, try this handy Percent Off Calculator (https://www.calculate-percent.com/) to figure out your savings quickly.
chrisgd 6/30/2025||
This is my dream. I started building something that would upload all my photos from my phone to my desktop, back them up somewhere and then present them 6 at a time on a local website solely so you could look at them again and decide if you wanted to keep them. Heart any you wanted to keep, favorite some, and delete the rest then show me 6 more.

The addition of an AI tool is a great idea.

ksec 7/1/2025||
Slightly Off Topic: I have always wanted (old) Apple to make Time Machine / Personal Cloud where Data is stored and processed in my property. While only offering Subscription based storage as long term storage Cloud backup and software update.

For Features. I dont know why there's isn't a tag for Screen Caps. I made lots of them and I want to group them together.

hammyhavoc 7/1/2025|
That sounds similar in concept to the original Apple TV (before the black puck one) that had a hard drive and basically ran Front Row (view your photos on your TV etc), but combined with the oldskool Apple Time Capsule.
gerdesj 6/30/2025||
Nextcloud with a few addons. Now this might look like overkill for your use case but I get the impression that you might want to go further in future.

Stock NC gets you a very solid general purpose document management system and with a few addons, you basically get self hosted SharePoint and OneDrive without the baggage. The images/pictures side of things has seen quite a lot of development and with some addons you get image classification with fairly minimal effort.

The system as a whole will quite happily handle many 100,000 files with pretty rubbish hardware, if you are happy to wait for batch jobs to run or you throw more hardware at it and speed up the job schedules.

NC has a stock phone app which works very well these days, including camera folder uploads. There are several more apps that integrate with the main one to add optional functionality. For example notes and voip.

It is a very large and mature setup with loads of documentation and hence extensible by a determined hacker if something is missing.

ssnepenthe 6/30/2025||
The gallery I use has an "internals" page in their docs: https://docs.home-gallery.org/internals/

It gives a sort of high level system overview that might provide some useful insights or inspiration for you.

weinzierl 6/30/2025||
In addition to all of that I want an AI solution that pre-selects good images for me, so I do not have to go through all of them manually. Similar to Apple Memories or Featured Photos. Is there anything self-hosted like that?
simonw 6/30/2025||
There are some spectacular local models for generating text descriptions of images now. I suggest starting with Mistral Small 3.2, Gemma 3 and Qwen 2.5VL - all available via Ollama.

I expect we will see a Qwen 3VL soon.

pmetras 7/2/2025|
If you want light configuration requirements with no database, you can try to enhance https://gitlab.com/paolobenve/myphotoshare. MyPhotoShare is a static photo-gallery where AI features have been added through extensions of the parser. This is a one-developer project, mostly Python and JavaScript, and he is open to contributions.
More comments...