Posted by stevekrouse 5 days ago
1. How did he tell Claude to “update” based on the notebook entries?
2. Won’t he eventually ran out of context window?
3. Won’t this be expensive when using hosted solutions? For just personal hacking, why not simply use ollama + your favorite model?
4. If one were to build this locally, can Vector DB similarity search or a hybrid combined with fulltext search be used to achieve this?
I can totally imagine using pgai for the notebook logs feature and local ollama + deepseek for the inference.
The email idea mentioned by other commenters is brilliant. But I don’t think you need a new mailbox, just pull from Gmail and grep if sender and receiver is yourself (aka the self tag).
Thank you for sharing, OP’s project is something I have been thinking for a few months now.
The "memories" table has a date column which is used to record the data when the information is relevant. The prompt can then be fed just information for today and the next few days - which will always be tiny.
It's possible to save "memories" that are always included in the prompt, but even those will add up to not a lot of tokens over time.
> Won’t this be expensive when using hosted solutions?
You may be under-estimating how absurdly cheap hosted LLMs are these days. Most prompts against most models cost a fraction of a single cent, even for tens of thousands of tokens. Play around with my LLM pricing calculator for an illustration of that: https://tools.simonwillison.net/llm-prices
> If one were to build this locally, can Vector DB similarity search or a hybrid combined with fulltext search be used to achieve this?
Geoffrey's design is so simple it doesn't even need search - all it does is dump in context that's been stamped with a date, and there are so few tokens there's no need for FTS or vector search. If you wanted to build something more sophisticated you could absolutely use those. SQLite has surprisingly capable FTS built in and there are extensions like https://github.com/asg017/sqlite-vec for doing things with vectors.
Do we even need to think of these as agents, or will the agentic frameworks move towrads being a call_llm() sql function?
Large swathes of the stack is commoditized OSS plumbing, and hosted inference is already cheap and easy.
There are obvious security issues with plugging an agent into your email and calendar, but I think many will find it preferable to control the whole stack rather than ceding control to Apple or Google.
"There are obivious security issues with plugging and agent into your email..." Isn't this how North Korea makes all their crypto happen?
For me, that is an extremely low barrier to cross.
I find Siri useful for exactly two things at the moment: setting timers and calling people while I am driving.
For these two things it is really useful, but even in these niches, when it comes to calling people, despite it having been around me for years now it insist on stupid things like telling me there is no Theresa in my contacts when I ask it to call Therese.
That said what I really want is a reliable system I can trust with calendar acccess and that is possible to discuss with, ideally voice based.
Which I think is a path that people haven't considered with LLMs. We are expecting them to get better forever, but once we start using them, their legs will be cut out to force them to feed us advertising.
I still regularly experience a bug where my mac sends sound to speakers instead of a plugged in headphone jack after waking up from sleep. 10 years ago when I first looked into it the official Apple response was "that's not possible with the hardware" and we haven't made any progress since. Gaslighting as a service I guess.
Luckily I can just unplug and plug back in. Maybe they can bring the great Apple minds together to make my iPhone stop blasting an alarm in my ear at regular volume if I happen to be talking on the phone when it goes off (issue since my very first iPhone 3).
This always gets me...is there not a public bug report for this one?
What do you think of this: instead of just deleting old entries, you could either do LRU (I guess Claude can help with it), or you could summarize the responses and store the summary back into the same table — kind of like memory consolidation. That way raw data fades, but a compressed version sticks around. Might be a nice way to keep memory lightweight while preserving context.
TL;DR I made shortcuts that work on my Apple watch directly to record my voice, transcribe it and store my daily logs on a Notion DB.
All you need are 1) a chatgpt API key and 2) a Notion account (free).
- I made one shortcut in my iPhone to record my voice, use whisper model to transcribe it (done locally using a POST request) and send this transcription to my Notion database (again a POST request on shortcuts)
- I made another shortcut that records my voice, transcribes and reads data from my Notion database to answer questions based on what exists in it. It puts all data from db into the context to answer -- costs a lot but simple and works well.
The best part is -- this workflow works without my iPhone and directly on my Apple Watch. It uses POST requests internally so no need of hosting a server. And Notion API happens to be free for this kind of a use case.
I like logging my day to day activities with just using Siri on my watch and possibly getting insights based on them. Honestly the whisper model is what makes it work because the accuracy is miles ahead of the local transcription model.
On second thought -- apple shortcuts is really brittle. It breaks in non obvious ways and a lot can only be learned by trial and error lol
Edit: I just wrote up something quick https://simianwords.bearblog.dev/how-i-use-my-apple-watch-to...
I'd use a hosted platform for this kind of thing myself, because then there's less for me to have to worry about. I have dozens of little systems running in GitHub Actions right now just to save me from having to maintain a machine with a crontab.
Home server AI is orders of magnitude more costly than heavily subsidized cloud based ones for this use case unless you run toy models that might hallucinate meetings.
edit: I now realize you're talking about the non-ai related functionality.
I have not thought about adding memory log of all current things and feeding it into the context I'll try it out.
Mine is a simple stateless thing that captures messages, voice memos and creates task entries in my org mode file with actionable items. I only feed current date to the context.
Its pretty amusing to see how it sometimes adds a little bit of its own personality to simple tasks, for example if one of my tasks are phrased as a question it will often try to answer the question in the task description.
I don't think it'll ever happen. Really the only valid use-case would be for people to hack together something for themselves (like we are discussing)... They don't want to allow developers to create applications on top of this as a 3rd party, as informed delivery itself has to carefully navigate privacy laws and it could be disastrous.
You can still get the parcel ID and use a public-ish web API to get tracking information on a rough level ("in transit", "being delivered") without exact address information.