Posted by ddddazed 3 days ago
The project is designed to handle communication between desktop apps in an agentic manner, so the focus is strictly on this IPC layer (forget about HTTP API calls).
At the heart of RAIL (Remote Agent Invocation Layer) are two fundamental concepts. The names might sound scary, but remember this is a research project:
Memory Logic Injection + Reflection Paradigm shift: The Chat is the Server, and the Apps are the Clients.
Why this approach? The idea was to avoid creating huge wrappers or API endpoints just to call internal methods. Instead, the agent application passes its own instance to the SDK (e.g., RailEngine.Ignite(this)).
Here is the flow that I find fascinating:
-The App passes its instance to the RailEngine library running inside its own process.
-The Chat (Orchestrator) receives the manifest of available methods.The Model decides what to do and sends the command back via Named Pipe.
-The Trigger: The RailEngine inside the App receives the command and uses Reflection on the held instance to directly perform the .Invoke().
Essentially, I am injecting the "Agent Logic" directly into the application memory space via the SDK, allowing the Chat to pull the trigger on local methods remotely.
A note on the Repo: The GitHub repository has become large. The core focus is RailEngine and RailOrchestrator. You will find other connectors (C++, Python) that are frankly "trash code" or incomplete experiments. I forced RTTR in C++ to achieve reflection, but I'm not convinced by it. Please skip those; they aren't relevant to the architectural discussion.
I’d love to focus the discussion on memory-managed languages (like C#/.NET) and ask you:
-Architecture: Does this inverted architecture (Apps "dialing home" via IPC) make sense for local agents compared to the standard Server/API model?
-Performance: Regarding the use of Reflection for every call—would it be worth implementing a mechanism to cache methods as Delegates at startup? Or is the optimization irrelevant considering the latency of the LLM itself?
-Security: Since we are effectively bypassing the API layer, what would be a hypothetical security layer to prevent malicious use? (e.g., a capability manifest signed by the user?)
I would love to hear architectural comparisons and critiques.
I think the biggest concern is that the # of types & methods is going to be too vast for most practical projects. LLM agents fall apart beyond 10 tools or so. Think about the odds of picking the right method out of 10000+, even with strong bias toward the correct path. A lot of the AI integration pain is carefully conforming to the raw nature of the environment so that we don't overwhelm the token budget of the model (or our personal budgets).
I would consider exposing a set of ~3 generic tools like:
SearchTypes
GetTypeInfo
ExecuteScript
This constrains your baseline token budget to a very reasonable starting point each time.I would also consider schemes like attributes that explicitly opt-in methods and POCOs for agent inspection/use.
In my work, we have multi-project solutions, and I understand what you mean. I completely agree. I didn't imagine proposing such a solution to eliminate the architectural creation part, but at this point, that can be done directly in the JSON we send to the model. With reflection, you don't need wrappers. Let's say we also limit the functions. With this architecture, I don't have to write the wrapper for the three functions I choose. I simply create a JSON file that contains three basic ones.
At this point, reflection will do its job.
Can you clarify this further?
With the C++26 release, native reflection was introduced, which is arguably more powerful than RTTR.
So I'm not entirely sure on this point, how do reflection discovery works and how are complex types handled?
nice work anyway
As for the details, the scheme I started with is “simple” and, as already mentioned, it is valid for all memory-managed languages.
The desktop application starts up and, at the entry point, does the following:
engine.Ingnite(this)
The “engine” library now has the application instance in its belly, and a listener is created for each application, waiting for commands.
On the “chat” side, the prompt to the model starts, and here both the prompt and the classic json that I built with the methods I decided to expose are sent.
When the chat receives the response, it sends the token to the listening listener:
engine.Execute(method.json)
From here, first deserialize the json by creating the objects. At this point, you have both the application instance and its method, as well as the arguments with which it must be called:
method.Invoke(instance, parameters)
I find this interesting, and it breaks with the classic way we are used to communicating with wrapper APIs.
I think that, since it has probably been an established method for decades with the advent of the web, it was natural to move in this direction. However, coming mainly from a desktop environment, I wonder if this is the “easiest” and “smartest” way to handle these cases.
I hope this can help.