Page History

Architectural reference for the Local AI capability in FrameworX 10.1.5: the two-path execution model, code organization, master-gate semantics, transcript cache mechanics, tool dispatch loop, and integration points with the rest of the platform.

Technical Reference → Platform Architecture Reference → Local AI Architecture Reference (10.1.5 draft)

Info
10.1.5 draft. This page documents Local AI internals as they ship in FrameworX 10.1.5. It is the architecture map — for task-oriented quick-starts and end-user reference, see Local AI. Content may change before GA.

Audience and scope

This page is for system integrators, custom-driver authors, OEM partners, and engineers who need to reason about how Local AI is composed inside the FrameworX runtime: which assembly hosts what, which thread runs which work, where gates apply, and which contracts must be honoured when extending or wrapping Local AI.

End users configuring Local AI from the Designer or wiring a chat panel from a Display do not need this page. See Local AI and the user-facing reference children there.

Capability summary

Local AI exposes one platform capability — calling an OpenAI-compatible LLM from inside a FrameworX solution — through two architecturally independent consumer paths:

Path	Surface	Caller context	Stateful?	Tools?	Hooks?
Cached	`ChatRequest` Display action	Operator click → ServiceClient TCP → Kernel TCPServer dispatch → Local AI	Per-Display-panel transcript cache	Yes (UNS / Alarm / Historian / custom)	Yes (`OnBeforeChat` / `OnAfterChatReply`)
Atomic	`TK.AIExecute` script API	Server-domain script (Server.Class, Script.Task, alarm callback, report generator) → Toolkit static	Stateless — every call independent	No (single POST, no tools)	No

Both paths POST to the same OpenAI-compatible endpoint, configured once per solution on three columns of SolutionSettings. Both paths return the same reply envelope (see Local AI Reply Envelope Schema (10.1.5 draft)).

Two-path model — what is shared and what is divergent

The two paths are deliberately independent code paths sharing a small set of explicit anchors. Code duplication is welcomed where the alternative would be a shared helper that couples them.

Shared by architecture (legitimate coupling)

The endpoint configuration — both paths read SolutionSettings.ModelSettings JSON via the same defensive accessors on the Local AI service singleton.
The master kill-switch — both paths check SolutionSettings.ModelEnabled at entry and short-circuit identically when off.
The SecuritySecrets resolver — both paths invoke Security.CheckAndResolveSecretCommConfiguration on resolved auth strings; this is a Kernel-side platform service, not a path-specific helper.
The reply envelope schema — both paths produce the same five-field JSON shape (text, status, toolTrace, latencyMs, warnings).
The auth multi-line decoder — both paths call the platform's pre-existing ReportWebData.DecodeAuthorizationHeader static, which is the single source of truth for the None / BearerToken / BasicAuth / CustomAuth wire format.

Divergent by design

Concern	Cached path	Atomic path
Implementation class	Sealed orchestrator owning the per-Display-panel transcript	Separate sealed class — no shared methods, no shared state
HTTP-POST helper	Inline in the cached orchestrator	Inline in the atomic class — separate copy, even if today the bodies look similar, so divergence as features grow stays isolated
Master tool-surface bit (`ModelOptions` `0x02`)	Required — off → `status="disabled"`	Not consulted — atomic has no tools to gate
Per-category tool bits (`0x04` – `0x20`)	AND-ed against master to assemble the tool catalog	Not consulted
Hook chain	Raises `OnBeforeChat` and `OnAfterChatReply`	Does NOT raise hooks — chat-attached behaviour does not apply
Transcript cache	Per-`ClientInfo.Guid`, lazy login-reset, `0x80 EnableChatHistory` sub-gate	None
Tool dispatch loop	Bounded by per-turn cap and 60-second wall-clock budget	Single POST, no loop
Caller wire signature	Receives `(ObjectServer, clientGuid, userName, queryJson)`	Receives `(ObjectServer, queryJson)`

Code organization

Assembly home and the namespace-contribution rule

Local AI is implemented in its own peer assembly, T.LocalAI.dll, sibling to T.Modules.dll and T.Toolkit.dll. It is NOT inside T.Modules.dll.

The placement is governed by a platform rule: code lives in T.Modules.dll if and only if it contributes to the runtime Object Model — Tag namespaces, eObjType registration, ListObj / RunObj entries, user-visible runtime addressability. Local AI has zero Object Model contribution: TK.AIExecute (a Toolkit static) and the Display ChatRequest action both consume the Object Model — they read tags, dispatch runtime_* tools — but neither contributes a Tag namespace or a runtime root. Therefore Local AI is a peer infrastructure assembly, not a runtime Module.

This explains why there is no AI namespace in the runtime Object Model in 10.1.5 — Local AI is reachable only as TK.AIExecute(...) from script and as the ChatRequest Display action. A future @Solution.AI.* binding surface is post-10.1.5 work.

Build position and dependencies

Assembly	Depends on	Consumed by
`T.LocalAI.dll`	`T.Kernel.dll`, `T.Library.dll`, `T.ProjectServer.dll`	`T.Toolkit.dll` (atomic-path entry), Kernel-side TCP service host (cached-path entry), `T.Designer.UI` (defaults class)

The build position is between Kernel/ProjectServer and Toolkit. The Toolkit consumer (TK_AI.cs) takes a compile-time project reference to T.LocalAI.dll — there is no reflection-by-name on this binding.

Bind sites

Two callers reach the Local AI service entry methods. Both are compile-time-checked under the prescribed code organization, so signature changes surface as build errors rather than runtime nulls.

Caller	Binding	Calls
Kernel-side TCP service handler (cached path)	Compile-time reference inside `T.LocalAI.dll` (the service handler is co-located with the service)	`ExecuteChatAsync(ObjectServer, clientGuid, userName, queryJson)`
Toolkit static `TK.AIExecute` (atomic path)	Compile-time reference via project reference on `T.Toolkit.csproj`	`ExecuteAtomicAsync(ObjectServer, queryJson)`

Type visibility

The Local AI service singleton, the cached-path orchestrator, and the atomic-path class are all internal sealed. The single public static surface is the defaults class, which exposes the recommended endpoint constants (URL, model name, etc.) for the Designer-side configuration dialog to seed its 5-field editor.

Process and thread model

Where Local AI runs

Local AI HTTP traffic is server-side only. Both paths execute their LLM POST inside the TServer process; clients never directly contact the LLM endpoint.

Caller surface	Process where LLM POST happens	Thread
`ChatRequest` Display action	TServer (TCP service handler dispatch)	TCPServer dispatch thread for the inbound ServiceClient call
`TK.AIExecute` from Server.Class method	TServer	Caller's thread (script execution thread); sync wrapper unwraps async via `AsyncHelpers.RunSync`
`TK.AIExecute` from Server.Task	ScriptTaskServer.exe (a server-domain process)	Task execution thread
`TK.AIExecute` from Report generator	ReportServer.exe	Report-rendering thread

Why no runtime "view-client" gate

The cached path is reachable only through the Kernel-side TCP service handler — by construction, that handler runs in the TServer process. A misrouted client-side caller cannot land in ExecuteChatAsync.

The atomic path is reachable from a Toolkit static, so technically a Display CodeBehind script (which runs at the client) could call TK.AIExecute. The platform does not block this with a runtime gate, because three platform-level safety nets already defend against degraded execution at the client:

Secret resolution short-circuits off-server. Security.CheckAndResolveSecretCommConfiguration returns the unresolved literal /secret:<Name> token when not running server-side. The LLM endpoint then receives the literal token as the credential and rejects with 401.
SolutionSettings reads route through ObjServer.DB. Some client-side configurations have a stripped-config DB; reads fall back to defaults targeting localhost:11434 — unreachable from a remote browser.
Master kill-switch precedes everything. When ModelEnabled is off, the call short-circuits with status="disabled" before any HTTP work.

Together these mean a misrouted call from a thin client never silently succeeds with leaked credentials — it returns a normal HTTP-error envelope or the master-gate disabled envelope. No runtime view-client predicate is required.

Master gates — application order

Both paths apply gates in a fixed sequence at entry to the service method. The order is non-negotiable.

Order	Gate	Cached path	Atomic path	Outcome when gate is closed
1	`SolutionSettings.ModelEnabled = true`	Yes	Yes	`status="disabled"`, `latencyMs=0`, no HTTP traffic
2	`SolutionSettings.ModelOptions` bit `0x02` (`EnableRuntimeMCP`)	Yes	Skipped	Cached: `status="disabled"`; atomic: not applicable
3	Per-category sub-bits (`0x04`–`0x20`)	AND-ed with master to build tool catalog	Skipped	Tool category absent from the catalog the LLM sees

The asymmetry on gate 2 is intentional. ModelOptions exists to gate the LLM's tool surface; the atomic path exposes no tools, so the bit is meaningless to it. Gating the atomic path on 0x02 would prevent customers from using TK.AIExecute for pure-language tasks (translation, summary, rephrase) while reserving tool-equipped chat for selected operator panels — a legitimate posture.

Cached-path mechanics

Per-Display-panel transcript cache

The cached orchestrator maintains a per-solution ConcurrentDictionary<string, ChatSession> keyed by ClientInfo.Guid — the platform's per-TCP-connection identifier (the same key shape ModuleHistorian and ServiceSyncObjects use for their own per-connection bookkeeping).

Property	Behaviour
Cache key	`ClientInfo.Guid` from the inbound TCP service call's thread-local context. NOT exposed to the runtime `Client.*` namespace.
Cache participation	Gated by `SolutionSettings.ModelOptions` bit `0x80` (`EnableChatHistory`). Default ON in new 10.1.5 solutions. When OFF, the orchestrator builds a local `SessionState` per call without entering the cache.
Login reset	Lazy. The orchestrator records the last-seen `UserName` per cache entry; when the next chat turn arrives with a different `UserName` on the same `ClientInfo.Guid`, the transcript clears before processing the new turn.
Transcript truncation	Per-cache-entry, truncated to a fixed maximum number of messages calibrated for ~8B-class context windows. Older messages drop oldest-first.
Cache lifetime	Per-process. Restarting TServer clears all transcripts. There is no persistence to disk in 10.1.5.

Tool dispatch loop

When the master tool bit is on and at least one category sub-bit is on, each chat turn assembles a tool catalog and may enter a tool-dispatch loop:

POST messages + tools to the LLM endpoint.
Inspect the assistant message. If it contains tool_calls, dispatch each tool in-process against the live ObjectServer (no HTTP loopback to RuntimeMCP — sub-millisecond direct dispatch).
Append tool-result messages to the running message list. Loop back to step 1.
Terminal exit when the assistant message is plain content (no tool_calls).

The loop is bounded by two limits:

Per-turn dispatch cap — total tool calls per turn. The default ships conservative; the platform target is 10 MaxToolDispatchesPerTurn = 5 in 10.1.5. Conservative cap chosen for the wall-clock budget below: at typical CPU-LLM latencies of 3–15 s per round-trip, 5 dispatches plus the terminal POST (worst case ~6 LLM POSTs) keeps the turn inside 60 seconds. On GPU-class hardware (~0.5–2.5 s per POST) the cap is not the binding constraint — the wall-clock budget is. When the cap trips, the next iteration POSTs WITHOUT new tool results so the LLM gets one final chance to compose a terminal reply, and returns status="truncated" if it does not.
Wall-clock budget — 60 seconds covering the entire loop (LLM POSTs + in-process tool dispatches combined).

Hook chain

OnBeforeChat and OnAfterChatReply are event Func<string, Task<string>> on the per-ObjectServer Local AI service singleton. Multicast; throwing handlers caught and logged into the reply's warnings[] with handler identity; chain proceeds with the un-mutated input. Hook failures do NOT flip the envelope's status away from ok — that's reserved for chain-aborting failures (LLM HTTP error, malformed query, etc.).

See Local AI Developer Reference (10.1.5 draft) for the reflection-attach pattern and worked hook examples.

Atomic-path mechanics

The atomic path is deliberately simple. After the master ModelEnabled gate:

Parse query JSON (plain text wrap or structured form with system / user / context / metadata fields).
Read fresh ModelSettings via the same defensive accessors as the cached path.
Resolve /secret:<Name> tokens via the platform's secret resolver.
Resolve {Tag.X} expressions in URL via the platform's expression resolver.
Decode auth multi-line via ReportWebData.DecodeAuthorizationHeader.
Build a single OpenAI-compatible messages[] array — no transcript prefix, no prior turns.
POST with empty tools[], 60-second wall-clock CancellationTokenSource, no loop.
Read choices[0].message.content. Build the SPEC envelope (toolTrace=[] always for atomic).

No hooks. No transcript. No tool dispatch. No retry. Every step lives in the atomic class; nothing calls into the cached orchestrator.

Configuration topology

Three columns on `SolutionSettings`

Local AI uses three pre-existing scaffolding columns. No schema migration; no per-solution row creation; no AI-specific table.

Column	Type	Role
`ModelEnabled`	Boolean	Master kill-switch. Off short-circuits both paths.
`ModelSettings`	String (JSON blob)	Five-key endpoint configuration: `URL`, `Name`, `Authorization`, `Headers`, `Info`.
`ModelOptions`	Int (bitmask)	Master tool bit and per-category sub-bits. Shared bitmask with the AI Runtime Connector and the AI Designer connector.

Defensive accessors

Every Local AI call re-parses ModelSettings JSON on every read. There is no caching layer with invalidation. The parse cost is negligible compared to the LLM round-trip; the alternative (cached parsed values + invalidation on column change) introduces a stale-cache bug surface that buys nothing measurable. Reads of NULL / empty / malformed JSON / missing keys / empty values transparently fall back to the defaults class.

Forward-compatibility on unknown keys

The Designer's 5-field editor preserves unknown JSON keys via a capture/replay mechanism, so future SPEC extensions to ModelSettings (e.g., a Provider dialect enum) survive an open/OK round-trip in the editor without being stripped.

SecuritySecrets integration

Local AI uses the same SecuritySecrets pattern that protocol drivers, the WebData connector, the UNS tag-provider service, and the Devices module use for credentials. There is no AI-specific secret mechanism.

Aspect	How Local AI uses it
Reference syntax	`/secret:<Name>` tokens embedded in `URL`, `Authorization`, or `Headers` fields of `ModelSettings`
Resolver	`ObjectServer.Security.CheckAndResolveSecretCommConfiguration(string)` — Kernel-side platform service
Off-server behaviour	Resolver short-circuits when not `RunningAtServer`; literal token survives. Calling code falls through to a normal HTTP-error envelope.
Encryption	SecretValue is encrypted at rest. Field is write-only via the Designer Security UI; redacted on read-back.

For the customer-facing walk-through and worked examples, see SecuritySecrets Authentication for Local AI (10.1.5 draft).

Reply envelope contract

The reply envelope is a strict five-field shape, identical for both paths and identical across success / error / disabled / truncated outcomes:

Code Block

language	json

{
  "text":      "<answer text or '' on non-ok status>",
  "status":    "ok | error | disabled | truncated",
  "toolTrace": [ ... ],
  "latencyMs": 1247,
  "warnings":  [ ... ]
}

Architectural guarantees:

Always parseable. Every envelope is well-formed JSON. JObject.Parse(reply) never throws on a Local AI reply.
Field stability. All five fields present on every envelope, every code path. Adding new top-level fields is non-breaking.
latencyMs always present — including on errors. Pre-condition errors (gates) report 0; post-condition errors report actual elapsed time.
Never throws. Network failures, parser failures, gate-off conditions, internal exceptions — all become envelopes with status="error" or status="disabled" and a populated warnings[].

Full schema and worked examples: Local AI Reply Envelope Schema (10.1.5 draft).

Integration points with other platform services

Service	How Local AI uses it
ObjectServer (per-host runtime)	Local AI service is per-`ObjectServer` singleton (`LocalAIService.Get(objSrv)`); accessors route configuration reads through `objSrv.DB`.
Security (SecuritySecrets resolver)	Token resolution at call time, server-side only.
SolutionSettings (configuration)	Three columns carry all configuration; no AI-specific table.
Kernel TCPServer (cached path entry)	Hosts the `ChatProcessService` handler that the Display ChatRequest action calls via ServiceClient TCP.
Toolkit (atomic path entry)	`TK.AIExecute` takes a project reference to `T.LocalAI.dll` for compile-time-clean dispatch.
WebData connector	Reuses `ReportWebData.DecodeAuthorizationHeader` for auth multi-line decoding — single source of truth for the wire format.
RuntimeMCP tool catalog	Cached path's tool catalog mirrors the same `runtime_*` tool surface that `RuntimeMCP.exe` exposes to external AI clients — single tool definition, two transports (in-process for cached path, HTTP for external clients).
AsyncHelpers	Both paths use `T.Library.tRPC.AsyncHelpers.RunSync` for sync-over-async wrapping (NOT raw `.Result` / `.GetAwaiter().GetResult()`) — handles SynchronizationContext correctly and unwraps `AggregateException`.

What Local AI deliberately does NOT add

Not added in 10.1.5	Why
An `AI` runtime root in the Object Model	The 12-root taxonomy (`Tag`, `Server`, `Client`, `Alarm`, `Device`, etc.) does not gain an `AI` root. Surface remains `TK.AIExecute` + `ChatRequest`. A peer of `InfoSolution.Settings` exposing AI fields is post-10.1.5.
A `T.Modules.AI` TModule	Local AI does not contribute Object Model namespace surface; it is a peer infrastructure assembly, not a runtime Module.
New `SolutionSettings` columns	The three existing columns (`ModelEnabled`, `ModelSettings`, `ModelOptions`) carry everything. Schema migration would impose cross-version routing risk for no benefit.
A provider-dialect enum (OpenAI / Ollama / Anthropic)	The dialect is implicit (OpenAI-compatible JSON). Adding a discriminator waits for a real customer needing a non-conforming endpoint.
Streaming output	Sync request/response only. Each call returns one complete envelope when the model finishes.
A bundled LLM runtime or model weights	Customers install Ollama themselves and pull the model from the official registry. Keeps the FX installer lean and avoids redistribution licensing friction.
HTTP loopback to RuntimeMCP for tool dispatch	Tool dispatch is in-process direct against the live `ObjectServer` — sub-millisecond vs 10–50 ms over-the-wire to localhost. Also guarantees the cached path works in standalone deployments without RuntimeMCPHttp running.
Atomic-path hooks or atomic-path tools	Atomic is a deliberately stateless one-shot. A customer wanting hooks or tools should use the `ChatRequest` Display action surface.

Versioning and forward-compatibility expectations

Reply envelope schema is LOCKED. Adding new top-level fields in future versions is non-breaking; existing five fields stay present and stay typed as documented.
Configuration topology is LOCKED. The three-column shape will not change. Future configuration extensions land as new keys inside the ModelSettings JSON or as new bits in ModelOptions.
Two-path independence is LOCKED. Future features ship in one path or the other (or both, with separate implementations). The two paths will not be merged behind a shared helper.
Code organization may evolve. The T.LocalAI.dll assembly home is stable for 10.1.5; later releases may split or rename internal types. The compile-time entry surface (ExecuteChatAsync, ExecuteAtomicAsync) is signature-stable.

Page tree

Versions Compared

Old Version 1

New Version 2

Key

Audience and scope

Capability summary

Two-path model — what is shared and what is divergent

Shared by architecture (legitimate coupling)

Divergent by design

Code organization

Assembly home and the namespace-contribution rule

Build position and dependencies

Bind sites

Type visibility

Process and thread model

Where Local AI runs

Why no runtime "view-client" gate

Master gates — application order

Cached-path mechanics

Per-Display-panel transcript cache

Tool dispatch loop

Hook chain

Atomic-path mechanics

Configuration topology

Three columns on `SolutionSettings`

Defensive accessors

Forward-compatibility on unknown keys

SecuritySecrets integration

Reply envelope contract

Integration points with other platform services

What Local AI deliberately does NOT add

Versioning and forward-compatibility expectations

See also

In this section...

Page tree

Page History

Versions Compared

Old Version 1

New Version 2

Key

Audience and scope

Capability summary

Two-path model — what is shared and what is divergent

Shared by architecture (legitimate coupling)

Divergent by design

Code organization

Assembly home and the namespace-contribution rule

Build position and dependencies

Bind sites

Type visibility

Process and thread model

Where Local AI runs

Why no runtime "view-client" gate

Master gates — application order

Cached-path mechanics

Per-Display-panel transcript cache

Tool dispatch loop

Hook chain

Atomic-path mechanics

Configuration topology

Three columns on SolutionSettings

Defensive accessors

Forward-compatibility on unknown keys

SecuritySecrets integration

Reply envelope contract

Integration points with other platform services

What Local AI deliberately does NOT add

Versioning and forward-compatibility expectations

See also

In this section...

Three columns on `SolutionSettings`