Infra / local-first implementation details
- sqlite
- fts5
- streaming
- sse
- local-inference
- byok
Intended ecosystem compatibility
- openai-compatible
- huggingface
- mistral
- llama
Product-ish / devtool signals
- devtools
- productivity
- citations
- interpretability
Core Idea
Core idea: Don’t stuff your library into the model. Let the model interact with the library.
Relationship Learning Models promise the ability to have unlimited context windows by providing an environmment in which they can "scan" a document library and retrieve information from this library be taking small snapshots of the information, and either answering the prompt provided it from these snapshots, or, additionally call small agents to assist it in returing their reply to the user's prompt. This model type promises to reduce content rot to zero.
The Quidoris Engine is an RLM environment. It is a harness that lives independently from an RLM but RLM's can dock with the harness and use its powerful environment
to run an unlimited context window. Right now LLM's are limited to a few million token context windows.
RLM's, in concert with the Quidoris Engine, have unlimited contect windows. Think 1000 documents or 10000 document context windows. This is what the
Quidoris Engine does for all RLM's.
Model Card for Model ID
Quidoris Engine (RLM Harness)
Quidoris Engine is a local-first inference harness inspired by Recursive Language Models (RLMs): it treats long prompts and large document libraries as an external environment, letting models search, read, cite, and recurse—instead of cramming everything into a single context window.
It runs as a local daemon with an HTTP API (SSE-friendly) and ships with a lightweight local web UI.
Model Details
Model Description
A RLM inference harness to allow any user to utilize RLM's locally on their own machines.
What this is
- A local daemon (Bun) + SQLite/FTS5 index for fast retrieval
- A provider-agnostic harness (BYOK): Local CLI / Hugging Face / OpenAI-compatible providers
- A UI Launcher that can auto-start the daemon (since browsers can’t spawn local processes)
- An “evidence-first” workflow: citations and evidence spotlight are first-class
Why it exists
Large models struggle with:
- Context window limits
- Hallucination risk when they can’t reference exact evidence
- Cost blowups when prompts get huge
Quidoris Engine makes “long context” feel interactive:
- keep the source material outside the model
- allow programmatic access via search/read APIs
- produce answers with traceable evidence
Features
Library + Index
- Local SQLite index (
rlm_index.sqlite) - FTS5 search (chunked content for recall + precise snippets)
- Incremental indexing (re-index only when files change)
RLM-friendly runtime
- Designed for “action → observe → recurse” loops
- Planned/compatible with async/batched
llm_querypatterns - SSE endpoints for streaming runs/steps (where available)
Local web UI
- Glassy login/boot screen
- “Enter the Engine” auto-starts the daemon via the local launcher
- Workspace scaffold (
/app) for Library / Runs / Evidence
Install
Requirements
- Bun (recommended)
- SQLite (bundled via
bun:sqlite)
Clone / download from Hugging Face or GitHub.
Quickstart (local)
1) Build UI
cd ui
bun install
bun run build
- **Developed by:** Ken Isaacson, Sr. "Grokci"
- **Funded by [optional]:** Quidoris Research Group"
- **Shared by [optional]:** Quidoris Research Group"
- **Model type:** Relationship Learning Model Harness - for use with Relationship Learning Models and Large Language Models
- **Language(s) (NLP):** BUN, Python, SQLite
- **License:** cc-by-2.5
-
### Model Sources [optional]
<!-- Provide the basic links for the model. -->
- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
**Quidoris Engine is not a trained model**, it’s a *local RLM harness/engine + daemon + UI*.
---
## Uses
Quidoris Engine is intended to be used as a **local-first inference harness** that enables RLM-style workflows: treating large prompts and document libraries as an **external environment** the model can interact with via actions (search/read/cite), rather than loading everything into the model context.
Foreseeable users:
* Individual developers running local tooling
* Researchers prototyping RLM-like inference loops
* Teams running BYOK workflows on laptops/workstations
* Contributors extending adapters, ingestion, or UI
People potentially affected:
* The user (and any collaborators) whose documents are indexed and queried
* Anyone whose content is included in the local library (e.g., shared docs)
* Any downstream consumers of generated outputs (reports, emails, code, decisions)
### Direct Use
Direct use means running Quidoris Engine locally as shipped:
* Start the **launcher + UI** locally
* Auto-start the **daemon**
* Index and manage a local document library
* Use the daemon API to:
* search local docs (FTS5)
* read snippets/regions
* generate model outputs with evidence/citations (as those endpoints are implemented)
* Use the UI to:
* sign in locally
* enter the workspace (“Engine”)
* (planned) upload/manage docs, tags/folders, run tasks, view evidence spotlight
Typical scenarios:
* “I have a large folder of PDFs + notes. Let the model answer questions with citations.”
* “I want a controllable long-context tool that’s provider-agnostic.”
* “I want an RLM-like loop where the model chooses actions (search/read/subcall) and I can inspect evidence.”
### Downstream Use [optional]
Downstream use means embedding Quidoris Engine into a larger system (still local-first) or using it as a substrate for an RLM agent/controller:
* Integrating Quidoris Engine into an internal toolchain via `/v1/*` API
* Building a custom “RLM controller” that:
* decides actions (search/read/transform/query-submodel)
* calls Quidoris Engine as the environment layer
* streams steps via SSE to a UI or logger
* Extending adapters for model providers (HF inference, local CLI, OpenAI-compatible servers)
* Building domain-specific ingestion pipelines (PDF→text, audio→transcript, etc.)
### Out-of-Scope Use
Quidoris Engine is **not** intended for:
* Running as a public hosted multi-tenant SaaS (this repo is local-first by design)
* Handling regulated/high-stakes workflows without additional controls (health, legal, finance)
* Serving as a security boundary between untrusted users and private documents (local indexing implies access)
* “Guaranteed correctness” answers: it can improve evidence access, but models can still misinterpret
* Real-time collaborative editing or “Google Drive-like” shared document management
* Malware, phishing, or abusive content generation (the tool should not be used to facilitate harm)
---
## Bias, Risks, and Limitations
Technical limitations:
* Retrieval is only as good as:
* ingestion quality (OCR, parsing errors)
* chunking strategy
* query formulation
* SQLite FTS5 is lexical (not semantic embeddings by default). It can miss paraphrases unless augmented.
* Evidence spotlight/citations reduce hallucinations but don’t eliminate:
* incorrect reasoning
* cherry-picked evidence
* quoting irrelevant snippets
* Provider behavior differs: different LLMs can produce different answers to identical evidence.
Sociotechnical / user risks:
* Private data risk: users may index sensitive documents; local-first reduces exposure but doesn’t remove it.
* “Trust from polish” risk: a beautiful UI can cause over-trust in outputs.
* Misuse risk: tool can be used to summarize or extract sensitive data if the library contains it.
* Attribution risk: users may treat retrieved snippets as definitive without verifying full context.
System limitations (current repo state):
* Many `/v1/*` endpoints in the OpenAPI spec may still be stubs returning `501` until implemented.
* Current auth is local-oriented and not a full enterprise IAM solution.
* Uploading non-text media (images/audio/video) requires ingestion pipelines not yet included by default.
### Recommendations
* Treat outputs as *drafts* unless validated, especially for high-stakes decisions.
* Always inspect:
* which documents were retrieved
* which chunks were used
* whether the snippet context is sufficient
* Add guardrails for sensitive libraries:
* separate profiles/databases for different projects
* explicit “do not index” directories
* local encryption at rest if required
* If you enable remote access, lock it down:
* bind to localhost only (default)
* require auth
* avoid exposing ports publicly
* For better recall, consider augmenting FTS with semantic retrieval later (optional plugin).
---
## How to Get Started with the Model
> Quidoris Engine is a tool (daemon + UI), not a trained model.
### Quickstart (local UI + auto-start daemon)
```bash
# Build the UI
cd ui
bun install
bun run build
# Start the launcher (serves UI + proxies API + can spawn the daemon)
cd ..
bun run ui:launch
Open:
Click Enter the Engine:
- launcher starts the daemon if needed
- UI waits for
/v1/health, performs login, routes to/app
Daemon only (no UI)
bun run quidoris-engine.ts daemon --port 8787
# then call:
# GET http://127.0.0.1:8787/v1/health
Training Details
Not applicable. Quidoris Engine is not a trained model; it is a local-first harness for running inference workflows against user-provided documents and user-selected model providers.
Training Data
Not applicable.
Training Procedure
Not applicable.
Preprocessing [optional]
Not applicable (except local document ingestion/indexing, which is not training).
Training Hyperparameters
- Training regime: Not applicable.
Speeds, Sizes, Times [optional]
Not applicable for “training.” (Performance notes that are relevant would be indexing speed, query latency, daemon startup time, etc., which depend heavily on machine and library size.)
Evaluation
Quidoris Engine itself is not evaluated like a model, but it can be evaluated as a system (retrieval quality, citation accuracy, latency).
Testing Data, Factors & Metrics
Testing Data
Not applicable as a model card training set; for system evaluation you can use:
- a curated local doc set with known answers
- “needle-in-haystack” tests (known facts buried in large libraries)
- long-context QA benchmarks adapted to tool-based retrieval (optional)
Factors
Suggested evaluation factors:
- document types: markdown, txt, html, pdf→text, code, emails
- library size: 100 docs vs 10k docs
- chunk sizes: 8KB vs 32KB
- query styles: keyword vs natural language
- provider differences: local vs hosted LLMs
Metrics
Suggested metrics for system-level evaluation:
Retrieval:
- recall@k (did the correct doc appear in top k?)
- snippet precision (did the snippet contain the needed evidence?)
Evidence correctness:
- citation validity (does cited text actually support the claim?)
Latency:
- search time
- end-to-end “question→answer” time
Stability:
- incremental indexing correctness
- reproducibility across runs
Results
Not yet reported in this repository.
Summary
Quidoris Engine is an infrastructure layer; evaluation is best done per deployment and provider choice.
Model Examination [optional]
Not applicable as a trained model. However, Quidoris Engine supports system examination:
- evidence spotlight (inspect chunks used)
- step traces (for RLM-like action/observation loops)
- logs of search/read actions
Environmental Impact
Not applicable as a trained model.
For local usage:
emissions are dominated by whichever model provider you choose (local GPU, hosted inference, etc.)
Quidoris Engine overhead is primarily:
- SQLite indexing (CPU)
- serving UI + HTTP daemon (minimal)
If you want, you can report:
- indexing energy/time on a reference machine
- typical daemon runtime overhead
Technical Specifications [optional]
Model Architecture and Objective
Quidoris Engine is not a model architecture. It is an inference harness + local environment that:
- stores a document library externally (SQLite + FTS5)
- exposes tools (
search,read) via HTTP - supports an agent/controller loop (RLM-style) that can recurse and cite evidence
- uses SSE-friendly API patterns for streaming steps/results
Compute Infrastructure
Runs locally on user machines.
Hardware
Typical:
- CPU laptop/workstation
- optional GPU if running local models (not required for the engine itself)
Software
- Bun runtime
- SQLite via
bun:sqlite - UI: Vite + React (local web app)
- API: local HTTP daemon + launcher proxy
Citation [optional]
BibTeX:
@software{quidoris_engine_2026,
author = {Quidoris Research Group},
title = {Quidoris Engine: Local-first RLM Harness},
year = {2026},
url = {https://huggingface.co/Quidoris/Quidoris_Engine
}
APA: Quidoris Research Group. (2026). Quidoris Engine: Local-first RLM Harness [Software]. Hugging Face. https://huggingface.co//
Glossary [optional]
- RLM (Recursive Language Model): an inference strategy where the model treats long prompts as an external environment and recursively operates over parts of it via tools/actions.
- FTS5: SQLite Full-Text Search extension for fast lexical search with ranking/snippets.
- BYOK: Bring Your Own Key — you supply API credentials to providers; tool does not centrally store them.
- Evidence spotlight: UI pattern that opens the exact retrieved snippet(s) supporting a claim.
- SSE: Server-Sent Events — HTTP streaming alternative to WebSockets for one-way event streams.
More Information [optional]
- OpenAPI spec:
openapi/openapi.yaml - API quickstart:
docs/api.md - DB schema:
docs/db_schema.sql - Error schema:
docs/errors.md - Pagination:
docs/pagination.md - SSE notes:
docs/sse.md
Model Card Authors [optional]
- Quidoris Research Group
Model Card Contact
- Project contact: Ken (@grokci)
If you want, I can also rewrite this in the “official HF style” with a tighter tone and a short “TL;DR” at the top, plus a more HF-native tag set (tools, agent, retrieval-augmented-generation, etc.).