Purgr — Documentation

00 // INSTALL

Get started with the SDK.

Install the Purgr core package via npm to begin integrating context compression into your AI pipeline.

[COPY]
# Install via npm

            npm install purgr
          

View on npm →

01 // QUICK START

Integrate in minutes.

Initialize the Purgr engine and compress your conversation history before sending it to your LLM provider.

[COPY]
import { Purgr } from'purgr'

const purgr = newPurgr({ activeWindow: 8, anchorCount: 3, scorerMode: 'auto' })

// Compress before your LLM call
const result = purgr.compress(conversationHistory)

// Pass compressed messages to your LLM
const response = await openai.chat.completions.create({

              model: 'gpt-4o',

              messages: result.messages

            })

// Receipt is in result.signedReceipt
// Human-readable receipt box:
console.log(purgr.receipt(result.stats))
          

[COPY]

            npm install -g purgr

            purgr proxy
# Point your app at http://localhost:3000
# Compression happens automatically on every request

[COPY]

            npm install -g purgr

            purgr proxy

            purgr setup claude-desktop
# Restart Claude Desktop — compression is now automatic

02 // CORE CONCEPTS

Architected for trust.

Purgr uses a multi-phase approach to ensure your most critical context is preserved while eliminating noise.

01 / WINDOWING

Active Window Protection.

The most recent N messages are never compressed and always pass through unchanged. Default: 8 messages.

02 / MOMENTUM

Phase 1 — Momentum Scoring.

Every message receives a momentum score based on topical activity. Messages that decay are compressed.

03 / KOOPMAN

Phase 2 — Koopman DMD.

Models conversation as a dynamical system, identifying structural topics that persist across changes.

04 / TRUST

Cryptographic Receipts.

Ed25519-signed receipts with Merkle roots over all decisions. Verifiable locally at purgr.dev/verify.

05 / LIVENESS

CLA Liveness Scoring.

NCD/LZ hash containment identifies logically necessary messages that momentum and DMD would incorrectly compress. Rescued messages are signed in the receipt. Pro feature — enableLiveness: true.

06 / FACTS

Fact Fidelity Scoring.

Post-compression deterministic verification. Every receipt shows how many critical facts (currency values, dates, identifiers, regulatory citations) survived. Signed in the Ed25519 receipt.

07 / RECALL

Purgr Recall.

Runs the full intelligence pipeline on both a source document and an AI response independently. Compares what was identified as important in the source against what survived in the response. Classifies every fact as Present, Modified, or Absent. Produces a signed Recall Receipt suitable for compliance documentation. Any value discrepancy — regardless of magnitude — is flagged for analyst review.

03 // API REFERENCE

Comprehensive control.

The Purgr SDK provides low-level methods for granular control over the compression engine and receipt generation.

compress(messages, options?) → CompressionResult

Primary method for compressing conversation history.

Parameter	Type	Req	Description
messages	Message[]	Yes	Array of role/content objects
options.query	string	No	Binds receipt to this prompt
options.response	string	No	Commits response to receipt

[COPY]
const result = purgr.compress(messages, {

                  query: "What was the approved budget?",

                  response: "The approved budget is $2.4M"

                })

compressDocument(text, config?) → CompressionResult

Optimized for long-form documents (PDFs, transcripts, reports).

Parameter	Type	Default	Description
text	string	req	Raw document text
config.profile	string	'balanced'	'conservative' \| 'aggressive'

conservative

████████ High protection

balanced

█████░░░ Balanced

aggressive

███░░░░░ Minimal protection

compressCorpus(chunks, options?) → CompressionResult

For RAG pipelines. Deduplicates and filters retrieved chunks.

Parameter	Type	Default	Description
chunks	CorpusChunk[]	req	Retrieved chunks with metadata
options.budget	number	8000	Maximum output tokens

verifyDocument(documentText, llmResponse, options?) → VerifyResult

Standalone stateless function. No Purgr instance required. Proves which specific numbers, dates, and identifiers in an LLM response exist in the source document.

Parameter	Type	Req	Description
documentText	string	Yes	Source document
llmResponse	string	Yes	AI response to verify
options.query	string	No	Optional query for receipt binding
options.privateKey	string	No	If provided, signs the result
options.publicKey	string	No	Required when privateKey provided

Returns: { groundingScore, groundedFacts, derivedFacts, ungroundedFacts, signedReceipt? }
Three categories: Grounded — fact present verbatim in document. Derived — mathematically computable from adjacent document facts. Ungrounded ⚠ — not traceable, verify manually.
Honest scope: proves token presence, not relational accuracy.

[COPY]
import { verifyDocument } from'purgr'

const result = verifyDocument(contractText, llmAnalysis)
console.log(`Grounding: ${Math.round(result.groundingScore * 100)}%`)
console.log('Grounded:', result.groundedFacts)
console.log('⚠ Ungrounded:', result.ungroundedFacts)
              

Purgr.parseConversation(text) → ParseResult | null

Static method. Detects and parses any conversation export format into a standard messages array.

Supported formats: openai-api — JSON with messages array · chatgpt-export — JSON with mapping object · claude-web-export — Claude.ai export format · human-assistant — Human:/Assistant: format · chatgpt-text — User:/ChatGPT: format · prose-transcript — Any continuous prose, podcasts, interviews, raw chat logs

Returns: { messages, format, confidence } | null

[COPY]
const parsed = Purgr.parseConversation(rawText)
if (parsed) {
console.log(`Format: ${parsed.format} (${parsed.confidence} confidence)`)
const result = purgr.compress(parsed.messages)

                }
              

proveDecision(messageId) → MerkleProof | null

Generates a Merkle proof for a single compression decision. Allows proving one message outcome without revealing the full session.

Returns: { messageId, leafHash, proof, root } | null
The proof can be verified by any third party using verifyMerkleProof() without access to the full receipt.

[COPY]
// Prove a single message was preserved
const proof = purgr.proveDecision('msg-42')

// Third party verification — no full session required
import { verifyMerkleProof } from'purgr'
const valid = verifyMerkleProof(proof.leafHash, proof.proof, proof.root)
// root should match receipt.payload.merkleRoot

recall(documentText, aiResponse, options?) → RecallResult

Standalone async function. No Purgr instance required. Runs the full Purgr intelligence pipeline on both the source document and the AI response with zero compression — maximum preservation — to use the pipeline's intelligent fact identification. Compares what was identified as important in the source against what is present in the response. Produces a signed Recall Receipt. The compliance-grade artifact.

Parameter	Type	Req	Description
documentText	string	Yes	Source document — contract, filing, report
aiResponse	string	Yes	The AI response to verify against the source
options.query	string	No	The prompt sent to AI — hashed and bound to receipt
options.chunkStrategy	string	No	'auto' (default), 'paragraph', 'sentence', 'fixed'
options.chunkSize	number	No	Chunk size when chunkStrategy is 'fixed'
options.signer	object	No	{ privateKey, publicKey } — uses persistent key if omitted

Returns: { receipt, warnings, error?, latencyMs }
Receipt contains:

factsPresent, factsAbsent, factsModified, recallScore, documentPipeline, responsePipeline, merkleRoot, signature, publicKey

Three verdict categories: Present — fact found in response with identical value. Modified ⚠ — fact found but value differs — ANY discrepancy flagged regardless of magnitude. Absent ✗ — fact in source not found in response.
Important note: Modified verdict flags ANY numeric discrepancy — $47,382 vs $47,000 is modified. The analyst determines significance. Purgr does not make materiality judgments.
Pipeline info documented in receipt: which scorer phase ran (1 or 2), whether co-occurrence activated, whether liveness ran, whether fallback to regex occurred, chunk count and strategy used.

[COPY]
import { recall } from'purgr'

const result = awaitrecall(contractText, aiAnalysis, {

                  query: "Summarize key financial obligations"

                })

if (result.error) {
console.error(result.error)

                } else {
console.log(`Recall Score: ${Math.round(result.receipt.recallScore * 100)}%`)
console.log('Modified facts requiring review:', result.receipt.factsModified)
console.log('Absent facts:', result.receipt.factsAbsent)
if (result.warnings.length > 0) console.warn(result.warnings)

                }
              

05 // CONFIGURATION

Tune for performance.

Adjust constructor options to balance between aggressive token reduction and context preservation.

Option	Type	Default	Description
activeWindow	number	8	Messages protected from compression
anchorCount	number	3	Max anchor summaries injected
scorerMode	string	'auto'	'momentum', 'dmd', or 'auto'
enableLiveness	boolean	false	CLA liveness scoring — rescues logically necessary messages. Pro feature.
enableCoOccurrence	boolean	false	Co-occurrence semantic similarity. Activates after 50 messages. Pro feature.
coOccurrenceWindow	number	10	Context window size for co-occurrence pairs
protectedLeadMessages	number	5	Oldest N messages permanently protected
factFidelityMaxTokens	number	500000	Token limit for fact fidelity pass. Set 0 to disable.

08 // LIMITATIONS

Transparent boundaries.

Purgr is highly effective but has known limitations regarding semantic paraphrase and fact protection signals.

01 / SEMANTIC

Semantic NIAH.

Pure semantic paraphrase NIAH scores 0% without an embedding model for conceptual fallback.

02 / TOPICS

Topic Resurrection.

DMD may compress early topics before resurrection signals arrive if turns exceed 100+.

03 / DETAIL

General Detail.

Fact protection triggers on high-specificity signals (currency, dates). Casual details are not protected.

05 / RELATIONAL

Relational Accuracy.

Recall and verifyDocument() prove specific facts were present — dollar amounts, dates, identifiers. They do not verify relational accuracy. "Entity A wired $50K to Entity B" and "Entity B wired $50K to Entity A" both contain the same facts and both score as present. SVO triple extraction for relational grounding is on the roadmap.

06 / CUSTOM PATTERNS

Custom Fact Patterns.

The fact identification engine recognizes standard compliance fact types — currency, dates, percentages, regulatory citations, FAR/DFARS clauses, CLIN numbers. Custom fact patterns per engagement — CAGE codes, contract-specific identifiers, domain-specific terms — are on the roadmap for the compliance enterprise build.

04 // INTEGRATION PATTERNS

Works with your stack.

Purgr is provider-agnostic. Use these patterns to integrate with OpenAI, Anthropic, or local models.

[COPY]
import { Purgr } from'purgr'
import OpenAI from'openai'

const purgr = newPurgr({ activeWindow: 8 })
const openai = newOpenAI()

const result = purgr.compress(conversationHistory)

const response = await openai.chat.completions.create({

              model: 'gpt-4o',

              messages: result.messages

            })

[COPY]
import { Purgr } from'purgr'
import Anthropic from'@anthropic-ai/sdk'

const purgr = newPurgr({ activeWindow: 8 })
const client = newAnthropic()

const result = purgr.compress(conversationHistory)

const response = await client.messages.create({

              model: 'claude-3-5-sonnet-latest',

              max_tokens: 1024,

              messages: result.messages

            })

[COPY]
import OpenAI from'openai'
const client = newOpenAI({

              baseURL: 'http://localhost:11434/v1', // Ollama

              apiKey: 'not-needed'

            })
          

07 // CLI REFERENCE

Command line power.

The Purgr CLI provides tools for proxying LLM traffic and managing local receipts.

Command	Description
purgr proxy	Start the local compression proxy
purgr setup claude-desktop	Configure Claude Desktop integration
purgr mcp	Start MCP server
purgr receipts	View today's receipt log
purgr recall --document [file] --response [file]	Generate a Recall Receipt comparing document against AI response
purgr recall --document [file] --response [file] --query "text"	Recall with query binding
purgr receipts --verify [receipt.json]	Verify a saved receipt

06 // RECEIPTS

Verifiable context.

Every compression produces an Ed25519-signed receipt forming a tamper-evident chain.

Field	Description
payload.inputHash	SHA-256 of uncompressed input
payload.merkleRoot	Merkle root over all decisions
signature	Ed25519 signature over payload
payload.decisionsCommitted	Count of individual decisions committed to Merkle tree
payload.receiptChainLength	Number of prior receipts linked in session chain
payload.factFidelityScore	Fraction of critical facts preserved (0.0–1.0)
payload.factFidelityPreserved	Count of facts preserved
payload.factFidelityTotal	Total facts detected
payload.livenessRescued	Messages rescued from compression by CLA liveness
payload.coOccurrenceActive	Whether co-occurrence matrix was active

Recall Receipt — additional fields

Field	Description
payload.receiptType	'recall' — identifies this as a Recall Receipt
payload.documentHash	SHA-256 of exact source document text
payload.responseHash	SHA-256 of exact AI response text
payload.recallScore	Fraction of source facts present in response (0.0–1.0)
payload.factsPresentCount	Count of facts found in response with identical value
payload.factsAbsentCount	Count of source facts not found in response
payload.factsModifiedCount	Count of facts found but with any value discrepancy
payload.documentPipeline	Pipeline info — scorer phase, co-occurrence, liveness, fallback status
payload.responsePipeline	Same for response side
payload.merkleRoot	Binary Merkle tree root over all fact verdicts sorted by raw value
payload.keyMode	'persistent' — engagement key / 'ephemeral' — session key only

Open Verification Portal

10 // PURGR PASSPORT

Portable session state.

A .purgr file is a portable session snapshot carrying compressed messages and signed receipts.

FREE TIER

Basic Portability.

✓ Compressed messages
✓ Signed cryptographic receipt
– No DMD memory
– No encryption

PRO TIER

Full Continuity.

✓ Everything in Free
✓ Full Purgr State (DMD memory)
✓ AES-256 Encryption

STATE CONTINUITY: Purgr State is the Koopman DMD model fitted to your session. Importing it resumes true continuity rather than replaying raw messages.

[COPY]

            purgr export --passport

            purgr export --passport --output session.purgr

# Pro: encrypted export

            purgr export --passport --encrypt --passphrase "yourpass"

[COPY]
const passportJson = await purgr.exportPassport(messages, {

              description: 'my session',

              modelHint: 'gpt-4o'

            })

[COPY]
const payload = await purgr.importPassport(json, {

              passphrase: 'yourpass'

            })
// payload.messages, payload.receipt, payload.koopmanState

09 // FAQ

Common questions.

Everything you need to know about Purgr's performance and streaming support.

Does Purgr work with streaming?

Yes. Proxy and SDK support streaming. Receipts sign after buffering completes.

How much latency does it add?

Phase 1 momentum: ~50ms amortized per call at 100k tokens. Phase 2 DMD: ~240ms amortized per call at 100k tokens. With CLA liveness enabled: adds ~942ms on 242k token sessions. All latency measured on lifecycle cadence benchmark — compress() called every 10 messages.

What is fact fidelity scoring?

After every compression, Purgr scans the output for critical facts — currency values, dates, identifiers, regulatory citations — and reports how many survived. 143/143 facts were preserved in a 242k token session at 53% compression. The score is deterministic and signed in the receipt.

What does verifyDocument() prove?

verifyDocument() proves which specific numbers, dates, and identifiers in an LLM response exist in the source document. It does not verify relational accuracy — a response inverting a relationship while using correct tokens still scores as grounded. Market as Hard-Fact Traceability, not hallucination detection.

What is selective disclosure?

Using proveDecision(), you can prove a single message outcome without revealing the full session. The Merkle proof contains only the sibling hashes needed to reconstruct the root — the rest of the session stays private. Any third party can verify using verifyMerkleProof().

What is the difference between verifyDocument() and recall()?

verifyDocument() is a lightweight stateless function — fast, no pipeline, deterministic regex extraction. recall() runs the full Purgr intelligence pipeline on both the source document and the AI response — fact identification is informed by momentum scoring, DMD structural analysis, co-occurrence semantic linking, and liveness detection where thresholds are met. recall() produces a richer, more intelligent Recall Receipt and is the recommended function for compliance use cases. verifyDocument() remains useful for quick spot checks where pipeline latency is a concern.

What does 'Modified' mean in a Recall Receipt?

A fact is classified as Modified when it appears in both the source document and the AI response but with different values. ANY discrepancy is flagged regardless of magnitude — $47,382 in the source and $47,000 in the response is Modified. $100 vs $99 is Modified. Purgr does not make materiality judgments — that is the analyst's responsibility. The receipt surfaces the discrepancy. The professional determines whether it matters.

What does 'fallbackToRegex' mean in the pipeline info?

When a document produces fewer than 5 chunks — too short for the pipeline to build meaningful signal — recall() falls back to deterministic regex extraction for fact identification on that side. The receipt is still cryptographically signed and tamper-evident. The fallbackToRegex field in documentPipeline or responsePipeline is set to true and a warning is included in the result. For compliance use, documents should be long enough to activate the full pipeline — at minimum 10-15 substantial paragraphs.