Chat with your notes.
In the cloud. On your machine.

PKMA is a desktop AI assistant for your Obsidian vault. Semantic search, a streaming agent, wikilink-aware retrieval — running entirely against your own files and your own model. No accounts. No upload. No telemetry.

Download for Windows Browse source Read the docs

0 KBdata uploaded

~8 MBRAM at idle

1 binaryno Electron, no Docker

4×OpenAI-compatible providers

research-vault / Chat · Indexing into vector store

ctx 12.4k / 32k ⌘1 ⌘2

—

▢

How do I keep my frontmatter consistent across atomic notes? Pull the conventions I've written down and any examples.

PKMA · normal · gemma4 (local)

vault_search · hybrid · 14 hits · 320ms

Your conventions live in Frontmatter — house rules and are referenced from Atomic note template. The short version:

Every note gets a title, a kebab-case id, an ISO created date, and a flat tags: list. type is one of atomic, map, journal, or reference.

The examples in Notes on Eberhardt 2019 drift — they nest tags under a parent key. Want me to propose a fix and show the diff?

Show me the diff for the drifted notes

effort: medium tools: sqlite mode: normal gemma4

PKMA desktop app — chat panel showing an example conversation about Obsidian frontmatter conventions

The Three Promises

Built for people who would rather not upload their second brain.

Most "chat with your notes" tools require uploading your vault to a third party. PKMA was designed from day one so that question never comes up.

Local

Your vault, the index, the vector store, the model — everything runs on your machine. The server lives on 127.0.0.1:8008. There's nowhere else for your data to go.

SQLite + FTS5 + sqlite-vec on diskNo cloud bucket, no remote DBblake3-hashed, incremental index

Offline

Pair it with Ollama or LM Studio and PKMA runs with the network unplugged. Boot a flight without Wi-Fi and the agent, retrieval, and embeddings keep working.

Works fully air-gappedNo remote auth, no licence pingEmbeddings run locally too

Private

No accounts. No telemetry. No analytics SDK. Source-available so you can read the few hundred lines of network code yourself and confirm there's no phone-home.

Zero telemetry, everNo login wall, no licence keySource-available · PKMA PUL

Bring your own model — any OpenAI-compatible endpoint, for both chat and embeddings.

Ollama LM Studio OpenRouter llama.cpp Custom · OpenAI-compat

Everything in the binary

A research partner that actually reads your notes.

First-class understanding of wikilinks, frontmatter, tags, and the standard .md vault layout. Plus a real agent loop on top.

vault_search

Hybrid retrieval, fused with RRF

FTS5 keyword search and semantic vector search fused together with reciprocal-rank fusion. Catches the note that uses different words AND the note that uses your exact phrase.

graph_neighbors

Wikilink-graph aware

A petgraph-backed link traversal lets the agent walk the [[graph]] up to three hops deep, surfacing connected ideas the embeddings missed.

SSE

Streaming agent loop

Token-by-token responses with live tool-call status, thinking steps, and an interruptible composer.

watcher

Incremental indexing

File watcher + blake3 content-hash dedupe. Only changed notes get re-embedded — and quarantine catches the ones that don't parse.

tauri v2

One desktop binary

A single Tauri binary. No Electron, no Docker, no Python service, no daemon to babysit. Native window chrome, ~8 MB at idle.

multi-vault

Per-vault isolation

Each vault gets its own SQLite database. Switching vaults is cancellation-aware — in-flight indexing aborts cleanly.

obsidian-cli

Live Obsidian tools

Optionally swap the search/read surface for live obsidian-cli tools: backlinks, properties, tasks, outline, daily note.

notes/lookup

Smart citation pills

The agent's replies cite Note Title inline. Each pill is resolved with exact / FTS / LIKE fallbacks, so you see at a glance whether the match is solid or a guess. Hover to preview; click to open in Obsidian.

OpenAI-compat

Provider-agnostic, both ways

Anything speaking the OpenAI chat-completions wire format works for chat. For embeddings, either Ollama's /api/embed or the OpenAI-compatible /v1/embeddings. Swap models per vault from inside the app.

Three speeds

Agent modes for the question you're actually asking.

Each mode has its own tool budget and retrieval cap. Set a default in settings; override per-thread when one query needs more depth.

Fast

One-hop answers.

Single retrieval, no tool follow-ups. For lookups you could almost do with grep.

tool calls≤ 1

retrieval k8

typical latency1-2s

Normal · default

The everyday research loop.

Multi-step retrieval with graph walks and quote-checks. The mode you actually use.

tool calls≤ 6

retrieval k20

graph depth2

Deep Research

Long-horizon synthesis.

Aggressive tool budgets, deep graph traversal, cross-note reasoning. Pour a coffee.

tool calls≤ 20

retrieval k60

graph depth3

Citations you can actually trust

Every claim links back to a note you wrote.

Wikilinks rendered as pills. Match quality visible at a glance. Hover to preview, click to open in Obsidian. No more "the model said so, somewhere".

Assistant · synthesis

The methodology you settled on in the second draft borrows the sampling frame from Eberhardt 2019 but swaps the bootstrap routine for the one in Resampling — house notes. There's also a note suggesting you reconsider — see Concerns: small-n.

Methodology — final Eberhardt 2019 Resampling — house notes Concerns: small-n

exact fts like / fuzzy unresolved

Eberhardt 2019

type: reference · tags: methods, sampling

A re-read of Eberhardt's framework. Useful for the boundary conditions; the bootstrap implementation here is the part I actually reuse...

The agent quotes you, not the internet.

PKMA only retrieves from the vault you pointed it at. There is no web search tool, no scraping, no training data leak. When the assistant cites a source, it's a note you wrote — and the pill tells you exactly how confidently it matched.

● exact — title matched verbatim. The pill is solid.
● fts — full-text fallback. Confident but not perfect.
● like / fuzzy — fuzzy match. Rendered dashed; treat with care.
● unresolved — agent invented a title that doesn't exist. Broken-link pill.

Under the hood

One Tauri binary. Two things in the same process.

A Rust core boots an Axum HTTP server on 127.0.0.1:8008; a Next.js webview talks to it over HTTP + SSE. That's the whole architecture.

webview Next.js 16 · React 19 · OKLCH shell

↑↓ HTTP + SSE

rust core · 127.0.0.1:8008 Axum · Tokio · Tower · Rig agent loop

storage VaultDb
SQLite + FTS5 + sqlite-vec

embed HTTP client
OpenAI-compat / Ollama

watcher notify + blake3
incremental hash diff

llm Bring-your-own
chat completions

Read-only pool, single writer

VaultDb is 4 read-only connections + 1 writer, all inside spawn_blocking. The chat hot path never blocks a Tokio worker on a SQLite lock.

SSE all the way down

Tokens, thinking steps, tool-call start/done, message complete — they're all distinct SSE event types. The composer disables while streaming; retry pulls the prior turn and resends.

Hash-diff indexing

blake3 hashes every note body. Unchanged bodies skip embedding entirely. Files the parser can't handle land in a quarantine table so they don't loop in "changed" forever.

Cancellation-aware vault switching

Switching to a different vault aborts the in-flight indexer, filters buffered events, and reconnects the SSE stream against the new vault id.

Status · alpha

Shipped. Shipping. Next.

PKMA is under active development. APIs and on-disk schemas may change between commits. The index can be safely rebuilt at any time.

Shipped

Hybrid retrieval with RRF fusion
Three agent modes with per-thread overrides
Live obsidian-cli tool profile
Incremental indexing + quarantine
Wikilink resolution with match quality
Multi-vault with clean cancellation

Shipping

Note authoring — agent-assisted writes
Refactor pass — rename, merge, split
Per-vault custom system prompts
Daily-note operations (read + write)
Telemetry: per-tool latency histograms

Plugin surface — user-extensible tools
Structured edits on Obsidian properties
Multi-modal: PDF + image notes
Encrypted index export / import
Windows + Linux signed installers

Your second brain. Yours to keep.

Free for personal use. No account, no telemetry, no subscription. Pick a build and point it at a folder of Markdown.

Download for Windows macOS soon Linux soon Build from source