name: zotero description: "Use this skill whenever the user wants to query, search, or explore their Zotero reference library — or when they need help with citations in their writing. This includes: finding papers in specific collections or group libraries, searching by author/year/topic, verifying whether cited papers actually support the claims they are cited for, finding additional citations relevant to a passage of text, reading PDF content for summaries or quotes, and any other task that involves the Zotero database or its attached PDFs. Trigger on phrases like: "find papers on X in my Zotero", "check my citations", "are these references accurate?", "find more citations for this paragraph", "what does [Author Year] actually say?", "search the [collection] library", "is this paper in NatureMAP / ECEC / my Zotero?", "summarise papers about X", "what have I read recently?", "show me my annotations on X", "what did I highlight". (Claude Code only — runs litmap against ~/LitLake/embeddings.db; not available in Cowork sandbox.)"

Zotero Skill

⚠️ Runtime: Claude Code only. This skill calls uv run litmap … against ~/LitLake/embeddings.db on the local machine. It will not work in the Cowork web sandbox. If you reached this skill from the Cowork web frontend, stop and switch to Claude Code.

Setup

Database: ~/Zotero/zotero.sqlite (mounted dynamically — use the path resolution snippet below) PDFs: <zotero_dir>/storage/<itemKey>/<filename>.pdf Reference file: <zotero_dir>/cowork-zotero-reference.md

Read the reference file at the start of every session — it contains the schema, field IDs, and query patterns so you don't need to rediscover them. Trust the reference file for schema and IDs. Always run live queries for counts and collection listings — the reference file counts go stale.

Always query across all libraries unless the user specifies otherwise.

Libraries

libraryID	Name
1	Personal (Doug)
3	NatureMetricsZ
4	ECEC
5	BioDivAbove
6	KIZ Statistical Methods in Ecology
7	NatureMAP

Four tiers of search

Choose the right tier for the task:

When in doubt, Tier 1 first to scope, then Tier 4 within scope.

Tier 1 — Metadata (instant, whole library) Query itemData/itemDataValues/creators for title, author, abstract, journal, date, DOI, tags. Use for: finding papers by author/year, listing a collection, keyword searches in titles/abstracts.

Tier 2 — Full-text index (fast, ~15,600 indexed items) Query fulltextWords + fulltextItemWords for keyword presence across PDFs without opening them. Good for "find papers that discuss [term]" across a large collection. Words are lower-cased and individual (not phrases). For phrase search, intersect multiple word queries or fall back to Tier 3.

Tier 3 — PDF reading (slower, for deep analysis) Open PDFs directly with pdfplumber when you need to verify a specific claim, extract a quote, or check citation faithfulness.

Use re.finditer to locate specific passages rather than printing the entire text.

Tier 4 — Semantic (model-backed)

Use when the query is conceptual or paraphrased — i.e. when keyword search would miss synonyms, acronyms, or rephrasings. Tier 4 calls litmap search or litmap cluster against the local embeddings database. The first call after a fresh model download or a long idle takes 10–30 seconds while the embedding model warms up; subsequent calls within the same session are sub-second.

Pattern 4a — Natural-language query → ranked papers

To run semantic search, you can query LanceDB directly using OmniMind's vector store. Write a quick TypeScript script or use LanceDB's python client lancedb to query ~/.omnimind/lancedb:

(Note: OmniMind uses LM Studio for embeddings, so you must fetch the embedding vector for the user's query from http://localhost:1234/v1/embeddings before passing it to LanceDB).

Summarise the top results. The path field contains the Zotero key. Pass it to Tier 1/2/3 to open the PDF or fetch full metadata.

Zotero 9 — New tables and fields

Zotero 9 added several tables and columns. Use them where relevant.

itemAnnotations — PDF/EPUB annotations (highlights, notes, underlines, images):

type: highlight, , , ,

itemAttachments.lastRead — Unix timestamp of when the attachment was last opened in the Zotero reader (new in Zotero 9; NULL if never opened in Zotero 9+):

Use for "what have I read recently?" queries. lastRead is seconds since Unix epoch; convert with datetime(ia.lastRead, 'unixepoch').

groupItems — tracks who added/last modified items in group libraries ("Added By" / "Modified By" feature in Zotero 9):

retractedItems — flags papers that have been retracted (new in Zotero 9):

When performing citation faithfulness checks, always query this table first for all cited papers. Flag any retracted source prominently at the top of the report — a retracted paper should never be cited without explicit acknowledgement of its status.

Finding a PDF path — including supplementary files

When retrieving attachments for a paper, always fetch all file attachments, not just the main PDF. Supplementary information files (SI, appendices, supporting data) are stored as sibling attachments under the same parentItemID.

Classify each result by filename:

PDF path: <zotero_dir>/storage/<attachmentKey>/<filename> where <filename> is the part of ia.path after storage:.

Always read supplementary files when they exist. Information critical to a claim is often in the SI — extended methods, species lists, robustness checks, data tables. Read the main PDF first; then check whether any SI attachments exist and read them too, prioritising sections most relevant to the claim being checked.

Non-PDF SI (e.g. .xlsx, .csv) can be read with pandas or standard file tools if the content is relevant to the task.

Core query patterns

Find item by author + year (all libraries)

List items in a collection (with metadata)

Find collection by name

Disambiguation: The same collection name can appear in multiple libraries. Always check libraryID and, if ambiguous, confirm with the user which one they mean — or query both and note the distinction in your output.

Get abstract for an item

Key field IDs

fieldName	fieldID
title	1
abstractNote	2
date	6
url	13
publicationTitle	37
DOI	58

Task: Find relevant citations for a passage

When the user provides a paragraph and asks for relevant citations:

Task: Verify citation faithfulness

When the user provides a paragraph with citations and asks whether the citations faithfully represent the source:

Critical check for statistics: When a claim involves a specific number (%, count, ratio), always verify that exact figure appears in the cited paper. A common failure is citing a review paper for a statistic that originated in a paper the review cites — the review paper's text will say something like "X et al. found that < 5%..." rather than presenting it as its own finding. In that case, the original source should be cited instead, or in addition.

Task: Check whether a paper is in the library

Search by last name across all libraries, then filter by year:

Scoping to a specific library or collection

By library name: map to libraryID using the table above, add WHERE i.libraryID = N
By collection name: find collectionID first, then join via collectionItems
Excluding bulk collections: the personal library has large unsorted collections (import, need_metadata) — exclude these when doing relevance searches unless the user specifically wants them included

Notes

The SQLite database is read-only — never attempt writes
pdfplumber is available for PDF reading
For large PDFs, use re.finditer for keyword search rather than printing all text
Some collections share names across libraries — always check libraryID
Reference file counts are approximate; always query live for actual counts
itemAnnotations is the canonical source for annotation data in Zotero 9; prefer it over parsing annotation text from PDFs

Cross-tier integration rules

Tier 4 results carry zotero_key. Pass it straight to Tier 1 SQL (WHERE i.key = ?) for full metadata, or to Tier 3 for PDF reading.
When the user has specified a collection scope ("in NatureMAP, find papers about X"), pass --collection "<name>" to litmap.

Tier 4 errors and edge cases

Condition	Skill response
`litmap` command not found	"`litmap` is not installed. Run `uv pip install -e .` from `~/src/Cowork/litmap`."
First-run model download	"First-run model download (~570 MB), this takes ~1 minute."
`~/LitLake/embeddings.db` missing	"Embeddings database not found. Run `litmap sync` once to embed the library."
Auto-sync took > 30s on incremental	Note: "Sync took longer than usual — if you've recently added many papers, this is expected."
Top result similarity < 0.5	"No strong semantic matches in your library. Consider rephrasing or broadening the query."