Project Files
skills / zotero / SKILL.md
⚠️ Runtime: Claude Code only. This skill calls
uv run litmap …against~/LitLake/embeddings.dbon the local machine. It will not work in the Cowork web sandbox. If you reached this skill from the Cowork web frontend, stop and switch to Claude Code.
Database: ~/Zotero/zotero.sqlite (mounted dynamically — use the path resolution snippet below)
PDFs: <zotero_dir>/storage/<itemKey>/<filename>.pdf
Reference file: <zotero_dir>/cowork-zotero-reference.md
Read the reference file at the start of every session — it contains the schema, field IDs, and query patterns so you don't need to rediscover them. Trust the reference file for schema and IDs. Always run live queries for counts and collection listings — the reference file counts go stale.
Always query across all libraries unless the user specifies otherwise.
| libraryID | Name |
|---|---|
| 1 | Personal (Doug) |
| 3 | NatureMetricsZ |
| 4 | ECEC |
| 5 | BioDivAbove |
| 6 | KIZ Statistical Methods in Ecology |
| 7 | NatureMAP |
Choose the right tier for the task:
When in doubt, Tier 1 first to scope, then Tier 4 within scope.
Tier 1 — Metadata (instant, whole library)
Query itemData/itemDataValues/creators for title, author, abstract, journal,
date, DOI, tags. Use for: finding papers by author/year, listing a collection,
keyword searches in titles/abstracts.
Tier 2 — Full-text index (fast, ~15,600 indexed items)
Query fulltextWords + fulltextItemWords for keyword presence across PDFs without
opening them. Good for "find papers that discuss [term]" across a large collection.
Words are lower-cased and individual (not phrases). For phrase search, intersect
multiple word queries or fall back to Tier 3.
Tier 3 — PDF reading (slower, for deep analysis)
Open PDFs directly with pdfplumber when you need to verify a specific claim,
extract a quote, or check citation faithfulness.
Use re.finditer to locate specific passages rather than printing the entire text.
Use when the query is conceptual or paraphrased — i.e. when keyword search would miss synonyms, acronyms, or rephrasings. Tier 4 calls litmap search or litmap cluster against the local embeddings database. The first call after a fresh model download or a long idle takes 10–30 seconds while the embedding model warms up; subsequent calls within the same session are sub-second.
To run semantic search, you can query LanceDB directly using OmniMind's vector store. Write a quick TypeScript script or use LanceDB's python client lancedb to query ~/.omnimind/lancedb:
(Note: OmniMind uses LM Studio for embeddings, so you must fetch the embedding vector for the user's query from http://localhost:1234/v1/embeddings before passing it to LanceDB).
Summarise the top results. The path field contains the Zotero key. Pass it to Tier 1/2/3 to open the PDF or fetch full metadata.
Zotero 9 added several tables and columns. Use them where relevant.
itemAnnotations — PDF/EPUB annotations (highlights, notes, underlines, images):
type: highlight, , , , itemAttachments.lastRead — Unix timestamp of when the attachment was last opened
in the Zotero reader (new in Zotero 9; NULL if never opened in Zotero 9+):
Use for "what have I read recently?" queries. lastRead is seconds since Unix epoch;
convert with datetime(ia.lastRead, 'unixepoch').
groupItems — tracks who added/last modified items in group libraries ("Added By" /
"Modified By" feature in Zotero 9):
retractedItems — flags papers that have been retracted (new in Zotero 9):
When performing citation faithfulness checks, always query this table first for all cited papers. Flag any retracted source prominently at the top of the report — a retracted paper should never be cited without explicit acknowledgement of its status.
When retrieving attachments for a paper, always fetch all file attachments, not
just the main PDF. Supplementary information files (SI, appendices, supporting data)
are stored as sibling attachments under the same parentItemID.
Classify each result by filename:
PDF path: <zotero_dir>/storage/<attachmentKey>/<filename>
where <filename> is the part of ia.path after storage:.
Always read supplementary files when they exist. Information critical to a claim is often in the SI — extended methods, species lists, robustness checks, data tables. Read the main PDF first; then check whether any SI attachments exist and read them too, prioritising sections most relevant to the claim being checked.
Non-PDF SI (e.g. .xlsx, .csv) can be read with pandas or standard file tools
if the content is relevant to the task.
Disambiguation: The same collection name can appear in multiple libraries. Always
check libraryID and, if ambiguous, confirm with the user which one they mean —
or query both and note the distinction in your output.
| fieldName | fieldID |
|---|---|
| title | 1 |
| abstractNote | 2 |
| date | 6 |
| url | 13 |
| publicationTitle | 37 |
| DOI | 58 |
When the user provides a paragraph and asks for relevant citations:
When the user provides a paragraph with citations and asks whether the citations faithfully represent the source:
Critical check for statistics: When a claim involves a specific number (%, count, ratio), always verify that exact figure appears in the cited paper. A common failure is citing a review paper for a statistic that originated in a paper the review cites — the review paper's text will say something like "X et al. found that < 5%..." rather than presenting it as its own finding. In that case, the original source should be cited instead, or in addition.
Search by last name across all libraries, then filter by year:
WHERE i.libraryID = NcollectionItemsimport, need_metadata) — exclude these when doing relevance searches unless
the user specifically wants them includedpdfplumber is available for PDF readingre.finditer for keyword search rather than printing all textlibraryIDitemAnnotations is the canonical source for annotation data in Zotero 9; prefer it
over parsing annotation text from PDFszotero_key. Pass it straight to Tier 1 SQL (WHERE i.key = ?) for full metadata, or to Tier 3 for PDF reading.--collection "<name>" to litmap.| Condition | Skill response |
|---|---|
litmap command not found | "litmap is not installed. Run uv pip install -e . from ~/src/Cowork/litmap." |
| First-run model download | "First-run model download (~570 MB), this takes ~1 minute." |
~/LitLake/embeddings.db missing | "Embeddings database not found. Run litmap sync once to embed the library." |
| Auto-sync took > 30s on incremental | Note: "Sync took longer than usual — if you've recently added many papers, this is expected." |
| Top result similarity < 0.5 | "No strong semantic matches in your library. Consider rephrasing or broadening the query." |
noteimageunderlineinktext: the highlighted/underlined text verbatimcomment: the annotation's note/comment bodypageLabel: the page label shown in the reader (may differ from physical page number)supplement, supporting, appendix,
SI, S1, S2, ESM, Online Resource, Data S, Table S, Figure SCheck for retractions first — query retractedItems for all cited papers.
Flag any retracted paper prominently before proceeding.
Find each cited paper in the database (Tier 1, by author + year)
Read the abstract first — flag obvious mismatches immediately
Find and open the PDF for each paper (Tier 3)
For each specific claim attributed to a paper, search the PDF text with
re.finditer for relevant passages
Assess each claim:
Present a verdict table: one row per (citation × claim) combination, with the supporting quote from the paper where possible
zotero_key against libraryID via a single Tier 1 SQL query.import sqlite3, glob, os
# Resolve Zotero mount path dynamically (session ID changes each run)
candidates = glob.glob('/sessions/*/mnt/Zotero/zotero.sqlite')
if candidates:
db_path = candidates[0]
else:
db_path = os.path.expanduser('~/Zotero/zotero.sqlite')
zotero_dir = os.path.dirname(db_path)
conn = sqlite3.connect(db_path)
Exact author/year/title or specific Zotero collection? → Tier 1
A specific keyword/phrase across PDF text? → Tier 2
Deep claim verification needing PDF read? → Tier 3
Conceptual / paraphrased / "find similar" / "organise"? → Tier 4
SELECT i.itemID, i.libraryID, tv.value AS title
FROM fulltextWords fw
JOIN fulltextItemWords fiw ON fw.wordID = fiw.wordID
JOIN items i ON fiw.itemID = i.itemID
JOIN itemData td ON i.itemID = td.itemID AND td.fieldID = 1
JOIN itemDataValues tv ON td.valueID = tv.valueID
WHERE fw.word = 'keyword'
import pdfplumber
with pdfplumber.open(pdf_path) as pdf:
text = ''.join(page.extract_text() or '' for page in pdf.pages)
import lancedb
db = lancedb.connect("~/.omnimind/lancedb")
table = db.open_table("chunks")
# You'll need to generate the query embedding via LM Studio's API first
results = table.search(query_vector).where("source = 'zotero'").limit(10).to_list()
SELECT ia.itemID, ia.type, ia.authorName, ia.text, ia.comment,
ia.color, ia.pageLabel, ia.sortIndex, ia.position, ia.isExternal
FROM itemAnnotations ia
WHERE ia.parentItemID = ? -- parentItemID is the attachment's itemID
ORDER BY ia.sortIndex
SELECT ia.lastRead, i.key, idv.value AS title, ia.path
FROM itemAttachments ia
JOIN items i ON ia.itemID = i.itemID
JOIN items parent ON ia.parentItemID = parent.itemID
JOIN itemData id_t ON parent.itemID = id_t.itemID AND id_t.fieldID = 1
JOIN itemDataValues idv ON id_t.valueID = idv.valueID
WHERE ia.lastRead IS NOT NULL
ORDER BY ia.lastRead DESC
LIMIT 20
SELECT gi.createdByUserID, gi.lastModifiedByUserID,
u1.username AS addedBy, u2.username AS modifiedBy
FROM groupItems gi
JOIN users u1 ON gi.createdByUserID = u1.userID
JOIN users u2 ON gi.lastModifiedByUserID = u2.userID
WHERE gi.itemID = ?
SELECT ri.itemID, ri.data, ri.flag
FROM retractedItems ri
SELECT ia.path, ia.contentType, i.key AS attachmentKey
FROM itemAttachments ia
JOIN items i ON ia.itemID = i.itemID
WHERE ia.parentItemID = ?
AND ia.path IS NOT NULL
AND ia.path != ''
ORDER BY ia.orderIndex
SELECT DISTINCT i.itemID, i.key, i.libraryID,
tv.value AS title, dv.value AS date, jv.value AS journal,
GROUP_CONCAT(c.lastName || ', ' || c.firstName, '; ') AS authors
FROM items i
JOIN itemCreators ic ON i.itemID = ic.itemID AND ic.orderIndex = 0
JOIN creators c ON ic.creatorID = c.creatorID
JOIN itemData td ON i.itemID = td.itemID AND td.fieldID = 1
JOIN itemDataValues tv ON td.valueID = tv.valueID
LEFT JOIN itemData dd ON i.itemID = dd.itemID AND dd.fieldID = 6
LEFT JOIN itemDataValues dv ON dd.valueID = dv.valueID
LEFT JOIN itemData jd ON i.itemID = jd.itemID AND jd.fieldID = 37
LEFT JOIN itemDataValues jv ON jd.valueID = jv.valueID
WHERE c.lastName LIKE '%LastName%' AND dv.value LIKE '%YEAR%'
GROUP BY i.itemID
SELECT DISTINCT i.itemID, i.key,
tv.value AS title, dv.value AS date, jv.value AS journal,
av.value AS abstract,
GROUP_CONCAT(c.lastName || ', ' || c.firstName, '; ') AS authors
FROM collectionItems ci
JOIN items i ON ci.itemID = i.itemID
JOIN itemData td ON i.itemID = td.itemID AND td.fieldID = 1
JOIN itemDataValues tv ON td.valueID = tv.valueID
LEFT JOIN itemData dd ON i.itemID = dd.itemID AND dd.fieldID = 6
LEFT JOIN itemDataValues dv ON dd.valueID = dv.valueID
LEFT JOIN itemData jd ON i.itemID = jd.itemID AND jd.fieldID = 37
LEFT JOIN itemDataValues jv ON jd.valueID = jv.valueID
LEFT JOIN itemData ad ON i.itemID = ad.itemID AND ad.fieldID = 2
LEFT JOIN itemDataValues av ON ad.valueID = av.valueID
LEFT JOIN itemCreators ic ON i.itemID = ic.itemID AND ic.orderIndex = 0
LEFT JOIN creators c ON ic.creatorID = c.creatorID
WHERE ci.collectionID = ?
GROUP BY i.itemID ORDER BY dv.value DESC
SELECT collectionID, collectionName, libraryID, parentCollectionID
FROM collections
WHERE collectionName LIKE '%search_term%'
ORDER BY libraryID, collectionName
SELECT v.value FROM itemData d
JOIN itemDataValues v ON d.valueID = v.valueID
WHERE d.itemID = ? AND d.fieldID = 2
SELECT i.itemID, i.libraryID, tv.value AS title, dv.value AS date
FROM items i
JOIN itemCreators ic ON i.itemID = ic.itemID AND ic.orderIndex = 0
JOIN creators c ON ic.creatorID = c.creatorID
JOIN itemData td ON i.itemID = td.itemID AND td.fieldID = 1
JOIN itemDataValues tv ON td.valueID = tv.valueID
LEFT JOIN itemData dd ON i.itemID = dd.itemID AND dd.fieldID = 6
LEFT JOIN itemDataValues dv ON dd.valueID = dv.valueID
WHERE c.lastName LIKE '%LastName%'
GROUP BY i.itemID