Importing Papers
paperKB supports several ways to add papers to a knowledge base. Navigate to your KB → Docs → Import to get started.
Import methods
PubMed IDs (PMIDs)
Paste one or more PMIDs (one per line, or comma/space separated). paperKB will fetch metadata from PubMed and full text from PMC when available.
DOIs
Paste DOIs in any common format (10.1234/..., doi:10.1234/..., or full
URLs). Metadata is fetched from CrossRef and PubMed.
PDF upload
Drag and drop PDF files or click to browse. For each PDF, paperKB will:
- Extract text using
pdftotext. - Search for a DOI in the first few pages.
- If a DOI is found, fetch full metadata from CrossRef/PubMed.
- If no DOI is found, create a source using the filename as the title.
- Chunk the extracted text for search and embedding.
URL import
Paste a URL to a paper. paperKB will attempt to extract metadata and content from the page.
Zotero sync
Connect your Zotero library from Settings → External Keys:
- Generate an API key at zotero.org/settings/keys (needs read access to library and files).
- Add it in Settings → External Keys with service "Zotero".
- On the Import page, select the Zotero tab, choose a library and collection, and import.
- You can also set up auto-sync to keep a Zotero collection in sync with your KB.
What happens after import
After a paper is imported, a background worker processes it:
- Text extraction — full text is parsed from PMC XML or PDF.
- Chunking — text is split into overlapping passages (~200 words each).
- Embedding — each chunk is embedded for semantic search.
- Observation extraction — an LLM extracts structured entities and relationships.
- Citation graph — references are parsed and linked.
- PageRank — citation-based importance scores are computed.
Monitor progress on the Status tab in KB settings.
Bulk operations
From the Docs list, you can select multiple papers and:
- Add them to a collection
- Remove them from a collection
- Remove them from the KB entirely