83 private links
a small .pdf management tool with a command-line UI - 2mol
paperboy
How could I merge / join PDF on linux cmdline terminal
preprocess (unpaper) and ocr (tesseract) pdf files and 'sandwich' the text behind the image -> output is a selectable pdf
seems perfect for pdf pipeline
PDF viewer for linux:
- few dependencies
- relatively small
- zathura-like hjkl movement
- rest of movement may be bindable like zathura?
- can open annotation viewer/editor with a
- can invert color with I, tint color with C
- runs on opengl (the newer mupdf-gl version)
- shows highlights and annotations
- can create note-based annotations (not highlights though?)
- can be customized to vim-like keys
- minimal interface and distraction
Python PDF Parser -- fork with Python 2+3 support using six - pdfminer
Extracts and formats text annotations from a PDF file - 0xabu
Python script to do PDF OCR conversion using Tesseract - virantha
This is also what paperless uses for its OCR process
A small utility making use of the pypdf library to provide a (somewhat) lighter alternative to pdftk - hellerbarde
allows differentiating between physical and logical pages, might be useful
I want a python function that takes a pdf and returns a list of the text of the note annotations in the document. I have looked at python-poppler
I want to edit the metadata of a scanned PDF to assign custom page numbers to different pages. For example, what are now pages 1-3 I might want to call i, ii and iii, and what are pages 4-10, I wan...
PDF 'Meta-Data' and Formatting Tools:
- pdftk - allows changing page order, extracting pages, combining pages (or image files) to one pdf
- jPdf - similar features to pdftk; allows adding bookmars, renumbering pages, watermarks, attachments, encryptiong, signing, etc
- ABBYY FineReader (win) - OCR and image editing
- irfanview(win)/xnview(unix) - allows batch editing the extracted image files
Annotation:
- Okular - the KDE pdf viewer, annotation works well and fast, dark mode/light mode, -needs many KDE dependencies installed
- MasterPDF Editor - small, fast, many annotation features; -previous Okular annotations don't seem to show up? (don't know if incompatible)
Search:
MasterPDF Editor seems awesome, and can be customized almost to vim-like keybinds. It's free in the aur (watermark removed version)
From previous experience, I found that I was making too many knowledge items while reading, which made it hard to identify the really important ones. So now, I read through the entire document (article PDF or chapter of a monograph) and mark down possible knowledge items (yellow highlight in citavi PDFs, pencil or 3M flags in printed materials). Then when I've finished, I write a summary knowledge item in 2 paragraphs—the first outlines the argument, the second (in italics) gives my evaluation and thoughts about possible uses. (I don't use the evaluation field in the contents tab as I want to be able to see this quickly in the knowledge view.) Now I'll go back through the marks I've made and work out which ones I want to input as knowledge items. As part of the input, I will assign to as specific category as possible. Every few days, I will have a look through my outline to see where the holes are (not a lot of knowledge items), and to see if anything in the NOT SURE YET category sparks any new thoughts on organization.