paperwork - 使用掃描儀和OCR轉化紙質文件的簡單方法
Paperwork
Description
Paperwork is a personal document manager for scanned documents (and PDFs).
It's designed to be easy and fast to use. The idea behind Paperwork is "scan & forget": You should be able to just scan a new document and forget about it until the day you need it again.
In other words, let the machine do most of the work for you.
Screenshots
Main Window
Search Suggestions
Labels
Settings window
Main features
- Scan
- Automatic detection of page orientation
- OCR
- Document labels
- Automatic guessing of the labels to apply on new documents
- Search
- Keyword suggestions
- Quick edit of scans
- PDF support
Installation
Contact/Help
Details
Papers are organized into documents. Each document contains pages.
It mainly uses:
- Sane/Pyinsane: To scan the pages
- Tesseract/Pyocr: To extract the words from the pages (OCR)
- GTK: For the user interface
- Whoosh: To index and search documents, and provide keyword suggestions
- Simplebayes: To guess the labels
- Pillow: Image manipulation
Licence
GPLv3 or later. See COPYING.
Archives
Github can automatically provides .tar.gz and .zip files if required. However, they are not required to install Paperwork. They are indicated here as a convenience for package maintainers.
- Paperwork 0.3.0.1
- Paperwork 0.3.0
- Paperwork 0.2.5
- Paperwork 0.2.4
- Paperwork 0.2.3
- Paperwork 0.2.2
- Paperwork 0.2.1
- Paperwork 0.2
- Paperwork 0.1.3
- Paperwork 0.1.2
- Paperwork 0.1.1
- Paperwork 0.1
Development
All the information can be found on the wiki
本文由用戶 jopen 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!