Scan with raspberry pi, convert with aws lambda to searchable PDF
I have long dreamed for a setup which lets me just press the scan button on my scanner and — without any further input — uploads it as a searchable PDF onto some cloud drive. Thanks to the good support of scanners by SANE and the ease of use of AWS lambda it’s actually quite easy (judging to the length of this post it looks like quite a task, but in the end it is straightforwards and is — surprisingly — quite free of hacks).
In this solution you:
set up SANE on your raspberry pi 3 so it scans your document
set up scanbd to detect the scan button
set up a S3 bucket for uploading
set up a lambda function which uses tesseract to create a searchable PDF
(optionally) set up google api to store the PDF to google drive
What you need:
Raspberry Pi 3 (I guess the other models serve equally well)
Paper scanner with a “scan” button which is supported by saned
an AWS account
Personally I’m using Raspbian Stretch Lite as OS on my Raspberry and a Fujitsu S1300i.
Before you start: you might just want to wipe your pi and start fresh. Takes you about 15 minutes extra, you can follow my howto so you can do that headless (without attaching monitor/keyboard to the pi).