DocDigitizer is an automation data capture service that enables you to streamline your document inbound, transforming unstructured and human-readable documents.
TL;DR If you just want to jump to the solution I found without reading all my 20 years struggles, see the following list. You will need all software from it PhotoStructure AntiDupl.NET Adobe Lightr