Duplicate Files Finder homepage
GitHub - datamade/dedupe: A python library for accurate and scaleable data deduplication and entity-resolution.
dedupe - :id: A python library for accurate and scaleable fuzzy matching, record deduplication and entity-resolution.