🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
-
Updated
Sep 16, 2024 - Python
Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). Entity resolution is necessary when joining different data sets based on entities that may or may not share a common identifier (e.g., database key, URI, National identification number), which may be due to differences in record shape, storage location, or curator style or preference.
🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
A powerful and modular toolkit for record linkage and duplicate detection in Python
🆔 Command line tool for deduplicating CSV files
🆔 Examples for using the dedupe library
🔎 Finds fuzzy matches between CSV files
Record Linkage ToolKit (Find and link entities)
A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning
Link Wikidata items to large catalogs
Python package for deduplication/entity resolution using active learning
Python implementation of anonymous linkage using cryptographic linkage keys
Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.
CLK hash: hash pii for entity matching
Merge Dirty Data with Clean Reference Tables
Learned string similarity for entity names using optimal transport.
Record matching and entity resolution at scale in Spark
A maximum-strength name parser for record linkage.
Privacy Preserving Record Linkage Service
An End-to-End Evaluation Framework for Entity Resolution Systems
Fork of the Freely Extensible Biomedical Record Linkage program
Created by Halbert L. Dunn
Released 1946