Great blog by Kyle Rossetti and Rebecca Bilbro explains how to disambiguate records that correspond to real-world entities across and within datasets using the Python dedupe package.
Contains code and examples so you can really understand it and easily replicate in your own work.
No comments:
Post a Comment
Note: only a member of this blog may post a comment.