Recently, there’s been an uproar about Georgia’s approach to voter registration. The state’s “exact match” law, passed last year, requires that citizens’ names on their government-issued IDs must precisely match their names as listed on the voter rolls. If the two don’t match, additional verification by a local registrar will be necessary. The Georgia NAACP and other civil rights groups have filed a lawsuit arguing that the measure, effective since July 2017, is aimed at disenfranchising racial minorities in the upcoming midterm elections. Georgia Secretary of State Brian Kemp, a Republican who is running for governor against Democrat Stacey Abrams, has put on hold more than 53,000 voters so far, given mismatches in the names in their voting records and other sources of identification such as driver’s licenses and Social Security cards. If the measure takes effect, voters whose information does not exactly match across sources will need to bring a valid photo ID to the polls on Election Day to vote. That could suppress voter turnout, either because some voters lack IDs or because voters are confused about whether they are eligible. Proponents of the rule assert that it is only meant to prevent illegal voting. But is missing a hyphen, an initial instead of a complete middle name, or just having a discrepancy in one letter in a voter’s name good evidence that the voter is not who they say they are? How would we know?
Researchers often need to match records — and they have to get it right
As it happens, researchers often ask that question. In doing empirical scientific research, they often need to link various sets of data by some imperfect identifier — say, agency names or individual addresses. While doing this can be tedious, getting the matches correct is crucial. Match the wrong records, and any analysis may be totally unreliable. That leads many data analysts to only retain exact matches.
But although incorrect matches can cause problems, so can dropping records that should be matched but have small discrepancies. Eliminating those records can also corrupt an analysis.