Record Linkage

Record linkage is the process of identifying and linking records across several files/databases that refer to the same entities. Here is the formal description of the record linkage problem.
A will be a database/dataset/set/collection containing all the data from a certain census. (e.g. 1871,1881)
A record a in A is the information that we have collected for a particular person/entity. This information will be split in attributes/fields/items (the answers collected in the census). Each record has N attributes (e.g. first name, last name, date of birth, birth place), a=(a1,a2,...,aN).