Let say we have two tFixedFlowInput that have some similar
street address, now we want to find out those.
Data source 1-
Data Source 2-
And we use tRecordMatching Component with fuzzy matching
function Jaro-winkler for finding duplicates-
Note: we can also
use blocking definition option.
My job design is as below-
And the final output is-
Please do comment if you have any questions.
There is another component tFuzzymatch for the same task.
ReplyDeleteNot really, no.
DeleteFor a single column, yes, but for this example, it is not possible with a tfuzzymatch.
When you are dealing with real matching stuff, you have to use a trecordmatching.