PercolatorMatching 1.1.0

dotnet add package PercolatorMatching --version 1.1.0                
NuGet\Install-Package PercolatorMatching -Version 1.1.0                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="PercolatorMatching" Version="1.1.0" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add PercolatorMatching --version 1.1.0                
#r "nuget: PercolatorMatching, 1.1.0"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install PercolatorMatching as a Cake Addin
#addin nuget:?package=PercolatorMatching&version=1.1.0

// Install PercolatorMatching as a Cake Tool
#tool nuget:?package=PercolatorMatching&version=1.1.0                

A simple dll that contains a matching class to match strings and to calculate the score of similarity between the two strings using the Ratcliff-Obershelp algorithm.

Product Compatible and additional computed target framework versions.
.NET Framework net is compatible. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

This package has no dependencies.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
1.1.0 29 4/7/2015

I originally built this when I found out that the fuzzy lookup and fuzzy grouping components of SSIS were only available on enterprise editions of SQL server.  I've used this to scan over database tables to search for possible duplicate entries, and output the results to another table for a user to look over at a later time, and other applications as well.

Reference the dll and expose the namespace "Percolator.Matching". Make a new instance of "Fuzzylator."

The "ThresholdPercentage" is the threshold that the two strings must meet in order to be deemed as similar. This can be set while creating the new object, or later. If no threshold is set, then it will default to the "Zero" percent.

There are several overloads of the "IsSimilar" method to accomodate a couple different scenarios.
--Durring every check a score is calculated. The optional out parameter can be used to grab that score out of the check if he or she wishes to use it later rather than having to calculate the same score later on. --An optional ThresholdPercentage can be used on a single method to use that percentage rather than the one set by the instance for that one method call.

The "IsUPCSimilar" is a specialized UPC scanner that is streamlined specifically for a upc string. It does not calculate longest common subsequences, rather just looks at each digit in order and returns the score.

"GetScore" returns the score between the two strings, using the Ratcliff/Obershelp algorithm.

"GetUPCScore" again is a streamlined algorithm specifically for a UPC string.

Examples =>

using the similarty bools:

var fuz = new Fuzzylator(ThresholdPercentage.Eighty);

string str1 = "Test String";
string str2 = "A Test String";

if (fuz.IsSimilar(str1, str2))
{
//Do something
}

double score; if (fuz.IsSimilar(str1, str2, out score))
{
//Do something
Console.WriteLine(score); //score now contains the score of the two strings
}

if (fuz.IsSimilar(str1, str2, ThresholdPercentage.Ninety))
{
//Do something
//The IsSimilar check uses a Ninety percent threshold for this one time.
}

double score = fuz.GetScore(str1, str2, true); //the score variable now holds the value of the score between str1 and str2, optionally ignoring the case.