Skip to content

Save fingerprints and identify tracks

Sergiu Ciumac edited this page Dec 24, 2021 · 8 revisions

Save and query

Save fingerprints

Once you've generated the fingerprints you have to store them for later retrieval. SoundFingerprinting package comes bundled with in-memory storage that can be used for testing purposes and other small projects.

var modelService = new InMemoryModelService();
var track = new TrackInfo("GBBKS1200164", "Skyfall", "Adele");
var fingerprints = await FingerprintCommandBuilder.Instance
                .BuildFingerprintCommand()
                .From(PathToWav)
                .Hash();

// store hashes in the database for later retrieval
modelService.Insert(track, fingerprints);

InMemoryModelService is not the only option available. If you are looking for persistent storage dedicated for SoundFingerprinting fingerprints, check out Emy.

Query

Querying the model service is done by creating a QueryCommand and specifying a reference to an IModelService with previously-stored tracks.

string path = "query.wav";
var queryResult = await QueryCommandBuilder.Instance
                                 .BuildQueryCommand()
                                 .From(path)
                                 .UsingServices(modelService)
                                 .Query();

// iterate over the results if any  
foreach(var (entry, _) in queryResult.ResultEntries)
{  
    // output only those tracks that matched at least seconds.
    if(entry.TrackCoverageWithPermittedGapsLength >= 5d) 
    {
        Console.WriteLine("Track {entry.Track.Id} matched. Coverage: {entry.TrackCoverageWithPermittedGapsLength:0.00} seconds");
    }
}

If you are looking into generating query requests from realtime source, check the following page for details: Realtime Query.

Filtering false positives

False positives are entries that matched even though they should not have had. This happens due to the nature of the approximate near neighbor algorithm employed in identifying similar fingerprints. There are various ways you can filter them depending on your use case.

  • TrackCoverageWithPermittedGapsLength >= 5d - consider matches that covered at least 5 seconds in the matched track. Good for general-purpose dataset.
  • TrackRelativeCoverage >= 0.4 - consider matches that covered at least 20% of the matched track (i.e., for 30 seconds track, that would be 12 seconds. Suitable for a dataset of ads.
  • Confidence >= 0.2 - confidence is a measure very similar to TrackRelativeCoverage. The difference between the two is how it treats matches that happen on the boundary of the issued query or the inserted track. It is only useful if you send real-time requests by cutting the query into chunks of equal length.

Configuration

Similar to fingerprinting command, you can parameterize fingerprints creation for your query. By default, the algorithm will use the DefaultQueryConfiguration class to configure your query command and SoundFingerprintingAudioService to read samples from the provided file.

The list of query configurations is quite extensive, covering a wide variety of scenarios. Check Query Configuration page for details.

Check the Audio Services page to generate query fingerprints from various file formats.