I did a 'sonification' of what Shazam's reduction of your music would sound like, some time ago [1]. Perhaps it adds something to the article. You can actually Shazam it, it still works.
I, I’m the author of the article.
In fact I also did the same when I did my prototype of Shazam.
When I wrote the article, I hesitated to add a sub chapter in the Shazam chapter when I would have put a well-known music and its fingerprinted version so that everyone can hear what it sounds like but I didn’t do it because I feared copyright lawsuit.
Would that be covered under fair use, or not because the article wasn't about those songs themselves (as in, the author could have picked any song, not necessarily a commercial one?)
This is a textbook example of fair use! A 5-second clip would have sufficed. it wouldn't have reproduced a large part of the work. it certainly wouldn't have affected the market for that work. it was for educational or criticism purposes, etc.
OTOH I can see why the author would have wanted to steer a million miles of reproducing ANYTHING (including so much as mentioning the title of any work, which obviously isn't copyright infringement.)
in this case it's not so much copyright infringement as steering very very clear of reference to anything.
Fair use would seem to apply under most interpretations regardless (excerpts and quotations for the purposes of illustration are generally held to be covered), but you really can't say that for sure until the courts decide. There's nothing you can do to keep a rights holder from dragging you through court.
Thought they used Parsons Code as it is space efficient as a fingerprinting technique and less across the wire too for a partial fingerprint and it handles tempo drift. In addition I know they where becoming CPU bound and then moved to GPU to do matching, that greatly helped them.
When I started this side project in 2012, I looked for publicly reliable information (especially thesis or research papers) and the only useful information I found was Shazam confounder’s paper.
Since this paper was written in 2003, I wouldn't be surprised if they have changed their algorithms since this time.
But from my understanding, the 2003 paper describes a highly scalable architecture and a noise tolerant and "time efficient" algorithm (that can be modified using thresholds) so it could still work in 2015 with a few optimizations. Still, I'm not working at Shazam and I'm not a researcher so I could be wrong.
[1] https://soundcloud.com/sample_noise/shazuffle-ii-shazam-me