Search Software America’s Data Matching Technology Helps EMI Music Publishing Improve Song Copyright Compliance

In terms of copyrights owned, controlled or administered, EMI Music Publishing, with offices in 30 countries and rights to more than one million musical compositions, is the world's largest music publisher.

Music publishers are in the business of acquiring and marketing the rights to musical compositions by entering into agreements with composers and writers for the use of their copyrighted words and music. Their business is to ensure that the songs are heard and enjoyed, typically by licensing them for performance, broadcast and inclusion in television programs, advertisements and motion pictures. They also make certain that all royalties due on such licensing agreements are collected and distributed.

EMI relies heavily on technology and has long been a pioneer in using it to further its business. It was the first major music publisher to establish a Web site, the first to use an advanced lyric search engine and the first to process licensing applications online.

EMI also uses computers in its back-end processing to perform data matching. For example, when a performing society sends it a batch of performances or a retail organization sends it information about CD’s, tapes or DVD’s sold for which they are not sure who holds the copyright, they must be matched against EMI’s copyright database to determine if EMI is the custodian of a particular song and, if so, to whom the royalties for that performance should be distributed.

Another important function of data matching is to check for copyright compliance and infringement. There are many Internet sites that offer songs for downloading and, because of recent legislation, they must now ask the Music Publishers for permission to use this copyrighted material. Accurate data matching means a fairer distribution of royalties and increased revenue.

Because there is no reliable universal song numbering system, EMI must use a combination of strategies to match song titles and writer names to its database. Inevitably, there are always some unmatched songs at the end of the process that must be matched manually (e.g. the song “LOVE” by the writer “SMITH”). The greater the number of songs that need to be manually matched, the greater the drain on productivity.

EMI sought to improve the reliability of its data matching and reduce the number of required manual matches. It turned to Search Software America (SSA), a software company with expertise and a long history in identity data searching and matching. SSA develops and distributes a number of products that enable organizations to improve on in-house methods of searching for and matching identity data in databases and file systems.

EMI was particularly interested in SSA’s Data Clustering Engine, a product that is specifically designed for batch data matching and grouping work (SSA also has products for online searching). Based on the results of proof-of-concept tests, EMI decided to purchase the Data Clustering Engine. A snapshot of Clustering results are shown below:

Although SSA had a number of customers already using its software for music-related searching or matching, EMI’s data required some fine-tuning of its rules. After consulting with specialists from SSA, EMI experienced a 16 to 20 percent improvement in automatic match quality over its existing methods.

“To say the least, we were very pleased,” said Alec Malyon, EMI’s IT Director. “We can match files from infringing companies and claim our rights more accurately and faster, and this results in increased revenue.” Malyon added: “Processing new files is low effort---when we receive third party files we convert to a standard format before sending it to the server (where the Data Clustering Engine runs) and thus need minimal changes to the SSA rules for this job. The actual runs are well within our processing time thresholds”.

“EMI’s needs were an exciting challenge for SSA, particularly due to the nature and variability of the data and the potential ROI that EMI could achieve, especially in the copyright infringement area”, said Michael Dunkerley, SSA’s vice president of global marketing. Dunkerley added “the Data Clustering Engine can operate in a number of modes, including grouping the data in one or many files, screening an external transaction file against internal reference data, and looking for the “non-matches” between files, an important process in compliance work.”

The Data Clustering Engine runs efficiently on Unix and Windows platforms, and can source data from either flat files or directly from relational databases. For more information on this product, visit http://www.searchsoftware.com/products/dce.htm, or visit http://www.searchsoftware.com/products/index.htm to see all of SSA’s identity search and matching products.

For more information on EMI Music Publishing and to browse the largest number of sound clips available from any publisher on a Web site (over 100,000), visit www.emimusicpub.com.

# # #
Close Window