Music Information Retrieval Annotated Bibliography Project


Our BibTeX Format - a slight variation of BibTeX



General Description of Our BibTeX Format:

For the Music Information Retrieval Annotated Bibliography Project, we use a custom BibTeX Format that draws from many of the current BibTeX standards. Please note that we do occassionally embed HTML code within the values of our BibTeX fields in order to enhance the web-view in Greenstone. Attention should be paid to our special use of these fields: addedkeywords, authorrole, backgroundcode, dateentryedited, editorrole, fullpubdate, and meetingdate.

See some real examples below.

Our BibTeX Entry Types:

@article - used for any journal articles.
@book - used for entire books.
@inbook - used for parts of books (e.g. chapters).
@inproceedings - used for conference papers, conference posters, invited presentations and addresses.
@misc - used for websites and other miscellaneous materials.

BibTeX Field Descriptions:

BibTeX field Description Displayed Name
In our Greenstone application, the field is called:
abstract The abstract of the work. Abstract
addedkeywords* Keywords or phrases describing the work that were added beyond those given in the author provided keywords field. not displayed
address The address of the publisher. The name of the publisher is stored in the publisher field. Publication Place
affiliation The affiliation of the principal author; including email (name at uiuc.edu), the research group, the department, and the institution. For example:
affiliation = "masoud at ece.umn.edu, Department of Electrical Engineering, University of Minnesota, Minneapolis, Minnesota, USA",
Contact Information
author The author or authors of the work. Author(s) (for all authors), Principal Author (for the first author only)
authorrole* The role of the primary author (i.e. the first author listed in the author field not displayed
backgroundcode* Information regarding whether this is a Background Reading in MIR related research. "This is a Background Reading in the Area of"
booktitle The title of a book or proceedings (usually used when citing part of it). Book/Proceedings Title
chapter The chapter number in the book (for entries representing chapters or parts of books). not displayed
copyright Information regarding who hold the copyright to this work. not displayed
dateentryedited* The date that this particular BibTeX entry was last edited in the format DD MM YYYY. The person who last edited this BibTeX entry is listed in the entryeditor field. A properly formatted (DD MM YYYY) example:

dateentryedited = "31 12 2003",
not displayed
editor The editor(s) of a book or proceedings Editor(s)
editorrole* The role(s) of the editor(s) Editor Role
entryeditor* The editor of this particular BibTeX entry. This corresponds with the person that last edited the entry on the date specified in dateentryedited. not displayed
fullpubdate* The full publication date of the work in the format DD MM YYYY. A properly formatted (DD MM YYYY) example:

fullpubdate = "31 12 2000"
not displayed, but this field might be used for sorting by date.
journal The name or title of the journal, no abbreviations please. Journal Title
keywords Keywords or phrases describing the work. These are the keywords that were given by the author of the work. Keywords
location The location where a conference took place. The date when the conference took place is stored in the meetingdate field. A properly formatted example:

location = "Baltimore, Maryland, USA",
Meeting Place
meetingdate* The full date when a conference took place. The location where the conference took place is stored in the location field. A properly formatted example:

meetingdate = "October 23-25, 2000",
Meeting Date
note Any notes or additional information regarding the work. Annotations
number The number of a journal, magazine, technical report, or of a work in a series. An issue of a journal is typically identified by its volume and number. not displayed
pages The total number of pages within the work (if followed by "p.") or the page range in which the work appears. Pages
publisher The name of the publisher of the work. The location of the publisher is stored in the address field. Publisher
title The title of the work. If the work is a within a larger book or conference proceedings, the name of the book or conference proceedings is stored in the booktitle field. Paper Title
type The type of document or work (e.g. 'book', 'conference poster', 'website', etc). Document Type
url The website URL where a version of the work is accessible. URL
volume The volume of a journal, magazine, technical report, or of a work in a series. An issue of a journal is typically identified by its volume and number. not displayed
year The year in which the work was published. Further publication information is stored in the address and publisher fields. Publication Date
* - Represents custom BibTeX fields

A Fake Sample BibTeX entry:

@inproceedings{ unique_key,
abstract = "The full text of the abstract goes here",
addedkeywords = "These are, comma-separated, keywords, added to those, provided by, the author",
affiliation = "Affiliation of the primary author, including his/her email address: e.g. jane at myuniversity.com",
author = "Jane Doe",
booktitle = "Proceedings of the Fourth International Conference on Stuff",
dateentryedited = "11/24/2004",
editor = "Joe Schmoe and Sally Green",
entryeditor = "J. Stephen Downie",
keywords = "These, are, comma-separated, keywords, provided by, the author",
location = "BigCity, Illinois, USA",
meetingdate = "October 26-30, 2004",
pages = "257-258",
publisher = "My University",
title = "S.T.U.F.F. - Some Things Used for Finding music Files",
type = "conference paper",
url = "<a href='http://www.my.edu/janespaper.html'>http://www.my.edu/janespaper.html</a>",
year = "2004"
}

See some real examples below.


Links to Other BibTeX Resources on the Web:


For more leading-edge research like MIR, see the ISRL webpage.
music-ir.org is hosted by the ISRL (Information Science Research Laboratories) which is part of GSLIS (the Graduate School of Library and Information Science at UIUC (the University of Illinois at Urbana-Champaign).
Maintained by :J Stephen Downie -
Comments to : jdownie@uiuc.edu
Last modified: 24 November 2004 (td)
music-ir.org

For instance, our BibTeX file in 2005, the top 25% of it looked like this:
@inproceedings{ Userdefined-music-se-1001153, 
abstract = "A system for retrieving a sequence of music excerpts or songs based on users and producers requirements is proposed in this paper. Our system provides a flexible way to retrieve music pieces based on its contents as well as user-defined constraints. The proposed system allows online users to extract a sequence of songs whose first and last tracks are known and at the same time the in-between songs have minimum inter-track differences and satisfy predefined requirements. We model the problem as a constrained minimum cost flow problem which leads to a binary integer linear program (BILP) that can be solved in a reasonable amount of time.",
addedkeywords = "user-defined retrieval",
affiliation = "masoud at ece.umn.edu, Department of Electrical Engineering, University of Minnesota, Minneapolis, Minnesota, USA",
author = "Masoud Alghoniemy and Ahmed H. Tewfik",
booktitle = "Proceedings of the Eighth ACM International Conference on Multimedia",
dateentryedited = "Jordan Seymour",
entryeditor = "28052003",
location = "Los Angeles, California, USA",
meetingdate = "October 30 - November 4, 2000",
note = "author's preprint",
pages = "356-358",
publisher = "ACM Press",
title = "User-defined music sequence retrieval",
type = "conference paper",
url = "http://www.ece.umn.edu/users/alghonie/papers/acm00.ps",
year = "2000",
}

@inproceedings{ Contentbased-identif-1175637, 
abstract = "Along with investigating similarity metrics between audio material, the topic of robust matching of pairs of audio content has gained wide interest recently. In particular, if this matching process is carried out using a compact representation of the audio content ('audio fingerprint'), it is possible to identify unknown audio material by means of matching it to a database with the fingerprints of registered works. This paper presents a system for reliable, fast and robust identification of audio material which can be run on the resources provided by today's standard computing platforms. The system is based on a general pattern recognition paradigm and exploits low level signal features standardized within the MPEG-7 framework, thus enabling interoperability on a world-wide scale.
Compared to similar systems, particular attention is given to issues of robustness with respect to common signal distortions, i.e. recognition performance for processed/modified audio signals. The system's current performance figures are benchmarked for a range of real-world signal distortions, including low bitrate coding and transmission over an acoustic channel. A number of interesting applications are discussed.", address = "Bloomington, IN", affiliation = "Fraunhofer IIS-A, GERMANY, alm at iis.fhg.de", author = "Eric Allamanche and Jürgen Herre and Oliver Hellmuth and Bernhard Fröba and Thorsten Kastner and Markus Cremer", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Editors", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "197-204", publisher = "Indiana University", title = "Content-based identification of audio material using MPEG-7 low level description", type = "Conference paper", url = "http://ismir2001.indiana.edu/pdf/allamanche.pdf, and http://music-ir.org/gsdl/ismir2001/pdf/allamanche.pdf (3.7M)", year = "2001", } @inproceedings{ A-multiple-feature-m-263255, abstract = "Despite the “fuzzy” nature of musical similarity, which varies from one person to another, perceptual low level features combined with appropriate classi- fication schemes have proven to perform satisfactorily for this task. Since a single feature only captures some selective characteristics of an audio signal, this information may, in some cases, not be sufficient to properly identify similarities between songs. This paper presents a system which combines a set of acoustic features for the task of retrieving similar sounding songs. The methodology for optimum feature selection and combination is explained, and the system’s performance is assessed by means of a subjective listening test.", addedkeywords = "music similarity", affiliation = "Fraunhofer Institut Integrierte Schaltungen, IIS, Erlangen, Germany, alm at iis.fhg.de", author = "Eric Allamanche and Jurgen Herre and Oliver Hellmuth and Thorsten Kastner and Christian Ertel", booktitle = "Proceedings of the Fourth International Conference on Music Information Retrieval: ISMIR 2003", editor = "Holger H. Hoos and David Bainbridge", entryeditor = "Christopher Phillippe", location = "Baltimore, Maryland, USA", meetingdate = "October 26-30, 2003", pages = "217-218", publisher = "Johns Hopkins University", title = "A multiple feature model for musical similarity retrieval", type = "conference poster", url = "http://ismir2003.ismir.net/papers/Allamanche.pdf", year = "2003", } @inproceedings{ Tracking-musical-bea-99860, abstract = "Identifying the temporal location of downbeats is a fundamental musical skill. Observing that previous attempts to automate this process are constrained to hold a single current notion of beat timing and placement, we find that they will fail to predict beats and not recover beyond the point at which the first mistake is made. We propose a new model that uses beam search to consider multiple interpretations of the performance. At any time, predictions of beat timing and placement are made according to the most credible of many interpretations under consideration.", affiliation = "Paul.Allen at cs.cmu.edu, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA", author = "Paul E. Allen and Roger B. Dannenberg", booktitle = "1990 International Computer Music Conference, International Computer Music Conference", dateentryedited = "Jordan Seymour", entryeditor = "30102002", location = "Glasgow, Scotland", meetingdate = "September", note = "author's preprint", pages = "140-143", title = "Tracking musical beats in real time", type = "conference paper", url = "http://www-2.cs.cmu.edu/\~rbd/papers/bticmc.pdf", year = "1990", } @misc{ American-Musicologic-1746378, abstract = "The American Musicological Society was founded in 1934 as a non-profit organization to advance research in the various fields of music as a branch of learning and scholarship. At present, 3,300 individual members and 1,200 institutional subscribers from forty nations participate in the Society.", dateentryedited = "Jordan Seymour", entryeditor = "31102002", keywords = "musicology", title = "American Musicological Society (website)", type = "website", url = "http://www.sas.upenn.edu/music/ams/", } @inproceedings{ STRETCH-A-system-for-128671, abstract = "A system for storing and retrieving imaged multimedia documents by content is described. This system is being developed within the Esprit project STRETCH (STorage and RETrieval by Content of imaged documents). The core of STRETCH system is a powerful archiving and retrieval engine, based on a structured document representation and capable of activating appropriate methods to characterise and automatically index heterogeneous documents with variable layout and subsequently retrieve them by answering to complex queries. The produced document base, or 'Docu-base', relies on an object oriented internal representation and related characterisation and search methods. A prototype was implemented and successfully tested, in particular, in the creation of an invoice archive.", affiliation = "enrico.appiani at elsag.it, RES Department Elsag spa – Via G. Puccini, 2 – 16154 Genova (Italy)", author = "Enrico Appiani and Luisa Boato and Sandra Bruzzo and Anna Maria Colla and Marco Davite and Donatella Sciarra", booktitle = "Proceedings of the Tenth International Workshop on Database and Expert Systems Applications, 1999., Tenth International Workshop on Database and Expert Systems Applications", dateentryedited = "Jordan Seymour", editor = "A. Cammelli and A. Tjoa and R.R. Wagner", entryeditor = "31102002", keywords = "document retrieval, content-based retrieval, complex queries, STRETCH, search methods", meetingdate = "September 1-3, 1999", title = "'STRETCH': A system for document storage and retrieval by content", type = "conference paper", year = "1999", } @inproceedings{ Automatic-synchroniz-1693523, abstract = "In this paper we present algorithms for the automatic time-synchronization of score-, MIDI- or PCMdata streams representing the same polyphonic piano piece.", addedkeywords = "synchronization", affiliation = "Universitat Bonn, Institut fur Informatik III, Bonn, Germany, vlora at cs.uni-bonn.de", author = "Vlora Arifi and Michael Clausen and Frank Kurth and Meinard Muller", booktitle = "Proceedings of the Fourth International Conference on Music Information Retrieval: ISMIR 2003", editor = "Holger H. Hoos and David Bainbridge", entryeditor = "Christopher Phillippe", location = "Baltimore, Maryland, USA", meetingdate = "October 26-30, 2003", pages = "219-220", publisher = "Johns Hopkins University", title = "Automatic synchronization for music data in score-, MIDI- and PCM-format", type = "conference poster", url = "http://ismir2003.ismir.net/papers/Arifi.pdf", year = "2003", } @inproceedings{ Music-similarity-mea-1251149, abstract = "Electronic Music Distribution (EMD) is in demand of robust, automatically extracted music descriptors. We introduce a timbral similarity measures for comparing music titles. This measure is based on a Gaussian model of cepstrum coefficients. We describe the timbre extractor and the corresponding timbral similarity relation. We describe experiments in assessing the quality of the similarity relation, and show that the measure is able to yield interesting similarity relations, in particular when used in conjunction with other similarity relations. We illustrate the use of the descriptor in several EMD applications developed in the context of the Cuidado European project.", address = "Paris, France", affiliation = "SONY Computer Science Lab., Paris, France, jj at csl.sony.fr", author = "Jean-Julien Aucouturier", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", editorrole = "Editor", keywords = "CUIDADO", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "157-163", publisher = "IRCAM - Centre Pompidou", title = "Music similarity measures: What's the use?", type = "conference paper", url = "http://ismir2002.ircam.fr/proceedings/02-FP05-2.pdf", year = "2002", } @article{ Representing-musical-284233, abstract = "Musical genre is probably the most popular music descriptor. In the context of large musical databases and Electronic Music Distribution, genre is therefore a crucial metadata for the description of music content. However, genre is intrinsically ill-defined and attempts at defining genre precisely have a strong tendency to end up in circular, ungrounded projections of fantasies. Is genre an intrinsic attribute of music titles, as, say, tempo? Or is genre a extrinsic description of the whole piece? In this article, we discuss the various approaches in representing musical genre, and propose to classify these approaches in three main categories: manual, prescriptive and emergent approaches. We discuss the pros and cons of each approach, and illustrate our study with results of the Cuidado IST project.", affiliation = "jj at csl.sony.fr, SONY Computer Science Laboratory, Paris", author = "Jean-Julien Aucouturier and Francois Pachet", dateentryedited = "Jordan Seymour, 28052003", journal = "Journal of New Music Research", note = "author's preprint", number = "1", title = "Representing musical genre: A state of art", type = "journal article", url = "http://www.csl.sony.fr/downloads/papers/2002/pachet02c.pdf", volume = "32", year = "2002", } @inproceedings{ Using-longterm-struc-1742993, abstract = "We present a measure of the similarity of the long-term structure of musical pieces. The system deals with raw polyphonic data. Through unsupervised learning, we generate an abstract representation of music - the “texture score”. This “texture score” can be matched to other similar scores using a generalized edit distance, in order to assess structural similarity. We notably apply this algorithm to the retrieval of different interpretations of the same song within a music database.", address = "Bloomington, IN", affiliation = "SONY Computer Science Labs, Inc., Paris, FRANCE, jjaucouturier at caramail.com", author = "Jean-Julien Aucouturier and Mark Sandler", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Editors", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "1-2", publisher = "Indiana University", title = "Using long-term structure to retrieve music: Represention and matching", type = "Poster abstract", url = "http://ismir2001.indiana.edu/posters/aucouturier.pdf, also http://music-ir.org/gsdl/ismir2001/posters/aucouturier.pdf (329k)", year = "2001", } @inproceedings{ Finding-repeating-pa-1238490, abstract = "Finding structure and repetitions in a musical signal is crucial to enable interactive browsing into large databases of music files. Notably, it is useful to produce short summaries of musical pieces, or ”audio thumbnails”. In this paper, we propose an algorithm to find repeating patterns in an acoustic musical signal. We first segment the signal into a meaningful succession of timbres. This gives a reduced string representation of the music, the texture score, which doesn’t encode any pitch information. We then look for patterns in this representation, using two techniques from image processing:", addedkeywords = "audio thumbnail", affiliation = "jj at cls.sony.fr, Sony Computer Science Laboratory, 6 rue Amyot, 75005 Paris, France", author = "Jean-Julien Aucouturier and Mark Sandler", booktitle = "Proceedings of the Audio Engineering Society 22nd International Conference on Virtual, Synthetic and Entertainment Audio (AES22)", dateentryedited = "Jordan Seymour", entryeditor = "29052003", location = "Espoo, Finland", meetingdate = "June 15-17, 2002", pages = "412-421", title = "Finding repeating patterns in acoustic musical signals: Applications for audio thumbnailing", type = "conference paper", url = "http://www.csl.sony.fr/downloads/papers/2002/aucouturier02b.pdf", year = "2002", } @book{ Modern-information-r-289283, abstract = "Information retrieval (IR) has changed considerably in recent years with the expansion of the World Wide Web and the advent of modern and inexpensive graphical user interfaces and mass storage devices. As a result, traditional IR textbooks have become quite out of date and this has led to the introduction of new IR books. Nevertheless, we believe that there is still great need for a book that approaches the field in a rigorous and complete way from a computer-science perspective (as opposed to a user-centered perspective). This book is an effort to partially fulfill this gap and should be useful for a first course on information retrieval as well as for a graduate course on the topic. [from preface]", address = "New York", affiliation = "Computer Science Department, University of Chile", author = "Ricardo Baeza-Yates and Berthier Ribeiro-Neto", backgroundcode = "information retrieval", booktitle = "Modern information retrieval", dateentryedited = "08072003", entryeditor = "Jordan Seymour", pages = "513 p.", publisher = "ACM Press", title = "Modern information retrieval", type = "book", year = "1999", } @inproceedings{ How-people-describe-562061, abstract = "How do users of music information retrieval (MIR) systems express their needs? Using a Wizard of Oz approach to system evaluation, combined with a grounded theory analysis of 502 real-world music queries posted to Google Answers, this paper addresses this pivotal question.", addedkeywords = "music information retrieval (mir)", affiliation = "Department of Computer Science, University of Waikato, Hamilton, New Zealand, davidb at cs.waikato.ac.nz, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, Champaign, Illinois", author = "David Bainbridge and Sally Jo Cunningham and J. Stephen Downie", booktitle = "Proceedings of the Fourth International Conference on Music Information Retrieval: ISMIR 2003", editor = "Holger H. Hoos and David Bainbridge", entryeditor = "Christopher Phillippe", location = "Baltimore, Maryland, USA", meetingdate = "October 26-30, 2003", pages = "221-222", publisher = "Johns Hopkins University", title = "How people describe their music information needs: A grounded theory analysis of music queries", type = "conference poster", url = "http://ismir2003.ismir.net/papers/Bainbridge.pdf", year = "2003", } @inproceedings{ Forming-a-corpus-of-945062, abstract = "The use of audio queries for searching multimedia content has increased rapidly with the rise of music information retrieval; there are now many Internet-accessible systems that take audio queries as input. However, testing the robustness of such a system can be problematic, as there is currently no standard test-bed of queries and music files available. A corpus of audio queries would aid researchers in the development of both audio signal processing techniques and audio query systems. Such a corpus would also be essential for making empirical comparisons between different systems and methods. We propose a pilot study that will field test a procedure for collecting audio queries. The lessons learned in the pilot study will guide us in refining the collection methodology, and we will make a final set of queries freely available to MIR researchers. The participants for this pilot study will be attendees of the ISMIR 2002 Conference.", address = "Paris, France", affiliation = "Department of Computer Science, University of Waikato, Hamilton, New Zealand, davidb at cs.waikato.ac.nz", author = "David Bainbridge and John R. McPherson and Sally Jo Cunningham", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", editorrole = "Editor", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "289-290", publisher = "IRCAM - Centre Pompidou", title = "Forming a corpus of voice queries for music information retrieval: A pilot study", type = "poster", url = "http://ismir2002.ircam.fr/proceedings/03-SP04-2.pdf", year = "2002", } @inproceedings{ Towards-a-digital-li-732267, abstract = "Digital libraries of music of the potential to capture popular imagination in ways that more scholarly libraries cannot. We are working towards a comprehensive digital library of musical material, including popular music. We have developed new ways of collecting musical material, accessing it through searching and browsing, and presenting the results to the user. We work with different representations of music: facsimile images of scores, the internal representation of a music editing program, page images typeset by a music editor, MIDI files, audio files representing sung user input, and textual metadata such as title, composer and arranger, and lyrics.", affiliation = "d.bainbridge at cs.waikato.ac.nz, University of Waikato, Hamilton, New Zealand", author = "David Bainbridge and Craig G. Nevill-Manning and Ian H. Witten and Lloyd A. Smith and Rodger J. McNab", booktitle = "Proceedings of the Fourth ACM Conference on Digital Libraries, International Conference on Digital Libraries", keywords = "music representation, melody matching, optical music recognition, MIDI", location = "Berkeley, California, USA", meetingdate = "August 11-14, 1999", note = "author's preprint", pages = "161-169", title = "Towards a digital library of popular music", type = "conference paper", url = "http://craig.nevill-manning.com/~nevill/publications/DL99.pdf", year = "1999", } @inproceedings{ The-role-of-music-IR-509126, abstract = "This extended abstract describes the computer music work that forms part of the New Zealand Digital Library (NZDL) project. In keeping with the scope of the general project, the music work investigates data acquisition, retrieval, presentation and scalability. These parts are described in turn in the text below.", address = "Amherst, MA", affiliation = "Department of Computer Science, University of Waikato, NEW ZEALAND, d.bainbridge at cs.waikato.ac.nz", author = "David Bainbridge", booktitle = "Proceedings of International Symposium on Music Information Retrieval: Music IR 2000", editor = "Don Byrd and J. Stephen Downie", editorrole = "Chairs", location = "Plymouth, MA", meetingdate = "October 23-25, 2000", publisher = "University of Massachusetts at Amherst", title = "The role of music IR in the New Zealand Digital Music Library project", type = "Extended Abstract", url = "http://ciir.cs.umass.edu/music2000/papers/invites/bainbridge_invite.pdf", year = "2000", } @article{ The-search-for-adapt-475163, abstract = "We deal with symbolic sequences (texts) in the analysis of natural languages, DNA molecules, musical compositions, computer routines, and elsewhere. The essential structural components of texts are repetitions--the fragments that in some sense are similar to each other. The representation of texts in terms of repetitions is used for the solutions of various classification problems (Gusev and Chuzhanova 1990).
The role of repetition in musical compositions, particularly in songs, is of special significance. The repetition of separate fragments (intonations) in melody facilitates its better learning, and the variation of repetitions enriches the melody. Together, these constitute important means for developing musical themes.
Quite often, identical or similar fragments are discovered not only in the individual melody--called repetitions of the first kind--but also in different melodies--referred to as the repetitions of the second kind--so that in some cases one can speak about the presence of (probably unconscious) adaptations. The detection and analysis of adaptations is important for understanding the psychological aspects of creative work.
The formalization of the notion 'adaptation' is impossible without the use of quantitative research techniques. Several of these are described in this article. They search for similar fragments, and also melodies as a whole, in a sufficiently large and representative sample of songs. Similar techniques can be applied to the problems of copyright protection and analysis of styles, as well as for compiling intonation vocabularies. The latter can be used in programs designed to imitate a given composer's work, and by the composers themselves in cases where 'intonational hints' are necessary.", affiliation = "titkova at math.nsc.ru, Institute of Mathematics, Siberian Division of the Russian Academy of Sciences, Novosibirsk, Russia", author = "Irene V. Bakhmutova and Vladimir D. Gusev and Tatiana N. Titkova", fullpubdate = "Spring 1997", journal = "Computer Music Journal", keywords = "adaptations, melody", number = "1", pages = "58-67", title = "The search for adaptations in song melodies", type = "journal article", volume = "21", year = "1997", } @inproceedings{ Figured-bass-and-ton-254486, abstract = "[First paragraph] In the course of the WedelMusic project [15], we are currently implementing retrieval engines based on musical content automatically extracted from a musical score. By musical content, we mean not only main melodic motives, but also harmony, or tonality. In this paper, we first review previous research in the domain of harmonic analysis of tonal music. We then present a method for automated harmonic analysis of a music score based on the extraction of a figured bass. The figured bass is determined by means of a template-matching algorithm, where templates for chords can be entirely and easily redefined by the end-user. We also address the problem of tonality recognition with a simple algorithm based on the figured bass. Limitations of the method are discussed. Results are shown and compared to previous research. Finally, potential uses for Music Information Retrieval are discussed.", address = "Bloomington, IN", affiliation = "Ircam, Paris, FRANCE, jerome.barthelemy at ircam.fr", author = "Jerome Barthélemy and Alain Bonardi", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Editors", keywords = "music analysis, automatic extraction of musical features, figured bass and tonality recognition", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "129-136", publisher = "Indiana University", title = "Figured bass and tonality recognition", type = "Conference paper", url = "http://ismir2001.indiana.edu/pdf/barthelemy.pdf, and http://music-ir.org/gsdl/ismir2001/pdf/barthelemy.pdf (300k)", year = "2001", } @inproceedings{ To-catch-a-chorus-Us-953046, abstract = "An important application for use with multimedia databases is a browsing aid, which allows a user to quickly and efficiently preview selections from either a database or from the results of a database query. Methods for facilitating browsing, though, are necessarily media dependent. We present one such method that produces short, representative samples (or 'audio thumbnails') of selections of popular music. This method attempts to identify the chorus or refrain of a song by identifying repeated sections of the audio waveform. A reduced spectral representation of the selection based on a chroma transformation of the spectrum is used to find repeating patterns. This representation encodes harmonic relationships in a signal and thus is ideal for popular music, which is often characterized by prominent harmonic progressions. The method is evaluated over a sizable database of popular music and found to perform well, with most of the errors resulting from songs that do not meet our structural assumptions.", affiliation = "mbartsch at eecs.umich.edu, EECS Department, University of Michigan, Ann Arbor, Michigan, USA", author = "Mark A. Bartsch and Gregory H. Wakefield", booktitle = "Proceedings of the 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics", location = "New Paltz, New York, USA", meetingdate = "October 21-24, 2001", note = "author's preprint", pages = "15-18", title = "To catch a chorus: Using chroma-based representations for audio thumbnailing", type = "conference paper", url = "http://musen.engin.umich.edu/papers/bartsch_wakefield_waspaa01_final.pdf", year = "2001", } @inproceedings{ Automatic-segmentati-1144782, abstract = "Music information retrieval has become a major topic in the last few years and we can find a wide range of applications that use it. For this reason, audio databases start growing in size as more and more digital audio resources have become available. However, the usefulness of an audio database relies not only on its size but also on its organization and structure. Therefore, much effort must be spent in the labeling process whose complexity grows with database size and diversity.
In this paper we introduce a new audio classification tool and we use its properties to develop an automatic system to segment audio material in a fully unsupervised way. The audio segments obtained with this process are automatically labeled in a way that two segments with similar psychoacoustics properties get the same label. By doing so, the audio signal is automatically segmented into a sequence of abstract acoustic events. This is specially useful to classify huge multimedia databases where a human driven segmentation is not practicable. This automatic classification allow a fast indexing and retrieval of audio fragments. This audio segmentation is done using competitive hidden Markov models as the main classification engine and, thus, no previous classified or hand-labeled data is needed. This powerful classification tool also has a great flexibility and offers the possibility to customize the matching criterion as well as the average segment length according to the application needs.", address = "Amherst, MA", affiliation = "Audiovisual Institute, Universitat Pompeu Fabra, SPAIN, eloi at iua.upf.es", author = "Eloi Batlle and Pedro Cano", booktitle = "Proceedings of International Symposium on Music Information Retrieval: Music IR 2000", editor = "Don Byrd and J. Stephen Downie", editorrole = "Chairs", location = "Plymouth, MA", meetingdate = "October 23-25, 2000", publisher = "University of Massachusetts at Amherst", title = "Automatic segmentation for music classification using competitive hidden Markov models", type = " Poster Abstract", url = "http://ciir.cs.umass.edu/music2000/posters/batlle.pdf", year = "2000", } @inproceedings{ Superconvenience-for-1491911, abstract = "Digital music distribution, the success of MP3 and the actual activities concerning the semantic web of music require for convenient music information retrieval. In this paper we will give an overview about the concepts behind our 'super-convenience' approach for MIR. By using natural language as input for human-oriented queries to large-scale music collections we were able to address the needs of non-musicians. The entire system is applicable for future semantic web services, existing music web-sites and mobile devices. Beside the framework we present a novel idea to incorporate the processing of lyrics based on standard information retrieval methods, i.e the vector space model.", address = "Paris, France", affiliation = "German Research Center for AI (DFKI), Kaiserslautern, Germany, Stephen.Baumann at dfki.de", author = "Stephan Baumann and Andreas Klüter", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", editorrole = "Editor", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "297-298", publisher = "IRCAM - Centre Pompidou", title = "Super-convenience for non-musicians: Querying MP3 and the semantic web", type = "poster", url = "http://ismir2002.ircam.fr/proceedings/03-SP05-3.pdf", year = "2002", } @article{ The-design-and-exper-240137, abstract = "Many applications-from planning and scheduling to problems in molecular biology-rely heavily on a temporal reasoning component. In this paper, we discuss the design and empirical analysis of algorithms for a temporal reasoning system based on Allen's influential interval-based framework for representing temporal information. At the core of the system are algorithms for determining whether the temporal information is consistent, and, if so, finding one or more scenarios that are consistent with the temporal information. Two important algorithms for these tasks are a path consistency algorithm and a backtracking algorithm. For the path consistency algorithm, we develop techniques that can result in up to a ten-fold speedup over an already highly optimized implementation. For the backtracking algorithm, we develop variable and value ordering heuristics that are shown empirically to dramatically improve the performance of the algorithm. As well, we show that a previously suggested reformulation of the backtracking search problem can reduce the time and space requirements of the backtracking search. Taken together, the techniques we develop allow a temporal reasoning component to solve problems that are of practical size.", affiliation = "vanbeek at cs.ualberta.ca, Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada", author = "Peter van Beek and Dennis W. Manchak", dateentryedited = "Jordan Seymour", entryeditor = "30102002", journal = "Journal of Artificial Intelligence Research", note = "author's preprint", pages = "1-18", title = "The design and experimental analysis of algorithms for temporal reasoning", type = "journal article", url = "http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume4/vanbeek96a.ps", volume = "4", year = "1996", } @inproceedings{ Techniques-for-autom-821580, abstract = "Two systems are reviewed than perform automatic music transcription. The first perform monophonic transcription using an autocorrelation pitch tracker. The algorithm takes advantage of some heuristic parameters related to the similarity between image and sound in the collector. The detection is correct between notes B1 to E6 and further timbre analysis will provide the necessary parameters to reproduce a similar copy of the original sound. The second system is able to analyse simple polyphonic tracks. It is composed of a blackboard system, receiving its input from a segmentation routine in the form of an averaged STFT matrix. The blackboard contents an hypotheses database, an scheduler and knowledge sources, one of which is a neural network chord recogniser with the ability to reconfigure the operation of the system, allowing it to output more than one note hypothesis at the time. Some examples are provided to illustrate the performance and the weaknesses of the current implementation. Next steps for further development are defined.", address = "Amherst, MA", affiliation = "Department of Electronic Engineering, King's College London, UK, juan.bello_correa at kcl.ac.uk", author = "Juan Pablo Bello and Giuliano Monti and Mark Sandler", booktitle = "Proceedings of International Symposium on Music Information Retrieval: Music IR 2000", editor = "Don Byrd and J. Stephen Downie", editorrole = "Chairs", location = "Plymouth, MA", meetingdate = "October 23-25, 2000", publisher = "University of Massachusetts at Amherst", title = "Techniques for automatic music transcription", type = "Conference Paper", url = "http://ciir.cs.umass.edu/music2000/papers/bello_paper.pdf", year = "2000", } @inproceedings{ Time-domain-extracti-481727, abstract = "Vibrato is an essential ingredient in the expressive nature of many musical instruments. Creating slight oscillations in the pitch and/or volume of the musical tone, vibrato allows a long sustained note to become more lively and dynamic. Musical instruments differ in both the technique used to create vibrato as well as the physical characteristics of the vibrato-enhanced sound produced. The goal of this research is to extract information describing the amplitude, frequency, and phase of the vibrato from a section of monophonic music.", address = "Amherst, MA", affiliation = "Undergraduate School of Electrical Engineering, University of Maryland at College Park, USA, dbendor at glue.umd.edu", author = "Daniel Bendor and Mark Sandler", booktitle = "Proceedings of International Symposium on Music Information Retrieval: Music IR 2000", editor = "Don Byrd and J. Stephen Downie", editorrole = "Chairs", location = "Plymouth, MA", meetingdate = "October 23-25, 2000", publisher = "University of Massachusetts at Amherst", title = "Time domain extraction of vibrato from monophonic instruments", type = "Poster Abstract", url = "http://ciir.cs.umass.edu/music2000/posters/bendor.pdf", year = "2000", } @inproceedings{ Using-Voice-Segments-50714, abstract = "Is it easier to identify musicians by listening to their voices or their music? We show that for a small set of pop and rock songs, automatically-located singing segments form a more reliable basis for classification than using the entire track, suggesting that the singer's voice is more stable across different performances, compositions, and transformations due to audio engineering techniques than the instrumental background. The accuracy of a system trained to distinguish among a set of 21 artists improves by about 15% (relative to the baseline) when based on segments containing a strong vocal component, whereas the system suffers by about 35% (relative) when music-only segments are used. In another experimenton a smaller set, however, performance drops by about 35% (relative) when the training and test sets are selected from different albums, suggesting that the system is learning album-specific properties possibly related to audio production techniques, musical stylistic elements, or instrumentation, even when attention is directed toward the supposedly more stable vocal regions.", affiliation = "alb63 at columbia.edu, Columbia University", author = "Adam Berenzweig and Daniel P.W. Ellis and Steve Lawrence", booktitle = "AES 22 International Conference on Virtual, Synthetic and Entertainment Audio", fullpubdate = "june 2002", location = "Espoo, Finland", meetingdate = "june 2002", title = "Using Voice Segments to Improve Artist Classification of Music.", type = "conference paper", url = "http://blush.ee.columbia.edu/adam/papers/BEL-aclass-voxseg-aes22-2002.pdf", year = "2002", } @inproceedings{ Anchor-Space-for-Cla-990187, abstract = "This paper describes a method of mapping music into a semantic spacethat can be used for similarity measurement, classification, andmusic information retrieval. The value along each dimension of this''anchor space'' is computed as the output from a pattern classifierwhich is trained to measure a particular semantic feature. In anchorspace, distributions that represent objects such as artists or songs aremodeled with Gaussian Mixture Models, and several similarity measures aredefined by computing approximations to the Kullback-Leibler divergencebetween distributions. Similarity measures are evaluated against humansimilarity judgements. The models are also used for artist classificationto achieve 62% accuracy on a 25-artist set, and 38% on a 404-artistset (random guessing achieves 0.25%). Finally, we describe a musicsimilarity browsing application that makes use of the fact that anchorspace dimensions are meaningful to users.", affiliation = "madadam at ee.columbia.edu, Columbia University", author = "Adam Berenzweig and Daniel P.W. Ellis and Steve Lawrence", booktitle = "ICME 2003, International Conference on Multimedia and Expo", fullpubdate = "July 2003", location = "Balitmore, MD", meetingdate = "July 2003", publisher = "Institute of Electrical and Electronics Engineers", title = "Anchor Space for Classification and Similarity Measurement of Music", type = "conference paper", url = "http://blush.ee.columbia.edu/adam/papers/berenzweig-anchor-icme03.pdf", year = "2003", } @inproceedings{ A-largescale-evaluat-229299, abstract = "Subjective similarity between musical pieces and artists is an elusive concept, but one that must be pursued in support of applications to provide automatic organization of large music collections. In this paper, we examine both acoustic and subjective approaches for calculating similarity between artists, comparing their performance on a common database of 400 popular artists. Specifically, we evaluate acoustic techniques based on Mel-frequency cepstral coefficients and an intermediate ‘anchor space’ of genre classifi- cation, and subjective techniqueswhich use data from The All Music Guide, from a survey, from playlists and personal collections, and from web-text mining. We find the following: (1) Acoustic-based measures can achieve agreement with ground truth data that is at least comparable to the internal agreement between different subjective sources. However, we observe significant differences between superficially similar distributionmodeling and comparison techniques. (2) Subjective measures from diverse sources show reasonable agreement, with the measure derived from co-occurrence in personal music collections being the most reliable overall. (3) Our methodology for largescale cross-site music similarity evaluations is practical and convenient, yielding directly comparable numbers for different approaches. In particular, we hope that our information-retrieval-based approach to scoring similarity measures, our paradigm of sharing common feature representations, and even our particular dataset of features for 400 artists, will be useful to other researchers.", addedkeywords = "music similarity, acoustic measures, evaluation, ground-truth", affiliation = "LabROSA Columbia University, New York, HP Labs, Cambridge, MA, Music Mind and Machine Group, MIT Media Lab, Cambridge, MA", author = "Adam Berenzweig and Beth Logan and Daniel P. W. Ellis and Brian Whitman", booktitle = "Proceedings of the Fourth International Conference on Music Information Retrieval: ISMIR 2003", editor = "Holger H. Hoos and David Bainbridge", entryeditor = "Christopher Phillippe", location = "Baltimore, Maryland, USA", meetingdate = "October 26-30, 2003", pages = "99-105", publisher = "Johns Hopkins University", title = "A large-scale evaluation of acoustic and subjective music similarity measures", type = "conference paper", url = "http://ismir2003.ismir.net/papers/Berenzweig.PDF", year = "2003", } @inproceedings{ Musart-Music-retriev-897554, abstract = "MUSART is a research project developing and studying new techniques for music information retrieval. The MUSART architecture uses a variety of representations to support multiple search modes. Progress is reported on the use of Markov modeling, melodic contour, and phonetic streams for music retrieval. To enable large-scale databases and more advanced searches, musical abstraction is studied. The MME subsystem performs theme extraction, and two other analysis systems are described that discover structure in audio representations of music. Theme extraction and structure analysis promise to improve search quality and support better browsing and "audio thumbnailing."Integration of these components within a single architecture will enable scientific comparison of different techniques and, ultimately, their use in combination for improved performance and functionality.", address = "Bloomington, IN", affiliation = "University of Michigan, Ann Arbor, MI, USA, wpb at eecs.umich.edu", author = "William P. Birmingham and Roger B. Dannenberg and Gregory H. Wakefield and Mark Bartsch and David Bykowski and Dominic Mazzoni and Colin Meek and Maureen Mellody and William Rand", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Chairs", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "73-81", publisher = "Indiana University", title = "Musart: Music retrieval via aural queries", type = "Conference paper", url = "http://ismir2001.indiana.edu/pdf/birmingham.pdf, and http://music-ir.org/gsdl/ismir2001/pdf/birmingham.pdf (318k)", year = "2001", } @article{ The-MusArt-musicretr-230109, abstract = "In this paper, we describe the architecture of MusArt (shown in Figure 1). An important element of MusArt is metadata creation: we believe that it is essential to automatically abstract important musical elements, particularly themes. Theme extraction is performed by a subsystem called MME, which we describe later in this paper. Another important element of MusArt is its support for a variety of search engines, as we believe that MIR is too complex for a single approach to work for all queries. Currently, MusArt supports a dynamic time-warping search engine that has high recall, and a complementary stochastic search engine that searches over themes, emphasizing speed and relevancy. The stochastic search engine is discussed in this paper.", affiliation = "wpb at eecs.umich.edu, Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan, USA", author = "William Birmingham and Bryan Pardo and Colin Meek and Jonah Shifrin", fullpubdate = "February 2002", journal = "D-Lib Magazine", keywords = "theme extraction, stochastic search engine", note = "D-Lib Magazine article", number = "2", title = "The MusArt music-retrieval system: An overview", type = "journal article", url = "http://www.dlib.org/dlib/february02/birmingham/02birmingham.html", volume = "8", year = "2002", } @inproceedings{ A-tool-for-content-b-557383, abstract = "This paper presents a system which employs the accepted notion of melodic pitch contours to support content-based navigation around a body of multimedia documents including MIDI and digital audio files. The system adopts an open hypermedia model which enables the user to find available links from an arbitrary fragment of a piece of music, based on the content or location of that fragment. The design of the tools, indexed contour database and the fast contour-matching algorithms are discussed.", addedkeywords = "open hypermedia, content based navigation, branching audio, melodic contours, pitch contours, query by humming", affiliation = "sgb97r at ecs.soton.ac.uk, Multimedia Research Group, Department of Electronics and Computer Science, University of Southampton, UK", author = "Steven Blackburn and David DeRoure", booktitle = "Proceedings of the Sixth ACM International Multimedia Conference, ACM International Multimedia Conference", dateentryedited = "Jordan Seymour", editor = "Wolfgang Effelsberg", entryeditor = "02112002", location = "Bristol, UK", meetingdate = "September 14-16, 1998", note = "ACM Multimedia 98 - Electronic Proceedings article", pages = "361-368", title = "A tool for content based navigation of music", type = "conference paper", url = "http://www.acm.org/sigs/sigmm/MM98/electronic_proceedings/blackburn/index.html", year = "1998", } @inproceedings{ Usability-of-musical-1000440, abstract = "There has been substantial research on technical aspects of musical digital libraries, but comparatively little on usability aspects. We have evaluated four web-accessible music libraries, focusing particularly on features that are particular to music libraries, such as music retrieval mechanisms. Although the original focus of the work was on how modalities are combined within the interactions with such libraries, that was not where the main difficulties were found. Libraries were generally well designed for use of different modalities. The main challenges identified relate to the details of melody matching and to simplifying the choices of file format. These issues are discussed in detail.", address = "Paris, France", affiliation = "UCL Interaction Centre (UCLIC), University College London, London, UK, A.Blandford at ucl.ac.uk", author = "Ann Blandford and Hanna Stelmaszewska", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", editorrole = "Editor", keywords = "digital music libraries", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "231-237", publisher = "IRCAM - Centre Pompidou", title = "Usability of musical digital libraries: A multimodal analysis", type = "conference paper", url = "http://ismir2002.ircam.fr/proceedings/02-FP07-5.pdf", year = "2002", } @inbook{ Audio-databases-with-1089910, author = "Thom Blum and Douglas Keislar and James Wheaton and Erling Wold", booktitle = "Intelligent Multimedia Information Retrieval", chapter = "6", editor = "Mark T. Maybury", pages = "113-135", publisher = "MIT Press", title = "Audio databases with content-based retrieval (Chapter 6)", type = "chapter in a book", year = "1997", } @inproceedings{ Music-tagging-type-d-1501358, abstract = "The paper discusses the general issues surrounding structured music information representation and music information retrieval, and present the result of the UK-JISC funded “proof-of-concept” project: MuTaTeD! (C. Hall and C. Boehm, 2000) (Music Tagging Type Definition) and discuss further development within the UK Library Information Commission funded MuTaTeD'II project (C. Boehm and D. MacLellan, 2000). Work within the two projects MuTaTeD! and MuTaTeD'II, striving towards the design and implementation of an expandable, flexible music information retrieval system with delivery/access services for encoded music, has resulted in involvement in standards development such as MPEG7, to support the needs of developers within music information retrieval research.", addedkeywords = "MuTaTeD, music representation", affiliation = "C.Boehm at music.gla.ac.uk, University of Glasgow", author = "Carola Boehm and Donald MacLellan", booktitle = "Proccedings of the 26th Euromicro Conference, 2000.", dateentryedited = "Jordan Seymour, 28052003", location = "Maastricht, Netherlands", meetingdate = "September 5-7, 2000", pages = "340-347", title = "Music tagging type definitions, systems for music representation and retrieval", type = "conference paper", volume = "2", year = "2000", } @inproceedings{ IR-for-contemporary-1929067, abstract = "[First paragraph] Listening does not only concern receiving musical information. On the contrary, it is “active” and based on a set of interactions between listeners and musical documentsincluding automatic music information research and extractionso as to discover intentions. This recognition process is based on the observation of regularities and rules, in order to build “forms” from all indications, information and redundancies. The listener interprets all the signs that are meaningful for him as intentions, attributed to the composer", address = "Amherst, MA", affiliation = "IRCAM, FRANCE, alain.bonardi at ircam.fr", author = "Alain Bonardi", booktitle = "Proceedings of International Symposium on Music Information Retrieval: Music IR 2000", editor = "Don Byrd and J. Stephen Downie", editorrole = "Chairs", location = "Plymouth, MA", meetingdate = "October 23-25, 2000", publisher = "University of Massachusetts at Amherst", title = "IR for contemporary music: What the musicologist needs", type = "Extended Abstract", url = "http://ciir.cs.umass.edu/music2000/papers/invites/bonardi_invite.pdf", year = "2000", } @article{ Calculation-of-a-con-1508351, abstract = "The frequencies that have been chosen to make up the scale of Western music are geometrically spaced. Thus the discrete Fourier transform (DFT), although extremely efficient in the fast Fourier transform implementation, yields components which do not map efficiently to musical frequencies. This is because the frequency components calculated with the DFT are separated by a constant frequency difference and with a constant resolution. A calculation similar to a discrete Fourier transform but with a constant ratio of center frequency to resolution has been made; this is a constant Q transform and is equivalent to a 1/24-oct filter bank. Thus there are two frequency components for each musical note so that two adjacent notes in the musical scale played simultaneously can be resolved anywhere in the musical frequency range. This transform against log (frequency) to obtain a constant pattern in the frequency domain for sounds with harmonic frequency components has been plotted. This is compared to the conventional DFT that yields a constant spacing between frequency components. In addition to advantages for resolution, representation with a constant pattern has the advantage that note identification ('note identification' rather than the term 'pitch tracking,' which is widely used in the signal processing community, is being used since the editor has correctly pointed out that 'pitch' should be reserved for a perceptual context), instrument recognition, and signal separation can be done elegantly by a straightforward pattern recognition algorithm.", affiliation = "jbrown at wellesley.edu, Physics Departments, Wellesley College, Wellesley, Massachusetts, USA", author = "Judith C. Brown", dateentryedited = "Jordan Seymour", entryeditor = "30102002", fullpubdate = "January 1991", journal = "Journal of the Acoustical Society of America", keywords = "Fourier Transformation", number = "1", pages = "425-434", title = "Calculation of a constant Q spectral transform", type = "journal article", volume = "89", year = "1991", } @inproceedings{ A-Hierarchical-Appro-1287803, abstract = "A system for the automatic classification of audio signals according to audio category is presented. The signals are recognized as speech, background noise and one of 13 musical genres. A large number of audio features are evaluated for their suitability in such a classification task, including well-known physical and perceptual features, audio descriptors defined in the MPEG-7 standard, as well as new features proposed in this work. These are selected with regard to their ability to distinguish between a given set of audio types and to their robustness to noise and bandwidth changes. In contrast to previous systems, the feature selection and the classification process itself are carried out in a hierarchical way. This is motivated by the numerous advantages of such a tree-like structure, which include easy expansion capabilities, flexibility in the design of genre-dependent features and the ability to reduce the probability of costly errors. The resulting application is evaluated with respect to classification accuracy and computational costs.", affiliation = "burred at zplane.de, Communication Systems Group, Technical University Berlin, Germany", author = "Juan José Burred and Alexander Lerch", booktitle = "Proceedings of the 6th International Conference on Digital Audio Effects DAFx-03, 6th International Conference on Digital Audio Effects DAFX03", location = "London, United Kingdom", meetingdate = "September 8-11, 2003", title = "A Hierarchical Approach To Automatic Musical Genre Classification", type = "conference paper", url = "http://www.elec.qmul.ac.uk/dafx03/proceedings/pdfs/dafx06.pdf", year = "2003", } @misc{ Hierarchical-Automat-1861266, abstract = "The design, implementation, and evaluation of a system for automatic audio signal classification, While research has been carried out on automated polyphonic music transcription, to-date the problem of automated polyphonic percussion transcription has not received the same degree of attention. A related problem is that of sound source separation, which attempts to separate a mixture signal into its constituent sources. This thesis focuses on the task of polyphonic percussion transcription and sound source separation of a limited set of drum instruments, namely the drums found in the standard rock/pop drum kit.", affiliation = "burred at nue.tu-berlin.de, Communication Systems Group, Technical University Berlin, Berlin, Germany, derry.fitzgerald at dit.ie, Dept. of Engineering, Kevin St., Dublin Institute of Technology", author = "Juan José Burred and Alexander Lerch and Derry FitzGerald", journal = "Journal of the Audio Engineering Society", number = "7/8", pages = "724, 739", title = "Hierarchical Automatic Audio Signal Classification, Automatic Drum Transcription and Source Separation", type = "journal article, other", url = "http://www.aes.org/journal/, http://homepage.eircom.net/~derryfitzgerald/ThesisFitz.pdf", volume = "52", year = "July/August 2004, May 2004", } @article{ Problems-of-music-in-649264, abstract = "Although a substantial number of research projects have addressed music information retrieval over the past three decades, the field is still very immature. Few of these projects involve complex (polyphonic) music; methods for evaluation are at a very primitive stage of development; none of the projects tackles the problem of realistically large-scale databases. Many problems to be faced are due to the nature of music itself. Among these are issues in human perception and cognition of music, especially as they concern the recognizability of a musical phrase. This paper considers some of the most fundamental problems in music information retrieval, challenging the common assumption that searching on pitch (or pitch-contour) alone is likely to be satisfactory for all purposes. This assumption may indeed be true for most monophonic (single-voice) music, but it is certainly inadequate for polyphonic (multi-voice) music. Even in the monophonic case it can lead to misleading results. The fact, long recognized in projects involving monophonic music, that a recognizable passage is usually not identical with the search pattern means that approximate matching is almost always necessary, yet this too is severely complicated by the demands of polyphonic music. Almost all text-IR methods rely on identifying approximate units of meaning, that is, words. A fundamental problem in music IR is that locating such units is extremely difficult, perhaps impossible.", addedkeywords = "information retrieval, searching, music, audio, midi, notation, information retrieval, searching, music, audio, midi, notation", affiliation = "dbyrd at cs.umass.edu, Center for Intelligent Information Retrieval, Department of Computer Science, University of Massachusetts, Amerherst, Massachusetts, USA", author = "Donald Byrd and Tim Crawford", backgroundcode = "general music information retrieval", dateentryedited = "Jordan Seymour", entryeditor = "03022002", fullpubdate = "March 2002", journal = "Information Processing and Management", note = "Elsevier Science", number = "2", pages = "249-272", title = "Problems of music information retrieval in the real world", type = "journal article", url = "available through Science Direct http://www.sciencedirect.com/", volume = "38", year = "2002", } @inproceedings{ Musicnotation-search-989462, abstract = "Almost all work on music information retrieval to date has concentrate d on music in the audio and event (normally MIDI) domains. However, music in the form of notation, especially Conventional Music Notation (CMN), is of much interest to musically-trained persons, both amateurs and professionals, and searching CMN has great value for digital music libraries. One obvious reason little has been done on music retrieval in CMN form is the overwhelming complexity of CMN, which requires a very substantial investment in programming before one can even begin studying music IR. This paper reports on work adding music-retrieval capabilities to Nightingale©, an existing professional-level music-notation editor.", affiliation = "dbyrd at cs.umass.edu, CIIR, Department of Computer Science, University of Massachusetts, Amherst, Massachusetts, USA", author = "Donald Byrd", booktitle = "Proceedings of the First ACM/IEEE-CS Joint Conference on Digital Libraries, International Conference on Digital Libraries", keywords = "conventional music notation, Nightingale", location = "Roanoke, Virginia, USA", meetingdate = "June 24-28, 2001", pages = "239-246", title = "Music-notation searching and digital libraries", type = "conference paper", year = "2001", } @inproceedings{ On-the-use-of-FastMa-637130, abstract = "In this article, a heuristic version of Multidimensional Scaling (MDS) named FastMap is used for audio retrieval and browsing. FastMap, like MDS, maps objects into an Euclidean space, such that similarities are preserved. In addition of being more efficient than MDS it allows query-by-example type of query, which makes it suitable for a content-based retrieval purposes.", address = "Paris, France", affiliation = "Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain, pcano at iua.upf.es", author = "Pedro Cano and Martin Kaltenbrunner and Fabien Gouyon and Eloi Batlle", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", editorrole = "Editor", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "275-276", publisher = "IRCAM - Centre Pompidou", title = "On the use of FastMap for audio retrieval and browsing", type = "poster", url = "http://ismir2002.ircam.fr/proceedings/03-SP02-6.pdf", year = "2002", } @inproceedings{ Statistical-signific-572227, abstract = "We present some methods for improving the performance a system capable of automatically identifying audio titles by listening to broadcast radio. We outline how the techniques, placed in an identification system, allow us detect and isolate songs embedded in hours of unlabelled audio yielding over a 91% rate of recognition of the songs and no false alarms. The whole system is also able of working real-time in an off-the-shelf computer.", address = "Bloomington, IN", affiliation = "Music Technology Group, IUA-Pompeu Fabra University, Barcelona, SPAIN, pedro.cano at iua.upf.es", author = "Pedro Cano and Martin Kaltenbrunner and Oscar Mayor and Eloi Batlle", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Editors", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "3-4", publisher = "Indiana University", title = "Statistical significance in song-spotting in audio", type = "Poster Abstract", url = "http://ismir2001.indiana.edu/posters/cano.pdf, also http://music-ir.org/gsdl/ismir2001/posters/cano.pdf (210k)", year = "2001", } @inproceedings{ Using-user-models-in-1980254, abstract = "[First paragraph] Most websites providing music services only support category-based browsing and/or text-based searching. There has been some research to improve the interface either for pull applications, e.g. query-by-humming systems, or for push applications, e.g. collaborative-filtering-based or feature-based music recommendation systems. However, for content-based search or feature-based filtering systems, one important problem is to describe music by its parameters or features, so that search engines or information filtering agents can use them to measure the similarity of the target (user’s query or preference) and the candidates. MPEG7 (formally called “Multimedia Content Description Interface”) is an international standard, which describes the multimedia content data to allow universal indexing, retrieval, filtering, control, and other activities supported by rich metadata. However, the metadata about the multimedia content itself are still insufficient, because many features of multimedia content are quite perceptual and user-dependent. For example, emotional features are very important for multimedia retrieval, but they are hard to be described by a universal model since different users may have different emotional responses to the same multimedia content. We therefore turn to user modeling techniques and representations to describe the properties of each user, so that the retrieval will be more accurate. Besides, user modeling can be used to reduce the search space, make push service easier and improve the user interface.", address = "Amherst, MA", affiliation = "Media Laboratory, Massachusetts Institute of Technology, USA, chaiwei at media.mit.edu", author = "Wei Chai and Barry Vercoe", booktitle = "Proceedings of International Symposium on Music Information Retrieval: Music IR 2000", editor = "Don Byrd and J. Stephen Downie", editorrole = "Chairs", location = "Plymouth, MA", meetingdate = "October 23-25, 2000", publisher = "University of Massachusetts at Amherst", title = "Using user models in music information retrieval systems", type = "Poster Abstract", url = "http://ciir.cs.umass.edu/music2000/posters/chai.pdf", year = "2000", } @inproceedings{ Melody-retrieval-on-532800, abstract = "This paper explores issues involved in a web-based query-by-humming system, which can find a piece of music in the digital music repository based on hummed melodies. Melody representation, melody matching, melody extraction and query construction are critical for an efficient and robust query-by-humming system and thus the focuses of this paper. Compared to previous systems, new and more effective melody representation and matching methods which combined both pitch and rhythmic information were adopted, a whole set of tools and deliverable software were implemented, and extensive experiments were conducted to evaluate the system. The experimental results show that our methods are more effective for most users than other existing methods.", addedkeywords = "query by humming", affiliation = "Media Lab., MIT, Cambridge, Massachusetts, USA", author = "Wei Chai and Barry Vercoe", booktitle = "Proceedings of SPIE - the International Society for Optical Engineering, Multimedia Computing and Networking 2002", dateentryedited = "Jordan Seymour", entryeditor = "29052003", location = "San Jose, California, USA", meetingdate = "January 23-24, 2002", pages = "226-241", title = "Melody retrieval on the Web", type = "conference paper", volume = "4673", year = "2002", } @inproceedings{ Folk-music-classific-1854659, abstract = "Automatic music classification is essential for implementing efficient music information retrieval systems; meanwhile, it may shed light on the process of human’s music perception. This paper describes our work on the classification of folk music from different countries based on their monophonic melodies using hidden Markov models. Music corpora of Irish, German and Austrian folk music in various symbolic formats were used as the data set. Different representations and HMM structures were tested and compared. The classification performances achieved 75%, 77% and 66% for 2-way classifications and 63% for 3-way classification using 6-state left-right HMM with the interval representation in the experiment. This shows that the melodies of folk music do carry some statistical features to distinguish them. We expect that the result will improve if we use a", affiliation = "chaiwei at media.mit.edu, Media Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA", author = "Wei Chai and Barry Vercoe", booktitle = "Proceedings of International Conference on Artificial Intelligence", dateentryedited = "Jordan Seymour", entryeditor = "30052003", keywords = "music classification, hidden Markov model, music perception", location = "Las Vegas, Nevada, USA", meetingdate = "June 25-28, 2001", note = "author's preprint", title = "Folk music classification using hidden Markov models", type = "conference paper", url = "http://web.media.mit.edu/~chaiwei/papers/chai_ICAI183.pdf", year = "2001", } @inproceedings{ An-extensible-repres-259546, abstract = "The increasing availability of digital music has created a greater need for methods to organize large collections of music. The eXtensible PlayList (XPL) representation allows users to express playlists with varying degrees of specificity. XPL handles references to exact files or URLs as well as rules for selecting content based on metadata constraints. XPL also allows the transitions between tracks in a playlist to be specified. This paper describes the features of XPL, a system for rendering XPL specifications and use of an advanced XPL renderer in an existing application.", address = "Paris, France", affiliation = "Creative Advanced Technology Ctr., Scotts Valley, California, USA, amar at atc.creative.com", author = "Amar Chaudhary", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", editorrole = "Editor", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "293-294", publisher = "IRCAM - Centre Pompidou", title = "An extensible representation for playlists", type = "poster", url = "http://ismir2002.ircam.fr/proceedings/03-SP05-1.pdf", year = "2002", } @inproceedings{ Query-by-music-segme-902114, abstract = "We present the techniques for retrieving songs by music segments. A music segment consists of a segment type and the associated beat and pitch information. The similarity measures for the beat and pitch are defined. Two index structures for music segments are proposed, in which the minimal and maximal values of the beat and pitch of the music segments are stored to aid the song retrieval process. Moreover, the threshold propagation functions are developed for efficient approximate searching. Experiments are performed to show the superiority of this approach.", affiliation = "alpchen at cs.nthu.edu.tw, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, R.O.C.", author = "Arbee L.P. Chen and Maggie Chang and Jesse Chen and Jia-Lien Hsu and Chih-How Hsu and Spot Y.S. Hua", booktitle = "Proceedings of the 2000 IEEE International Conference on Multimedia and Expo, IEEE International Conference on Multimedia and Expo (ICME 2000)", dateentryedited = "Jordan Seymour", entryeditor = "31102002", keywords = "content-based retrieval, query by music segments, similarity measures", location = "New York, New York, USA", meetingdate = "July 30-August 2, 2000", pages = "873-876", title = "Query by music segments: An efficient approach for song retrieval", type = "conference paper", volume = "2", year = "2000", } @inproceedings{ An-approach-for-song-1297813, abstract = "We propose techniques for retrieving songs by rhythm from music databases. The rhythm of songs is modeled by rhythm strings. The song retrieval problem is then transformed to the string matching problem. In order to allow approximate string matching, we define similarity measures on rhythm strings. An index structure, called L-tree, is proposed to support efficient sub-string matching. Retrieval algorithms based on L-tree are then designed to provide approximate and sub- song retrieval. Experimental results show that this approach is effective and efficient.", affiliation = "alpchen at cs.nthu.edu.tw, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan", author = "James C.C. Chen and Arbee L.P. Chen", booktitle = "Proceedings of the Eighth IEEE International Workshop on Research Issues in Data Engineering, Continuous-Media Databases and Applications", dateentryedited = "Jordan Seymour", editor = "Avi Silberschatz", entryeditor = "31102002", keywords = "string matching, query by rhythm, l-tree, retrieval algorithms", location = "Orlando, Florida, USA", meetingdate = "February 23-24, 1998", pages = "139-146", title = "An approach for song retrieval in music databases", type = "conference paper", year = "1998", } @inproceedings{ Categorizing-informa-1751800, abstract = "Many web sites have dynamic information objects whose topics change over time. Classifying these objects automatically and promptly is a challenging and important problem for site masters. Traditional content-based and link structure based classification techniques have intrinsic limitations for this task. This paper proposes a framework to classify an object into an existing category structure by analyzing the users' traversals in the category structure. The key idea is to infer an object's topic from the predicted preferences of users when they access the object. We compare two approaches using this idea. One analyzes collective user behavior and the other each user's accesses. We present experimental results on actual data that demonstrate a much higher prediction accuracy and applicability with the latter approach. We also analyze the correlation between classification quality and various factors such as the number of users accessing the object. To our knowledge, this work is the first effort in combining object classification with user access prediction.", affiliation = "maoch at cs.princeton.edu, Princeton University, Princeton, New Jersey, USA", author = "Mao Chen and Andrea LaPaugh and Jaswinder Pal Singh", booktitle = "Proceedings of the Eleventh International Conference on Information and Knowledge Management, Session: Classification", dateentryedited = "Jordan Seymour, 28052003", keywords = "classification", location = "McLean, Virginia, USA", meetingdate = "November 4-9, 2002", pages = "365-372", publisher = "ACM Press", title = "Categorizing information objects from user access patterns", type = "conference paper", year = "2002", } @inproceedings{ Music-representation-1195881, abstract = "In this extended abstract, our work on the representation, indexing and retrieval of music data is summarized. We treat the rhythm, melody, and chords of a music object as music features and develop various data structures and algorithms to efficiently perform approximate and partial matching for the retrieval of music data [Liu99a], [Chen98], [Chou96]. In [Chen00a], we present the techniques for retrieving songs by music segments. A music segment consists of a segment type and the associated beat and pitch information. We also propose multi-feature index structures for exact and approximate searching on different features [Lee00].
The problem of feature extraction is also studied. The repeating pattern is defined as a sequence of notes, which appears more than once in music objects. Choosing repeating patterns as the feature to represent the music objects meets both efficiency and semantic-richness requirements for content-based music data retrieval. We propose approaches to efficiently discover the repeating patterns of music objects in [Hsu98], [Liu99b]. We have also implemented Muse, a prototype system for content-based music data retrieval to illustrate the feasibility of the concepts we propose.", address = "Amherst, MA", affiliation = "Department of Computer Science, National Tsing Hua University, TAIWAN, alpchen at cs.nthu.edu.tw", author = "Arbee L.P. Chen", booktitle = "Proceedings of International Symposium on Music Information Retrieval: Music IR 2000", editor = "Don Byrd and J. Stephen Downie", editorrole = "Chairs", location = "Plymouth, MA", meetingdate = "October 23-25, 2000", publisher = "University of Massachusetts at Amherst", title = "Music representation, indexing and retrieval at NTHU", type = "Extended Abstract", url = "http://ciir.cs.umass.edu/music2000/papers/invites/chen_invite.pdf", year = "2000", } @inproceedings{ Determining-contextd-1930676, abstract = "This paper presents algorithms for pitch spelling using the Spiral Array model. Accurate pitch spelling, assigning contextually consistent letter names to pitch numbers (for example,MIDI), is a critical component ofmusic transcription and analysis systems. The local context is found to be more important than the global, but a combination of both achieves the best results.", addedkeywords = "pitch spelling, music analysis, algorithm design", affiliation = "Integrated Media Systems Center and D. J. Epstein Department of Industrial and Systems Engineering", author = "Elaine Chew and Yun-Ching Chen", booktitle = "Proceedings of the Fourth International Conference on Music Information Retrieval: ISMIR 2003", editor = "Holger H. Hoos and David Bainbridge", entryeditor = "Christopher Phillippe", location = "Baltimore, Maryland, USA", meetingdate = "October 26-30, 2003", pages = "223-224", publisher = "Johns Hopkins University", title = "Determining context-defining windows: Pitch spelling using the spiral array", type = "conference poster", url = "http://ismir2003.ismir.net/papers/Chew.PDF", year = "2003", } @inproceedings{ Technology-and-art-938072, abstract = "A set of standard technologies already developed or under development by the Moving Picture Experts Group (MPEG) promises to bring back to authors the control of their works. The technologies are those of Content Representation, Digital Item Declaration, Interoperable Intellectual Property Management and Protection, and Metadata.", address = "Paris, France", affiliation = "Telecom Italia Lab., Torino, Italy, leonardo.chiariglione at tilab.com", author = "Leonardo Chiariglione", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", editorrole = "Editor", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "238-240", publisher = "IRCAM - Centre Pompidou", title = "Technology and art - putting things in context", type = "invited presentation", url = "http://ismir2002.ircam.fr/Proceedings/02-FP08-1.pdf", year = "2002", } @inproceedings{ Music-databases-Inde-57747, abstract = "In this paper, the music database with the search-by-content ability is studied. The chords are used to represent music. With the chord-representation model, the input fault tolerance ability is equipped. PAT-tree is proposed as the index structure. The 'unstructured search' is an important characteristic of PAT-tree. We have implemented a music database system based on the chord-representation model and PAT-tree index structure.", affiliation = "alpchen at cs.nthu.edu.tw, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, Republic of China", author = "Ta-Chun Chou and Arbee L.P. Chen and Chih-Chin Liu", booktitle = "Proceedings of IEEE International Workshop on Multimedia Database Management Systems, International Workshop on Multimedia Database Management Systems", keywords = "PAT-tree index structure, unstructured search", location = "Blue Mountain Lake, New York, USA", meetingdate = "August 14-16, 1996", pages = "46-53", title = "Music databases: Indexing techniques and implementation", type = "conference paper", year = "1996", } @article{ Strike-up-the-score-1286440, affiliation = "sayeed at jhu.edu, Digital Knowledge Center, Milton S. Eisenhower Library, Johns Hopkins University", author = "G. Sayeed Choudhury and Tim DiLauro and Michael Droettboom and Ichiro Fujinaga and Karl MacMillan", dateentryedited = "Jordan Seymour", entryeditor = "03112002", fullpubdate = "February 2001", journal = "D-Lib Magazine", keywords = "digital music, digital libraries, music libraries", number = "2", title = "Strike up the score: Deriving searchable and playable digital formats from sheet music", type = "journal article", url = "", volume = "7", year = "2001", } @inproceedings{ Optical-music-recogn-1842526, abstract = "An adaptive optical music recognition system is being developed as part of an experiment in creating a comprehensive framework of tools to manage the workflow of large-scale digitization projects. This framework will support the path from physical object and/or digitized material into a digital library repository, and offer effective tools for incorporating metadata and perusing the content of the resulting multimedia objects.", address = "Amherst, MA", affiliation = "Digital Knowledge Center, Milton S. Eisenhower Library and Peabody Conservatory of Music, Johns Hopkins University, USA, ich at peabody.jhu.edu", author = "G. Sayeed Choudhury and Tim DiLauro and Michael Droettboom and Ichiro Fujinaga and Brian Harrington and Karl MacMillan", booktitle = "Proceedings of International Symposium on Music Information Retrieval: Music IR 2000", editor = "Don Byrd and J. Stephen Downie", editorrole = "Chairs", location = "Plymouth, MA", meetingdate = "October 23-25, 2000", pages = "[4] page numbers unknown", publisher = "University of Massachusetts at Amherst", title = "Optical music recognition system within a large-scale digitization project", type = "Conference Paper", url = "http://ciir.cs.umass.edu/music2000/papers/choudhury_paper.pdf", year = "2000", } @inproceedings{ An-auditory-model-ba-1058790, abstract = "In this paper, a new system for the automatic transcription of singing sequences into a sequence of pitch and duration pairs is presented. Although such a system may have a wider range of applications, it was mainly developed to become the acoustic module of a query-by-humming (QBH) system for retrieving pieces of music from a digitized musical library. The first part of the paper is devoted to the systematic evaluation of a variety of state-of-the art transcription systems. The main result of this evaluation is that there is clearly a need for more accurate systems. Especially the segmentation was experienced as being too error prone (≈ 20% segmentation errors). In the second part of the paper, a new auditory model based transcription system is proposed and evaluated. The results of that evaluation are very promising. Segmentation errors vary between 0 and 7% dependent on the amount of lyrics that is used by the singer. The paper ends with the description of an experimental study that was issued to demonstrate that the accuracy of the newly proposed transcription system is not very sensitive to the choice of the free parameters, at least as long as they remain in the vicinity of the values one could forecast on the basis of their meaning.", address = "Paris, France", affiliation = "Department of Electronics and Information Systems (ELIS), Ghent University, Gent, Belgium, martens at elis.rug.ac.be", author = "L. P. Clarisse and J. P. Martens and M. Lesaffre and B. De Baets and H. De Meyer and M. Leman", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", editorrole = "Editor", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "116-123", publisher = "IRCAM - Centre Pompidou", title = "An auditory model based transcriber of singing sequences", type = "conference paper", url = "http://ismir2002.ircam.fr/proceedings/02-FP04-3.pdf", year = "2002", } @inproceedings{ A-unified-approach-t-302484, abstract = "In this paper we propose a unified approach to content-based search in different kinds of music data. Our approach is based on a general algorithmic framework for searching patterns of complex objects in large databases. In particular we describe how this approach may be used to allow for polyphonic search in polyphonic scores as well as for the identification of PCM audio material. We give an overview on the various aspects of our technology including fault tolerant search methods. Several areas of application are suggested. We give an overview on several prototypic systems we developed for those applications including the notify! and the audentify! systems.", addedkeywords = "fault-tolerant search", affiliation = "clausen at cs.uni-bonn.de, Department of Computer Science III, University of Bonn, Bonn, Germany", author = "Michael Clausen and Frank Kurth", booktitle = "Proceedings Second International Conference on WEB Delivering of Music: WEDELMUSIC 2002", dateentryedited = "Jordan Seymour", entryeditor = "28052003", location = "Darmstadt, Germany", meetingdate = "December 9-11, 2002", note = "author's preprint", pages = "56-65", title = "A unified approach to content-based and fault tolerant music identification", type = "conference paper", url = "http://www.informatik.uni-bonn.de/~frank/papers/wedelmusic02.pdf", year = "2002", } @inproceedings{ Exploration-of-point-316479, abstract = "Similarity is an intuitive criterion for indexing and classification of digital audio files in music information retrieval systems. While significant work has been done on similarity-based approaches to monophonic music, methods for reliably dealing with databases of arbitrary polyphonic music remain elusive. In this paper we describe our ongoing research in exploring the use of high-order multivariate statistical techniques for similarity-based classification of polyphonic music in digital audio files. The statistical techniques we employ, known as point distribution models (PDMs), have recently proven to be of surprising value in computer vision research for rating visual similarity; here we are attempting to apply PDMs to musical similarity. This involves creating neural networks that approximate the statistical processing, to save on potentially explosive storage and processor requirements. This paper reports on work in progress: our results to date are inconclusive and somewhat negative. We describe our rationale for exploring PDMs in polyphonic music similarity-rating and discuss the problems we have encountered so far, with the intention of encouraging other members of the music information retrieval community to explore this and related approaches.", address = "Amherst, MA", affiliation = "Digital Media Systems Department, Hewlett-Packard Labs, UK, dave_cliff at hp.com", author = "Dave Cliff and Heppie Freeburn", booktitle = "Proceedings of International Symposium on Music Information Retrieval: Music IR 2000", editor = "Don Byrd and J. Stephen Downie", editorrole = "Chairs", location = "Plymouth, MA", meetingdate = "October 23-25, 2000", publisher = "University of Massachusetts at Amherst", title = "Exploration of point-distribution models for similarity-based classification and indexing of polyphonic music", type = "Poster Abstract", url = "http://ciir.cs.umass.edu/music2000/posters/cliff.pdf", year = "2000", } @article{ Webcollaborative-fil-1041315, abstract = "We show that it is possible to collect data that are useful for collaborative filtering (CF) using an autonomous Web spider. In CF, entities are recommended to a new user based on the stated preferences of other, similar users. We describe a CF spider that collects from the Web lists of semantically related entities. These lists can then be used by existing CF algorithms by encoding them as 'pseudo-users'. Importantly, the spider can collect useful data without pre-programmed knowledge about the format of particular pages or particular sites. Instead, the CF spider uses commercial Web-search engines to find pages likely to contain lists in the domain of interest, and then applies previously proposed heuristics to extract lists from these pages. We show that data collected by this spider are nearly as effective for CF as data collected from real users, and more effective than data collected by two plausible hand-programmed spiders. In some cases, autonomously spidered data can also be combined with actual user data to improve performance.", affiliation = "wcohen at research.att.com, AT&T Shannon Laboratories, 180 Park Avenue, Florham Park, New Jersey, USA", author = "William W. Cohen and Wei Fan", dateentryedited = "Jordan Seymour", entryeditor = "30052003", journal = "Computer Networks", note = "WWW9 Online Conference Article", number = "1-6", pages = "685-698", title = "Web-collaborative filtering: Recommending music by crawling the Web", type = "journal article", url = "http://www9.org/w9cdrom/266/266.html", volume = "33", year = "2000", } @book{ Music-cognition-and-1336413, abstract = "This book is an introduction to psychoacoustics, specifically geared toward those interested in music. It covers both basic concepts and some brand new research, and can be used as a textbook for introductory courses at the college sophomore level or above. The book is also suitable for independent study by people interested in psychology and music. The individual chapters are designed to be 'interesting reads' as individual units on particular topics. The goal is to capture the interest of readers who are new to the topics, and provide references for those wishing further research. The lecture nature of the book lends well to classroom teaching, with the 23 chapters providing material for one or more lectures each. [from preface]", address = "Cambridge, Massachusetts, USA", affiliation = "Department of Computer Science, Princeton University", author = "Perry R. Cook", authorrole = "Editor", backgroundcode = "music psychology", booktitle = "Music, cognition, and computerized sound: An introduction to psychoacoustics", dateentryedited = "09072003", entryeditor = "Jordan Seymour", pages = "372 p.", publisher = "MIT Press", title = "Music, cognition, and computerized sound: An introduction to psychoacoustics", type = "book", year = "1999", } @inproceedings{ Automatic-music-summ-1117180, abstract = "We present methods for automatically producing summary excerpts or thumbnails of music. To find the most representative excerpt, we maximize the average segment similarity to the entire work. After window-based audio parameterization, a quantitative similarity measure is calculated between every pair of windows, and the results are embedded in a 2-D similarity matrix. Summing the similarity matrix over the support of a segment results in a measure of how similar that segment is to the whole. This measure is mazimized to find the segment that best represents the entire work. We discuss variations on the method, and present experimental results for orchestral music, popular songs, and jazz. These results demonstrate that the method finds significantly representative excerpts, using very few assumptions about the source audio.", address = "Paris, France", affiliation = "FX Palo Alto Laboratory, Palo Alto, California, USA, cooper at fxpal.com", author = "Matthew Cooper and Jonathan Foote", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", editorrole = "Editor", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "81-85", publisher = "IRCAM - Centre Pompidou", title = "Automatic music summarization via similarity analysis", type = "conference paper", url = "http://ismir2002.ircam.fr/proceedings/02-FP03-1.pdf", year = "2002", } @inproceedings{ Computer-analysis-of-297544, abstract = "I will describe here a computer program called Sorcerer. Sorcerer uses what I call referential analysis, a semiotic approach roughly situated between hermeneutic and Rétian analyses, which associates patterns found in a target work —music under study— with several potential source works —music assumed to either influence or be influenced by the target work. Sorcerer then presents these patterns as possible references called allusions. The program lists its findings without regard for whether the composer of the target work consciously or subconsciously referenced the source work, only that the found allusions exist. I will further describe the possible relevance and importance of this type of analysis as a complementary approach to more standard harmonic, melodic, and formal types of analysis, as a method for performers to better interpret the music they play, and as one possible approach to the deeper understanding of meaning in music.", address = "Bloomington, IN", affiliation = "Music Department, Division of Arts, University of California, Santa Cruz, CA, USA, howell at cats.ucsc.edu", author = "David Cope", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Editors", keywords = "Sorcerer, referential analysis, semiotic approach, allusions", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "83-84", publisher = "Indiana University", title = "Computer analysis of musical allusions", type = "Extended Abstract, Keynote Address", url = "http://music-ir.org/gsdl/ismir2001/invited/cope.pdf, and http://music-ir.org/gsdl/ismir2001/invited/cope.pdf (252k)", year = "2001", } @article{ Stringmatching-techn-1195704, affiliation = "t.crawford at kcl.ac.uk, Music Department, King's College, London, England", author = "Tim Crawford and Costas S. Iliopoulos and Rajeev Raman", dateentryedited = "Jordan Seymour", entryeditor = "02112002", journal = "Computing in Musicology", pages = "73-100", title = "String-matching techniques for musical similarity and melodic recognition", type = "journal article", volume = "11", year = "1998", } @inproceedings{ Finding-motifs-with-513775, abstract = "This paper focuses on a set of string pattern-matching problems that arise in musical analysis, and especially in musical information retrieval. A musical score can be viewed as a string: at a very rudimentary level, the alphabet could simply be the set of notes in the chromatic or diatonic notation, or the set of intervals that appear between notes (e.g. pitch may be represented as MIDI numbers and pitch intervals as number of semitones).", address = "Amherst, MA", affiliation = "Institut Gaspard-Monge, Laboratoire d'informatique, Universite de Marne-la-Vallee, mac at univ-mlv.fr", author = "Maxime Crochemore and Costas S. Iliopoulos and Yoan J. Pinzon and Wojciech Rytter", booktitle = "Proceedings of International Symposium on Music Information Retrieval: Music IR 2000", editor = "Don Byrd and J. Stephen Downie", editorrole = "Chairs", location = "Plymouth, MA", meetingdate = "October 23-25, 2000", publisher = "University of Massachusetts at Amherst", title = "Finding motifs with gaps", type = "Poster Abstract", url = "http://ciir.cs.umass.edu/music2000/posters/pinzon.pdf", year = "2000", } @inproceedings{ The-MUSART-Testbed-f-1356620, abstract = "Evaluating music information retrieval systems is acknowledged to be a difficult problem. We have created a database and a software testbed for the systematic evaluation of various query-by-humming (QBH) search systems. As might be expected, different queries and different databases lead to wide variations in observed search precision. “Natural” queries from two sources led to lower performance than that typically reported in the QBH literature. These results point out the importance of careful measurement and objective comparisons to study retrieval algorithms. This study compares search algorithms based on note-interval matching with dynamic programming, fixed-frame melodic contour matching with dynamic time warping, and a hidden Markov model. An examination of scaling trends is encouraging: precision falls off very slowly as the database size increases. This trend is simple to compute and could be useful to predict performance on larger databases.", addedkeywords = "musart, query-by-humming, note-interval algorithm, contour-matching algorithm, hidden markov model, jcs", affiliation = "School of Computer Science, Carnegie Mellon University, rbd at cs.cmu.edu", author = "Roger B. Dannenberg and William P. Birmingham and George Tzanetakis and Colin Meek and Ning Hu and Bryan Pardo", booktitle = "Proceedings of the Fourth International Conference on Music Information Retrieval: ISMIR 2003", editor = "Holger H. Hoos and David Bainbridge", entryeditor = "Christopher Phillippe", location = "Baltimore, Maryland, USA", meetingdate = "October 26-30, 2003", pages = "41-47", publisher = "Johns Hopkins University", title = "The MUSART Testbed for Query-By-Humming Evaluation", type = "conference paper", url = "http://ismir2003.ismir.net/papers/Dannenberg.PDF", year = "2003", } @inproceedings{ Pattern-discovery-te-788978, abstract = "Human listeners are able to recognize structure in music through the perception of repetition and other relationships within a piece of music. This work aims to automate the task of music analysis. Music is 'explained' in terms of embedded relationships, especially repetition of segments or phrases. The steps in this process are the transcription of audio into a representation with a similarity or distance metric, the search for similar segments, forming clusters of similar segments, and explaining music in terms of these clusters. Several transcription methods are considered: monophonic pitch estimation, chroma (spectral) representation, and polyphonic transcription followed by harmonic analysis. Also, several algorithms that search for similar segments are described. These techniques can be used to perform an analysis of musical structure, as illustrated by examples.", affiliation = "rbd at cs.cmu.edu, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA", author = "Roger B. Dannenberg and Ning Hu", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", keywords = "musical structure analysis", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "63-70", title = "Pattern discovery techniques for music audio", type = "conference paper", url = "http://ismir2002.ircam.fr/proceedings/02-FP02-3.pdf", year = "2002", } @inproceedings{ Music-information-re-690226, abstract = "Much of the difficulty in Music Information Retrieval can be traced to problems of good music representations, understanding music structure, and adequate models of music perception. In short, the central problem of Music Information Retrieval is Music Understanding, a topic that also forms the basis for much of the work in the fields of Computer Music and Music Perception. It is important for all of these fields to communicate and share results. With this goal in mind, the author’s work on Music Understanding in interactive systems, including computer accompaniment and style recognition, are discussed.", address = "Bloomington, IN", affiliation = "Carnegie Mellon University, School of Computer Science, USA, rbd at cs.cmu.edu", author = "Roger B. Dannenberg", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Chairs", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "139-142", publisher = "Indiana University", title = "Music information retrieval as music understanding", type = "Extended Abstract, Invited Address", url = "http://ismir2001.indiana.edu/proceedings.pdf, and http://music-ir.org/gsdl/ismir2001/invited/dannenberg.pdf (210k)", year = "2001", } @inproceedings{ A-machine-learning-a-1735650, abstract = "Much of the work on perception and understanding of music by computers has focused on low-level perceptual features such as pitch and tempo. Our work demonstrates that machine learning can be used to build effective style classifiers for interactive performance systems. We also present an analysis explaining why these techniques work so well when hand-coded approaches have consistently failed. We also describe a reliable real-time performance style classifier.", addedkeywords = "machine learning", affiliation = "rbd at cs.cmu.edu, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA", author = "Roger B. Dannenberg and Belinda Thom and David Watson", booktitle = "1997 International Computer Music Conference", dateentryedited = "Jordan Seymour", entryeditor = "29052003", location = "Aristotle University, Thessaloniki, Greece", meetingdate = "September 25-30, 1997", note = "author's preprint", pages = "344-347", publisher = "International Computer Music Association", title = "A machine learning approach to musical style recognition", type = "conference paper", url = "http://www-2.cs.cmu.edu/afs/cs.cmu.edu/user/rbd/www/papers/styleclass.pdf", year = "1997", } @inproceedings{ Managing-metadata-902856, abstract = "The All Media Guide (AMG) is a technology company that maintains the world's largest database of metadata relating to the entertainment industries. This document describes some of the goals of AMG, the issues uncovered during the evolution of our databases, and discusses some of the implementations we have chosen.", address = "Paris, France", affiliation = "All Media Guide, Ann Arbor, Michigan, USA, davdat at allmusic.com", author = "David Datta", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", editorrole = "Editor", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "249-251", publisher = "IRCAM - Centre Pompidou", title = "Managing metadata", type = "invited presentation", url = "http://ismir2002.ircam.fr/Proceedings/02-FP08-4.pdf", year = "2002", } @inproceedings{ A-specialized-open-a-577962, abstract = "The Open Archives Initiative (OAI) Sheet Music Project is a consortium of institutions building OAIcompliant data providers, a metadata harvester, and a web-based service provider for digital sheet music collections. The project aims to test the viability of the OAI standard for providing access to sheet music collections on the web, and to build a permanent and increasingly participatory service for the discovery of digital sheet music. The service provider design has been informed by detailed usability testing, and by limitations imposed by the variations in metadata harvested from the different participating collections. Advanced services in addition to basic searching and browsing have been developed, including the ability to save and share subsets across participating collections. Harvesting and searching strategies for overcoming metadata limitations are being developed. The consortium is seeking additional participants with digital sheet music collections, and is ex!", addedkeywords = "open archives initiative (oai)", affiliation = "University of California, Los Angeles, Music Library, Los Angeles, California, sdavison at library.ucla.edu, Johns Hopkins University, Special Collections, Eisenhower Library, Baltimore, Maryland, Indiana University, Digital Library Program", author = "Stephen Davison and Cynthia Requardt and Kristine Brancolini", booktitle = "Proceedings of the Fourth International Conference on Music Information Retrieval: ISMIR 2003", editor = "Holger H. Hoos and David Bainbridge", entryeditor = "Christopher Phillippe", location = "Baltimore, Maryland, USA", meetingdate = "October 26-30, 2003", pages = "225-226", publisher = "Johns Hopkins University", title = "A specialized open archives initiative harvester for sheet music: A project report and examination of issues", type = "conference poster", url = "http://ismir2003.ismir.net/papers/Davison.PDF", year = "2003", } @book{ The-psychology-of-mu-1464955, abstract = "The aim of this book is to interpret musical phenomena in terms of mental function - to characterize the ways in which we perceive, remember, create, and perform music. ...[It] is inteneded as a comprehensive reference source for musicians, in particular for those who are interested in the way that music is perceived, apprehended, and performed. It is also intended as a reference source for perceptual and cognitive psychologists. In addition, this volume is designed for use as a textbook for advanced courses in the psychology of music. [from preface]", address = "San Diego, California, USA", affiliation = "Department of Psychology, University of California, San Diego, La Jolla, California", author = "Diana Deutsch", authorrole = "Editor", backgroundcode = "music Psychology", booktitle = "The psychology of music", dateentryedited = "09072003", entryeditor = "Jordan Seymour", publisher = "Academic Press", title = "The psychology of music", type = "book", year = "1999", } @article{ Automated-name-autho-784269, affiliation = "timmo at jhu.edu, Digital Knowledge Center, Milton S. Eisenhower Library, Johns Hopkins University", author = "Tim DiLauro and G. Sayeed Choudhury and Mark Patton and James W. Warner and Elizabeth W. Brown", dateentryedited = "Jordan Seymour", entryeditor = "03112002", fullpubdate = "April 2001", journal = "D-Lib Magazine", keywords = "automated name authority control system, digital libraries, music libraries", note = "D-Lib online article", number = "4", title = "Automated name authority control and enhanced searching in the Levy Collection", type = "journal article", url = "http://www.dlib.org/dlib/april01/dilauro/04dilauro.html", volume = "7", year = "2001", } @inproceedings{ Saving-the-multimedi-11781, abstract = "This paper describes a project that deals with the design and development of a multimedia database to access archive material from the Teatro alla Scala in Milan - such as audio and video files, photographic pictures and music scores. The purpose of the project is to make the cultural heritage of Teatro alla Scala available in digital form and allow efficient search and retrieval.", affiliation = "loretta at dsi.unimi.it, LIM-DSI, Università degli Studi di Milano, Milano, Italy", author = "Loretta Diana and Elena Ferrari and Goffredo Haus", booktitle = "Proceedings First International Conference on WEB Delivering of Music: WEDELMUSIC 2001", dateentryedited = "Jordan Seymour", entryeditor = "28052003", keywords = "multimedia databases", location = "Florence, Italy", meetingdate = "November 23-24, 2001", pages = "52-59", title = "Saving the multimedia musical heritage of Teatro alla Scala for querying in a web-oriented environment", type = "conference paper", year = "2001", } @inproceedings{ Classification-of-da-1445027, abstract = "This paper addresses the genre classification problem for a specific subset of music, standard and Latin ballroom dance music, using a classification method based only on timing information. We compare two methods of extracting periodicities from audio recordings in order to find the metrical hierarchy and timing patterns by which the style of the music can be recognised: the first method performs onset detection and clustering of inter-onset intervals; the second uses autocorrelation on the amplitude envelopes of band-limited versions of the signal as its method of periodicity detection. The relationships between periodicities are then used to find the metrical hierarchy and to estimate the tempo at the beat and measure levels of the hierarchy. The periodicities are then interpreted as musical note values, and the estimated tempo, meter and the distribution of periodicities are used to predict the style of music using a simple set of rules. The methods are evaluated with a test set of standard and Latin dance music, for which the style and tempo are given on the CD cover, providing a “ground truth” by which the automatic classification can be measured.", addedkeywords = "classification, periodicity, inter-onset intervals (ioi)", affiliation = "Austrian Research Institute for AI, Vienna, Austria, simon at oefai.at", author = "Simon Dixon and Elias Pampalk and Gerhard Widmer", booktitle = "Proceedings of the Fourth International Conference on Music Information Retrieval: ISMIR 2003", editor = "Holger H. Hoos and David Bainbridge", entryeditor = "Christopher Phillippe", location = "Baltimore, Maryland, USA", meetingdate = "October 26-30, 2003", pages = "159-165", publisher = "Johns Hopkins University", title = "Classification of dance music by periodicity patterns", type = "conference paper", url = "http://ismir2003.ismir.net/papers/Dixon.PDF", year = "2003", } @inproceedings{ An-approach-towards-506678, abstract = "Most research on music retrieval systems is based on monophonic musical sequences. In this paper, we investigate techniques for a full polyphonic music retrieval system. A method for indexing polyphonic music data files using the pitch and rhythm dimensions of music information is introduced. Our strategy is to use all combinations of monophonic musical sequences from polyphonic music data. ‘Musical words’ are then obtained using the n-gram approach enabling text retrieval methods to be used for polyphonic music retrieval. Here we extend the n-gram technique to encode rhythmic as well as interval information, using the ratios of onset time differences between two adjacent pairs of pitch events. In studying the precision in which intervals are to be represented, a mapping function is formulated in dividing intervals into smaller classes. To overcome the quantisation problems that arise with using rhythmic information from performance data, an encoding mechanism using ratio bins is also adopted. We present results from retrieval experiments with a database of 3096 polyphonic pieces.", address = "Bloomington, IN", affiliation = "Dept. of Computing, Imperial College, London, UK, sd3 at doc.ic.ac.uk", author = "Shyamala Doraisamy and Stefan M. Rüger", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Editors", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "187-193", publisher = "Indiana University", title = "An approach towards a polyphonic music retrieval system", type = "Conference Paper", url = "http://ismir2001.indiana.edu/pdf/doraisamy.pdf, and http://music-ir.org/gsdl/ismir2001/pdf/doraisamy.pdf (315k)", year = "2001", } @inproceedings{ Position-indexing-of-791676, abstract = "In this paper we examine the retrieval performance of adjacent and concurrent n-grams generated from polyphonic music data. We deploy a method to index polyphonic music using a word position indexer with the n-gram approach. Using all possible combinations of monophonic sequences from polyphonic music data, “overlaying” word locations within a document are obtained, such as needed with polyphony (i.e. where more than one word can assume the same word position). The feasibility in utilising the position information of polyphonic ‘musical words’ is investigated using various proximity-based and structured query operators available with text retrieval system. Our experiments show that nested phrase operators improve the retrieval performance and we present the results of our comparative study on a collection of 5456 polyphonic pieces encoded in the MIDI format.", addedkeywords = "position indexing", affiliation = "Department of Computing, South Kensington Campus, Imperial College, London, London, England, sd3 at imperial.ac.uk", author = "Shyamala Doraisamy and Stefan Ruger", booktitle = "Proceedings of the Fourth International Conference on Music Information Retrieval: ISMIR 2003", editor = "Holger H. Hoos and David Bainbridge", entryeditor = "Christopher Phillippe", location = "Baltimore, Maryland, USA", meetingdate = "October 26-30, 2003", pages = "227-228", publisher = "Johns Hopkins University", title = "Position indexing of adjacent and concurrent N-grams for polyphonic music retrieval", type = "conference poster", url = "http://ismir2003.ismir.net/papers/Doraisamy.PDF", year = "2003", } @inproceedings{ A-comparative-and-fa-1642106, abstract = "In this paper we investigate the retrieval performance of monophonic queries made on a polyphonic music database using the n-gram approach for full-music indexing. The pitch and rhythm dimensions of music are used, and the musical words (a term coined by Downie) generated enable text retrieval methods to be used with music retrieval. We outline an experimental framework for a comparative and fault-tolerance study of various n-gramming strategies and encoding precision using six experimental databases. For monophonic queries we focus in particular on query-by-hummin (QBH) systems. Error models addressed in several QBH studies are surveyed for the fault-tolerance study. Our experiments show that different n-gramming strategies and encoding precision differ widely in their effectiveness. We present the results of our comparative and fault-tolerance study on a collection of 5380 polyphonic music pieces encoded in the MIDI format.", address = "Paris, France", affiliation = "Department of Computing, Imperial College, London, England, UK, sd3 at doc.ic.ac.uk", author = "Shyamala Doraisamy and Stefan Rüger", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", editorrole = "Editor", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "101-106", publisher = "IRCAM - Centre Pompidou", title = "A comparative and fault-tolerance study of the use of n-grams with polyphonic music", type = "conference paper", url = "http://ismir2002.ircam.fr/proceedings/02-FP04-1.pdf", year = "2002", } @inproceedings{ A-system-for-adding-1414624, abstract = "Most online music library catalogues can only be searched by textual m etadata. Whilst highly effective - since the rules for maintaining consistency have been refined over many years - this does not allow searching by musical content. Many music librarians are familiar with users humming their enquiries. Most systems providing a query by humming interface tend to run independently of music library catalogue systems and not offer similar textual metadata searching. This demonstration shows how we can integrate these two types of system based on work conducted as part of the NSF/JISC funded OMRAS project (http://www.omras.org).", addedkeywords = "z39.50", affiliation = "mathew.dovey at las.ox.ac.uk, Department of Computer Science, Kings College, London, UK", author = "Matthew J. Dovey", booktitle = "Proceedings of the First ACM/IEEE-CS Joint Conference on Digital Libraries, International Conference on Digital Libraries", dateentryedited = "Jordan Seymour", entryeditor = "02112002", location = "Roanoke, Virginia, USA", meetingdate = "June 24-28, 2001", pages = "458", title = "A system for adding content-based searching to a traditional music library catalogue server", type = "conference paper", year = "2001", } @inproceedings{ A-technique-for-quot-85907, abstract = "This paper discussed some of the ongoing investigative work on integrating these two systems conducted as part of the NSF/JISC funded OMRAS (Online Music Retrieval and Searching) project into polyphonic searching of music. It describes a simple and efficient "piano-roll" based algorithm for locating a polyphonic query within a large polyphonic text. It then describes ways in which this algorithm can be modified without affecting the performance to allow more freedom in the how a match is made,allowing queries which involve something akin to polyphonic regular expressions to be located in the text.", address = "Bloomington, IN", affiliation = "Dept. of Computer Science, Kings College, London, UK, matthew.dovey at las.ox.ac.uk", author = "Matthew J. Dovey", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Editors", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "179-185", publisher = "Indiana University", title = "A technique for "regular expression" style searching in polyphonic music", type = "Conference paper", url = "http://ismir2001.indiana.edu/pdf/dovey.pdf, and http://music-ir.org/gsdl/ismir2001/pdf/dovey.pdf (2.1M)", year = "2001", } @inproceedings{ Adding-contentbased-1593136, abstract = "Most online music library catalogues can only be searched by textual metadata. Whilst highly effective - since the rules for maintaining consistency have been refined over many years - this does not allow searching by musical content. Many music librarians are familiar with users humming their enquiries. Most systems providing a 'query by humming' interface tend to run independently of music library catalogue systems and not offer similar textual metadata searching. This paper discusses the ongoing investigative work on integrating these two types of system conducted as part of the NSF/JISC funded OMRAS project (http://www.omras.org).", affiliation = "matthew.dovey at las.ox.ac.uk, Visiting Research Fellow, OMRAS Project, Department of Computer Science, Kings College, London, England", author = "Matthew J. Dovey", booktitle = "Proceedings of the First ACM/IEEE-CS Joint Conference on Digital Libraries, International Conference on Digital Libraries", dateentryedited = "Jordan Seymour", entryeditor = "30052003", keywords = "z39.50", location = "Roanoke, Virginia, USA", meetingdate = "June 24-28, 2001", pages = "249-250", publisher = "ACM Press", title = "Adding content-based searching to a traditional music library catalogue server", type = "conference paper", year = "2001", } @inproceedings{ An-algorithm-for-loc-628187, author = "Matthew Dovey", booktitle = "Proceedings of the AISB '99 [Artificial Intelligence and Simulation of Behaviour] Symposium on Musical Creativity", dateentryedited = "Jordan Seymour", entryeditor = "30052003", pages = "48-53", title = "An algorithm for locating polyphonic phrases within a polyphonic music piece", type = "conference paper", year = "1999", } @book{ Music-cognition-1039446, abstract = "This book focuses on the perception and cognition of music. The point of view we take toward psychology is what is generally called information processing. We view the music listener as a gatherer and interpreter of information from the environment, and we believe that is possible to study the separate component processes by which the listener accomplishes this gathering and interpreting. [from preface]", address = "Orlando, Florida, USA", affiliation = "Program in Human Development and Communication Sciences, University of Texas at Dallas, Richardson, Texas, USA", author = "W. Jay Dowling and Dane L. Harwood", backgroundcode = "music psychology", booktitle = "Music cognition", dateentryedited = "09072003", entryeditor = "Jordan Seymour", pages = "258 p.", publisher = "Academic Press", title = "Music cognition", type = "book", year = "1986", } @article{ Scale-and-contour-Tw-1022188, abstract = "Develops a 2-component model of how melodies are stored in long- and short-term memory. The 1st component is the overlearned perceptual-motor schema of the musical scale. Evidence is presented supporting the lifetime stability of scales and the fact that they seem to have a basically logarithmic form cross-culturally. The 2nd component, melodic contour, is shown to function independently of pitch interval sequence in memory. 21 college students were studied using a recognition memory paradigm in which tonal standard stimuli were confused with same-contour comparisons, whether they were exact transpositions or tonal answers, but not with atonal comparison stimuli. This result is contrasted with earlier work using atonal melodies and shows the interdependence of the 2 components, scale and contour.", affiliation = "University of Texas Program in Psychology and Human Development, Dallas, Texas, USA", author = "W. Jay Dowling", fullpubdate = "July 1978", journal = "Psychological Review", keywords = "human information storage, memory", number = "4", pages = "341-354", title = "Scale and contour: Two components of a theory of memory for melodies", type = "journal article", volume = "85", year = "1978", } @inproceedings{ Toward-a-theory-of-m-1302249, abstract = "This paper analyzes a set of 161 music-related information requests posted to the rec.music.country.old-time newsgroup. These posting are categorized by the types of detail used to characterize the poster's information need, the type of music information requested, the intended use for the information, and additional social and contextual elements present in the postings. The results of this analysis suggest that similar studies of 'native' music information requests can be used to inform the design of effective, usable music information retrieval interfaces.", address = "Paris, France", affiliation = "Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, Champaign, Illinois, USA, jdownie at uiuc.edu", author = "J. Stephen Downie and Sally Jo Cunningham", booktitle = "Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002", editor = "Michael Fingerhut", editorrole = "Editor", location = "Paris, France", meetingdate = "October 13-17, 2002", pages = "299-300", publisher = "IRCAM - Centre Pompidou", title = "Toward a theory of music information retrieval queries: System design implications", type = "poster", url = "http://ismir2002.ircam.fr/proceedings/03-SP05-4.pdf", year = "2002", } @misc{ Evaluating-a-simple-1822671, abstract = "Taking our cue from those printed thematic catalogues that have reduced the amount of music information represented we developed, and then evaluated, a Music Information Retrieval (MIR) system based upon the intervals found within the melodies of a collection of 9354 folksongs. We believe that there is enough information contained within an interval-only representation of monophonic melodies that effective retrieval of music information has been achieved. We extended the thematic catalogue model by affording access to musical expressions found anywhere within a melody. To achieve this extension we fragmented to the melodies into length-n subsections called n-grams. The length of these n-grams and the degree to which we precisely represent the intervals are variables analyzed in this thesis.", affiliation = "jdownie at uiuc.edu, Graduate School of Library and Information Science, University of Illinios at Urbana-Champaign", author = "J. Stephen Downie", dateentryedited = "Jordan Seymour", entryeditor = "03112002", keywords = "music information retrieval, information retrieval, informetrics, informetric modeling, information systems, information system evaluation", title = "Evaluating a simple approach to music information retrieval", type = "other", url = "http://alexia.lis.uiuc.edu/\~jdownie/mir_papers/thesis_missing_some_music_figs.pdf", year = "1999", } @inproceedings{ Music-information-re-278231, abstract = "Music information retrieval (MIR) as a nascent discipline is blessed with a multi-disciplinary group of people endeavoring to bring their respective knowledge-bases and research paradigms to bear on MIR problems. Communication difficulties across disciplinary boundaries, however, threaten to impede the maturation of MIR into a full-fledge discipline. The principal causes of the communications breakdown among members of the MIR community are a( the lack of bibliographic control of the MIR literature; and, b) the use of discipline-specific languages and methodologies throughout that literature. This poster abstract reports upon the background, framework, goals and ongoing development of the MIR Annotated Bibliography Website Project. This project is being undertaken to specifically address and overcome these bibliographic control and communications issues.", address = "Bloomington, IN", affiliation = "Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA, jdownie at uiuc.edu", author = "J. Stephen Downie", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Editors", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "5-7", publisher = "Indiana University", title = "Music information retrieval annotated bibliography website project, phase I", type = "Poster Abstract", url = "http://ismir2001.indiana.edu/posters/downie.pdf, also http://music-ir.org/gsdl/ismir2001/posters/downie.pdf (222k)", year = "2001", } @inbook{ Music-information-re-1802190, abstract = "Myriad difficulties remain to be overcome before the creation, deployment, and evaluation of robust, large-scale, and content-based Music Information Retrieval (MIR) systems become reality. The dizzyingly complex interaction of musicIR's pitch, temporal, harmonic, timbral, editorial, textual, and bibliographic 'facets', for example, demonstrates just one of MIR's perplexing problems. The choice of music representati on whether symbol-based, audio-based, or both further compounds matters, as each choice determines bandwidth, computation, storage, retrieval, and interface requirements and capabilities. Overlay the multicultural, multiexperiential, and multidisciplinary aspects of music and it becomes apparent that the challenges facing MIR research and development are far from trivial.", addedkeywords = "music information retrieval", affiliation = "jdownie at uiuc.edu, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, USA", author = "J. Stephen Downie", backgroundcode = "general music information retrieval", booktitle = "Annual Review of Information Science and Technology 37", chapter = "7", dateentryedited = "Jordan Seymour", editor = "Blaise Cronin", entryeditor = "03022002", pages = "295-340", publisher = "Information Today Books", title = "Music information retrieval (Chapter 7)", type = "chapter in a book", url = "http://music-ir.org/jdownie_papers/downie_mir_arist37.pdf", year = "2003", } @inproceedings{ Evaluation-of-a-simp-1931880, abstract = "We developed, and then evaluated, a music information retrieval (MIR) system based upon the intervals found within the melodies of a collection of 9354 folksongs. The songs were converted to an interval-only representation of monophonic melodies and then fragmented t into length-n subsections called n-grams. The length of these n-grams and the degree to which we precisely represent the intervals are variables analyzed in this paper. We constructed a collection of “musical word” databases using the text-based, SMART information retrieval system. A group of simulated queries, some of which contained simulated errors, was run against these databases. The results were evaluated using the normalized precision and normalized recall measures. Our concept of “musical words” shows great merit thus implying that useful MIR systems can be constructed simply and efficiently using pre-existing text-based information retrieval software.", affiliation = "jdownie at uiuc.edu, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, Champaign, Illinois, USA", author = "J. Stephen Downie and Michael Nelson", booktitle = "Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Annual ACM Conference on Research and Development in Information Retrieval", dateentryedited = "Jordan Seymour", entryeditor = "02112002", keywords = "efficient search over non-textual information, results analysis and presentation for MMIR", location = "Athens, Greece", meetingdate = "July 24-28, 2000", pages = "73-80", title = "Evaluation of a simple and effective music information retrieval method", type = "conference paper", year = "2000", } @inproceedings{ Toward-the-Scientifi-1894328, abstract = "This paper outlines the findings-to-date of a project to assist in the efforts being made to establish a TREC-like evaluation paradigm within the Music Information Retrieval (MIR) research community. The findings and recommendations are based upon expert opinion garnered from members of the Information Retrieval (IR), Music Digital Library (MDL) and MIR communities with regard to the construction and implementation of scientifically valid evaluation frameworks. Proposed recommendations include the creation of data-rich query records that are both grounded in real-world requirements and neutral with respect to retrieval technique(s) being examined; adoption, and subsequent validation, of a 'reasonable person' approach to 'relevance' assessment; and, the development of a secure, yet accessible, research environment that allows researchers to remotely access the large-scale testbed collection.", addedkeywords = "mir/mdl evaluation project, trec, music queries, relevance, intellectual property", affiliation = "Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, jdownie at uiuc.edu", author = "J. Stephen Downie", booktitle = "Proceedings of the Fourth International Conference on Music Information Retrieval: ISMIR 2003", editor = "Holger H. Hoos and David Bainbridge", entryeditor = "Christopher Phillippe", location = "Baltimore, Maryland, USA", meetingdate = "October 26-30, 2003", pages = "25-32", publisher = "Johns Hopkins University", title = "Toward the Scientific Evaluation of Music Information Retrieval Systems", type = "conference paper", url = "http://ismir2003.ismir.net/papers/Downie.PDF", year = "2003", } @inproceedings{ Whither-music-inform-1295430, abstract = "I intend to use this forum to share with you my personal thoughts and feelings concerning the future of the music information retrieval (MIR) research community. I wish to propose that the MIR community begin, in earnest, to construct more formal and permanent organizational frameworks explicitly designed to maximize the benefits of being a multi-disciplinary and multi-national research community while at the same time minimizing their inherent costs. Throughout this presentation I make suggestions and recommendations that I hope will prompt others to take up the challenge of creating a stronger and more vibrant future for MIR research and development.", address = "Bloomington, IN", affiliation = "Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA, jdownie at uiuc.edu", author = "J. Stephen Downie", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Editors", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "219-222", publisher = "Indiana University", title = "Whither music information retrieval: Ten suggestions to strengthen the MIR research community", type = "Chair's Address", url = "http://music-ir.org/gsdl/ismir2001/chair.pdf, and http://music-ir.org/gsdl/ismir2001/chair.pdf (294k)", year = "2001", } @inproceedings{ Expressive-and-effic-35808, abstract = "The ideal content-based musical search engine for large corpora must be both expressive enough to meet the needs ofa diverse user base and efficient enough to perform queries in a reasonable amount of time. In this paper, we present such a system, based on an existing advanced natural language search engine. In our design, musically meaningful searching is simply a special case of more general search techniques. This approach has allowed us to create an extremely powerful and fast search engine with minimal effort.", address = "Bloomington, IN", affiliation = "The Peabody Institute, Johns Hopkins University, USA, mdboom at peabody.jhu.edu", author = "Michael Droettboom and Ichiro Fujinaga and Karl MacMillan and Mark Patton and James Warner and G. Sayeed Choudhury and Tim DiLauro", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Editors", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "173-178", publisher = "Indiana University", title = "Expressive and efficient retrieval of symbolic musical data", type = "Conference paper", url = "http://ismir2001.indiana.edu/pdf/droettboom.pdf, and http://music-ir.org/gsdl/ismir2001/pdf/droettboom.pdf (279k)", year = "2001", } @inproceedings{ Extracting-sound-obj-614247, abstract = "In this paper we present a scheme for unsupervised extraction of sound objects or sources from a single recording containing a mixture of sounds. The separation/extraction procedure is performed by orthogonal projection of the mixed sound onto sub-spaces that are derived by clustering of transform coefficients, such as coefficients obtained by PCA or ICA. The clustering step reveals a residual non-linear grouping structure of the signal that is omitted by the linear transform. To achieve independence we are searching for partitioning that maximizes the mutual information between a component and a set to which it belongs. This information is obtained by considering a pairwise distance measure among all coefficients. Source separation experiments are reported in the paper.", author = "Shlomo Dubnov", booktitle = "Proceedings of the Audio Engineering Society 22nd International Conference on Virtual, Synthetic and Entertainment Audio (AES22)", dateentryedited = "Jordan Seymour", entryeditor = "29052003", location = "Espoo, Finland", meetingdate = "June 15-17, 2002", title = "Extracting sound objects by independent subspace analysis", type = "conference paper", year = "2002", } @book{ Pattern-Classificati-230121, abstract = "Practitioners developing or investigating pattern recognition systems in such diverse application areas as speech recognition, optical character recognition, image processing, or signal analysis, often face the difficult task of having to decide among a bewildering array of available techniques. This unique text/professional reference provides the information needed to choose the most appropriate method for a given class of problems, presenting an in-depth, systematic account of the major topics in pattern recognition today. A new edition of a classic work that helped define the field for over a quarter century, this practical book updates and expands the original work, focusing on pattern classification and the immense progress it has experienced in recent years.", address = "New York, New York, USA", affiliation = "Electrical Engineering Department, San Jose State University, San Jose, California, USA", author = "Richard O. Duda and Peter E. Hart and David G. Stork", backgroundcode = "audio signal processing", dateentryedited = "22072003", entryeditor = "Jordan Seymour", pages = "654 p.", publisher = "John Wiley & Sons", title = "Pattern Classification", type = "book", year = "2001", } @inproceedings{ Beyond-VARIATIONS-Cr-1248462, abstract = "This presentation will focus primarily on work being done at Indiana University in the area of digital music libraries, with some discussion of related efforts. Indiana University’s VARIATIONS system serves as both an early example of a working digital library supporting music content and an early application of World Wide Web technologies to music. Since April 1996, the system has provided online access within the William and Gayle Cook Music Library to sound recordings from the library’s collections. Unlike many early university-based digital library projects whose primary goals were to provide broader access to unique and/or archival collections, VARIATIONS has built its digital collection from standard musical repertoire identified as central to the teaching mission of the Indiana University School of Music.", address = "Amherst, MA", affiliation = "Digital Library Program, Indiana University, USA, jwd at indiana.edu", author = "Jon Dunn", booktitle = "Proceedings of International Symposium on Music Information Retrieval: Music IR 2000", editor = "Don Byrd and J. Stephen Downie", editorrole = "Chairs", location = "Plymouth, MA", meetingdate = "October 23-25, 2000", publisher = "University of Massachusetts at Amherst", title = "Beyond VARIATIONS: Creating a digital music library", type = "Extended Abstract", url = "http://ciir.cs.umass.edu/music2000/papers/invites/dunn_invite.pdf", year = "2000", } @inproceedings{ Indiana-University-D-145160, abstract = "This talk will present a progress report on the Indiana University Digital Music Library project as it enters its second of four years.", address = "Bloomington, IN", affiliation = "Digital Library Program, Indiana University, USA, jwd at indiana.edu", author = "Jon W. Dunn and Mary Wallace Davidson and Eric J. Isaacson", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Editors", keywords = "indiana university digital music library project", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "137-138", publisher = "Indiana University", title = "Indiana University Digital Music Library Project: An update", type = "Conference paper", url = "http://music-ir.org/gsdl/ismir2001/dunn.pdf (208k), and part of http://ismir2001.indiana.edu/proceedings.pdf", year = "2001", } @inproceedings{ VARIATIONS-A-digital-1324837, abstract = "The field of music provides an interesting context for the development of digital library systems due to the variety of information formats used by music students and scholars. The VARIATIONS digital library project at Indiana University curently delivers online access to sound recordings from the collections of IU's William and Gayle Cook Music Library and is developing access to musical score images and other formats. This paper covers the motivations for the creation of VARIATIONS, and overview of its operation and implementation, user reactions to the system, and future plans for development.", addedkeywords = "VARIATIONS project", affiliation = "jwd at indiana.edu, Digital Library Program, Indiana University", author = "Jon W. Dunn and Constance A. Mayer", booktitle = "DL '99: Proceedings of the Fourth ACM Conference on Digital Libraries, International Conference on Digital Libraries", dateentryedited = "Jordan Seymour", entryeditor = "03112002", keywords = "digital audio, digital libraries, music libraries", location = "Berkeley, California, USA", meetingdate = "August 11-14, 1999", note = "personal preprint", pages = "12-19", title = "VARIATIONS: A digital music library system at Indiana University", type = "conference paper", url = "http://www.dlib.indiana.edu/variations/VARIATIONS-DL99.pdf", year = "1999", } @inproceedings{ Melody-spotting-usin-987218, abstract = "As we acquire more digitized copies of musical recordings, it becomes increasingly necessary to have the assistance of a computer in sorting through the information that it stores. In this paper, we propose a new system for melody-based retrieval of a song from a musical database which adapts wordspotting techniques from automatic speech recognition to create a melody spotting system in the musical domain. This system is tested using a variety of feature sets derived from raw audio data. It results in a successful proof of the melody spotting concept which offers great potential for development inot a musical database capable of being queried by melody.", address = "Bloomington, IN", affiliation = "Center for Signal and Image Processing, School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA, gte401k at ece.gatech.edu", author = "Adriane Swalm Durey and Mark A. Clements", booktitle = "Proceedings of the Second Annual International Symposium on Music Information Retrieval: ISMIR 2001", editor = "J. Stephen Downie and David Bainbridge", editorrole = "Editors", location = "Indiana University, Bloomington, IN", meetingdate = "October 15-17, 2001", pages = "109-117", publisher = "Indiana University", title = "Melody spotting using hidden Markov models", type = "Conference Paper", url = "http://ismir2001.indiana.edu/pdf/durey.pdf, and http://music-ir.org/gsdl/ismir2001/pdf/durey.pdf (1022k)", year = "2001", }