Networked Environment for Music Analysis (NEMA)
Nema Alpha Release
We recently released some preliminary results produced by Nema infrastructure. Click here to view the results.
Phase I of the Networked Environment for Music Analysis (NEMA) framework project is a multinational, multidisciplinary cyberinfrastructure project for music information processing that builds upon and extends the music information retrieval research being conducted by the International Music Information Retrieval Systems Evaluation Laboratory (IMIRSEL) at the University of Illinois at Urbana-Champaign (UIUC). NEMA brings together the collective projects and the associated tools of six world leaders in the domains of music information retrieval (MIR), computational musicology (CM) and e-humanities research. The NEMA team aims to create an open and extensible webservice-based resource framework that facilitates the integration of music data and analytic/evaluative tools that can be used by the global MIR and CM research and education communities on a basis independent of time or location. To help achieve this goal, the NEMA team will be working co-operatively with the UIUC-based, Mellon-funded, Software Environment for the Advancement of Scholarly Research (SEASR) project to exploit SEASR’s expertise and technologies in the domains of data mining and webservice-based resource framework development.
An abridged PDF version of the original proposal is available: nema_abridged_proposal.pdf.
The Networked Environment for Music Analysis (NEMA) project was inspired by the lessons learned over the course of the Mellon-funded Music Information Retrieval/Music Digital Library Evaluation Project (2003-2007) being led by Prof. J. Stephen Downie and his IMIRSEL team at UIUC's Graduate School of Library and Information Science (GSLIS). Downie’s experience in running the annual Music Information Retrieval Evaluation eXchange (MIREX) on behalf of the MIR community has brought to the fore three important issues that have a direct impact on the present NEMA project. The automation, distribution and integration of MIR and CM research tool development, evaluation and use are but some of the important issues being addressed under the NEMA rubric.
NEMA Phase I offers the promise of a new and expanded MIR/CM research paradigm. Under this new paradigm, it should become possible for MIR/CM researchers to overcome limitations of time-specific and location-specific resources. In the new NEMA reality, for example, it should become common place for researchers at Lab A to easily build a virtual collection from Library B and Lab C, acquire the necessary ground-truth from Lab D , incorporate a feature extractor from Lab E , amalgamate the extracted features with those provided by Lab F, build a set of models based on pair of classifiers from Labs G and H and then validate the results against another virtual collection taken from Lab I and Library J. Once completed, the results and newly created features sets would be, in turn, made available for others to build upon.
Figure 1. The NEMA framework model bringing the NEMA components together. The components listed in the grey portion of the diagram are independent technologies being developed by members of the NEMA team.
- Resource accessibility. For example, new means to provide access to good ground-truth sets, to broad-based music collections, to feature sets, and to pre-built models, etc. must be found. Also, in the case of music collections where items from the music collections will not be able to move about, new ways of bringing researchers and their tools to the data need to be constructed. It is important to envision a future where many different collections of music materials are independently made available in such a way as to create a much larger and diverse “super-collection.” Such “super-collections” are needed to address the the current problem of data "overuse" (i.e., the "overfitting" of models to small datasets). They are also needed to allow for better scalability/stress testing of approaches. Finally, new methods of creating and providing on-demand computational and storage resources to the MIR/CM community need to be explored.
- Resource discovery. For example, even if the aforementioned resources were readily available it is still necessary to create appropriate music-specific location and discovery tools so that individual items or resource subsets might be put to use.
- Resource sharing/re-use. For example, new standards for ground-truth and feature sets must be developed to facilitate their re-use. Mechanisms need to put into place to make it easy for researchers to store or make their own sets available to others. In the same manner, mechanisms must be put in place to overcome the interoperability problems that limit the re-use of research code, including feature extractors, classifiers, and pre-built classification models, etc.
- Resource customization. For example, new ways need to be developed to help researchers amalgamate aspects of independently produced feature sets to create novel feature sets. New techniques must be found to easily create on-demand “virtual” collections that span across several real-world collections regardless of their physical location. Again, interoperability problems among research code sets must be overcome so that researchers can create customized hybrid systems that integrate tools from many different research labs.
- Principal Investigator: J. Stephen Downie (University of Illinois at Urbana-Champaign,United States of America)
- Co-Principal Investigator: Ichiro Fujinaga (McGill University,Canada)
Key Research Partners
- David De Roure (University of Southampton,United Kingdom)
- Mark Sandler (Queen Mary University of London,United Kingdom)
- Tim Crawford (Goldsmiths University of London,United Kingdom)
- David Bainbridge (University of Waikato,New Zealand)
Time Frame: 1 Jan 2008 to 31 December 2010