- Copyright. Any opinions on whether we need to contact RISM? If so, what would be the best way?
My two cents
It is great to have the ground truth set up. Thank you for the painstaking efforts! And I hope more queries can be added in the future.
For the second evaluation, we can try the pool method in TREC: let experts judge on items retreved by all participant teams, items not retrieved by any team are assumed to be irrelevant. It is fair to all participants, but need experts' judgment.
For the first evaluation, I agree we will need to consider those items ignored in building the ground truth. Again, experts' judgment is needed. Can we ask music librarian or music students for help?
In case copyright problems prove to be insurmountable, the ground truth data could also be used in a different way: instead of letting all algorithms search the entire collection, just let them rank all incipits that are present in the ground truth. Then use the same measure that I suggested for the comparison based on the ground truth. Less interesting, but also much less work and probably not as sensitive copyright-wise.