My research department works on programming computers to analyse music.
In this field, researchers like to have some idea of whether a problem is naturally easy or difficult for humans.
For example, tapping along with the beat of a musical recording is usually easy, and it’s fairly instinctive—you don’t need much training to do it.
Identifying the instrument that is playing a solo section takes some context. (You need to learn what the instruments sound like.) But we seem well-equipped to do it once we’ve heard the possible instruments a few times.
Naming the key of a piece while listening to it is hard, or impossible, without training, but some listeners can do it easily when practised.
Tasks that a computer scientist might think of as “search problems”, such as identifying performances that are actually the same while disregarding background noise and other interference, tend to be difficult for humans no matter how much experience they have.
Ground truth
It matters to a researcher whether the problem they’re studying is easy or difficult for humans. They need to be able to judge how successful their methods are, and to do that they need to have something to compare them with. If a problem is straightforward for humans, then there’s no problem—they can just see how closely their results match those from normal people.
But if it’s a problem that humans find difficult too, that won’t work. Being as good as a human isn’t such a great result if you’re trying to do something humans are no good at.
Researchers use the term “ground truth” to refer to something they can evaluate their work against. The idea, of course, is that the ground truth is known to be true, and computer methods are supposed to approach it more or less closely depending on how good they are. (The term comes from satellite image sensing, where the ground truth is literally the set of objects on the ground that the satellite is trying to detect.)
Music recommendation
Can there be a human “ground truth” for music recommendation?
When it comes to suggesting music that a listener might like, based on the music they’ve apparently enjoyed in the past—should computers be trying to approach “human” reliability? How else should we decide whether a recommendation method is successful or not?
What do you think?
How good are you at recommending music to the people you know best?
Can a human recommend music to another human better than a computer ever could? Under what circumstances? What does “better” mean anyway?
Or should a computer be able to do better than a human? Why?
(I’m not looking for academically rigorous replies—I’m just trying to get more of an idea about the fuzzy human and emotional factors that research methods would have to contend with in practice.)
Tricky question. There’s so many dependencies. I think humans can be good at recommending music to others, but in order for me to be good it, the following factors would all contribute:
1. How well I know the person’s general disposition
2. The sample size of their musical taste
3. How consistent their musical taste is
4. How similar their musical taste is to mine
5. The breadth and depth of my musical knowledge
For example, I know my brother well. I know he particularly likes electronica and alt-country (two disparate but well-defined genres). I happen to like those genres too, and I listen to quite a lot of music, so the chances are I could recommend an artist to my brother and be pretty confident he would like it.
Conversely, I have a friend at work called James. I know him pretty well. I also know he (still) likes old-school hip hop and 90s drum n’ bass – a fairly consistent taste in music, but not one that closely matches mine, so I would not feel confident in recommend an artist to him.
My feeling is that (1) is not required but 2-4 are required to some degree, but 3 and 5 are the most important.
Thanks for the comment! That sounds like quite an analytical approach to the question–if your experience says that the guiding factor is how much knowledge you have about music generally and someone’s musical taste specifically, then a system with more information (i.e. a database-driven recommender like last.fm) ought to do better than a human generally could.
What sort of role do you think social factors play? For instance your brother might be inclined to listen the same stuff as you in order to have things to talk about, gigs to go to together, etc. Or conversely people may be disinclined to look for suggestions from friends because it would dilute the feeling that their musical tastes “belong to them”. Are these factors (i.e. where the recommendation comes from) as significant as what the recommendation is?
Interesting question. I’m very open to receiving recommendations from friends – I positively encourage because I like finding new music. I’d extend that to radio programmes and music podcasts too. That said I tend to be turned off by ‘recommendations’ from iTunes and Last.fm. Maybe because the former in particular doesn’t do a good job.