A Scalable Curation System Is Possible And A Way Out Of Our Industry’s Data Mess
While many theories are floated about how artificial intelligence and blockchain could be the cure for rights holders’ and creatives financial woes, neither of these would untangle the industry’s rats nest of data, but a scalable curation system might.
Guest post by Vasja Veber, Co-Founder and Business Development Director for Viberate
If you’ve talked to anyone in the music or entertainment space over the last ten years, you’re likely to have heard complaints and laments about the state of data in the industry. Though recording and composition metadata are often at the center of these woes in music–they are, after all, how creatives and rights holders get paid–other slices of the music business are faring even worse when it comes to data.
There’s lots of gushing about everything from AI to blockchain, technologies that many of us take very seriously, but at the bottom of the problem is just one big, tough-to-untangle data mess.
The nature of the mess may sound familiar to many outside of music and live entertainment. The data tend to be of very poor quality; you don’t actually know who came into your club or event, as ticketing information is appallingly inaccurate, for example. Data are very dispersed, scattered across socials, retail sites, streaming platforms, and other proprietary services. Worst of all for this machine learning-powered era, some of the early indicators of what’s going to be big–in the live music case, what’s taking off at certain small clubs, smaller tastemaker festivals, or key parties–may not be part of the mainstream data that’s easy to integrate via existing APIs.
These issues find specific form in the music and entertainment industry, but have relevance to a wide range of businesses, from hospitality and event organizing to DTC and other data-reliant retail. And in live music, as in many other realms of commerce and marketing, addressing them demands a serious look at how to build a team to cultivate accurate information globally, which in turn requires a scalable approach that empowers individual data curators.
To do anything with data, you have to find and refine the necessary sources for input, the data points that actually say something about the business, community, or scene. There are so many options out there in most cases that it’s tempting to rely on scraping plus a few APIs from relevant platforms. Another common approach is to simply set things up for crowdsourcing, and let the communities or customers fill in the data, yet that can quickly turn from exciting approach into moderation hell. Ideally, you want to combine a few firehose-like streams of data with important input from users who are incentivized to do a better-than-shoddy job at contributing information. In short, you need to tame what’s out there in the wild.
Only humans can tame this wilderness and make it productive, people specially trained to weed out poor or irrelevant data. There’s too much complexity, nuance, and regional variation at this point to find automated solutions. That’s why we knew, as we tackled the data mess in our business, that we needed curators, real humans who knew what looked reasonable and what seemed off. Because we’re growing a large network of profiles, crossing the million mark recently, we also knew we needed enough humans to do the work well, and needed them to have certain knowledge and skills.
These skills were determined by the focus we adopted early on. We knew that aiming to become something vague yet all encompassing (“the Facebook for music,” as many startups liked to bandy around at some point) would make our site useless. Furthermore, we saw a massive gap in the live event realm. So we focused on live music and how other platforms and data points speak to live music scenes. There’s a lot to be said for niche approaches, and when you want to create industry-leading data, being a generalist isn’t necessarily a logical choice.
In fact, our industry, like many others, has seen a proliferation of vanity metrics in the digital era, as well as metric fraud like purchasing follows and streams. To counteract these forces, we homed in on unexpected metrics and data points that tell stories helpful to our clients and users, who range from fans to festival organizers and booking agents. For example, we surface which artists of note are following one another, something hard to figure out when scanning an artists’ thousands or millions of Twitter followers. This can show unanticipated connections and suggests potential collaborations and partnerships.
We also made sure to solve one of the industry’s toughest data problems, by following one simple rule. One artist = one profile. It sounds ridiculously obvious, but even the world’s leading streaming platform doesn’t follow that rule. The only way to achieve that level of precision is by adding a human touch. A lot of times we have to defend our claim that we have one of the largest artist databases in the world, currently just shy of 500,000 profiles. We hear things like, “yeah, but I know this service that has 2 million.” They might claim this, but if you go to that particular service and type in “Tiesto”, you’ll get 10 or even more profiles for the same artist. From a data perspective, this renders such service useless, because having data scattered through multiple profiles for the same artist doesn’t let you engage in any kind of data-related analysis. It’s like one person having multiple social security numbers.
Along with finding these simple, but hard-to-solve data painpoints, we also looked for benchmarks and metrics that made sense to our community. For example, we realized that the price of a standard-sized beer was a great benchmark for the overall cost of a festival or venue, guiding music fans to find the right experience for their budgets and helping event operators see how they measure up to the competition. People note the cost of a pint, our curators validate it, and we can then show a meaningful data point to our users. Other industries may find other quirky yet extremely telling metrics that can only be revealed by well-cultivated data.
On top of right-scaled humans and data that actually matters, you need a large dose of flexibility. To find enough skilled people with a broad grounding in pop culture and strong local knowledge, we had to get creative. We found lots of talented and qualified people in our home region of Eastern Europe. We recruited people from around the world, and used crypto to pay those in unstable regions who had the skills we needed. For example, we found a good group of curators in Venezuela, where inflation almost instantly destroys fiat currency values and where banking is chaotic, to say the least. By keeping our focus reasonable, we can make their jobs reasonable, reducing curation or moderation burnout.
These approaches need to be tailored to your industry, but the human-machine balance in cultivating quality, actionable data should be your goal. It’s allowed us to raise the bar on insights into the live music business, insights we expect to continue to grow richer as time passes. A scalable curation system is possible, with the right mix of openmindedness, tech tools, and smart people.