Data Discovery: Finding Your Perfect Match

As set out in Strategies for Unlocking the Value of Third-Party Data, organizations are faced with both a need and an opportunity to revisit how they acquire and extract value from third-party data. In Solving the 80/20 Data Dilemma, six phases associated with acquiring and extracting value from data were identified: Discover, Assess, Acquire, Adapt, Use, Monetize (optional).

Discovery is critical because it sets the scope for what will be possible. If the scope of your inquiry is limited, so will be your potential outcomes. That suggests going very wide and deep to discover data that’s optimal for your use case – ideally multiple use cases – but that’s difficult and time-consuming. How do you know you’ve investigated far and wide enough? How do you know whether what’s being offered is relevant and useful? What have people in your organization used for similar use cases? What’s the cost of a given option? These are all difficult questions to answer, and more so when there is an urgent business need. As a result, discovery can be overlooked and rushed, resulting in poor outcomes.

Historically, discovery has involved marketing by data vendors and the use of data brokers but has evolved in the last decade. After a slew of [failed data marketplaces], there was an explosion in ‘alternative data’ driven by financial investment managers seeking exclusive insights to compete in capital markets and data owners aiming to monetize their data. Companies like Quandl (acquired by Nasdaq), EagleAlpha, and BattleFin supported this activity and innovated new concepts like collaboration, buyer-seller networking, and try-before-you-buy. While this led to new challenges, such as how to adequately assess and integrate a bewildering range of options, it provides an insight into the approaches that can be taken for general third-party data discovery.

So, what would ‘great’ discovery look like for a large organization? 

  1. Make it easy. One option is to use a data hunter/scout to help teams find what they’re looking for. This is a specialized role with extensive experience and knowledge of the industry and could be either an employee or an outsourced role. Another option is to create a marketplace or catalog of what third-party data is available. This creates an intuitive experience for the consumer and lays the foundation for other benefits such as accelerating assessment and licensing.
  2. Compare apples with apples. With limited vocabulary to describe data and high inherent complexity, it’s crucial to create a yardstick to compare third-party data. There are no industry standards for this, so you’ll need to have your own, but there are two ways of doing this. The first is to establish your own spec and invite people to map to it, the second is to consistently profile data and then map it to your needs. Whichever approach is right, your teams will be able to quickly understand what data is potentially useful and this meta-data can easily be recycled to support future discovery efforts.
  3. Crowd-source. Capturing knowledge of what data has previously been acquired and rejected by people in your organization and what adequately satisfied a given use case provides a significant advantage. While likely to be distributed across the organization, this knowledge is highly valuable in helping people rapidly discover third-party data they can more easily trust. Making this available within your marketplace/catalog not only exposes it to the widest audience but helps to drive active contributions.
  4. Build data-driven estimates. Maintain a record of what third-party data has cost for the various use cases, so teams can gauge the likely cost of data and tackle the market accordingly. Data has wildly different value propositions depending on the type and scope of the use case. Data vendors understand this and price accordingly to best serve the various markets. Understanding what is ‘fair market value’ for your organization during the discovery phase avoids issues later on.

Discovery is the first phase of acquiring and extracting value from data and potentially a great place for any data-driven enterprise to start focusing. Once discovery is complete and a set of options identified, the task of assessing the data begins. Not only can this be extremely time-consuming, but it can also easily lead to flawed decision-making.

Authored by Anthony Cosgrove (Co-Founder) at Harbr