Data Adaptation: The First Dance

As set out in Strategies for Unlocking the Value of Third-Party Data, organizations are faced with both a need and an opportunity to revisit how they acquire and extract value from third-party data. In Solving the 80/20 Data Dilemma, six phases associated with acquiring and extracting value from data were identified: Discover, Assess, Acquire, Adapt, Use, Monetize (optional).

With Discovery, Assessment, and Acquisition completed, the adaptation phase begins. This marks a return to more technically-orientated people, who will be actively working with the data, but also technicians supporting data movement. Depending on how well the previous phases were managed and whether or not a technology solution such as a collaborative data exchange is used to manage the end to end process, there will be dramatically different challenges. For many, this may be the first time they access the full data product and will be quickly followed by the first time they experience an update. While that is happening work is underway to make the data product fit the specific use case.

There are myriad issues that can arise here. Format conversion, volume of data, timeliness of update, volume of change, and cleanliness. Few of which can be preemptively managed if a static sample has been used during the assessment phase. If insufficient assessment work has been done, you can also experience issues with matching/joining to your data, missing values, inaccurate data elements, and various other suitability problems. Some of these issues can be so severe that the data may simply not work for the intended use case. There is also the need to move the data, which can involve many different technologies and (often manual) ways of managing them, generating administrative and technical overhead that can cause weeks of delay.

Unlike some other phases, there is a lot that can be done to improve the situation, but much of that opportunity resides within the Discovery and Assessment phases. However, if you find yourself in a painful situation here are some options:

  1. Build bridges. If you start struggling with the data remember that the provider should know their data the best and their other customers may have experienced similar issues. Try to avoid using calls or emails to diagnose and resolve issues. Instead, find a way for both parties to get hands-on with the data in a secure environment. This will dramatically increase the likelihood of getting a successful outcome and the data provider may be able to fix issues at source rather than you fixing it in isolation.  
  2. Share and share alike. If you’re having problems or just making adaptations it is likely other users of that data product in your organization will have the same experience. If you are able to share your work with others, it will significantly help your organization move faster. However, this only really works if the sharing happens in a technology that is highly accessible, not a point solution, and will work well if licensing is closely-coupled with where the adaptations are being created.
  3. Modular engineering. Significant adaptations to a data product are best done in a modular fashion so that different aspects can be shared and forked. The basic work done on formatting, cleansing and filtering may be widely re-used, whereas the more use case-specific work will not. Building in a modular way will allow re-use of as much as possible.
  4. Robotic movements. Investing in the automation of data movement makes sense because data will always need to be moved unless your entire organization runs on a single database! This is a feature of collaborative data exchanges that seek to remove all of the technical complexity and overhead by providing multiple mechanisms without any overhead, eliminating costs, and managing risks.

Adapting data is inevitable when trying to extract value from third-party data. Much of the pain can be avoided by fundamentally altering what happens at the discovery and assessment phases and by enabling your organization to better collaborate internally and externally. Once the data has been adapted it can be out to use, but how do you know if it’s working?

Authored by Anthony Cosgrove (Co-Founder) at Harbr