Insights / Blog Data Marketplace or Data Exchange: What’s the Difference and Does It Matter? September 15, 2020 Data Marketplace or Data Exchange? Following a spate of failed data marketplaces in the early 2010s, new platforms have launched with wider value propositions. Frustratingly, they also use different names and sometimes interchangeably. So, what is a data exchange? What is a data marketplace? What is the difference, and does it matter? We’ve previously written about why data marketplaces tend to fail (SPOILER ALERT: They failed to adapt the paradigm to the unique attributes of data, which is a unique type of asset, resulting in poor value propositions). Since then a second-wave of data marketplaces – and data exchanges – have emerged. Generally, ‘data marketplace’ is used to describe a place for buying and selling third-party data with examples including AWS (actually called a data exchange), Snowflake, Quandl, BattleFin, Narrative, and Dawex. These platforms typically focus on the transactional aspect of buying and selling data, including publishing, licensing, discovering, and distributing. They rarely include data from the platform provider, so are largely ‘platform-based businesses’ that seek to leverage scale or network effects. Some large data vendors also have ‘data marketplaces’, but they tend to contain their own data products, so are less of a ‘marketplace’ and more of a ‘storefront’ enabling the digital transformation of their supply-chain businesses. In all cases, the financial transaction is typically external-facing between legal entities. Meanwhile, ‘data exchange’ is most widely used for technologies that enable the exchange of data without an associated financial transaction with basic examples including Microsoft’s Azure Data Share and Snowflake’s Private Data Exchange. These are targeted at organizations that are unlikely to be selling their data and are instead seeking to extract value by ‘exchanging’ it. The ‘exchange’ can either be one-way or mutual to drive joint value propositions. Some data exchanges are very basic, but new offerings are emerging to provide deeper functionality. Unlike a marketplace or storefront, many of these exchanges happen within legal entities and corporate groups as well as between them. Does it Matter? When considering use cases and selecting or interacting with a technology, the terms do matter. Data marketplaces facilitate the external exchange of data via a financial transaction. Data exchanges can also do this, but support a wider range of exchange-based use cases not involving a financial transaction including the internal exchange of data. The data marketplace category is established if still struggling to achieve widespread adoption. The data exchange category is emerging with significant innovation around how to support value transfer between suppliers and consumers across internal and external boundaries. The workflow of a data marketplace includes publishing, licensing, discovering, and distributing data. Those capabilities are also necessary for an effective data exchange, but they usually require functionality to help organizations extract value from the data being exchanged because a direct financial transaction is not occurring. That often requires collaboration between the supplier and consumer to understand both the data and the use case, more so when there is a mutual exchange of data. Exchanging data can also be accompanied by concerns around privacy and security, especially if the supplying party has limited experience or deems their data to be highly sensitive. This is rarely the case when directly selling data, as the expectation is typically that a copy of the data will be given to the buyer. Two solutions data exchanges can use to address these concerns are to limit the data that is exchanged or limit where the data can be accessed and used. Limiting the data that is exchanged means removing some of the original content, limiting what can technically be accessed, or using differential privacy techniques, such as encryption and tokenization to protect the privacy of data subjects. This is a balancing act that requires a strong understanding of both the data and the use case to ensure a value proposition is maintained. Again, this usually requires collaboration, as the data supplier and the data consumer both need to understand the data and the use case. Success requires a period of co-innovation between the supplier and consumer – or via a trusted broker – to prove and operationalize the value. In some cases, ‘blind’ matching and ‘pre-canned’ analytics can be used in addition to differential privacy techniques to derive insight without the data actually being ‘shared’. Inevitably, the use cases for this are limited, and challenges around data cleanliness and schema alignment can also affect results. Limiting where data can be accessed provides an opportunity to securely access both ‘limited’ data and data in its original form. Legal terms can be used but are a fairly passive control. Another option is to use a secure ‘sandbox’ environment containing tools and infrastructure, so the data can be accessed without the supplier losing ownership. This mechanism can be used at any stage from the initial trial/assessment of data, through to it being adapted to a use case and, in some cases, all the way to realizing the value. To be most effective, these sandbox environments need to be collaborative to facilitate broad use cases within and between organizations. Collaboration is also necessary where a value proposition requires the mutual exchange of data or other intellectual property, such as a model. The ‘Collaborative Data Exchange’ and the ‘Data Mesh’ The combination of a data exchange and collaborative functionality, such as a secure sandbox, is called a ‘collaborative data exchange’. The core purpose is to enable a low-risk, iterative approach to exchanging and collaborating on data within and between organizations. Depending on the underlying infrastructure, some collaborative data exchanges can also be described as a data mesh. The core principles of a data mesh – distributed data ownership, treating data as a product, ecosystem governance, and a self-service approach – align well with the design needs of a successful collaborative data exchange, but contrast sharply with the centralized approaches typically seen in most data exchanges and marketplaces. While collaboration is not mandatory for a data marketplace or storefront, it is helpful for avoiding many of the issues that have led to previous data marketplaces failing. Every data supplier and consumer experiences friction as they try to prove value propositions, manage the risk of unfair/unauthorized value transfer, and compensate for an inability to track access and usage. This results in sample-based trials, hands-off support, and a lot of overhead on the consumer to integrate the data and make it useful. By solving those issues, a collaborative data exchange provides an excellent foundation for second-wave data marketplaces and data storefronts aiming to deliver a meaningful value proposition to their users and avoid the mistakes of the past. Authored by Anthony Cosgrove (Co-Founder) at Harbr