Insights / Blog Data Presentation | Beyond Sharing Data as Tables July 9, 2020 Tables Are Rarely Useful Throughout my career, I’ve worn many hats, from data scientist to data engineer, software engineer, data analyst and now head of support. It’s safe to say that I’ve experienced a lot of the challenges associated with accessing and sharing data. One of the clearest observations from that history is sharing data as tables, simply isn’t useful. There needs to be more. Data Diversity A common data misconception is that it’s all tabular, meaning it can be described as rows and columns, similar to the old-fashioned Excel table everyone has seen at some point in their lives. But data comes in many different shapes, sizes and forms including both structured and unstructured, all of which are valuable in their own way. While a table with rows and columns is data, so is the visual representation of the data such as a map or a chart. A model built from the data is also data. An aggregation of the data is still data. Unstructured content associated with the data such as photos, videos and sound files are also data. Even a derivative taken from the data is data. All these different types of data possess different value propositions for different use cases, and therein lies the key. The Importance of Use Cases If I were to provide an example of weather data and its various use cases, it’s clear the raw data – a table – simply isn’t enough to deliver value across all the potential use cases. This would include predicting weather for the coming week, modelling the effects of climate change over a period of time, or predicting which coastal cities are most prone to hurricanes. A visualization of the data may be used to show the predicted weather patterns for the day across a region on live television; a researcher could identify a correlation between rainfall and road accidents; a farmer could determine irrigation schedules or plan for flooding. These are very diverse use cases and they all require the same data, but in different forms leading to different outputs. Additionally, each may carry different requirements on access credentials, the methodology for collection or aggregation, documentation, metadata, etc. For this reason, sharing a table of weather data simply isn’t enough to make it useful in real-life scenarios. Data needs to be shared in different formats and structures, and via different mechanisms, to enable multiple use cases, click to tweet. Metadata Makes the World Go Round When sharing any kind of data, metadata is hugely important but often overlooked. Consider a book without a title, chapter headings, an index or a summary; it would be much harder to read and understand. The same is true of data. Tabular data without metadata can lead to problems and restrict usability. Metadata is arguably as valuable as the actual data itself because without it consumers lose context and that then becomes open to interpretation and therefore error. Sharing data with metadata, as a cohesive data product, enables consumers to interpret, engage with, and understand the data. The same is true for unstructured data and highlights why sharing tables or any type of raw file on its own is often insufficient for achieving real-world outcomes. Data as a Product There is now widespread acceptance that data is valuable. But sharing tables will often fail to unlock the potential value. When supplying data it’s critical to understand the use case your consumers have in mind, the tools they have available to them and their level of skill and knowledge. This will determine the appropriate format and the scope of the metadata and other related assets that are required. Bundling data, metadata and other assets together transforms a simple table into a data product – something that is targeted at a specific persona and a specific value proposition. Ultimately, this increases the usability of the data and, as data is only really valuable when it’s used, it’s critical to realizing value. Having been a data consumer for many years, in almost no circumstance were my needs met by a single table. If data suppliers had provided what I actually needed, in the form of a product, I would have been able to achieve my goals far, far quicker, click to tweet. Authored by Ryan Cauldwell (Head of Support) at Harbr