A Taxonomy of the Data Economy
Oil and energy dominated the list of most valuable firms in 2006. Today the same prize is awarded to the technology giants Alphabet, Amazon, Apple, Facebook, and Microsoft. Each has come to monopolise a distinct niche, and yet, the thread that links this diverse bunch is a voracious appetite for—and perhaps more crucially—the ability to process data. Data, it appears, is the future of commerce. It is the promise and peril of this emerging landscape that is the focus of this piece.
What is the data economy? An all-encompassing definition might include the production, distribution, and consumption of digital information. Beyond this, all that is certain is that the data economy is different. Different in the sense that traditional economic theories do not sufficiently explain its inner workings. Although a robust methodology has yet to be developed, estimates from Statistics Canada, a government agency of its namesake, suggest the data economy may be worth 5% of the country’s stock of private physical capital. If the volume of data generated is any guide, the data economy is growing. IDC, a market research firm, estimates more data will be produced in the next two years than has been generated since the dawn of computers.
Location, Location, Location
What will the infrastructure of the burgeoning data economy look like? Two paradigms exist. At one end is the centralised approach, where cloud service providers like Amazon Web Services (AWS) aggregate and process data at a central repository. ‘Edge computing’ by contrast is where data is processed locally close to where it is collected. It is between these extremes that the infrastructure of the data economy will span. For every data centre there will be an infinite number of locally sourced processing power, collecting data from every nook and cranny of the digital grid.
How far the pendulum swings will also be a function of the industry to which it is applied. Regulatory compliance for example requires control over sensitive data, limiting the degree to which data can be dispersed. A central hub aggregating risk and compliance data has thus captured the imagination of an army of chief information officers (CIOs) in the financial services sector, where it is not uncommon for over a third of total IT change budgets to be geared toward compliance. As a 2019 report by the Bank of England noted, cloud computing may nonetheless offer the potential to reduce technological infrastructure costs by up to 50%. A digital transition should not only be viewed as an expense but as an opportunity to push for innovation.
It is the economics, stupid
What are the economics of data? A defining feature of data is that it can be repeatedly consumed without the risk of depletion. In economic parlance this implies data is non-rivalrous and should thus be considered a public good. On the other hand, data can also be encrypted to prevent unauthorised use. In this way, data exhibits traits of both a public and a private good.
Online advertising is an illustration of where data has taken shape as a private good. Data brokers do a brisk business in the sale of personal information by tracking hundreds of data points. They sell it to everyone from banks to telecoms carriers in an industry worth over $170 billion. The case for data to be considered solely as a private good however, is limited by the fact that the bulk of data today rarely trade hands. To understand why consider the fungibility of corporate datasets. Each is different in the way it is collected, its purpose and reliability. This makes it difficult for buyers and sellers to agree on a standard of value—commonly price. In response AWS has launched a marketplace to make the trading of data as frictionless as possible. It works just as you might expect of an online store, with buyers and sellers agreeing to license terms, and AWS processing the payments.
Why hoard data if it is not tradable? This line of argument has given rise to what is known as the open-data movement. It advocates the sharing of data and has primarily been led by the public sector. More recently the corporate sector has also begun to play a larger role with the Open Data Initiative by Microsoft one such example. Yet, open data can only go so far. The main limitation for personal data is privacy laws, such as the European Union’s General Data Protection Regulation (GDPR) and California’s Consumer Privacy Act (CCPA). For corporate data, the checks are economical; generating data is expensive and can reveal propriety secrets.
Companies will have to make strategic decisions about which datasets to make public. The challenge of separating what can be shared safely will be made more accessible as advances in differential privacy make it possible, for example, to replace raw data with anonymised data sharing similar statistical patterns. Homomorphic encryption and blockchain also offer possibilities. The former allows algorithms to crunch encrypted data without having to decrypt the underlying data, while the latter fine-tune the management of admission, and to track who has done so.
The Dove is in the Detail
Combining datasets in novel ways while maintaining data integrity is becoming increasingly difficult as the volume of data multiplies. Even straightforward variables such as customer name can be recorded in a variety of ways. New approaches are required to store, organise, and process data. Enter data warehouses or data lakes—which, although differ architecturally—often live in the cloud where data can be fed from different sources and utilised by multiple users simultaneously. Kiva, an international non-profit lender, successfully built a cloud-based data warehouse with Snowflake to drive operational efficiencies as well as increase the value of its datasets. The collaboration between Kiva and Snowflake also highlights another important industry trend. The growing demand for analytics-as-a-service to be layered on cloud-based platforms, with third-party applications providing data science capabilities as required.
Others have been less successful. A common obstacle is data silos, which reflect firm-internal boundaries. Reluctant to relinquish power departments loath to share data, preventing the development of a cohesive strategy. To overcome digital division, some companies have made organisational changes. A growing number have appointed senior personnel to the role of CIO to facilitate multidisciplinary efforts. Yet changes at the top are limited if the rest of the company is not ready. Data literacy is another common obstacle to corporate data projects. Changing this does not require every employee to become a data scientist, only that they have a basic grasp of what data analytic tools and methods are appropriate while thinking critically about the results yielded by data analysis. As the technology market research firm Forrester put it, “Businesses are drowning in data but starving for insights.”
Conclusion
The shift from atoms to bits will usher in new markets, infrastructure, and business paradigms. Although many details have yet to be ironed out, one trend is clear. This is the skew in outcomes due to what is known as ‘network effects,’ where successful platforms beget further success in a virtuous cycle. A firm that can leverage data will make better use of Artificial Intelligence, which, in turn, will generate yet more data. The only question that remains is: What side of the digital divide will you find yourself?