How consumers feed the data economy ... by feeding their captive digital twins

The preceding article gave an overview of the data economy, describing how consumers provide the foundation of a huge economy with little control and limited return. This article dives into more detail about the many ways in which consumers provide their data, creating and feeding their own digital twins over which they have no control.


download the paper

Sources of consumer data


What are the sources of consumer data? Credit agencies have long collected consumer credit data to be used in a banking context. Using search data for advertising, Google was probably the first company to systematically collect consumer data and to monetize it in a context completely different from its collection. As suggested in a recent book, this move was motivated by the need to survive the dotcom bust.


Since then, a great variety of Internet services for consumers has been built, from e-mail to messaging, social media, news, entertainment, photo sharing, and many more. The service providers collect the transaction data from these services, as well as metadata such as users’ device types, locations, interaction behavior and more.


The Internet of Things, connecting sensors and devices to the Internet, is another prolific and highly differentiated source of consumer data. Combining data from connected thermostats, entry doors and garage doors, light switches, security cameras, and other devices can create a detailed view of people’s home lives.


Voice assistants and navigation systems add more layers of fine detail. The digitization of consumer services such as credit cards, banking, and e-commerce has created additional data sources.


Wearables such as smart watches and fitness trackers finally are looking to track not just our health data, but also our moods and our interactions with others.





All this data is flowing into the diffuse consumer data ecosystem – we’re calling it the data swamp - of data brokers, advertisers, and other users of consumer data. The leading big tech companies are creating and owning detailed digital twins of consumers.


The data swamp


The combination of different data types from a variety of sources creates focused insights. Data brokers acquiring data from many sources are one avenue for this data aggregation. Moreover, some dominant service providers are branching out into adjacent spaces, augmenting their existing data with new data types. Google’s move from search into voice assistants and smart home services is just one example of this. Another example is Amazon’s entry into the pharmacy business.


The abundance of data has created a vast ecosystem of data that is traded among many participants. In addition to using their own data, the big tech companies are deeply connected with the data broker ecosystem in what could be called a data swamp. Mining all that data, the leading big tech companies create consumers’ digital twins with incredible detail. Big tech’s monopoly-like market power and their strong-arm terms of service make it hard for consumers to liberate their digital twins.



Raw data vs. actionable insight


With their transactions, from using search, e-mail, social media, navigation systems and other Internet services, consumers create raw data. That is, somebody searches for a product, mentions a vacation trip in their e-mail, drives to a particular location, or posts about a recipe. This data is sometimes called a natural resource.


Like physical natural resources, this data is not useable in its raw form, it must be processed to turn it into actionable insight. Individual raw data points are processed to become consumer profiles which are matched with application demands. In some cases, the data from entire user classes is matched with target demographics. In other cases, profiles are used to target individuals with messages to influence their decisions.


Historically, statistical analysis was the method used for most of this processing. More recently, the advent of artificial intelligence (AI) has enabled the processing of larger and more diverse data sets to deliver more differentiated and personalized results. Analyzing text, voice, and video data creates incredibly detailed consumer insights, consumers’ digital twins.



Thus, the value of actionable information depends on two components: the raw data and the algorithms to process the data. It can be argued that the raw data belongs to the consumers. The algorithms, however, are the tightly guarded intellectual property of the big tech companies. Without either ingredient, there is no value. How much of that value should be assigned to the input data? In the current environment, without a regulatory framework and with much of the data flows and value transfers hidden from consumers, the big tech companies retain the lion’s share for themselves.


The next article in this series will examine how Big Tech and other companies extract value from the digital twins.