News

Tim Wilson (Analytics Power Hour): ‘We need to get comfortable with the probabilistic nature of analytics’

21 September 2023

Tim Wilson is a seasoned analytics consultant with over two decades of experience. Lucky for us, he will be speaking at the DDMA Digital Analytics Summit on October 12th. We got the chance to talk with him beforehand, discussing analytics maturity across industries to questioning the utility of multitouch marketing attribution models. As a self-proclaimed “Analytics Curmudgeon”, he reflects on the evolving landscape of digital analytics, emphasizing the importance of shifting focus from data collection to purposeful data usage to unlock true business value.

Come check out Tim Wilson’s talk at the DDMA Digital Analytics Summit 2023 on October 12th. Tickets available at: shop.digitalanalyticssummit.nl.

Hi Tim, can you briefly introduce yourself? Who are you, and what do you do?

‘I’m an analyst. I stumbled into the analytics world by way of digital analytics a couple of decades ago, and I’ve been wandering around in a variety of roles in the world of analytics ever since. To be a bit more specific, I’m an analytics consultant who works in the realm of marketing and product analytics—working with organizations primarily on the people and process side of things. Or, to put it a bit more in data terms, I work with companies to help them put their data to productive use, as opposed to working with them on how they are collecting and managing their data.

At the moment, I’m between paid employment, as I left my last role at the beginning of this year to take a few breaths to figure out exactly what I’ll be doing next (as well as to have a few adventures with various of my kids as they fly the coop). So, “what I do” in analytics in the present tense is: co-host and co-produce the bi-weekly Analytics Power Hour podcast, co-run the monthly Columbus Web Analytics Wednesday meetup, speak at various conferences (like Digital Analytics Summit!), develop content for an analytics book I’m working on with a former colleague, and do gratuitous little analyses here and there to keep my R coding skills sharp.’

From your experience, it seems you’ve been a consultant for various industries, including healthcare, pharma, retail, CPG, and financial services. Do you see significant differences in the analytics maturity and strategy of their Digital Analytics activities, for instance looking at their governance?

‘I have to be a little careful about selection bias, as every company I work with is a company that has sought out external analytics support in some form. In theory, very analytically mature organizations—regardless of their industry—have less of a need for outside support.

Having said that, while the business models for different verticals vary, I see a lot of similarities when it comes to their analytics and analytics maturity. Perhaps painting with too broad of a brush, but every organization feels like its data is fragmented and incomplete, that there is more value to be mined from it, and that a deluge of actionable insights will burst forth if they can just get all of the right data pieces in place. Many organizations—again, regardless of their vertical—have a Big Project related to their data tooling or infrastructure under way: implementing a data lake, adding a customer journey analytics tool, rolling out a customer data platform (CDP), migrating to a new BI tool, or even simply shifting to a new digital analytics platform. Often, in my view, these efforts are misguided…but that’s the core of my talk at Summit, and I recognize it is a contrarian position.

I do think it’s worth noting that the nature of the data that organizations in different verticals have can be quite different. For instance, CPG/FMCG companies rarely have access to customer-level data for their customers, since much of the marketing and sales occurs through channels owned and managed by their distribution partners. Retailers often have both online and offline sales channels so, even if they have customer-level data in some form, the nature of that data varies based on the channel (and stitching together a single person’s activity across online and offline at scale is a losing proposition). And, of course, the sensitivity of the data can vary quite a bit as well—even as GDPR and other regulations require all organizations to think about personal data and be very protective of it, the nature of that data is considerably more sensitive in, say, healthcare and financial services, than it is in retail or CPG/FMCG.

I think I’ve given a prototypical consultant answer, no? Basically, “yes, no, and it depends!”’

On LinkedIn, you mention that you help clients choose algorithmic multitouch marketing attribution models. The once-promising idea that these models would be the holy grail of attribution has yet to be fully realized. How do you perceive this, and how do you ensure that these models are truly workable in today’s context?

‘Oh, dear. I have helped clients make those choices, but it’s always been under duress, because multitouch marketing attribution never did and never will actually be what many marketers expect it to be. I’ve delivered entire presentations and even posted a lengthy Twitter/X thread on the topic. Trying to be as succinct as possible, the fundamental misunderstanding is that multitouch attribution is an “assignment of value” exercise, but it gets treated as though it is a tool for “measuring value.” The latter is what marketers (and analysts) expect: how much value did channel X (or sub-channel Y, or campaign Z) deliver? The true answer to this question would be a calculation that takes the total revenue realized (or whatever the business metric of choice is) and then subtract from that the total revenue that was realized in a parallel universe where channel X was not used at all. In fancy-statistics-speak, this is the concept of “counterfactuals.” Obviously, we can’t actually experience multiple universes, but there are techniques that approximate them. Specifically, randomized controlled trials (RCTs, or experiments) and marketing mix modelling (MMM). Multitouch attribution, regardless of its degree of algorithmic-ness, is not particularly good at this. The other nice benefit of RCTs and MMM is that neither one relies on tracking individual users across multiple touchpoints, so a whole pile of privacy considerations—technical and regulatory—are rendered moot!

This doesn’t mean that RCTs and MMM are silver bullets. They’re inherently less granular, and they take time and effort to configure and run. Multitouch attribution has a place: it’s quick, it’s relatively easy, it can be very granular (keyword-, tactic-, or placement-level) and it provides some level of signal as to which activities are garnering a response. It doesn’t show, though, when any given response is cannibalizing a response that would have happened elsewhere in the absence of the tactic (think: branded paid search terms getting clicks that would have come through via organic search, anyway).

What I find exciting is that there is an increasing interest in RCTs, and MMM—which existed long before digital—is making a comeback. At the end of the day, the most mature companies use multiple techniques and use RCTs and MMM to calibrate each other and their multitouch attribution modeling.’

It’s often said that the field of Digital Analytics is rapidly evolving. But is this really the case? We tend to cling to what we’re used to in our field. Can you provide your perspective as an “Analytics Curmudgeon” on this?

‘Let me first don my Curmudgeon Hat and say that, as Stéphane Hamel recently put it, “digital analytics is mostly ‘analytics engineering’ (aka ‘tagging’), and very few real analyses and business outcomes.” The data collection aspects of digital analytics have certainly been rapidly evolving: it wasn’t that long ago that we didn’t have tag managers, cookies have become increasingly unreliable as a means for identifying a single user across sessions (cookies were always a hack on which client-side tracking was built, so we shouldn’t really be surprised), and privacy regulations and browser and operating system changes have added even more challenges to comprehensively tracking users. As a result, there is a lot of handwringing by practitioners about how they’re having to work harder and harder simply to backslide as slowly as possible with the data they’re collecting.

When it comes to how data actually gets used to inform business decisions, there is also a continuing evolution. Ten years ago, very few digital analysts were even thinking about SQL, Python, or R as tools they needed to have in their toolbelt. While there are still (too) many analysts resisting that evolution, I truly believe they are limiting their career growth. Increasingly (and this is not particularly new), organizations are finding they have to work with data across different sources, and that often means some combination of programmatically extracting data through APIs and working with data that is housed in an enterprise-grade database, be it BigQuery, Azure, AWS, or something else. Along with those “broader sets of data” often comes “working with data scientists,” and that opens the door to smarter, better, and deeper thinking about different analytical techniques. My mind was blown—in a positive way—when these types of collaborations introduced me to several concepts and techniques: counterfactuals (which I referenced earlier), time-series decomposition, stationarity, first differences, and Bayesian structural time series. These are enormously useful, and they’re all much, much easier to do when using a programming language like R or Python. Really, this is an “evolution” that is about bringing time-tested techniques from other fields—econometrics, social sciences, and elsewhere—into the world of digital analytics.

And, of course, AI will drive some evolution in the space, too. My sense is that it is both underhyped and overhyped—mishyped, maybe?—but there are more than enough people with Strong Opinions on that subject already, so I’ll leave it at that.

But, yes, I think “rapid evolution” is a fair description of what’s going on in digital analytics. Some of that evolution is for the better, some of it really isn’t!’

What are the trends and developments that digital analytics professionals should really focus on within the field in the upcoming years?

‘There is almost certainly a gap—potentially a massive chasm—between what the industry will focus on and what I think they should focus on. I don’t have enough hubris to declare that I’m absolutely right, but the biggest trend I see being thrust upon the industry is a decline in the availability of person-level data. We’ve already touched on this—”privacy” both from a regulatory perspective and a technological perspective are driving organizations farther and farther away from the nirvana of a “360-degree view of the customer.” That nirvana was never achievable at scale, but organizations are increasingly more aware that that’s the case.

What I’d like the analytics industry to do as a response to this reality is twofold.

First, I’d like for us to stop treating complete, user-level data as an analytical goal in and of itself and, instead, embrace incomplete and aggregated data as being perfectly adequate. This means getting comfortable with the probabilistic nature of analytics—eschewing a search for an “objective truth” and, instead, viewing our role as “reducing uncertainty in the service of making decisions.” This requires a mindset shift on the part of analysts and a mindset shift on the part of our business counterparts. It’s no small feat, but it’s where I hope things go.

Second, I hope we start realizing how easy it is to get caught up in the technical and engineering challenges of collecting and managing data, and that we start actively pushing back against those forces to focus on how we’re helping our business counterparts actually use the data we’re collecting. It’s always easier to gather and integrate more data or push out another dashboard than it is to roll up our sleeves, identify the biggest problems and challenges the business is facing, and then figure out the most effective and efficient ways that we can use data (analytics, experimentation, research) to drive the business forward.

These are, admittedly, pretty lofty aspirations, but it’s where I think we need to go if we don’t want to find ourselves becoming marginalized as simply chart-generating cost centers.’

Can you provide a sneak peek of what you’ll be discussing at the Summit?

‘You kind of teed me up for this with your last question! I’ll be diving into the idea that all data work can be divided into two discrete buckets: data collection and management work, and data usage work. I’ll make the case that, while it is easy to get seduced into thinking that there is inherent business value in data collection, there really isn’t. The collection and management of data only provides the potential for business value. To actually realize business value, we have to do things with the data, and it’s either naive or irresponsible (or both) to expect our business counterparts to shoulder the entire load for that.

I’ll dive into some of the powerful forces that push us (and our business counterparts) to think that there is business value in data collection itself, and then I will (briefly) provide a framework for putting data to meaningful use.’

Come check out Tim Wilson at the DDMA Digital Analytics Summit 2023 on October 12th. Tickets available at: digitalanalyticssummit.nl.