Part 1: A better way to think about “Minimum Detectable Effect
The Minimum Detectable Effect or MDE has always baffled me, and not because it’s a difficult concept to understand but rather because most people working in the field of experimentation have been using it counter-intuitively for so long.
For the record, I don’t think any of what I’m about to write is groundbreaking, but having worked in this field for many years, I’ve not seen anyone talk about it so I thought I would. Who knows, maybe you’ll disagree with me.
How are people currently using MDE?
The MDE is one of several inputs used to calculate the sample size required for an experiment along with:
– α: the selected level of significance
– β: the selected power
– σ: the standard deviation
– μ1 or p1: the baseline mean or proportion
– μ2 or p2: the proposed/expected new mean or proportion (this is where our MDE currently sits)
– r: ratio of groups (usually 1 assuming equal sample size)
– ni: sample size per variant
To calculate μ2 or p2, experiment owners are expected to estimate some percentage uplift they think a test or feature will drive — usually with very little reasoning. All of these values are then entered into one of the two formulas below:
Sample size for means (AOV, ARPU, Order Frequency)
Sample size for proportions (Conversion rate, sign up rate)
Whether you do this “by hand” or using an online calculator, you enter the inputs and hey presto, you have your sample size.
This is where I’ve found the use of MDE to be counter-intuitive. Most people can make a fairly accurate estimate of how many visitors they’re likely to get in a 1,2,3…n week period, but what they don’t usually know is the effect size they can measure with the visitors available to them. This results in:
a) people playing a game of trial and error with MDE until the value they’ve entered spits out a sample size that matches their available traffic over a time period they deem to be acceptable.
b) they realise that their estimated effect size is too high (returning a very small sample size) or too low (returning a sample size that’s too large) at which point they jig things around anyway.
So, if there is a minimum period of time (e.g. 1 week) and some maximum period of time (e.g 4 weeks) that you’d want to run a test for, why do the MDE dance? Why not just treat MDE as a function of sample size and calculate what it would be for 1,2,3,4…n weeks? i.e. Move MDE from an input variable to an output variable. Once you know that, you can make a better decision on how long you want to run the test. Or, if you’re an analyst and you’re doing these calculations, you can provide a range to the experiment owner and show them what’s achievable with the traffic available to them. It will also give them a steer on how ambitious their tests can afford to be. Let’s face it, most of us don’t have Facebook’s level of traffic so knowing what minimum effect size we can detect is critical for test planning.
So why do the formulas above require us to input an MDE?
I don’t know but based on the countless clinical trial papers I’ve read, I would say it’s because most of the Mathematical formulas we use were born out of a necessity to understand the impact of medical intervention on patients. Therefore, having a robust and ethical way to ensure that the right number of patients were part of the medical study with meaningful effects based on countless research and past trials probably had something to do with it.
Luckily for us, we carry none of that burden, we don’t have to worry much about ethics and we can “enrol patients” by the thousands with a flick of a switch. What we don’t have is meaningful lab research or past trials to make a reliable estimate of the effect size but knowing our traffic volumes means we can simply rearrange the formula and make MDE the subject and make an educated guess on whether or not the output is achievable with our “intervention”.
In Part 2 I’ll be sharing what our newly arranged formula will look like and if that’s going to give you a headache (which it probably will) then a little trick in excel to do the rearranging for you.
You can follow me on Twitter or with minimal effort you can follow me and CRAP Talks on Medium.