What is a Good Mystery Shop Score?
This is perhaps the most common question I’m asked by clients old and new alike. There seems to be a common misconception among both clients and providers, that any one number, say 90% is a “good” mystery shop score. Beware of anyone who glibly throws out a specific number without any consideration of the context. They are either ignorant, glib or both. Like most things in life, the answer to this question is much more complex.
Most mystery shopping programs score shops according to some scoring methodology to distill the mystery shop results down into a single number. Scoring methodologies vary, but the most common methodology is to assign points earned for each behavior measured and divide the total points earned by the total points possible, yielding a percent of points earned relative to points possible.
It amazes me how many mystery shop providers I’ve heard pull a number out of the air, again say 90%, and quote that as the benchmark with no thought given to the context of the question. The fact of the matter is much more complex. Context is key. What constitutes a good score varies dramatically from client-to-client, program-to-program and is based on the specifics of the evaluation. One program may be an easy evaluation, measuring easy behaviors, where a score must be near perfect to be considered “good” – others may be difficult evaluations measuring more difficult behaviors, in this case a good score will be well below perfect. The best practice in determining what constitutes a good mystery shop score is to consider the distribution of your shop scores as a whole, determine the percentile rank of each shop (the proportion of shops that fall below a given score), and set an appropriate cut off point. For example, if management decides the 60th percentile is an appropriate standard (6 out of 10 shops are below it), and a shop score of 86% is in the 60th percentile, then a shop score of 86% is a “good” shop score.
Again, context is key. What constitutes a good score varies dramatically from client-to-client, program-to-program and is based on the specifics of the evaluation. Discount the advice of anyone in the industry who glibly throws out a number stating it’s a good score, without considering the context.