Net Promoter Score (NPS) burst on the customer experience scene 15 years ago in a Harvard Business Review article with the confident (some might say over confident) title “The One Number You Need to Grow.” NPS was introduced as the one survey question you need to ask in a customer survey.
Unfortunately, I’ve seen many customer experience managers include NPS in their mystery shopping programs, which is frankly a poor research practice.
The NPS methodology is relatively simple. Ask customers a “would recommend” question, “How likely are you to recommend us to a friend, relative or colleague?” on an 11-point scale from 0-10.
Next, segment respondents according to their responses to this would recommend question. Respondents who answered “9” or “10” are labeled “promoters”, those who answered “7” or “8” are identified as “passive referrers”, and finally, those who answered 0-6 are labeled “detractors”. Once this segmentation is complete, the Net Promoter Score (NPS) is calculated by subtracting the proportion of “detractors” from the proportion of “promoters.” This yields the net promoters, the proportion of promoters after the detractors have been subtracted out.
The theory behind NPS is simple. It is used as a proxy for customer loyalty. Loyalty is a behavior, surveys best measure attitudes, not behaviors. Therefore customer experience researchers need a proxy measurement for loyalty. NPS is considered an excellent proxy for loyalty under the theory that if one is likely to put their reputation at risk by referring a brand to others, they are more likely to be loyal to the brand. In contrast, to those who are not willing to put their reputation at risk are less likely to be loyal.
Fads in customer experience measurement come and go. The NPS fad has been particularly stubborn. Mostly because the theory behind it is intuitive, it is a solution to the problem of measuring loyalty within a survey, and it is simple. I personally think it was oversold as the “one number you need to grow.” Overselling it as the one number you need to grow doesn’t do justice to the complexities of managing the customer experience, nor does one NPS number give any direction in terms of how to improve your NPS score. An NPS score alone is just not very actionable.
While NPS is an excellent loyalty proxy and has a lot of utility is a customer experience survey, it is not an appropriate tool to use in a mystery shopping context. Mystery shopping is a snapshot of one experience in time, where a mystery shopper interacts with the representative of the brand. NPS is a measure of one’s likelihood to refer the brand to others. The problem is the likelihood to refer the brand to others is almost never the result of a snapshot in time. Rather, it is a holistic measure of the health of the entire relationship with the brand, and as such does not work well in a mystery shop context where the measurement is of a single interaction. As such, NPS is a measure of things unrelated to the specific experience measured in the mystery shop; things like: past-experiences, overall branding, alignment of the brand to customer expectations, etc.
Now, I understand the intent of inserting NPS in the mystery shop. It is to identify a dependent variable from which to evaluate the efficacy of the experience. NPS is just the wrong solution for this objective.
There is a better way.
Instead of blindly using NPS in the wrong research context, focus on your business objectives. Ask yourself:
- What are our business objectives with respect to the experience mystery shopped?
- What do we want to accomplish?
- How do we want the customer to feel as a result of the experience?
- What do we want the customer to do as a result of the experience shopped?
Once you have determined what business objectives you want to achieve as a result of the customer experience, design a specific question to measure the influence of the customer experience on this business objective.
For example, assume your objective of the customer experience is purchase intent. You want the customer to be more motivated to purchase after the experience than before. Ask a purchase intent question, designed to capture the shopper’s change in purchase intent as a result of the shop.
Now, you have a true dependent variable from which to evaluate the behaviors measured in the mystery shop. This is what we call Key Driver Analysis – identifying the behaviors which are key drivers of the desired business objective. In the example above we want to identify key drivers of purchase intent.
I like to think of different question types and analytical techniques as tools in a tool box. Each is important for its specific purpose, but few are universal tools which work in every context. NPS may be a useful tool for customer experience surveys. It is not, however, an appropriate tool for mystery shopping.
Mystery shop programs measure human interactions; interactions with other humans and increasingly human interactions with automated machines. Given that humans are on one or both sides of the equation, it is not surprising that variation in the customer experience exists.
When designing a mystery shop program, a central decision is the number of shops to deploy. This decision is dependent on a number of issues including: desired reliability, number of customer interactions, and the budgetary resources available for the program. However, one additional and very important consideration, which frankly doesn’t get much attention, is the amount of variation expected in the customer experience to be measured.
The level of variation in the customer experience is an important consideration. Consistent customer experience processes require less mystery shops than those with a high degree of variation. To illustrate this, consider the following:
Assume a customer experience process is 100% consistent with zero variation from experience to experience. Such a process would require only one shop to accurately describe the experience as a whole. Now, consider a customer experience process with an infinite level of variation in the experience. Such a process would require far more than one shop. In fact, assuming an infinite level of variation, 400 shops would be required to achieve a margin of error of plus or minus five percent.
Obviously, the variation of most customer experience processes reside somewhere between perfect consistency and infinite variation. So how do managers determine the level of variation in their process? The answer to this question will probably be more qualitative than quantitative. Ask yourself:
- Do you have a set of standardized customer experience expectations?
- Are these expectations clearly communicated to employees?
- Other than mystery shopping, do you have any processes in place to monitor the customer experience? If so, are the results of these monitoring tools consistent from month-to-month or quarter-to-quarter?
To make it easy, I always ask new clients to give a qualitative estimate of the level of variation in their customer experience from: high, medium to low. The answer to this question will also be considered along with the level of statistical reliability desired and budgetary resources available for the program in determining the appropriate number of shops.
So – ask yourself; how much variation can we expect in our customer experience?
Self-help resources typically take the form of a webpage housed on the mystery shop provider’s website or on an internal resource page. These resources provide a tutorial in the form of either a PowerPoint or video, reinforcing to stakeholders many of the subjects already discussed: definition of the brand, behavioral service expectations, and a copy of the questionnaire.
These self-help resources are also an excellent opportunity to introduce the mystery shop reports and how to read them (both on an individual shop basis and on an analytical level), and introduce concepts designed to identify the relative importance of specific sales and service behaviors which drive desired outcomes like purchase intent and customer loyalty.
Shop Results E-Mail
Upon distribution of the first shop, it is a best practice in launching a mystery shop program to send an e-mail to the supervisor of the employee shopped advising them of a completed shop, and containing either a PDF shop report or access to the shop via an online reporting tool.
The content of this e-mail should be dependent on the performance of the individuals shopped. If a shop is perfect, the e-mail should congratulate the employees on a perfect shop. If a shop is below expectations, it should inform the employees, in as positive way as possible, that their performance was below expectations and set the stage for coaching. It should remind employees that it is not the performance of this first shop that counts, but subsequent improvement as a result of the shops.
Depending on the timing of shop e-mails, some clients prefer the shop to be sent as soon as it clears the provider’s quality control process, while others prefer shops be held and released in mass at the end of a given shopping period (typically monthly). If the e-mail is sent at the end of a given period, this is an excellent opportunity to identify top performers who received perfect shops as a means of both recognizing superior performance, and motivating other employees to seek similar achievement.
Finally, this e-mail should reinforce superior shop performance by reminding front-line employees and managers of the rewards earned by successful shop performance.
This e-mail should be modified for all subsequent waves of shopping and be used as a cover letter for distribution of all future shops.
Additional e-mails may be sent to notify employees and their managers of specific events, such as: perfect shops, failed shops, shops within a specific score range, or shops which identify a specific behavior of an employee like a cross-sell effort.
Post Shop Call/ Presentation
Similar to the kickoff presentation, after the first wave of shopping, it is a best practice to conduct a post shop presentation, again by conference call or WebEx. The purpose of this presentation is to present the reports available, discuss how to read them, and – most importantly – take action on the results through coaching and interpreting call to action elements built into the program. Call to action elements designed to identify which behaviors are most important in terms of driving purchase intent or loyalty.
There should be no surprises in mystery shopping. A key to keeping all stakeholders informed of the mystery shop process is pre-shop communication.
The first communication tool is the kickoff letter. This letter is most often in the form of an e-mail. Sent prior to shopping, its purpose is to introduce employees to the program, explain its purpose in a positive way, make sure employees are aware of what is expected of them, and link shopping to their best interests, by reinforcing it is designed to make them more successful.
The kickoff e-mail should:
- Define the brand and emphasize that frontline employees are the personification of the brand. They are the physical embodiment of the brand.
- Explain that certain behaviors are expected from them in their role as the physical embodiment of the brand.
- List the specific sales and service behaviors that shoppers are asked to observe. Stress that management wants every representative to score well. Management has no interest in setting employees up for failure. If they perform these behaviors, they will receive a perfect shop score.
- Detail the incentive and reward structures in place as a result of the mystery shop program.
A presentation, conference call, or WebEx is an excellent tool to kick off a mystery shop program. All stakeholders in the process should understand their role and what is expected of them.
As with the kickoff letter or e-mail, the presentation should define the brand, stress that employees are the physical embodiment of the brand, and identify the specific sales and service behaviors expected from employees.
It should identify the internal administrator of the program, communicate the dispute process, discuss incentives and rewards earned through positive mystery shops, as well as introduce the concept of coaching as a result of the shop – making sure that managers and customer-facing personnel understand their role in the coaching process.
Finally, this presentation should introduce employees to self-help resources available for taking positive action as a result of the shop.
Best in class mystery shop programs provide a central point of internal administration. A central administrator manages the relationship with the mystery shop provider and coordinates with other stakeholders (such as training and human resources).
This central point of administration requires a strong administrator to keep the brand focused and engaged, and to make sure that recalcitrant field managers are not able to undermine the program before it begins to realize its potential value.
A best practice in launching a mystery shop program is to identify, to all stakeholders, the main contact for internal administration, and how to communicate with them. Along with identifying the internal administrator, in most cases, it is a best practice to also identify the mystery shopping provider – just to keep employees comfortable with the measurement process. However, in some cases, such as instances where there has been a history of employees gaming the system, it may be more appropriate to keep the mystery shop provider anonymous.
Disputed shops are part of the mystery shop process. Mystery shops are just a snap shot in time, and measure complicated service encounters. As a result, there may be extenuating circumstances that need to be addressed, or questions about the quality of the shopper’s performance that require both a fair and firm process to resolve.
The specifics of the dispute process should be aligned with the brand’s values and culture. Broadly, there are two ways to design a dispute process: arbitration and fixed number of challenges.
Arbitration: Most brands have a program manager or group of program managers acting as an arbitrator of disputes and ordering reshops or adjusting points to an individual shop as they see fit. The arbiter of disputes must be both fair and firm, otherwise, employees and other managers will quickly start gaming the system, bogging the process down with frivolous disputes.
Fixed Number of Challenges: Other brands give each business unit (or store) a fixed number of challenges in which they can ask for an additional shop. Managers responsible for that business unit can request a reshop for any reason. However, when the fixed number of disputes is exhausted they lose the ability to request a reshop. This approach is fair (each business unit has the same number of disputes), it reduces the administrative burden on a centralized arbiter, and reduces the potential for massive gaming of the system as there is a limited number of disputes.
This is perhaps the most common question I’m asked by clients old and new alike. There seems to be a common misconception among both clients and providers, that any one number, say 90% is a “good” mystery shop score. Beware of anyone who glibly throws out a specific number without any consideration of the context. They are either ignorant, glib or both. Like most things in life, the answer to this question is much more complex.
Most mystery shopping programs score shops according to some scoring methodology to distill the mystery shop results down into a single number. Scoring methodologies vary, but the most common methodology is to assign points earned for each behavior measured and divide the total points earned by the total points possible, yielding a percent of points earned relative to points possible.
It amazes me how many mystery shop providers I’ve heard pull a number out of the air, again say 90%, and quote that as the benchmark with no thought given to the context of the question. The fact of the matter is much more complex. Context is key. What constitutes a good score varies dramatically from client-to-client, program-to-program and is based on the specifics of the evaluation. One program may be an easy evaluation, measuring easy behaviors, where a score must be near perfect to be considered “good” – others may be difficult evaluations measuring more difficult behaviors, in this case a good score will be well below perfect. The best practice in determining what constitutes a good mystery shop score is to consider the distribution of your shop scores as a whole, determine the percentile rank of each shop (the proportion of shops that fall below a given score), and set an appropriate cut off point. For example, if management decides the 60th percentile is an appropriate standard (6 out of 10 shops are below it), and a shop score of 86% is in the 60th percentile, then a shop score of 86% is a “good” shop score.
Again, context is key. What constitutes a good score varies dramatically from client-to-client, program-to-program and is based on the specifics of the evaluation. Discount the advice of anyone in the industry who glibly throws out a number stating it’s a good score, without considering the context.
Most mystery shopping programs score shops according to some scoring methodology to distill the mystery shop results down into a single number. Scoring methodologies vary, but the most common methodology is to assign points earned for each behavior measured and divide the total points earned by the total points possible, yielding a percentage of points earned relative to points possible.
Drive Desired Behaviors
Some behaviors are more important than others. As a result, best in class mystery shop programs weight behaviors by assigning more points possible to those deemed more important. Best practices in mystery shop weighting begin by assigning weights according to management standards (behaviors deemed more important, such as certain sales or customer education behaviors), or according to their importance to their relationship to a desired outcome such as purchase intent or loyalty. Service behaviors with stronger relationships to the desired outcome receive stronger weight.
One tool to identify behavioral relationships to desired outcomes is Key Driver Analysis. See the attached post for a discussion of Key Driver Analysis.
Don’t Average Averages
It is a best practice in mystery shopping to calculate the score for each business unit independently (employee, store, region, division, corporate), rather than averaging business unit scores together (such as calculating a region’s score by averaging the individual stores or even shop scores for the region). Averaging averages will only yield a mathematically correct score if all shops have exactly the same points possible, and if all business units have exactly the same number of shops. However, if the shop has any skip logic, where some questions are only answered if specific conditions exist, different shops will have different points possible, and it is a mistake to average them together. Averaging them together gives shops with skipped questions disproportionate weight. Rather, points earned should be divided by points possible for each business unit independently. Just remember – don’t average averages!
Work Toward a Distribution of Shops
When all is said and done, the product of a best in class mystery shop scoring methodology will produce a distribution of shop scores, particularly on the low end of the distribution.
Mystery shop programs with tight distributions around the average shop score offer little opportunity to identify areas for improvement. All the shops end up being very similar to each other, making it difficult to identify problem areas and improve employee behaviors. Distributions with scores skewed to the low end, make it much easier to identify poor shops and offer opportunities for improvement via employee coaching. If questionnaire design and scoring create scores with tight distributions, consider a redesign.
Most mystery shopping programs score shops according to some scoring methodology. In designing a mystery shop score methodology best in class programs focus on driving desired behaviors, do not average averages and work toward a distribution of shops.