Tag Archive | Best Practices in Mystery Shop Scoring

Best Practices in Mystery Shop Program Launch: Program Administration

In a previous post we introduced the importance of proper program launch.

Best in class mystery shop programs provide a central point of internal administration. A central administrator manages the relationship with the mystery shop provider and coordinates with other stakeholders (such as training and human resources).

This central point of administration requires a strong administrator to keep the brand focused and engaged, and to make sure that recalcitrant field managers are not able to undermine the program before it begins to realize its potential value.

A best practice in launching a mystery shop program is to identify, to all stakeholders, the main contact for internal administration, and how to communicate with them. Along with identifying the internal administrator, in most cases, it is a best practice to also identify the mystery shopping provider – just to keep employees comfortable with the measurement process. However, in some cases, such as instances where there has been a history of employees gaming the system, it may be more appropriate to keep the mystery shop provider anonymous.

Disputed shops are part of the mystery shop process. Mystery shops are just a snap shot in time, and measure complicated service encounters. As a result, there may be extenuating circumstances that need to be addressed, or questions about the quality of the shopper’s performance that require both a fair and firm process to resolve.

The specifics of the dispute process should be aligned with the brand’s values and culture. Broadly, there are two ways to design a dispute process: arbitration and fixed number of challenges.

Arbitration: Most brands have a program manager or group of program managers acting as an arbitrator of disputes and ordering reshops or adjusting points to an individual shop as they see fit. The arbiter of disputes must be both fair and firm, otherwise, employees and other managers will quickly start gaming the system, bogging the process down with frivolous disputes.

Fixed Number of Challenges: Other brands give each business unit (or store) a fixed number of challenges in which they can ask for an additional shop. Managers responsible for that business unit can request a reshop for any reason. However, when the fixed number of disputes is exhausted they lose the ability to request a reshop. This approach is fair (each business unit has the same number of disputes), it reduces the administrative burden on a centralized arbiter, and reduces the potential for massive gaming of the system as there is a limited number of disputes.

In a subsequent post we will discuss the importance of building and communicating call-to-action elements into a mystery shop program.

Mystery_Shopping_Page

What is a Good Mystery Shop Score?

This is perhaps the most common question I’m asked by clients old and new alike.  There seems to be a common misconception among both clients and providers, that any one number, say 90% is a “good” mystery shop score.  Beware of anyone who glibly throws out a specific number without any consideration of the context.  They are either ignorant, glib or both.  Like most things in life, the answer to this question is much more complex.

Most mystery shopping programs score shops according to some scoring methodology to distill the mystery shop results down into a single number.  Scoring methodologies vary, but the most common methodology is to assign points earned for each behavior measured and divide the total points earned by the total points possible, yielding a percent of points earned relative to points possible.

It amazes me how many mystery shop providers I’ve heard pull a number out of the air, again say 90%, and quote that as the benchmark with no thought given to the context of the question.  The fact of the matter is much more complex.   Context is key.  What constitutes a good score varies dramatically from client-to-client, program-to-program and is based on the specifics of the evaluation.  One program may be an easy evaluation, measuring easy behaviors, where a score must be near perfect to be considered “good” – others may be difficult evaluations measuring more difficult behaviors, in this case a good score will be well below perfect.  The best practice in determining what constitutes a good mystery shop score is to consider the distribution of your shop scores as a whole, determine the percentile rank of each shop (the proportion of shops that fall below a given score), and set an appropriate cut off point.   For example, if management decides the 60th percentile is an appropriate standard (6 out of 10 shops are below it), and a shop score of 86% is in the 60th percentile, then a shop score of 86% is a “good” shop score.

Bell Curve

Again, context is key.  What constitutes a good score varies dramatically from client-to-client, program-to-program and is based on the specifics of the evaluation.  Discount the advice of anyone in the industry who glibly throws out a number stating it’s a good score, without considering the context.

Click Here for Mystery Shopping Best Practices

 

 

Click Here for Mystery Shopping Best Practices

 

 

Mystery_Shopping_Page

Best Practices in Mystery Shop Scoring

FocalPoint3

Most mystery shopping programs score shops according to some scoring methodology to distill the mystery shop results down into a single number.  Scoring methodologies vary, but the most common methodology is to assign points earned for each behavior measured and divide the total points earned by the total points possible, yielding a percentage of points earned relative to points possible.

Drive Desired Behaviors

Some behaviors are more important than others.  As a result, best in class mystery shop programs weight behaviors by assigning more points possible to those deemed more important.  Best practices in mystery shop weighting begin by assigning weights according to management standards (behaviors deemed more important, such as certain sales or customer education behaviors), or according to their importance to their relationship to a desired outcome such as purchase intent or loyalty.  Service behaviors with stronger relationships to the desired outcome receive stronger weight.

One tool to identify behavioral relationships to desired outcomes is Key Driver Analysis.  See the attached post for a discussion of Key Driver Analysis.

Don’t Average Averages

It is a best practice in mystery shopping to calculate the score for each business unit independently (employee, store, region, division, corporate), rather than averaging business unit scores together (such as calculating a region’s score by averaging the individual stores or even shop scores for the region).  Averaging averages will only yield a mathematically correct score if all shops have exactly the same points possible, and if all business units have exactly the same number of shops.  However, if the shop has any skip logic, where some questions are only answered if specific conditions exist, different shops will have different points possible, and it is a mistake to average them together.  Averaging them together gives shops with skipped questions disproportionate weight.  Rather, points earned should be divided by points possible for each business unit independently.   Just remember – don’t average averages!

Work Toward a Distribution of Shops

When all is said and done, the product of a best in class mystery shop scoring methodology will produce a distribution of shop scores, particularly on the low end of the distribution.

Distribution

Mystery shop programs with tight distributions around the average shop score offer little opportunity to identify areas for improvement.  All the shops end up being very similar to each other, making it difficult to identify problem areas and improve employee behaviors.  Distributions with scores skewed to the low end, make it much easier to identify poor shops and offer opportunities for improvement via employee coaching.  If questionnaire design and scoring create scores with tight distributions, consider a redesign.

Most mystery shopping programs score shops according to some scoring methodology.  In designing a mystery shop score methodology best in class programs focus on driving desired behaviors, do not average averages and work toward a distribution of shops.

Good MS Score

 

 

Click Here for Mystery Shopping Best Practices

 

 

Click Here for Mystery Shopping Best Practices

 

 

Mystery_Shopping_Page

Mystery Shop Best Practices Conclusion

Title Bar

Previously we examined best practices in mystery shop program launch.

Plan for Change

Finally, given mystery shopping measures employee behaviors against service standards, it is a best practice in mystery shopping to calibrate and align service standards with customer expectations. This is achieved by maintaining a feedback loop from customer expectations uncovered with surveys of customers back into updating both service standards based on these customer expectations and mystery shopping to measure and reinforce those standards.  Such an informed feedback loop between customer surveys and mystery shopping will ensure the behaviors measured are aligned with customer expectations.

Even well-designed and administered best practices in mystery shopping research requires periodic adjustment. Performance scores eventually flatten out or cluster together, diminishing the value of the program as a tool for rewarding top performers and continuously improving quality. Periodic reviews should be worked into the program design so it can be kept relevant and useful, and so the bar can be repeatedly raised on service quality and employee performance.

Provider Selection

Truth be told…mystery shop data collection is largely a commodity, all mystery shop providers have access to the same pool of shoppers, and use similar technology to collect shop data.  The source of differentiation is the extent to which a provider can help take meaningful action on the results.

Hire a provider that can be a partner. Large companies often employ an excruciating bidding process that rarely identifies the best vendor for their needs. They issue lengthy RFPs for mystery shopping that are meant to weed out the weakest contenders, but by asking bidders to commit to overly detailed and inappropriate specifications, they effectively eliminate more sophisticated companies at the same time. The typical RFP process creates an environment in which mystery shopping vendors over-promise in order to make the first cut, thus setting themselves up for failure if they win the account. In addition, it treats mystery shopping research as a commodity, regarding it as a bulk purchase of data rather than a high-value quality improvement tool. Companies have more success when they research the market carefully and identify the providers that have the knowledge and commitment to help them build a truly valuable program.

Conclusion

It is the employees who animate the brand, and it is imperative that employee sales and service behaviors be aligned with the brand promise.  Actions speak louder than words.  Brands spend millions of dollars on external messaging to define an emotional connection with the customer.  However, when a customer perceives a disconnect between an employee representing the brand and external messaging, they almost certainly will experience brand ambiguity.  The result severely undermines these investments, not only for the customer in question, but their entire social network.  In today’s increasingly connected world, one bad experience could be shared hundreds if not thousands of times over.  Mystery shopping is an excellent tool to align sales and service behaviors to the brand.

Mystery shopping programs, when administered in accordance with certain mystery shopping best practices, identify the sales and service behaviors that matter most – those which drive purchase intent and customer loyalty.

Click Here for Mystery Shopping Best Practices

Mystery_Shopping_Page

Taking Action on Mystery Shop Results

Title Bar

Previously we examined best practices in mystery shop sample planning.

Call to Action Analysis

A best practice in mystery shop design is to build in call to action elements designed to identify key sales and service behaviors which correlate to a desired customer experience outcome.  This Key Driver Analysis determines the relationship between specific behaviors and a desired outcome.  For most brands and industries, the desired outcomes are purchase intent or return intent (customer loyalty).  This approach helps brands identify and reinforce sales and service behaviors which drive sales or loyalty – behaviors that matter.

Key Driver Graphic 1


Earlier we suggested anticipating the analysis in questionnaire design in a mystery shop best practice.  Here is how the three main design elements discussed provide input into call to action analysis.

HowShoppers are asked if they had been an actual customer, how the experience influenced their return intent.  Cross-tabulating positive and negative return intent will identify how the responses of mystery shoppers who reported a positive influence on return intent vary from those who reported a negative influence.  This yields a ranking of the importance of each behavior by the strength of its relationship to return intent.

WhyIn addition, paired with this rating is a follow-up question asking, why the shopper rated their return intent as they did.  The responses to this question are grouped and classified into similar themes, and cross-tabulated by the return intent rating described above.  The result of this analysis produces a qualitative determination of what sales and service practices drive return intent.
WhatThe final step in the analysis is identifying which behaviors have the highest potential for ROI in terms of driving return intent.  This is achieved by comparing the importance of each behavior (as defined above) and its performance (the frequency in which it is observed).  Mapping this comparison in a quadrant chart, like the one to the below, provides a means for identifying behaviors with relatively high importance and low performance, which will yield the highest potential for ROI in terms of driving return intent.

Gap Graphic

 

This analysis helps brands focus training, coaching, incentives, and other motivational tools directly on the sales and service behaviors that will produce the largest return on investment – behaviors that matter.

Taking Action

Part of Balanced Scorecard

A best practice in mystery shopping is to integrate customer experience metrics from both sides of the brand-customer interface as part of an incentive plan.  The exact nature of the compensation plan should depend on broader company culture and objectives.  In our experience, a best practice is a balanced score card approach which incorporates customer experience metrics along with financial, internal business processes (cycle time, productivity, employee satisfaction, etc.), as well as innovation and learning metrics.

Within these four broad categories of measurement, Kinēsis recommends managers select the specific metrics (such as ROI, mystery shop scores, customer satisfaction, and cycle time), which will best measure performance relative to company goals. Discipline should be used, however. Too many can be difficult to absorb. Rather, a few metrics of key significance to the organization should be collected and tracked in a balanced score card.

Coaching

Best in class mystery shop programs identify employees in need of coaching.  Event-triggered reports should identify employees who failed to perform targeted behaviors.  For example, if it is important for a brand to track cross- and up-selling attempts in a mystery shop, a Coaching Report should be designed to flag any employees who failed to cross- or up-sell.  Managers simply consult this report to identify which employees are in need of coaching with respect to these key behaviors – behaviors that matter.

Click here for the next installment in this series: Mystery Shop Program Launch.

Click Here for Mystery Shopping Best Practices

Mystery_Shopping_Page

Best Practices in Mystery Shop Scoring

Title Bar

Previously we examined the process of mystery shop questionnaire design.

Most mystery shopping programs score shops according to some scoring methodology to distill the mystery shop results down into a single number.

Scoring methodologies vary, but the most common methodology is to assign points earned for each behavior measured and divide the total points earned by the total points possible, yielding a percent of points earned relative to points possible.  It is a best practice in mystery shopping to calculate the score for each business unit independently (employee, store, region, division, corporate).

Not all Behaviors are Equal

Some behaviors are more important than others.  As a result, best in class mystery shop programs weight behaviors by assigning more points possible to those deemed more important.  Best practices in mystery shop weighting begin by assigning weights according to management standards (behaviors deemed more important, such as certain sales or customer education behaviors), or according to their importance to a desired outcome such as purchase intent or loyalty.  Service behaviors with stronger relationships to the desired outcome, identified through Key Driver Analysis, receive stronger weight.  Again, see the subsequent discussion of Key Driver Analysis.

Don’t Average Averages!

It is a mistake to calculate business unit scores by averaging unit scores together (such as calculating a region’s score by averaging the individual stores or even shop scores for the region).  This will only yield a mathematically correct score if all shops have exactly the same points possible, and if all business units have exactly the same number of shops.  However, if the shop has any skip logic, where some questions are only answered if specific conditions exist, different shops will have different points possible, and it is a mistake to average them together.  Averaging them together gives shops with skipped questions disproportionate weight.  Rather, points earned should be divided by points possible for each business unit independently.   Just remember – don’t average averages!

What Is A Good Score?

This is perhaps the most common question asked by mystery shop clients – one for which there is no simple answer.  It amazes me how many mystery shop providers I’ve heard pull a number out of the air, say 90%, and quote that as the benchmark with no thought given to the context of the question.  The fact of the matter is much more complex.   Context is key.  What constitutes a good score varies dramatically from client-to-client, program-to-program based on the specifics of the evaluation.  One program may be an easy evaluation, measuring easy behaviors, where a score must be near perfect to be considered “good” – others may be difficult evaluations measuring more difficult behaviors, in this case a good score will be well below perfect.  The best practice in determining what constitutes a good mystery shop score is to consider the distribution of your shop scores as a whole, determine the percentile rank of each shop (the proportion of shops that fall below a given score), and set an appropriate cut off point.   For example, if management decides the 60th percentile is an appropriate standard (6 out of 10 shops are below it), and a shop score of 86% is in the 60th percentile, then a shop score of 86% is a “good” shop score.

Work Toward a Distribution

Distribution

When all is said and done, the product of a best in class mystery shop scoring methodology will produce a distribution of shop scores, particularly on the low end of the distribution. Mystery shop programs with tight distributions around the average shop score offer little opportunity to identify areas for improvement. All the shops end up being very similar to each other, making it difficult to identify problem areas and improve employee behaviors. Distributions with scores skewed to the low end, make it much easier to identify poor shops and offer opportunities for improvement via employee coaching. If questionnaire design and scoring create scores with tight distributions, consider a redesign.

Click here for the next installment in this series: Mystery Shop Sample Plans.

Click Here for Mystery Shopping Best Practices

Mystery_Shopping_Page