What Are The Attributes of a Good WFM Metric?
By Ric Kosiba, Ph.D.
Starting a Metrics Discussion
When I post an SWPP tip of the week, I try to include an invitation to start a conversation with me about any WFM or planning problems you might have. I am truly interested, and maybe can help talk through problems.
My last SWPP tip of the week generated more than its fair share of questions around definitions of metrics, for example, what to include or exclude in an occupancy or worked-to-paid calculation. Good stuff, and I love the discussions. But it got me thinking. What are the attributes of a metric that make its use good or bad? So, in no particular order, here are some attributes we should ponder when developing measures and reports for our operation:
What is the Metric’s Purpose?
The first question you should ask yourself when choosing to report a performance metric is what is the purpose of measuring it, and does it measure what it is supposed to? Is the goal to inform specific people of performance? Is it to motivate? Is it to spot operational anomalies?
Every metric should have a defined reason for taking the effort to gather its information. And the effort to relay the information, and the effort to watch and monitor it. If the purpose of measuring something is vague or unknown, or if it is “what we’ve always done,” then maybe we can skip that particular measure.
Can You Measure It?
This sounds a little bit basic, but can you actually measure the performance metric? I mean really measure the thing you are trying to measure? Here is an example.
First Call Resolution (FCR), has been a treasured metric to senior management for years. It is measured at most operations by sending out a survey and asking the question, “was your problem solved?” or something similar. Some sophisticated operations might employ an AI bot to “listen” to the contact and search for phrases like “thanks for solving my issue” in the transcript.
I would argue that for this measure, the “hit” rate, the percentage of people who answer the survey, or say those magical words, is quite low. Further, the customer’s honesty, when answering the survey might also be iffy. Also, the selection bias associated with *who* answers the survey, makes the metric sort of meaningless (say, if upset people are more likely to answer a survey). With this metric, FCR, we have low samples of customers who happen to be not very representative of the whole customer population. I believe it is not a good measure.
And don’t get me started on Net Promoter Score. I absolutely hate that one.
Is It Actionable? Who is It Actionable By?
Can the person or group being measured with your metric affect change? Nothing is more demotivating than being powerless to control your performance. But it does happen. Here are a couple examples.
I’ve seen supervisors or agent teams measured on abandons, occupancy, or service levels. While they could lower their personal handle times, and work exactly when they are supposed to, they really cannot affect abandons that much. On the other hand, the workforce management team can directly affect service levels, occupancy, and abandons, by putting together a solid capacity plan, efficient schedules, and managing adherence tightly. So, measuring WFM teams on service level and abandons is fine; it is not so fine for agents.
But, as we all know, adherence is actionable by the agent, and serves to help with our service metrics. But they are separate things and agents should be measured on the appropriate metric.
In our previous example, FCR is measurable by the organization, but not really actionable directly by the agent (or the organization). Active Contact Resolution, is both perfectly actionable by every agent and can be measured on every single contact. I’ve written about it at length in this space before, and it is a fantastic metric. Reach out to me if you’d like more info on it (ric@realnumbers.com).
I’ve seen “calls handled” kept as an agent metric, and while it can be affected by each agent, other factors may make it a poor proxy when instead, the agent can be just measured on their handle times. For instance, the specific hours scheduled may have more of an impact on how many calls an agent handles than the agent’s personal performance.
Is the Metric Complete?
At the fabulous SWPP conference every year, a recurring question often asked is “what goes into your definition of…” some particular metric. In the tip of the week about occupancy as a poor efficiency metric and worked-to-paid as a better measure, the questions I received were of the same ilk. What goes into the definition of worked-to-paid?
It is important that your metrics include all of the factors that make it complete, and interestingly, that may vary from company to company and staff group to staff group. For instance, does your abandonment metric keep “short hangups” as part of their abandoned calls tally? I’d argue that for many companies, they shouldn’t. But what is short? I once worked with a company who defined a short abandon — excluded from their abandoned calls metric — as being any call that abandoned in less than ten minutes! While I found that strange, the company did not, and they knew their operation and the nature of their calls better than I.
There is judgment in the makeup of your metrics, and all factors that make sense to keep in your metric should be part of the calculation.
For example, the ratio of worked time to paid time (by the way, a great metric for understanding the efficiency of your operation and much better than occupancy for measuring efficiency), it is important to understand the real definition of what comprises work. Does it include idle time? Yes! Does it include lunches and breaks? Probably not. Does it include vacation? No. How about meetings or training? Probably not, but maybe.
Is the Metric Manipulatable?
My first task, when I started working for a credit card company, was to figure out ways our agents were gaming our collections incentive program. It was a concern because many metrics can be manipulated.
The most obvious manipulatable metric is handle time. All an agent has to do is to hang up on a few customers, and abracadabra, handle times reduce.
Occupancy is highly manipulatable by management. If occupancy is too low, supervisors can create agent make-work or have a team meeting, and occupancy improves.
If it is important to use a manipulatable metric like agent AHT, there needs to be a mechanism to find those agents who game the system. For example, flagging short calls can help find agents who routinely disconnect calls early to improve their “performance.” In this instance, we are looking for agents who have a disproportionate number of short calls.
Monitoring ACR along with handle time will also help find agents who are gaming your metrics. A high ACR score may mean agents stay on contacts longer and do more work to make sure each customer is happy. This, of course, will increase handle times on the margin. However, focusing on AHT only, making sure each call is “efficient” could be in conflict with making sure each contact is resolved properly. The trick is to have a happy balance; it makes sense to include both ACR and AHT metrics when scoring agents.
Is It Too Complicated to Understand?
Scorecards are built to simplify. If we could roll up several measures into one, we would only have to look at one metric to understand our performance, which is a time saver. But there is a problem with this, it could actually obscure performance.
I was sitting through a demo of a performance management system, which had at the top of its agent-facing screen a single agent performance score. This number, the “score,” was an average of over 20 other metrics. This felt silly, and I asked the vendor if this example of a complicated score was ever used in the real world. I was assured that it indeed was.
How in the world could an agent make sense of that one number? How could they know if what they were doing, contact by contact, would contribute positively or negatively to the top score?
In these cases, there are just too many pieces that make up the scorecard. It becomes impossible to understand, and by extension meaningless, demotivating, and a waste of time. The agent cannot comprehend what they are supposed to do by watching this single score. Simple is good, complicated is counterproductive.
Metrics I Like
If you have read this column before, you probably know that I am a big proponent of measuring agent’s performance and sharing it with them real-time. I am very bullish on several metrics. I love ACR as a measure of both productivity and agent performance. I am a huge fan of measuring abandons, and using it as a staffing goal. I understand why WFM vendors don’t like it (it is hard to forecast accurately), but it is still my favorite.
I would love to see a WFM tool develop a distribution of service levels, rather than just one as a goal. Adherence is a good agent measure, and I expect agent adherence performance has changed with more at-home agents.
On the other hand, I am not a fan of scorecards, and averages of multiple metrics. However, there is one scorecard
I love, that I have written about here before: a scorecard that simply keeps track of “what percentage of time do I do what
I am supposed to do.” That scorecard, like ACR, could be a game-changer.
Ric Kosiba is a charter member of SWPP. He can be reached at ric@realnumbers.com or (410) 562-1217. Please know that he is *very* interested in learning about your business problems and challenges (and what you think of these articles). Want to improve that capacity plan? You can find his calendar and can schedule time with him at realnumbers.com. Follow him on LinkedIn! (www.linkedin.com/in/ric-kosiba/)