Cisco IP Phones have been an integral part of the IP Telephony Environment. Even today, when more and more enterprises are moving to UCaaS based solutions and adopting soft phones on a wider scale, Cisco keeps selling its physical phones like hot cakes. So, if we still have them and they show no sign of disappearing, at least for a while then why not dig a bit deeper into their functionality and work towards building a custom application on our own. In this multi-part series, we will be going over different facets of phone services, starting with the basics, then moving to the payload structure that these services consume and how it can be manipulated, to finally building a working solution that tackles a real world use case. I’ll also upload the code on Github at the end of this series.
The table of contents is listed below. You can click on the link to jump to the relevant section.
Background
From an administrator’s point of view, when it comes to executing an action on a Cisco IP phone remotely then we primarily have the following two ways available
- CUCM GUI Interface
- Phone’s CLI interface
We can, of course, access the phone’s GUI as well for read only purposes. But what if we wanted to perform an action on a phone that is simply not possible through any of the above mentioned methods ? For example, executing a custom script. This is where the concept of IP Phone services comes in.
Cisco IP Phones use HTTP to communicate with external applications. The phone firmware includes both an HTTP client for making requests and an HTTP server for receiving requests. We can use a combination of XML/Text based payload wrapped in an HTTP request and send it to the phone for execution.
Typical Flow
I am sure, we all have experienced the power of phone services many a times through one of the commonly used services called Extension Mobility. The following is basically what happens behind the scenes when you select “Extension Mobility” service option on the phone.
- User presses the Service button on the phone and select a particular service (E.g Extension Mobility)
- The phone’s built in HTTP Client sends an HTTP GET for a specified URL (The one defined for Extension Mobility)
- The phone parses the XML object
- The phone presents data and options to the user which is most likely the EM login prompt.
XML Objects
Cisco IP Phones have quite a few different types of XML Objects defined for them. Their availability and usage depends on what the model of the phone is. The following table lists different types of XML Objects and their availability on different phone models. I’ve referenced only 78XX and 88XX series because of their wide deployment & usage. If you want to check the availability of XML Objects against any other phone model then you can have a look at the official Cisco documentation.
| XML object | 78XX Series | 88XX Series |
|---|---|---|
| CiscoIPPhoneMenu | Supported | Supported |
| CiscoIPPhoneText | Supported | Supported |
| CiscoIPPhoneInput | Supported | Supported |
| CiscoIPPhoneDirectory | Supported | Supported |
| CiscoIPPhoneImage | Not supported | Supported |
| CiscoIPPhoneImageFile | Not supported | Supported |
| CiscoIPPhoneGraphicMenu | Not supported | Supported |
| CiscoIPPhoneGraphicFileMenu | Not supported | Supported |
| CiscoIPPhoneIconMenu | Supported | Supported |
| CiscoIPPhoneIconFileMenu | Supported | Supported |
| CiscoIPPhoneStatus | Not supported | Supported |
| CiscoIPPhoneStatusFile | Not supported | Supported |
| CiscoIPPhoneExecute | Supported | Supported |
| CiscoIPPhoneResponse | Supported | Supported |
| CiscoIPPhoneError | Supported | Supported |
For the purpose of this series, we will be keeping our focus only on one object called CiscoIPPhoneExecute. This is what’s needed if we want to perform any real time action on the phones. When we build a program to execute an action on the phone like a physical key or softkey press or initiate a new call then the phone’s built in HTTP client will send a POST request to the server for CiscoIPPhoneExecute object and start executing the URIs.
Use Case
I can think of couple of scenarios that occur more often than we think in a day to day operations environment. For example,
- Need to modify TFTP server’s configuration on the phone if it’s not able to work with the one that it got through Option 150
- Need to reset ITL/CTL Security settings on the phone
If we were to do any of the above mentioned activities today then we would only have 2 options available to us :
- Ask the end user to press a bunch of buttons on the phone.
- Or send someone there to do it on the phone if the user is not cooperative. This can take hours or even days to perform an action that itself takes no more than 10 seconds.
Both these solutions run the risk of causing customer dissatisfaction, delay in ticket completion and huge wastage of time.
Voice of a Frustrated Operations Manager 😠
Solution
- The first solution pertains to third party applications out there that can help you perform file resets or TFTP reconfiguration and much more, remotely. Whatever you do on the physical phone while standing next to it (like pressing buttons to access the menus etc) can be done remotely. The script will send instructions to the phone and you will literally see them being executed on the phone. It’s like when you are on a remote troubleshooting session and someone takes remote control of your computer and starts clicking on different folders etc.
- The second option is to build a similar setup yourself and help your customers by saving them additional expense.
The main objective of my blog posts is to advocate the practice of building & automating stuff wherever possible and remove the reliance on third party tools. The idea is to look at it from a cost-benefit analysis point of view. If a third party solution is offering you 10 features but you need only one then why pay for the remaining 9. The cost you’ll be incurring will outweigh the benefits you will be getting out of the tool. On the other hand, if the tool helps you in more ways than one through multiple use cases and you know that you can’t automate all those features on your own then by all means, feel free to go on a shopping spree.
In the next part, we will focusing on the payload structure that can be sent to the phone’s server component for execution. This is the key part of this whole process where the payload will define what we want to see happening on the phone. So, stay tuned.
Please feel free to drop your feedback/suggestions, if any. Happy Learnings!!
