In 2011 and early 2012, mobile subscribers in Japan faced a series of large scale network outages. Between April 2011 and February 2012, both NTT docomo and KDDI had 5 major outages each. The impact of the incidents was so severe that it led the Internal Affairs and Communications Ministry in Japan to order the network operators to conduct a comprehensive investigation of capacity across their entire networks and submit detailed reports.
From the reports, it was clear that the explosion in the number of smartphones had caused the outages. Compared to feature phones, a smartphone's control plane (C-plane) traffic causes surges 2 to 5 times greater, and user plane (U-plane) traffic surges as much as 10 times greater. Service providers are eager to promote smartphones to generate data traffic revenues, but as a result, the sharp rise in data traffic began to exceed network capacity and cause frequent service disruptions.
Smartphones have two distinguishable characteristics from feature phones: they create a larger amount of data traffic and higher network access frequency. Of course, large data traffic can ultimately lead to service disruptions, but the same can be said of the increased frequency with which users access the network.
Radio resources are shared among all subscribers, and large data traffic can easily create a bottleneck by using up the available resources. When the radio resources are fully exhausted, the data throughput and accessibility for all subscribers is seriously affected. Frequent network access can also affect the network, but in a different way. The high frequency of network access causes an unexpectedly high amount of C-plane traffic, which creates extreme stress on core network processing. When it exceeds the network’s processing capability, it can take down all the mobile services in the area. The diagram below shows a breakdown of this scenario.
Starting in some of the most densely populate areas, these types of outages spread across major cities in Japan and affected millions of subscribers in Japan between 2010 and 2012. The root cause of the outages was found to be the sharp rise in C-plane traffic. The figure below shows the relationship between C-Plane traffic, U-plane traffic, and radio link status. This relationship is key to the smooth functioning of radio access networks. As an article in the Japan Times reported, NTT docomo's Executive Vice President Fumio Iwasaki said, "Our estimate (of the communication volume) was insufficient . . . We apologize to our subscribers for causing the trouble.”
Boosting a Network’s Capacity for C-plane Traffic
Service operators reacted by boosting network capacity and reducing the number of C-plane messages per network access. NTT docomo and other operators have already reinforced their networks with a number of powerful packet switches and revised their C-plane processing software.
To avoid rising CAPEX, new technologies were introduced for more efficient C-plane processing. One such example is Fast Dormancy. If a terminal and network support this specification, the number of C-plane messages can be reduced to one third of the normal volume. Some popular smartphones, such as the iPhone, had already implemented Fast Dormancy, and the service operators in Japan began to support this feature from 2011 to 2013.
Reducing the Number of Messages in C-plane Traffic
C-plane messages are generated by terminals to set up and release their radio links. These messages represent only a very limited amount of bandwidth, yet C-plane errors can trigger fatal disruptions of the radio link. In the case of NTT docomo, a sharp rise in C-plane traffic from smartphones overloaded the packet switches and brought down the network. Frequent network access causes terminals to transmit C-plane messages to set up, but then soon release the radio link upon idle timeout from no data activity. Multiple background applications can add up to create a high volume of C-plane traffic, resulting in high load processing in the core network. Thus, reducing C-plane messages emerged as one of the main challenges to be resolved.
Underestimating C-plane Traffic
NTT docomo admitted that they had underestimated the amount of C-plane traffic generated by smartphones. NTT docomo's spokesperson Mr. Hiramatsu noted that C-plane traffic can create high stress environments not only for radio access but for the core network. The reason behind the high stress comes from the mechanism where C-plane messages re-write data on radio network controllers (RNC) and packet switches (PS).
The CEO of NTT docomo Mr. Yamada also stated that NTT docomo had been focusing solely on handling the bursting user traffic. NTT docomo introduced new packet switches on January 25, 2012 in response to the network incidents.
Core network nodes keep logical connections for certain periods, said Ericsson Japan CTO Mr. Fujisawa. However, physical radio links are set up and released frequently to increase radio usage efficiency. While NTT docomo had increased simultaneous connection capacity from 880K to 1.8M by introducing new packet switches to accommodate smartphone traffic, C-plane performance was reduced from 27.5M to 14.1M packets per hour (see figure below). However, before the packet switches were reinforced, sharp rises in C-plane traffic attacked NTT docomo's network.
Introducing Fast Dormancy
At the current rate, just reinforcing packet switches can lead to limitless CAPEX. To avoid the increased cost, a new technology called Fast Dormancy was introduced in 3GPP Release 8 to reduce C-plane messages by two thirds (Figure 4). Fast Dormancy specifies an intermediate state (PCH) to avoid frequent return to idle state.
Rapidly changing modern cellular network environments create a challenge for service operators, especially in light of the wide-spread use of smartphones. Service disruptions are no longer minor, unusual incidents; highly congested networks have become a major issue in most major cities in the world. Unprecedented consumer adoption of smartphones and other smart devices are responsible for high congestion rates in terms of network capacity, but more importantly, the associated C-plane traffic can cause fatal damage that takes down mobile services. The need for comprehensive testing including C-plane performance testing is increasing for both vendors and operators to know the true limit of their network capacities. LTE network testers like the DuoSIM and DuoSIM-A are essential for evaluating the communication nodes of radio access and core networks with total testing solutions, to better prepare for the worst case scenarios.