In a recent letter to the FCC, AT&T said that it had no objection to VoIP applications on the iPhone that communicate over the Wi-Fi connection. It furthermore said:
Consistent with this approach, we plan to take a fresh look at possibly authorizing VoIP capabilities on the iPhone for use on AT&T’s 3G network.
So why would anybody want to do VoIP on the cellular data channel, when there is a cellular voice channel already? Wouldn’t voice on the data channel cost more? And since the voice channel is optimized for voice and the data channel isn’t, wouldn’t voice on the data channel sound even worse than cellular voice already does?
Let’s look at the “why bother?” question first. There are actually at least four reasons you might want to do voice on the cellular data channel:
- To save money. If your voice plan has some expensive types of call (for example international calls) you may want to use VoIP on the data channel for toll by-pass. The alternative to this is to use the voice channel to call a local access number for an international toll by-pass service (like RebTel.)
- To get better sound quality: the cellular voice codecs are very low bandwidth and sound horrible. You can choose which codec to run over the data network and even go wideband. At IT Expo West a couple of weeks ago David Frankel of ZipDX demoed a wideband voice call on his laptop going through a Sprint Wireless Data Card. The audio quality was excellent.
- To get additional service features: companies like DiVitas offer roaming between the cellular and Wi-Fi networks that makes your cell phone act as an extension behind your corporate PBX. All these solutions currently use the cellular voice channel when out of Wi-Fi range, but if they were to go to the data channel they could offer wideband codecs and other differentiating features.
- For cases where there is no voice channel. In the example of David Frankel’s demo, the wireless data card doesn’t offer a voice channel, so VoIP on the data channel is the only option for a voice connection.
Moving on to the issue of cost, an iPhone unlimited data plan is $30 per month. “Unlimited” is AT&T’s euphemism for “limited to 5GB per month,” but translated to voice that’s a lot of minutes: even with IP packet overhead the bit-rate of compressed HD voice is going to be around 50K bits per second, which works out to about 13,000 minutes in 5GB. So using it for voice is unlikely to increase your bill. On the other hand, many voice plans are already effectively unlimited, what with rollover minutes, friend and family minutes, night and weekend minutes and whatnot, and you can’t get a phone without a voice plan. So for normal (non-international) use voice on the data channel is not going to reduce your bill, but it is unlikely to increase it, either.
Finally we come to the issue of whether voice sounds better on the voice channel or the data channel. The answer is, it depends on several factors, primarily the codec and the network QoS. With VoIP you can radically improve the sound quality of a call by using a wideband codec, but do impairments on the data channel nullify this benefit?
Technically, the answer is yes. The cellular data channel is not engineered for low latency. Variable delays are introduced by network routing decisions and by router queuing decisions. Latencies in the hundreds of milliseconds are not unusual. This will change with the advent of LTE, where the latencies will be of the order of 10 milliseconds. The available bandwidth is also highly variable, in contrast to the fixed bandwidth allocation of the voice channel. It can sometimes drop below what is needed for voice with even an aggressive variable rate codec.
In practice VoIP on the cellular data channel can sometimes sound much better than regular cellular voice. I mentioned above David Frankel’s demo at IT Expo West. I performed a similar experiment this morning with Michael Graves, with similarly good results. I was on a Polycom desk phone, Michael used Eyebeam on a laptop, and the codec was G.722. The latency on this call was appreciable – I estimated it at around 1 second round trip. There was also some packet loss – not bad for me, but it caused a sub-par experience for Michael. Earlier this week at Jeff Pulver’s HD Connect conference in New York, researchers from Qualcomm demoed a handset running on the Verizon network using EVRC-WB, transcoding to G.722 on Polycom and Gigaset phones in their lab in San Diego. The sound quality was excellent, but the latency was very high – I estimated it at around two seconds round trip.
The ITU addresses latency (delay) in Recommendation G.114. Delay is a problem because normal conversation depends on turn taking. Most people insert pauses of up to about 400 ms as they talk. If nobody else speaks during a pause, they continue. This means that if the one-way delay on a phone conversation is greater than 200 ms, the talker doesn’t hear an interruption within the 400 ms break, and starts talking again, causing frustrating collisions.
The ITU E-Model for call quality identifies a threshold at about 170 ms one-way at which latency becomes a problem. The E-Model also tells us that increasing latency amplifies other impairments – notably echo, which can be severe at low latencies without being a problem, but at high latencies even relatively quiet echo can severely disrupt a talker.
Some people may be able to handle long latencies better than others. Michael observed that he can get used to high latency echo after a few minutes of conversation.