ISP Performance numbers from Netflix

Interesting numbers from the Netflix Tech Blog.

Several things jump out at me. First, cable is faster than DSL, and wireless is the slowest. Second, again no surprise, urban is faster than rural. But the big surprise to me is the Verizon number. They have spent a ton on FIOS, and according to Trefis about half of Verizon’s broadband customers are now on FiOS. So according to these numbers, even if we supposed that Verizon’s non-FiOS customers were getting a bandwidth of zero, the average bandwidth available to a FiOS customer appears to be less than 5 megabits per second.

Since FiOS is a very fast last mile, the bottleneck might be in the backhaul, or, more likely, in some bandwidth-throttling device. Whichever way you slice it, it’s hard to cast these numbers in a positive light for Verizon.

Netflix measurements of ISP bandwidth
Update January 31, 2011: This story in the St. Louis Business Journal says that the Charter, the ISP with the best showing in the Netflix measurements, is increasing its speed further, with no increase in price. This is good news. It is time that ISPs in the US started to compete on speed.

Contemplating the graphs, the lines appear to cluster to some extent in three bands, centered on 1.5 mbps, 2 mbps and 2.5 mbps. If this is evidence of traffic shaping, these are the numbers that ISPs should be using in their promotional materials, rather than the usual “up to” numbers that don’t mention minimums or averages.

QoS meters on Voxygen

The term “QoS” is used ambiguously. The two main categories of definition are first, QoS Provisioning: “the capability of a network to provide better service to selected network traffic,” which means packet prioritization of one kind or another, and second more literally: “Quality of Service,” which is the degree of perfection of a user’s audio experience in the face of potential impairments to network performance. These impairments fall into four categories: availability, packet loss, packet delay and tampering. Since this sense is normally used in the context of trying to measure it, we could call it QoS Metrics as opposed to QoS Provisioning. I would put issues like choice of codec and echo into the larger category of Quality of Experience, which includes all the possible impairments to audio experience, not just those imposed by the network.

By “tampering” I mean any intentional changes to the media payload of a packet, and I am OK with the negative connotations of the term since I favor the “dumb pipes” view of the Internet. On phone calls the vast bulk of such tampering is transcoding: changing the media format from one codec to another. Transcoding always reduces the fidelity of the sound, even when transcoding to a “better” codec.

Networks vary greatly in the QoS they deliver. One of the major benefits of going with VoIP service provided by your ISP (Internet Service Provider) is that your ISP has complete control over QoS. But there is a growing number of ITSPs (Internet Telephony Service Providers) that contend that the open Internet provides adequate QoS for business-grade telephone service. Skype, for example.

But it’s nice to be sure. So I have added a “QoS Metrics” category in the list to the right of this post. You can use the tools there to check your connection. I particularly like the one from Voxygen, which frames the test results in terms of the number of simultaneous voice sessions that your WAN connection can comfortably handle. Here’s an example of a test of ten channels:

Screen shot of Voxygen VoIP performance metrics tool

Third Generation WLAN Architectures

Aerohive claims to be the first example of a third-generation Wireless LAN architecture.

  • The first generation was the autonomous access point.
  • The second generation was the wireless switch, or controller-based WLAN architecture.
  • The third generation is a controller-less architecture.

The move from the first generation to the second was driven by enterprise networking needs. Enterprises need greater control and manageability than smaller deployments. First generation autonomous access points didn’t have the processing power to handle the demands of greater network control, so a separate category of device was a natural solution: in the second generation architecture, “thin” access points did all the real-time work, and delegated the less time-sensitive processing to powerful central controllers.

Now the technology transition to 802.11n enables higher capacity wireless networks with better coverage. This allows enterprises to expand the role of wireless in their networks, from convenience to an alternative access layer. This in turn further increases the capacity, performance and reliability demands on the WLAN.

Aerohive believes this generational change in technology and market requires a corresponding generational change in system architecture. A fundamental technology driver for 802.11n, the ever-increasing processing bang-for-the-buck yielded by Moore’s law, also yields sufficient low-cost processing power to move the control functions from central controllers back to the access points. Aerohive aspires to lead the enterprise Wi-Fi market into this new architecture generation.

Superficially, getting rid of the controller looks like a return to the first generation architecture. But an architecture with all the benefits of a controller-based WLAN, only without a controller, requires a sophisticated suite of protocols by which the smart access points can coordinate with each other. Aerohive claims to have developed such a protocol suite.

The original controller-based architectures used the controller for all network traffic: the management plane, the control plane and the data plane. The bulk of network traffic is on the data plane, so bottlenecks there do more damage than on the other planes. So modern controller-based architectures have “hybrid” access points that handle the data plane, leaving only the control and management planes to the controller device (Aerohive’s architect, Devin Akin, says:, “distributed data forwarding at Layer-2 isn’t news, as every other vendor can do this.”) Aerohive’s third generation architecture takes it to the next step and distributes control plane handling as well, leaving only the management function centralized, and that’s just software on a generic server.

Aerohive contends that controller-based architectures are expensive, poorly scalable, unreliable, hard to deploy and not needed. A controller-based architecture is more expensive than a controller-less one, because controllers aren’t free (Aerohive charges the same for its APs as other vendors do for their thin ones: under $700 for a 2×2 MIMO dual-band 802.11n device). It is not scalable because the controller constitutes a bottleneck. It is not reliable because a controller is a single point of failure, and it is not needed because processing power is now so cheap that all the functions of the controller can be put into each AP, and given the right system design, the APs can coordinate with each other without the need for centralized control.

Distributing control in this way is considerably more difficult than distributing data forwarding. Control plane functions include all the security features of the WLAN, like authentication and admission, multiple VLANs and intrusion detection (WIPS). Greg Taylor, wireless LAN services practice lead for the Professional Services Organization of BT in North America says “The number one benefit [of a controller-based architecture] is security,” so a controller-less solution has to reassure customers that their vulnerability will not be increased. According to Dr. Amit Sinha, Chief Technology Officer at Motorola Enterprise Networking and Communications, other functions handled by controllers include “firewall, QoS, L2/L3 roaming, WIPS, AAA, site survivability, DHCP, dynamic RF management, firmware and configuration management, load balancing, statistics aggregation, etc.”

You can download a comprehensive white paper describing Aerohive’s architecture here.

Motorola recently validated Aerohive’s vision, announcing a similar architecture, described here.

Here’s another perspective on this topic.

ITExpo West — Achieving HD Voice On Smartphones

I will be moderating a panel discussion at ITExpo West on Tuesday 5th October at 11:30 am in room 306B: “Achieving HD Voice On Smartphones.”

Here’s the session description:

The communications market has been evolving to fixed high definition voice services for some time now, and nearly every desktop phone manufacturer is including support for G.722 and other codecs now. Why? Because HD voice makes the entire communications experience a much better one than we are used to.

But what does it mean for the wireless industry? When will wireless communications become part of the HD revolution? How will handset vendors, network equipment providers, and service providers have to adapt their current technologies in order to deliver wireless HD voice? How will HD impact service delivery? What are the business models around mobile HD voice?

This session will answer these questions and more, discussing both the technology and business aspects of bringing HD into the mobile space.

The panelists are:

This is a deeply experienced panel; each of the panelists is a world-class expert in his field. We can expect a highly informative session, so come armed with your toughest questions.

VoIP on the cellular data channel

In a recent letter to the FCC, AT&T said that it had no objection to VoIP applications on the iPhone that communicate over the Wi-Fi connection. It furthermore said:

Consistent with this approach, we plan to take a fresh look at possibly authorizing VoIP capabilities on the iPhone for use on AT&T’s 3G network.

So why would anybody want to do VoIP on the cellular data channel, when there is a cellular voice channel already? Wouldn’t voice on the data channel cost more? And since the voice channel is optimized for voice and the data channel isn’t, wouldn’t voice on the data channel sound even worse than cellular voice already does?

Let’s look at the “why bother?” question first. There are actually at least four reasons you might want to do voice on the cellular data channel:

  1. To save money. If your voice plan has some expensive types of call (for example international calls) you may want to use VoIP on the data channel for toll by-pass. The alternative to this is to use the voice channel to call a local access number for an international toll by-pass service (like RebTel.)
  2. To get better sound quality: the cellular voice codecs are very low bandwidth and sound horrible. You can choose which codec to run over the data network and even go wideband. At IT Expo West a couple of weeks ago David Frankel of ZipDX demoed a wideband voice call on his laptop going through a Sprint Wireless Data Card. The audio quality was excellent.
  3. To get additional service features: companies like DiVitas offer roaming between the cellular and Wi-Fi networks that makes your cell phone act as an extension behind your corporate PBX. All these solutions currently use the cellular voice channel when out of Wi-Fi range, but if they were to go to the data channel they could offer wideband codecs and other differentiating features.
  4. For cases where there is no voice channel. In the example of David Frankel’s demo, the wireless data card doesn’t offer a voice channel, so VoIP on the data channel is the only option for a voice connection.

Moving on to the issue of cost, an iPhone unlimited data plan is $30 per month. “Unlimited” is AT&T’s euphemism for “limited to 5GB per month,” but translated to voice that’s a lot of minutes: even with IP packet overhead the bit-rate of compressed HD voice is going to be around 50K bits per second, which works out to about 13,000 minutes in 5GB. So using it for voice is unlikely to increase your bill. On the other hand, many voice plans are already effectively unlimited, what with rollover minutes, friend and family minutes, night and weekend minutes and whatnot, and you can’t get a phone without a voice plan. So for normal (non-international) use voice on the data channel is not going to reduce your bill, but it is unlikely to increase it, either.

Finally we come to the issue of whether voice sounds better on the voice channel or the data channel. The answer is, it depends on several factors, primarily the codec and the network QoS. With VoIP you can radically improve the sound quality of a call by using a wideband codec, but do impairments on the data channel nullify this benefit?

Technically, the answer is yes. The cellular data channel is not engineered for low latency. Variable delays are introduced by network routing decisions and by router queuing decisions. Latencies in the hundreds of milliseconds are not unusual. This will change with the advent of LTE, where the latencies will be of the order of 10 milliseconds. The available bandwidth is also highly variable, in contrast to the fixed bandwidth allocation of the voice channel. It can sometimes drop below what is needed for voice with even an aggressive variable rate codec.

In practice VoIP on the cellular data channel can sometimes sound much better than regular cellular voice. I mentioned above David Frankel’s demo at IT Expo West. I performed a similar experiment this morning with Michael Graves, with similarly good results. I was on a Polycom desk phone, Michael used Eyebeam on a laptop, and the codec was G.722. The latency on this call was appreciable – I estimated it at around 1 second round trip. There was also some packet loss – not bad for me, but it caused a sub-par experience for Michael. Earlier this week at Jeff Pulver’s HD Connect conference in New York, researchers from Qualcomm demoed a handset running on the Verizon network using EVRC-WB, transcoding to G.722 on Polycom and Gigaset phones in their lab in San Diego. The sound quality was excellent, but the latency was very high – I estimated it at around two seconds round trip.

The ITU addresses latency (delay) in Recommendation G.114. Delay is a problem because normal conversation depends on turn taking. Most people insert pauses of up to about 400 ms as they talk. If nobody else speaks during a pause, they continue. This means that if the one-way delay on a phone conversation is greater than 200 ms, the talker doesn’t hear an interruption within the 400 ms break, and starts talking again, causing frustrating collisions.
The ITU E-Model for call quality identifies a threshold at about 170 ms one-way at which latency becomes a problem. The E-Model also tells us that increasing latency amplifies other impairments – notably echo, which can be severe at low latencies without being a problem, but at high latencies even relatively quiet echo can severely disrupt a talker.

Some people may be able to handle long latencies better than others. Michael observed that he can get used to high latency echo after a few minutes of conversation.

Transparency and neutrality

Google and the New America Foundation have been working together for some time on White Spaces. Now they have (with PlanetLab and some academic researchers) come up with an initiative to inject some hard facts into the network neutrality debate.

The idea is that if users can easily measure their network bandwidth and quality of service, they will be able to hold their ISPs to the claims in their advertisements and “plans.” As things stand, businesses buying data links from network providers normally have a Service Level Agreement (SLA) which specifies minimum performance characteristics for their connections. For consumers, things are different. ISPs do not issue SLAs to their consumer customers. When they advertise uplink and downlink speeds, these speeds are “typical” or “maximum,” but they don’t specify a minimum speed, and they don’t offer any guarantees of latency, jitter, packet loss or even integrity of the packet contents. For example, here’s an excerpt from the Verizon Online Terms of Service:

VERIZON DOES NOT WARRANT THAT THE SERVICE OR EQUIPMENT PROVIDED BY VERIZON WILL PERFORM AT A PARTICULAR SPEED, BANDWIDTH OR DATA THROUGHPUT RATE, OR WILL BE UNINTERRUPTED, ERROR-FREE, SECURE…

Businesses pay more than consumers for their bandwidth, and providing SLAs is one of the reasons. Consumers would probably not be willing to pay more for SLAs, but they can still legitimately expect to know what they are paying for. The Measurement Lab data will be able to confirm or disprove accusations that ISPs are intentionally impairing traffic of some types.

This is a complicated issue, because one man’s traffic blocking is another man’s network management, and what a consumer might consider acceptable use (like BitTorrent) may violate an ISP’s Acceptable Use Policy (Verizon:”…it is a violation of… this AUP to… generate excessive amounts of email or other Internet traffic;”). The arguments can go round in circles until terms like “excessive” and “unlimited” are defined numerically and measurements are made. So Measurement Lab is a great step forward in the Network Neutrality debate, and should be applauded by consumers and service providers alike.

Wi-Fi certification for voice devices

In news that is huge for VoWi-Fi, the Wi-Fi Alliance announced on June 30th a new certification program, “Voice-Personal.” Eight devices have already been certified under this program, including enterprise access points from Cisco and Meru, a residential access point from Broadcom, and client adapters from Intel and Redpine Signals.

Why is this huge news? Well, as the press release points out, by 2011 annual shipments of cell phones with Wi-Fi will be running at roughly 300 million units. The Wi-Fi in these phones will be used for Internet browsing, for syncing photos and music with PCs, and for cheap or free voice calls.

The certification requirements for Voice-Personal are not aggressive: only four simultaneous voice calls in the presence of data traffic, with a latency of less than 50 milliseconds and a maximum jitter of less than 50 milliseconds. These numbers will produce an acceptable call under most conditions, but a network round-trip delay of 300 ms is generally considered to approach the limit of acceptability, and with a Wi-Fi hop at each end running at the limit of these specifications there would be no room in the latency budget for any additional delays in the voice path. The packet loss requirement, 1% with no burst losses, is a very good number considering that modern voice codecs from companies like GIPS can yield excellent sound quality in the presence of much higher packet loss. This number is hard to achieve in the real world, as phones encounter microwave ovens, move through spots of poor coverage and transition between access points.

Since this certification is termed “Voice-Personal,” four active calls per access point is acceptable; a residence is unlikely to need more than that. Three of the four access points submitted for this certification are enterprise access points. They should be able to handle many more calls, and probably can. The Wi-Fi Alliance is planning a “Voice-Enterprise” certification for 2009.

There are several things that are good about this certification. First, the WFA has seen fit to highlight voice as a primary use for Wi-Fi, and has set a performance baseline. Second, this certification requires some other certifications as well, like WMM power save and WMM QoS. So far in 2008, of 99 residential access points certified only 6 support WMM power save, and of 52 enterprise access points only 13 support WMM power save. One of the biggest criticisms of Wi-Fi in handsets is that it draws too much power. WMM power save yields radical improvements in battery life – better than doubling talk time and increasing standby time by over 30%, according to numbers in the WFA promotional materials.

Reliable VoIP

QoS metrics are important, and several companies have products that measure packet loss, jitter, latency and so on. But you can have perfect QoS, and your VoIP system can still be defective for all sorts of reasons.

I spoke with Gurmeet Lamba, VP of Engineering, at Clarus Systems at the Internet Telephony Expo this week. He said that even if a VoIP system is perfectly configured on installation, it can decay over time to the point of unusability. Routers go down and are brought up again with minor misconfigurations; moves, adds and changes accumulate bad settings and policy violations.

VoIP systems are rarely configured perfectly even on installation. For example, IP phones have built-in switches so you can plug your PC into your desk phone. Those ports are unlocked by default. But some phones are installed in public areas like lobbies. It’s easy for installers to forget to lock those ports, so anybody sitting in the lobby can plug their laptop into the LAN. There are numerous common errors of this kind. Clarus has an interesting product that actively and passively tests for them; it monitors policy compliance and triggers alarms on policy violations.

Clarus uses CTI to do active testing of your VoIP system, looking for badly configured devices and network bottlenecks. Currently it works only on Cisco voice networks, but Clarus plans to support other manufacturers.

Clarus started out focusing on automated testing of latency, jitter and packet loss for IP phone systems. It went on to add help desk support with remote control of handsets, and the ability to roll back phone settings to known good configurations.

The next step was to add “Business Information,” certifying deployment configurations, and helping to manage ongoing operations with change management and vulnerability reports. Clarus’ most recent announcement added passive monitoring based on a policy-based rules engine.

Clarus claims to have tested over 350 thousand endpoints to date. It has partners that offer network monitoring services.

WSJ on FMC

Today’s Wall Street Journal has a good article about T-Mobile’s UMA trial in Seattle. It says that T-Mobile may be rolling it out nationally as early as next month, despite some trial particpants’ complaints about handoff and battery life issues. T-Mobile will be offering a home router to help with QoS and battery life. I presume that for the battery life this is just WMM Power Save (802.11e APSD) since that is what the phones in the trial (Samsung T709 and Nokia 6136) support. For QoS side I expect these APs will support WMM (802.11e EDCF), but they could also support some proprietary QoS on the WAN access link, the way that the AT&T CallVantage routers do, which would be interesting.

There is some background on the trial here.

The article goes on to put the trial into the context of other FMC deployments, from BT Fusion, Telecom Italia and Orange. The article quotes a Verizon Wireless spokesman saying that they aren’t convinced that Wi-Fi can deliver high enough voice quality to carry Verizon branded calls. This is amusing bearing in mind the usual quality of a cellular call in a residence.

The article also quotes Frank Hanzlik, the head of the Wi-Fi Alliance as saying that business FMC may have more potential than consumer. I agree.

Dual-mode phones are the key to better-sounding calls

Potentially VoIP calls can sound radically better than what we are used to even on landline phones. So why don’t they? It may be lack of will. Some say the success of the mobile phone industry proves that people don’t care about sound quality on their calls. I don’t think this is a valid inference. All it proves is that people value mobility higher than sound quality.

The telephonic journey from mouth to ear, often thousands of miles in tens of milliseconds, traverses a chain of many weak links, each compounding the impairment of the sound. First, the phone. Whether it’s a headset, a desk phone or a PC, the microphone and speakers have to be capable of transmitting the full frequency spectrum of the human voice without loss, distortion or echo. Second the digital encoding of the call; it has to be done with a wideband codec. Third, the codec has to be end-to-end, so no hops through the circuit switched phone network. Finally the network must convey the media packets swiftly and reliably, since delayed packets are effectively lost, and lost packets reduce sound quality.

Discussions of VoIP QoS normally dwell mainly on the last of these factors, but the others are at least as important. The exciting thing about dual-mode cell phones is that they provide a means to cut through them. Because they must handle polyphonic ring tones and iPod-type capabilities, the speakers on most cell phones can easily carry the full frequency range of the human voice. Cell phone microphones can also pick up the required range, and DSP techniques can mitigate the physical acoustic design challenges of the cell phone form factor. Smart phone processors have the oomph to run modern wideband codecs. This leaves the issue of staying on the IP network from end-to-end. The great thing about dual-mode phones is that they can connect directly to the Internet in the two places where most people spend most of their time: at work and at home.

So if you and the person you are talking to are both in a Wi-Fi enabled location, and you both have a dual mode cell phone, your calls should not only be free, but the sound should be way better than toll quality.

Check out the V2oIP website for an industry initiative on this topic.