Skype is the gorilla of HD Voice. Looking at my Skype client I see that there are at this moment about 16 million people enjoying the wideband audio experience on Skype. The other main type of Voice over IP, SIP, is rarely used for HD Voice conversations, though I wrote an HD Voice Cookbook to help to popularize wideband codecs on SIP. Since Skype has the largest base of wideband codec users, those who are enthusiasts of both HD Voice and SIP are eager for SIP networks to interoperate with Skype, allowing all HD-capable endpoints to talk HD to each other. Skype does already kind of interoperate with SIP, but only through the PSTN, which reduces the wideband media stream to narrowband. Opening up Skype would solve this problem, so it’s obviously a good idea. What is not so clear, however, is what it means to “open up Skype.”
Skype reinvented Voice over IP, and did it better than SIP. SIP was originally intended to be a lightweight way to set up real-time communications session. It was the Internet Engineering Task Force’s response to the complexities of the ITU VoIP standard, H.323. But SIP got hijacked by the telephone industry, and recast into the familiar mold of proliferating standards and proprietary implementations. SIP is no longer lightweight, implementation is a challenge and only the basic features are easily interoperable.
Take a look at my HD Voice Cookbook to see what it takes to set up a typical SIP phone, then compare this to installing Skype on your PC. Or compare it to the simplicity of plugging in a POTS phone to your wall socket. So we have:
- Skype, free video calls with HD voice from your PC to anywhere in the world;
- POTS, narrowband voice-only calls that cost about $30 per month plus per-minute charges for international calls; or
- SIP, that falls somewhere in between the two but which is way too complex for consumers to set up, and which people only really use for narrowband because everybody else only uses it for narrowband, so there’s no network effect.
Open VoIP standards got a several-year start on Skype, starting with H.323 and going on to SIP; but from its inception Skype blew them out of the water. To be sure it had a strong hype amplifier since P2P file sharing was controversial at that time, and Skype came from the same people as Kazaa, but at that time NetMeeting (an H.323 VoIP program) had an enormous installed base, since it came as part of Windows. The problem Skype solved was ease of use.
Skype doesn’t just give you video and wideband voice. It’s all encrypted and you get all sorts of bonus features like conferencing, presence, chat, desktop sharing, NAT traversal and dial-by-name. And did I mention it’s free?
The open standards VoIP community was beaten fair and square by Skype, blowing a several year start in the process.
Let me clarify that. In terms of minutes of voice traffic on network backbones, SIP traffic outweighs Skype, so from that point of view, SIP is not so beaten by Skype. The sense in which Skype has trounced the open standards VoIP community is in providing users with something better and cheaper than the decades-old PSTN experience, which carrier VoIP merely strives to emulate at a marginally lower price.
So it seems to me like sour grapes to clamor for Skype to make technical changes to conform to open standards, especially if those changes would impair some of the benefits that Skype offers users. How would users benefit from opening up Skype? Would the competition lower the cost of a Skype call? It’s hard to see how, when Skype calls are free. Would the service be more accessible, or accessible to more customers? No, because anybody with a browser can download Skype free by typing “Skype” or even “Skipe” into their browser’s search field. Would the open standards community innovate faster than Skype, and provide more and better features? Not based on the their respective track records. The open standards community has had plenty of time to out-innovate Skype and manifestly failed.
Anyway, what are the senses in which Skype is not open? It is certainly interoperable with the PSTN; SkypeIn and SkypeOut are among the cheapest ways to make calls on the PSTN. Actually, this may be the greatest threat to Skype’s innovation. SkypeIn and SkypeOut are the only way that Skype makes money; this is a powerful motivation for Skype to not incent users to abandon them. If this remains the only economic force acting on the company Skype is likely to decay into an old-style regular phone service provider.
After a lot of debate with people who know about these things, there seem to be two main ways in which Skype could be said to be not open:
-
The protocol is proprietary and not published, so third parties can’t implement endpoints that interoperate with Skype endpoints.
- Only Skype can issue Skype addresses, and Skype controls the directories rather than using DNS like SIP.
Let’s look at the issue of the proprietary protocol first. Let’s break it into two parts, first who defines the protocols and second, their secrecy. In the debate between the cathedral and the bazaar, the cathedral has recently been losing out to the bazaar amongst the theorizers. We see the success of Apache, MySQL, Linux and Firefox and it looks as though the cathedral is being routed in the marketplace, too. But on the other hand we have successful companies like Apple, Google, Intel and Skype, whose success demonstrates that a design monopoly can often deliver a more elegant and tight user experience. There is no Linus Torvalds of SIP. Having taken the decision to implement a protocol other than SIP, it seems fine to me that whoever invented the Skype protocol should continue to design it, especially since they have manifestly done a much better job than the designers of SIP – ‘better’ in the sense of being more appealing to users.
What about the secrecy? A while back one of the original designers of SIP, Henning Schulzrinne, with his colleague Salman Baset, reverse engineered the Skype network and published his findings here. There is more technical background on Skype here. According to Baset and Schulzrinne:
Login is perhaps the most critical function to the Skype operation. It is during this process a Skype client authenticates its user name and password with the login server, advertises its presence to other peers and its buddies, determines the type of NAT and firewall it is behind, discovers online Skype nodes with public IP addresses, and checks the availability of latest Skype version.
Opening up the protocol to let other people use it would enable them to implement their own Skype login servers. This would enable a parallel network, but in the absence of a new protocol that enabled the login servers to exchange information, it would not lead to interoperability, in the sense of users on Skype being able to view the presence information of users on the parallel network, or even retrieve their IP address to make a call. So it would have the effect of fragmenting the Skype network, rather than opening it. Alternatively the Skype login servers could implement the SIP protocol to exchange presence information. But then it would start to be a SIP network, not a Skype network. And the market numbers say that users find SIP inferior to Skype. So why do it?
Opening up the protocol to let other people write Skype clients that logged into the Skype login servers would open up the network, but at the risk of introducing interoperability issues due to faulty interpretations of the specification. Network protocols are notoriously prone to this kind of problem. But guaranteed interoperability of the clients is one of the primary benefits of Skype over SIP from the point of view of the user, who would therefore not benefit from this step.
So why not have Skype distribute binaries that expose to third party applications the functionality of the protocols and the ability to log into the Skype login server through a published API? Wait a sec – they already do that.
Another objection to Skype publishing the protocols for third parties to implement is that there would be a danger of the third parties implementing some parts of the protocol but not others. For example not the encryption part, or not the parts that enable clients to be super-nodes or relays. A proliferation of this kind of free-rider would stress the network, making it more prone to failure.
Related to the issue of who implements the login servers is who issues Skype addresses. There is a central authority for issuing phone numbers (the ITU), and a central authority for issuing IP addresses (the IANA). But in both cases, the address space is hierarchical, allowing the central authority to delegate blocks of addresses to third party issuers. The Skype address space is not hierarchical, so it would require some kind of reworking to enable delegation. Alternatively the Skype login servers could accept logins from anybody with a SIP address. But there would be no guarantee that the client logging in was interoperable.
Scanning back through this posting, I see that my arguments could be parodied as “you can’t argue with success,” and “if it ain’t broke don’t fix it.” Arguments of this type are normally weak, so in this case I think my points are actually “there are reasons for Skype’s success,” “fixes could break it,” and “users would be better served if Skype competitors concentrated on seducing them with a superior offering,” the last of which, after all, is how Skype has won its users away from the traditional telecom industry. Some people are trying this approach, notably Gizmo5, which I plan to write about later.