Skype is the gorilla of HD Voice. Looking at my Skype client I see that there are at this moment about 16 million people enjoying the wideband audio experience on Skype. The other main type of Voice over IP, SIP, is rarely used for HD Voice conversations, though I wrote an HD Voice Cookbook to help to popularize wideband codecs on SIP. Since Skype has the largest base of wideband codec users, those who are enthusiasts of both HD Voice and SIP are eager for SIP networks to interoperate with Skype, allowing all HD-capable endpoints to talk HD to each other. Skype does already kind of interoperate with SIP, but only through the PSTN, which reduces the wideband media stream to narrowband. Opening up Skype would solve this problem, so it’s obviously a good idea. What is not so clear, however, is what it means to “open up Skype.”
Skype reinvented Voice over IP, and did it better than SIP. SIP was originally intended to be a lightweight way to set up real-time communications session. It was the Internet Engineering Task Force’s response to the complexities of the ITU VoIP standard, H.323. But SIP got hijacked by the telephone industry, and recast into the familiar mold of proliferating standards and proprietary implementations. SIP is no longer lightweight, implementation is a challenge and only the basic features are easily interoperable.
Take a look at my HD Voice Cookbook to see what it takes to set up a typical SIP phone, then compare this to installing Skype on your PC. Or compare it to the simplicity of plugging in a POTS phone to your wall socket. So we have:
- Skype, free video calls with HD voice from your PC to anywhere in the world;
- POTS, narrowband voice-only calls that cost about $30 per month plus per-minute charges for international calls; or
- SIP, that falls somewhere in between the two but which is way too complex for consumers to set up, and which people only really use for narrowband because everybody else only uses it for narrowband, so there’s no network effect.
Open VoIP standards got a several-year start on Skype, starting with H.323 and going on to SIP; but from its inception Skype blew them out of the water. To be sure it had a strong hype amplifier since P2P file sharing was controversial at that time, and Skype came from the same people as Kazaa, but at that time NetMeeting (an H.323 VoIP program) had an enormous installed base, since it came as part of Windows. The problem Skype solved was ease of use.
Skype doesn’t just give you video and wideband voice. It’s all encrypted and you get all sorts of bonus features like conferencing, presence, chat, desktop sharing, NAT traversal and dial-by-name. And did I mention it’s free?
The open standards VoIP community was beaten fair and square by Skype, blowing a several year start in the process.
Let me clarify that. In terms of minutes of voice traffic on network backbones, SIP traffic outweighs Skype, so from that point of view, SIP is not so beaten by Skype. The sense in which Skype has trounced the open standards VoIP community is in providing users with something better and cheaper than the decades-old PSTN experience, which carrier VoIP merely strives to emulate at a marginally lower price.
So it seems to me like sour grapes to clamor for Skype to make technical changes to conform to open standards, especially if those changes would impair some of the benefits that Skype offers users. How would users benefit from opening up Skype? Would the competition lower the cost of a Skype call? It’s hard to see how, when Skype calls are free. Would the service be more accessible, or accessible to more customers? No, because anybody with a browser can download Skype free by typing “Skype” or even “Skipe” into their browser’s search field. Would the open standards community innovate faster than Skype, and provide more and better features? Not based on the their respective track records. The open standards community has had plenty of time to out-innovate Skype and manifestly failed.
Anyway, what are the senses in which Skype is not open? It is certainly interoperable with the PSTN; SkypeIn and SkypeOut are among the cheapest ways to make calls on the PSTN. Actually, this may be the greatest threat to Skype’s innovation. SkypeIn and SkypeOut are the only way that Skype makes money; this is a powerful motivation for Skype to not incent users to abandon them. If this remains the only economic force acting on the company Skype is likely to decay into an old-style regular phone service provider.
After a lot of debate with people who know about these things, there seem to be two main ways in which Skype could be said to be not open:
- The protocol is proprietary and not published, so third parties can’t implement endpoints that interoperate with Skype endpoints.
- Only Skype can issue Skype addresses, and Skype controls the directories rather than using DNS like SIP.
Let’s look at the issue of the proprietary protocol first. Let’s break it into two parts, first who defines the protocols and second, their secrecy. In the debate between the cathedral and the bazaar, the cathedral has recently been losing out to the bazaar amongst the theorizers. We see the success of Apache, MySQL, Linux and Firefox and it looks as though the cathedral is being routed in the marketplace, too. But on the other hand we have successful companies like Apple, Google, Intel and Skype, whose success demonstrates that a design monopoly can often deliver a more elegant and tight user experience. There is no Linus Torvalds of SIP. Having taken the decision to implement a protocol other than SIP, it seems fine to me that whoever invented the Skype protocol should continue to design it, especially since they have manifestly done a much better job than the designers of SIP – ‘better’ in the sense of being more appealing to users.
What about the secrecy? A while back one of the original designers of SIP, Henning Schulzrinne, with his colleague Salman Baset, reverse engineered the Skype network and published his findings here. There is more technical background on Skype here. According to Baset and Schulzrinne:
Login is perhaps the most critical function to the Skype operation. It is during this process a Skype client authenticates its user name and password with the login server, advertises its presence to other peers and its buddies, determines the type of NAT and firewall it is behind, discovers online Skype nodes with public IP addresses, and checks the availability of latest Skype version.
Opening up the protocol to let other people use it would enable them to implement their own Skype login servers. This would enable a parallel network, but in the absence of a new protocol that enabled the login servers to exchange information, it would not lead to interoperability, in the sense of users on Skype being able to view the presence information of users on the parallel network, or even retrieve their IP address to make a call. So it would have the effect of fragmenting the Skype network, rather than opening it. Alternatively the Skype login servers could implement the SIP protocol to exchange presence information. But then it would start to be a SIP network, not a Skype network. And the market numbers say that users find SIP inferior to Skype. So why do it?
Opening up the protocol to let other people write Skype clients that logged into the Skype login servers would open up the network, but at the risk of introducing interoperability issues due to faulty interpretations of the specification. Network protocols are notoriously prone to this kind of problem. But guaranteed interoperability of the clients is one of the primary benefits of Skype over SIP from the point of view of the user, who would therefore not benefit from this step.
So why not have Skype distribute binaries that expose to third party applications the functionality of the protocols and the ability to log into the Skype login server through a published API? Wait a sec – they already do that.
Another objection to Skype publishing the protocols for third parties to implement is that there would be a danger of the third parties implementing some parts of the protocol but not others. For example not the encryption part, or not the parts that enable clients to be super-nodes or relays. A proliferation of this kind of free-rider would stress the network, making it more prone to failure.
Related to the issue of who implements the login servers is who issues Skype addresses. There is a central authority for issuing phone numbers (the ITU), and a central authority for issuing IP addresses (the IANA). But in both cases, the address space is hierarchical, allowing the central authority to delegate blocks of addresses to third party issuers. The Skype address space is not hierarchical, so it would require some kind of reworking to enable delegation. Alternatively the Skype login servers could accept logins from anybody with a SIP address. But there would be no guarantee that the client logging in was interoperable.
Scanning back through this posting, I see that my arguments could be parodied as “you can’t argue with success,” and “if it ain’t broke don’t fix it.” Arguments of this type are normally weak, so in this case I think my points are actually “there are reasons for Skype’s success,” “fixes could break it,” and “users would be better served if Skype competitors concentrated on seducing them with a superior offering,” the last of which, after all, is how Skype has won its users away from the traditional telecom industry. Some people are trying this approach, notably Gizmo5, which I plan to write about later.
Differences between Skype and SIP make an effort to draw conclusions from their respective degrees of success problematic. Ease of use and reliability are more readily achievable with a vertically integrated and proprietary solution like Skype. Open platforms enjoy a more rapid pace of innovation. The game remains in play. Consider the example of AOL and the world wide web. Vertically interated and proprietary AOL enjoyed an early lead in the online business, but the openness of the world wide web won in the long run.
It is true that SIP may become more successful than Skype. In some ways it already is – All SkypeIn and SkypeOut minutes go over SIP, and SIP is the protocol of choice for the next generation of cellular technology as well as wireline technology (they all say they are moving to IMS). This post was really focusing on HD Communications, and Skype currently leads SIP overwhelmingly in minutes of wideband voice and video. But this lead will evaporate if the cellular companies adopt AMR-WB as planned, proving you right, Daniel.
The question is, what will cause broad adoption of wideband? Numerous forces influence innovation in the telecom industry, like regulation, standards and intellectual property. But as with all businesses, the primary driver is money. HD Voice yields a superior experience, so it has a greater value. The trouble is, nobody has figured out a way to translate that greater value into greater revenue. Even Skype makes no money on its wideband and video calls. Its entire revenue is from narrowband calls on SkypeIn and SkypeOut.
Skype HD made money for company founders in sale to eBay. Skype HD makes money in the sense of attracting users that the company can sell SD minutes. Skype makes money for companies that sell Skype enabled devices. I believe the telecom business models will start to look like infotech business models associated with the web. Lot’s of companies make money serving people interested in the world wide web. The rise of software as a service even provides an example of metered usage of web based apps. The metering will not likely involve minutes of use and tracking location of use as in the case of telecom. The big telco’s did a fair amount of transforming to survive the move from wireline to wireless. They will need to keep moving in order to survive the move from the PSTN to Internet based services.
While I agree with most of your arguments, and believe that openness is important, I think there’s one flaw in your comparison of SIP to Skype.
While Skype is a service, SIP is a protocol.
Protocols are not easy to use or install – they are just protocols.
Skype were the first to innovate in the sense of bringing a service that actually works through firewalls and gives pretty good voice quality, and they have done so while being easy to install and use.
No SIP or H.323 service at the time were able to do that.
The way I see it, innovation comes in two ways:
1. If you go proprietary and ignore standards, you can innovate faster. This is what Skype did.
2. If you go open, you can let others innovate for/with you.
Opening up Skype will let them enjoy the fruits of 3rd party innovation.
Tsahi
Thanks for pointing this out, Tsahi. You are right about Skype being a service while SIP is a protocol. Some of the references to “Skype” in this post should read “the Skype protocol,” and some of the references to “SIP” should read “SIP-based services.” I concede it is confusing to leave it to the reader to figure out which is which. I will be more careful with my terminology in the future.
I also agree with your point that proprietary implementations are faster to market, while standard implementations let you leverage the work of others. My point on this issue is that Skype can benefit from third party innovation through its published API, and that SIP has suffered from excessive “innovation” that serves mainly to complicate the protocol rather than improve it.
I still don’t know what the phrase “opening up Skype” means in practice, and I would still like to hear how it would benefit users. The primary benefit that I am currently interested in is propagation of wideband voice. Skype has already taken a step in this direction that could be viewed as opening up: the royalty-free licensing of the SILK codec.
Michael,
I am with you on the open APIs thingy – It does make a lot of sense, but in this case, as Skype is a SW and not a HW platform; the ability to use clients in front of their servers which are not fully controlled by Skype may also make sense as it can grow their audience from PC users to consumer electronics.
Wideband voice propagation is happenning with or without Skype – just check out Pulver’s latest stunt of HD Communication Summit.
Hi,
Thanks for the good article. There was one paragraph that raises an issue I have been wondering about:
“Another objection to Skype publishing the protocols for third parties to implement is that there would be a danger of the third parties implementing some parts of the protocol but not others. For example not the encryption part, or not the parts that enable clients to be super-nodes or relays. A proliferation of this kind of free-rider would stress the network, making it more prone to failure.”
I assume that mobile clients cannot or do not act as super-nodes, relying on the deskbound clients to do the leg work. With the off switch for supernodes installed to appease enterprises could the balance of supernodes and users become, well, unbalanced?
Also, wasn’t there a kerfuffle with Skype’s Chinese partner over encryption implying that variations of the protocol have been deployed?