Skype – and why SIP is not the answer

skypeOnce in a while I blog about some VoIP stuff. Especially Jingle and Skype is something I think about quite a lot.

And everytime I blog about one of these, some user point to to some SIP solution like Wengo, Ekiga or Gizmo which are supposed to solve all my problems I have with skype (not standard compliant, until some weeks ago no Alsa support, etc.). Thanks about the comments and the hints, I really appreciate feedback here. But I also have to say: They do not solve any problem. In fact, they cannot but introduce even more problems. Because they are SIP based. And SIP is not a solution which can be compared with Skype. Not at all.

The problem with SIP is that it only works out of the box when you have either no NAT-Router or Firewall at all in front of you, or if you have SIP capable Firewalls/Routers. In other cases, you have to start dealing with STUNNEL, UPnP or similar stuff – and that does not work in every case. Especially when there is a NAT setup with more restricted rules on both sides SIP is impossible. And that’s the fault of SIP which is designed in such a way.

Let’s have a look at the other side: Skype, the working solution everyone knows, and Jingle, the hopefully-somewhen-to-be perfect solution. They do not have these problems, they work out of a box. It just works, there is nothing, nothing to configurate at all, it juts works. That’s because the protocol design is completely different: it is P2P with Nodes, and can access everything everywhere. That is the big advantage when you compare it to SIP. And that’s the reason why Wengo, Gizmo, Ekiga and all the other SIP solutions do not work for me, and are not a threat to Skype at all. Sure, they can be useful in a corporate environment where you have professionals creating the setup, but not for personal use where the people will not use SIP when they have trouble reaching some people (like both behind restrictive NATs) once in a while.

My big hope, though, is that the mentioned programs will get jingle support soon. Actually I hope that almost all these tools will get jingle support soon. At the moment there are only Coccinella (which I never tried), Telepathy and Tapioca, where tapioca joins telepathy inn the near future.

At this point I also have to admit that I’m a bit disappointed by the reaction to jingle: it was *the* opportunity to introduce a real alternative to Skype, and it could have been introduced to Linux and other free systems almost perfectly. Instead, the only system which really added jingle support was MacOS, where the adium IM client has full jingle support.
But I’m also a but disappointed by Google: they could’ve launched Google talk as a real competitor: together with a set of services (like call in and out), with some functions for corporate needs (think about SIP-B for example) and with strong development support to bring it into the free world. I still miss the Asterisk connector, that would have been perfect! And I also thought Google is putting more power in it to make it a real competitor against some MSN or ICQ solutions: where is the Video support, where is fully conference support, where are nice addons and easy-share buttons for files or directories and such stuff? And where are these functions in the free clients?

I guess I have to wait until the Free world starts bundling the efforts of implementing every protocol and start working together. That should free some sources, and improve the quality of the implemented functions massively. Until then, I have to stick to some half baked solutions, what a sad situation.

12 thoughts on “Skype – and why SIP is not the answer”

  1. I thought the SIP issues were sorted out by STUN which is UDP tunneling I believe – but I’m not much of an expert on this. I hate protocols like Skype precisely because they are so aggressive about finding a way through your firewall. If the IT department meant for an application to be used they’d have allowed the port. If they didn’t intend to allow the protocol then the program shouldn’t try and route around the problem.

    I agree with you that Google should pull their finger out about gtalk. If they want to index the Internet then they need to realise that people below the age of 25 spend enormous amounts of time on IM. Their conversations imply all sort of relationships and topics that they will want to capture for advertising. And obviously there’s a lot of commercial opportunity in voice transcripts.

  2. About STUN: afaik it cannot work if both people sit behind a restricted NAT. And I had enough people struggling with different problems that I gave up on advertising SIP as such a good idea.

    But about the IT-derpartment thing: as I mentioned: SIP is a nice solution for them, there you have specialists which are payed to solve the problem. But that’s nothing a home user can think about.

    And Skype itself: it is a nightmare for IT departments, I agree with you. I posted somewhere hidden in this blog a bit about this topic, mainly referring to these two studies: Silver Needle in the Skype and Skype uncovered. But here again Google could be more inventive and could implement the jingle stuff in a way that would be powerful, but also stoppable for real IT firewalls. That’s one of the things I meant when I wrote about “with some functions for corporate needs”.

    Well, any idea how to reach Google in a way they listen to me and act like I want them?😉

  3. I guess by ‘restricted NAT’ you mean where the firewall will not allow UDP from high ports out. So where the firewall policy is ‘block everything, and only allow a specificy set of services’. That’s certainly how I used to setup firewalls for companies.

    But, I’m not sure if that is the circumstance for most home/soho users. They mostly set their firewall to block incoming traffic (NAT) and allow anything outgoing. SIP with STUN should work fine in that circumstance, and particularly as vendors make their firewalls more friendly to this traffic profile. Take FTP for example, it’s a nightmare but every vendor just built knowledge of it into their firewall – it uses high random ports to start data downloads.

    Guess the only way to get feedback to google is through their feedback system or through support. But in reality I don’t think they’ll change at this point. They’re positioning to be the champion of standards versus Skypes champion of ‘secret protocol that’s secure honest!’.

  4. Just to add it – when I speak about restricted NATs I mainly refer to this information – and the sad experience that I was not able to set up two VoIP clients on two computers behind the same Linksys NAT.:/
    But, additionally, when you buy newer routers for home users in Germany, mostly together with a DSL contract, you often get VoIP ready routers.

  5. The problem with SIP is that it only works out of the box when you have either no NAT-Router or Firewall at all in front of you, or if you have SIP capable FIrewalls/Routers. In other cases, you have to start dealing with STUNNEL, UPnP or similar stuff – and that does not work in every case. Especially when there is a NAT setup with more restricted rules on both sides SIP is impossible. And that’s the fault of SIP which is designed in such a way.

    That’s just not right. Of course SIP is not “designed” to not work in the presence of NAT or firewalls. It’s just that two computers that are behind NAT cannot connect to each other, unless special measures are taken. Similarly, a firewall that blocks SIP traffic will, well, block SIP traffic. You can’t reasonably fault SIP for that.

    So how come Skype works in these circumstances? Simple: it’s designed to circumvent the restrictions. If two users that are behind NAT try to communicate, the data is routed through a third machine which is not behind NAT. Skype finds this machine through a peer to peer protocol, so it will only work if another Skype program is running, not behind NAT, and has enough resources to forward the traffic. Ekiga, a SIP (and H.323) client, uses STUN; a standard protocol for traversing NAT, which uses a central server. Obviously, this means the server must be available and have enough resources.

    What about firewalls? Well, SIP clients will use specific ports to communicate. If traffic to these ports is blocked, it won’t work. You may be able to use an HTTP proxy to get your SIP traffic through the firewall, because most firewalls allow HTTP traffic. In fact, this is exactly what Skype does, if all its other options fail.

    I hope to have made it clear that SIP and Skype suffer the same problems when confronted with NAT and firewalls, and that the solutions are there not only for Skype, but also for SIP clients. And for those of you who are going to say “but Skype Just Works, and SIP doesn’t”, I will add that Ekiga has always Just Worked for me, even though I am behind the most restrictive kind of NAT. I’m sure there are other SIP clients that Just Work, as I’m sure there are some that won’t Just Work. I would urge everyone to not dismiss SIP, simply because not all clients are easy to use in restrictive environments.

  6. STUN is a protocol to find out the public IP if you are behind a NAT, and what kind of NAT it is. You cannot use it to move the call traffic over the central server to route it between two different clients which are behind NAT’s. See here for more information.

    And yes, SIP is not p2p based, that’s what I’m talking about all the time. The fact that these are two different designs is clear; however, I can still compare how both solutions work for different tasks.

    And to come back: both solutions do not suffer, as you state, from the same problems. The p2p solution just works, that’s what we need. The SIP solution does not in every case. It is nice that it works for you, lucky you, but it didn’t work for me in different setups. But individual experiences do not count here, so: There are too many setups where it cannot work easily without any (!) additionaly configuration, see here or here.

    Something else comes to my mind at this moment: even if ekiga “just works” for you, how is it if you suddenly change the network heavily? Like, to another router, or without any NAT at all? Or the other way around? If someone changes from a public IP to a NAT, ekiga must be reconfigured at least once. Sure, it is easy, but still impossible for the average computer user.

    A final word: I am searching for a solution for my friends and family members which just works in every case. And I would like to have something open standards based. The competitor is Skype. And at this point, SIP does not work. It is because of design decisions, that’s clear, and that’s ok. But that means that it is not the right solution. By design. Therefore I have to search for another solution, and the ideal solution would be a open, but also p2p based solution like jingle.

  7. STUN is a protocol to find out the public IP if you are behind a NAT, and what kind of NAT it is.

    …which will, in most cases, allow you to traverse the NAT. Skype does something similar, although it’s not exactly STUN, AFAIK.

    And to come back: both solutions do not suffer, as you state, from the same problems.

    You’re not interpreting my words the way I meant them. The problems are the same for any software that wishes to set up communication between two nodes that are behind NAT or firewalls. Skype implements solutions that overcome the problems. As far as I know, there’s nothing preventing the use of these solutions for SIP. Of course, many (all?) of them haven’t, but that’s a problem with the clients (or, as I would say, with NAT), not with SIP.

    If someone changes from a public IP to a NAT, ekiga must be reconfigured at least once. Sure, it is easy, but still impossible for the average computer user.

    Which, again, is not a problem with SIP, but with the clients (or with NAT).

    Of course, I see the reality that Skype Just Works and SIP clients don’t. If this weren’t the case, there wouldn’t be so much complaining. I just find it unfair to fault SIP for it. Which brings me to my last point:

    Therefore I have to search for another solution, and the ideal solution would be a open, but also p2p based solution like jingle.

    Why Yet Another Protocol? Why not just – finally – standardize and implement a way to do NAT traversal that works, and run SIP over it?

  8. Ok, probably you’re right and I interpreted your words not as you meant them, so I just concentrate on the last question:

    Why Yet Another Protocol? Why not just – finally – standardize and implement a way to do NAT traversal that works, and run SIP over it?

    Th problem is that you have to do another standardization. That’s something new everyone has to accept, and everyone has to agree to. Keep in mind that there are quite a lot of standards which are not implemented on any operating system because no one cares, and everyone things these standards are not worth it.
    So, when you standardize on a new way you not only need a good standard, you also have to persuade all producers to accept this standard. Also, not everyone want to buy a new router. In Germany roughly 25% of all homes have DSL, and therefore they already have a router. They will use these devices for the next 5 years, and in this time gap every new standardization cannot be a solution because it will not affect the already sold devices.

    And afaik everyone hopes that such a standard comes up – the closing words of the RFC which defines the STUN protocol mention that STUN can be a solution in several cases until the NAT-standard is fully accepted and spread. But at the moment, they are still waiting.

    And since one step is needed anyway, there are two possibilities: you would prefer NAT standardization, I would prefer another protocol. Actually, I do not really see the problem with a new protocol. Why stick with SIP? Yes, there are quite a lot of devices and techniques you can get which are working with SIP (telephones, etc.), but there are also enough problems with it (SIP-B, afaik), and you can also get telephones and such stuff for Skype. So, it is worth a try.
    Or, which is probably the best solution: give it a try. If jabber/jingle would be pushed at least as a real competitor, the market would choose.

    At the moment the home user market chose Skype…

  9. First off, I should probably clarify that I meant “standardize” in the de facto sense, not in the sense of let’s get some standards body together and spend the next 5 years coming up with some elephantine specification.

    Th problem is that you have to do another standardization.

    But consider the alternative: you would have to standardize a new VoIP protocol. So you have that anyway. If you standardize a NAT traversal protocol, that solves problems for a number of protocols at once, rather than being VoIP-specific. Given the choice, I know which one I would prefer.

    That’s something new everyone has to accept, and everyone has to agree to.

    Well, it’s not as bad as it sounds. Vendors would be free to implement it or not, but those who didn’t would have a problem with NAT traversal.

    Also, not everyone want to buy a new router.

    Depending on how the traversal works, they may not need to. What I had in mind was actually a sort of peer to peer protocol, like the one Skype uses, for example. I think there already are open protocols that fit the bill.

    Actually, I do not really see the problem with a new protocol.

    Well, interoperability with existing implementations, for one. There’s quite a lot of SIP equipment and software already out there. Plus, I take the position that the advocates of a new protocol have to convince the world of its advantages, rather than the world convincing them of the advantages of not breaking things. So, what would be the great advantage of introducing Yet Another Voip Protocol, rather than working with what we already have?

    and you can also get telephones and such stuff for Skype.

    …and lock yourself into proprietary technology. No, thanks.

    Or, which is probably the best solution: give it a try.

    I agree with that. If the vendors of SIP clients can’t get their act together and make them work, they have themselves to blame when competitors run away with their customers. In the end, the customers decide which technology will win – unfortunately, that’s often not the best one.

    At the moment the home user market chose Skype

    Clearly. Also clearly, I resent that choice. First of all, because I hate it when a company manages to enthuse users for their proprietary product, even when equivalent open products have existed for years (in this case, for example, open source VoIP (and video conferencing) applications existed before NAT became a real problem).

    Secondly, because there are real practical disadvantages to proprietary technology. For example, AFAIK, there’s no Skype client for Linux on powerpc, and there probably won’t be one, until the protocol is publicly known. When that happens, Skype-the-company might start trying to block unofficial clients, just like we’ve seen in the IM world. Speaking of which: don’t a lot of IM clients do voice chat using SIP as well? How do they deal with NAT?

  10. Skype is a trust issue, you’re running software on your machine that is explicitly designed to penetrate firewalls and only a couple dozen people have examined the source code/protocol. Scary.

  11. Inglorion:

    Depending on how the traversal works, they may not need to. What I had in mind was actually a sort of peer to peer protocol, like the one Skype uses, for example. I think there already are open protocols that fit the bill.

    So, where is the difference to what I proposed? That what I’m talking about! Jingle *is* a sort of peer to peer protocol like Skype uses, it is spread in a way, and is on the way to become an official jabber/xml standard, and therefore an official industry standard. Since Jabber/XML is actually used in corporate environments afaik jingle would already have a way to enter the real market.

    I agree with that. If the vendors of SIP clients can’t get their act together and make them work, they have themselves to blame when competitors run away with their customers.

    Well, no – because it is not the problem of the SIP clients. SIP cannot work as wanted by me by design, as we already discussed. So you need to extend it, and afterwards it is not SIP anymore! It is probably SIP-E (extended), or whatever, but not SIP.

    open source VoIP (and video conferencing) applications existed before NAT became a real problem

    Again, I unfortunately cannot agree with you: NAT has been a problem all the time, since ethernet started to spread. At least where I come from the bandwidth which you need for VoIP came together with NAT.

    don’t a lot of IM clients do voice chat using SIP as well? How do they deal with NAT?

    I know the SIP clients you get when you order a SIP number, or programs like Ekiga. The best solution available at the moment is Ekiga after all I saw around. And there you have the metnioed problems. And, again, it is not a question of the program, it is a question of the design of the protocol. It is designed with special goals for specific tasks. And these goals do not meet the needs of home users in the real world today.
    With IPv6 this would not be a problem. But we are not in the IPv6 world yet.

Comments are closed.