Yesterday we had the release of the Phonon homepage. That brought up some news in different online media which reported over that fact, and among other’s there were an article on the german pro-linux.de.
In the discussion about this article it became clear pretty soon that there are two things which bother the users about Phonon:
* first: some of them do not know how Phonon differs from ALSA or for what you need it
* second: some users complain that KDE does not simply take GStremaer as default
Here is an explanation about these topics, together with some general things on how Sound under Linux works. The video-part is not really addressed, although I sometimes talk about “multimedia”, if you have comments how to extend it, feel free to post them.
First, there is the hardware – different sound cards, different sound hardware, and therefore different hardware drivers which are needed. To avoid the problem of implementing each driver for each piece of hardware into every program we need to abstract the hardware drivers into one kind of “standard driver” which can be accessed by all other programs. This abstraction layer can than implement all other hardware drivers and can talk to the hardware.
This abstracted hardware can accept all audio streams and commands which would go directly to the hardware in other cases. To keep it easy the audio streams are only accepted as raw audio data streams. The abstraction layer should not nbe bothered with the different types of music formats.
This task is done by OSS or by ALSA where ALSA is the standard for Linux and OSS is still used in Unix versions like BSD. The advantage of ALSA is in this case that it can not only accept one audio raw data stream, but several, and mix them together. So different applications can talk to ALSA and send them streams at the same time.
But, as I said, ALSA does only take raw data streams. Why? Well, imagine the amount of different audio data formats, the way how they could be handled (stream, file, etc.) and so on: MP3, OGG, WAV, ACC, WMA, RMA, etc. ALSA was designed as an hardware abstraction, and to keep a project living and developing it makes sense to not put all problems into one solution. Keep it simple, you might call it. There are enough problems you have to fight with when you are designing a hardware abstraction layer.
So we come to the second step: the multimedia framework. This framework is the part which can understand all the different media formats like the one’s above mentioned. Therefore several people state it should be able to handle some kind of plugins to provide an easy way of integrating new file formats.
Common multimedia frameworks are GStreamer, NMM, Xine, Helix and also the old aRts from KDE 3. Since OSS is not able to mix multiple sound streams some of these frameworks can overtake this job and therefore are also called soundserver. One of the reasons aRts was very popular in the beginning was that it addressed this problem very well.
The next step now is the third – the wrapper. Imagine you are programming in a specific framework like the KDE framework: you are using your normal programming language (here c++), you are using APIs which you are used to (qt/KDE APIs), and you are just used to the style and the way of how it works in this environment and how problems are addressed, etc.
Then it makes pretty much sense to provide oyu with a convenient set of APIs and your favoured language if I want you to implement a new general feautre.
This feature is now multimedia, or more specific, sound. You have to implement some kind of multimedia framework into your program – but this is most likely not written especially for your desktop environment with APIs in the way you are used to, but it is written with it’s own APIs, in it’s own way to address different problems, and probably even written in it’s own programming language.
That is exactly the situation we have with GStreamer and KDE!
So there is the need to program a wrapper which provide the KDE developers which something they are used to – and here we have it: Phonon. That task is done by it. Phonon provides the APIs the KDE people can easily integrate into their programs and can take usage of without learning to much new stuff. Keep it simple, developers, after all, are also just human beings.
So you need to have this wrapper, no matter if you want to tightly integrate GStreamer and nothing else into KDE or not.
Before we have a close look at this last sentence, we just talk about the fourth and last step of the architecture of sound inside Linux: the applications. These should integrate something to send their sound to as easy as possible.
If you are a KDE-used developer, you should use the APIs which are provided by KDE, so in future which are provided by Phonon. If you are a developer of professional audio software you should directly integrate ALSA support into your application, probably with an option to switch to OSS (although I do not think that a professional sound application would be satisfied with OSS).
Gnome developers will certainly implement GStreamer at the moment, but who knows what the future will come up with? NMM looks very, very promising, and even Helix seems to come up with some interesting stuff in the near future.
But back to Phonon – I already explained why we have to have Phonon, no matter if we integrate GStreamer as tight as possible or not. And now have a look at the history of KDE: KDE already thought once they have found the holy grail with aRts, and it was very painful to learn that it wasn’t. Additionally even today GStreamer is not supported by everyone, there are enough people who prefer Xine, aRts or Helix, and the distributors also have somehting to say. And do not forget that it is possible that something new step up suddenly and provide an astonishing new multimedia framework with functions every user have never dared to dream of.
Another thought is binary incompatibility: even if GStremaer will be the preferred solution for the next years, GStreamer will develop also. And there will probably a point where they brake the binary compatibility between two versions. With Phonon that wouldn’t be just a small correction (well, a new Phonon-backend probably), but nothing to worry about. The same is true for all other backends.
With these thoughts in background and the fact that there had to be a wrapper in all cases the developers decided to make this extra effort to be able to switch between the different solutions. It is also a nice way to keep some downwards compatibility with KDE 3 since there is also the ability to support aRts with Phonon.
It just keeps KDE flexbile and still gives the opportunity to support GStreamer as the main solution.
So, if you think GStreamer is the best ever: well, don’t complain, GStreamer can be fully integrated into KDE, and can be the default backend. You would like to add now that it makes sense to support only one backend since there are different funcitons in different backends, and you cannot support them all, or you cannot provide function x with backend y and the other way around to the application developers and therefore you can only provide limited functions of each backend and cannot use the full abilities of the backends. That’s right in theorie, but in practise Phonon and most multimedia frameworks are aiming at the same target: the normal user. And the normal user does only have limited needs. As mentioned above: if you want to program a high professional audio application, nothing stands in your way – but you should as close to the hardware as you could, therefore you shouldn’t use a multimedia framework but should work directly with ALSA.
So far, I hope that I cleared some questions and calmed down some stir. Spread the word/a link to this post, and show that Phonon is not as bad as several people think – the opposite will hopefully be the case 🙂