Ways to understand the Linux Users – Popcon and Smolt

Tux
The Linux user is mainly an unknown species: since you can download your distribution anonymously and everywhere the distributors know almost nothing about their userbase. Due to bittorrent and ftp mirrors even the user numbers are rough estimations at best. However, with Popcon and Smolt two different approaches exist to gather more information about the user base.

Popcon

Popcon is short for popularity contest, as in Debian Popularity Contest. The idea is simple: a small piece of software on the users computer gathers data about the installed software and rough data about the usage of the software (regularly or not at all). These data are send to a central server which collects the data and provides them to everyone interested. All you have to do to participate is to install the popularity-contest package.
And the results speak for themselves: over 50k people participated in the contest and submitted usage information.

Of course, not everyone installed the package, and these who did are most likely more technique affine than the people who didn’t – but the data are still interesting. For example you can check how well your package is adopted and used – or not. And you can gather information if the a specific package you introduced is really used: webmin is installed on over 2k machines – but only used by a couple of hundreds regularly. Instead, clamav is installed on almost 5k machines and is regularly used on almost 3k machines.
Also, you can check for the general popularity of packages: totem is much more popular than Amarok (more Gnome users, I guess), but xine is much more popular than mplayer. And so on…

I must admit that I would love to have such information for Fedora because I would also like to see hints for the adoption of the packages I maintain. But at the moment it is highly unlikely that we will see such information :/

Smolt

What Popcon is for used software is Smolt for the hardware: Smolt collects hardware information from every client participating. With Fedora 7 every user has to decide if s/he wants to take part in the data collection or not, which might increase the number of participants drastically. At the moment the database lists roughly 11k entries.

With these data at your hand you can easily check which kind of hardware is used – and where you put your focus on improved hardware integration and support if you want to please the Linux user base. Also, it might give some hints about what kind of hardware support you can expect.
For example: one third of all machines have 512 MB Ram or less – therefore the distributor should be easy on the Ram. On the other side, on average more than one third of all machines have two or more CPUs/cores. Also, almost half of the installations are marked as Desktop and 20% as Laptop (and 20% as unknown).
But you can also check for the hardware used in one category: the Fedora people tend to use ATI hardware more than NVIDIA hardware.

Besides these hardware information Smolt also gathers basic information about the main system, like the default language, the version of the distribution and so on. This can be pretty important for distributors to have a picture how many people are still using old versions of a distribution and what it will mean when they are forced to upgrade, for example.

Last words

I really hope that it will become normal that all distributions collect information of both types. I would love to see a corporation, but as usual this is unlikely in the short time.
In the meantime, every distribution tries its own way. Ubuntu for example has the Ubuntu Hardware Database – which is seriously broken for month now which is pretty disappointing. But I’m sure they will fix it eventually.

In any case, speaking about statistics, you have to be careful and doubtful every time: since Popcon and Smolt both rely on volunteers you wont have a representative profile of the user base.
Also, gathering numbers might be tricky – for example, Debian’s popcon package was installed by less machines than machines which sent information, and the network device with the largest share among the Smolt users is used by 157.7% of all users…

8 thoughts on “Ways to understand the Linux Users – Popcon and Smolt”

  1. xine more popular than mplayer? now that’s _really_ hard to believe for me, since mplayer is just better suited as general media player.

    maybe they meant xine-lib which is used just about everywhere?

  2. Yes, I was talking about xine-lib – as mplayer it is used in many other players.
    And it is you opinion that mplayer is better suited – others might have different opinions.

    To me it was not very surprising – xine is used in several good players, and gstreamer is getting better and better.
    In might be worse in other distributions, btw.: Fedora managed it to split xine in free and non-free parts which it could not do that with mplayer. Therefore Fedora does include xine, but not mplayer, you need additional repos for that.

  3. Xine is much better then mplayer for playing media. I’ve always had problems with this program but not had any issues with Xine to speak of.

    I’ve had these issues copied over from distro to distro and as such now always avoid mplayer.

    Xine seems to handle avi and wmv files better once the file support is installed too.

  4. “the Fedora people tend to use ATI hardware more than NVIDIA hardware.”

    What are they, crazy or something? Everyone knows NVidia plays way nicer with Linux than ATI.

  5. Out of four Linux systems I use (Fedora 7 on one and SUSE 10.2 on the other three), the only ATI video I have in on the notebook system, since it’s onboard video. Given a choice, I’ve picked NVIDIA cards for all the desktop system graphics and unless ATI did something really drastic to change the current driver situation, I will continue as is.

  6. With regards to xine-mplayer….both deserve respect. There is no reason to select one over another just to declare some victory (untill next weekk). Both are quite functional, load the same codecs and so forth. Xine has a GUI by default which may throw off some users of mplayer. Mplayer has some serious command line functionality which GUI users will never even know. The best thing to do…just have ’em both.., Need not be a debate….yes yes I know there’s xine-nonfree and m…restriced or whatever but by all accounts both do their job 100%. Only in the event the user does not do his is this even an issue. As of late the ATI (proprietary) drivers have been doing a good job so using either nvidia or ati – the user should experience would be very much alike.

  7. I recently looked at popcon data for software that we develop. It was really useful, especially being able to see the usage data as opposed to simply the install count.

    I package our apps for Fedora and I also wish there was something similar.

    The journey started when I looked at our application on sourceforge. Seems Windows users where the most prolific users of our application. Yet when I checked popcon data it was about 5X as large on Debian. Which ignored Ubuntu data.

    I don’t know what the privacy concerns are, I’m sure they exist. But as a packager it really does help me understand my usefulness. It also help to compare my install base with other similar applications.

  8. Popcon on Fedora would make much sense, however in case of Fedora it would need some pople to port popcon to Fedora and incorporate it with the installer. Not an impossible task, but something which simply has to be done by someone.

    Once it is set up it would face the same privacy concerns as smolt, and they culd be answered in the same way, so I don’t see real problems there.
    But again, first someone needs to port the tools needed.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.