The Shorewall Environment Survey 2006 Paul Gear 2006 Paul D. Gear Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover, and with no Back-Cover Texts. A copy of the license is included in the section entitled GNU Free Documentation License.
Background In early March 2006, i embarked on the journey of surveying Shorewall users. Initially this sprang from my own curiosity: i thought that some of the systems at work on which i use Shorewall may be bigger and more complex than most others, and i wanted to find out if there are people out there who use Shorewall like i do. As started thinking about the questions i would ask, i realised that if i asked the right questions, i could create a survey that might help the Shorewall project better to understand its users. I used Zoomerang to create the survey. It has a number of tools that make it easy to create useful surveys. To get the most benefit out of Zoomerang, you have to subscribe to their professional version. In the long term, it would be great to have a practical free software alternative that could be self-hosted. A number of free content management systems such as Drupal have a survey module, but when i last looked at them, they were more limited and harder to use than Zoomerang.
Survey and results links The survey is still open as of this writing, and can be accessed at the Zoomerang survey page. Further participation is encouraged. The figures quoted in this document reflect the results at the time of writing. The public results of the survey are available. If you complete the survey, a link to the results is provided on the thank you page.
Sample size An important note about this survey is that it has a small sample size (103 complete responses at the time of writing), so any conclusions drawn should be considered tentative. To speculate on the overall number of users that this sample represents, the Debian popularity contest reports 478 installations of Shorewall, 285 of which are in active use. Assuming that the popularity contest represents 30% of the Debian installed base (likely ridiculously optimistic), this would make the number of active Shorewall systems approximately: 285 / 0.3 (percentage of Debian systems) / 0.26 (percentage Debian holds of all distributions) = 3654 (rounding up the numbers to the nearest whole, and assuming the percentages extrapolate regularly) This means that our survey represents a maximum of 2.8% of the installed base, likely far less.
Other possible inaccuracies Additionally, since the survey was open to multiple responses, it could be that some people answered the questions about themselves more than once, despite instructions to the contrary in the introduction page. There is an error in the released version of the survey for question 15 (RAM size): it was a multiple choice question rather than single choice, and thus there were more results than expected. The number of errors doesn't seem to be significant. If you notice any errors in this analysis, or have any suggestions about how to improve it, please contact the author at pgear@shorewall.net.
Results analysis
Organisations Small organisations dominate the spectrum of Shorewall users. The largest group (44%) was 1-10 users - mostly SOHO LANs based on the comments in that section. Ninety percent (90%) of Shorewall installations are in organisations with less than 500 users. The results for the questions about organisational size and the number of users serviced by Shorewall match fairly closely, which seems to indicate that the majority of Shorewall systems are servicing the entire organisation in question. The vast majority (84%) of Shorewall systems are administered by only one person. One question that needs to be asked is, "Why?" Possible reasons for this might be: Most of the organisations in which it is used are small, thus most of them will only have one person skilled in the area of packet filtering firewalls. This seems a likely scenario, but a cross correlation of the results of questions 1 and 2 with question 3 indicates that the number of administrators is fairly uniform across all sizes of organisation and user base. Shorewall works so well that people don't have to touch it much. Obviously, this is the preferred interpretation of the Shorewall project team. :-) Shorewall is too hard for new users to comprehend, so one skilled person in an organisation tends to get the job maintaining it. Equally obviously, this is a non-preferred interpretation. :-) However, being a firewall generator, Shorewall is not likely to attract the same sort of users as a web browser or music player. Shorewall administrators are a closed bunch and don't like sharing their job around. Given the nature of firewalls and packet filtering, this doesn't seem far-fetched. There doesn't seem to be an easy answer to thus question. In retrospect, since there were no responses indicating 10 or more administrators, i could have made the granularity of this question better. A question about a person's role in the organisation may also have been helpful. Possibly we could follow up with a smaller survey, specifically about the people and organisations who use Shorewall.
Users Unsurprisingly, 97% of survey respondents were male. Or to put it another way: surprisingly, there are actually 3 female Shorewall users. :-) Being male seems to be an occupational hazard of life in the IT industry, and even more so in the more "nerdy" specialisations like Linux and security. The largest age group of users is 25-34 years (42% of all respondents). There were no retirees (65 and over) or minors (under 18) in the responses. The distribution of the remaining age groups was fairly even. The largest group of users in terms of education was those with a Bachelor's degree, followed by those with a high school education. Fifty-seven percent (57%) of Shorewall users have a Bachelor's degree or better. Many users' highest qualifications are not in an IT-related discipline (42%). This remains fairly constant across the spectrum when correlated with the highest level of qualifications. Those who do not claim IT as their highest discipline come from a wide variety of other fields, including agriculture, art, business, chemistry, education, various forms of engineering, law, mathematics, physics, and theology. Almost two-thirds of users (62%) use Shorewall as part of their paid employment. Of these, 12% (7 of 58) do not use Shorewall as part of their official duties. Cross correlation with level of education revealed no major variance in this trend depending on level of education. The majority of users (73%) began using the Internet in the 1990s. A smaller majority (61%) have been using the Internet for more than 12 years (1994 or earlier). (The single response indicating use of the Internet (then ARPANET) since the 1960s seems to be an error.) The majority of users (70%) began using Linux after it reached a certain stage of maturity - around or after the release of kernel 2.0 (1996). However, nearly all respondents (97%) have been using Linux for 5 years or more, with almost half (47%) having 10 or more years experience with it. It seems fair to say that as a rule, Shorewall attracts people with plenty of experience. Around one third of users (30%) have been using Shorewall for more than 5 years, with two-thirds (66%) having used it since the 1.x series (2003 or earlier). It seems fair to say that Shorewall users seem to stick with the product once they are familiar with it. On the other hand, it seems that Shorewall is not attracting large numbers of new users, which is a concern for the future of the project.
Hardware Ninety-three percent (93%) of users run Shorewall on i386 family hardware, with a further 6% running it on x86-64/EM64T platforms. One response was received indicating use of Shorewall on MIPS (Linksys WRT platform). No responses were received for any other hardware platform. While it is not surprising that Intel would be dominant, given their market share, it seems a little skewed not to have any representatives of other architectures. A good spread of CPU power is shown in the survey responses. The largest group was 400-999 MHz (30%), with only 16% of responses indicating less than 400 MHz, and the same number greater than 2500 MHz. A number of responses in the field for additional information suggested that the machines used were either recycled desktops, or systems that were specifically built to do the job, and had been running in that role for a number of years. RAM configuration seemed to mostly mirror CPU power, with a slight bias towards higher RAM figures. The majority (52%) of systems have between 256 and 1023 MB; only 11% of systems have less than 128 MB; 28% have 1024 MB or more. This reflects the more server-oriented workload that many Shorewall systems run (see the section on server roles below). Shorewall systems on the whole tend toward smaller OS hard disks, with 42% having disks 39 GB or smaller. The largest group by a small margin was 80-159 GB at 23%, with 10-39 GB and 0-9 GB coming in a close second and third at 22% and 20% respectively.
Network The majority of Shorewall systems (82%) use between two and four network interfaces. The number of devices connected to systems closely mirrors the size of the organisations in which they are used, with 95% of systems connecting less than 500 devices, and the largest group (41%) connecting 2-10 other devices. Ninety percent (90%) of Shorewall systems are connected to 100 Mbps or faster local networks. Most systems have a broadband Internet connection or better, with only 7% having 512 Kbps or less, and 51% having 10 Mbps or better. DSL is the most common form of Internet connection, with over half the responses (51%).
Software The most popular Linux distribution on which users run Shorewall is Debian (26% of respondents), followed by a group consisting of Fedora Core (16%), Red Hat 9 and earlier (13%) and Red Hat Enterprise and derivatives (12%). The next group consists of SUSE (9%), Slackware (8%), Gentoo (6%), and LEAF/Bering (5%). The message about maintaining an up-to-date Shorewall system seems to have gotten through, with 61% of respondents running the latest stable version (3.0), and an additional 22% running the previous stable version (2.4). Only 14% of users are running unsupported older versions (2.2 and previous). The most common roles played by Shorewall systems are: External firewall/router (78%) DNS name server (61%) DHCP server (59%) Internal firewall/router (56%) Time server (55%)
Comments from users Following is a sample of the comments we received about the survey - they have been carefully sanitised to make us look good. ;-) More power to Shorewall! Shorewall Rocks! I'm amazed how easy it is every time I need to do something, even if it's been 6+ months since the last change! :) Good job and a great product. Shorewall is good, I have recommended it to several people, mostly working in the University & academic areas. Thanks to everyone who contributes to Shorewall. That's a *great* piece of software! Shorewall has been incredible. Tom has given so much of himself to this project, I can only say thank you from one person, I look up to people like him. I have used Shorewall for many systems, I am a contractor that "set up shop" all over the world. Depending on the available ISP services, this project has been flexible in every situation to date. Also, depending on my needs, it has done the same. "IP Tables made easy" is really an accurate description. I'm quite interested in seeing what the 'cross section' of Shorewall users are like. It's made my life a lot easier over the years. Thank you.
Lessons learned about survey technique
Treat surveys like releasing free software test on a small group before you go public release early and often make branches (copies) when you release alpha and beta versions merge the changes from branches (lessons you learned in those versions) into the main trunk
Start small and work towards what you want to know with specific, concrete questions I tried to do everything in one survey, and ended up confusing some people. For example, despite the fact that the survey's start page clearly says "Please answer the questions for only ONE SYSTEM running Shorewall", i received multiple comments saying that they couldn't answer accurately because they ran more than one Shorewall system. It would have been better to have two surveys: one about the people who use Shorewall, and another about the systems they run it on. Better still would be for Shorewall to automatically collect appropriate information about systems and request permission to send it to a central location for statistical analysis. How to do this and maintain users' privacy and obtain their permission efficiently is not an easy problem with a product like Shorewall, which doesn't actually stay running on user systems, and doesn't present a user interface per se.
Be prepared beforehand Within hours of the survey's release, 50% of the results were in. Within 3 days, it hit the Zoomerang basic survey limit of 100 responses. I had not planned for such an enthusiastic response, and also was too busy to download all of the results before the survey's time limit expired. Fortunately, i was able to obtain funding to allow a Zoomerang "pro" subscription to be purchased and thus provide advanced analysis, and complete downloads of the results.
Incrementally improve your surveys The final version of this survey was released still with a few bugs. The released version was just a copy of my master survey, and i continued to maintain the master after the final survey was released (and during this analysis), and i'm sure the next version will be even better.
Possible implications for the Shorewall project The users we have seem, on the whole, rather experienced, and very loyal. However, we don't seem to be attracting new users, despite new features such as multi-ISP support and integrated traffic shaping. The question about a GUI comes up frequently, and one wonders whether this is would make a significant difference in Shorewall's uptake with new users. Shorewall seems to be predominantly used in small, i386-based environments such as home LANs and small businesses. It seems to be frequently combined with a number of other basic functions, such as DNS, DHCP, NTP, VPN. Integration with (or perhaps providing a plug-in module for) a dedicated gateway distribution such as ipcop, Smoothwall, or Clark Connect might be a good way to serve the needs of our users.
Possible implications for other free software projects The essence of free software is software by the people, for the people. Knowing who the people are and what their needs are is critical to this process. If at all possible, build statistics gathering into your application, and find a way to encourage people to use it. This concrete data will help confirm the results of any surveys you might conduct.