Scalability and Performance
Tom
Eastep
2006
Thomas M. Eastep
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version
1.2 or any later version published by the Free Software Foundation; with
no Invariant Sections, with no Front-Cover, and with no Back-Cover
Texts. A copy of the license is included in the section entitled
GNU Free Documentation
License
.
Introduction
The performance of the shorewall
start and shorewall restart
commands is a frequent topic of questions. This article attempts to
explain the scalability issues involved and to offer some tips for
reducing the time required to compile a Shorewall configuration and to
execute the compiled script.
Host Groups
In this article, we will use the term host
group to refer to a set of IP addresses accessed through a
particular interface. In a Shorewall configuration, there is one host
group for:
Each entry in /etc/shorewall/interfaces
that contains the name of a zone in the first column.
Each entry in /etc/shorewall/hosts.
As you can see, each host group is associated with a single
zone.
Scaling by Host Groups
For each host group, it is possible to attempt connections to every
other host group; and if the host group has the routeback option, then it is possible for
connections to be attempted from the host group to itself. So if there are
H host groups defined in a Shorewall
configuration, then the number of unique pairs of (source host
group, destination host group) is
H*H or
H2. In other
words, the number of combinations is the square of the number of host
groups and increasing the number of groups from H to H+1 adds
H + H +
1 = 2H + 1 additional
combinations.
Scaling by Zones
A similar scaling issue applies to Shorewall zones. If there are
Z zones, then connections may be
attempted from a given zone Zn to all of the other zones
(including to Zn
itself). Hence, the number of combinations is the square of the number of
zones or Z2.
Scaling within the Shorewall Code
Shorewall is written entirely in Bourne Shell. While this allows
Shorewall to run on a wide range of distributions (included embedded
ones), the shell programming environment is not ideal for writing the
compiler portion of Shorewall. As a consequence, the code must repeatedly
perform sequential searches of lists. If a list has N elements and a sequential search is made for each
of those elements, then the number of comparisons is 1 + 2 + 3 + .... +
N = N *
(N + 1 ) / 2. So again, we see order
N2
scaling.
Improving Performance
Achieving good performance boils down to two things:
Use a light-weight shell and fast hardware. Especially in the
compiler, a light-weight shell such as ash or
dash can provide considerable improvement over
bash.
With all of the order N2 scaling that is
implicit in the problem being solved, it is vital to keep N small.
So while it is tempting to create lots of zones through entries in
/etc/shorewall/hosts, such configurations
always perform badly. In these cases, it is much
better to have more rules than more zones because the performance scales
linearly with the number of rules whereas it scales geometrically with the
number of zones.
Another tip worth noting has to do with the use of shell
variables.
Suppose that the following appears in
/etc/shorewall/params:
HOSTS=<ip1>,<ip2>,<ip3>,...<ipN>
and suppose that $HOSTS appears in the SOURCE column of M ACCEPT rules. That would generate a total of
N * M
iptables ACCEPT rules.
On the other hand, consider the following:
/etc/shorewall/actions:
AcceptHosts
/etc/shorewall/action.AcceptHosts:
#TARGET SOURCE DEST PROTO DEST SOURCE ORIGINAL RATE
# PORT PORT(S) DEST LIMIT
ACCEPT $HOSTS
If the M ACCEPT rules are now
replaced with M AcceptHosts rules, the
total number of rules will be N +
M.