shorewall_code/docs/ScalabilityAndPerformance.xml

200 lines
8.0 KiB
XML
Raw Normal View History

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<article>
<!--$Id$-->
<articleinfo>
<title>Scalability and Performance</title>
<authorgroup>
<author>
<firstname>Tom</firstname>
<surname>Eastep</surname>
</author>
</authorgroup>
<pubdate><?dbtimestamp format="Y/m/d"?></pubdate>
<copyright>
<year>2006</year>
<holder>Thomas M. Eastep</holder>
</copyright>
<legalnotice>
<para>Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version
1.2 or any later version published by the Free Software Foundation; with
no Invariant Sections, with no Front-Cover, and with no Back-Cover
Texts. A copy of the license is included in the section entitled
<quote><ulink url="GnuCopyright.htm">GNU Free Documentation
License</ulink></quote>.</para>
</legalnotice>
</articleinfo>
<section>
<title>Introduction</title>
<para>The performance of the <emphasis role="bold">shorewall
start</emphasis> and <emphasis role="bold">shorewall restart</emphasis>
commands is a frequent topic of questions. This article attempts to
explain the scalability issues involved and to offer some tips for
reducing the time required to compile a Shorewall configuration and to
execute the compiled script.</para>
</section>
<section>
<title>Host Groups</title>
<para>In this article, we will use the term <firstterm>host
group</firstterm> to refer to a set of IP addresses accessed through a
particular interface. In a Shorewall configuration, there is one host
group for:</para>
<itemizedlist>
<listitem>
<para>Each entry in <filename>/etc/shorewall/interfaces</filename>
that contains the name of a zone in the first column.</para>
</listitem>
<listitem>
<para>Each entry in <filename>/etc/shorewall/hosts</filename>.</para>
</listitem>
</itemizedlist>
<para>As you can see, each host group is associated with a single
<firstterm>zone</firstterm>.</para>
</section>
<section>
<title>Scaling by Host Groups</title>
<para>For each host group, it is possible to attempt connections to every
other host group; and if the host group has the <emphasis
role="bold">routeback</emphasis> option, then it is possible for
connections to be attempted from the host group to itself. So if there are
<emphasis role="bold">H</emphasis> host groups defined in a Shorewall
configuration, then the number of unique pairs of (<emphasis>source host
group</emphasis>, <emphasis>destination host group</emphasis>) is
<emphasis role="bold">H</emphasis>*<emphasis role="bold">H</emphasis> or
<emphasis role="bold">H</emphasis><superscript>2</superscript>. In other
words, the number of combinations is the square of the number of host
groups and increasing the number of groups from <emphasis
role="bold">H</emphasis> to <emphasis role="bold">H</emphasis>+1 adds
<emphasis role="bold">H</emphasis> + <emphasis role="bold">H</emphasis> +
1 = 2<emphasis role="bold">H</emphasis> + 1 additional
combinations.</para>
</section>
<section>
<title>Scaling by Zones</title>
<para>A similar scaling issue applies to Shorewall zones. If there are
<emphasis role="bold">Z</emphasis> zones, then connections may be
attempted from a given zone <emphasis
role="bold">Z</emphasis><subscript>n</subscript> to all of the other zones
(including to <emphasis role="bold">Z</emphasis><subscript>n</subscript>
itself). Hence, the number of combinations is the square of the number of
zones or <emphasis
role="bold">Z</emphasis><superscript>2</superscript>.</para>
</section>
<section>
<title>Scaling within the Shorewall Code</title>
<para>Shorewall is written entirely in Bourne Shell. While this allows
Shorewall to run on a wide range of distributions (included embedded
ones), the shell programming environment is not ideal for writing the
compiler portion of Shorewall. As a consequence, the code must repeatedly
perform sequential searches of lists. If a list has <emphasis
role="bold">N</emphasis> elements and a sequential search is made for each
of those elements, then the number of comparisons is 1 + 2 + 3 + .... +
<emphasis role="bold">N</emphasis> = <emphasis role="bold">N</emphasis> *
(<emphasis role="bold">N</emphasis> + 1 ) / 2. So again, we see order
<emphasis role="bold">N</emphasis><superscript>2</superscript>
scaling.</para>
</section>
<section>
<title>Improving Performance</title>
<para>Achieving good performance boils down to two things:</para>
<itemizedlist>
<listitem>
<para>Use a light-weight shell and fast hardware. Especially in the
compiler, a light-weight shell such as <command>ash</command> or
<command>dash</command> can provide considerable improvement over
<command>bash</command>.</para>
</listitem>
<listitem>
<para>With all of the order <emphasis
role="bold">N</emphasis><superscript>2</superscript> scaling that is
implicit in the problem being solved, it is vital to keep <emphasis
role="bold">N</emphasis> small.</para>
<itemizedlist>
<listitem>
<para>If you have a large number of interfaces, use wild-cards
("+") in <filename>/etc/shorewall/interfaces</filename> and
<filename>/etc/shorewall/hosts</filename> to reduce the number of
host groups.</para>
</listitem>
<listitem>
<para>Combine host groups with similar firewall requirements into
a single zone.</para>
</listitem>
</itemizedlist>
</listitem>
</itemizedlist>
<para>So while it is tempting to create lots of zones through entries in
<filename>/etc/shorewall/hosts</filename>, such configurations
<emphasis>always</emphasis> perform badly. In these cases, it is much
better to have more rules than more zones because the performance scales
linearly with the number of rules whereas it scales geometrically with the
number of zones.</para>
<para>Another tip worth noting has to do with the use of shell
variables.</para>
<para>Suppose that the following appears in
<filename>/etc/shorewall/params</filename>:</para>
<programlisting>HOSTS=&lt;ip1&gt;,&lt;ip2&gt;,&lt;ip3&gt;,...&lt;ipN&gt;</programlisting>
<para>and suppose that $HOSTS appears in the SOURCE column of <emphasis
role="bold">M</emphasis> ACCEPT rules. That would generate a total of
<emphasis role="bold">N</emphasis> * <emphasis role="bold">M</emphasis>
iptables ACCEPT rules.</para>
<para>On the other hand, consider the following:</para>
<blockquote>
<para><filename>/etc/shorewall/actions</filename>:</para>
<programlisting>AcceptHosts</programlisting>
<para><filename>/etc/shorewall/action.AcceptHosts</filename>:</para>
<programlisting>#TARGET SOURCE DEST PROTO DEST SOURCE ORIGINAL RATE
# PORT PORT(S) DEST LIMIT
ACCEPT $HOSTS </programlisting>
</blockquote>
<para>If the <emphasis role="bold">M</emphasis> ACCEPT rules are now
replaced with <emphasis role="bold">M</emphasis> AcceptHosts rules, the
total number of rules will be <emphasis role="bold">N</emphasis> +
<emphasis role="bold">M</emphasis>.</para>
<para>Example (Accept net-&gt;fw SSH from $HOSTS):</para>
<programlisting>#ACTION SOURCE DEST PROTO DEST SOURCE ORIGINAL RATE USER/
# PORT PORT(S) DEST LIMIT GROUP
AcceptHosts net $FW tcp 22 </programlisting>
</section>
</article>