shorewall_code/docs/ScalabilityAndPerformance.xml
2007-06-28 20:41:32 +00:00

212 lines
8.6 KiB
XML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<article>
<!--$Id$-->
<articleinfo>
<title>Scalability and Performance</title>
<authorgroup>
<author>
<firstname>Tom</firstname>
<surname>Eastep</surname>
</author>
</authorgroup>
<pubdate><?dbtimestamp format="Y/m/d"?></pubdate>
<copyright>
<year>2006</year>
<year>2007</year>
<holder>Thomas M. Eastep</holder>
</copyright>
<legalnotice>
<para>Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version
1.2 or any later version published by the Free Software Foundation; with
no Invariant Sections, with no Front-Cover, and with no Back-Cover
Texts. A copy of the license is included in the section entitled
<quote><ulink url="GnuCopyright.htm">GNU Free Documentation
License</ulink></quote>.</para>
</legalnotice>
</articleinfo>
<section id="Intro">
<title>Introduction</title>
<para>The performance of the <emphasis role="bold">shorewall
start</emphasis> and <emphasis role="bold">shorewall restart</emphasis>
commands when using Shorewall-shell is a frequent topic of questions. This
article attempts to explain the scalability issues involved and to offer
some tips for reducing the time required to compile a Shorewall
configuration and to execute the compiled script.</para>
<para>Ultimately, the solution to these performance problems is to migrate
to the use of Shorewall-perl if at all possible.</para>
</section>
<section id="Groups">
<title>Host Groups</title>
<para>In this article, we will use the term <firstterm>host
group</firstterm> to refer to a set of IP addresses accessed through a
particular interface. In a Shorewall configuration, there is one host
group for:</para>
<itemizedlist>
<listitem>
<para>Each entry in <filename>/etc/shorewall/interfaces</filename>
that contains the name of a zone in the first column.</para>
</listitem>
<listitem>
<para>Each entry in <filename>/etc/shorewall/hosts</filename>.</para>
</listitem>
</itemizedlist>
<para>As you can see, each host group is associated with a single
<firstterm>zone</firstterm>.</para>
</section>
<section id="GroupScale">
<title>Scaling by Host Groups</title>
<para>For each host group, it is possible to attempt connections to every
other host group; and if the host group has the <emphasis
role="bold">routeback</emphasis> option, then it is possible for
connections to be attempted from the host group to itself. So if there are
<emphasis role="bold">H</emphasis> host groups defined in a Shorewall
configuration, then the number of unique pairs of (<emphasis>source host
group</emphasis>, <emphasis>destination host group</emphasis>) is
<emphasis role="bold">H</emphasis>*<emphasis role="bold">H</emphasis> or
<emphasis role="bold">H</emphasis><superscript>2</superscript>. In other
words, the number of combinations is the square of the number of host
groups and increasing the number of groups from <emphasis
role="bold">H</emphasis> to <emphasis role="bold">H</emphasis>+1 adds
<emphasis role="bold">H</emphasis> + <emphasis role="bold">H</emphasis> +
1 = 2<emphasis role="bold">H</emphasis> + 1 additional
combinations.</para>
</section>
<section id="ZoneScale">
<title>Scaling by Zones</title>
<para>A similar scaling issue applies to Shorewall zones. If there are
<emphasis role="bold">Z</emphasis> zones, then connections may be
attempted from a given zone <emphasis
role="bold">Z</emphasis><subscript>n</subscript> to all of the other zones
(including to <emphasis role="bold">Z</emphasis><subscript>n</subscript>
itself). Hence, the number of combinations is the square of the number of
zones or <emphasis
role="bold">Z</emphasis><superscript>2</superscript>.</para>
</section>
<section id="Shorewall">
<title>Scaling within the Shorewall Code</title>
<para>Shorewall is written entirely in Bourne Shell. While this allows
Shorewall to run on a wide range of distributions (included embedded
ones), the shell programming environment is not ideal for writing the
compiler portion of Shorewall. As a consequence, the code must repeatedly
perform sequential searches of lists. If a list has <emphasis
role="bold">N</emphasis> elements and a sequential search is made for each
of those elements, then the number of comparisons is 1 + 2 + 3 + .... +
<emphasis role="bold">N</emphasis> = <emphasis role="bold">N</emphasis> *
(<emphasis role="bold">N</emphasis> + 1 ) / 2. So again, we see order
<emphasis role="bold">N</emphasis><superscript>2</superscript>
scaling.</para>
</section>
<section id="Improving">
<title>Improving Performance</title>
<para>Achieving good performance boils down to three things:</para>
<itemizedlist>
<listitem>
<para>Use a light-weight shell and fast hardware. Especially in the
compiler, a light-weight shell such as <command>ash</command> or
<command>dash</command> can provide considerable improvement over
<command>bash</command>.</para>
</listitem>
<listitem>
<para>With all of the order <emphasis
role="bold">N</emphasis><superscript>2</superscript> scaling that is
implicit in the problem being solved, it is vital to keep <emphasis
role="bold">N</emphasis> small.</para>
<itemizedlist>
<listitem>
<para>If you have a large number of interfaces, use wild-cards
("+") in <filename>/etc/shorewall/interfaces</filename> and
<filename>/etc/shorewall/hosts</filename> to reduce the number of
host groups.</para>
</listitem>
<listitem>
<para>Combine host groups with similar firewall requirements into
a single zone.</para>
</listitem>
</itemizedlist>
</listitem>
<listitem>
<para>Use NONE policies whereever appropriate. This helps especially
in the rules activation phase of both script compilation and
execution.</para>
</listitem>
</itemizedlist>
<para>So while it is tempting to create lots of zones through entries in
<filename>/etc/shorewall/hosts</filename>, such configurations
<emphasis>always</emphasis> perform badly. In these cases, it is much
better to have more rules than more zones because the performance scales
linearly with the number of rules whereas it scales geometrically with the
number of zones.</para>
<para>Another tip worth noting has to do with the use of shell
variables.</para>
<para>Suppose that the following appears in
<filename>/etc/shorewall/params</filename>:</para>
<programlisting>HOSTS=&lt;ip1&gt;,&lt;ip2&gt;,&lt;ip3&gt;,...&lt;ipN&gt;</programlisting>
<para>and suppose that $HOSTS appears in the SOURCE column of <emphasis
role="bold">M</emphasis> ACCEPT rules. That would generate a total of
<emphasis role="bold">N</emphasis> * <emphasis role="bold">M</emphasis>
iptables ACCEPT rules.</para>
<para>The number of rules can be reduced significantly by using an <ulink
url="Actions.html">action</ulink>. Consider the following:</para>
<blockquote>
<para><filename>/etc/shorewall/actions</filename>:</para>
<programlisting>AcceptHosts</programlisting>
<para><filename>/etc/shorewall/action.AcceptHosts</filename>:</para>
<programlisting>#TARGET SOURCE DEST PROTO DEST SOURCE ORIGINAL RATE
# PORT PORT(S) DEST LIMIT
ACCEPT $HOSTS </programlisting>
</blockquote>
<para>If the <emphasis role="bold">M</emphasis> ACCEPT rules are now
replaced with <emphasis role="bold">M</emphasis> AcceptHosts rules, the
total number of rules will be <emphasis role="bold">N</emphasis> +
<emphasis role="bold">M</emphasis>.</para>
<para>Example (Accept net-&gt;fw SSH from $HOSTS):</para>
<programlisting>#ACTION SOURCE DEST PROTO DEST SOURCE ORIGINAL RATE USER/
# PORT PORT(S) DEST LIMIT GROUP
AcceptHosts net $FW tcp 22 </programlisting>
</section>
</article>