mirror-tutorial/en_US mirror-tutorial.xml,NONE,1.1

Paul W. Frields (pfrields) fedora-docs-commits at redhat.com
Sat Mar 4 20:58:42 UTC 2006


Author: pfrields

Update of /cvs/docs/mirror-tutorial/en_US
In directory cvs-int.fedora.redhat.com:/tmp/cvs-serv15972/en_US

Added Files:
	mirror-tutorial.xml 
Log Message:
Move to proper en_US locale


--- NEW FILE mirror-tutorial.xml ---
<!-- $Id: -->
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
 "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [

<!ENTITY % FEDORA-ENTITIES-EN SYSTEM "../../docs-common/common/fedora-entities-en.ent">
%FEDORA-ENTITIES-EN;

<!ENTITY DOCNAME "mirror-tutorial">
<!ENTITY DOCVERSION "1.0">
<!ENTITY DOCDATE "2006-02-06">
<!ENTITY DOCID "&DOCNAME;-&DOCVERSION; (&DOCDATE;)"> <!-- change version of manual and date here -->

<!ENTITY FCLOCALVER "4">
]>

<article id="mirror-tutorial" lang="en">

  <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
    href="fdp-info.xml"/>

  <section id="sn-introduction">
    <title>Introduction</title>
    <section id="sn-purpose">
      <title>Purpose</title>
      <para>
        This tutorial presents a number of related topics that allow an
        administrator to seamlessly integrate mirroring and update
        services for &FC;.  Use these services to provision a classroom,
        laboratory, or office. These service provisions also increase
        ease of use and enhance user experience.  They also add to the
        perceived value of non-proprietary operating systems and
        software.
      </para>
      &BUG-REPORTING;
    </section>
    <section id="sn-audience">
      <title>Audience</title>
      <para>
	You will find this tutorial more useful if you are a system
	administrator, or a &FC; "power user" familiar with the
	following topics:
      </para>
      <itemizedlist>
	<listitem>
	  <para>
	    &FC; system installation and administration
	  </para>
	</listitem>
	<listitem>
	  <para>
	    Basic Internet protocols (HTTP/Web)
	  </para>
	</listitem>
	<listitem>
	  <para>
	    Using a command line interface
	  </para>
	</listitem>
      </itemizedlist>
    </section>
    <section id="sn-about-mirrors">
      <title>About Mirrors</title>
      <para>
	A <emphasis>mirror</emphasis>
	<indexterm><primary>mirror</primary></indexterm> is a server
	that provides a copy of one or more collections of files.
	Mirroring a site reduces traffic to the original source site,
	thus spreading the stress and bandwidth costs of many users
	across many sites. Side benefits of running a local mirror
	include very fast access through the local network, providing
	custom services to local users, and increasing your skills in
	managing Internet services.
      </para>
      <para>
	The site from which you retrieve files to build your mirror is
	called an <emphasis>upstream mirror</emphasis><indexterm>
	  <primary>mirror</primary> <secondary>upstream</secondary>
	</indexterm>. If possible, choose an upstream mirror that is
	located close to you geographically. This reduces unnecessary
	traffic across transcontinental sections of the Internet, where
	bandwidth is limited and expensive. Use only upstream mirrors
	that are intended for public access, unless you have permission
	from the upstream mirror site administrator.
      </para>
    </section>
    <section id="sn-additional-resources">
      <title>Additional Resources</title>
      <para>
	For more information on installing &FC; see the &FC; &IG; at
	&IG-URL;. For more information on basic Internet protocols, see
	<ulink
	  url="http://library.albany.edu/internet/internet.html">http://library.albany.edu/internet/internet.html</ulink>, 
	or search Google at <ulink
	  url="http://www.google.com/">http://www.google.com/</ulink>.
	For more general information about mirrors, see <ulink
	  url="http://en.wikipedia.org/wiki/Mirror_(computing)">http://en.wikipedia.org/wiki/Mirror_(computing)</ulink>.
      </para>
    </section>
    <section id="sn-acknowledgements">
      <title>Acknowledgements</title>
      <para>
	Karsten Wade provided editorial services and kept the style
	crisp and consistent.  Stuart Ellis provided some additional
	security-related information.
      </para>
    </section>
  </section>
  
  <section id="sn-planning-and-setup">
    <title>Planning and Setup</title>
    
    <section id="sn-hierarchy">
      <title>The Distribution Structure</title>
      <para>
	The &FED; <emphasis>distribution</emphasis><indexterm>
	  <primary>distribution</primary> 
	</indexterm>, which is the collection of all &FED;-related
	files, uses the directory tree in <xref
	  linkend="ex-fedora-dir-tree"/>. It may include multiple
	versions of &FC;. The tree design makes it easier to "trim"
	unnecessary or undesired files.  When you set up a mirror,
	duplicate this tree exactly, or as closely as possible.  If you
	duplicate the tree, it will be easier to automate nightly
	updates.
      </para>

      <example id="ex-fedora-dir-tree">
	<title>Fedora directory tree</title>
<screen>
<computeroutput>fedora
+-- linux
    +-- core
        |-- 1 
        |   ... 
        +-- &FCVER; 
        |   +-- SRPMS 
        |   +-- i386 
        |   |   +-- debug 
        |   |   +-- iso 
        |   |   +-- os 
        |   |       +-- Fedora 
        |   |       +-- SRPMS 
        |   |       +-- images 
        |   |       +-- isolinux 
        |   +-- x86_64 
        +-- development 
        |      ...
        +-- test 
        |      ...
        +-- updates 
            +-- 1 
            |   ... 
            +-- &FCVER; 
            |   +-- SRPMS 
            |   +-- i386 
            |   +-- x86_64 
            +-- testing 
                +-- 1 
                |   ... 
                +-- &FCVER; 
                    +-- SRPMS 
                    +-- i386 
                    +-- x86_64</computeroutput>
</screen>
      </example>

      <note>
	<title>Naming conventions</title>
	<para>
	  Throughout the rest of the document,
	  <filename>/var/www/mirror</filename> represents the folder
	  where all your mirrored files are stored. You may substitute a
	  different location. This location simplifies sharing your
	  mirror, due to the shipping configuration of &FC;. See <xref
	  linkend="sn-server-config"/> for more information.  The site
	  name <computeroutput>mirror.example.com</computeroutput>
	  represents the upstream mirror.
	</para>
      </note>
      <para>
	The
	<filename>fedora/linux/core/&FCVER;/<replaceable>arch</replaceable>/os</filename>
	directory contains a copy of all the original distribution files
	for &FC; &FCVER;. They are the same files found on the DVD and
	CD-ROM version of the distribution. The
	<filename>&FED;</filename> subfolder contains all the files that
	are necessary for installation, including the entire collection
	of &FC; RPM packages. The <filename>images</filename> folder
	contains copies of any floppy diskette or CD-ROM images that
	boot a system into installation or rescue modes. The
	<filename>fedora/linux/core/&FCVER;/<replaceable>arch</replaceable>/iso</filename>
	folder contains images of the CD-ROM version of the
	distribution.
      </para>
      <note>
	<title>RPM packages</title>
	<para>
	  <firstterm>RPM</firstterm><indexterm> <primary>RPM</primary>
	  </indexterm>, originally the Red Hat Package Manager and now
	  the RPM Package Manager, is not just a file format. RPM is
	  also a system that tracks and interconnects software and
	  version information. The RPM system is quite popular, and many
	  other Linux distributions use RPM as well. Read more
	  information on RPM at <ulink
	  url="http://www.rpm.org/">http://www.rpm.org/</ulink>.
	</para>
      </note>
      <para>
	The <filename>SRPMS</filename> folders under
	architecture-specific branches are links that point to the main
	<filename>SRPMS</filename> folder for that distribution. For
	example, <filename>fedora/linux/core/2/i386/os/SRPMS</filename>
	is a link that points to
	<filename>fedora/linux/core/2/SRPMS</filename>.
      </para>
      <para>
	A &FED; mirror consists of at least the original ISO images
	<emphasis>or</emphasis> the distribution files. If possible,
	include both, provided you have sufficient disk space and/or
	bandwidth.
      </para>
    </section>

    <section id="sn-copying-original-distribution">
      <title>Copying the Original Distribution</title>
      <para>
	If you already have reliable CD-ROM installation discs of a
	distribution, reduce your initial bandwidth and time spent
	mirroring by copying the files from the discs to your server.
	Copy all files from Installation Disc 1 into the
	<filename>fedora/linux/core/&FCVER;/<replaceable>arch</replaceable>/os</filename>
	folder. Then copy all files from the <filename>&FED;</filename>
	folder of each of the remaining Installation discs into the
	<filename>fedora/linux/core/&FCVER;/<replaceable>arch</replaceable>/os/&FED;</filename>
	folder on the server.
      </para>
      <para>
	Copy all the files from the <filename>SRPMS</filename> folder on
	each of the "Sources" discs to the
	<filename>fedora/linux/core/&FCVER;/SRPMS</filename> folder on
	the server. Make a link in the <filename>os</filename> folder
	that occurs under each architecture. Follow this example:
      </para>

<screen>
<userinput>cd /var/www/mirror/fedora/linux/core/&FCVER;/i386/os/Fedora
ln ../../SRPMS SRPMS</userinput>
</screen>

      <para>
	The documentation for <application>anaconda</application><indexterm>
	  <primary>anaconda</primary>
	</indexterm>, the &FC; installation program, calls this directory
	structure an <firstterm>exploded tree</firstterm><indexterm>
	  <primary>exploded tree</primary>
	</indexterm>. This is because the package data on each CD is extracted,
	or exploded, to a large directory tree with a predetermined structure.
	The <application>anaconda</application> installer expects this structure
	to some extent.
      </para>
      <para>
	If you <emphasis>only</emphasis> include CD images, create a mirror
	suitable for installation services by mounting each CD image under the
	<filename><replaceable>arch</replaceable>/os/</filename> directory. Make
	a directory for each disc, naming them <filename>disc1</filename>,
	<filename>disc2</filename>, and so on. Mount each disc on the
	appropriate folder, and add entries to <filename>/etc/fstab</filename>
	to perform this mount automatically in case of a reboot. Each entry
	looks like this:
      </para>

<screen>
<computeroutput>/<replaceable>path</replaceable>/i386/iso/FC&FCVER;-i386-disc1.iso  /<replaceable>path</replaceable>/i386/os/disc1  iso9660  defaults  0 0</computeroutput>
</screen>

      <para>
	The <application>anaconda</application> installer application
	automatically detects these folders and uses them properly. In
	addition, system configuration tools such as
	<application>system-config-packages</application> also continue
	to work properly when pointed at the parent of the ISO image
	mount points.
      </para>
      <para>
	There are drawbacks to using CD ISO images in this fashion. For
	instance, no one directory contains the entire distribution of RPM
	packages. Soft links circumvent this problem, but your server security
	policies may not permit them. &FC; also comes in a ISO format DVD image,
	which alleviates this problem. Users who do not have DVD burning
	hardware, however, cannot use this image to make discs for their own
	use.
      </para>
      <para>
	You only need a single line in <filename>/etc/fstab</filename>
	for mounting the &FC; DVD ISO image.  The entry looks like this:
      </para>

<screen>
<computeroutput>/<replaceable>path</replaceable>/i386/iso/FC&FCVER;-i386-DVD.iso  /<replaceable>path</replaceable>/i386/os  iso9660  defaults  0 0</computeroutput>
</screen>

    </section>

    <section id="sn-trimming-tree">
      <title>Trimming Branches</title>
      <para>
	You may omit almost any branch of the tree that you do not plan to use.
	Consider carefully the impact of excluding that folder. Branches you
	might trim from your mirror include:
      </para>
      <variablelist>
	<varlistentry>
	  <term>Older versions of &FC; (any numbered directory).</term>
	  <listitem>
	    <para>
	      Before you exclude an old version, ensure
	      this does not adversely affect any of your users. These adverse
	      affects can come in many forms. For example, the level of support
	      for certain hardware sometimes changes between releases of &FC;.
	      Users who cannot install a previous version may not be able to use
	      &FC;. Your users might need to perform software-related tasks such
	      as building packages for different &FC; releases. Always remain
	      aware of the needs of your users during the planning stage.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>Folders for architectures your site does not support.</term>
	  <listitem>
	    <para>
	      If you do not have any x86-64 hosts to support, trimming these
	      folders eliminates several gigabytes of extra files. If you
	      support x86-64 hosts later, though, you must restore mirroring of
	      these branches.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>The <filename>development</filename> folder (formerly
	    "Rawhide").</term>
	  <listitem>
	    <para>
	      This folder contains all the latest "bleeding-edge"
	      packages from the &FP;. If you participate in active &FED;
	      development, you should not trim this branch. &FED;
	      development moves at a rapid pace and requires frequent
	      updates to the latest development package
	      versions. However, the frequent updates cause your mirror
	      to download significant amounts of material during the
	      regular update cycle.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>The <filename>testing</filename> folders.</term>
	  <listitem>
	    <para>
	      These branches contain updates that are being subjected to
	      quality assurance through public testing, as well as the
	      test or "pre-release" versions of the &FC;
	      distribution. The <filename>testing</filename> folder
	      under the main <filename>core</filename> tree is where
	      test versions of the distribution, such as &FC;
	      &FCTESTVER;, are kept. (Users of &FC; test distributions
	      are often directed to use the
	      <filename>development</filename> branch to update
	      packages.) The <filename>testing</filename> folder, under
	      <filename>updates</filename>, contains package updates
	      that have not yet passed the public testing phase.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>The <filename>debug</filename> folders.</term> 
	  <listitem>
	    <para>
	      These folders contain packages that enable developers and
	      skilled users to interpret data created when a program
	      crashes or encounters a bug. If you participate actively
	      in &FED; development, you should not trim these
	      folders. If you trim this branch, you may still download
	      individual packages as needed from a nearby public mirror
	      site.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>The <filename>SRPMS</filename> folders (and links
	    thereto).</term> 
	  <listitem>
	    <para>
	      These folders contain the original source for all the
	      binary RPM packages in the distribution. You may download
	      these packages individually as needed to save space on
	      your local mirror.
	    </para>
	  </listitem>
	</varlistentry>
      </variablelist>
      <para>
	Unless your site closely manages workstation configuration, you
	should probably not trim any of the <filename>updates</filename>
	branches for the distributions you support. These locations
	contain packages with bug fixes, security patches, and errata
	updates that your users probably want.
      </para>
    </section>

    <section id="sn-download-files">
      <title>Downloading the Files</title>
      <para>
	Locate a public mirror site for &FC; by referring to the main
	project site's mirror page, &FDP-URL;. Once you have selected a
	nearby mirror site, note what services it offers (FTP, HTTP,
	and/or rsync). A mirror is usually servicing a large number of
	users. Choose off-peak hours, when possible, to download a large
	set of files. Be aware of any timezone differences when
	estimating off-peak hours.
      </para>

      <section id="sn-http-and-ftp-download">
	<title>Download Using HTTP or FTP</title>
	<para>
	  To download via HTTP or FTP, use either the
	  <command>wget</command> or <command>lftp</command>
	  command. The <command>wget</command> command recurses
	  subdirectories automatically and pulls down entire trees of
	  data with a single command. If you are not careful, however,
	  it is possible to pull down much more data than you
	  intended. The following commands mirror the entire current
	  &FC; distribution:
	</para>

<screen>
<userinput>cd /var/www/mirror 
wget --mirror -np -nH --cut-dirs=<replaceable>2</replaceable> http://mirror.example.com/pub/mirror/fedora/linux/core/&FCVER;/</userinput>
</screen>

	<para>
	  Note the options used above:
	</para>
	<itemizedlist>
	  <listitem>
	    <para>
	      <command>--mirror</command> turns on recursion (descends
	      into all subdirectories), and duplicates file timestamps;
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      <command>-np</command> prevents <command>wget</command>
	      from ascending into the parent directory;
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      <command>-nH</command> prevents <command>wget</command>
	      from writing a directory named after the host (in this
	      case,
	      <filename><replaceable>mirror.example.com</replaceable></filename>);
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      <command>--cut-dirs=<replaceable>n</replaceable></command>
	      truncates the first <replaceable>n</replaceable>
	      directories in the path. In the example above,
	      <command>--cut-dirs=2</command> prevents
	      <command>wget</command> from writing the
	      <filename><replaceable>/pub/mirror</replaceable></filename>
	      portion of the path into your mirror.
	    </para>
	  </listitem>
	</itemizedlist>
	<para>
	  The same syntax works for both HTTP and FTP upstream
	  mirrors. It is possible that you may download some extraneous
	  files if the HTTP site formats its pages for browser
	  viewing. These files can be safely deleted, but return each
	  time the mirror updates unless you exclude them using special
	  options. See the <command>wget</command> man pages for more
	  information.
	</para>

	<para>
	  The <command>lftp</command> command works like the
	  <command>wget</command> command, and mirrors the content of a
	  HTTP or FTP server.  The <command>wget</command> command,
	  however, does not delete old files locally.  This feature is
	  important for update repository mirrors to stay synchronized
	  to upstream mirrors.  New files are created and old files are
	  automatically removed from the upstream mirrors on a frequent
	  basis.
	</para>

	<para>
	  The <command>lftp</command> command synchronizes files and
	  directories from a remote host like <command>rsync</command>,
	  but uses HTTP or FTP protocols.  Use the following command to
	  mirror the entire &FC; distribution with
	  <command>lftp</command>:
	</para>

<screen>
<userinput>cd /var/www/mirror && \
lftp -c "open http://mirror.example.com/pub/mirror/linux/core/&FCVER;/i386/ && \
mirror --delete --verbose"</userinput>
</screen>

	<para>
	  The <option>-c</option> parameter executes a set of commands
	  in a <command>lftp</command> process. Commands are separated
	  with <command>&&</command> to prevent the
	  <command>lftp</command> command from executing if the
	  <command>cd</command> command fails.  The commands in the
	  <command>lftp</command> command set work the same way.  The
	  command syntax <command>A && B</command> is often
	  shorthand for "if A returns success, run B."  An explanation
	  of the <command>lftp</command> commands follows:
	</para>
	  
	<itemizedlist>
	  <listitem>
	    <para>
	      <command>open</command> connects to the site and changes
	      directory automatically.
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      <command>mirror</command> fetches all files and
	      directories recursively in the current directory. The
	      <command>--delete</command> option excludes all local
	      files that are not in the remote directory.  The
	      <command>--verbose</command> option prints some
	      information in the screen and is optional.
	    </para>
	  </listitem>
	</itemizedlist>

	<para>
	  The <command>lftp</command> command above mantains an exact
	  copy of the directory for you.  It downloads only new or
	  changed files, and deletes only those that no longer exist on
	  the upstream mirror.
	</para>

	<para>
	  As with <command>wget</command>, it is possible you may
	  download some unwanted files.  The <command>lftp</command>
	  command supports regular expressions for excluding files
	  within a <command>mirror</command> command.  The command below
	  shows how to mirror an current &FC; distribution updates
	  repository, excluding <filename>debug</filename> and
	  <filename>repodata</filename> directories:
	</para>

<screen>
<userinput>cd /var/www/mirror && \
lftp -c "set mirror:exclude-regex 'debug\/|repodata\/' && \
open http://mirror.example.com/pub/mirror/linux/core/updates/&FCVER;/i386/ && \
mirror --delete --verbose"</userinput>
</screen>

	<para>Consult the <command>lftp</command> man pages for more
	details and usage options.</para>

	<tip>
	  <title>Using Proxy for HTTP or FTP retrieval</title>
	  <para>
	    If you are behind a proxy or firewall, you may need to use a
	    HTTP proxy to mirror files.  To do this, export the
	    environment variables <command>http_proxy</command> and
	    <command>ftp_proxy</command> before you run the
	    <command>wget</command> or <command>lftp</command> commands:
	  </para>

<screen>
<userinput>export http_proxy=http://<replaceable>username</replaceable>:<replaceable>password</replaceable>@<replaceable>host</replaceable>:<replaceable>port</replaceable>
export ftp_proxy=http://<replaceable>username</replaceable>:<replaceable>password</replaceable>@<replaceable>host</replaceable>:<replaceable>port</replaceable></userinput>
</screen>

	</tip>
      </section>

      <section id="sn-rsync">
	<title>The <command>rsync</command> Command</title>
	<para>
	  Use the <command>rsync</command> command to synchronize a set
	  of files and/or directories with a remote host. It operates in
	  much the same way as <command>rcp</command>, but it is usually
	  faster. One reason for the speed is that
	  <command>rsync</command> has a special protocol that evaluates
	  and skips files (or portions of files) that are already
	  downloaded.
	</para>
	<para>
	  Begin by identifying the modules available on the upstream
	  mirror site you have chosen. Note that the double colon "::"
	  is always used after the host name to separate it from the
	  rest of the <command>rsync</command> path. The following
	  command generates a list of "modules" on the upstream mirror.
	</para>

<screen>
<userinput>rsync mirror.example.org::</userinput>
</screen>

	<para>
	  These modules are roughly equivalent to top-level directories,
	  and they follow the same rules. To list any subdirectory of
	  the upstream mirror, add the directory path to the command
	  above. For example, on many mirrors, the
	  <filename>fedora-linux-core</filename> module is equivalent to
	  the <filename>fedora/linux/core</filename> path found at the
	  &FP; main download server. To list the contents of the &FC;
	  &FCVER; distribution folder on the upstream server, issue the
	  following command. Do not forget the trailing slash "/".
	  Without it, you only receive a listing of a folder name that
	  matches the last component of the remote path.
	</para>

<screen>
<userinput>rsync mirror.example.org::fedora-linux-core/&FCVER;/</userinput>
</screen>

      </section>

      <section id="sn-rsync-download">
	<title>Downloading Using <command>rsync</command></title>
	<para>
	  To download via <command>rsync</command>, add a destination
	  path on your system to the end of the command line. The
	  resulting tree of files from the listing you perform are
	  downloaded to the local path you specify. Remember, if you
	  leave off the trailing slash on the remote path, then the last
	  component of that path is created as a folder, and its
	  contents are copied.
	</para>

<screen>
<userinput>rsync filehouse.example.org::files/misc/ /var/www/misc/</userinput>
</screen>

	<para>
	  When downloading using <command>rsync</command> for mirror purposes,
	  use some of the command line switches to improve performance and
	  feedback. The switches <command>-PHav</command> enable the following
	  <command>rsync</command> features:
	</para>
	<variablelist>
	  <varlistentry>
	    <term>-P</term>
	    <listitem>
	      <para>
		recover partially-downloaded files, and show a progress
		meter
	      </para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term>-H</term>
	    <listitem>
	      <para>
		preserve hard links
	      </para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term>-a</term>
	    <listitem>
	      <para>
		recurse all directories, and preserve as much file
		information as possible, including timestamps,
		ownership, permissions, device files (if you are running
		as root), and soft links
	      </para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term>-v</term>
	    <listitem>
	      <para>
		give verbose feedback to the screen
	      </para>
	    </listitem>
	  </varlistentry>
	</variablelist>

	<para>
	  Remove the <command>-v</command> switch if you run this mirroring
	  process as part of a script, or have no need to monitor progress. The
	  following example mirrors all available versions of &FC; from an
	  upstream site.
	</para>
	<caution>
	  <title>Example command downloads many gigabytes of files</title>
	  <para>
	    This command downloads many gigabytes of files, and is intended for
	    use as an example only. Do not run this command if you do not
	    understand the consequences.
	  </para>
	</caution>
	  
<screen>
<userinput>rsync -PHav mirror.example.org::fedora-linux-core/&FCVER;/ /var/www/mirror/fedora/linux/core/&FCVER;</userinput>
</screen>

	<para id="rsync-n-switch">
	  The <command>-n</command> switch performs a "dry run" using
	  the other given parameters. Use this switch to test any
	  <command>rsync</command> command if you are unsure what files
	  you will receive. See also <xref
	  linkend="rsync-possible-data-loss"/>.
	</para>
	<para>
	  The <command>-z</command> switch enables compression during the
	  <command>rsync</command> process. The server compresses data before
	  transmission, and the client decompresses the data before writing it
	  to disk.
	</para>
	<tip>
	  <title>Compression using <command>rsync</command></title>
	  <para>
	    The vast majority of the &FC; distribution consists of RPM files,
	    which are already compressed data. Therefore, additional compression
	    does not save time, and instead induces an unnecessary load on the
	    upstream mirror CPU. As a courtesy, do not use the
	    <command>-z</command> switch for this purpose.
	  </para>
	</tip>
	<para>
	  The next section features some additional switches that can be used to
	  automatically trim branches from the tree of downloaded folders. With
	  proper usage, they result in a mirror that is exactly as organized and
	  full-featured as any high-volume public upstream site.
	</para>
	<warning id="rsync-possible-data-loss">
	  <title>Possible data loss</title>
	  <para>
	    If you are not exceedingly careful in using these switches, it is
	    possible to delete large portions of your mirrored data. Fixing this
	    problem might require performing the copying steps outlined
	    in <xref linkend="sn-copying-original-distribution"/> above. On the
	    other hand, if you are also careless about your destination path,
	    and you are running as root, you could put your entire system at
	    risk. Know your environment before using these switches:
	  </para>
	  <itemizedlist>
	    <listitem>
	      <para>
		What is your current working directory? Use
		<command>pwd</command> to find out, if you are unsure.
	      </para>
	    </listitem>
	    <listitem>
	      <para>
		Are you logged in as root? If you are using SELinux extensions,
		what is your current security context?
	      </para>
	    </listitem>
	    <listitem>
	      <para>
		Have you tested this command using the <command>-n</command>
		switch (see <xref linkend="rsync-n-switch"/>)?
	      </para>
	    </listitem>
	  </itemizedlist>
	</warning>
	<para>
	  Use the <command>--exclude</command> switch, along with a simple
	  pattern, to disallow download of certain files and/or folders. For
	  instance, <command>--exclude "*.iso"</command> excludes the download
	  of any file whose name ends with the string ".iso".
	</para>
	<para>
	  Use the <command>--delete</command> switch, again with a pattern, to
	  remove any file from the local system which does not have a match on
	  the upstream mirror. This switch prevents unwanted <firstterm>file
	    debris</firstterm> from cropping up in your mirror. You can also use
	  it to retroactively trim branches of the tree which you no longer wish
	  to maintain or download.
	</para>
	<para>
	  Wildcards are permitted with <command>rsync</command> commands,
	  including the asterisk <computeroutput>*</computeroutput>, question
	  mark <computeroutput>?</computeroutput>, and brackets
	  <computeroutput>[ ]</computeroutput>. The question mark and brackets
	  work as in the shell; the former matches any single character, while
	  the brackets define a set of characters to be matched. Asterisks are
	  especially powerful when combined with a portion of a file name. The
	  double asterisk <computeroutput>**</computeroutput> pattern matches
	  any character, <emphasis>including slashes</emphasis>; a single
	  asterisk <computeroutput>*</computeroutput> matches any character, but
	  stops at a slash. Therefore, be judicious about using either. The
	  double asterisk is very useful for mirroring a tree that includes
	  multiple instances of directories and files that contain a pattern. A
	  good example is mirroring several versions of &FC;, where certain
	  folder names appear in every version.
	</para>
	<tip>
	  <title>Pattern matching wildcards</title>
	  <para>
	    Use double asterisks to trim out directories that repeat throughout
	    a mirrored tree. For example, when mirroring for a site that
	    only uses i386 architecture machines, you may trim all files and
	    folders marked for x86_64 architecture, using the switch
	    <command>--exclude "**x86_64**"</command>. This matches not only
	    folders marked <filename>x86_64</filename>, but also files such as
	    ISO images for x86_64, which are indicated by file names such as
	    <filename>FC&FCVER;-x86_64-disc1.iso</filename>.
	  </para>
	</tip>
	<para>
	  Process a long list of exclusions and deletions with the
	  <command>--exclude-from</command> and <command>--delete-from</command>
	  options. Follow each tag with a file name that includes a list of
	  patterns, one per line, to be matched by the appropriate option.
	</para>
	<para>
	  These syntax hints only scratch the surface of
	  <command>rsync</command>, but suffice to make your first mirror. Once
	  you have selected your site and formulated your excludes and deletes,
	  run your <command>rsync</command> command with the
	  <command>-n</command> option. Redirect output to a file so you can
	  examine the resulting list of files in the editor or pager of your
	  choice.
	</para>
	<para>
	  The following example mirrors the entire &FC; &FCVER; distribution,
	  with <command>--exclude</command> options that avoid downloading:
	</para>
	<itemizedlist>
	  <listitem>
	    <para>
	      Any information for x86_64 architecture;
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      Any <command>yum</command> headers (see <xref
	      linkend="sn-repositories"/>);
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      Any <filename>debuginfo</filename> packages; and,
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      CD or DVD images.
	    </para>
	  </listitem>
	</itemizedlist>
	<para>
	  The <command>-n</command> switch is included for testing purposes.
	  Backslashes at the ends of lines indicate this example is a single
	  command line.
	</para>

<screen>
<userinput>rsync -Pan --delete --exclude "**x86_64**" --exclude "**headers**" \
  --exclude "**debug**" --exclude "**iso**" \
  mirror.example.com::fedora-linux-core/&FCVER;/ \
  /var/www/mirror/fedora/core/&FCVER;</userinput>
</screen>

      </section>
      
    </section>

    <section id="sn-maintenance">
      <title>Maintaining Your Mirror</title>
      <para>
	&FED; mirrors are even more useful when they are more than just a
	snapshot of the distribution at release time. Most mirror administrators
	also choose to carry updates and errata packages. Repositories of
	updates or development trees change daily, and your mirror should
	reflect these changes.
      </para>
      <important>
	<title><command>rsync</command> etiquette</title>
	<para>
	  If you plan to do regular updates of your mirror that include large
	  amounts of data, you should ask permission from the administrator of
	  the upstream mirror. Downloading nightly package updates for the
	  official releases of &FC; &FCVER; should not require notification, as
	  they are rarely more than a few megabytes. However, the
	  <filename>development</filename> tree routinely turns over several
	  hundred megabytes nightly. Take these factors into consideration
	  before putting any maintenance scripts into effect.
	</para>
      </important>
      <para>
	Once your <command>rsync</command> command is working as desire, you may
	want to place it in a nightly <command>cron</command> script. The
	<command>cron</command> system allows you to schedule
	regularly-occurring jobs on your system. The intervals are highly
	configurable, but a nightly run keeps your mirror synchronized with
	updates and errata. Make sure your nightly <command>cron</command> job
	follows some simple guidelines:
      </para>
      <itemizedlist>
	<listitem>
	  <para>
	    If your upstream mirror only synchronizes once or twice daily, run
	    your job <emphasis>after</emphasis> the upstream mirror completes
	    its update. This insures your mirror not only gets the freshest
	    material, but also does not interfere with the upstream server's
	    bandwidth while it runs its job. If you do not know this time, it is
	    usually safe to plan your downloads for pre-dawn hours.
	  </para>
	</listitem>
	<listitem>
	  <para>
	    Be sure you have sufficient disk space for additional packages. The
	    <filename>updates</filename> tree in particular grows over time as
	    more errata packages are released.
	  </para>
	</listitem>
	<listitem>
	  <para>
	    Always test your script thoroughly before allowing it to run
	    automatically. Use a <command>-n</command> or <command>-v</command>
	    switch in the <command>rsync</command> command line for testing, and
	    then remove it once you have completed testing. Remember that the
	    results are e-mailed to your account on your system unless you
	    specify differently. Read the <command>crontab(5)</command> man
	    pages for additional information, with the command <command>man 5
	      crontab</command>.
	  </para>
	</listitem>
      </itemizedlist>
    </section>

  </section>

  <section id="sn-server-config">
    <title>Server Configuration</title>

    <para>
      This section describes how to set up a HTTP (Web) server to
      support &FED; installation and software management applications.
    </para>

    <section id="sn-installing-apache">
      <title>Installing The Apache Web Server</title>
      <para>
        &FC; provides the Apache server in the
        <filename>httpd</filename> package.  The
        <filename>httpd</filename> package is included on &FED; systems
        installed with the <guilabel>Server</guilabel> installation
        type.  You may have installed it later in order to run websites
        or Web applications. &FEX; also offers alternative HTTP servers,
        which are beyond the scope of this document.
      </para>
      <para>
        To install the <filename>httpd</filename> package, if you have
        not already done so, use the following command:
      </para>

<screen>
<userinput>su -c 'yum install httpd'</userinput>
</screen>

      <para>
        Enter the password for the
        <systemitem class="username">root</systemitem> account when
        prompted.
      </para>
      <para>
        To start the service, use the following command:
      </para>

<screen>
<userinput>su -c '/sbin/service httpd start'</userinput>
</screen>

      <para>
        Enter the password for the
        <systemitem class="username">root</systemitem> account when
        prompted.
      </para>
      <para>
        To enable this service to load automatically at boot time, use
        the following command:
      </para>

<screen>
<userinput>su -c '/sbin/chkconfig --level 345 httpd on'</userinput>
</screen>

      <para>
        Enter the password for the
        <systemitem class="username">root</systemitem> account when
        prompted.
      </para>
      <para>
        The default firewall configuration for &FED; blocks access from
        remote systems. To enable other systems to connect to your HTTP
        service, use the
        <application>system-config-securitylevel</application> utility:
      </para>
      <procedure>
        <step>
          <para>
            Choose <menuchoice> <guimenu>Desktop</guimenu>
            <guisubmenu>System Settings</guisubmenu>
            <guimenuitem>Security Level</guimenuitem> </menuchoice>.
          </para>
        </step>
        <step>
          <para>
            Enter the password for the
            <systemitem class="username">root</systemitem> account when
            prompted.
          </para>
        </step>
        <step>
          <para>
            Select <guilabel>WWW (HTTP)</guilabel> from the list of
            services.
          </para>
        </step>
        <step>
          <para>
            When prompted, select <guilabel>Yes</guilabel> to update the
            firewall configuration.
          </para>
        </step>
      </procedure>
    </section>
    <section id="sn-configuring-apache">
      <title>Configuring The Apache Web Server</title>
      <para>
        To enable HTTP access to the files in your mirror directory,
        create the configuration file
        <filename>/etc/httpd/conf.d/mirror.conf</filename>. The
        following listing is an example:
       </para>
       <example>
        <title>Apache 2.x configuration file for &FED; mirror</title>

<screen>
<computeroutput><![CDATA[# The name at which the mirror will be shared, 
# followed by the name of the root directory of that tree.
Alias /mirror /var/www/mirror

# Share options for the mirror. 
# Only allow connections from localhost and 
# IP addresses which start with 192.168.1
    <Directory /var/www/mirror>
      AllowOverride None
      Order Deny,Allow
      Deny from all
      Allow from 127.0.0.1 192.168.1
      Options Indexes
    </Directory>]]></computeroutput>
</screen>

      </example>
      <para>
        You must use root privileges to create or copy files in the
        directory <filename>/etc/httpd/conf.d/</filename>.
      </para>
      <para>
        To update an active <command>httpd</command> service with a new
        configuration, use the following command:
      </para>

<screen>
<userinput>su -c '/sbin/service httpd reload'</userinput>
</screen>

      <para>
        Enter the password for the
        <systemitem class="username">root</systemitem> account when
        prompted.
      </para>
      <para>
         Your clients may now visit any area of your mirror by using the
         URL
         http://<replaceable>server.mydomain.org</replaceable>/mirror/<replaceable>path</replaceable>.
       </para>
      <note>
        <title>Apache and &SEL;</title>
        <para>
          The default &SEL; configuration for &FED; permits Apache to
          use files in the <filename>/var/www/</filename> directory. If
          you build your mirror in another directory, you may need to
          modify the &SEL; policy.
        </para>
      </note>
     </section>
    <section id="sn-solving-dependencies">
      <title>Solving Dependencies</title>
      <para>
	Every RPM package has a <indexterm> <primary>RPM</primary>
	  <secondary>header</secondary>
	  </indexterm><firstterm>header</firstterm> that contains all
	the vital information about that package. This information
	includes name, version and release, contents, the capabilities
	provided by the package, and any prerequisites. These
	prerequisites may include
	<emphasis>dependencies</emphasis><indexterm>
	  <primary>RPM</primary>
	  <secondary>dependencies</secondary>
	  </indexterm>. A dependency is a requirement for one or more
	additional packages.
      </para>
      <para>
	Packages installed without satisfying their dependencies may not
	work correctly. Dependencies may create a problem for users who
	are trying to install a single package. Manually determining and
	resolving dependencies is difficult. &FC; provides the
	<command>yum</command> utility for solving these dependencies
	automatically, providing an improved user experience.
      </para>

      <para>
      The Yellow Dog Updater Modified, or
	<emphasis>yum</emphasis><indexterm> <primary>yum</primary>
	</indexterm>, is a Python-based system for computing and solving
	RPM dependencies. A <command>yum</command> client retrieves a
	cache of headers from its repository server, as well as a list
	of available RPM packages and their exact locations on the
	server. It can do this via HTTP or FTP, as well as using
	standard file system calls (either local or remote via NFS). The
	client computes solutions to any package dependencies using the
	downloaded header information, and requests all necessary
	RPM packages once it has finished. The <command>yum</command>
	command relies on <command>rpm</command> functions to perform
	many of the computations involved in the process.
      </para>
      <para>
	A drawback to <command>yum</command> is that the first time it
	is run, it must download a header for every package installed on
	the system in order to determine available updates. However,
	running a local mirror nullifies this drawback. The
	<command>yum</command> command can download many megabytes of
	headers almost instantly on a standard Ethernet LAN. The
	<command>yum</command> utility is the most popular update method
	for &FC;.
      </para>
      <para>
	For more information about using <command>yum</command>, refer
	to <ulink url="http://fedora.redhat.com/docs/yum/"/>.
      </para>

    </section>

    <section id="sn-repositories">
      <title>Configuring Repositories</title>
      <para>
      A <command>yum</command>
	<emphasis>repository</emphasis><indexterm>
	  <primary>repository</primary>
	</indexterm> is a collection of packages on a server which
	supports <command>yum</command> clients. Repositories can serve
	both types of clients if desired.
      </para>

      <para>
	To set up a <command>yum</command> repository, you must write a
	directory that contains information which the clients require to
	resolve RPM dependencies. The directory's name depends on the
	version of <command>yum</command> it supports. It is permissible
	to have both kinds of repository information in a single
	repository.
      </para>
      <para>
	To support older <command>yum</command> clients, use the
	<command>yum-arch</command> command.  To support current
	<command>yum</command> clients, use the
	<command>createrepo</command> command.
      </para>
      <important>
	<title>Supporting &FC; 3 and beyond</title>
	<para>
	  &FC; 3 ships with a newer version of <command>yum</command>.
	  To support &FC; 3 <command>yum</command> clients, you
	  <emphasis>must</emphasis> use <command>createrepo</command> on
	  your server's repositories.
	</para>
      </important>

      <section id="sn-yum-arch">
	<title><command>yum-arch</command></title>
	<para>
	  The <command>yum-arch</command> command creates a directory
	  named <filename>headers/</filename> which supports older
	  versions of <command>yum</command> (before 2.2). The
	  <command>yum-arch</command> program searches recursively
	  through a target directory and any subdirectories for RPM
	  packages, and includes them in the header data.  The
	  <command>yum-arch</command> command always creates the
	  <filename>headers/</filename> directory in the current working
	  directory.  Therefore you should change your working directory
	  to the directory where you want <filename>headers/</filename>
	  to appear.
	</para>

<screen>
<userinput>cd /var/www/mirror/fedora/linux/core/&FCVER;/i386/os
su -c 'yum-arch -ls .'</userinput>
</screen>

	<para>
	  Enter the root password at the prompt.  The
	  <command>-l</command> switch follows symbolic links. The
	  <command>-s</command> switch includes SRPMS (source RPM
	  packages) in the header list. The command above creates the
	  <command>yum</command> header cache in the directory
	  <filename>/var/www/mirror/fedora/linux/core/&FCVER;/i386/os/headers/</filename>.
	</para>
      </section>

      <section id="sn-createrepo">
	<title><command>createrepo</command></title>
	<para>
	  The <command>createrepo</command> command creates repository
	  information to support newer versions of
	  <command>yum</command> (and possibly other repository client
	  programs). The <command>createrepo</command> command stores
	  this data in a folder named <filename>repodata</filename>.
	  Run <command>createrepo</command> against the directory
	  <emphasis>under which</emphasis> you want the
	  <filename>repodata</filename> directory to appear. The
	  <command>createrepo</command> program also searches
	  recursively for RPM packages to include in the repository
	  data.
	</para>
	<para>
	  The following command creates the repository data in the
	  directory
	  <filename>/var/www/mirror/fedora/linux/core/&FCVER;/i386/os/repodata</filename>.
	</para>

<screen>
<userinput>su -c 'createrepo /var/www/mirror/fedora/linux/core/&FCVER;/i386/os'</userinput>
</screen>

	<para>
	  To create repository data for package groups in addition to
	  the package files, use the <command>createrepo -g</command>
	  command.  The <option>-g</option> option requires a parameter
	  which points to the group file, <emphasis>relative</emphasis>
	  to the given location of the package data.  The following
	  command creates the package group data corresponding to the
	  repository directly above.  Note the relative location of the
	  group file
	  <filename>/var/www/mirror/fedora/linux/core/&FCVER;i386/os/Fedora/base/comps.xml</filename>.
	</para>

<screen>
<userinput>su -c 'createrepo -g Fedora/base/comps.xml /var/www/mirror/fedora/linux/core/&FCVER;/i386/os'</userinput>
</screen>

	<para>
	  You may have certain clients who update their version of
	  <command>yum</command> in a non-prescribed way. To minimize
	  problems for your clients, create both kinds of repository
	  data for any repositories.  The extra repository information
	  is relatively small and will not affect your mirror's proper
	  function.
	</para>

      </section>

      <section id="sn-repository-locations">
	<title>Repository Locations</title>
	<para>
	  Typically you will run <command>yum-arch</command> or
	  <command>createrepo</command> against at least the following
	  locations:
	</para>
	<itemizedlist>
	  <listitem>
	    <para>
	      The stock distribution; for example,
	      <filename>/var/www/mirror/fedora/linux/core/&FCVER;/i386/os/</filename>. 
	      For <command>yum-arch</command>, use the
	      <command>-l</command> and <command>-s</command> options to
	      follow the linked directory <filename>SRPMS</filename> and
	      include the source packages therein.
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      Official updates to the distribution; for example,
	      <filename>/var/www/mirror/fedora/linux/core/updates/&FCVER;/</filename>. 
	      Once again, for <command>yum-arch</command> use
	      <command>-l</command> and/or <command>-s</command> if
	      appropriate.
	    </para>
	  </listitem>
	</itemizedlist>
      </section>

    </section>

  </section>

  <section id="sn-client-config">
    <title>Client Configuration</title>

    <para>
      Client systems that use <command>yum</command> to contact your
      mirror also require configuration.  The <command>yum</command>
      repository configuration files are located in
      <filename>/etc/yum.repos.d</filename> and end with the suffix
      <filename>.repo</filename>.  Below is an example configuration
      file.
    </para>

    <example>
      <title>Example
      <filename>/etc/yum.repos.d/fedora-mirror.repo</filename></title>

<screen>
<computeroutput>[mirror]
name=Fedora Core $releasever - $basearch - Base
baseurl=http://server.mydomain.net/mirror/fedora/linux/core/$releasever/$basearch
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora</computeroutput>
</screen>

    </example>

    <para>
      Client systems should use a repository configuration file for each
      &FED; branch your mirror provides.  The base distribution and
      released updates, for example, each require a separate
      configuration file.
    </para>

    <para>
      If you want clients to use your mirror in place of the official
      repositories, disable the existing repositories.  To do this, edit
      the client's file for the official repository to include the
      directive <userinput>enabled=0</userinput>.  You will need
      <systemitem class="username">root</systemitem> access to edit
      these files.
    </para>

    <para>
      Many repositories provide their own installable RPM packages
      containing these configuration files.  When a user installs the
      RPM, the new files in <filename>/etc/yum.repos.d/</filename>
      reference that repository.  These packages simplify the addition
      of new repositories for end users.  Whether you use such a package
      yourself will depend on the number and skill set of clients your
      repository serves.
    </para>

  </section>

<!-- 

  FIXME:
  
  The following section is out of scope for now. When more documents are
  available, this would make a great "see also" section.

  <section id="sn-advanced-topics">
    <title>Advanced Topics</title>
    <para>
      No outline here yet. Suggestions: distributing via kickstart (xref
      kickstart tutorial?); rolling custom RPMs, starting with up2date; rolling
      custom distro (xref RedHat-CD-HOWTO?)....
    </para>

  </section>

-->

  <index id="generated-index">
  </index>

</article>

<!--
Local variables:
mode: xml
fill-column: 72
End:
-->




More information about the Fedora-docs-commits mailing list