mirror-tutorial/en_US mirror-tutorial.xml,NONE,1.1
Paul W. Frields (pfrields)
fedora-docs-commits at redhat.com
Sat Mar 4 20:58:42 UTC 2006
Author: pfrields
Update of /cvs/docs/mirror-tutorial/en_US
In directory cvs-int.fedora.redhat.com:/tmp/cvs-serv15972/en_US
Added Files:
mirror-tutorial.xml
Log Message:
Move to proper en_US locale
--- NEW FILE mirror-tutorial.xml ---
<!-- $Id: -->
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [
<!ENTITY % FEDORA-ENTITIES-EN SYSTEM "../../docs-common/common/fedora-entities-en.ent">
%FEDORA-ENTITIES-EN;
<!ENTITY DOCNAME "mirror-tutorial">
<!ENTITY DOCVERSION "1.0">
<!ENTITY DOCDATE "2006-02-06">
<!ENTITY DOCID "&DOCNAME;-&DOCVERSION; (&DOCDATE;)"> <!-- change version of manual and date here -->
<!ENTITY FCLOCALVER "4">
]>
<article id="mirror-tutorial" lang="en">
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="fdp-info.xml"/>
<section id="sn-introduction">
<title>Introduction</title>
<section id="sn-purpose">
<title>Purpose</title>
<para>
This tutorial presents a number of related topics that allow an
administrator to seamlessly integrate mirroring and update
services for &FC;. Use these services to provision a classroom,
laboratory, or office. These service provisions also increase
ease of use and enhance user experience. They also add to the
perceived value of non-proprietary operating systems and
software.
</para>
&BUG-REPORTING;
</section>
<section id="sn-audience">
<title>Audience</title>
<para>
You will find this tutorial more useful if you are a system
administrator, or a &FC; "power user" familiar with the
following topics:
</para>
<itemizedlist>
<listitem>
<para>
&FC; system installation and administration
</para>
</listitem>
<listitem>
<para>
Basic Internet protocols (HTTP/Web)
</para>
</listitem>
<listitem>
<para>
Using a command line interface
</para>
</listitem>
</itemizedlist>
</section>
<section id="sn-about-mirrors">
<title>About Mirrors</title>
<para>
A <emphasis>mirror</emphasis>
<indexterm><primary>mirror</primary></indexterm> is a server
that provides a copy of one or more collections of files.
Mirroring a site reduces traffic to the original source site,
thus spreading the stress and bandwidth costs of many users
across many sites. Side benefits of running a local mirror
include very fast access through the local network, providing
custom services to local users, and increasing your skills in
managing Internet services.
</para>
<para>
The site from which you retrieve files to build your mirror is
called an <emphasis>upstream mirror</emphasis><indexterm>
<primary>mirror</primary> <secondary>upstream</secondary>
</indexterm>. If possible, choose an upstream mirror that is
located close to you geographically. This reduces unnecessary
traffic across transcontinental sections of the Internet, where
bandwidth is limited and expensive. Use only upstream mirrors
that are intended for public access, unless you have permission
from the upstream mirror site administrator.
</para>
</section>
<section id="sn-additional-resources">
<title>Additional Resources</title>
<para>
For more information on installing &FC; see the &FC; &IG; at
&IG-URL;. For more information on basic Internet protocols, see
<ulink
url="http://library.albany.edu/internet/internet.html">http://library.albany.edu/internet/internet.html</ulink>,
or search Google at <ulink
url="http://www.google.com/">http://www.google.com/</ulink>.
For more general information about mirrors, see <ulink
url="http://en.wikipedia.org/wiki/Mirror_(computing)">http://en.wikipedia.org/wiki/Mirror_(computing)</ulink>.
</para>
</section>
<section id="sn-acknowledgements">
<title>Acknowledgements</title>
<para>
Karsten Wade provided editorial services and kept the style
crisp and consistent. Stuart Ellis provided some additional
security-related information.
</para>
</section>
</section>
<section id="sn-planning-and-setup">
<title>Planning and Setup</title>
<section id="sn-hierarchy">
<title>The Distribution Structure</title>
<para>
The &FED; <emphasis>distribution</emphasis><indexterm>
<primary>distribution</primary>
</indexterm>, which is the collection of all &FED;-related
files, uses the directory tree in <xref
linkend="ex-fedora-dir-tree"/>. It may include multiple
versions of &FC;. The tree design makes it easier to "trim"
unnecessary or undesired files. When you set up a mirror,
duplicate this tree exactly, or as closely as possible. If you
duplicate the tree, it will be easier to automate nightly
updates.
</para>
<example id="ex-fedora-dir-tree">
<title>Fedora directory tree</title>
<screen>
<computeroutput>fedora
+-- linux
+-- core
|-- 1
| ...
+-- &FCVER;
| +-- SRPMS
| +-- i386
| | +-- debug
| | +-- iso
| | +-- os
| | +-- Fedora
| | +-- SRPMS
| | +-- images
| | +-- isolinux
| +-- x86_64
+-- development
| ...
+-- test
| ...
+-- updates
+-- 1
| ...
+-- &FCVER;
| +-- SRPMS
| +-- i386
| +-- x86_64
+-- testing
+-- 1
| ...
+-- &FCVER;
+-- SRPMS
+-- i386
+-- x86_64</computeroutput>
</screen>
</example>
<note>
<title>Naming conventions</title>
<para>
Throughout the rest of the document,
<filename>/var/www/mirror</filename> represents the folder
where all your mirrored files are stored. You may substitute a
different location. This location simplifies sharing your
mirror, due to the shipping configuration of &FC;. See <xref
linkend="sn-server-config"/> for more information. The site
name <computeroutput>mirror.example.com</computeroutput>
represents the upstream mirror.
</para>
</note>
<para>
The
<filename>fedora/linux/core/&FCVER;/<replaceable>arch</replaceable>/os</filename>
directory contains a copy of all the original distribution files
for &FC; &FCVER;. They are the same files found on the DVD and
CD-ROM version of the distribution. The
<filename>&FED;</filename> subfolder contains all the files that
are necessary for installation, including the entire collection
of &FC; RPM packages. The <filename>images</filename> folder
contains copies of any floppy diskette or CD-ROM images that
boot a system into installation or rescue modes. The
<filename>fedora/linux/core/&FCVER;/<replaceable>arch</replaceable>/iso</filename>
folder contains images of the CD-ROM version of the
distribution.
</para>
<note>
<title>RPM packages</title>
<para>
<firstterm>RPM</firstterm><indexterm> <primary>RPM</primary>
</indexterm>, originally the Red Hat Package Manager and now
the RPM Package Manager, is not just a file format. RPM is
also a system that tracks and interconnects software and
version information. The RPM system is quite popular, and many
other Linux distributions use RPM as well. Read more
information on RPM at <ulink
url="http://www.rpm.org/">http://www.rpm.org/</ulink>.
</para>
</note>
<para>
The <filename>SRPMS</filename> folders under
architecture-specific branches are links that point to the main
<filename>SRPMS</filename> folder for that distribution. For
example, <filename>fedora/linux/core/2/i386/os/SRPMS</filename>
is a link that points to
<filename>fedora/linux/core/2/SRPMS</filename>.
</para>
<para>
A &FED; mirror consists of at least the original ISO images
<emphasis>or</emphasis> the distribution files. If possible,
include both, provided you have sufficient disk space and/or
bandwidth.
</para>
</section>
<section id="sn-copying-original-distribution">
<title>Copying the Original Distribution</title>
<para>
If you already have reliable CD-ROM installation discs of a
distribution, reduce your initial bandwidth and time spent
mirroring by copying the files from the discs to your server.
Copy all files from Installation Disc 1 into the
<filename>fedora/linux/core/&FCVER;/<replaceable>arch</replaceable>/os</filename>
folder. Then copy all files from the <filename>&FED;</filename>
folder of each of the remaining Installation discs into the
<filename>fedora/linux/core/&FCVER;/<replaceable>arch</replaceable>/os/&FED;</filename>
folder on the server.
</para>
<para>
Copy all the files from the <filename>SRPMS</filename> folder on
each of the "Sources" discs to the
<filename>fedora/linux/core/&FCVER;/SRPMS</filename> folder on
the server. Make a link in the <filename>os</filename> folder
that occurs under each architecture. Follow this example:
</para>
<screen>
<userinput>cd /var/www/mirror/fedora/linux/core/&FCVER;/i386/os/Fedora
ln ../../SRPMS SRPMS</userinput>
</screen>
<para>
The documentation for <application>anaconda</application><indexterm>
<primary>anaconda</primary>
</indexterm>, the &FC; installation program, calls this directory
structure an <firstterm>exploded tree</firstterm><indexterm>
<primary>exploded tree</primary>
</indexterm>. This is because the package data on each CD is extracted,
or exploded, to a large directory tree with a predetermined structure.
The <application>anaconda</application> installer expects this structure
to some extent.
</para>
<para>
If you <emphasis>only</emphasis> include CD images, create a mirror
suitable for installation services by mounting each CD image under the
<filename><replaceable>arch</replaceable>/os/</filename> directory. Make
a directory for each disc, naming them <filename>disc1</filename>,
<filename>disc2</filename>, and so on. Mount each disc on the
appropriate folder, and add entries to <filename>/etc/fstab</filename>
to perform this mount automatically in case of a reboot. Each entry
looks like this:
</para>
<screen>
<computeroutput>/<replaceable>path</replaceable>/i386/iso/FC&FCVER;-i386-disc1.iso /<replaceable>path</replaceable>/i386/os/disc1 iso9660 defaults 0 0</computeroutput>
</screen>
<para>
The <application>anaconda</application> installer application
automatically detects these folders and uses them properly. In
addition, system configuration tools such as
<application>system-config-packages</application> also continue
to work properly when pointed at the parent of the ISO image
mount points.
</para>
<para>
There are drawbacks to using CD ISO images in this fashion. For
instance, no one directory contains the entire distribution of RPM
packages. Soft links circumvent this problem, but your server security
policies may not permit them. &FC; also comes in a ISO format DVD image,
which alleviates this problem. Users who do not have DVD burning
hardware, however, cannot use this image to make discs for their own
use.
</para>
<para>
You only need a single line in <filename>/etc/fstab</filename>
for mounting the &FC; DVD ISO image. The entry looks like this:
</para>
<screen>
<computeroutput>/<replaceable>path</replaceable>/i386/iso/FC&FCVER;-i386-DVD.iso /<replaceable>path</replaceable>/i386/os iso9660 defaults 0 0</computeroutput>
</screen>
</section>
<section id="sn-trimming-tree">
<title>Trimming Branches</title>
<para>
You may omit almost any branch of the tree that you do not plan to use.
Consider carefully the impact of excluding that folder. Branches you
might trim from your mirror include:
</para>
<variablelist>
<varlistentry>
<term>Older versions of &FC; (any numbered directory).</term>
<listitem>
<para>
Before you exclude an old version, ensure
this does not adversely affect any of your users. These adverse
affects can come in many forms. For example, the level of support
for certain hardware sometimes changes between releases of &FC;.
Users who cannot install a previous version may not be able to use
&FC;. Your users might need to perform software-related tasks such
as building packages for different &FC; releases. Always remain
aware of the needs of your users during the planning stage.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Folders for architectures your site does not support.</term>
<listitem>
<para>
If you do not have any x86-64 hosts to support, trimming these
folders eliminates several gigabytes of extra files. If you
support x86-64 hosts later, though, you must restore mirroring of
these branches.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>The <filename>development</filename> folder (formerly
"Rawhide").</term>
<listitem>
<para>
This folder contains all the latest "bleeding-edge"
packages from the &FP;. If you participate in active &FED;
development, you should not trim this branch. &FED;
development moves at a rapid pace and requires frequent
updates to the latest development package
versions. However, the frequent updates cause your mirror
to download significant amounts of material during the
regular update cycle.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>The <filename>testing</filename> folders.</term>
<listitem>
<para>
These branches contain updates that are being subjected to
quality assurance through public testing, as well as the
test or "pre-release" versions of the &FC;
distribution. The <filename>testing</filename> folder
under the main <filename>core</filename> tree is where
test versions of the distribution, such as &FC;
&FCTESTVER;, are kept. (Users of &FC; test distributions
are often directed to use the
<filename>development</filename> branch to update
packages.) The <filename>testing</filename> folder, under
<filename>updates</filename>, contains package updates
that have not yet passed the public testing phase.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>The <filename>debug</filename> folders.</term>
<listitem>
<para>
These folders contain packages that enable developers and
skilled users to interpret data created when a program
crashes or encounters a bug. If you participate actively
in &FED; development, you should not trim these
folders. If you trim this branch, you may still download
individual packages as needed from a nearby public mirror
site.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>The <filename>SRPMS</filename> folders (and links
thereto).</term>
<listitem>
<para>
These folders contain the original source for all the
binary RPM packages in the distribution. You may download
these packages individually as needed to save space on
your local mirror.
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
Unless your site closely manages workstation configuration, you
should probably not trim any of the <filename>updates</filename>
branches for the distributions you support. These locations
contain packages with bug fixes, security patches, and errata
updates that your users probably want.
</para>
</section>
<section id="sn-download-files">
<title>Downloading the Files</title>
<para>
Locate a public mirror site for &FC; by referring to the main
project site's mirror page, &FDP-URL;. Once you have selected a
nearby mirror site, note what services it offers (FTP, HTTP,
and/or rsync). A mirror is usually servicing a large number of
users. Choose off-peak hours, when possible, to download a large
set of files. Be aware of any timezone differences when
estimating off-peak hours.
</para>
<section id="sn-http-and-ftp-download">
<title>Download Using HTTP or FTP</title>
<para>
To download via HTTP or FTP, use either the
<command>wget</command> or <command>lftp</command>
command. The <command>wget</command> command recurses
subdirectories automatically and pulls down entire trees of
data with a single command. If you are not careful, however,
it is possible to pull down much more data than you
intended. The following commands mirror the entire current
&FC; distribution:
</para>
<screen>
<userinput>cd /var/www/mirror
wget --mirror -np -nH --cut-dirs=<replaceable>2</replaceable> http://mirror.example.com/pub/mirror/fedora/linux/core/&FCVER;/</userinput>
</screen>
<para>
Note the options used above:
</para>
<itemizedlist>
<listitem>
<para>
<command>--mirror</command> turns on recursion (descends
into all subdirectories), and duplicates file timestamps;
</para>
</listitem>
<listitem>
<para>
<command>-np</command> prevents <command>wget</command>
from ascending into the parent directory;
</para>
</listitem>
<listitem>
<para>
<command>-nH</command> prevents <command>wget</command>
from writing a directory named after the host (in this
case,
<filename><replaceable>mirror.example.com</replaceable></filename>);
</para>
</listitem>
<listitem>
<para>
<command>--cut-dirs=<replaceable>n</replaceable></command>
truncates the first <replaceable>n</replaceable>
directories in the path. In the example above,
<command>--cut-dirs=2</command> prevents
<command>wget</command> from writing the
<filename><replaceable>/pub/mirror</replaceable></filename>
portion of the path into your mirror.
</para>
</listitem>
</itemizedlist>
<para>
The same syntax works for both HTTP and FTP upstream
mirrors. It is possible that you may download some extraneous
files if the HTTP site formats its pages for browser
viewing. These files can be safely deleted, but return each
time the mirror updates unless you exclude them using special
options. See the <command>wget</command> man pages for more
information.
</para>
<para>
The <command>lftp</command> command works like the
<command>wget</command> command, and mirrors the content of a
HTTP or FTP server. The <command>wget</command> command,
however, does not delete old files locally. This feature is
important for update repository mirrors to stay synchronized
to upstream mirrors. New files are created and old files are
automatically removed from the upstream mirrors on a frequent
basis.
</para>
<para>
The <command>lftp</command> command synchronizes files and
directories from a remote host like <command>rsync</command>,
but uses HTTP or FTP protocols. Use the following command to
mirror the entire &FC; distribution with
<command>lftp</command>:
</para>
<screen>
<userinput>cd /var/www/mirror && \
lftp -c "open http://mirror.example.com/pub/mirror/linux/core/&FCVER;/i386/ && \
mirror --delete --verbose"</userinput>
</screen>
<para>
The <option>-c</option> parameter executes a set of commands
in a <command>lftp</command> process. Commands are separated
with <command>&&</command> to prevent the
<command>lftp</command> command from executing if the
<command>cd</command> command fails. The commands in the
<command>lftp</command> command set work the same way. The
command syntax <command>A && B</command> is often
shorthand for "if A returns success, run B." An explanation
of the <command>lftp</command> commands follows:
</para>
<itemizedlist>
<listitem>
<para>
<command>open</command> connects to the site and changes
directory automatically.
</para>
</listitem>
<listitem>
<para>
<command>mirror</command> fetches all files and
directories recursively in the current directory. The
<command>--delete</command> option excludes all local
files that are not in the remote directory. The
<command>--verbose</command> option prints some
information in the screen and is optional.
</para>
</listitem>
</itemizedlist>
<para>
The <command>lftp</command> command above mantains an exact
copy of the directory for you. It downloads only new or
changed files, and deletes only those that no longer exist on
the upstream mirror.
</para>
<para>
As with <command>wget</command>, it is possible you may
download some unwanted files. The <command>lftp</command>
command supports regular expressions for excluding files
within a <command>mirror</command> command. The command below
shows how to mirror an current &FC; distribution updates
repository, excluding <filename>debug</filename> and
<filename>repodata</filename> directories:
</para>
<screen>
<userinput>cd /var/www/mirror && \
lftp -c "set mirror:exclude-regex 'debug\/|repodata\/' && \
open http://mirror.example.com/pub/mirror/linux/core/updates/&FCVER;/i386/ && \
mirror --delete --verbose"</userinput>
</screen>
<para>Consult the <command>lftp</command> man pages for more
details and usage options.</para>
<tip>
<title>Using Proxy for HTTP or FTP retrieval</title>
<para>
If you are behind a proxy or firewall, you may need to use a
HTTP proxy to mirror files. To do this, export the
environment variables <command>http_proxy</command> and
<command>ftp_proxy</command> before you run the
<command>wget</command> or <command>lftp</command> commands:
</para>
<screen>
<userinput>export http_proxy=http://<replaceable>username</replaceable>:<replaceable>password</replaceable>@<replaceable>host</replaceable>:<replaceable>port</replaceable>
export ftp_proxy=http://<replaceable>username</replaceable>:<replaceable>password</replaceable>@<replaceable>host</replaceable>:<replaceable>port</replaceable></userinput>
</screen>
</tip>
</section>
<section id="sn-rsync">
<title>The <command>rsync</command> Command</title>
<para>
Use the <command>rsync</command> command to synchronize a set
of files and/or directories with a remote host. It operates in
much the same way as <command>rcp</command>, but it is usually
faster. One reason for the speed is that
<command>rsync</command> has a special protocol that evaluates
and skips files (or portions of files) that are already
downloaded.
</para>
<para>
Begin by identifying the modules available on the upstream
mirror site you have chosen. Note that the double colon "::"
is always used after the host name to separate it from the
rest of the <command>rsync</command> path. The following
command generates a list of "modules" on the upstream mirror.
</para>
<screen>
<userinput>rsync mirror.example.org::</userinput>
</screen>
<para>
These modules are roughly equivalent to top-level directories,
and they follow the same rules. To list any subdirectory of
the upstream mirror, add the directory path to the command
above. For example, on many mirrors, the
<filename>fedora-linux-core</filename> module is equivalent to
the <filename>fedora/linux/core</filename> path found at the
&FP; main download server. To list the contents of the &FC;
&FCVER; distribution folder on the upstream server, issue the
following command. Do not forget the trailing slash "/".
Without it, you only receive a listing of a folder name that
matches the last component of the remote path.
</para>
<screen>
<userinput>rsync mirror.example.org::fedora-linux-core/&FCVER;/</userinput>
</screen>
</section>
<section id="sn-rsync-download">
<title>Downloading Using <command>rsync</command></title>
<para>
To download via <command>rsync</command>, add a destination
path on your system to the end of the command line. The
resulting tree of files from the listing you perform are
downloaded to the local path you specify. Remember, if you
leave off the trailing slash on the remote path, then the last
component of that path is created as a folder, and its
contents are copied.
</para>
<screen>
<userinput>rsync filehouse.example.org::files/misc/ /var/www/misc/</userinput>
</screen>
<para>
When downloading using <command>rsync</command> for mirror purposes,
use some of the command line switches to improve performance and
feedback. The switches <command>-PHav</command> enable the following
<command>rsync</command> features:
</para>
<variablelist>
<varlistentry>
<term>-P</term>
<listitem>
<para>
recover partially-downloaded files, and show a progress
meter
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>-H</term>
<listitem>
<para>
preserve hard links
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>-a</term>
<listitem>
<para>
recurse all directories, and preserve as much file
information as possible, including timestamps,
ownership, permissions, device files (if you are running
as root), and soft links
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>-v</term>
<listitem>
<para>
give verbose feedback to the screen
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
Remove the <command>-v</command> switch if you run this mirroring
process as part of a script, or have no need to monitor progress. The
following example mirrors all available versions of &FC; from an
upstream site.
</para>
<caution>
<title>Example command downloads many gigabytes of files</title>
<para>
This command downloads many gigabytes of files, and is intended for
use as an example only. Do not run this command if you do not
understand the consequences.
</para>
</caution>
<screen>
<userinput>rsync -PHav mirror.example.org::fedora-linux-core/&FCVER;/ /var/www/mirror/fedora/linux/core/&FCVER;</userinput>
</screen>
<para id="rsync-n-switch">
The <command>-n</command> switch performs a "dry run" using
the other given parameters. Use this switch to test any
<command>rsync</command> command if you are unsure what files
you will receive. See also <xref
linkend="rsync-possible-data-loss"/>.
</para>
<para>
The <command>-z</command> switch enables compression during the
<command>rsync</command> process. The server compresses data before
transmission, and the client decompresses the data before writing it
to disk.
</para>
<tip>
<title>Compression using <command>rsync</command></title>
<para>
The vast majority of the &FC; distribution consists of RPM files,
which are already compressed data. Therefore, additional compression
does not save time, and instead induces an unnecessary load on the
upstream mirror CPU. As a courtesy, do not use the
<command>-z</command> switch for this purpose.
</para>
</tip>
<para>
The next section features some additional switches that can be used to
automatically trim branches from the tree of downloaded folders. With
proper usage, they result in a mirror that is exactly as organized and
full-featured as any high-volume public upstream site.
</para>
<warning id="rsync-possible-data-loss">
<title>Possible data loss</title>
<para>
If you are not exceedingly careful in using these switches, it is
possible to delete large portions of your mirrored data. Fixing this
problem might require performing the copying steps outlined
in <xref linkend="sn-copying-original-distribution"/> above. On the
other hand, if you are also careless about your destination path,
and you are running as root, you could put your entire system at
risk. Know your environment before using these switches:
</para>
<itemizedlist>
<listitem>
<para>
What is your current working directory? Use
<command>pwd</command> to find out, if you are unsure.
</para>
</listitem>
<listitem>
<para>
Are you logged in as root? If you are using SELinux extensions,
what is your current security context?
</para>
</listitem>
<listitem>
<para>
Have you tested this command using the <command>-n</command>
switch (see <xref linkend="rsync-n-switch"/>)?
</para>
</listitem>
</itemizedlist>
</warning>
<para>
Use the <command>--exclude</command> switch, along with a simple
pattern, to disallow download of certain files and/or folders. For
instance, <command>--exclude "*.iso"</command> excludes the download
of any file whose name ends with the string ".iso".
</para>
<para>
Use the <command>--delete</command> switch, again with a pattern, to
remove any file from the local system which does not have a match on
the upstream mirror. This switch prevents unwanted <firstterm>file
debris</firstterm> from cropping up in your mirror. You can also use
it to retroactively trim branches of the tree which you no longer wish
to maintain or download.
</para>
<para>
Wildcards are permitted with <command>rsync</command> commands,
including the asterisk <computeroutput>*</computeroutput>, question
mark <computeroutput>?</computeroutput>, and brackets
<computeroutput>[ ]</computeroutput>. The question mark and brackets
work as in the shell; the former matches any single character, while
the brackets define a set of characters to be matched. Asterisks are
especially powerful when combined with a portion of a file name. The
double asterisk <computeroutput>**</computeroutput> pattern matches
any character, <emphasis>including slashes</emphasis>; a single
asterisk <computeroutput>*</computeroutput> matches any character, but
stops at a slash. Therefore, be judicious about using either. The
double asterisk is very useful for mirroring a tree that includes
multiple instances of directories and files that contain a pattern. A
good example is mirroring several versions of &FC;, where certain
folder names appear in every version.
</para>
<tip>
<title>Pattern matching wildcards</title>
<para>
Use double asterisks to trim out directories that repeat throughout
a mirrored tree. For example, when mirroring for a site that
only uses i386 architecture machines, you may trim all files and
folders marked for x86_64 architecture, using the switch
<command>--exclude "**x86_64**"</command>. This matches not only
folders marked <filename>x86_64</filename>, but also files such as
ISO images for x86_64, which are indicated by file names such as
<filename>FC&FCVER;-x86_64-disc1.iso</filename>.
</para>
</tip>
<para>
Process a long list of exclusions and deletions with the
<command>--exclude-from</command> and <command>--delete-from</command>
options. Follow each tag with a file name that includes a list of
patterns, one per line, to be matched by the appropriate option.
</para>
<para>
These syntax hints only scratch the surface of
<command>rsync</command>, but suffice to make your first mirror. Once
you have selected your site and formulated your excludes and deletes,
run your <command>rsync</command> command with the
<command>-n</command> option. Redirect output to a file so you can
examine the resulting list of files in the editor or pager of your
choice.
</para>
<para>
The following example mirrors the entire &FC; &FCVER; distribution,
with <command>--exclude</command> options that avoid downloading:
</para>
<itemizedlist>
<listitem>
<para>
Any information for x86_64 architecture;
</para>
</listitem>
<listitem>
<para>
Any <command>yum</command> headers (see <xref
linkend="sn-repositories"/>);
</para>
</listitem>
<listitem>
<para>
Any <filename>debuginfo</filename> packages; and,
</para>
</listitem>
<listitem>
<para>
CD or DVD images.
</para>
</listitem>
</itemizedlist>
<para>
The <command>-n</command> switch is included for testing purposes.
Backslashes at the ends of lines indicate this example is a single
command line.
</para>
<screen>
<userinput>rsync -Pan --delete --exclude "**x86_64**" --exclude "**headers**" \
--exclude "**debug**" --exclude "**iso**" \
mirror.example.com::fedora-linux-core/&FCVER;/ \
/var/www/mirror/fedora/core/&FCVER;</userinput>
</screen>
</section>
</section>
<section id="sn-maintenance">
<title>Maintaining Your Mirror</title>
<para>
&FED; mirrors are even more useful when they are more than just a
snapshot of the distribution at release time. Most mirror administrators
also choose to carry updates and errata packages. Repositories of
updates or development trees change daily, and your mirror should
reflect these changes.
</para>
<important>
<title><command>rsync</command> etiquette</title>
<para>
If you plan to do regular updates of your mirror that include large
amounts of data, you should ask permission from the administrator of
the upstream mirror. Downloading nightly package updates for the
official releases of &FC; &FCVER; should not require notification, as
they are rarely more than a few megabytes. However, the
<filename>development</filename> tree routinely turns over several
hundred megabytes nightly. Take these factors into consideration
before putting any maintenance scripts into effect.
</para>
</important>
<para>
Once your <command>rsync</command> command is working as desire, you may
want to place it in a nightly <command>cron</command> script. The
<command>cron</command> system allows you to schedule
regularly-occurring jobs on your system. The intervals are highly
configurable, but a nightly run keeps your mirror synchronized with
updates and errata. Make sure your nightly <command>cron</command> job
follows some simple guidelines:
</para>
<itemizedlist>
<listitem>
<para>
If your upstream mirror only synchronizes once or twice daily, run
your job <emphasis>after</emphasis> the upstream mirror completes
its update. This insures your mirror not only gets the freshest
material, but also does not interfere with the upstream server's
bandwidth while it runs its job. If you do not know this time, it is
usually safe to plan your downloads for pre-dawn hours.
</para>
</listitem>
<listitem>
<para>
Be sure you have sufficient disk space for additional packages. The
<filename>updates</filename> tree in particular grows over time as
more errata packages are released.
</para>
</listitem>
<listitem>
<para>
Always test your script thoroughly before allowing it to run
automatically. Use a <command>-n</command> or <command>-v</command>
switch in the <command>rsync</command> command line for testing, and
then remove it once you have completed testing. Remember that the
results are e-mailed to your account on your system unless you
specify differently. Read the <command>crontab(5)</command> man
pages for additional information, with the command <command>man 5
crontab</command>.
</para>
</listitem>
</itemizedlist>
</section>
</section>
<section id="sn-server-config">
<title>Server Configuration</title>
<para>
This section describes how to set up a HTTP (Web) server to
support &FED; installation and software management applications.
</para>
<section id="sn-installing-apache">
<title>Installing The Apache Web Server</title>
<para>
&FC; provides the Apache server in the
<filename>httpd</filename> package. The
<filename>httpd</filename> package is included on &FED; systems
installed with the <guilabel>Server</guilabel> installation
type. You may have installed it later in order to run websites
or Web applications. &FEX; also offers alternative HTTP servers,
which are beyond the scope of this document.
</para>
<para>
To install the <filename>httpd</filename> package, if you have
not already done so, use the following command:
</para>
<screen>
<userinput>su -c 'yum install httpd'</userinput>
</screen>
<para>
Enter the password for the
<systemitem class="username">root</systemitem> account when
prompted.
</para>
<para>
To start the service, use the following command:
</para>
<screen>
<userinput>su -c '/sbin/service httpd start'</userinput>
</screen>
<para>
Enter the password for the
<systemitem class="username">root</systemitem> account when
prompted.
</para>
<para>
To enable this service to load automatically at boot time, use
the following command:
</para>
<screen>
<userinput>su -c '/sbin/chkconfig --level 345 httpd on'</userinput>
</screen>
<para>
Enter the password for the
<systemitem class="username">root</systemitem> account when
prompted.
</para>
<para>
The default firewall configuration for &FED; blocks access from
remote systems. To enable other systems to connect to your HTTP
service, use the
<application>system-config-securitylevel</application> utility:
</para>
<procedure>
<step>
<para>
Choose <menuchoice> <guimenu>Desktop</guimenu>
<guisubmenu>System Settings</guisubmenu>
<guimenuitem>Security Level</guimenuitem> </menuchoice>.
</para>
</step>
<step>
<para>
Enter the password for the
<systemitem class="username">root</systemitem> account when
prompted.
</para>
</step>
<step>
<para>
Select <guilabel>WWW (HTTP)</guilabel> from the list of
services.
</para>
</step>
<step>
<para>
When prompted, select <guilabel>Yes</guilabel> to update the
firewall configuration.
</para>
</step>
</procedure>
</section>
<section id="sn-configuring-apache">
<title>Configuring The Apache Web Server</title>
<para>
To enable HTTP access to the files in your mirror directory,
create the configuration file
<filename>/etc/httpd/conf.d/mirror.conf</filename>. The
following listing is an example:
</para>
<example>
<title>Apache 2.x configuration file for &FED; mirror</title>
<screen>
<computeroutput><![CDATA[# The name at which the mirror will be shared,
# followed by the name of the root directory of that tree.
Alias /mirror /var/www/mirror
# Share options for the mirror.
# Only allow connections from localhost and
# IP addresses which start with 192.168.1
<Directory /var/www/mirror>
AllowOverride None
Order Deny,Allow
Deny from all
Allow from 127.0.0.1 192.168.1
Options Indexes
</Directory>]]></computeroutput>
</screen>
</example>
<para>
You must use root privileges to create or copy files in the
directory <filename>/etc/httpd/conf.d/</filename>.
</para>
<para>
To update an active <command>httpd</command> service with a new
configuration, use the following command:
</para>
<screen>
<userinput>su -c '/sbin/service httpd reload'</userinput>
</screen>
<para>
Enter the password for the
<systemitem class="username">root</systemitem> account when
prompted.
</para>
<para>
Your clients may now visit any area of your mirror by using the
URL
http://<replaceable>server.mydomain.org</replaceable>/mirror/<replaceable>path</replaceable>.
</para>
<note>
<title>Apache and &SEL;</title>
<para>
The default &SEL; configuration for &FED; permits Apache to
use files in the <filename>/var/www/</filename> directory. If
you build your mirror in another directory, you may need to
modify the &SEL; policy.
</para>
</note>
</section>
<section id="sn-solving-dependencies">
<title>Solving Dependencies</title>
<para>
Every RPM package has a <indexterm> <primary>RPM</primary>
<secondary>header</secondary>
</indexterm><firstterm>header</firstterm> that contains all
the vital information about that package. This information
includes name, version and release, contents, the capabilities
provided by the package, and any prerequisites. These
prerequisites may include
<emphasis>dependencies</emphasis><indexterm>
<primary>RPM</primary>
<secondary>dependencies</secondary>
</indexterm>. A dependency is a requirement for one or more
additional packages.
</para>
<para>
Packages installed without satisfying their dependencies may not
work correctly. Dependencies may create a problem for users who
are trying to install a single package. Manually determining and
resolving dependencies is difficult. &FC; provides the
<command>yum</command> utility for solving these dependencies
automatically, providing an improved user experience.
</para>
<para>
The Yellow Dog Updater Modified, or
<emphasis>yum</emphasis><indexterm> <primary>yum</primary>
</indexterm>, is a Python-based system for computing and solving
RPM dependencies. A <command>yum</command> client retrieves a
cache of headers from its repository server, as well as a list
of available RPM packages and their exact locations on the
server. It can do this via HTTP or FTP, as well as using
standard file system calls (either local or remote via NFS). The
client computes solutions to any package dependencies using the
downloaded header information, and requests all necessary
RPM packages once it has finished. The <command>yum</command>
command relies on <command>rpm</command> functions to perform
many of the computations involved in the process.
</para>
<para>
A drawback to <command>yum</command> is that the first time it
is run, it must download a header for every package installed on
the system in order to determine available updates. However,
running a local mirror nullifies this drawback. The
<command>yum</command> command can download many megabytes of
headers almost instantly on a standard Ethernet LAN. The
<command>yum</command> utility is the most popular update method
for &FC;.
</para>
<para>
For more information about using <command>yum</command>, refer
to <ulink url="http://fedora.redhat.com/docs/yum/"/>.
</para>
</section>
<section id="sn-repositories">
<title>Configuring Repositories</title>
<para>
A <command>yum</command>
<emphasis>repository</emphasis><indexterm>
<primary>repository</primary>
</indexterm> is a collection of packages on a server which
supports <command>yum</command> clients. Repositories can serve
both types of clients if desired.
</para>
<para>
To set up a <command>yum</command> repository, you must write a
directory that contains information which the clients require to
resolve RPM dependencies. The directory's name depends on the
version of <command>yum</command> it supports. It is permissible
to have both kinds of repository information in a single
repository.
</para>
<para>
To support older <command>yum</command> clients, use the
<command>yum-arch</command> command. To support current
<command>yum</command> clients, use the
<command>createrepo</command> command.
</para>
<important>
<title>Supporting &FC; 3 and beyond</title>
<para>
&FC; 3 ships with a newer version of <command>yum</command>.
To support &FC; 3 <command>yum</command> clients, you
<emphasis>must</emphasis> use <command>createrepo</command> on
your server's repositories.
</para>
</important>
<section id="sn-yum-arch">
<title><command>yum-arch</command></title>
<para>
The <command>yum-arch</command> command creates a directory
named <filename>headers/</filename> which supports older
versions of <command>yum</command> (before 2.2). The
<command>yum-arch</command> program searches recursively
through a target directory and any subdirectories for RPM
packages, and includes them in the header data. The
<command>yum-arch</command> command always creates the
<filename>headers/</filename> directory in the current working
directory. Therefore you should change your working directory
to the directory where you want <filename>headers/</filename>
to appear.
</para>
<screen>
<userinput>cd /var/www/mirror/fedora/linux/core/&FCVER;/i386/os
su -c 'yum-arch -ls .'</userinput>
</screen>
<para>
Enter the root password at the prompt. The
<command>-l</command> switch follows symbolic links. The
<command>-s</command> switch includes SRPMS (source RPM
packages) in the header list. The command above creates the
<command>yum</command> header cache in the directory
<filename>/var/www/mirror/fedora/linux/core/&FCVER;/i386/os/headers/</filename>.
</para>
</section>
<section id="sn-createrepo">
<title><command>createrepo</command></title>
<para>
The <command>createrepo</command> command creates repository
information to support newer versions of
<command>yum</command> (and possibly other repository client
programs). The <command>createrepo</command> command stores
this data in a folder named <filename>repodata</filename>.
Run <command>createrepo</command> against the directory
<emphasis>under which</emphasis> you want the
<filename>repodata</filename> directory to appear. The
<command>createrepo</command> program also searches
recursively for RPM packages to include in the repository
data.
</para>
<para>
The following command creates the repository data in the
directory
<filename>/var/www/mirror/fedora/linux/core/&FCVER;/i386/os/repodata</filename>.
</para>
<screen>
<userinput>su -c 'createrepo /var/www/mirror/fedora/linux/core/&FCVER;/i386/os'</userinput>
</screen>
<para>
To create repository data for package groups in addition to
the package files, use the <command>createrepo -g</command>
command. The <option>-g</option> option requires a parameter
which points to the group file, <emphasis>relative</emphasis>
to the given location of the package data. The following
command creates the package group data corresponding to the
repository directly above. Note the relative location of the
group file
<filename>/var/www/mirror/fedora/linux/core/&FCVER;i386/os/Fedora/base/comps.xml</filename>.
</para>
<screen>
<userinput>su -c 'createrepo -g Fedora/base/comps.xml /var/www/mirror/fedora/linux/core/&FCVER;/i386/os'</userinput>
</screen>
<para>
You may have certain clients who update their version of
<command>yum</command> in a non-prescribed way. To minimize
problems for your clients, create both kinds of repository
data for any repositories. The extra repository information
is relatively small and will not affect your mirror's proper
function.
</para>
</section>
<section id="sn-repository-locations">
<title>Repository Locations</title>
<para>
Typically you will run <command>yum-arch</command> or
<command>createrepo</command> against at least the following
locations:
</para>
<itemizedlist>
<listitem>
<para>
The stock distribution; for example,
<filename>/var/www/mirror/fedora/linux/core/&FCVER;/i386/os/</filename>.
For <command>yum-arch</command>, use the
<command>-l</command> and <command>-s</command> options to
follow the linked directory <filename>SRPMS</filename> and
include the source packages therein.
</para>
</listitem>
<listitem>
<para>
Official updates to the distribution; for example,
<filename>/var/www/mirror/fedora/linux/core/updates/&FCVER;/</filename>.
Once again, for <command>yum-arch</command> use
<command>-l</command> and/or <command>-s</command> if
appropriate.
</para>
</listitem>
</itemizedlist>
</section>
</section>
</section>
<section id="sn-client-config">
<title>Client Configuration</title>
<para>
Client systems that use <command>yum</command> to contact your
mirror also require configuration. The <command>yum</command>
repository configuration files are located in
<filename>/etc/yum.repos.d</filename> and end with the suffix
<filename>.repo</filename>. Below is an example configuration
file.
</para>
<example>
<title>Example
<filename>/etc/yum.repos.d/fedora-mirror.repo</filename></title>
<screen>
<computeroutput>[mirror]
name=Fedora Core $releasever - $basearch - Base
baseurl=http://server.mydomain.net/mirror/fedora/linux/core/$releasever/$basearch
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora</computeroutput>
</screen>
</example>
<para>
Client systems should use a repository configuration file for each
&FED; branch your mirror provides. The base distribution and
released updates, for example, each require a separate
configuration file.
</para>
<para>
If you want clients to use your mirror in place of the official
repositories, disable the existing repositories. To do this, edit
the client's file for the official repository to include the
directive <userinput>enabled=0</userinput>. You will need
<systemitem class="username">root</systemitem> access to edit
these files.
</para>
<para>
Many repositories provide their own installable RPM packages
containing these configuration files. When a user installs the
RPM, the new files in <filename>/etc/yum.repos.d/</filename>
reference that repository. These packages simplify the addition
of new repositories for end users. Whether you use such a package
yourself will depend on the number and skill set of clients your
repository serves.
</para>
</section>
<!--
FIXME:
The following section is out of scope for now. When more documents are
available, this would make a great "see also" section.
<section id="sn-advanced-topics">
<title>Advanced Topics</title>
<para>
No outline here yet. Suggestions: distributing via kickstart (xref
kickstart tutorial?); rolling custom RPMs, starting with up2date; rolling
custom distro (xref RedHat-CD-HOWTO?)....
</para>
</section>
-->
<index id="generated-index">
</index>
</article>
<!--
Local variables:
mode: xml
fill-column: 72
End:
-->
More information about the Fedora-docs-commits
mailing list