release-notes/F-8/xmlbeats .cvsignore, NONE, 1.1 README, NONE, 1.1 beatconvert, NONE, 1.1 beatlist, NONE, 1.1 files-map.txt, NONE, 1.1 steps-to-convert-FC7.txt, NONE, 1.1 to-do-fc5-errata-notes.txt, NONE, 1.1 to-do-fc5-gold-notes.txt, NONE, 1.1 wikixml2fdpxml, NONE, 1.1 xmlbeats, NONE, 1.1 xmlfix, NONE, 1.1

Paul W. Frields (pfrields) fedora-docs-commits at redhat.com
Tue Oct 23 01:43:28 UTC 2007


Author: pfrields

Update of /cvs/docs/release-notes/F-8/xmlbeats
In directory cvs-int.fedora.redhat.com:/tmp/cvs-serv12648/F-8/xmlbeats

Added Files:
	.cvsignore README beatconvert beatlist files-map.txt 
	steps-to-convert-FC7.txt to-do-fc5-errata-notes.txt 
	to-do-fc5-gold-notes.txt wikixml2fdpxml xmlbeats xmlfix 
Log Message:
Now that we have a package, make a F-8 branch for release-notes


--- NEW FILE .cvsignore ---
Beats
Fixed


--- NEW FILE README ---
Edit 'beatlist' to specify which wiki file names to convert to XML, then run './xmlbeats'.


--- NEW FILE beatconvert ---
#!/usr/bin/python -tt
#
# (C) 2007 Paul W. Frields.
# This file is licensed under the GNU General Public License (GPL) v2.

import os, sys
import urllib
import re
from time import sleep
from optparse import OptionParser

"""Take a list of pages from a Moin wiki and convert them to DocBook."""

parser = OptionParser()
parser.add_option("-d", "--output-directory", dest="outdir",
                  default=os.getcwd(),                  
		  help="Output files to directory DIR", metavar="DIR")
parser.add_option("-v", "--verbose", dest="verbose", action="store_true",
                  help="Use verbose logging", default=False)
parser.add_option("-i", "--input-file", dest="infname", default="",
                  help="Take list of pages from file FILE", metavar="FILE")
parser.add_option("-u", "--url", dest="urlbase",
                  default="http://fedoraproject.org/wiki/Docs/Beats/",
                  help="Use URL as base for pages to fetch", metavar="URL")
parser.add_option("-p", "--pause", dest="pausetime", default=0.5,
                  help="Wait SEC seconds between fetches", metavar="SEC")
parser.add_option("-s", "--slashes", dest="ignoreSlashes",
                  action="store_false", default=True,
                  help="Don't ignore page names with embedded slashes")
(opts, args) = parser.parse_args()

try:
    infile = open(opts.infname, "r")
except:
    print "Can't open input file", opts.infname
    sys.exit(-2)

if not os.access(opts.outdir, os.W_OK):
    print "Can't write to output directory", opts.outdir
    sys.exit(-3)

pageList = [page.rstrip('\n') for page in infile.readlines()]
infile.close()
if opts.verbose: print "Read", len(pageList), "pages"

for page in pageList:
    if opts.ignoreSlashes and len(re.findall("/", page)) > 0:
        continue
    else:
        sleep(opts.pausetime)
        if opts.verbose: print "Retrieving", page
        data = ''.join(urllib.urlopen(opts.urlbase + page +
                                      "?action=RenderAsDocbook").readlines())
        outXml = open(os.path.join(opts.outdir, page.replace("/", "")), "w")
        outXml.write(data)
        outXml.close()

if opts.verbose: print "Finished"


--- NEW FILE beatlist ---
Welcome
OverView
Feedback
Installer
ArchSpecific
ArchSpecific/PPC
ArchSpecific/x86
ArchSpecific/x86_64
Live
PackageNotes
Kernel
Desktop
FileSystems
WebServers
MailServers
Devel
Devel/Runtime
Devel/Tools
Devel/Tools/GCC
Devel/Tools/Eclipse
Security
Security/SELinux
Java
Multimedia
Entertainment
Virtualization
Xorg
DatabaseServers
I18n
BackwardsCompatibility
PackageChanges
Legacy
ProjectOverview
Colophon


--- NEW FILE files-map.txt ---
# map of how XML files in the release-notes module interact

RELEASE-NOTES-*.xml
    fdp-info-*.xml
        ../../docs-common/common/legalnotice-relnotes-*.xml
    Welcome-*.xml
    OverView-*.xml
    ../../docs-common/common/legalnotice-*.xml
    Feedback-*.xml
    Introduction-*.xml
    Installer-*.xml
    ArchSpecific-*.xml
        ArchSpecificPPC-*.xml
        ArchSpecificx86-*.xml
        ArchSpecificx86_64-*.xml
    Networking-*.xml
    PackageNotes-*.xml
        ServerTools-*.xml
        PackageNotesJava-*.xml
        Kernel-*.xml
        Security-*.xml
            SecuritySELinux-*.xml
        DevelopmentTools-*.xml
            DevelopmentToolsJava-*.xml
            DevelopmentToolsGCC-*.xml
        I18n-*.xml
        Printing-*.xml
        DatabaseServers-*.xml
        Multimedia-*.xml
        WebServers-*.xml
        Samba-*.xml
        Xorg-*.xml
        Entertainment-*.xml
    Legacy-*.xml
    PackageChanges-*.xml
    ProjectOverview-*.xml
    Colophon-*.xml

# Unused, but maybe we should use/

BackwardsCompatibility-*.xml
Desktop-*.xml
FileSystems-*.xml
FileServers-*.xml
SystemDaemons-*.xml






--- NEW FILE steps-to-convert-FC7.txt ---
Prior to converting to XML, make sure each Wiki page is:

* Edited for clarity, grammar, and technical details
* Wiki markup conventions for conversion to XML are followed
* Each page is formatted to follow the Docs/Beats/Template

KGW, 2007-04-03

0. Make a holding place for beats:

   mkdir Beats

1. Use beatconvert to pull the beats from the main wiki to the test
   wiki and convert to XML.

   ./beatconvert --verbose -i beatlist -d Beats

   The results are in the "Beats/" folder.

2. Run xmlfix to get the section ID's and titles fixed.

   ./xmlfix Beats

3. Edit, copy to the relnotes devel/en_US folder, and commit.  Don't
   forget to "make pot" also!


PWF, 2007-04-02


--- NEW FILE to-do-fc5-errata-notes.txt ---

1. Update parent XML to call all beats in a flat namespace to match
   the wiki Docs/Beats page.  DONE

2. Add top-level sn-BeatName ID attributes for each file. 

3. Fix all admonition tables
   - fix table, or
   - make a proper admonition  

4. Fix missing version number:  

   http://fedoraproject.org/wiki/Docs/Beats?action=fullsearch&context=180&value=GetVal&fullsearch=Text#preview

   grep "Core  " *xml
   grep "Core ." *xml

5. Search all <screen> tags and fix the line breaks; may require
   injection of fresh content  
   - look for solo-list elements surrounding <screen>
     grep -B2 "<screen>" *.xml | grep listitem  

6. Look for unnecessary linebreaks around <code/>, it is being
  treated as a block.  Is this from xmlformat or the wiki output?

7. Watch for over sub-sectioning
   - have to build to notice?

8. When done, grep all XML files for:
   grep "code> ," *xml
   grep "code> ." *xml
   grep "Core  " *xml
   grep "Core ." *xml
   grep "Core ," *xml

## non-essential

8. Figure out how to have a @@RELNAME@@ variable.

9. Add in the release name?


?. Add call to every file to ../locale-entities.xml - scriptable NOT
   NEEDED 

X. Update .pot file? AUTOMATIC
 


## to-do -- Clean-up for the Wiki

1. Change all titles to not follow format of Docs/Beats/BeatName

2. Flatten the sub-sections a bit, where needed, avoiding orphaned
   sections



--- NEW FILE to-do-fc5-gold-notes.txt ---

1. Update parent XML to call all beats in a flat namespace to match
   the wiki Docs/Beats page.  DONE

2. Add top-level sn-BeatName ID attributes for each file. DONE

3. Fix all admonition tables
   - fix table, or
   - make a proper admonition  DONE

4. Fix missing version number:  DONE

   http://fedoraproject.org/wiki/Docs/Beats?action=fullsearch&context=180&value=GetVal&fullsearch=Text#preview

   grep "Core  " *xml
   grep "Core ." *xml

5. Search all <screen> tags and fix the line breaks; may require
   injection of fresh content  DONE
   - look for solo-list elements surrounding <screen>
     grep -B2 "<screen>" *.xml | grep listitem  

6. Watch for over sub-sectioning
   - have to build to notice?

7. Figure out how to have a @@RELNAME@@ variable.

8. Add in in the release name.


?. Add call to every file to ../locale-entities.xml - scriptable NOT
   NEEDED 

X. Update .pot file? AUTOMATIC
 


Clean-up for the Wiki

1. Change all titles to not follow format of Docs/Beats/BeatName

2. Flatten the sub-sections a bit, where needed, avoiding orphandd
   sections



--- NEW FILE wikixml2fdpxml ---
#!/bin/bash
#
# This file can be completely replaced with a better tool written in 
# $LANGUAGE of someone's choice
#
# Original shell script - 29-Jan-2005
# kwade at redhat.com

# Manually rename some files to include their wiki namespace
#echo "Renaming Wiki files."
#mv Beats/PPC.xml Beats/ArchSpecificPPC.xml
#mv Beats/x86_64.xml Beats/ArchSpecificx86_64.xml
#mv Beats/x86.xml Beats/ArchSpecificx86.xml
#mv Beats/GCC.xml Beats/DevelToolsGCC.xml
#mv Beats/SELinux.xml Beats/SecuritySELinux.xml
#echo "Finished renaming files."

# Fix the DocType header from bad Wiki output
#ls Beats/ > xmlfiles
#for i in `cat xmlfiles`;
#do
#  sed s'/DocBook V4\.4/DocBook XML V4\.4/g' Beats/$i > tmpfile;
#  mv tmpfile Beats/$i;
#  echo "DOCTYPE header fixed for" $i
#done
#rm xmlfiles
#echo "Done"


# Add the base language extension to the files
#ls Beats/ > xmlfiles
#for i in `cat xmlfiles`;
#  do
#  echo $i | sed 's/.xml/-en.xml/g' > newfilename;
#  mv Beats/$i Beats/`cat newfilename`;
#done
#rm xmlfiles newfilename
#echo "done"

# Right here is where we want to call perl-fu or python-fu
# to follow this pseudo-code
# 
# for each(<section>); 
#  do
#    get(contents of <title></title>) == $title;
#    replace(" " with "-") == $idattrib;
#    insert($idattrib) -> <section id="$idattrib">;
#  done

# We need to convert the targets of XREFs somehow

# This script uses the FDP implementation of xmldiff
# found in cvs.fedora:/cvs/docs/docs-common/bin/
#
# This script expects to be run in-place in
# the release-notes/xmlbeats module, as the paths
# are relative to that point
#
# $Id:
#
# First version kwade at redhat.com 2006-01-04 -- 0.1

# Variables
#XMLDIFF="../../docs-common/bin/xmldiff"
#XMLDIFF_OPTIONS="-p" # colored unified diff
#BEATPATH="./Beats"
#DBPATH=".."
#FILEEXT="*xml"

# Actions
# Run xmldiff against the beat and canonical XML

#for i in $BEATPATH/$FILEEXT;
#  do $XMLDIFF $XMLDIFF_OPTIONS $i

# Move the XML to the build directory
# mv Beats/*.xml ../

# Fix section names for the top-level
for i in `ls *.xml`;
  do
  echo $i | sed 's/\.xml//' > snID;
  echo "Section name sn-"`cat snID`" for "`echo $i`;
  sed 's/  <\/articleinfo>\n
  <section>/  <\/articleinfo>
  <section id="`cat snID`">/' $i > tmpfile;
  mv tmpfile $i;
  echo $i" has a new section id";
done



--- NEW FILE xmlbeats ---
#!/bin/sh

WIKIURL="http://fedoraproject.org/wiki/"
CONVERTERURL="http://www.n-man.com/moin2docbook.htm"
PAGES="`cat beatlist`"

rm -rf Beats
mkdir -p Beats

for PAGEBASE in $PAGES; do
	PAGENAME="Docs/Beats/${PAGEBASE}"
	PAGEENCODED="`echo "$PAGENAME" | sed 's/\//%2F/g' | sed 's/:/%3A/g'`"
	PAGEOUT="Beats/`echo "${PAGEBASE}" | sed "s/\///g"`.xml"
	echo -en "\"${PAGENAME}\" => \"${PAGEOUT}\"..."
	wget -q "${CONVERTERURL}?submit=submit&url=${WIKIURL}${PAGEENCODED}%3Faction=raw" -O "${PAGEOUT}"
	sed -i 's/DocBook V4\.4/DocBook XML V4\.4/g' "${PAGEOUT}"
        xmlformat -f ../../docs-common/bin/xmlformat-fdp.conf $i > tmpfile
        mv tmpfile $i
	echo -en " done.\n"
done



--- NEW FILE xmlfix ---
#!/bin/bash
#
# Take the output files from the "beatconvert" script and make them
# palatable for updating CVS.
#

print_usage() {
    echo "Usage: $0 [<beatdir>] [<docs-common-dir>]"
    echo "  <beatdir>: folder containing conversions of Beats (default $PWD)"
    echo "  <docs-common-dir>: location of docs-common module"
}

case $1 in
    -h|--help )
	print_usage && exit 0
	;;
    * )
	;;
esac

BEATDIR="$1"

[ $# -lt 2 ] && COMMON="../../../docs-common" || COMMON="$2"
if [ ! -d "$COMMON" ]; then
    echo "$COMMON is not the docs-common you're looking for."
    exit 1
fi

XMLFORMAT="${COMMON}/bin/xmlformat"
XMLFORMATCONF="${COMMON}/bin/xmlformat-fdp.conf"

[ -f "$XMLFORMAT" ] || ( echo "No xmlformat found" && exit 1 )
[ -f "$XMLFORMATCONF" ] || (echo "No xmlformat-fdp.conf found" && exit 1 )

mkdir -p Fixed/ || ( echo "Couldn't make a directory here" && exit 1 )

for i in $BEATDIR/*; do
    OUTXML="Fixed/$(basename "$i").xml"
    eval "$XMLFORMAT -f $XMLFORMATCONF ${i} > $OUTXML"
    if [ -z "$(grep DOCTYPE ${i})" ]; then
	sed -i 's@\(<\?xml .\+\?>\)@\1\n<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.docbook.org/xml/4.4/docbookx.dtd">@' $OUTXML
    fi
    sed -i 's@<\(/\)\?article>@<\1section>@' $OUTXML
    sed -i 's@<!DOCTYPE article@<!DOCTYPE section@' $OUTXML

    # Yes this is hacky. So sue me.
    sed -i "s@^<section@<section id=\"sn-${i//${BEATDIR}\//}\"@" $OUTXML
    NEWTITLE=$(grep '<h2>' $OUTXML | sed 's at .\+<h2>\([^<]\+\).\+@\1@')
    sed -i "s@^  <title>.\+@  <title>${NEWTITLE}</title>@" $OUTXML
    sed -i "s@<\(/\)\?h2>@<\1title>@g" $OUTXML
done




More information about the Fedora-docs-commits mailing list