From bugzilla at redhat.com Mon Jan 16 11:52:05 2012 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 16 Jan 2012 06:52:05 -0500 Subject: [publican-list] [Bug 711348] Linking to a bridgehead does not work in html formats In-Reply-To: References: Message-ID: <201201161152.q0GBq5rq019784@bzweb01.app.bz.hst.phx2.redhat.com> Please do not reply directly to this email. All additional comments should be made in the comments box of this bug. https://bugzilla.redhat.com/show_bug.cgi?id=711348 Martin Prpic changed: What |Removed |Added ---------------------------------------------------------------------------- Status|CLOSED |ASSIGNED Resolution|NEXTRELEASE | Keywords| |Reopened --- Comment #3 from Martin Prpic 2012-01-16 06:52:04 EST --- ~]$ rpm -q publican publican-2.8-1.fc15.noarch Linking to a bridgehead still doesn't work. Re-opening. -- Configure bugmail: https://bugzilla.redhat.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. From jfearn at redhat.com Wed Jan 18 02:41:47 2012 From: jfearn at redhat.com (Jeff Fearn) Date: Wed, 18 Jan 2012 12:41:47 +1000 Subject: [publican-list] Possible alternative to wkhtmltopdf In-Reply-To: <20111206134241.GA4911@bowman.infotech.monash.edu.au> References: <20111206134241.GA4911@bowman.infotech.monash.edu.au> Message-ID: <4F16316B.4030906@redhat.com> Hi Peter, sorry for the looooooong delay in responding ... people just keep demanding my attention! :) Is there any reason you are not applying this effort to webkit? AIUI both Apple and Google have teams making webkit use CSS3 properly, so it would seem to be handy to take advantage of their efforts. wkhtmltopdf seems to be a better comparison than FOP since we are actively moving to it. I have wkhtmltopdf built and working on all our arches, like PPC32 and S390, so I'm very keen on having that as a benchmark for consideration ... which of course excludes FOP! :) I tried checking out the git repos but got errors: fatal: The remote end hung up unexpectedly Cheers, Jeff. On 12/06/2011 11:42 PM, Peter Moulder wrote: > I mentioned earlier that I was working on an HTML renderer to do > pagination. Let's call it Morp. Although it isn't user-ready, the > output is starting to look like a tempting alternative, at least for > print usage. > > > Headline features from a Publican point of view: > > - HTML/CSS styling. > > - Doesn't fall apart when encountering a keep-together block larger > than a page. > > - Allows glyph fallback font substitution for mixed-script documents. > > - Proper shaping for Indic scripts (using Pango). > > - Decent page breaking: honours 'widows'& 'orphans' and so on, but > also tries to avoid breaks that are merely undesirable, such as > breaking a short list item, or even splitting a paragraph if this > can be easily avoided. Conversely, it might allow a widow if the > alternatives seem worse. > > (E.g. if I mark figures as page-break-before:avoid and > page-break-inside:avoid, then Morp chooses to give a widow on page 89 of > the below sample in preference to either breaking those constraints or > leaving the page only 60% full.) > > - css3-page styling of page headings, page numbering (roman numerals > in preface), styling of the "blank" page before a chapter, different > margins between inside& outside edges, etc. > > - Rounded borders for the
  things.  (This is the most obvious
>      visual difference between FOP page content that I guess is due to
>      something missing from FOP.)
>
>    - Justified text good enough to actually use.
>
>      Web browsers and even word processors have taught people that
>      justified text can't be used satisfactorily, producing large gaps
>      and/or excessive hyphenation.
>
>      Morp may not apply every known technique, but already it's enough
>      that Publican-produced pages can look like a book rather than like
>      a web page or school project.
>
> (I have a feeling that FOP can do quite good justified text too, btw.)
>
>
> The most recent sample of wkhtmltopdf output that was posted to the list
> was the Red Hat Enterprise Linux 6 Installation Guide (in English):
>
>    http://fedorapeople.org/~jfearn/Red_Hat_Enterprise_Linux-6-Installation_Guide-en-US-TEST.pdf
>
> The corresponding document (though apparently a slightly different
> version) as rendered by FOP is
>
>    http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/pdf/Installation_Guide/Red_Hat_Enterprise_Linux-6-Installation_Guide-en-US.pdf
>
> while output from Morp is at
>
>    http://bowman.infotech.monash.edu.au/~pmoulder/Red_Hat_Enterprise_Linux-6-Installation_Guide-en-US-Morp.pdf
>
>
> I've tried to make the page styling match FOP output.  Given that
> Publican output has lots of screenshots (so it's hard to fill every page
> evenly no matter what you do), I've set the pagination not to try very
> hard to fill pages exactly, letting break pages in more logical places
> (so usually breaking between paragraphs rather than within paragraphs).
>
> I've used SVG versions of the warning/note/important icons, whereas I
> replaced the list-item bitmap images with simple glyph markers (diamond
> and box).
>
> Some notable omissions are:
>
>    - No page references yet (e.g. in tables of contents).
>
>    - No clickable document outline or clickable links.  I'd do this if
>      Cairo made it convenient (someone was working on an interface for
>      that), but this isn't something my boss needs.  Otherwise, the
>      output could be labelled as "PDF for printing" or the like, and
>      steering people to EPUB or HTML for on-screen use.
>
> pjrm.
>
> _______________________________________________
> publican-list mailing list
> publican-list at redhat.com
> https://www.redhat.com/mailman/listinfo/publican-list
> Wiki: https://fedorahosted.org/publican


-- 
"Reply All" why you shouldn't use it: 
http://www.emailreplies.com/#12replytoall



From peter.moulder at monash.edu  Thu Jan 19 05:13:39 2012
From: peter.moulder at monash.edu (Peter Moulder)
Date: Thu, 19 Jan 2012 16:13:39 +1100
Subject: [publican-list] Possible alternative to wkhtmltopdf
In-Reply-To: <4F16316B.4030906@redhat.com>
References: <20111206134241.GA4911@bowman.infotech.monash.edu.au>
	<4F16316B.4030906@redhat.com>
Message-ID: <20120119051339.GA1796@bowman.infotech.monash.edu.au>

On Wed, Jan 18, 2012 at 12:41:47PM +1000, Jeff Fearn wrote:
> Hi Peter, sorry for the looooooong delay in responding ... people
> just keep demanding my attention! :)
> 
> Is there any reason you are not applying this effort to webkit? AIUI
> both Apple and Google have teams making webkit use CSS3 properly, so
> it would seem to be handy to take advantage of their efforts.

I did seriously consider that option, and spent quite a bit of time at
the start of the project looking at the source of webkit (webcore) and to
a lesser extent khtml.

Some obstacles with WebKit are that it's mainly for web browsers, so
needs to support things like DOM manipulation and incremental rendering
(and compatibility with IE6 and the like).  This gets in the way of
completely changing the data structures as needed for doing global
optimization of page breaks or line breaks or float placement in paged
output, or allocation of width among table columns, or whatever the
research needs.

(For similar reasons, a complete change in data structures in a way that
disregards the needs of incremental rendering or DOM manipulation isn't
likely to be accepted upstream even if I had started from (more of) the
existing WebKit source.)

More generally, starting with a large existing piece of software isn't
a good way of experimenting with how to do things differently.

So in the end, I got a few ideas from WebKit, and possibly even a
function or two, but that's about it.

> wkhtmltopdf seems to be a better comparison than FOP since we are
> actively moving to it.

Hmm, wkhtmltopdf did have some serious limitations for paged output
when you last posted a sample from it many months ago.
If wkhtmltopdf hasn't improved since then, and you want to drop use of
FOP output in the near future, then I would suggest looking at offering
Morp-produced pdfs for print usage.

What if to start with I made a pdf for just the documents listed on
http://docs.redhat.com/docs/toc.html that have an html-single but no pdf
version ?  That's 89 documents, mostly *-IN languages, though for some
reason includes two small en-US documents.

OK, I've just run Morp on that set of documents and put the results at:

  http://bowman.infotech.monash.edu.au/~pmoulder/redhat-docs/

However, I haven't checked them, and I expect there are some stylesheet
changes left to make to get the page headings right for documents that
aren't of class "book".  And maybe there'll be some embarrassing bug or
bad line break or something.  Oh well, let me know of problems you find.
I might try to check & fix at least the page headings in a couple of days.

Since the previous time I posted, I've fixed a bug in coordinate
calculation that showed up in a couple of tables near the end of the
installation guide document (most notably the revision history, which is
a table as far as CSS is concerned).  I've overwritten the file at the
original URL with a fixed version.

> I have wkhtmltopdf built and working on all
> our arches, like PPC32 and S390, so I'm very keen on having that as
> a benchmark for consideration ... which of course excludes FOP! :)

I'm not aware of portability obstacles for those arches, but I've only
actually tried it on 32-bit little-endian architectures (ia32, and arm a
long time ago).

Speed-wise: when I compared it with a distro-installed copy of
wkhtmltopdf (v0.9.9 running against stock qt4 4.5.3 webkit) for documents
from docs.redhat.com, I found the two to have very roughly similar speed,
to the extent that some documents were faster with one and some faster
with the other.  I wouldn't expect them to be that similar in speed (I
certainly haven't tried as hard as webkit engineers to make things fast),
so I wonder whether there's a flaw in how I did the timings, or maybe
it's something that'll change a lot as both projects develop.  (E.g. Morp
is currently slower on average than wkhtmltopdf in my testing, but I
expect Morp's average times will jump to below wkhtmltopdf's in a couple
of months, but then increase a bit with later functional changes.)

In my estimation, the most important differences in the output I've seen
is that wkhtmltopdf's pdf output is better for on-screen pdf browsing
(clickable links), while Morp's output is better for print (more helpful
page headings, and avoids bad page breaks).

> I tried checking out the git repos but got errors:
> 
> fatal: The remote end hung up unexpectedly

Oops, drop the /srv/git from the URLs I gave, i.e. make them:

  git://bowman.infotech.monash.edu.au/libcroco.git
  git://bowman.infotech.monash.edu.au/morp.git

However, remember my comments about not being user-ready (e.g. see
message in reply to Raphael Hertzog), so you could be in for a bumpy ride
(lack of documentation, assertion failures for unimplemented things).
I'll write privately later to try to smooth things a little.


In the message quoted below, I've removed the points that don't apply in
comparison to wkhtmltopdf.

pjrm.


> >I mentioned earlier that I was working on an HTML renderer to do
> >pagination.  Let's call it Morp.  Although it isn't user-ready, the
> >output is starting to look like a tempting alternative, at least for
> >print usage.
> >
> >
> >Headline features from a Publican point of view:
> >
> >   - Decent page breaking: honours 'widows' & 'orphans' and so on, but
> >     also tries to avoid breaks that are merely undesirable, such as
> >     breaking a short list item, or even splitting a paragraph if this
> >     can be easily avoided.  Conversely, it might allow a widow if the
> >     alternatives seem worse.
> >
> >     (E.g. if I mark figures as page-break-before:avoid and
> >     page-break-inside:avoid, then Morp chooses to give a widow on page 89 of
> >     the below sample in preference to either breaking those constraints or
> >     leaving the page only 60% full.)
> >
> >   - css3-page styling of page headings, page numbering (roman numerals
> >     in preface), styling of the "blank" page before a chapter, different
> >     margins between inside & outside edges, etc.
> >
> >   - Justified text good enough to actually use.
> >
> >     Web browsers and even word processors have taught people that
> >     justified text can't be used satisfactorily, producing large gaps
> >     and/or excessive hyphenation.
> >
> >     Morp may not apply every known technique, but already it's enough
> >     that Publican-produced pages can look like a book rather than like
> >     a web page or school project.
> >
> >The most recent sample of wkhtmltopdf output that was posted to the list
> >was the Red Hat Enterprise Linux 6 Installation Guide (in English):
> >
> >   http://fedorapeople.org/~jfearn/Red_Hat_Enterprise_Linux-6-Installation_Guide-en-US-TEST.pdf
> >
> >while output from Morp is at
> >
> >   http://bowman.infotech.monash.edu.au/~pmoulder/Red_Hat_Enterprise_Linux-6-Installation_Guide-en-US-Morp.pdf
> >
> >
> >I've tried to make the page styling match FOP output.  Given that
> >Publican output has lots of screenshots (so it's hard to fill every page
> >evenly no matter what you do), I've set the pagination not to try very
> >hard to fill pages exactly, letting break pages in more logical places
> >(so usually breaking between paragraphs rather than within paragraphs).
> >
> >I've used SVG versions of the warning/note/important icons, whereas I
> >replaced the list-item bitmap images with simple glyph markers (diamond
> >and box).
> >
> >Some notable omissions are:
> >
> >   - No page references yet (e.g. in tables of contents).
> >
> >   - No clickable document outline or clickable links.  I'd do this if
> >     Cairo made it convenient (someone was working on an interface for
> >     that), but this isn't something my boss needs.  Otherwise, the
> >     output could be labelled as "PDF for printing" or the like, and
> >     steering people to EPUB or HTML for on-screen use.

The above message didn't try to make a full comparison between Morp and
either wkhtmltopdf or FOP.  An obvious point in WebKit's favour is that
WebKit is much more widely tested than Morp, and will have better support
of whatever CSS features your designers want to experiment with.

One thing on your side in any case is that once you've set up each piece
of software in the first place, it's relatively easy to switch between
them.  [Spoken in blissful ignorance of your experience of starting to
use wkhtmltopdf.]

pjrm.



From bugzilla at redhat.com  Thu Jan 19 10:20:26 2012
From: bugzilla at redhat.com (bugzilla at redhat.com)
Date: Thu, 19 Jan 2012 05:20:26 -0500
Subject: [publican-list] [Bug 663539] [FAMILY Given] shown without author's
 name on the cover page
In-Reply-To: 
References: 
Message-ID: <201201191020.q0JAKQbY011883@bzweb01.app.bz.hst.phx2.redhat.com>

Please do not reply directly to this email. All additional
comments should be made in the comments box of this bug.


https://bugzilla.redhat.com/show_bug.cgi?id=663539

Martin Prpic  changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|VERIFIED                    |CLOSED
         Resolution|                            |CURRENTRELEASE
        Last Closed|2011-08-01 18:48:00         |2012-01-19 05:20:25

-- 
Configure bugmail: https://bugzilla.redhat.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



From Norman at dunbar-it.co.uk  Wed Jan 25 16:49:32 2012
From: Norman at dunbar-it.co.uk (Norman Dunbar)
Date: Wed, 25 Jan 2012 16:49:32 +0000
Subject: [publican-list] PDF Indices
Message-ID: <4F20329C.30807@dunbar-it.co.uk>

Does anyone use indices in their pdf documents? Do you use page ranges 
for any entries?

I'm using these just now in a fairly large document I'm creating and 
I've noticed that setting up an index page range, as follows, gives 
strange results.

Example:

At the beginning on a page range, add the following indexterm:


Whatever

At the end of the range, put this indexterm:


Whatever

When the index is generated, you get something like

Whatever 10-22, 22

The page number for the end of range is added in as a separate single page.

If no-one has seen this, I'll log it as a problem.


Thanks.

Cheers,
Norm.

-- 
Norman Dunbar
Dunbar IT Consultants Ltd

Registered address:
Thorpe House
61 Richardshaw Lane
Pudsey
West Yorkshire
United Kingdom
LS28 7EL

Company Number: 05132767



From Norman at dunbar-it.co.uk  Thu Jan 26 10:20:31 2012
From: Norman at dunbar-it.co.uk (Norman Dunbar)
Date: Thu, 26 Jan 2012 10:20:31 +0000
Subject: [publican-list] Dbfo keep-together instructions stripped out by
	publican?
Message-ID: <4F2128EF.3080802@dunbar-it.co.uk>

I'm running Publican 2.8 on Fedora 16.

I have a manual that I'm creating and in order to prevent all tables 
splitting across pages, I've added the following to a customisation 
layer which I've called pdf.xsl after renaming the Publican file of the 
same name to pdf.original.xsl. (This is in the /usr/share/publican/xsl 
directory.)









   always



   always


...



The problem is that while the tables do not split over a page break, 
which is what I want, the two very large tables now get crushed up onto 
a page.

I looked in the tmp/en-US/xml folder for the Publican processed xml file 
and the processing instruction I added to the big tables had been 
stripped out.

There were no warning messages about deprecated or invalid xml when 
processing the particular file in question.

Is there a way I can get around this please?

As a workaround I could add the instruction to the Publican generated 
XML file, there are only two large tables, but how do I then get the 
build to (a) stop after generating the XML and (b) generate the PDF from 
the tmp/en-US/xlm files rather than starting again from my source files?


Many thanks.


Cheers,
Norm.

PS. Happy to log a bug, but a search showed nothing relevant and I 
thought I'd ask first. Thanks.

-- 
Norman Dunbar
Dunbar IT Consultants Ltd

Registered address:
Thorpe House
61 Richardshaw Lane
Pudsey
West Yorkshire
United Kingdom
LS28 7EL

Company Number: 05132767



From Norman at dunbar-it.co.uk  Thu Jan 26 10:21:49 2012
From: Norman at dunbar-it.co.uk (Norman Dunbar)
Date: Thu, 26 Jan 2012 10:21:49 +0000
Subject: [publican-list] PDF Indices
In-Reply-To: <4F20329C.30807@dunbar-it.co.uk>
References: <4F20329C.30807@dunbar-it.co.uk>
Message-ID: <4F21293D.3030800@dunbar-it.co.uk>

On 25/01/12 16:49, Norman Dunbar wrote:
> Does anyone use indices in their pdf documents? Do you use page ranges
> for any entries?

Sorry, I forgot to mention, I'm running Publican 2.8 on Fedora 16.

Cheers,
Norm.

-- 
Norman Dunbar
Dunbar IT Consultants Ltd

Registered address:
Thorpe House
61 Richardshaw Lane
Pudsey
West Yorkshire
United Kingdom
LS28 7EL

Company Number: 05132767



From fdalrymple at redhat.com  Thu Jan 26 22:24:53 2012
From: fdalrymple at redhat.com (Fred Dalrymple)
Date: Thu, 26 Jan 2012 17:24:53 -0500 (EST)
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: 
Message-ID: 

Hi everyone -- 

Just joined this list, though I've been at Red Hat for a couple of years. I'm a marketing writer in Westford, currently writing the CloudForms Evaluation Guide, and have started using topic tools and publican. 

I understand that we've generally avoided alphabetically sorted lists, e.g., glossaries, in documents because the sort order may change when translated. 

I'm thinking of prototyping a solution that would transform a list into an appropriate alternate order and am interested in any requirements that people might have. For example: 

    ? I'm assuming it should work with DocBook 4.5 (at least, that's what I'm using now) -- any other versions or tagsets? 


    ? 
what lists, other than glossaries, should be handled by this solution? An obvious possibility is . 


    ? 
is there anything wrong with the general idea of deriving an alternate version of a source file, to reorder designated lists, as long as the original source files are not harmed? I'm leaning toward doing the processing just before formatting so that it would be more independent of edits. 


    ? 
are there any constraints on which language the pre-processor is written in? For example, perl is ok? 
How are translations handled? Files are sent out, translated, and returned with markup generally intact (i.e., only content translated)? I see the directory structure in the document tree for each locale. 

A requirement that I'm planning to observe: this approach should not require changes to any content source files. However, new, independent, non-content files could be used to drive the process. The only exception that I can see now is, for example, an  might indicate that it is supposed to be alphabetized by using something like role="alphabetized". 

Comments requested. 

Thanks! 

Fred 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From jmorgan at redhat.com  Fri Jan 27 03:22:24 2012
From: jmorgan at redhat.com (Jared Morgan)
Date: Thu, 26 Jan 2012 22:22:24 -0500 (EST)
Subject: [publican-list] Dbfo keep-together instructions stripped out
	by	publican?
In-Reply-To: <4F2128EF.3080802@dunbar-it.co.uk>
Message-ID: 

Hi Norman

Apologies for the tl:dr response that follows, but stick with it and hopefully it will help you.

I had trouble getting FOP playing nicely on F16 at all. I asked internally here, and it was suggested that I give the wkhtmltopdf plugin a shot for PDF generation.

You might want to revisit the thread back in August 2011, that goes into a bit of detail about the testing of the tool. Jeff has rolled some of his own RPMs, which are available from the links in the email thread below..

https://www.redhat.com/archives/publican-list/2011-August/msg00063.html

Caveat: this is not yet implemented in publican fully yet, and is very much a beta feature. But it is about 10 times faster than FOP, and at least for me, results in frankly beautiful PDF output (thanks Publican team for getting it this far!!).

It *might* also solve some of your table problems that FOP is causing. The keep-together issue has stripped out entire  blocks from PDFs that are larger than 1 page in length.   

If wkhtmltopdf does not fix your issue, or makes it worse, just remove both packages and you are back to FOP.

Based on previous bugs I raised against FOP issues, I don't think this will be fixed any time soon (because to get it working correctly across the many scenarios that require keep-together is a tremor-inducing nightmare).

I hope this helps you out. It certainly made the PDFs I was generating locally look fantastic, and function correctly.

Cheers

Jared Morgan
EPP Maintenance Lead | PressGang Lead
Red Hat Asia Pacific
1/193 North Quay
BRISBANE QLD 4000

P: +61 7 3514 8242
M: +61 413 005 479

Too brief? Here's why! http://emailcharter.org 

----- Original Message -----
> From: "Norman Dunbar" 
> To: publican-list at redhat.com
> Sent: Thursday, January 26, 2012 8:20:31 PM
> Subject: [publican-list] Dbfo keep-together instructions stripped out by	publican?
> 
> I'm running Publican 2.8 on Fedora 16.
> 
> I have a manual that I'm creating and in order to prevent all tables
> splitting across pages, I've added the following to a customisation
> layer which I've called pdf.xsl after renaming the Publican file of
> the
> same name to pdf.original.xsl. (This is in the
> /usr/share/publican/xsl
> directory.)
> 
> 
> 
> 
>  ...
> 
> 
> 
> 
> 
> 
>        name="keep-together.within-column">always
> 
> 
> 
>        name="keep-together.within-column">always
> 
> 
> ...
> 
> 
> 
> The problem is that while the tables do not split over a page break,
> which is what I want, the two very large tables now get crushed up
> onto
> a page.
> 
> I looked in the tmp/en-US/xml folder for the Publican processed xml
> file
> and the processing instruction I added to the big tables had been
> stripped out.
> 
> There were no warning messages about deprecated or invalid xml when
> processing the particular file in question.
> 
> Is there a way I can get around this please?
> 
> As a workaround I could add the instruction to the Publican generated
> XML file, there are only two large tables, but how do I then get the
> build to (a) stop after generating the XML and (b) generate the PDF
> from
> the tmp/en-US/xlm files rather than starting again from my source
> files?
> 
> 
> Many thanks.
> 
> 
> Cheers,
> Norm.
> 
> PS. Happy to log a bug, but a search showed nothing relevant and I
> thought I'd ask first. Thanks.
> 
> --
> Norman Dunbar
> Dunbar IT Consultants Ltd
> 
> Registered address:
> Thorpe House
> 61 Richardshaw Lane
> Pudsey
> West Yorkshire
> United Kingdom
> LS28 7EL
> 
> Company Number: 05132767
> 
> _______________________________________________
> publican-list mailing list
> publican-list at redhat.com
> https://www.redhat.com/mailman/listinfo/publican-list
> Wiki: https://fedorahosted.org/publican
> 



From Norman at dunbar-it.co.uk  Fri Jan 27 08:17:15 2012
From: Norman at dunbar-it.co.uk (Norman Dunbar)
Date: Fri, 27 Jan 2012 08:17:15 +0000
Subject: [publican-list] Dbfo keep-together instructions stripped out
 by	publican?
In-Reply-To: 
References: 
Message-ID: <4F225D8B.1080702@dunbar-it.co.uk>

Morning Jared,

> Apologies for the tl:dr response that follows, but stick with it and hopefully it will help you.
Ummm, tl:dr? You've got me there!



I have looked at the wkhtmltopdf utility and found it ok, that was when 
it was first announced. I have not yet tried to interface it with 
Publican - I shall certainly give it a try as a publican plugin.



> It *might* also solve some of your table problems that FOP is causing. The keep-together issue has stripped out entire  blocks from PDFs that are larger than 1 page in length.
I'm not seeing this. The big tables are still there in the pdf, just 
wrapped around themselves on a page. The reason being, as far as I can 
see, that my processing instruction within the large tables has been 
stripped out by the publican build command, before it gets to converting 
from xml to xsl-fo and before fop is involved.

In my customisation layer over pdf.xsl I have set the keep-together to 
be "always" and for these two tables, a PI sets it to "auto". If I 
change it back to "auto" then these two tables do correctly split but so 
do all the other ones I don't want splitting.

Under plain old DocBook, it works the way I need it to work. So I'm 
assuming that Publican's preprocessing is the culprit and is quietly 
removing my PI from my source. But is doing so without warning me.

Running publican build --format=test .... shows no warnings about the PI 
and the print_banned output doesn't list PIs as being a bad thing either.



> Based on previous bugs I raised against FOP issues, I don't think this will be fixed any time soon (because to get it working correctly across the many scenarios that require keep-together is a tremor-inducing nightmare).
As I mention above, I'm not convinced that this is a FOP issue, the PI 
is stripped out long before FOP is executed.

> I hope this helps you out. It certainly made the PDFs I was generating locally look fantastic, and function correctly.
Thank you again. I will give the new system decent chance. It may well 
fix the problem without needing the PI, I hope so, and it will be a lot 
quicker than me trying to learn perl to see what's what in the source 
for Publican! ;-)

Many thanks.


Cheers,
Norm.

PS. Brisbane? I love that place.

-- 
Norman Dunbar
Dunbar IT Consultants Ltd

Registered address:
Thorpe House
61 Richardshaw Lane
Pudsey
West Yorkshire
United Kingdom
LS28 7EL

Company Number: 05132767



From Norman at dunbar-it.co.uk  Fri Jan 27 09:20:41 2012
From: Norman at dunbar-it.co.uk (Norman Dunbar)
Date: Fri, 27 Jan 2012 09:20:41 +0000
Subject: [publican-list] Dbfo keep-together instructions stripped out
 by	publican?
In-Reply-To: 
References: 
Message-ID: <4F226C69.9080005@dunbar-it.co.uk>

Morning Jared,

Ok, I tried wkhtmltopdf after installing the QT stuff needed by the two 
rpms you linked me to.

Good points:

It's very fast indeed!

The output format appears quite nice, as you say, fantastic looking 
documents.


Bad points (for me anyway):

The front cover image is missing and I have a lovely set of scroll bars 
where it should be. :-(

Paging is ruined - nothing throws to a new page any more - sections, 
chapters, parts etc, all start just below where the last "bit" finished.

Every part, chapter and section has a table of contents present, only 
the book itself should have one.

Every single table now splits over a page, but even worse, using FOP I 
did get a copy of the table headers on the continuation. Not any more. :-(

I presume the above is due to the initial conversion from XML to HTML 
being an html-single - there are no page breaks - so tables etc don't 
split in the html, but when converted to pdf, oh boy!

But the worst thing of all for a printed document, the indexing. I have 
4 (yes, overkill perhaps, but that's how it is!) different indices and 
instead of having page numbers, they have the section header instead.

I'm afraid it's not for me - yet - as it's not producing anything like a 
decent, printable pdf document. Sorry.


Appreciate you taking the time to remind me of this utility, but it's 
not quite ready for mainstream yet - at least, as far as my book is 
concerned.

Thanks.


Cheers,
Norm.

-- 
Norman Dunbar
Dunbar IT Consultants Ltd

Registered address:
Thorpe House
61 Richardshaw Lane
Pudsey
West Yorkshire
United Kingdom
LS28 7EL

Company Number: 05132767



From jfearn at redhat.com  Sun Jan 29 23:17:22 2012
From: jfearn at redhat.com (Jeff Fearn)
Date: Mon, 30 Jan 2012 09:17:22 +1000
Subject: [publican-list] PDF Indices
In-Reply-To: <4F20329C.30807@dunbar-it.co.uk>
References: <4F20329C.30807@dunbar-it.co.uk>
Message-ID: <4F25D382.7080301@redhat.com>

On 01/26/2012 02:49 AM, Norman Dunbar wrote:
> Does anyone use indices in their pdf documents? Do you use page ranges
> for any entries?
>
> I'm using these just now in a fairly large document I'm creating and
> I've noticed that setting up an index page range, as follows, gives
> strange results.
>
> Example:
>
> At the beginning on a page range, add the following indexterm:
>
> 
> Whatever
>
> At the end of the range, put this indexterm:
>
> 
> Whatever
>
> When the index is generated, you get something like
>
> Whatever 10-22, 22
>
> The page number for the end of range is added in as a separate single page.
>
> If no-one has seen this, I'll log it as a problem.
>
>
> Thanks.
>
> Cheers,
> Norm.
>

Hi Norm, I'm unaware of anyone using this functionality, so it's 
probably never been vetted. Please open a bug and include how you think 
it should work.

Cheers, Jeff.

-- 
"Reply All" why you shouldn't use it: 
http://www.emailreplies.com/#12replytoall



From jfearn at redhat.com  Sun Jan 29 23:25:12 2012
From: jfearn at redhat.com (Jeff Fearn)
Date: Mon, 30 Jan 2012 09:25:12 +1000
Subject: [publican-list] Dbfo keep-together instructions stripped out
 by	publican?
In-Reply-To: <4F2128EF.3080802@dunbar-it.co.uk>
References: <4F2128EF.3080802@dunbar-it.co.uk>
Message-ID: <4F25D558.4010905@redhat.com>

Hi Norm,

On 01/26/2012 08:20 PM, Norman Dunbar wrote:
> I'm running Publican 2.8 on Fedora 16.
>
> I have a manual that I'm creating and in order to prevent all tables
> splitting across pages, I've added the following to a customisation
> layer which I've called pdf.xsl after renaming the Publican file of the
> same name to pdf.original.xsl. (This is in the /usr/share/publican/xsl
> directory.)
>
> 
> 
>
>  ...
>
> 
> 
>
> 
> 
> always
> 
>
> 
> always
> 
>
> ...
>
> 
>
> The problem is that while the tables do not split over a page break,
> which is what I want, the two very large tables now get crushed up onto
> a page.
>
> I looked in the tmp/en-US/xml folder for the Publican processed xml file
> and the processing instruction I added to the big tables had been
> stripped out.
>
> There were no warning messages about deprecated or invalid xml when
> processing the particular file in question.

We do silently strip out processing instructions. It's quite deliberate, 
when we started making Publican writers adding PI's caused us _endless_ 
amounts of consternation. That decision was made long before Publican 
was made public and almost as long forgotten!

> Is there a way I can get around this please?

You can override table output in your pdf.xsl file and set the attribute 
in the XML::FO output directly. You could base it on a role on the 
docbook table.

> As a workaround I could add the instruction to the Publican generated
> XML file, there are only two large tables, but how do I then get the
> build to (a) stop after generating the XML and (b) generate the PDF from
> the tmp/en-US/xlm files rather than starting again from my source files?

It's much better for your sanity to edit your brands XSL than to try 
interceding in the build process.

Cheers, Jeff.

-- 
"Reply All" why you shouldn't use it: 
http://www.emailreplies.com/#12replytoall



From jfearn at redhat.com  Sun Jan 29 23:29:07 2012
From: jfearn at redhat.com (Jeff Fearn)
Date: Mon, 30 Jan 2012 09:29:07 +1000
Subject: [publican-list] Dbfo keep-together instructions stripped out
 by	publican?
In-Reply-To: <4F226C69.9080005@dunbar-it.co.uk>
References: 
	<4F226C69.9080005@dunbar-it.co.uk>
Message-ID: <4F25D643.1010501@redhat.com>

Hi Norm, have you rebuilt Publican from the SVN repo? We have done a lot 
of work tweaking the CSS file used to create PDFs with wkhtmltopdf.

svn co http://svn.fedorahosted.org/svn/publican/branches/publican-2x

These changes, and hopefully a few more, will be in Publican 2.9, due 
out ... hopefully soon O_O

Cheers, Jeff.

On 01/27/2012 07:20 PM, Norman Dunbar wrote:
> Morning Jared,
>
> Ok, I tried wkhtmltopdf after installing the QT stuff needed by the two
> rpms you linked me to.
>
> Good points:
>
> It's very fast indeed!
>
> The output format appears quite nice, as you say, fantastic looking
> documents.
>
>
> Bad points (for me anyway):
>
> The front cover image is missing and I have a lovely set of scroll bars
> where it should be. :-(
>
> Paging is ruined - nothing throws to a new page any more - sections,
> chapters, parts etc, all start just below where the last "bit" finished.
>
> Every part, chapter and section has a table of contents present, only
> the book itself should have one.
>
> Every single table now splits over a page, but even worse, using FOP I
> did get a copy of the table headers on the continuation. Not any more. :-(
>
> I presume the above is due to the initial conversion from XML to HTML
> being an html-single - there are no page breaks - so tables etc don't
> split in the html, but when converted to pdf, oh boy!
>
> But the worst thing of all for a printed document, the indexing. I have
> 4 (yes, overkill perhaps, but that's how it is!) different indices and
> instead of having page numbers, they have the section header instead.
>
> I'm afraid it's not for me - yet - as it's not producing anything like a
> decent, printable pdf document. Sorry.
>
>
> Appreciate you taking the time to remind me of this utility, but it's
> not quite ready for mainstream yet - at least, as far as my book is
> concerned.
>
> Thanks.
>
>
> Cheers,
> Norm.
>


-- 
"Reply All" why you shouldn't use it: 
http://www.emailreplies.com/#12replytoall



From jwulf at redhat.com  Mon Jan 30 05:52:33 2012
From: jwulf at redhat.com (Joshua J Wulf)
Date: Mon, 30 Jan 2012 15:52:33 +1000
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: 
References: 
Message-ID: <4F263021.7020902@redhat.com>

Hi Fred,

Sounds like a really interesting and useful idea!

Translation is accomplished with many projects using Zanata 
[www.zanata.org].

The workflow for translation is like this:

1. Generate pot files (Portable Object Template).
2. Use the zanata client to push to the pot files to Zanata
3. Use the zanata client to pull the translations as po files

According to my source, the matching between source xml and po file is 
on string match, not on line number. If that is the case, then if you 
don't change the content of the source xml, but only the ordering, then 
you'll be able to rewrite the source xml based on the po file.

So normally you'd go:
4. Build the translated output using the source xml and the po file for 
the target language
  For example: publican build --langs=es-ES --formats=html

With a dynamic rewrite you might go:

reorderpublican --langs=es-ES --formats=html

Which would then:
4. Copy the original xml file
5. Scan the xml copy for reorderable elements
6. Locate the reorderable elements in the es-ES po files
7. Construct a reordering based on the translations in the po files
8. Rewrite the xml according to the reordering
9. Call publican build --langs=es-ES --formats=html using the reordered 
xml file

Not sure if that's what you were thinking...

I don't know how you deal with a language like Chinese, which isn't 
"alphabetical". They must have some kind of ordering though, to be able 
to produce dictionaries.

As a hack-around, you could write it in anything. If it were written in 
Perl it might be easier to get it accepted into Publican.

I'd be happy to help you to test it if you get a prototype up and working.

- Josh




On 01/27/2012 08:24 AM, Fred Dalrymple wrote:
> Hi everyone --
>
> Just joined this list, though I've been at Red Hat for a couple of 
> years.  I'm a marketing writer in Westford, currently writing the 
> CloudForms Evaluation Guide, and have started using topic tools and 
> publican.
>
> I understand that we've generally avoided alphabetically sorted lists, 
> e.g., glossaries, in documents because the sort order may change when 
> translated.
>
> I'm thinking of prototyping a solution that would transform a list 
> into an appropriate alternate order and am interested in any 
> requirements that people might have.  For example:
>
>   * I'm assuming it should work with DocBook 4.5 (at least, that's
>     what I'm using now) -- any other versions or tagsets?
>
>   * what lists, other than glossaries, should be handled by this
>     solution? An obvious possibility is .
>
>   * is there anything wrong with the general idea of deriving an
>     alternate version of a source file, to reorder designated lists,
>     as long as the original source files are not harmed? I'm leaning
>     toward doing the processing just before formatting so that it
>     would be more independent of edits.
>
>   * are there any constraints on which language the pre-processor is
>     written in? For example, perl is ok?
>
> How are translations handled?  Files are sent out, translated, and 
> returned with markup generally intact (i.e., only content translated)? 
> I see the directory structure in the document tree for each locale.
>
> A requirement that I'm planning to observe:  this approach should not 
> require changes to any content source files.  However, new, 
> independent, non-content files could be used to drive the process. The 
> only exception that I can see now is, for example, an  
> might indicate that it is supposed to be alphabetized by using 
> something like role="alphabetized".
>
> Comments requested.
>
> Thanks!
>
> Fred
>
>
> _______________________________________________
> publican-list mailing list
> publican-list at redhat.com
> https://www.redhat.com/mailman/listinfo/publican-list
> Wiki: https://fedorahosted.org/publican

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From jfearn at redhat.com  Mon Jan 30 06:28:58 2012
From: jfearn at redhat.com (Jeff Fearn)
Date: Mon, 30 Jan 2012 16:28:58 +1000
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: 
References: 
Message-ID: <4F2638AA.80803@redhat.com>

Hi Fred, welcome to the list.

We have discussed this many times, see: 
https://fedorahosted.org/publican/wiki/WishList#GlossaryCollation

Collating the three kana scripts of Japanese properly is the Mt Everest 
of this challenge.

On 01/27/2012 08:24 AM, Fred Dalrymple wrote:
> Hi everyone --
>
> Just joined this list, though I've been at Red Hat for a couple of years. I'm a marketing writer in Westford, currently writing the CloudForms Evaluation Guide, and have started using topic tools and publican.
>
> I understand that we've generally avoided alphabetically sorted lists, e.g., glossaries, in documents because the sort order may change when translated.

It's not because the order changes, it's because it's not possible to 
collate kana correctly.

> I'm thinking of prototyping a solution that would transform a list into an appropriate alternate order and am interested in any requirements that people might have. For example:
>
>      ? I'm assuming it should work with DocBook 4.5 (at least, that's what I'm using now) -- any other versions or tagsets?

Really it should just take two strings and return the correctly collated 
"bigger than" value. Leave it up to the app to supply the strings and 
handle what it wants to pass in.

>      ?
> what lists, other than glossaries, should be handled by this solution? An obvious possibility is.

Ignore that level to start with, just handle two strings.

>
>      ?
> is there anything wrong with the general idea of deriving an alternate version of a source file, to reorder designated lists, as long as the original source files are not harmed? I'm leaning toward doing the processing just before formatting so that it would be more independent of edits.

It limits how it can be used. XSL can call arbitrary functions so it can 
be handled during the XSLT call, no need for an extra intermediate file.

>
>      ?
> are there any constraints on which language the pre-processor is written in? For example, perl is ok?

There is no constraint. Perl would be great, but kana collation would be 
extremely useful to many open source projects, so picking a language 
that performs well and can be easily made in to a library would be good.

> How are translations handled? Files are sent out, translated, and returned with markup generally intact (i.e., only content translated)? I see the directory structure in the document tree for each locale.

I wouldn't worry about that. Collating kana is an enormous challenge and 
you are better off leaving the app specific integration up to app 
developers. I'd have it working in publican in a split second if it 
existed as a library.

> A requirement that I'm planning to observe: this approach should not require changes to any content source files. However, new, independent, non-content files could be used to drive the process. The only exception that I can see now is, for example, an  might indicate that it is supposed to be alphabetized by using something like role="alphabetized".

Again, forget application use, concentrate on getting the collation working.

Good luck!

-- 
"Reply All" why you shouldn't use it: 
http://www.emailreplies.com/#12replytoall



From jwulf at redhat.com  Mon Jan 30 06:37:07 2012
From: jwulf at redhat.com (Joshua J Wulf)
Date: Mon, 30 Jan 2012 16:37:07 +1000
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: <4F2638AA.80803@redhat.com>
References: 
	<4F2638AA.80803@redhat.com>
Message-ID: <4F263A93.6070001@redhat.com>

On 01/30/2012 04:28 PM, Jeff Fearn wrote:
> Hi Fred, welcome to the list.
>
> We have discussed this many times, see: 
> https://fedorahosted.org/publican/wiki/WishList#GlossaryCollation
>
> Collating the three kana scripts of Japanese properly is the Mt 
> Everest of this challenge.

I thought it looked too easy to still be up for grabs...



From jfearn at redhat.com  Mon Jan 30 06:58:45 2012
From: jfearn at redhat.com (Jeff Fearn)
Date: Mon, 30 Jan 2012 16:58:45 +1000
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: <4F263A93.6070001@redhat.com>
References: 
	<4F2638AA.80803@redhat.com> <4F263A93.6070001@redhat.com>
Message-ID: <4F263FA5.1070804@redhat.com>

On 01/30/2012 04:37 PM, Joshua J Wulf wrote:
> On 01/30/2012 04:28 PM, Jeff Fearn wrote:
>> Hi Fred, welcome to the list.
>>
>> We have discussed this many times, see:
>> https://fedorahosted.org/publican/wiki/WishList#GlossaryCollation
>>
>> Collating the three kana scripts of Japanese properly is the Mt
>> Everest of this challenge.
>
> I thought it looked too easy to still be up for grabs...

Collating each of them separately is easy, but it's perfectly valid in 
Japanese to mix them so you have to be able to collate all of them 
together. AFAIK no one has done that in any open source project.

Cheers, Jeff.


-- 
"Reply All" why you shouldn't use it: 
http://www.emailreplies.com/#12replytoall



From jfearn at redhat.com  Mon Jan 30 07:00:32 2012
From: jfearn at redhat.com (Jeff Fearn)
Date: Mon, 30 Jan 2012 17:00:32 +1000
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: <4F263FA5.1070804@redhat.com>
References: 	<4F2638AA.80803@redhat.com>
	<4F263A93.6070001@redhat.com> <4F263FA5.1070804@redhat.com>
Message-ID: <4F264010.7010402@redhat.com>

This is a good explanation of the issue 
https://www.redhat.com/archives/publican-list/2010-May/msg00025.html

Cheers, Jeff.

-- 
"Reply All" why you shouldn't use it: 
http://www.emailreplies.com/#12replytoall



From Norman at dunbar-it.co.uk  Mon Jan 30 11:52:36 2012
From: Norman at dunbar-it.co.uk (Norman Dunbar)
Date: Mon, 30 Jan 2012 11:52:36 +0000
Subject: [publican-list] PDF Indices
In-Reply-To: <4F25D382.7080301@redhat.com>
References: <4F20329C.30807@dunbar-it.co.uk> <4F25D382.7080301@redhat.com>
Message-ID: <4F268484.7030302@dunbar-it.co.uk>

On 29/01/12 23:17, Jeff Fearn wrote:
> Hi Norm, I'm unaware of anyone using this functionality, so it's
> probably never been vetted. Please open a bug and include how you think
> it should work.
>
> Cheers, Jeff.
This has been done, thanks. 
https://bugzilla.redhat.com/show_bug.cgi?id=785697

Cheers,
Norm.

-- 
Norman Dunbar
Dunbar IT Consultants Ltd

Registered address:
Thorpe House
61 Richardshaw Lane
Pudsey
West Yorkshire
United Kingdom
LS28 7EL

Company Number: 05132767



From fdalrymple at redhat.com  Mon Jan 30 14:16:49 2012
From: fdalrymple at redhat.com (Fred Dalrymple)
Date: Mon, 30 Jan 2012 09:16:49 -0500 (EST)
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: <4F264010.7010402@redhat.com>
Message-ID: <774e5bf4-16cb-42f8-92cb-3e4501fcac2a@zmail11.collab.prod.int.phx2.redhat.com>

Thanks for the pointer (I didn't look far enough back in the archives). 

In general, if there is no automated programmatic solution, then I'd probably introduce an external file that would point at entries that didn't follow the programmatic default and provide either a clue or explicit sorting key -- think RDF resources (though I'm partial to Topic Maps). Perhaps verbose, but if a machine can't figure it out automatically, what can you do? 

Actually, I'd assumed this in the solution because I'm thinking about non-alphabetic sorting needs, like the order of introduction of terms, perhaps on a per-topic basis (and yes, enabling solutions in forms other than print). 

thanks -- 

Fred 

----- Original Message -----

> This is a good explanation of the issue
> https://www.redhat.com/archives/publican-list/2010-May/msg00025.html

> Cheers, Jeff.

> --
> "Reply All" why you shouldn't use it:
> http://www.emailreplies.com/#12replytoall

> _______________________________________________
> publican-list mailing list
> publican-list at redhat.com
> https://www.redhat.com/mailman/listinfo/publican-list
> Wiki: https://fedorahosted.org/publican
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From Norman at dunbar-it.co.uk  Mon Jan 30 15:00:59 2012
From: Norman at dunbar-it.co.uk (Norman Dunbar)
Date: Mon, 30 Jan 2012 15:00:59 +0000
Subject: [publican-list] Dbfo keep-together instructions stripped out
 by	publican?
In-Reply-To: <4F25D643.1010501@redhat.com>
References: 	<4F226C69.9080005@dunbar-it.co.uk>
	<4F25D643.1010501@redhat.com>
Message-ID: <4F26B0AB.40106@dunbar-it.co.uk>

Hi Jeff,


On 29/01/12 23:29, Jeff Fearn wrote:
> Hi Norm, have you rebuilt Publican from the SVN repo? We have done a lot
> of work tweaking the CSS file used to create PDFs with wkhtmltopdf.
Well, I hadn't built it from source, but I've just spent the afternoon 
doing just that. I'm not a Perl user at all, so it was "fun".

I now know about cpan and after much wailing and gnashing of teeth, I 
managed to get a "Build" executable created and from there, Publican 2.9 
installed.

The initial ./Build test failed to create the pdf files, but 
re-installing wkhtmltopdf resolved those issues. >/Build install (as 
root) worked fine and I now have 2.9 running.

You have indeed been doing a huge amount of work on the CSS as the 
finished pdfs now look much better. But unfortunately, still fraught 
with problems for my situation anyway:

 is formatted on as few lines as possible. I have a 5 line 
 that renders onto one line and another 40 line one that renders 
as a huge mono-block of text. :-(

Tables have no borders between rows etc. I like the fancy bi-coloured 
rows though, unfortunately I need the fully-bordered version. I assume 
the CSS could be tweaked in css/overrides.css to fix this. (Note to 
self, have to learn CSS now!)

 is no longer rendered in a mono-spaced font. The nice 
background and borders have also gone. Again, I assume tweaking the CSS 
will help restore the old ways.

Images are no longer centred/centered where specified.

Graphical admonitions seem to have lost all their quality. The title-bar 
is rendered in a two tone colour rather than a single one. The Note 
image, for example, is very fuzzy indeed.

Tables split over pages again - even when I've specified that they 
shouldn't. There are no headings for the continued tables as there was 
before.

Indices - well, my book has 4 of them. Three are specialised indices and 
the fourth is the main (default) index. Every one of the four renders 
exactly the same. It appears that the index type is being ignored. This 
could be in the initial conversion to HTML and I will check, just in case.

I'm a bit suspicious of this as the Publican User Guide renders 
perfectly in pdf.

Unfortunately, the problem with a lack of page numbers in the pdf 
indices is still there. HTML uses section headings PDF uses page 
numbers, except now both use section headings.

The cover page image is now present and correct.

Paging is better now. There are page breaks at the top of each Part, 
Section and Chapter, as desired.

It's looking a whole lot better now, but unfortunately, there's still a 
lot of work required. Wish I could help a bit more. :-(

Appreciate all the work being done with Publican though - thanks a huge 
amount!


Cheers,
Norm.


>
> svn co http://svn.fedorahosted.org/svn/publican/branches/publican-2x
>
> These changes, and hopefully a few more, will be in Publican 2.9, due
> out ... hopefully soon O_O
>
> Cheers, Jeff.
>
> On 01/27/2012 07:20 PM, Norman Dunbar wrote:
>> Morning Jared,
>>
>> Ok, I tried wkhtmltopdf after installing the QT stuff needed by the two
>> rpms you linked me to.
>>
>> Good points:
>>
>> It's very fast indeed!
>>
>> The output format appears quite nice, as you say, fantastic looking
>> documents.
>>
>>
>> Bad points (for me anyway):
>>
>> The front cover image is missing and I have a lovely set of scroll bars
>> where it should be. :-(
>>
>> Paging is ruined - nothing throws to a new page any more - sections,
>> chapters, parts etc, all start just below where the last "bit" finished.
>>
>> Every part, chapter and section has a table of contents present, only
>> the book itself should have one.
>>
>> Every single table now splits over a page, but even worse, using FOP I
>> did get a copy of the table headers on the continuation. Not any more.
>> :-(
>>
>> I presume the above is due to the initial conversion from XML to HTML
>> being an html-single - there are no page breaks - so tables etc don't
>> split in the html, but when converted to pdf, oh boy!
>>
>> But the worst thing of all for a printed document, the indexing. I have
>> 4 (yes, overkill perhaps, but that's how it is!) different indices and
>> instead of having page numbers, they have the section header instead.
>>
>> I'm afraid it's not for me - yet - as it's not producing anything like a
>> decent, printable pdf document. Sorry.
>>
>>
>> Appreciate you taking the time to remind me of this utility, but it's
>> not quite ready for mainstream yet - at least, as far as my book is
>> concerned.
>>
>> Thanks.
>>
>>
>> Cheers,
>> Norm.
>>
>
>


-- 
Norman Dunbar
Dunbar IT Consultants Ltd

Registered address:
Thorpe House
61 Richardshaw Lane
Pudsey
West Yorkshire
United Kingdom
LS28 7EL

Company Number: 05132767



From Norman at dunbar-it.co.uk  Mon Jan 30 15:41:56 2012
From: Norman at dunbar-it.co.uk (Norman Dunbar)
Date: Mon, 30 Jan 2012 15:41:56 +0000
Subject: [publican-list] Dbfo keep-together instructions stripped out
 by	publican?
In-Reply-To: <4F25D643.1010501@redhat.com>
References: 	<4F226C69.9080005@dunbar-it.co.uk>
	<4F25D643.1010501@redhat.com>
Message-ID: <4F26BA44.5000000@dunbar-it.co.uk>

Hi Jeff,

On 29/01/12 23:29, Jeff Fearn wrote:
> Hi Norm, have you rebuilt Publican from the SVN repo? We have done a lot
> of work tweaking the CSS file used to create PDFs with wkhtmltopdf.

One other thing I've noticed about 2.9, when I deinstall the wkhtmltopdf 
RPMs again, to get back to using FOP for my documents, running ./Build 
test again fails to create the User Guide PDF and fails 2/6 tests.

Is there any reason why the User Guide for 2.9 cannot be created in PDF 
form by FOP?


Test Summary Report
-------------------
t/910.publican.Users_Guide.t (Wstat: 256 Tests: 5 Failed: 1)
   Failed test:  5
   Non-zero exit status: 1
t/pod-coverage.t            (Wstat: 256 Tests: 9 Failed: 1)
   Failed test:  7
   Non-zero exit status: 1
Files=10, Tests=68, 141 wallclock secs ( 0.25 usr  0.18 sys + 119.40 
cusr  8.80 csys = 128.63 CPU)
Result: FAIL
Failed 2/10 test programs. 2/68 subtests failed.

I did see this whizz up the screen:

FOP error, PDF generation failed. Check log for details.

I can't seem to find a logfile though.



Cheers,
Norm.



-- 
Norman Dunbar
Dunbar IT Consultants Ltd

Registered address:
Thorpe House
61 Richardshaw Lane
Pudsey
West Yorkshire
United Kingdom
LS28 7EL

Company Number: 05132767



From jfearn at redhat.com  Tue Jan 31 02:19:48 2012
From: jfearn at redhat.com (Jeff Fearn)
Date: Tue, 31 Jan 2012 12:19:48 +1000
Subject: [publican-list] Publican translations now on Zanata.org!
Message-ID: <4F274FC4.5030206@redhat.com>

Hi Everybody! Just thought I'd drop a note to let you all know that the 
Publican translations are now hosted on https://translate.zanata.org. So 
hopefully it will be easier for people to help with translations now.

For full details check out the Publican wiki: 
https://fedorahosted.org/publican/wiki/WikiStart#Translation

Cheers, Jeff.

-- 
"Reply All" why you shouldn't use it: 
http://www.emailreplies.com/#12replytoall



From jfearn at redhat.com  Tue Jan 31 03:15:56 2012
From: jfearn at redhat.com (Jeff Fearn)
Date: Tue, 31 Jan 2012 13:15:56 +1000
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: <774e5bf4-16cb-42f8-92cb-3e4501fcac2a@zmail11.collab.prod.int.phx2.redhat.com>
References: <774e5bf4-16cb-42f8-92cb-3e4501fcac2a@zmail11.collab.prod.int.phx2.redhat.com>
Message-ID: <4F275CEC.8070506@redhat.com>

On 01/31/2012 12:16 AM, Fred Dalrymple wrote:
> Thanks for the pointer (I didn't look far enough back in the archives).
>
> In general, if there is no automated programmatic solution, then I'd probably introduce an external file that would point at entries that didn't follow the programmatic default and provide either a clue or explicit sorting key -- think RDF resources (though I'm partial to Topic Maps).

Requiring 1 writer to order their list is a bit spendthrift, requiring 
50 translators to order them just blew your budget. Doing it manually 
just does not scale.

Don't ask "how can I do this" or "how can I do this in $language", ask 
"how can I do this in 50 languages?" If you come up with 'manually' then 
you didn't multiply effort by 50 properly or you aren't holding yourself 
accountable for how your choices affect other people.

> Perhaps verbose, but if a machine can't figure it out automatically, what can you do?

It can depend on what you are contractually required to do. Like say if 
you were contractually required to ensure a translation had the same 
level of presentation and editorship as the source language, then doing 
such things in an un-automated fashion might expose you to very 
expensive repercussions.

> Actually, I'd assumed this in the solution because I'm thinking about non-alphabetic sorting needs, like the order of introduction of terms, perhaps on a per-topic basis (and yes, enabling solutions in forms other than print).

DocBook already has attributes to allow this kind of sorting for some 
lists, it's useless from a translation perspective. We did consider 
modifying the translation tools to expose these attributes to the 
translators, but the issue of scale raised it's ugly head and we 
realised we couldn't afford the impact on translation time.

Cheers, Jeff.

-- 
"Reply All" why you shouldn't use it: 
http://www.emailreplies.com/#12replytoall



From jwulf at redhat.com  Tue Jan 31 03:44:28 2012
From: jwulf at redhat.com (Joshua J Wulf)
Date: Tue, 31 Jan 2012 13:44:28 +1000
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: <774e5bf4-16cb-42f8-92cb-3e4501fcac2a@zmail11.collab.prod.int.phx2.redhat.com>
References: <774e5bf4-16cb-42f8-92cb-3e4501fcac2a@zmail11.collab.prod.int.phx2.redhat.com>
Message-ID: <4F27639C.4020702@redhat.com>

On 01/31/2012 12:16 AM, Fred Dalrymple wrote:
> Thanks for the pointer (I didn't look far enough back in the archives).
>
> In general, if there is no automated programmatic solution, then I'd 
> probably introduce an external file that would point at entries that 
> didn't follow the programmatic default and provide either a clue or 
> explicit sorting key -- think RDF resources (though I'm partial to 
> Topic Maps).  Perhaps verbose, but if a machine can't figure it out 
> automatically, what can you do?
>
> Actually, I'd assumed this in the solution because I'm thinking about 
> non-alphabetic sorting needs, like the order of introduction of terms, 
> perhaps on a per-topic basis (and yes, enabling solutions in forms 
> other than print).
Fred, you've really piqued my interest now. What approaches might you 
take to order the introduction of terms using RDF?

- Josh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From dlackey at redhat.com  Tue Jan 31 04:50:42 2012
From: dlackey at redhat.com (E Deon Lackey)
Date: Mon, 30 Jan 2012 22:50:42 -0600
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: <4F275CEC.8070506@redhat.com>
References: <774e5bf4-16cb-42f8-92cb-3e4501fcac2a@zmail11.collab.prod.int.phx2.redhat.com>
	<4F275CEC.8070506@redhat.com>
Message-ID: <4F277322.8020102@redhat.com>

On 1/30/2012 9:15 PM, Jeff Fearn wrote:
> On 01/31/2012 12:16 AM, Fred Dalrymple wrote:
>> Thanks for the pointer (I didn't look far enough back in the archives).
>>
>> In general, if there is no automated programmatic solution, then I'd 
>> probably introduce an external file that would point at entries that 
>> didn't follow the programmatic default and provide either a clue or 
>> explicit sorting key -- think RDF resources (though I'm partial to 
>> Topic Maps).
>
> Requiring 1 writer to order their list is a bit spendthrift, requiring 
> 50 translators to order them just blew your budget. Doing it manually 
> just does not scale.

Would this be necessary? Would it be possible to be automatic for 
everything but the Japanese kana languages, and then collate that one 
set manually? (Fred, I demand your full engineering design now! :) )


>
> Don't ask "how can I do this" or "how can I do this in $language", ask 
> "how can I do this in 50 languages?" If you come up with 'manually' 
> then you didn't multiply effort by 50 properly or you aren't holding 
> yourself accountable for how your choices affect other people.

On the other hand, glossaries very seldom change, much less frequently 
than any other part of the book. That stability has to offset at least 
some of the cost, especially when compared to the benefits to users. (I 
have constant requests for glossaries from both management and GSS for 
JON, for example.)


>
>> Perhaps verbose, but if a machine can't figure it out automatically, 
>> what can you do?
>
> It can depend on what you are contractually required to do. 

Do we have any contracts that require that level of parity? For my own 
doc sets, I have real and known requests to include glossaries. So, I 
know that having a glossary matters in real life. My question is whether 
the concern about contracts is a hypothetical or a real consideration. 
That is something to balance as well -- a real customer and support 
request v. a hypothetical customer requirement. (Not that a hypothetical 
consideration can't also be the deciding factor -- it may be only a 
possibility but an important enough possibility to outweigh a real but 
relatively insignificant convenience.)



From jwulf at redhat.com  Tue Jan 31 04:58:44 2012
From: jwulf at redhat.com (Joshua J Wulf)
Date: Tue, 31 Jan 2012 14:58:44 +1000
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: <4F277322.8020102@redhat.com>
References: <774e5bf4-16cb-42f8-92cb-3e4501fcac2a@zmail11.collab.prod.int.phx2.redhat.com>
	<4F275CEC.8070506@redhat.com> <4F277322.8020102@redhat.com>
Message-ID: <4F277504.5090609@redhat.com>

On 01/31/2012 02:50 PM, E Deon Lackey wrote:
> On 1/30/2012 9:15 PM, Jeff Fearn wrote:
>> On 01/31/2012 12:16 AM, Fred Dalrymple wrote:
>>> Thanks for the pointer (I didn't look far enough back in the archives).
>>>
>>> In general, if there is no automated programmatic solution, then I'd 
>>> probably introduce an external file that would point at entries that 
>>> didn't follow the programmatic default and provide either a clue or 
>>> explicit sorting key -- think RDF resources (though I'm partial to 
>>> Topic Maps).
>>
>> Requiring 1 writer to order their list is a bit spendthrift, 
>> requiring 50 translators to order them just blew your budget. Doing 
>> it manually just does not scale.
>
> Would this be necessary? Would it be possible to be automatic for 
> everything but the Japanese kana languages, and then collate that one 
> set manually? (Fred, I demand your full engineering design now! :) )
>
>
>>
>> Don't ask "how can I do this" or "how can I do this in $language", 
>> ask "how can I do this in 50 languages?" If you come up with 
>> 'manually' then you didn't multiply effort by 50 properly or you 
>> aren't holding yourself accountable for how your choices affect other 
>> people.
>
> On the other hand, glossaries very seldom change, much less frequently 
> than any other part of the book. That stability has to offset at least 
> some of the cost, especially when compared to the benefits to users. 
> (I have constant requests for glossaries from both management and GSS 
> for JON, for example.)
>
>
>>
>>> Perhaps verbose, but if a machine can't figure it out automatically, 
>>> what can you do?
>>
>> It can depend on what you are contractually required to do. 
>
> Do we have any contracts that require that level of parity? For my own 
> doc sets, I have real and known requests to include glossaries. So, I 
> know that having a glossary matters in real life. My question is 
> whether the concern about contracts is a hypothetical or a real 
> consideration. 

Let's just say that Japanese companies are frequently very particular 
that their language is not treated as a second-class citizen.


> That is something to balance as well -- a real customer and support 
> request v. a hypothetical customer requirement. (Not that a 
> hypothetical consideration can't also be the deciding factor -- it may 
> be only a possibility but an important enough possibility to outweigh 
> a real but relatively insignificant convenience.)
>
> _______________________________________________
> publican-list mailing list
> publican-list at redhat.com
> https://www.redhat.com/mailman/listinfo/publican-list
> Wiki: https://fedorahosted.org/publican



From peter.moulder at monash.edu  Tue Jan 31 07:45:12 2012
From: peter.moulder at monash.edu (Peter Moulder)
Date: Tue, 31 Jan 2012 18:45:12 +1100
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: <4F263FA5.1070804@redhat.com>
References: 
	<4F2638AA.80803@redhat.com> <4F263A93.6070001@redhat.com>
	<4F263FA5.1070804@redhat.com>
Message-ID: <20120131074512.GA14761@bowman.infotech.monash.edu.au>

In two messages around Jan 30, 2012, Jeff Fearn wrote:

> Collating the three kana scripts of Japanese properly is the Mt
> Everest of this challenge.
>
> [...]
> 
> Collating each of them separately is easy, but it's perfectly valid
> in Japanese to mix them so you have to be able to collate all of
> them together. AFAIK no one has done that in any open source
> project.

Apparently, the mapping from a string of Kanji to its pronunciation
(ordering) isn't even a deterministic operation, at least for proper
names.

(The example I came across is that the woman's name ?? ?? has at
least four possible readings of the family name times two possible
readings of the given name.)

Thus, the solution would have to involve supplying pronunciations somehow
for at least some glossary entries.

Once pronunciations (in Katakana or Hiragana) are available for all the
glossary entries, the Lingua::JA::Sort::JIS perl module can be used to do
the JIS X 4061:1996 collation among them.

Really, the problem would benefit from Japanese input on how the problem
is usually solved.  The Japanese translators might be able to help there,
at least as to how they supply pronunciations to other computer software
that needs to know sorting order.


(Btw, if anyone was going to try looking up JIS X 4061:1996, then
 unfortunately it looks like it's only available for a fee and in
 Japanese:

   http://www.webstore.jsa.or.jp/webstore/Com/FlowControl.jsp?lang=en&bunsyoId=JIS+X+4061%3A1996&dantaiCd=JIS&status=1&pageNo=6

 However, I'm told that the Japanese wikipedia article

   http://ja.wikipedia.org/wiki/??????????

 has an overview.  The google translation of that page is challenging to
 read, though:

   http://translate.google.com/translate?sl=ja&tl=en&u=http%3A%2F%2Fja.wikipedia.org%2Fwiki%2F%E6%97%A5%E6%9C%AC%E8%AA%9E%E6%96%87%E5%AD%97%E5%88%97%E7%85%A7%E5%90%88%E9%A0%86%E7%95%AA

 .)


pjrm.



From Norman at dunbar-it.co.uk  Tue Jan 31 08:23:10 2012
From: Norman at dunbar-it.co.uk (Norman Dunbar)
Date: Tue, 31 Jan 2012 08:23:10 +0000
Subject: [publican-list] Publican 2.9 slight foible
Message-ID: <4F27A4EE.4020108@dunbar-it.co.uk>

On Jeff's advice I compiled Publican 2.9 from source yesterday (Fedora 
16, 64 bit) and was pleased with the results (when creating pdfs using 
FOP though) and installed it.

I've noticed that generating a pdf using a brand has replaced (in the 
generated pdf) my brand's title logo with the Publican's "red book" image.

My publican.cfg definitely specified my own brand.


Cheers,
Norm.

-- 
Norman Dunbar
Dunbar IT Consultants Ltd

Registered address:
Thorpe House
61 Richardshaw Lane
Pudsey
West Yorkshire
United Kingdom
LS28 7EL

Company Number: 05132767



From fdalrymple at redhat.com  Tue Jan 31 14:33:50 2012
From: fdalrymple at redhat.com (Fred Dalrymple)
Date: Tue, 31 Jan 2012 09:33:50 -0500 (EST)
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: <4F275CEC.8070506@redhat.com>
Message-ID: 

----- Original Message -----

> On 01/31/2012 12:16 AM, Fred Dalrymple wrote:
> > Thanks for the pointer (I didn't look far enough back in the
> > archives).
> >
> > In general, if there is no automated programmatic solution, then
> > I'd probably introduce an external file that would point at
> > entries that didn't follow the programmatic default and provide
> > either a clue or explicit sorting key -- think RDF resources
> > (though I'm partial to Topic Maps).

> Requiring 1 writer to order their list is a bit spendthrift,
> requiring
> 50 translators to order them just blew your budget. Doing it manually
> just does not scale.

Hi Jeff -- 

Sorry if I wasn't clear enough. It would be a hybrid solution, where the default would be automatic translation. The external annotations would only over-ride those cases that didn't sort correctly via the automated solution. I'm all for going as far as possible with automation, but if there are cases that are non-deterministic (as Peter demonstrated), then this approach requires the least effort from writers -- unless one gives up on the entire project :). 

Yes, DocBook has anticipated some cases, but the approach I'm thinking of (1) potentially works across any elements (including those that aren't lists), and (2) doesn't require "intrusive" markup of the original source. An external file is cleaner, and allows multiple annotation sets for different uses. That's not to say that I'm committed to implementing the complete set of features and flexibility in the prototype, just showing a viable solution for glossary entries. 

Fred 

> Don't ask "how can I do this" or "how can I do this in $language",
> ask
> "how can I do this in 50 languages?" If you come up with 'manually'
> then
> you didn't multiply effort by 50 properly or you aren't holding
> yourself
> accountable for how your choices affect other people.

> > Perhaps verbose, but if a machine can't figure it out
> > automatically, what can you do?

> It can depend on what you are contractually required to do. Like say
> if
> you were contractually required to ensure a translation had the same
> level of presentation and editorship as the source language, then
> doing
> such things in an un-automated fashion might expose you to very
> expensive repercussions.

> > Actually, I'd assumed this in the solution because I'm thinking
> > about non-alphabetic sorting needs, like the order of introduction
> > of terms, perhaps on a per-topic basis (and yes, enabling
> > solutions in forms other than print).

> DocBook already has attributes to allow this kind of sorting for some
> lists, it's useless from a translation perspective. We did consider
> modifying the translation tools to expose these attributes to the
> translators, but the issue of scale raised it's ugly head and we
> realised we couldn't afford the impact on translation time.

> Cheers, Jeff.

> --
> "Reply All" why you shouldn't use it:
> http://www.emailreplies.com/#12replytoall

> _______________________________________________
> publican-list mailing list
> publican-list at redhat.com
> https://www.redhat.com/mailman/listinfo/publican-list
> Wiki: https://fedorahosted.org/publican
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From fdalrymple at redhat.com  Tue Jan 31 15:03:46 2012
From: fdalrymple at redhat.com (Fred Dalrymple)
Date: Tue, 31 Jan 2012 10:03:46 -0500 (EST)
Subject: [publican-list] sortable lists, esp. glossaries
In-Reply-To: <4F27639C.4020702@redhat.com>
Message-ID: 

Hi Josh -- 

Are you familiar with Topic Maps? 

http://www.ontopia.net/topicmaps/materials/tao.html 
http://en.wikipedia.org/wiki/Topic_Maps 
http://www.topicmaps.org/xtm/ 

The original requirements for Topic Maps were specifically about dealing with indexes and glossaries -- mainly regarding issues across sets / families of documentation, such as those we had at the Open Software Foundation two decades ago (gulp). The standard became much broader by becoming more general, but the core capabilities include what is required by this project. 

RDF and Topic Maps don't have exactly the same internal model, but they are probably close enough (at least for this project) for Topic Map documents to be translated into RDF form. I'm fairly agnostic on syntax, but I understand how to express things much better in Topic Maps than in RDF, so that's my crutch. 

Fred 

----- Original Message -----

> On 01/31/2012 12:16 AM, Fred Dalrymple wrote:
> > Thanks for the pointer (I didn't look far enough back in the
> > archives).
> 

> > In general, if there is no automated programmatic solution, then
> > I'd
> > probably introduce an external file that would point at entries
> > that
> > didn't follow the programmatic default and provide either a clue or
> > explicit sorting key -- think RDF resources (though I'm partial to
> > Topic Maps). Perhaps verbose, but if a machine can't figure it out
> > automatically, what can you do?
> 

> > Actually, I'd assumed this in the solution because I'm thinking
> > about
> > non-alphabetic sorting needs, like the order of introduction of
> > terms, perhaps on a per-topic basis (and yes, enabling solutions in
> > forms other than print).
> 

> Fred, you've really piqued my interest now. What approaches might you
> take to order the introduction of terms using RDF?

> - Josh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From jfearn at redhat.com  Tue Jan 31 22:32:01 2012
From: jfearn at redhat.com (Jeff Fearn)
Date: Wed, 01 Feb 2012 08:32:01 +1000
Subject: [publican-list] Publican 2.9 slight foible
In-Reply-To: <4F27A4EE.4020108@dunbar-it.co.uk>
References: <4F27A4EE.4020108@dunbar-it.co.uk>
Message-ID: <4F286BE1.4080107@redhat.com>

On 01/31/2012 06:23 PM, Norman Dunbar wrote:
> On Jeff's advice I compiled Publican 2.9 from source yesterday (Fedora
> 16, 64 bit) and was pleased with the results (when creating pdfs using
> FOP though) and installed it.
>
> I've noticed that generating a pdf using a brand has replaced (in the
> generated pdf) my brand's title logo with the Publican's "red book" image.
>
> My publican.cfg definitely specified my own brand.
>
>
> Cheers,
> Norm.
>

Hi Norman, the changes we are doing for PDFs requires changing the 
brands. Rudi has scheduled writing up a short how-to on exactly what has 
to change in brands, but it hasn't been started yet :(

Cheers, Jeff.

-- 
"Reply All" why you shouldn't use it: 
http://www.emailreplies.com/#12replytoall