rpms/mod_line_edit/EL-5 index.html, NONE, 1.1 mod_line_edit.c, NONE, 1.1 mod_line_edit.conf, NONE, 1.1 mod_line_edit.spec, NONE, 1.1

Rob Myers (rmyers) fedora-extras-commits at redhat.com
Tue Jan 29 11:47:44 UTC 2008


Author: rmyers

Update of /cvs/extras/rpms/mod_line_edit/EL-5
In directory cvs-int.fedora.redhat.com:/tmp/cvs-serv18096/EL-5

Added Files:
	index.html mod_line_edit.c mod_line_edit.conf 
	mod_line_edit.spec 
Log Message:
Initial import.



--- NEW FILE index.html ---
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en"><head>
<title>mod_line_edit</title>
<style type="text/css">
@import url(/index.css) ;
</style>
</head><body>
<div id="apache">
<h1>mod_line_edit</h1>
<p>mod_line_edit is a general-purpose filter for text documents.
It operates as a simple on-the-fly line editor, applying
search-and-replace rules defined in a configuration or .htaccess
file.</p>
<p>Unlike most of WebÞing's filter modules, it is not markup-aware,
so it is not an optimal choice for processing HTML or XML,
though it may nevertheless be used with caution (and may be far better
than semi-markup-aware options such as mod_layout).
</p><p>
For non-markup document types such as plain text, and non-markup
Web documents such as Javascript or Stylesheets, it is the best
available option in the absence of a filter that parses any
relevant document structures.</p>
<p>mod_line_edit is written for performance and reliability,
and should scale without problems as document size grows.
mod_line_edit is fully compatible with Apache 2.0 and 2.2,
and all operating systems and MPMs.</p>


<h2>Usage</h2>
<p><code>LoadModule line_edit_module modules/mod_line_edit.so</code></p>
<p>The module implements a single output filter named <code>line-editor</code>.
Insert it in the filter chain using the standard filter directives,
e.g. to rewrite all text documents:</p>
<pre><code>FilterProvider textedit line-editor resp=Content-Type $text/
FilterChain textedit</code></pre>
<p>or, for backward compatibility with Apache 2.0:</p>
<pre><code>AddOutputFilter	line-editor	.txt .css .js</code></pre>
<p>or</p>
<pre><code>SetOutputFilter	line-editor
SetEnv	LineEdit "text/plain;text/css;text/javascript"</code></pre>
<h3>Text Editing</h3>
<p>The <code>LERewriteRule</code> directive defines search-and-replace rules.
Both simple text and regular expression search and replace are supported.
</p>
<h3>Line Modes</h3>
<p>mod_line_edit normally applies its edits line-by-line.  This avoids
the risk of missing a pattern to be matched because it is spread over
more than one chunk of data when it reaches the parser, without having
to resort to the performance and scalability limitations of loading
an entire document into memory.</p>
<h3>The LineEdit Environment Variable</h3>
<p>If the <tt>LineEdit</tt> environment variable is set, it controls
Content Types that will be filtered.  This enables it to filter
selectively on content-type in a proxy with Apache 2.0.  Just set
the variable to a list of content types, and mod_line_edit will
leave other types untouched.</p>
<p>Example: to filter plain text and javascript, but leave other types alone:
<br /><code>SetEnv LineEdit "text/plain;text/javascript"</code></p>
<p>This also works with Apache 2.2, but is of course unnecessary there.</p>
<h3>Directives</h3>
<dl>
<dt>LELineEnd</dt>
<dd>
<p><code>LELineEnd UNIX|MAC|DOS|NONE|ANY|CUSTOM [char]</code></p>
<p>This tells the parser what characters in the text to interpret
as line-endings:</p>
<ul>
<li><strong>UNIX</strong> - the line end is the traditional Unix <q>\n</q>.</li>
<li><strong>MAC</strong> - the line end is the old MacOS <q>\r</q>.
Note that modern MacOSX is Unix-based.</li>
<li><strong>DOS</strong> - the line end is the MSDOS and Windows sequence
<q>\r\n</q>.</li>
<li><strong>ANY</strong> - Any of the above will be interpreted as a line
break (with <q>\r\n</q> treated as one, not two, linebreaks).
This is the default.</li>
<li><strong>NONE</strong> - This will treat the entire document as a
single line, enabling multi-line search-and-replace.  Note that this
will incur a <strong>substantial performance penalty</strong> for
larger documents, as it requires an entire document to be loaded into
memory and processed in a single operation.</li>
<li><strong>CUSTOM</strong> - This enables you to partition the input
by splitting on some character other than a conventional line end.</p>
</ul>
<p>When you use <code>LELineEnd Custom</code>, you must specify a second
single-character argument, which is the character to split the input on.
For all other <code>LELineEnd</code> options, any second argument will
be ignored.</p>
</dd>
<dt>LERewriteRule</dt>
<dd>
<p><code>LERewriteRule from-pattern to-pattern [flags]</code></p>
<p>This directive defines a search-and-replace edit rule that will
be applied to the text.</p>
<ul>
<li>The search string <strong>from-pattern</strong> may be a literal
string or a regular expression.</li>
<li>The replacement string <strong>to-pattern</strong> may be a literal
string, or may include backreferences $1-$9 in the case of a regular
expression match.</li>
<li>The optional <strong>Flags</strong> argument may contain any
combination of:
<ul class="table">
<li><strong>R</strong> - This rule is a regular expression
search-and-replace.</li>
<li><strong>i</strong> - Use case-insensitive matching on
<strong>from-pattern</strong>.</li>
<li><strong>m</strong> - Support multi-line regexp matching
(in conjunction with the <strong>R</strong> flag and <code>LELineEnd
None</code>).</li>
<li><strong>V</strong> - Support environment variables in
<strong>to-pattern</strong>.  The string <tt>${var}</tt> will be
replaced with the value of the environment variable <tt>var</tt>.</li>
</ul>
</li>
</ul>
</dd>
</dl>

<h2>Availability</h2>
<p><a href="mod_line_edit.c"
>mod_line_edit.c source code</a> is available under the
<a href="http://www.fsf.org/licenses/gpl.html">GNU
General Public License (GPL)</a>.  As with other opensource modules,
we can consider alternative licenses by request.</p>
<p><a href="/registration.html">Registered Users</a>
may request <strong>binaries</strong> for any available platform.</p>
<h3>Version 0.9.2</h3>
<p>Dec. 26<sup>th</sup> 2005: the first public release of
mod_line_edit is version 0.9.2</p>
<h3>Version 1.0</h3>
<p>June 12<sup>th</sup> 2006: added capability to interpolate environment
variables in rewrite rules.  Bumping version to 1.0 because 0.9 has
proved stable over six months in the wild.</p>

</div>
<div id="navbar"><a class="internal" href="./" title="Up">Up</a>
*
<a class="internal" href="/" title="WebThing Apache Centre">Home</a>
*
<a class="internal" href="/contact.html" title="Contact WebThing">Contact</a>
*
<a class="external" href="http://www.webthing.com/" title="WebThing Ltd">WebÞing</a>
*
<a class="external" href="http://www.apache.org/" title="Apache Software Foundation">Apache</a></div></body></html>


--- NEW FILE mod_line_edit.c ---
/********************************************************************
  Copyright (c) 2005-6, WebThing Ltd
  Author: Nick Kew <nick at webthing.com>

  This program is free software; you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation; either version 2 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program; if not, write to the Free Software
  Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

*********************************************************************/


#define LINE_EDIT_VERSION "1.0.0"

#include <ctype.h>

#include <httpd.h>
#include <http_config.h>
#include <util_filter.h>

#include <apr_strmatch.h>
#include <apr_strings.h>

#ifdef AP_REG_ICASE
#define APACHE21
#else
#define APACHE20
#endif

#ifdef APACHE20
#define ap_regex_t regex_t
#define ap_regmatch_t regmatch_t
#define AP_REG_EXTENDED REG_EXTENDED
#define AP_REG_ICASE REG_ICASE
#define AP_REG_NOSUB REG_NOSUB
#define AP_REG_NEWLINE REG_NEWLINE

/* we don't have protocol handling in 2.0 */
#define ap_register_output_filter_protocol(a,b,c,d,e) \
	ap_register_output_filter(a,b,c,d)
#endif

#define M_REGEX		0x01
#define M_NOCASE	0x08
#define M_NEWLINE	0x10
#define M_ENV		0x20

typedef struct {
  union {
    const apr_strmatch_pattern* s;
    const ap_regex_t* r ;
  } from ;
  const char* to ;
  unsigned int flags ;
  unsigned int length ;
} rewriterule ;

typedef struct {
  enum {
	LINEEND_UNSET,
	LINEEND_ANY,
	LINEEND_UNIX,
	LINEEND_MAC,
	LINEEND_DOS,
	LINEEND_CUSTOM,
	LINEEND_NONE
  } lineend ;
  apr_array_header_t* rewriterules ;
  int lechar;
} line_edit_cfg ;

module AP_MODULE_DECLARE_DATA line_edit_module ;

static const char* const line_edit_filter_name = "line-editor" ;

typedef struct {
  apr_bucket_brigade* bbsave ;
  apr_pool_t* lpool ;
  apr_array_header_t* rewriterules ; /* make a copy if per-request
					interpolation is wanted */
} line_edit_ctx ;

static const char* interpolate_env(request_rec *r, const char *str) {
  /* Interpolate an env str in a configuration string
   * Syntax ${var} --> value_of(var)
   * Method: replace one var, and recurse on remainder of string
   * Nothing clever here, and crap like nested vars may do silly things
   * but we'll at least avoid sending the unwary into a loop
   */
  const char *start;
  const char *end;
  const char *var;
  const char *val;
  const char *firstpart;

  start = ap_strstr(str, "${");
  if (start == NULL) {
    return str;
  }
  end = ap_strchr(start+2, '}');
  if (end == NULL) {
    return str;
  }
  /* OK, this is syntax we want to interpolate.  Is there such a var ? */
  var = apr_pstrndup(r->pool, start+2, end-(start+2));
  val = apr_table_get(r->subprocess_env, var);
  firstpart = apr_pstrndup(r->pool, str, (start-str));

  if (val == NULL) {
    return apr_pstrcat(r->pool, firstpart, interpolate_env(r, end+1), NULL);
  } else {
    return apr_pstrcat(r->pool, firstpart, val,
	interpolate_env(r, end+1), NULL);
  }
}
static apr_status_t line_edit_filter(ap_filter_t* f, apr_bucket_brigade* bb) {
  int i, j;
  unsigned int match ;
  unsigned int nmatch = 10 ;
  ap_regmatch_t pmatch[10] ;
  const char* bufp;
  const char* subs ;
  apr_size_t bytes ;
  apr_size_t fbytes ;
  apr_size_t offs ;
  const char* buf ;
  const char* le = NULL ;
  const char* le_n ;
  const char* le_r ;
  char* fbuf ;
  apr_bucket* b = APR_BRIGADE_FIRST(bb) ;
  apr_bucket* b1 ;
  int found = 0 ;
  apr_status_t rv ;

  apr_bucket_brigade* bbline ;
  line_edit_cfg* cfg
	= ap_get_module_config(f->r->per_dir_config, &line_edit_module) ;
  rewriterule* rules = (rewriterule*) cfg->rewriterules->elts ;
  rewriterule* newrule;

  line_edit_ctx* ctx = f->ctx ;
  if (ctx == NULL) {

    /* check env to see if we're wanted, to give basic control with 2.0 */
    buf = apr_table_get(f->r->subprocess_env, "LineEdit");
    if (buf && f->r->content_type) {
      char* lcbuf = apr_pstrdup(f->r->pool, buf) ;
      char* lctype = apr_pstrdup(f->r->pool, f->r->content_type) ;
      char* c ;

      for (c = lcbuf; *c; ++c)
	if (isupper(*c))
	  *c = tolower(*c) ;

      for (c = lctype; *c; ++c)
	if (isupper(*c))
	  *c = tolower(*c) ;
	else if (*c == ';') {
	  *c = 0 ;
	  break ;
	}

      if (!strstr(lcbuf, lctype)) {
	/* don't filter this content type */
	ap_filter_t* fnext = f->next ;
	ap_remove_output_filter(f) ;
	return ap_pass_brigade(fnext, bb) ;
      }
    }

    ctx = f->ctx = apr_palloc(f->r->pool, sizeof(line_edit_ctx)) ;
    ctx->bbsave = apr_brigade_create(f->r->pool, f->c->bucket_alloc) ;

    /* If we have any regex matches, we'll need to copy everything, so we
     * have null-terminated strings to parse.  That's a lot of memory if
     * we're streaming anything big.  So we'll use (and reuse) a local
     * subpool.  Fall back to the request pool if anything bad happens.
     */
    ctx->lpool = f->r->pool ;
    for (i = 0; i < cfg->rewriterules->nelts; ++i) {
      if ( rules[i].flags & M_REGEX ) {
        if (apr_pool_create(&ctx->lpool, f->r->pool) != APR_SUCCESS) {
	  ctx->lpool = f->r->pool ;
        }
        break ;
      }
    }
    /* If we have env interpolation, we'll need a private copy of
     * our rewrite rules with this requests env.  Otherwise we can
     * save processing time by using the original.
     *
     * If one ENV is found, we also have to copy all previous and
     * subsequent rules, even those with no interpolation.
     */
    ctx->rewriterules = cfg->rewriterules;
    for (i = 0; i < cfg->rewriterules->nelts; ++i) {
      found |= (rules[i].flags & M_ENV) ;
      if ( found ) {
	if (ctx->rewriterules == cfg->rewriterules) {
	  ctx->rewriterules = apr_array_make(f->r->pool,
		cfg->rewriterules->nelts, sizeof(rewriterule));
	  for (j = 0; j < i; ++j) {
            newrule = apr_array_push (((line_edit_ctx*)ctx)->rewriterules) ;
	    newrule->from = rules[j].from;
	    newrule->to = rules[j].to;
	    newrule->flags = rules[j].flags;
	    newrule->length = rules[j].length;
	  }
	}
	/* this rule needs to be interpolated */
        newrule = apr_array_push (((line_edit_ctx*)ctx)->rewriterules) ;
	newrule->from = rules[i].from;
	if (rules[i].flags & M_ENV) {
	  newrule->to = interpolate_env(f->r, rules[i].to);
	} else {
	  newrule->to = rules[i].to ;
	}
	newrule->flags = rules[i].flags;
	newrule->length = rules[i].length;
      }
    }
    /* for back-compatibility with Apache 2.0, set some protocol stuff */
    apr_table_unset(f->r->headers_out, "Content-Length") ;
    apr_table_unset(f->r->headers_out, "Content-MD5") ;
    apr_table_unset(f->r->headers_out, "Accept-Ranges") ;
  }
  /* by now our rules are in ctx->rewriterules */
  rules = (rewriterule*) ctx->rewriterules->elts ;

  /* bbline is what goes to the next filter,
   * so we (can) have a new one each time.
   */
  bbline = apr_brigade_create(f->r->pool, f->c->bucket_alloc) ;

  /* first ensure we have no mid-line breaks that might be in the
   * middle of a search string causing us to miss it!  At the same
   * time we split into lines to avoid pattern-matching over big
   * chunks of memory.
   */
  while ( b != APR_BRIGADE_SENTINEL(bb) ) {
    if ( !APR_BUCKET_IS_METADATA(b) ) {
      if ( apr_bucket_read(b, &buf, &bytes, APR_BLOCK_READ) == APR_SUCCESS ) {
	if ( bytes == 0 ) {
	  APR_BUCKET_REMOVE(b) ;
	} else while ( bytes > 0 ) {
	  switch (cfg->lineend) {

	  case LINEEND_UNIX:
	    le = memchr(buf, '\n', bytes) ;
	    break ;

	  case LINEEND_MAC:
	    le = memchr(buf, '\r', bytes) ;
	    break ;

	  case LINEEND_DOS:
	    /* Edge-case issue: if a \r\n spans buckets it'll get missed.
	     * Not a problem for present purposes, but would be an issue
	     * if we claimed to support pattern matching on the lineends.
	     */
	    found = 0 ;
	    le = memchr(buf+1, '\n', bytes-1) ;
	    while ( le && !found ) {
	      if ( le[-1] == '\r' ) {
	        found = 1 ;
	      } else {
	        le = memchr(le+1, '\n', bytes-1 - (le+1 - buf)) ;
	      }
	    }
	    if ( !found )
	      le = 0 ;
	    break;

	  case LINEEND_ANY:
	  case LINEEND_UNSET:
	    /* Edge-case notabug: if a \r\n spans buckets it'll get seen as
	     * two line-ends.  It'll insert the \n as a one-byte bucket.
	     */
	    le_n = memchr(buf, '\n', bytes) ;
	    le_r = memchr(buf, '\r', bytes) ;
	    if ( le_n != NULL )
	      if ( le_n == le_r + sizeof(char))
	        le = le_n ;
	      else if ( (le_r < le_n) && (le_r != NULL) )
	        le = le_r ;
	      else
	        le = le_n ;
	    else
	      le = le_r ;
	    break;

	  case LINEEND_NONE:
	    le = 0 ;
	    break;

	  case LINEEND_CUSTOM:
	    le = memchr(buf, cfg->lechar, bytes) ;
	    break;
	  }
	  if ( le ) {
	    /* found a lineend in this bucket. */
	    offs = 1 + ((unsigned int)le-(unsigned int)buf) / sizeof(char) ;
	    apr_bucket_split(b, offs) ;
	    bytes -= offs ;
	    buf += offs ;
	    b1 = APR_BUCKET_NEXT(b) ;
	    APR_BUCKET_REMOVE(b);

	    /* Is there any previous unterminated content ? */
	    if ( !APR_BRIGADE_EMPTY(ctx->bbsave) ) {
	      /* append this to any content waiting for a lineend */
	      APR_BRIGADE_INSERT_TAIL(ctx->bbsave, b) ;
	      rv = apr_brigade_pflatten(ctx->bbsave, &fbuf, &fbytes, f->r->pool) ;
	      /* make b a new bucket of the flattened stuff */
	      b = apr_bucket_pool_create(fbuf, fbytes, f->r->pool,
			f->r->connection->bucket_alloc) ;

	      /* bbsave has been consumed, so clear it */
	      apr_brigade_cleanup(ctx->bbsave) ;
	    }
	    /* b now contains exactly one line */
	    APR_BRIGADE_INSERT_TAIL(bbline, b);
	    b = b1 ;
	  } else {
	    /* no lineend found.  Remember the dangling content */
	    APR_BUCKET_REMOVE(b);
	    APR_BRIGADE_INSERT_TAIL(ctx->bbsave, b);
	    bytes = 0 ;
	  }
	} /* while bytes > 0 */
      } else {
	/* bucket read failed - oops !  Let's remove it. */
	APR_BUCKET_REMOVE(b);
      }
    } else if ( APR_BUCKET_IS_EOS(b) ) {
      /* If there's data to pass, send it in one bucket */
      if ( !APR_BRIGADE_EMPTY(ctx->bbsave) ) {
        rv = apr_brigade_pflatten(ctx->bbsave, &fbuf, &fbytes, f->r->pool) ;
        b1 = apr_bucket_pool_create(fbuf, fbytes, f->r->pool,
		f->r->connection->bucket_alloc) ;
        APR_BRIGADE_INSERT_TAIL(bbline, b1);
      }
      apr_brigade_cleanup(ctx->bbsave) ;
      /* start again rather than segfault if a seriously buggy
       * filter in front of us sent a bogus EOS
       */
      f->ctx = NULL ;

      /* move the EOS to the new brigade */
      APR_BUCKET_REMOVE(b);
      APR_BRIGADE_INSERT_TAIL(bbline, b);
    } else {
      /* chop flush or unknown metadata bucket types */
      apr_bucket_delete(b);
    }
    /* OK, reset pointer to what's left (since we're not in a for-loop) */
    b = APR_BRIGADE_FIRST(bb) ;
  }

  /* OK, now we have a bunch of complete lines in bbline,
   * so we can apply our edit rules
   */

  /* When we get a match, we split the line into before+match+after.
   * To flatten that back into one buf every time would be inefficient.
   * So we treat it as three separate bufs to apply future rules.
   *
   * We can only reasonably do that by looping over buckets *inside*
   * the loop over rules.
   *
   * That means concepts like one-match-per-line or start-of-line-only
   * won't work, except for the first rule.  So we won't pretend.
   */
  for (i = 0; i < ctx->rewriterules->nelts; ++i) {
    for ( b = APR_BRIGADE_FIRST(bbline) ;
	b != APR_BRIGADE_SENTINEL(bbline) ;
	b = APR_BUCKET_NEXT(b) ) {
      if ( !APR_BUCKET_IS_METADATA(b)
	&& (apr_bucket_read(b, &buf, &bytes, APR_BLOCK_READ) == APR_SUCCESS)) {
	if ( rules[i].flags & M_REGEX ) {
	  bufp = apr_pstrmemdup(ctx->lpool, buf, bytes) ;
	  while ( ! ap_regexec(rules[i].from.r, bufp, nmatch, pmatch, 0) ) {
	    match = pmatch[0].rm_so ;
	    subs = ap_pregsub(f->r->pool, rules[i].to, bufp, nmatch, pmatch) ;
	    apr_bucket_split(b, match) ;
	    b1 = APR_BUCKET_NEXT(b) ;
	    apr_bucket_split(b1, pmatch[0].rm_eo - match) ;
	    b = APR_BUCKET_NEXT(b1) ;
	    apr_bucket_delete(b1) ;
	    b1 = apr_bucket_pool_create(subs, strlen(subs), f->r->pool,
		  f->r->connection->bucket_alloc) ;
	    APR_BUCKET_INSERT_BEFORE(b, b1) ;
	    bufp += pmatch[0].rm_eo ;
	  }
	} else {
	  bufp = buf ;
	  while (subs = apr_strmatch(rules[i].from.s, bufp, bytes),
			subs != NULL) {
	    match = ((unsigned int)subs - (unsigned int)bufp) / sizeof(char) ;
	    bytes -= match ;
	    bufp += match ;
	    apr_bucket_split(b, match) ;
	    b1 = APR_BUCKET_NEXT(b) ;
	    apr_bucket_split(b1, rules[i].length) ;
	    b = APR_BUCKET_NEXT(b1) ;
	    apr_bucket_delete(b1) ;
	    bytes -= rules[i].length ;
	    bufp += rules[i].length ;
	    b1 = apr_bucket_immortal_create(rules[i].to, strlen(rules[i].to),
		f->r->connection->bucket_alloc) ;
	    APR_BUCKET_INSERT_BEFORE(b, b1) ;
	  }
	}
      }
    }
    /* If we used a local pool, clear it now */
    if ( (ctx->lpool != f->r->pool) && (rules[i].flags & M_REGEX) ) {
      apr_pool_clear(ctx->lpool) ;
    }
  }

  /* now pass it down the chain */
  rv = ap_pass_brigade(f->next, bbline) ;

  /* if we have leftover data, don't risk it going out of scope */
  for ( b = APR_BRIGADE_FIRST(ctx->bbsave) ;
	b != APR_BRIGADE_SENTINEL(ctx->bbsave) ;
	b = APR_BUCKET_NEXT(b)) {
    apr_bucket_setaside(b, f->r->pool) ;
  }

  return rv ;
}
static int line_edit(apr_pool_t* pool, apr_pool_t* p1,
		apr_pool_t* p2, server_rec* s) {
  ap_add_version_component(pool, "Line-Edit/" LINE_EDIT_VERSION) ;
  return DECLINED ;
}

static void line_edit_hooks(apr_pool_t* pool) {
  ap_register_output_filter_protocol(line_edit_filter_name, line_edit_filter,
		NULL, AP_FTYPE_RESOURCE,
		AP_FILTER_PROTO_CHANGE|AP_FILTER_PROTO_CHANGE_LENGTH) ;
  ap_hook_post_config(line_edit, NULL, NULL, APR_HOOK_MIDDLE) ;
}

static const char* line_edit_lineend(cmd_parms* cmd,
		void* cfg, const char* arg, const char *ch) {
  line_edit_cfg* fcfg = cfg ;
  if (!strcasecmp(arg, "unix")) {
    fcfg->lineend = LINEEND_UNIX ;
  } else if (!strcasecmp(arg, "dos")) {
    fcfg->lineend = LINEEND_DOS ;
  } else if (!strcasecmp(arg, "mac")) {
    fcfg->lineend = LINEEND_MAC ;
  } else if (!strcasecmp(arg, "any")) {
    fcfg->lineend = LINEEND_ANY ;
  } else if (!strcasecmp(arg, "none")) {
    fcfg->lineend = LINEEND_NONE ;
  } else if (!strcasecmp(arg, "custom")) {
    if (ch) {
      fcfg->lineend = LINEEND_CUSTOM ;
      fcfg->lechar = ch[0];
    }
    else {
      return "You must specify the custom lineend character.";
    }
  } else {
    return "Unknown lineend scheme";
  }
  return NULL;
}

#define REGFLAG(n,s,c) ( (s&&(ap_strchr((char*)(s),(c))!=NULL)) ? (n) : 0 )
static const char* line_edit_rewriterule(cmd_parms* cmd, void* cfg,
		const char* from, const char* to, const char* flags) {
  rewriterule* rule = apr_array_push (((line_edit_cfg*)cfg)->rewriterules) ;
  int lflags = 0 ;

  rule->to = to ;
  if ( flags ) {
    rule->flags
	= REGFLAG(M_REGEX, flags, 'R')
	| REGFLAG(M_NOCASE, flags, 'i')
	| REGFLAG(M_NEWLINE, flags, 'm')
	| REGFLAG(M_ENV, flags, 'V')
	;
  } else {
    rule->flags = 0 ;
  }
  if ( rule->flags & M_REGEX ) {
    if ( rule->flags & M_NOCASE ) {
      lflags |= AP_REG_ICASE;
    }
    if ( rule->flags & M_NEWLINE ) {
      lflags |= AP_REG_NEWLINE;
    }
    rule->from.r = ap_pregcomp(cmd->pool, from, lflags) ;
  } else {
    lflags = (rule->flags & M_NOCASE) ? 0 : 1 ;
    rule->length = strlen(from) ;
    rule->from.s = apr_strmatch_precompile(cmd->pool, from, lflags) ;
  }
  return NULL;
}

static const command_rec line_edit_cmds[] = {
  AP_INIT_TAKE12("LELineEnd", line_edit_lineend, NULL, OR_ALL,
	"Use line ending: UNIX|MAC|DOS|ANY|NONE|CUSTOM [char]") ,
  AP_INIT_TAKE23("LERewriteRule", line_edit_rewriterule, NULL, OR_ALL,
	"Line-oriented text rewrite rule: From-pattern, To-pattern [, Flags]") ,
  {NULL}
} ;
static void* line_edit_cr_cfg(apr_pool_t* pool, char* x) {
  line_edit_cfg* ret = apr_palloc(pool, sizeof(line_edit_cfg)) ;
  ret->lineend = LINEEND_UNSET;
  ret->rewriterules = apr_array_make(pool, 8, sizeof(rewriterule)) ;
  ret->lechar = 0;
  return ret ;
}
static void* line_edit_merge(apr_pool_t* pool, void* BASE, void* ADD) {
  line_edit_cfg* base = (line_edit_cfg*) BASE ;
  line_edit_cfg* add = (line_edit_cfg*) ADD ;
  line_edit_cfg* conf = apr_palloc(pool, sizeof(line_edit_cfg)) ;

  conf->lineend = (add->lineend == LINEEND_UNSET)
	  ? base->lineend
	  : add->lineend ;
  conf->rewriterules
	  = apr_array_append(pool, base->rewriterules, add->rewriterules) ;
  conf->lechar = (add->lechar == 0) ? base->lechar : add->lechar;
  return conf ;
}

module AP_MODULE_DECLARE_DATA line_edit_module = {
  STANDARD20_MODULE_STUFF,
  line_edit_cr_cfg ,
  line_edit_merge ,
  NULL ,
  NULL ,
  line_edit_cmds ,
  line_edit_hooks
};


--- NEW FILE mod_line_edit.conf ---
LoadModule line_edit_module    modules/mod_line_edit.so

<IfModule mod_line_edit.c>

    # LELineEnd
    # 
    #     LELineEnd UNIX|MAC|DOS|NONE|ANY|CUSTOM [char]
    # 
    #     This tells the parser what characters in the text to interpret as
    #     line-endings:
    # 
    #         * UNIX - the line end is the traditional Unix \n.
    #         * MAC - the line end is the old MacOS \r. Note that modern MacOSX
    #                 is Unix-based.
    #         * DOS - the line end is the MSDOS and Windows sequence \r\n.
    #         * ANY - Any of the above will be interpreted as a line break
    #                 (with \r\n treated as one, not two, linebreaks). This is
    #                 the default.
    #         * NONE - This will treat the entire document as a single line,
    #                  enabling multi-line search-and-replace. Note that this
    #                  will incur a substantial performance penalty for larger
    #                  documents, as it requires an entire document to be loaded
    #                  into memory and processed in a single operation.
    #         * CUSTOM - This enables you to partition the input by splitting on
    #                    some character other than a conventional line end. 
    # 
    #     When you use LELineEnd Custom, you must specify a second
    #     single-character argument, which is the character to split the input
    #     on. For all other LELineEnd options, any second argument will be
    #      ignored.
    #
    # LERewriteRule
    # 
    #     LERewriteRule from-pattern to-pattern [flags]
    # 
    #     This directive defines a search-and-replace edit rule that will be
    #     applied to the text.
    # 
    #         * The search string from-pattern may be a literal string or a
    #           regular expression.
    #         * The replacement string to-pattern may be a literal string, or
    #           may include backreferences $1-$9 in the case of a regular
    #           expression match.
    #         * The optional Flags argument may contain any combination of:
    #               o R - This rule is a regular expression search-and-replace.
    #               o i - Use case-insensitive matching on from-pattern.
    #               o m - Support multi-line regexp matching (in conjunction
    #                     with the R flag and LELineEnd None).
    #               o V - Support environment variables in to-pattern. The
    #                     string ${var} will be replaced with the value of the
    #                     environment variable var.

    # FilterProvider textedit line-editor resp=Content-Type $text/
    # FilterChain textedit

    # AddOutputFilter	line-editor	.txt .css .js

    # SetOutputFilter	line-editor
    # SetEnv	LineEdit "text/plain;text/css;text/javascript"

</IfModule>


--- NEW FILE mod_line_edit.spec ---
#Module-Specific definitions
%define mod_name mod_line_edit
%define mod_conf %{mod_name}.conf
%define mod_so %{mod_name}.so

Summary:	A general-puropse filter for text documents
Name:		%{mod_name}
Version:	1.0.0
Release:	3%{?dist}
Group:		System Environment/Daemons
License:	GPLv2+
URL:		http://apache.webthing.com/mod_line_edit/
Source0:	http://apache.webthing.com/mod_line_edit/mod_line_edit.c
Source1:	%{mod_conf}
Source2:	http://apache.webthing.com/mod_line_edit/index.html
Requires:	httpd httpd-mmn = %([ -a %{_includedir}/httpd/.mmn ] && cat %{_includedir}/httpd/.mmn || echo missing)
BuildRequires:	httpd-devel
BuildRequires:	file
BuildRoot:	%(mktemp -ud %{_tmppath}/%{name}-%{version}-%{release}-XXXXXX)

%description
mod_line_edit is a general-purpose filter for text documents. It operates as a
simple on-the-fly line editor, applying search-and-replace rules defined in a
configuration or .htaccess file.

Unlike most of Webthing's filter modules, it is not markup-aware, so it is not 
an optimal choice for processing HTML or XML, though it may nevertheless be
used with caution (and may be far better than semi-markup-aware options such as
mod_layout).

For non-markup document types such as plain text, and non-markup Web documents
such as Javascript or Stylesheets, it is the best available option in the
absence of a filter that parses any relevant document structures.

mod_line_edit is written for performance and reliability, and should scale
without problems as document size grows. mod_line_edit is fully compatible with
Apache 2.0 and 2.2, and all operating systems and MPMs.

%prep

%setup -q -T -c -n %{mod_name}-%{version}
cp %{SOURCE0} %{mod_name}.c
cp %{SOURCE1} %{mod_conf}
cp %{SOURCE2} README.html

# strip away annoying ^M
find . -type f|xargs file|grep 'CRLF'|cut -d: -f1|xargs perl -p -i -e 's/\r//'
find . -type f|xargs file|grep 'text'|cut -d: -f1|xargs perl -p -i -e 's/\r//'

head -19 %{mod_name}.c > LICENSE

%build
%{_sbindir}/apxs -c %{mod_name}.c

%install
rm -rf %{buildroot}

install -d %{buildroot}%{_libdir}/httpd/modules/
install -d %{buildroot}%{_sysconfdir}/httpd/conf.d

install -m0755 .libs/*.so %{buildroot}%{_libdir}/httpd/modules/
install -m0644 %{mod_conf} %{buildroot}%{_sysconfdir}/httpd/conf.d/%{mod_conf}

%clean
rm -rf %{buildroot}

%files
%defattr(-,root,root)
%doc LICENSE README.html
%attr(0644,root,root) %config(noreplace) %{_sysconfdir}/httpd/conf.d/%{mod_conf}
%attr(0755,root,root) %{_libdir}/httpd/modules/%{mod_so}

%changelog
* Tue Jan 22 2008 Rob Myers <rob.myers at gtri.gatech.edu> 1.0.0-3
- spec fixups from tibbs (#428981)

* Tue Jan 15 2008 Rob Myers <rob.myers at gtri.gatech.edu> 1.0.0-2
- initial fedora submission

* Sat Sep 08 2007 Oden Eriksson <oeriksson at mandriva.com> 1.0.0-2mdv2008.0
+ Revision: 82603
- rebuild


* Wed Mar 14 2007 Oden Eriksson <oeriksson at mandriva.com> 1.0.0-1mdv2007.1
+ Revision: 143722
- Import apache-mod_line_edit

* Wed Mar 14 2007 Oden Eriksson <oeriksson at mandriva.com> 1.0.0-1mdv2007.1
- initial Mandriva package





More information about the fedora-extras-commits mailing list