mksios mksios - 22 days ago 13
Apache Configuration Question

Unable to ignore mod_rewrite internal redirects with NS flag

I have defined a couple

mod_rewrite
rules in an
.htaccess
file, one to rewrite the URL path from
/rwtest/source.html
to
/rwtest/target.html
, and another to prohibit direct access to
/rwtest/target.html
. That is, all users wishing to see the content of
/rwtest/target.html
must enter
/rwtest/source.html
in their URL bar.

I was trying to use the
NS
flag in the forbid rule to prevent rewritten URLs from being denied as well, but it appears this flag does not distinguish between the first request and the internal redirect. It would seem that
NS
should do the job, but I'm sure I'm misunderstanding something.

Can someone please clarify this behavior? What exactly makes this internal redirect not an internal subrequest that the
NS
flag can ignore?

Details:



Here's my full
.htaccess
file:

Options +FollowSymLinks -Multiviews

RewriteEngine on
RewriteBase /rwtest

# Forbid rule. Prohibit direct access to target.html. Note the NS flag.
RewriteRule ^target.html$ - [F,NS]

# Rewrite rule. Rewrite source.html to target.html.
RewriteRule ^source.html$ target.html


I'm running Apache 2.4.9 on Windows 7 x64, but I've observed similar behavior on Apache 2.4.3 on Linux. Here's Log output for a request to
/rwtest/source.html
.

[rewrite:trace3] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] strip per-dir prefix: C:/Apache24/htdocs/rwtest/source.html -> source.html
[rewrite:trace3] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] applying pattern '^target.html$' to uri 'source.html'
[rewrite:trace3] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] strip per-dir prefix: C:/Apache24/htdocs/rwtest/source.html -> source.html
[rewrite:trace3] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] applying pattern '^source.html$' to uri 'source.html'
[rewrite:trace2] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] rewrite 'source.html' -> 'target.html'
[rewrite:trace3] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] add per-dir prefix: target.html -> C:/Apache24/htdocs/rwtest/target.html
[rewrite:trace2] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] trying to replace prefix C:/Apache24/htdocs/rwtest/ with /rwtest
[rewrite:trace5] [rid#20b6200/initial] strip matching prefix: C:/Apache24/htdocs/rwtest/target.html -> target.html
[rewrite:trace4] [rid#20b6200/initial] add subst prefix: target.html -> /rwtest/target.html
[rewrite:trace1] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] internal redirect with /rwtest/target.html [INTERNAL REDIRECT]
[rewrite:trace3] [rid#20ba360/initial/redir#1] [perdir C:/Apache24/htdocs/rwtest/] strip per-dir prefix: C:/Apache24/htdocs/rwtest/target.html -> target.html
[rewrite:trace3] [rid#20ba360/initial/redir#1] [perdir C:/Apache24/htdocs/rwtest/] applying pattern '^target.html$' to uri 'target.html'
[rewrite:trace2] [rid#20ba360/initial/redir#1] [perdir C:/Apache24/htdocs/rwtest/] forcing responsecode 403 for C:/Apache24/htdocs/rwtest/target.html


Workarounds



I've posted a few workarounds below.

Answer

There are several workarounds for this, each with their pros and cons. As a disclaimer, I've only tested them in an .htaccess context.


Workaround 1. Check for empty REDIRECT_STATUS

Add a RewriteCond checking to see if %{ENV:REDIRECT_STATUS} is empty. If it is empty, then the current request is not an internal redirect.

Pros

  • Most direct way to determine internal redirect.

Cons

  • Lack of documentation. The page on Custom Error Responses mentions this variable briefly:

    REDIRECT_ environment variables are created from the environment variables which existed prior to the redirect. They are renamed with a REDIRECT_ prefix, i.e., HTTP_USER_AGENT becomes REDIRECT_HTTP_USER_AGENT. REDIRECT_URL, REDIRECT_STATUS, and REDIRECT_QUERY_STRING are guaranteed to be set, and the other headers will be set only if they existed prior to the error condition.

    I've tried every other REDIRECT_ variable in RewriteCond, yet all of them except REDIRECT_STATUS were empty for internal redirects. Why REDIRECT_STATUS is the special one in mod_rewrite remains a mystery.

Example

# Forbid rule. Prohibit direct access to target.html.
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^target.html$ - [F]

# Rewrite rule. Rewrite source.html to target.html.
RewriteRule ^source.html$ target.html

Credits for this approach go to URL rewrite : internal server error.


Workaround 2. Halt rewrite rule processing with END

Unlike the L flag, END halts rewrite rules even for internal redirects.

Pros

  • Simple. Just an extra flag.

Cons

  • Does not give you enough control over which rules to process and which to skip.

Example

# Forbid rule. Prohibit direct access to target.html.
RewriteRule ^target.html$ - [F]

# Rewrite rule. Rewrite source.html to target.html.
RewriteRule ^source.html$ target.html [END]

For more information see END flag.


Workaround 3. Match against original URL in THE_REQUEST

%{THE_REQUEST}

The full HTTP request line sent by the browser to the server (e.g., "GET /index.html HTTP/1.1").

THE_REQUEST does not change with internal redirects, so you can match against it.

Pros

  • Can be used to match against the original URL even in the second round of URL processing.

Cons

  • Significantly more complicated than the other approaches. Forces the use of RewriteCond where just one RewriteRule would have been sufficient.

  • Matches against the full URL which has not been unescaped (decoded), unlike most other variables.

  • Inconvenient to use in multiple RewriteRules. RewriteConds can be copied above every RewriteRule or the value can be exported to an environment variable (see example). Both hacky alternatives.

Example

# Forbid rule. Prohibit direct access to target.html.
RewriteCond %{THE_REQUEST} "^[^ ]+ ([^ ?]*)"  # extract path from request line
RewriteCond %1 ^/rwtest/target.html$
RewriteRule ^ - [F]

# Rewrite rule. Rewrite source.html to target.html.
RewriteRule ^source.html$ target.html

Or, export the path to an environment variable and use it in multiple RewriteRules.

# Extract the original URL and save it to ORIG_URL.
RewriteCond %{THE_REQUEST} "^[^ ]+ ([^ ?]*)"  # extract path from request line
RewriteRule ^ - [E=ORIG_URL:%1]

# Forbid rule. Prohibit direct access to target.html.
RewriteCond %{ENV:ORIG_URL} ^/rwtest/target.html$
RewriteRule ^ - [F]

# Rewrite rule. Rewrite source.html to target.html.
RewriteCond %{ENV:ORIG_URL} ^/rwtest/source.html$
RewriteRule ^ target.html