isapiRewrite

isapiRewrite
isapiRewrite

asp网站程序在国内运用很广,但是类似于im286.asp?id=20050307213811这样的url有点不利于搜索引擎的收录,也就是说不符合友好url(urls-friendly)的标准,那么我们用isapi_rewrite打造一个clean url,让蜘蛛欢欢喜喜的来爬你的站吧。

废话不多说,开工!

1、下载isapi_rewrite.isapi_rewrite分精简(lite)和完全(full)版.精简版不支持对每个虚拟主机站点进行重写,只能进行全局处理。不过对于有服务器的朋友,精简版也就够啦。精简版下载地址:https://www.360docs.net/doc/f28319585.html,/download/,就是那lite version (free)啦。

2、安装.msi的文件,和装一般程序一样装就可以了,俺就装在d:\isapi_rewrite。

3、接下来一步比较重要哦,看仔细喽.打开internet 信息服务,右键,web站点属性,电isapi筛选器选项卡.添加筛选器,名称自己填,路径自己指定isapi_rewri te.dll,然后确定。

4、来测试一下。新建一个1ting.asp,里面写上<%=request.querystring("inso")%>,效果就是执行的时候1ting.asp?inso=*浏览器显示*。

5、这一步很重要哦,开始添加rewrite规则.正则,好头痛,幸亏这个例子比较简单。

找到isapi_rewrite目录,把httpd.ini的只读属性去掉,打开编辑.我们要把1ting.asp?inso=im286映射成为1ting-im286.html这样的类型,需要在httpd.ini里加上这么一行:rewriterule /1ting-([0-9,a-z]*).html /1ting.asp\?inso=$1,保存。

6、来来来,到浏览器里查看一下效果吧。输入http://127.0.0.1/1ting.asp?inso=im286和http://127.0.0.1/1ting-im286.html,显示的内容是不是都是im286?这就证明成功啦!

嘿嘿,1ting-im286.html这样的页面要比1ting.asp?inso=im286容易收入,所以现在还在用动态方式的朋友可以尝试一下这样的静态映射效果. iis rewrite也可以实现这样的功能。

后话:这个可行性可能不是太强,俺只是从纯技术的角度来讨论,表拍我砖啊,上次的伤还没养好呢。

配置:

在NT 2000 XP和2003平台上,在系统帐户下应该INETINFO程序应该与IIS5以共存模式过滤器运行。所以系统帐户应该给予对所有的ISAPI- REWIRITE DLLS 和所有的HTTPD。INI文件至少可读权限,我们也推荐对给予系统帐户对于所有包括HTTPD。INI文件的文件夹的可写权限,这将允许产生HTTP。PARSE。ERRORS文件,这些文件包含配置文件语法错误。对于PROXY模块也需要额外的权限,因为它将运行于连接池或HIGH-ISPLATED 应用模式,IIS帐户共享池和HIGH-ISOLATION池应被给予对RWHELPERE。DLL的可读权限。缺省情况下IWAM-《计算机名》被用于所有的池,在相应的COM+应用设置中应借助COM+ ADMINISTRATION MMC SNAP-IN建立池帐户

配置文件格式化:

有两种形式的配置文件-GLOBAL (SERVER-LEVEL)和INDIVIDUAL(SITE-LEVAL)文件,GLOBAL配置文件应被命名为HTTPD.INI并出现在ISAPI-REWRITE安装目录中,文件的快捷方式通过开始菜单提供,INDIVIDUAL配置文件应名为HTTPD。INI并且能够出现在虚拟站点的物理根目录中,两种类型的格式化是相同的并是标准的WINDOWS。INI文件,所有的指令都应该放在这一部分并且所有指令都应该以分隔线放置,任何这一部分以外的文本都将被忽略

HTTPD.INI文件示例

[ISAPI_Rewrite]

# This is a comment

# 300 = 5 minutes

CacheClockRate 300

RepeatLimit 20

# Block external access to the httpd.ini and httpd.parse.errors files

RewriteRule /httpd(?:.ini|.parse.errors) / [F,I,O]

# Block external access to the Helper ISAPI Extension

RewriteRule .*.isrwhlp / [F,I,O]

# Some custom rules

RewriteCond Host: (.+)

RewriteCond 指令

Syntax:(句法)RewriteCond TestV erb CondPattern [Flags]

这一指令定义一个条件规则,在RewriteRule 或者RewriteHeader或RewriteProxy指令前预行RewriteCond指令,后面的规则只有它的,模式匹配URI 的当前状态并且额外的条件也被应用才会被应用。

TestVerb

Specifies verb that will be matched against regular expression.

特别定义的动词匹配规定的表达式

TestVerb=(URL | METHOD | VERSION | HTTPHeaderName: | %ServerVariable) where:

URL - returns Request-URI of client request as described in RFC 2068 (HTTP 1.1);

返回客户端在RFC2068中描述的需求的Request-URI

METHOD - returns HTTP method of client request (OPTIONS, GET, HEAD, POST, PUT, DELETE or TRACE);

返回客户端需求(OPTIONS, GET, HEAD, POST, PUT, DELETE or TRACE)的HTTP方法

VERSION - returns HTTP version;

返回HTTP版本

HTTPHeaderName - returns value of the specified HTTP header. HTTPHeaderName can be any valid HTTP header name. Header names should include the trailing colon ":". If specified header does not exists in a client's request TestVerb is treated as empty string.

返回特定义的HTTP头文件的值

HTTPHeaderName =

Accept:

Accept-Charset:

Accept-Encoding:

Accept-Language:

Authorization:

Cookie:

From:

Host:

If-Modified-Since:

If-Match:

If-None-Match:

If-Range:

If-Unmodified-Since:

Max-Forwards:

Proxy-Authorization:

Range:

Referer:

User-Agent:

Any-Custom-Header

得到更多的关于HTTP头文件的和他们的值的信息参考RFC2068

ServerVariable 返回特定义的服务器变量的值。例如服务器端口,全部服务器变量列表应在IIS文档中建立,变量名应用%符预定;

CondPattern

The regular expression to match TestVerb

规则表达式匹配TestVerb

[Flags]

Flags is a comma-separated list of the following flags:

O (nOrmalize)

Normalizes string before processing. Normalization includes removing of an URL-encoding, illegal characters, etc. This flag is useful with URLs and URL-encoded headers

RewriteRule 指令

Syntax: RewriteRule Pattern FormatString [Flags]

这个指令可以不止发生一次,每个指令定义一个单独的重写规则,这些规则的定义命令很重要,因为这个命令在应用运行时规则是有用途的

I (ignore case)

不管大小写强行指定字符匹配,这个FLAG影响RewriteRule指令和相应的RewriteCond 指令

F (Forbidden)

对客户端做反应,停止REWRITING进程并且发送403错误,注意在这种情况下FORMA TSTRING 是无用的并可以设置为任何非空字符串。

L (last rule)

不应用任何重写规则在此停止重写进程,使用这个FLAG以阻止当前被重写的URI被后面的规则再次重写

N (Next iteration)

强制REWRITINGENGINE调整规则目标并且从头重启规则检查(所有修改将保存),重启次数由RepeatLimit指定的值限制,如果这个数值超过N FLAG 将被忽略

NS (Next iteration of the same rule)

以N标记工作不从相同的规则重启规则规则进程(例如强制重复规则应用),通过RepeatLimit指令指定一个反复实行某一规则的最大数目,

P (force proxy)

强制目的URI在内部强制为代理需求并且立即通过ISAPI扩展应付代理需求,必须确认代理字符串是一个有效的URI包括协议主机等等否则代理将返回错误

R (explicit redirect)

强制服务器对客户端发出重定向指示即时应答,提供目的URI的新地址,重定向规则经常是最后规则

RP (permanent redirect)

几乎和[R]标记相同但是发布301HTTP状态而不是302HTTP状态代码

U (Unmangle Log)

当URI是源需求而不是重写需求时记载URI

O (nOrmalize)

在实行之前标准化字符串。标准化包括URL-ENCODING,不合法的字符的再移动等,这个标记对于URLS和URLS-ENDODED头是有用的

CL (Case Lower)

小写

CU (Case Upper)

大写

RewriteHeader directive

Syntax: RewriteHeader HeaderName Pattern FormatString [Flags]

这个指令是RewriteRule的更概括化变种,它不仅重写URL的客户端需求部分,而且重写HTTP头,这个指令不仅用于重写。生成,删除任何HTTP头,甚至改变客户端请求的方法

HeaderName

指定将被重写的客户头,可取的值与RewriteCond 指令中TestVerb参数相同

Pattern

限定规则表达式以匹配Request-URI,

FormatString

限定将生成新的URI的FormatString

[Flags]

是一个下列FLAGS的命令分隔列表

I (ignore case)

不管大小写强行指定字符匹配,这个FLAG影响RewriteRule指令和相应的RewriteCond 指令

F (Forbidden)

对客户端做反应,停止REWRITING进程并且发送403错误,注意在这种情况下FORMA TSTRING 是无用的并可以设置为任何非空字符串。

L (last rule)

不应用任何重写规则在此停止重写进程,使用这个FLAG以阻止当前被重写的URI被后面的规则再次重写

N (Next iteration)

强制REWRITINGENGINE调整规则目标并且从头重启规则检查(所有修改将保存),重启次数由RepeatLimit指定的值限制,如果这个数值超过N FLAG 将被忽略

NS (Next iteration of the same rule)

以N标记工作不从相同的规则重启规则规则进程(例如强制重复规则应用),通过RepeatLimit指令指定一个反复实行某一规则的最大数目,

R (explicit redirect)

强制服务器对客户端发出重定向指示即时应答,提供目的URI的新地址,重定向规则经常是最后规则

RP (permanent redirect)

几乎和[R]标记相同但是发布301HTTP状态而不是302HTTP状态代码

U (Unmangle Log)

当URI是源需求而不是重写需求时记载URI

O (nOrmalize)

在实行之前标准化字符串。标准化包括URL-ENCODING,不合法的字符的再移动等,这个标记对于URLS和URLS-ENDODED头是有用的CL (Case Lower)

小写

CU (Case Upper)

大写

要重移动头,FORMA T STRING模式应该生成一个空字符串,例如这一规则将从客户请求中重移代理信息

RewriteHeader User-Agent: .* $0

并且这一规则将把OLD-URL HEADER 加入请求中。

RewriteCond URL (.*)RewriteHeader Old-URL: ^$ $1

最后一个例子将通过改变请求方法定向所有的WEBDA V请求到/WEBDA V。ASP

RewriteCond METHOD OPTIONS

RewriteRule (.*) /webdav.asp?$1

RewriteHeader METHOD OPTIONS GET

RewriteProxy directive

Syntax: RewriteProxy Pattern FormatString [Flags]

强制目的URI在内部强制为代理需求并且立即通过ISAPI扩展应付代理需求,这将允许IIS作为代理服务器并且重路由到其他站点和服务器Pattern

限定规则表达式以匹配Request-URI,

FormatString

限定将生成新的URI的FormatString

[Flags]

是一个下列FLAGS的命令分隔列表

D (Delegate security)

代理模式将试图以当前假冒的用户资格登陆远程服务器,

C (use Credentials)

代理模式将试图一在URL或基本授权头文件中指定的资格登陆远程服务器,用这个标记你可以使用http://user:password@https://www.360docs.net/doc/f28319585.html,/path/ syntax 作为URL F (Follow redirects)

缺省情况下ISAPI_Rewrite 将试图将MAP远程服务器返回的重定向指令到本地服务器命名空间,如果远程服务器返回重定向点到那台服务器其他的某个位置,ISAPI_Rewrite 将修改这一重定向指令指向本服务器名,这将避免用户看到真实(内部)服务器名称

使用F标记强制代理模式内部跟踪远程服务器返回的重定向指令,使用这个标记如果你根本不需要接受远程服务器的重定向指令,在WINHTTP设置中有重定向限制以避免远程重定向循环

I (ignore case)

不管大小写强行指定字符匹配

U (Unmangle Log)

当URI是源需求而不是重写需求时记载URI

O (nOrmalize)

在实行之前标准化字符串。标准化包括URL-ENCODING,不合法的字符的再移动等,这个标记对于URLS和URLS-ENDODED头是有用的CacheClockRate directive

Syntax: CacheClockRate Interval

这个指令只在GLOBAL配置内容中出现,如果这个指令在SITE-LEVEL内容中出现将被忽略并把错误信息写入httpd.parse.errors 文件

ISAPI_Rewrite caches每次在第一次加载时配置,使用这个指令你可以限定当一个特定站点从缓存中清理的不活动周期,把这个参数设置的足够大你可以强制ISAPI_Rewrite 永不清理缓存,记住任何配置文件的改变将在下次请求后立即更新而忽略这个周期

Interval

限定特定配置被清理出缓存的不作为时间(以秒计),缺省值3600(1小时)

EnableConfig and DisableConfig directives

Syntax:

EnableConfig [SiteID|"Site name"]

DisableConfig [SiteID|"Site name"]

对所选站点激活或不激活SITE-LEVEL配置或者改变缺省配置,缺省SITE-LEVEL配置不激活,这个指令只出现在GLOBAL配置内容中

SiteID

Numeric metabase identifier of a site

Site name

Name of the site as it appears in the IIS console

不用参数使用这个命令将改变缺省配置到ENABLE/DISABLE配置进程

例子

下面例子将使配置仅作用于ID=1(典型是缺省站点)名字是MY SITE的站点

DisableConfig

EnableConfig 1

EnableConfig"My site"

下边例子将激活名称为SOMESITE配置因为它分割设置重载了缺省设置

EnableConfig"Some site"

DisableConfig

EnableRewrite and DisableRewrite directives

Syntax:

EnableRewrite [SiteID|"Site name"]

DisableRewrite [SiteID|"Site name"]

对所选站点激活或不激活重写或者改变缺省配置,缺省重写配置激活,这个指令只出现在GLOBAL配置内容中SiteID

Numeric metabase identifier of a site

Site name

Name of the site as it appears in the IIS console.

不使用参数这个命令将全部激活或者不激活

RepeatLimit directive

Syntax: RepeatLimit Limit

这个指令可以出现在GLOBAL和SITE-LEVEL配置文件中,如果出现在GLOBAL配置文件中竟改变GLOBAL对于所有站点的限制,出现在SITE-LEVEL 配置中竟只改变对于这个站点的限制并且这个限制不能超过GLOBAL限制

ISAPI_Rewrite在实行规则时允许循环,这个指令允许限制最大可能循环的数量,可以设置为0或1而不支持循环,

LIMIT

限制最大循环数量,缺省32

RFStyle directive

Syntax: RFStyle Old | New

Configuration Utility

ISAPI_Rewrite Full包括配置功用(可以在ISAPI_Rewrite 程序组中启动),它允许你浏览测试状态并输入注册码(如果在安装过程中没有注册),并且调整部分与代理模式操作相关的产品功能,UTILITY是由三个页面组成的属性表

Trial page允许你浏览TIRAL状态并输入注册码(如果在安装过程中没有注册)

Settings page

这页包含对下列参数的编辑框

Helper URL

这个参数影响过滤器和代理模块之间的联系方式,它即可以是以点做前缀的文件扩展名(如.isrwhlp)也可以是绝对路径,

第一种情况下扩展名将追加在初始请求URI上并且代理模块竟通过SCRIPT MAP激活,缺省扩展名isrwhlp在安装进程中加在global script map 中,如果你改变这个扩展名或者你的应用不继承global script map 设置你应该手动添加向script map 所需求的入口。这个应该有如下参数

Executable: An absolute path to the rwhelper.dll in the short form

Extension: Desired extension (.isrwhlp is default)

Verbs radio button: All Verbs

Script engine checkbox: Checked

Check that file exists checkbox: Unchecked

我们已经创建了一个WSH script proxycfg.vbs ,可以简单在一个a script maps中注册,她位于安装文件夹并且可以在命令行一如下方式运行

cscript proxycfg.vbs [-r] [MetabasePath]

Optional -r 强制注册扩展名

Optional MetabasePath parameter allows specification of the first metabase key to process. By default it is "/localhost/W3SVC".

要在所有现存的script maps 中注册你可以以如下命令行激活script

cscript proxycfg.vbs -r

第二种情况下你应该提供一个URI作为'Helper URL'的值,你也应该map 一个ISAPI_Rewrite的安装文件夹作为美意个站点的虚拟文件家

注意:根据顾客反应,IIS5(也许包括IIS4)对长目录名有问题。所以我们强烈推荐使用短目录名

Worker threads limit

这个参数限制在代理扩展线程池中工作线程数,缺省为0意味着这个限制等于处理器数量乘以2

Active threads limit

这个参数限制当前运行线程数,这个数量不可大于"Worker threads limit". 缺省0意思是等于处理器数量

Queue size 这个参数定义最大请求数量,如果你曾经看到Queue timeout expired" 信息在the Application event log中你可以增加这个参数

Queue timeout

这个参数定义你在内部请求队列中防止新请求的最大等待时间,如果你曾经看到Queue timeout expired" 信息在the Application event log中你可以增加这个参数

Connect timeout

以毫秒设定代理模块连接超时

Send timeout

以毫秒设定代理模块发送超时

Receive timeout

以毫秒设定代理模块发送超时

About page.

It contains copyright information and a link to the ISAPI_Rewrite's web site.

Regular expression syntax

这一部分覆盖了ISAPI_Rewrite规定的表达句法

Literals

所有字符都是原意除了".", "*", "?", "+", "(", ")", "{", "}", "[", "]", "^" and "$".,这些字符在用“”处理时是原意,原意指一个字符匹配自身

Wildcard

The dot character "." matches any single character except null character and newline character

以下为句法

Repeats

A repeat is an expression that is repeated an arbitrary number of times. An expression followed by "*" can be repeated any number of times including zero. An expression followed by "+" can be repeated any number of times, but at least once. An expression followed by "?" may be repeated zero or one times only. When it is necessary to specify the minimum and maximum number of repeats explicitly, the bounds operator "{}" may be used, thus "a{2}" is the letter "a" repeated exactly twice, "a{2,4}" represents the letter "a" repeated between 2 and 4 times, and "a{2,}" represents the letter "a" repeated at least twice with no upper limit. Note that there must be no white-space inside the {}, and there is no upper limit on the values of the lower and upper bounds. All repeat expressions refer to the shortest possible previous sub-expression: a single character; a character set, or a sub-expression grouped with "()" for example.

Examples:

"ba*" will match all of "b", "ba", "baaa" etc.

"ba+" will match "ba" or "baaaa" for example but not "b".

"ba?" will match "b" or "ba".

"ba{2,4}" will match "baa", "baaa" and "baaaa".

Non-greedy repeats

Non-greedy repeats are possible by appending a '?' after the repeat; a non-greedy repeat is one which will match the shortest possible string.

For example to match html tag pairs one could use something like:

"]*>(.*?)"

In this case $1 will contain the text between the tag pairs, and will be the shortest possible matching string.

Parenthesis

Parentheses serve two purposes, to group items together into a sub-expression, and to mark what generated the match. For example the expression "(ab)*" would match all of the string "ababab". All sub matches marked by parenthesis can be back referenced using N or $N syntax. It is permissible for sub-expressions to match null strings. Sub-expressions are indexed from left to right starting from 1, sub-expression 0 is the whole expression.

Non-Marking Parenthesis

Sometimes you need to group sub-expressions with parenthesis, but don't want the parenthesis to spit out another marked sub-expression, in this case a non-marking parenthesis (?:expression) can be used. For example the following expression creates no sub-expressions:

"(?:abc)*"

Alternatives

Alternatives occur when the expression can match either one sub-expression or another, each alternative is separated by a "|". Each alternative is the largest possible previous sub-expression; this is the opposite behaviour from repetition operators.

Examples:

"a(b|c)" could match "ab" or "ac".

"abc|def" could match "abc" or "def".

Sets

A set is a set of characters that can match any single character that is a member of the set. Sets are delimited by "[" and "]" and can contain literals, character ranges, character classes, collating elements and equivalence classes. Set declarations that start with "^" contain the compliment of the elements that follow.

Examples:

Character literals:

"[abc]" will match either of "a", "b", or "c".

"[^abc] will match any character other than "a", "b", or "c".

Character ranges:

"[a-z]" will match any character in the range "a" to "z".

"[^A-Z]" will match any character other than those in the range "A" to "Z".

Character classes

Character classes are denoted using the syntax "[:classname:]" within a set declaration, for example "[[:space:]]" is the set of all whitespace characters. The available character classes are:

alnum Any alpha numeric character.

alpha Any alphabetical character a-z and A-Z. Other characters may also be included depending upon the locale.

blank Any blank character, either a space or a tab.

cntrl Any control character.

digit Any digit 0-9.

graph Any graphical character.

lower Any lower case character a-z. Other characters may also be included depending upon the locale.

print Any printable character.

punct Any punctuation character.

space Any whitespace character.

upper Any upper case character A-Z. Other characters may also be included depending upon the locale.

xdigit Any hexadecimal digit character, 0-9, a-f and A-F.

word Any word character - all alphanumeric characters plus the underscore.

unicode Any character whose code is greater than 255, this applies to the wide character traits classes only.

There are some shortcuts that can be used in place of the character classes:

w in place of [:word:]

s in place of [:space:]

d in plac

e o

f [:digit:]

l in place of [:lower:]

u in place of [:upper:]

Collating elements

Collating elements take the general form [.tagname.] inside a set declaration, where tagname is either a single character, or a name of a collating element, for example [[.a.]] is equivalent to [a], and [[.comma.]] is equivalent to [,]. ISAPI_Rewrite supports all the standard POSIX collating element names, and in addition the following digraphs: "ae", "ch", "ll", "ss", "nj", "dz", "lj", each in lower, upper and title case variations. Multi-character collating elements can result in the set matching more than one character, for example [[.ae.]] would match two characters, but note that [^[.ae.]] would only match one character.

Equivalence classes

Equivalenceclassestakethegeneralform[=tagname=] inside a set declaration, where tagname is either a single character, or a name of a collating element, and matches any character that is a member of the same primary equivalence class as the collating element [.tagname.]. An equivalence class is a set of characters that collate the same, a primary equivalence class is a set of characters whose primary sort key are all the same (for example strings are typically collated by character, then by accent, and then by case; the primary sort key then relates to the character, the secondary to the accentation, and the tertiary to the case). If there is no equivalence class corresponding to tagname, then [=tagname=] is exactly the same as [.tagname.].

To include a literal "-" in a set declaration then: make it the first character after the opening "[" or "[^", the endpoint of a range, a collating element, or precede it with an escape character as in "[-]". To include a literal "[" or "]" or "^" in a set then make them the endpoint of a range, a collating element, or precede with an escape character.

Line anchors

An anchor is something that matches the null string at the start or end of a line: "^" matches the null string at the start of a line, "$" matches the null string at the end of a line.

Back references

A back reference is a reference to a previous sub-expression that has already been matched, the reference is to what the sub-expression matched, not to the expression itself. A back reference consists of the escape character "" followed by a digit "1" to "9", "1" refers to the first sub-expression, "2" to the second etc. For example the expression "(.*)1" matches any string that is repeated about its mid-point for example "abcabc" or "xyzxyz". A back reference to a sub-expression that did not participate in any match, matches the null string. In ISAPI_Rewrite all back references are global for entire RewriteRule and corresponding RewriteCond directives. Sub matches are numbered up to down and left to right beginning from the first RewriteCond directive of the corresponding RewriteRule directive, if there is one.

Forward Lookahead Asserts

There are two forms of these; one for positive forward lookahead asserts, and one for negative lookahead asserts:

"(?=abc)" matches zero characters only if they are followed by the expression "abc".

"(?!abc)" matches zero characters only if they are not followed by the expression "abc".

Word operators

The following operators are provided for compatibility with the GNU regular expression library.

"w" matches any single character that is a member of the "word" character class, this is identical to the expression "[[:word:]]".

"W" matches any single character that is not a member of the "word" character class, this is identical to the expression "[^[:word:]]".

"<" matches the null string at the start of a word.

">" matches the null string at the end of the word.

" " matches the null string at either the start or the end of a word.

"B" matches a null string within a word.

Escape operator

相关主题
相关文档
最新文档