documented the new ngx.re.match() API; fixed unmatched subpattern capturing (we should return nil instead of "" here); only enable ngx.re.match when PCRE is enabled in the nginx core.

This commit is contained in:
agentzh (章亦春) 2011-08-16 23:58:45 +08:00
Родитель 8f021621f2
Коммит bf52ee08be
7 изменённых файлов: 214 добавлений и 9 удалений

57
README
Просмотреть файл

@ -8,9 +8,9 @@ Status
This module is under active development and is already production ready.
Version
This document describes lua-nginx-module v0.2.1rc10
This document describes lua-nginx-module v0.2.1rc11
(<https://github.com/chaoslawful/lua-nginx-module/downloads>) released
on 14 August 2011.
on 16 August 2011.
Synopsis
# set search paths for pure Lua external libraries (';;' is the default path):
@ -1280,11 +1280,64 @@ Nginx API for Lua
then ... end
ngx.is_subrequest
syntax: *value = ngx.is_subrequest*
context: *set_by_lua*, rewrite_by_lua*, access_by_lua*, content_by_lua**
Returns "true" if the current request is an nginx subrequest, or "false"
otherwise.
ngx.re.match
syntax: ''captures = ngx.re.match(subject, regex, options?)
context: *set_by_lua*, rewrite_by_lua*, access_by_lua*, content_by_lua**
Matches the "subject" string using the Perl-compatible regular
expression "regex" with the optional "options".
Only the first occurrence of the match is returned, or "nil" if no match
is found.
When a match is found, a Lua table "captures" is returned, where
"captures[0]" holds the whole substring being matched, and "captures[1]"
holds the first parenthesized subpattern's capturing, "captures[2]" the
second, and so on. Here's some examples:
local m = ngx.re.match("hello, 1234", "[0-9]+")
-- m[0] == "1234"
local m = ngx.re.match("hello, 1234", "([0-9])[0-9]+")
-- m[0] == "1234"
-- m[1] == "1"
Unmatched subpatterns will take "nil" values in their "captures" table
fields. For instance,
local m = ngx.re.match("hello, world", "(world)|(hello)")
-- m[0] == "hello"
-- m[1] == nil
-- m[2] == "hello"
You can also specify "options" to control how the match will be
performed. The following option characters are supported:
m multi-line mode (just like Perl 5's //m)
s single-line mode (just like Perl 5's //s)
i caseless mode (just like Perl 5's //i)
u UTF-8 mode
x extended mode (just like Perl 5's //x)
These characters can be combined together, for example,
local m = ngx.re.match("hello, world", "HEL LO", "ix")
-- m[0] == "hello"
local m = ngx.re.match("hello, 美好生活", "HELLO, (.{2})", "iu")
-- m[0] == "hello, 美好"
-- m[1] == "美好"
This method requires the PCRE library enabled in your Nginx build.
ndk.set_var.DIRECTIVE
syntax: *res = ndk.set_var.DIRECTIVE_NAME* context: *rewrite_by_lua*,
access_by_lua*, content_by_lua**

Просмотреть файл

@ -13,7 +13,7 @@ This module is under active development and is already production ready.
Version
=======
This document describes lua-nginx-module [v0.2.1rc10](https://github.com/chaoslawful/lua-nginx-module/downloads) released on 14 August 2011.
This document describes lua-nginx-module [v0.2.1rc11](https://github.com/chaoslawful/lua-nginx-module/downloads) released on 16 August 2011.
Synopsis
========
@ -1513,10 +1513,69 @@ Parse the http time string (as returned by [ngx.http_time](http://wiki.nginx.org
ngx.is_subrequest
-----------------
**syntax:** *value = ngx.is_subrequest*
**context:** *set_by_lua*, rewrite_by_lua*, access_by_lua*, content_by_lua**
Returns `true` if the current request is an nginx subrequest, or `false` otherwise.
ngx.re.match
------------
**syntax:** ''captures = ngx.re.match(subject, regex, options?)
**context:** *set_by_lua*, rewrite_by_lua*, access_by_lua*, content_by_lua**
Matches the `subject` string using the Perl-compatible regular expression `regex` with the optional `options`.
Only the first occurrence of the match is returned, or `nil` if no match is found.
When a match is found, a Lua table `captures` is returned, where `captures[0]` holds the whole substring being matched, and `captures[1]` holds the first parenthesized subpattern's capturing, `captures[2]` the second, and so on. Here's some examples:
local m = ngx.re.match("hello, 1234", "[0-9]+")
-- m[0] == "1234"
local m = ngx.re.match("hello, 1234", "([0-9])[0-9]+")
-- m[0] == "1234"
-- m[1] == "1"
Unmatched subpatterns will take `nil` values in their `captures` table fields. For instance,
local m = ngx.re.match("hello, world", "(world)|(hello)")
-- m[0] == "hello"
-- m[1] == nil
-- m[2] == "hello"
You can also specify `options` to control how the match will be performed. The following option characters are supported:
m multi-line mode (just like Perl 5's //m)
s single-line mode (just like Perl 5's //s)
i caseless mode (just like Perl 5's //i)
u UTF-8 mode
x extended mode (just like Perl 5's //x)
These characters can be combined together, for example,
local m = ngx.re.match("hello, world", "HEL LO", "ix")
-- m[0] == "hello"
local m = ngx.re.match("hello, 美好生活", "HELLO, (.{2})", "iu")
-- m[0] == "hello, 美好"
-- m[1] == "美好"
This method requires the PCRE library enabled in your Nginx build.
ndk.set_var.DIRECTIVE
---------------------
**syntax:** *res = ndk.set_var.DIRECTIVE_NAME*

Просмотреть файл

@ -10,7 +10,7 @@ This module is under active development and is already production ready.
= Version =
This document describes lua-nginx-module [https://github.com/chaoslawful/lua-nginx-module/downloads v0.2.1rc10] released on 14 August 2011.
This document describes lua-nginx-module [https://github.com/chaoslawful/lua-nginx-module/downloads v0.2.1rc11] released on 16 August 2011.
= Synopsis =
<geshi lang="nginx">
@ -1456,10 +1456,68 @@ Parse the http time string (as returned by [[#ngx.http_time|ngx.http_time]]) int
end
</geshi>
== ngx.is_subrequest ==
'''syntax:''' ''value = ngx.is_subrequest''
'''context:''' ''set_by_lua*, rewrite_by_lua*, access_by_lua*, content_by_lua*''
Returns <code>true</code> if the current request is an nginx subrequest, or <code>false</code> otherwise.
== ngx.re.match ==
'''syntax:''' ''captures = ngx.re.match(subject, regex, options?)
'''context:''' ''set_by_lua*, rewrite_by_lua*, access_by_lua*, content_by_lua*''
Matches the <code>subject</code> string using the Perl-compatible regular expression <code>regex</code> with the optional <code>options</code>.
Only the first occurrence of the match is returned, or <code>nil</code> if no match is found.
When a match is found, a Lua table <code>captures</code> is returned, where <code>captures[0]</code> holds the whole substring being matched, and <code>captures[1]</code> holds the first parenthesized subpattern's capturing, <code>captures[2]</code> the second, and so on. Here's some examples:
<geshi lang="lua">
local m = ngx.re.match("hello, 1234", "[0-9]+")
-- m[0] == "1234"
</geshi>
<geshi lang="lua">
local m = ngx.re.match("hello, 1234", "([0-9])[0-9]+")
-- m[0] == "1234"
-- m[1] == "1"
</geshi>
Unmatched subpatterns will take <code>nil</code> values in their <code>captures</code> table fields. For instance,
<geshi lang="lua">
local m = ngx.re.match("hello, world", "(world)|(hello)")
-- m[0] == "hello"
-- m[1] == nil
-- m[2] == "hello"
</geshi>
You can also specify <code>options</code> to control how the match will be performed. The following option characters are supported:
<geshi lang="text">
m multi-line mode (just like Perl 5's //m)
s single-line mode (just like Perl 5's //s)
i caseless mode (just like Perl 5's //i)
u UTF-8 mode
x extended mode (just like Perl 5's //x)
</geshi>
These characters can be combined together, for example,
<geshi lang="nginx">
local m = ngx.re.match("hello, world", "HEL LO", "ix")
-- m[0] == "hello"
</geshi>
<geshi lang="nginx">
local m = ngx.re.match("hello, 美好生活", "HELLO, (.{2})", "iu")
-- m[0] == "hello, 美好"
-- m[1] == "美好"
</geshi>
This method requires the PCRE library enabled in your Nginx build.
== ndk.set_var.DIRECTIVE ==
'''syntax:''' ''res = ndk.set_var.DIRECTIVE_NAME''
'''context:''' ''rewrite_by_lua*, access_by_lua*, content_by_lua*''

Просмотреть файл

@ -3482,6 +3482,7 @@ ngx_http_lua_ngx_set_ctx(lua_State *L)
}
#if (NGX_PCRE)
int
ngx_http_lua_ngx_re_match(lua_State *L)
{
@ -3596,8 +3597,8 @@ ngx_http_lua_ngx_re_match(lua_State *L)
if (rc < 0) {
ngx_pfree(r->pool, cap);
return luaL_error(L, ngx_regex_exec_n " failed: %d on \"%s\" using \"%s\"",
(int) rc, subj.data, pat.data);
return luaL_error(L, ngx_regex_exec_n " failed: %d on \"%s\" "
"using \"%s\"", (int) rc, subj.data, pat.data);
}
lua_createtable(L, re.captures + 1 /* narr */, 1 /* nrec */);
@ -3609,10 +3610,16 @@ ngx_http_lua_ngx_re_match(lua_State *L)
dd("rc = %d", (int) rc);
for (i = 0, n = 0; i <= re.captures; i++, n += 2) {
lua_pushlstring(L, (char *) &subj.data[cap[n]],
cap[n + 1] - cap[n]);
dd("capture %d: %d %d", i, cap[n], cap[n + 1]);
if (cap[n] < 0) {
lua_pushnil(L);
dd("pushing capture %s at %d", lua_tostring(L, -1), (int) i);
} else {
lua_pushlstring(L, (char *) &subj.data[cap[n]],
cap[n + 1] - cap[n]);
dd("pushing capture %s at %d", lua_tostring(L, -1), (int) i);
}
lua_rawseti(L, -2, (int) i);
}
@ -3621,4 +3628,5 @@ ngx_http_lua_ngx_re_match(lua_State *L)
return 1;
}
#endif /* NGX_PCRE */

Просмотреть файл

@ -61,7 +61,9 @@ int ngx_http_lua_ngx_header_set(lua_State *L);
int ngx_http_lua_ngx_req_get_headers(lua_State *L);
#if (NGX_PCRE)
int ngx_http_lua_ngx_re_match(lua_State *L);
#endif
int ngx_http_lua_ngx_exec(lua_State *L);

Просмотреть файл

@ -719,6 +719,7 @@ init_ngx_lua_globals(ngx_conf_t *cf, lua_State *L)
lua_setfield(L, -2, "var");
#if (NGX_PCRE)
/* {{{ ngx.re table */
lua_newtable(L); /* .re */
@ -729,6 +730,7 @@ init_ngx_lua_globals(ngx_conf_t *cf, lua_State *L)
lua_setfield(L, -2, "re");
/* }}} */
#endif /* NGX_PCRE */
/* {{{ ngx.req table */

Просмотреть файл

@ -296,3 +296,26 @@ error: bad argument #2 to '?' (failed to compile regex "(abc": pcre_compile() fa
--- response_body
error: bad argument #3 to '?' (unknown flag "H")
=== TEST 15: extended mode (ignore whitespaces)
--- config
location /re {
content_by_lua '
m = ngx.re.match("hello, world", "(world)|(hello)", "x")
if m then
ngx.say(m[0])
ngx.say(m[1])
ngx.say(m[2])
else
ngx.say("not matched: ", m)
end
';
}
--- request
GET /re
--- response_body
hello
nil
hello