avoid exponential regex match for java and objc function names

In the old regex

^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\([^;]*)$
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

you can backtrack arbitrarily from [A-Za-z_0-9]* into [A-Za-z_], thus
causing an exponential number of backtracks.  Ironically it also causes
the regex not to work as intended; for example "catch" can match the
underlined part of the regex, the first repetition matching "c" and
the second matching "atch".

The replacement regex avoids this problem, because it makes sure that
at least a space/tab is eaten on each repetition.  In other words,
a suffix of a repetition can never be a prefix of the next repetition.

Signed-off-by: Paolo Bonzini <bonzini@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Paolo Bonzini 2009-06-17 16:26:06 +02:00 коммит произвёл Junio C Hamano
Родитель 9b7dc71835
Коммит 959e2e64a5
1 изменённых файлов: 3 добавлений и 2 удалений

Просмотреть файл

@ -13,7 +13,8 @@ PATTERNS("html", "^[ \t]*(<[Hh][1-6][ \t].*>.*)$",
"[^<>= \t]+|[^[:space:]]|[\x80-\xff]+"),
PATTERNS("java",
"!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n"
"^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\\([^;]*)$",
"^[ \t]*(([A-Za-z_][A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$",
/* -- */
"[a-zA-Z_][a-zA-Z0-9_]*"
"|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?"
"|[-+*/<>%&^|=!]="
@ -25,7 +26,7 @@ PATTERNS("objc",
/* Objective-C methods */
"^[ \t]*([-+][ \t]*\\([ \t]*[A-Za-z_][A-Za-z_0-9* \t]*\\)[ \t]*[A-Za-z_].*)$\n"
/* C functions */
"^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\\([^;]*)$\n"
"^[ \t]*(([A-Za-z_][A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$\n"
/* Objective-C class/protocol definitions */
"^(@(implementation|interface|protocol)[ \t].*)$",
/* -- */