Take a look at the following regexp: ^[0-2]~(([0-9A-F]{2})+|~)+$. All nice and cute, isn't it? Well, don't think too fast. I have some input text that can literally kill your regexp engine if you try to match against this regexp (the match will take an exponential time to match each time the input text grows by 2 characters).
There are "killable" regexp engines, and "non-killable", so I guess it all boils down to how much the regexp engine suck. Without a big surprise, Perl's is not killable, when Ruby's, Python's and Java's are.
Killable:
[gc@meuh /tmp] time ruby -e '"0~0000000000000000000000000000000000000000000~" =~ /^[0-2]~(([0-9A-F]{2})+|~)+$/'
real 0m1.947s
[gc@meuh /tmp] time ruby -e '"0~000000000000000000000000000000000000000000000~" =~ /^[0-2]~(([0-9A-F]{2})+|~)+$/'
real 0m3.939s
[gc@meuh /tmp] time ruby -e '"0~00000000000000000000000000000000000000000000000~" =~ /^[0-2]~(([0-9A-F]{2})+|~)+$/'
real 0m7.920s
...
[gc@meuh /tmp] time python -c "import re; re.match( r'^[0-2]~(([0-9A-F]{2})+|~)+$', '0~0000000000000000000000000000000000000000000~' )"
real 0m1.343s
[gc@meuh /tmp] time python -c "import re; re.match( r'^[0-2]~(([0-9A-F]{2})+|~)+$', '0~000000000000000000000000000000000000000000000~' )"
real 0m2.667s
[gc@meuh /tmp] time python -c "import re; re.match( r'^[0-2]~(([0-9A-F]{2})+|~)+$', '0~00000000000000000000000000000000000000000000000~' )"
real 0m5.397s
..
public static void main( String args[] ) {
Pattern.compile("^[0-2]~(([0-9A-F]{2})+|~)+$").matcher(args[0]).matches();
}
[gc@meuh /tmp] time java T 0~0000000000000000000000000000000000000000000~
real 0m1.057s
[gc@meuh /tmp] time java T 0~000000000000000000000000000000000000000000000~
real 0m2.045s
[gc@meuh /tmp] time java T 0~00000000000000000000000000000000000000000000000~
real 0m3.968s
[gc@meuh /tmp] echo 'Str.string_match (Str.regexp "^[0-2]~\\(\\([0-9A-F][0-9A-F]\\)+\\|~\\)+$") "0~0000000000000000000000000000000000000000000~" 0' > t.ml; time ocaml str.cma t.ml real 0m0.288s [gc@meuh /tmp] echo 'Str.string_match (Str.regexp "^[0-2]~\\(\\([0-9A-F][0-9A-F]\\)+\\|~\\)+$") "0~000000000000000000000000000000000000000000000~" 0' > t.ml; time ocaml str.cma t.ml real 0m0.559s [gc@meuh /tmp] echo 'Str.string_match (Str.regexp "^[0-2]~\\(\\([0-9A-F][0-9A-F]\\)+\\|~\\)+$") "0~00000000000000000000000000000000000000000000000~" 0' > t.ml; time ocaml str.cma t.ml real 0m1.102s
Semi-killable:
[gc@meuh /tmp] time php -r 'preg_match("/^[0-2]~(([0-9A-F]{2})+|~)+$/", "0~0000000000000000000000000000000000000000000~");'
real 0m0.613s
[gc@meuh /tmp] time php -r 'preg_match("/^[0-2]~(([0-9A-F]{2})+|~)+$/", "0~000000000000000000000000000000000000000000000~");'
real 0m0.614s
[gc@meuh /tmp] time php -r 'preg_match("/^[0-2]~(([0-9A-F]{2})+|~)+$/", "0~00000000000000000000000000000000000000000000000~");'
real 0m0.615s
[gc@meuh /tmp] time php -r 'preg_match("/^[0-2]~(([0-9A-F]{2})+|~)+$/", "0~00000000000000000000000000000000000000000000~");'
real 0m0.071s
[olivier@angband ~]$ time -p clisp -q -K full -x '(pcre:pcre-exec (pcre:pcre-compile "^[0-2]~(([0-9A-F]{2})+|~)+$") "0~0000000000000000000000000000000000000000000~")
real 0.60
[olivier@angband ~]$ time -p clisp -q -K full -x '(pcre:pcre-exec (pcre:pcre-compile "^[0-2]~(([0-9A-F]{2})+|~)+$") "0~000000000000000000000000000000000000000000000~")'
real 0.61
[olivier@angband ~]$ time -p clisp -q -K full -x '(pcre:pcre-exec (pcre:pcre-compile "^[0-2]~(([0-9A-F]{2})+|~)+$") "0~00000000000000000000000000000000000000000000000~")'
real 0.60
[olivier@angband ~]$ time -p clisp -q -K full -x '(pcre:pcre-exec (pcre:pcre-compile "^[0-2]~(([0-9A-F]{2})+|~)+$") "0~0000000000000000000000000000000000000000000000~")'
real 0.02
Not killable:
perl -e '"0~0000000000000000000000000000000000000000000~" =~ /^[0-2]~(([0-9A-F]{2})+|~)+$/'
echo "0~0000000000000000000000000000000000000000000~" | egrep '^[0-2]~(([0-9A-F]{2})+|~)+$'
clisp -q -K full -x '(regexp:match "^[0-2]~\\(\\([0-9A-F]\\{2\\}\\)\\+\\|~\\)\\+$" "0~00000000000000000000000000000000000000000000000~")'
Last update: Wed May 4 15:32:44 2005