Many other algorithms have exponential edge cases. This can open yourself to DoS'ing if you accept regular expressions from the user (e.g. a search feature.)
No. While you are correct a DFA is far superior for parsing this specific subset of javascript regex, it does in no way make it ideal for debugging purposes.
1) In the user's program the regex is not going to be run on a dfa (since we are talking about the javascript variation which has back references). It makes more sense to warn the user about bad performance, than making them believe they are safe.
2) A debugger has to be true to the input. If the user wants to debug (a) it doesn't help that the debugger just casually transforms it into a*. That wouldn't make the diagrams fun at all.
3) It is entirely possible that in the future, the author wants to expand the awesome tool to a larger subset of javascript regex. This would probably make it break out of the finite automa space.
I do however agree that it's a pitty how many good regular expressions are run on stupid backtracking systems out there.
Javascript doesn't mandate that this regular expression be slow. In fact, in Safari and Firefox on OSX this regular expression runs fine. Chrome OSX fails but I remember it running fine in Linux yesterday (but I might be wrong.)
1) Back references do not mean you need to have exponential edge cases for vanilla REs
2) There is no one true way to execute an RE. There are good ways and bad ways, though.
3) The Thompson algorithm does not preclude non-regular extensions.
Anyway, just wanted to add the rsc link to the discussion :)
Sorry for the delayed response. Spent all day yesterday responding to feedback. The reason this crashes is due to the internal javascript engine.
In order to ensure that my engine (I simulate a kind of NFA) matches what javascript's engine matches, every time I match on my engine, I also try to exact match using javascript's engine. Unfortunately, javascript's engine always uses backtracking, even when it doesn't need to. Obviously this code should have been turned off for production, and I'll fix it on the next push.
To replicate the crash on your own, try typing:
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab'.match(/^(a)$/) in your console.
Interesting - it seems some browsers dont have this as an exponential edge case and indeed in those yours doesn't execute exponentially (I guess I got mixed up when testing.)
Anyway I dont consider it a "bug" or anything, just wanted to bring up the rsc paper for discussion :) Keep up the good work!
String: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab
This kills the browser.
ADDENDUM:
A good read on executing regular expressions in linear (and thus predictable) time is http://swtch.com/~rsc/regexp/regexp1.html
Many other algorithms have exponential edge cases. This can open yourself to DoS'ing if you accept regular expressions from the user (e.g. a search feature.)