Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

And the xml paresr will fail if the xml is not well formed while the regex will just keep sailing along. I had an example of this with BlogPoster.py which uses python xmlrpc. There are Wordpress hosts which return invalid xml and this causes an exception, I reimplemented what I needed with Bash and cURL using regexes and it works fine.


I'm not sure if you mean that pro or against regular expressions. That is a good example for #5. Regular expressions are great to quickly patch together something that kinda works, but:

1. Those hosts are still broken. The next person will have to jump the same hoops to support them.

2. You parser is very permissive. It will encourage people to create even more broken implementations.

3. The specification of this protocol is now worthless. There is no way to safely add new functionality. Any new element or attribute can break those regexes. Everyone has to take every implementation into account.

4. You are probably missing some corner cases like CDATA elements or quoted characters.

From your point of view, it probably makes sense to support even broken sites. But, you are helping to create next HTML - where every implementation works differently and you have to test everything on every browser.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: