That is fine for throwaway scripts. But such "perl duct tape" is not 100% accurate and will break for no reason. There is no place for such solutions in reliable and maintainable software.
Why would it "break for no reason"?! For all I know regExps matching one small piece of the page if far less prone to breaking than parser that has to analyze the whole page. Designer changes one <div> or id/class somewhere in the top of the DOM tree and you can't reach the node that you are looking for anymore. Same goes for regExp of course, but it's looking at a smaller portion of the html, so it's less likely to be affected by small changes in some unrelated part of the page. And any major redesign will break any dedicated scraper, no matter which parser it uses...
And the xml paresr will fail if the xml is not well formed while the regex will just keep sailing along. I had an example of this with BlogPoster.py which uses python xmlrpc. There are Wordpress hosts which return invalid xml and this causes an exception, I reimplemented what I needed with Bash and cURL using regexes and it works fine.
I'm not sure if you mean that pro or against regular expressions. That is a good example for #5. Regular expressions are great to quickly patch together something that kinda works, but:
1. Those hosts are still broken. The next person will have to jump the same hoops to support them.
2. You parser is very permissive. It will encourage people to create even more broken implementations.
3. The specification of this protocol is now worthless. There is no way to safely add new functionality. Any new element or attribute can break those regexes. Everyone has to take every implementation into account.
4. You are probably missing some corner cases like CDATA elements or quoted characters.
From your point of view, it probably makes sense to support even broken sites. But, you are helping to create next HTML - where every implementation works differently and you have to test everything on every browser.