Comment by egonschiele

14 years ago

I love Haskell, but parsing XML with it is a huge pain. Someday I want to write a better XML parsing library, but for now I use HXT. I wrote a blog post a while back that shows some sample usage: http://adit.io/posts/2012-03-10-building_a_concurrent_web_sc....

I use hexpat and it works nicely. I wasn't able to get very far with either haxml or hxt before having grief.

  • I use HXT to parse HTML. AFAICT, Hexpat doesn't do much besides parse the XML file into a tree. It doesn't have the niceties that Nokogiri or BeautifulSoup do. For example, I can use Nokogiri to get all the links on a page like so: page.css("a").

    HXT allows me to come close to this:

    tree >>> getXPathTreesInDoc "//a"

    But I haven't seen a single Haskell XML parsing library that is as nice as Nokogiri.

    • In my work, I read in XML, parse its elements, attributes, and data, producing new XML. Along with Parsec, Hexpat is well-suited to the task.

      I haven't had to parse HTML in Haskell. I use BeautifulSoup for that. I wouldn't be surprised if the Haskell libraries aren't as useful for that kind of thing.

      1 reply →

The article was a linkbait which rehashes tired old arguments.

But I do agree on the XML. It is just too hard for beginners to grok how to process XML in haskell.