XML Parsing


#1

Hi,

RSS feeds are relatively easy to parse and the SAX approach followed by NSXMLParser suits it.

But what if you were given some other website and were asked to fetch a information from it and populate your TableView, how would you go about it?

In my case, I am trying to develop an iPhone app for my University of Massachusetts. I got the news feed working.

Now I am on to next level where I want the app to display the Academic calendar information in a tableView.

The link is - http://www.umass.edu/registrar/gen_info/academic_calendar.htm

What parser will help me achieve this?

I am stuck here since last two days.

Any thoughts?

Thanks,
S.


#2

seems to me like you need to query the data via API using an NSURL … and the delivery format determines what parsing system to use. In the case of a webpage built with data and images etc … i believe you’re dealing with a formidable challenge.

As far as parsers of choice … on iOS5 it’s NSXMLParser and NSJSONSerialization (which i’m not familiar with yet), however i’m told there are several third-party libraries. Which to use again depends on your needs … but parsing webpages is not happening (yet) on iOS 5 that i’m aware of. Hope i’m wrong! :slight_smile:


#3

It is not a HTML parser hence it doesn’t work on websites returning HTML content.


#4

If the page were XHTML, you could still use the NSXMLParser. However, this page is not, so it looks like you’re relegated to one of the following options.

Use a 3rd-party library to clean up the HTML so that it validates as XHTML, then parse using the techniques from the book. It’s actually pretty easy to generate something akin to a DOM tree by hand using just a few lines of code using NSXMLParser, as long as the parser successfully gets through the whole document. I built the code for that and then ran into an error fairly quickly (a tag that did not close properly).

Use a 3rd-party library to parse the HTML as-is. I haven’t done this myself, but the best I could find with a little search-engine elbow grease was here: http://stackoverflow.com/questions/405749/parsing-html-on-the-iphone

Extract the stuff that you’re interested in by hand, using a combination of NSString utilities and NSRegularExpressions. This is probably more trouble than it’s worth, but it’s an option nonetheless.