Over the past few weeks I’ve been throwing this together in my full time – a class library to pull HTML nodes from a page using a CSS selector. hpricot for Ruby would be a good comparison. It uses the HTML Agility Pack behind the scenes, to clean up the source document, and provide means of reading the document nodes.
I’ve got a limited suite of unit tests which document the current level of support, most CSS 2.1 stuff is in there, and a couple of CSS 3 ones. The unit tests are partially pinched from jQuery’s selector engine unit tests, so thank you to the jQuery team.
It works like this:
SelectorEngine engine = new SelectorEngine(htmlString); IList
nodes = engine.Parse("#p>a");
Pretty simple stuff. There are no binaries yet, as I consider it alpha-quality, but you can check it out over at Google Code. Contributions would be appreciated.