pyparsing review

This is the sad true: parsing is boring. And writing parser is even worst.

If you can choose a scripting language for parsing you can think to do it in perl.

For this way, take a big breath and go in the black sea  of perl's funny regexp. They are funny only if you have that special love for the regular expressions.

But if you are more confortable with python, pyparser is a better solution.

Pyparser is a library written in Python, for building parser described with a BNF (Backus-Naur Form).

O'Reilly has just published a "Short Cuts" e-book written by Paul McGuire; in less then 70 pages you get a very good insight of pyparser.

Even if you are new to python, the book is very easy to read.

And if you do not know nothing about parser and Backus & Naur, you will find an easy path to understand it. Parsing is a tricky topic because of the grammar theory behind it, but for all-day work, you can follow the McGuire introduction.

After some simple example, you'll dive into a small web page parser.

It is very amazing how you can do extraction from web pages without a complex Sax parser, and using only  a very compact grammar.

After this intro examples,  the manual take us to a more complex task: a lisp-like expression language parser called S-Expression.

This example is important because complex data structure are oftern recursive as S-Expression are.

The last chapter, "Search Engine in 100 Lines of Code", is a well-written example, and show us how to build a small search-engine-grammar.

 So this e-book is a "must" if you need to do even simple parsing and you… do not want to become crazy with too regular expressions :)

 

 

 

 

Leave a Reply