[UPHPU] Extracting templates from web pages
Richard K Miller
richardkmiller at gmail.com
Wed Apr 2 12:28:27 MDT 2008
On Apr 2, 2008, at 12:11 PM, MilesTogoe wrote:
> Richard K Miller wrote:
>> Adrian Holovaty (creator of ChicagoCrime.org and Django) has a
>> Python script called templatemaker[1][2], which in theory would do
>> what I want. You feed it a bunch of similar web pages and it
>> produces a template with "holes" where the data was different
>> across each web page. In practice, it's too granular; it doesn't
>> recognize HTML. It looks at every I don't care about spaces between
>> tags. I only care about substantial content differences across
>> pages. Everything else can be moved to the template.
> Sounds like your excuse to step up and move to Python & Django! :)
I'm sure Python and Django have plenty of other virtues, but
templatemaker didn't work for me. Its engine is written in C and
probably needs to be modified to recognize HTML and ignore whitespace,
but that's outside of my area of expertise.
More information about the UPHPU
mailing list