[UPHPU] Web page data extraction
Jay Newhouse
jay at newhousenetwork.net
Wed Jan 16 22:30:36 MST 2008
You might also want to try snoopy. It allows for logging into sites and
retrieving the html code returned. I find it's a little easier to implement
than CURL.
http://sourceforge.net/projects/snoopy/
http://www.jonasjohn.de/snippets/php/snoopy-example.htm
Jay
----- Original Message -----
From: "Mike Mackrory" <mike at echovue.com>
To: <boucha at gmail.com>
Cc: <uphpu at uphpu.org>
Sent: Friday, January 11, 2008 11:49 AM
Subject: Re: [UPHPU] Web page data extraction
> Thanks guys! I'll have to give this a whirl!
>
> On Jan 11, 2008 11:02 AM, Mike Mackrory <mike at echovue.com> wrote:
>> I have an interesting question.
>>
>> I wrote an Access application a year or two ago that I'm looking at
>> rewriting
> as a web app. One thing I'm not sure I can move over to a web app is a
> tool I
> put together to let the users extract data from web pages.
>>
>> In the Access App, I open a browser window, they can log into the secure
> site, find the page with the data they need, then click a button and the
> program then takes the HTML source, parses out the necessary info and then
> loads it into the local database.
>>
>> Does anyone know if this is possible to do using PHP or JavaScript. Using
>> an
> IFrame would be perfect, but since the site they want to extract the info
> from
> is on a different domain this doesn't appear to be possible. Anyone have
> any
> idea's of how I could do this? The big obstacle is just finding a way to
> get
> the source code of the web page being viewed.
>>
>> Thanks
>>
>> Mike
>>
>
> You can get the source of the page by using fopen.
> http://us.php.net/fopen
>
> And like Wade said, you can use Curl to handle the logging in and
> stuff. http://us.php.net/curl
>
> Dave
>
> _______________________________________________
>
> UPHPU mailing list
> UPHPU at uphpu.org
> http://uphpu.org/mailman/listinfo/uphpu
> IRC: #uphpu on irc.freenode.net
>
More information about the UPHPU
mailing list