Flick Harrison on Thu, 7 Jan 2010 11:38:26 +0100 (CET)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: <nettime> fast-changing propaganda website archiving tools?


Thanks for the responses, people.

I've tested out the tools suggested and still encounter the same  
probs.  Maybe this is sliding from a search for tools into a techie  
discussion... but more tools would be appreciated!

On the Sri Lankan / Tamil campaign site, for instance,

http://www.defence.lk/orbat/Default.asp

wget-based interfaces don't follow links within the flash files, nor  
(i think) localize the links within flash animations.  So clicking on  
"show photos" or "show animation" in the downloaded version, for  
instance, doesn't work.

I tried wgetting the entire SL Ministry of Defence site and after 6  
hours and 6.4 gigs of downloads, the downloaded version of the  
interactive battle map still doesn't work.

On this other site,

http://www.thisisdion.ca/Htmlsite/_old_html_index.html

Javascript-type popup links (i.e.  
onClick="MyWindow=window.open('meetDionpop.html' ) don't get followed  
by wget, even if it's told to follow links.

I solved that one by wgetting everything in the domain, then clicking  
the original popups one by one in firefox and saving them as "web  
page, complete" in the same directory.  For a simple couple of pages  
that's do-able, but it seems error-prone (i.e. localization would get  
very confused).

Thanks,
Flick

A summary of the suggestions:

On 5-Jan-10, at 01:08 , Michael van Schaik wrote:

> On mac I've had good results using sitesucker.app
> http://www.sitesucker.us/
>
> I has a GUI and can be configured to eg. download infinitely but only
> from one domain.


On 4-Jan-10, at 14:12 , Chris wrote:

> I use HTTrack, I dunno if it does Flash, you might also need to write
> some shell script wrappers for it:
>
>  http://www.httrack.com/
>
> It's GPL'd and in debian, I have never used the GUI interface... ;-)


On 4-Jan-10, at 13:47 , Karin Spaink wrote:

> You might want to try DeepVacuum. It works with wget but it has a  
> nice user interface, and it's built for the Mac:  http://www.hexcat.com/

* FLICK's WEBSITE & BLOG: http://www.flickharrison.com
* FACEBOOK http://www.facebook.com/profile.php?id=860700553
* MYSPACE: http://myspace.com/flickharrison




#  distributed via <nettime>: no commercial use without permission
#  <nettime>  is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: http://mail.kein.org/mailman/listinfo/nettime-l
#  archive: http://www.nettime.org contact: nettime@kein.org