Jump to content

Plucker On Gsak Generated Html Files?


Recommended Posts

For some time, I have been using gpx2html to generate html files. I would then use Plucker to transfer to the PDA. I always set Plucker to go 2 levels deep, and limit to the exact server only. Everything worked just fine.

 

Now I am trying to use Gsak to generate the html files. That part seemed to go OK. I set up Plucker the same as always. The trouble is that all I got on the PDA is the index. If I click on a cache, I get this message:

“Sorry, the link you selected was not downloaded by Plucker. It was probably an external site or exceeded the maximum depth that the referring page was asked to retrieve.”

 

So, I changed my Plucker setting to go 3 levels deep and updated the same files. Now I got the cache pages. All seemed fine, except I didn’t get the hints. If I clicked on a hint, I got the same error message as above. It took Plucker about 7 minutes to process this way.

 

So, I changed my Plucker setting to go 4 levels deep and updated the files again. Now everything, including the hints worked. The problem is that it took Plucker 1 hour and 12 minutes to process this. :rolleyes: That time is completely unacceptable. I am used to doing it in probably 6 to 7 minutes. This is for a GPX file for 475 caches, which is what I get most of the time.

 

I was hoping to cut my number of apps down to just Gsak and Plucker. I don’t know why Plucker seems to act differently with html files generated by gpx2html and those generated by Gsak. What do I need to do differently? Go back to gpx2html? Cachemate? Change some settings somewhere?

 

I don’t really understand how these programs work, and am basically “cookbooking it”, so PLEASE cut me some slack here, and keep it simple. Thanks for any suggestions!

Link to comment

475 is a lot of caches. I get 100, and Plucker takes about 5 minutes on my old 400MHz machine using the GSAK html files. JPluck is quicker, and that's what I've been using. It runs on any machine, via Java. Give it a try & see if it's faster for you. You might consider cutting back on the number of caches - it would take months to find that many.

 

If you were getting everything with a depth of 2 via gpx2html, that means it's generating far fewer files, thus giving much less control and flexibility. The number of locations you have set to index from also makes a difference. I normally index from 2 locations - home and work, and GSAK gives pages for both distance and bearing from both; you can plug in as many as you like. IIRC, you can't do that via gpx2html. TANSTAAFL, and more files -> more flexibility -> more time to process. I prefer GSAK by a wide margin, YMMV.

Link to comment
475 is a lot of caches.  I get 100, and Plucker takes about 5 minutes on my old 400MHz machine using the GSAK html files.  JPluck is quicker, and that's what I've been using.  It runs on any machine, via Java.  Give it a try & see if it's faster for you.  You might consider cutting back on the number of caches - it would take months to find that many.

 

If you were getting everything with a depth of 2 via gpx2html, that means it's generating far fewer files, thus giving much less control and flexibility.  The number of locations you have set to index from also makes a difference.  I normally index from 2 locations - home and work, and GSAK gives pages for both distance and bearing from both; you can plug in as many as you like.  IIRC, you can't do that via gpx2html.  TANSTAAFL, and more files -> more flexibility -> more time to process.  I prefer GSAK by a wide margin, YMMV.

 

Thanks for the reply NightPilot! I know that 475 caches is a lot, but I like to import the bunch into Streets and Trips to see where I want to go caching. Plus, we like to travel, so I "need" a lot of cache data. Anyway, the thing is, it is now taking 72 minutes to get exactly what I can get in 7 minutes with gpx2html. I normally index to two locations, but have sometimes used up to 5 and didn't notice any significant difference in processing time. I have a 1 Ghz 'puter, btw. I cannot see any significant increase in data or function when using the files generated by Gsak as opposed to those generated by gpx2html (other than the ability to index by bearing- you can do multiple reference locations in gpx2html, but not bearings, AFAICT). How many levels do you go to when using the Gsak files?

 

I love Gsak so far, and I am definitely not saying that this is a Gsak problem. I just don't know what the problem is.

 

Thanks again!

Edited by Boundertom
Link to comment

I set my depth to 4 also. I also restrict Plucker to the directory. If you don't restrict it, it will try to go to the web and download links from there, which can take a lot of time. For this channel, you definitely want your spidering restricted to the start directory.

 

I just did a conversion, using JPluck. I don't know exactly how long it took because I was typing this reply. I checked at just over 2 minutes, and it had finished before I checked, so it's definitely less than 2 minutes for 100 caches, and on a very old and slow system. JPluck is faster than Plucker, IME, and they both give the same results.

 

If you're taking that long, I suspect you need to limit where Plucker goes. This is in the setup configuration, can't recall the exact page right now.

Link to comment

I think I've mentioned before that I love the gpx2html / plucker combo so much that it would take an act of congress to make me switch. GSAK is a fantastic piece of software so what I have been doing is creating the gpx file in GSAK that I feed to gpx2html / plucker. It might be one extra step but it is working well.

 

Cheers, Olar

Link to comment

Olar, can you tell any difference in what ends up in the PDA whether you run it through gpx2html or go straight to html from GSAK? I haven't tried to do a close comparison, but a casual lookover didn't detect any obvious difference.

 

I'm going quite a bit from memory, so I could easily be overlooking something substantial.

Link to comment
Olar, can you tell any difference in what ends up in the PDA whether you run it through gpx2html or go straight to html from GSAK? I haven't tried to do a close comparison, but a casual lookover didn't detect any obvious difference.

 

I'm going quite a bit from memory, so I could easily be overlooking something substantial.

Max you talked me in to trying the html created by GSAK. The results are very similar to that created by gpx2html once run through Plucker. Actually I'm not surprised as anything associated with GSAK is excellent and I wouldn't expect anything less. I will give it the full test on my next weekly major update.

 

Cheers, Olar

Link to comment

I just finished running a test on a Pocket Query GPX file with 500 caches. In Plucker the channels were both setup to spider by breadth first, limited to exact server only with a depth of 4 for GSAK (required to get the hints) and 3 for Spinner. Both were set to sort by distance from one location.

 

Using GSAK to generate the HTML pages it took Plucker 25 minutes as compared to only 4 minutes with HTML generated by Spinner. The pages appear pretty much identical on my Palm except that Spinner is using the old cache icons and GSAK includes a link to the cache page on Geocaching.com.

Edited by PDOP's
Link to comment
PDOP's, how many html files are being generated by each?  I no longer have either GPXSpinner or gpx2html on my computer to check.  Are you using Plucker Desktop or JPluck?

I'm using Plucker Destop. GSAK generated 1013 HTML files, Spinner 986 HTML files.

 

I also just downloaded gpx2html to try it but I can't get the Reflocation file to work. I've entered the coordinates in a couple of different formats but each time it creates the index HTML it's blank. It took Plucker two minutes to process the 506 HTML it created.

Edited by PDOP's
Link to comment
PDOP's, how many html files are being generated by each?  I no longer have either GPXSpinner or gpx2html on my computer to check.  Are you using Plucker Desktop or JPluck?

I'm using Plucker Destop. GSAK generated 1013 HTML files, Spinner 986 HTML files.

 

I also just downloaded gpx2html to try it but I can't get the Reflocation file to work. I've entered the coordinates in a couple of different formats but each time it creates the index HTML it's blank. It took Plucker two minutes to process the 506 HTML it created.

Hi PDOP'S. Here is my reflocation file to show you the format that works.

 

# Enter reference locations

# Name, Lat, Long

From Home, N 43 xx.xxx, W 79 xx.xxx

From Work, N 43 xx.xxx, W 79 xx.xxx

 

Cheers, Olar

Link to comment
So, I changed my Plucker setting to go 3 levels deep and updated the same files. Now I got the cache pages. All seemed fine, except I didn’t get the hints. If I clicked on a hint, I got the same error message as above. It took Plucker about 7 minutes to process this way.

 

So, I changed my Plucker setting to go 4 levels deep and updated the files again. Now everything, including the hints worked. The problem is that it took Plucker 1 hour and 12 minutes to process this. :rolleyes: That time is completely unacceptable. I am used to doing it in probably 6 to 7 minutes. This is for a GPX file for 475 caches, which is what I get most of the time.

 

From this observation it would seem that the extra depth of 4 is causing the problem.

 

How about I put an option in the HTML generation to put the hints on the same page (similar to how gpx2html does).?

 

That way you can leave the depth to 3 and the generation (as you have already tested) will be quicker

Link to comment
So, I changed my problem is that it took Plucker 1 hour and 12 minutes to process this. :rolleyes:  That time is completely unacceptable. I am used to doing it in probably 6 to 7 minutes. This is for a GPX file for 475 

I routinely pluck many hundreds of geocache pages at a level of 4 on pages that admittedly aren't from GSAK, but the HTML isn't too much unlike GPX2HTML's output - and I do put the hint on separate pages. While my laptop (hardware not unlike what the original poster described) doesn't do itnightly, it's not THAT slow. I specify --zlib-compression --noimages --stayonhost -M 4 -V 0 and it rips right along.

 

As slow as you're describing it, I wonder if you're really staying on host? Turn on the debugging options to plucker-build (or just plain watch the lights on your 'net connection) and see if you're plucking more than you think you're plucking.

 

I've worked Plucker with lots of topologies ranging from "stalky" to "bushy" and never seen it give exponentially bad performance when it was plucking what I really wanted it to pluck. Look carefully at your "stay on host" option....

Link to comment

From this observation it would seem that the extra depth of 4 is causing the problem.

 

How about I put an option in the HTML generation to put the hints on the same page  (similar to how gpx2html does).?

 

That way you can leave the depth to 3 and the generation (as you have already tested) will be quicker

 

That sounds like it would work Clyde. I honestly don't understand too much about how these things work. Like I said initially, I am "cookbooking it"! I didn't really think this was a Gsak problem, strictly speaking, but I appreciate your willingness to help. I didn't want to bug you for anything else, after just asking about the print function a few days ago.

 

Thanks again!!!

 

Tom

Link to comment
I routinely pluck many hundreds of geocache pages at a level of 4 on pages that admittedly aren't from GSAK, but the HTML isn't too much unlike GPX2HTML's output - and I do put the hint on separate pages.  While my laptop (hardware not unlike what the original poster described) doesn't do itnightly, it's not THAT slow.  I specify --zlib-compression --noimages --stayonhost  -M 4 -V 0 and it rips right along.

 

As slow as you're describing it, I wonder if you're really staying on host?  Turn on the debugging options to plucker-build (or just plain watch the lights on your 'net connection) and see if you're plucking more than you think you're plucking.

 

I've worked Plucker with lots of topologies ranging from "stalky" to "bushy" and never seen it give exponentially bad performance when it was plucking what I really wanted it to pluck.    Look carefully at your "stay on host" option....

OK, I don't even understand everything you have said here :rolleyes: , but I AM sure it wasn't going out on the net on that 1hr 12 min deal. I actually had to go somewhere while Plucker was working on that, so I unplugged my network cable and took my laptop with me! This was roughly 1/2 way through the process. I wanted to see what was going on, and I didn't see any significant change in speed.

I will have to study the settings in Plucker some more. Thanks!

 

Tom

Link to comment

I also just downloaded gpx2html to try it but I can't get the Reflocation file to work. I've entered the coordinates in a couple of different formats but each time it creates the index HTML it's blank.  It took Plucker  two minutes to process the 506 HTML it created.

I also had trouble with the ref locations file when I first started using gpx2html. I think it was Olar who helped me out with that! The format he gave you should work.

Link to comment

I'm a little out of my depth in saying this, but it is conceivable to me that even unplugged from the net, Plucker might spend some trying checking each off-site url it came accross if it thought it was supposed to do so. I think the acid test will be to try Plucker with the "stay on host" option enabled to see if it makes a difference.

 

This thread is long enough that I can't easily see if someone pointed you to where that option is. It's just below the place where you set the depth: File/Configure selected channel/Limits.

Link to comment
I also had trouble with the ref locations file when I first started using gpx2html. I think it was Olar who helped me out with that! The format he gave you should work.

I copied Olars text but it still won't work for me :rolleyes: I'm not sure why as this is almost the same as what Spinner uses. It's probably operator error :blink:

 

Thanks for starting this topic. I had been using GSAK with Plucker set to a depth of 3 and hadn't realized that there was a problem with the hints. :)

Link to comment
I'm a little out of my depth in saying this, but it is conceivable to me that even unplugged from the net, Plucker might spend some trying checking each off-site url it came accross if it thought it was supposed to do so.  I think the acid test will be to try Plucker with the "stay on host" option enabled to see if it makes a difference.

There's certainly a lot more activity showing on the progress screen in Plucker when processing the GSAK HTML files. The tests I ran were with Plucker set to "stay on host"

 

I wondering if it's just that there are a lot more links on GSAK HTML for Plucker to process even if it isn't set to retrieve the images or pages from these links.

 

Edited for spelling

Edited by PDOP's
Link to comment

I just ran some spot-checks on 1300 caches I happened to have laying around that I fed into GSAK 3.0 Beta 1. I wrote the HTML images to network drive where I could get to them from a system where Plucker is installed and ran plucker-build thusly:

 

$ time plucker-build --stayonhost --noimages -V1 -M 4 -f /tmp/booger.pdb index.htm

 

and it clocked in at:

real 2m30.136s

user 2m12.770s

sys 0m1.060s

 

My system isn't terribly girly: it's a 2.4Ghz P4 with 640MB of core. So there's nothing in the GSAK output that inherently hits plucker in an achilles heel.

 

Changing to "depth-first" knocks it down somewhat:

 

real 2m4.876s

user 1m53.570s

sys 0m0.620s

 

My money says you didn't set "stayonhost" and it was wandering around looking at network timeouts.

 

(Or you're being punished by some kind of data-specific or OS-specific issue.)

Link to comment

I double checked my Plucker settings. Of the things that have been mentioned, it is set for breadth first. For "Stay on Host" I had checked " Ignore links to a server that is different from starting pages server". I had also selected "Limit to the exact server only". I see that "include images" is checked. I will disable that and see what happens.

 

Robert, your speeds are excellent! You do have a lot more computer than I do, but you were also doing 3 times the caches, in a third the time. Do you think there is anything in Gsak 3.0 that makes a difference? I have Version 2.03.

 

I will also try this on some other GPX files and see if they act the same for me.

 

Many thanks to everyone for your help!!!

 

Tom

Link to comment

Ok, I just disabled the images and ran Plucker on the same GPX file. Processed in under 7 minutes! Right back where I used to be. I had never messed with the image setting previously. I guess that when i was using files from gpx2html and only going 2 levels deep, the images didn't matter.

 

Clyde, at this point *I* don't see that you need to make any changes to Gsak regarding this "problem". I am satisfied that all is well.

 

Thanks again to everyone!!!

 

Tom

Link to comment
Ok, I just disabled the images and ran Plucker on the same GPX file. Processed in under 7 minutes! Right back where I used to be. I had never messed with the image setting previously. I guess that when i was using files from gpx2html and only going 2 levels deep, the images didn't matter.

 

Clyde, at this point *I* don't see that you need to make any changes to Gsak regarding this "problem". I am satisfied that all is well.

 

Thanks again to everyone!!!

 

Tom

I know this is kind of off topic but...

I use CacheMate...500 caches in my Palm in less than 30 seconds! :D

Link to comment

Boundertom, yes, I know I couldn't compare apples to apples. Other than me running a completely different computer on a completely different set of data with a completely different OS, and apparently with different versions of GSAK and maybe even Plucker (I'm on 1.6.0) I tried to minimize the variables. I really was doing it to affirm that there isn't a generic problem with either GSAK or Plucker.

 

However, if I let it do images (which, as an aside, I almost never do; they're essentially unusable on my B/W visor and my Prism turns invisible in the sunlight, so it's not worth replucking for different options) on the same data, my pluck time soars by a whopping 16 seconds:

 

(robertl) rjloud:/tmp/blarf/Cache

$ time plucker-build --depth-first --stayonhost -V1 -M 4 -f /tmp/booger.pdb index.htm

 

real 2m20.649s

user 2m7.560s

sys 0m1.210s

 

My guess remains the same: there's something specific to your environment (specific caches that have gone nuts with graphics, the windows version of plucker sucking when converting graphics, etc., some backrevved software somewhere, that's consuming mass quantifies of wall time when it shouldn't be.

 

If you will send me your 'Cache' directory, I'll help analyze this for you.

Link to comment

I just used Plucker to convert a site I had downloaded to my computer. Nothing to do with GSAK or geocaching, just a website. I ended up with several subdirectories, and lots of images. I told Plucker to stay on host and convert, and it went crazy. Half an hour later, it was still trying to get thousands of files, most of them from the net. I stopped it, and told it to stay in the starting directory. No change. I simply cannot get Plucker to work on that set of pages, for some reason. I verified multiple times that it's being told to limit itself to the current directory, but it just won't do it. Same results with JPluck. Something in the html code seems to be sending it off into cyberspace. Perhaps it's the images, because they are necessary for the site to work correctly, but my guess is that in processing images it has to go elsewhere to get every image that is referenced.

 

In short, just because you told Plucker to stay on host, that doesn't mean it will obey.

Link to comment

Thanks to all who offered help and suggestions on this issue. I have tried many different things, to the point that I am very tired of fooling with it. I am definitely with Olar on this one, nothing works as well as gpx2html with Plucker. That combination is fast, easy, and it just plain WORKS! I find that it is well worth the extra step involved. YMMV!

 

Thanks again.

 

Boundertom

Link to comment

GSAK 3 Beta2 with the option to have the hints on the same page really speeded things up for me. Using a GPX file with 527 caches and Plucker set to a depth of 3 with images included the processing time was 3.5 minutes as compared to 40 minutes with GSAK 3 and Plucker set to a depth of 4.

Link to comment

Clyde sent me an email about the latest version and then I saw PDOP's message, so I went ahead and loaded up Beta2. My Plucker processing time on a file of 475 caches was 6 minutes and 50 seconds. This is pretty much what it was taking with the gpx2html files. Everything seems to be working great! Maybe PDOP's just has a faster computer than I do (1 Ghz), but I am satisfied.

 

One more thanks to everyone, and especially to Clyde for all of your work on Gsak!

 

Boundertom

Link to comment

I just can't seem to lay this topic to rest. I just did another file with Plucker set to "depth first" as Robertlipe had suggested. I had forgot to do that before. This brought the Plucker processing time down to ~4minutes and 30 seconds. This was also with color images enabled- the little stars look purty on my new color Clie. :blink:

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...