thanks for the replies, i kinda thought there wasn't a way to do it
the grep idea is a good one, i can use that on some of my stuff (except where i don't have shell access, of course). i'll have to give that a whirl, i've got one site that has a ton of crap on it, and i don't even know what half of it is... lol.
as long as stuff is linked in the file, it doesn't matter where it is, right? i can bury the index file somewhere on the server off the site's root, and it still should find everything, or only things that are in directories below it?
the only way i would think is if somehow you could get apache on the target server to return a directory listing like it does on a directory without index.html (but the directory does have an index file). i'm guessing there isn't really a way do do that. not really worried about stuff with htaccess protection on it, but that would be a plus in some cases.