Closed Thread Icon

Topic awaiting preservation: restricting deeplinking to pdf files (Page 1 of 1) Pages that link to <a href="https://ozoneasylum.com/backlink?for=12978" title="Pages that link to Topic awaiting preservation: restricting deeplinking to pdf files (Page 1 of 1)" rel="nofollow" >Topic awaiting preservation: restricting deeplinking to pdf files <span class="small">(Page 1 of 1)</span>\

 
JKMabry
Maniac (V) Inmate

From: out of a sleepy funk
Insane since: Aug 2000

posted posted 11-13-2003 01:49

I have a directory full of PDF files in sub directories that are going to be linked to from a main web site. Search engines will no doubt link directly to the PDF files, but the PDFs give no clue about the web site that's hosting them save the URL, and most people aren't to be trusted to figure that out.

What would be the best way to restrict direct linking to those files, instead redirecting them to the html/php page that they are linked to from? I know that may disappoint a few people that were hoping to get direectly to the PDF but we'd sure like them to know there's a site hosting them and have them take a peek

these PDFs are product brochures and manuals that live in /documents/partners/partner1/mauals/ etc etc but they are linked to from /partners/partner1/brochures.php or somesuch convention.

And yes, we have carte blanche permission to use the partners' PDFs

Jason

Slime
Lunatic (VI) Mad Scientist

From: Massachusetts, USA
Insane since: Mar 2000

posted posted 11-13-2003 02:32

Sounds like you probably just want to take some .htaccess remote image link blocking code and modify it to forward people going to certain PDF files to the URL that links to those files.

JKMabry
Maniac (V) Inmate

From: out of a sleepy funk
Insane since: Aug 2000

posted posted 11-13-2003 19:53

yah, I figured that but I'm not sure about the ramifications of that practice on my search engine friendliness. I'd also think it hard to write an htaccess file/rule that properly identified all the engines, my logs are a mish mash of thousands it seems.

Just wanna get some more opinions and possible alternatives before deciding on the best means to the end.

Jason

Dracusis
Maniac (V) Inmate

From: Brisbane, Australia
Insane since: Apr 2001

posted posted 11-14-2003 00:35

How many search engines can index PDF files?

If it's only a handfull you could run a simple PHP script to read the HTTP_USER_AGENT string and simply not print out the links to the PDF files when you know the visitor is a web crawler.

Or you could try and hide the links to the PDF file through javascript, flash, or simple forms -- although I'm not sure which or if search engines can read javascript or form elements.

Edit: Ack, just woke up -- after 3000+ posts typing in this username has become more automatic than my real name. Kinda scary actually.

[This message has been edited by Dracusis (edited 11-14-2003).]

JKMabry
Maniac (V) Inmate

From: out of a sleepy funk
Insane since: Aug 2000

posted posted 11-14-2003 04:23

I'm not trying to hide the links from search engines, in fact just the opposite, the links to them are very keyword rich as are the PDFs themselves, the ones that are indexable anyhow.

I just don't want the surfer to jump from Google to the PDF and never see the web site...

Jason

JKMabry
Maniac (V) Inmate

From: out of a sleepy funk
Insane since: Aug 2000

posted posted 11-18-2003 21:38
code:
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?mydomain\.com/.*$ [NC]
RewriteRule \.?(pdf)$ [url=http://www.mydomain.com/documents/]http://www.mydomain.com/documents/[/url] [R,L]



Decided to go with the Apache redirecting as I've done in the past with my sig hosting directory, modified the sig redirect code a little but that ain't working for squat. Anybody see my error there?

Jason

« BackwardsOnwards »

Show Forum Drop Down Menu