Home arrow Forum

Remository Forum

 


dmjedli

Karma: 0  
Search Engine Crawler with PDFs - 2010/08/19 06:44 I've been using Remository in production for some time and we have thousands of searchable PDF files. We've noticed that search engines aren't able to index the full text of the PDF files.

I'm removed the "nofollow" addition to the links, but that doesn't seem to have made any difference (and I didn't think it would).

Anyone have any suggestions for what the deal with this is? I'm guessing that the MD5 checksum may cause some issues. But is there some javascript protection for the download link itself?

Any help is appreciated!
  | | Sorry, you do not currently have permission to write here.
admin

Karma: 98  
Re:Search Engine Crawler with PDFs - 2010/08/22 16:55 It used to be the case that files in a repository would be meaningless to a search engine, since they could handle only HTML and text.

Although things have changed, there is a difficulty in the way of making the files more accessible to search engines. It is that if search engines can get to them, then anyone at all will be able to. That would leave you vulnerable to your files being made available for download from other web sites - using your bandwidth, but giving you no credit.
Martin Brampton aka Counterpoint
http://aliro.org
http://black-sheep-research.com
  | | Sorry, you do not currently have permission to write here.
dmjedli

Karma: 0  
Re:Search Engine Crawler with PDFs - 2010/08/22 19:58 I ended up bypassing all the date/time stuff in the makeCheck function and just had it return the $id value. PDF searching by the crawler works great. We are using Omnifind Yahoo Edition (for testing) and plan to switch to the full Omnifind version soon.

While it opened up the site to hot-linking, all parties involved are okay with it.

Thanks!
  | | Sorry, you do not currently have permission to write here.

Login

Subscribe to Premium Support

Get priority support for Remository and Glossary, sign up now for a Premium Support monthly subscription:

Your Remository user name

Or purchase a year's support:

Your Remository user name

Recommended SEF

SEF Advance

Who is Online

Remository welcomes guests and visitors

We have 7 guest online