View Single Post
  #4   Report Post  
Old May 23rd 05, 02:46 AM
Mike Andrews
 
Posts: n/a
Default

Paul Rubin wrote:
(Mike Andrews) writes:
So for pdf files without going to a jpeg --- are ascii text addresses
harvestable ?


Yes, in the sense that Optical Character Recognition (OCR) programs _can_
read text out of an image. In practice, it's not worth the spammers' or
web spider operators' trouble -- or that's been my experience, anyway.


PDF files contain the underlying text strings and search engines index
them without OCR'ing. Whether spammers bother, I don't know.


Hi, Paul. Long time no see.

Depends on whether they're text-based PDF or image-based PDF. If I scan
a page into a JPEG or TIFF and then convert that to PDF, it may not have
any of the text as text, and I think it's improbable that it will.

--
Mike Andrews, W5EGO

Tired old sysadmin