Thread
:
Question on web spiders
View Single Post
#
3
May 22nd 05, 03:43 AM
Paul Rubin
Posts: n/a
(Mike Andrews) writes:
So for pdf files without going to a jpeg --- are ascii text addresses
harvestable ?
Yes, in the sense that Optical Character Recognition (OCR) programs _can_
read text out of an image. In practice, it's not worth the spammers' or
web spider operators' trouble -- or that's been my experience, anyway.
PDF files contain the underlying text strings and search engines index
them without OCR'ing. Whether spammers bother, I don't know.
Reply With Quote