![]() |
Question on web spiders
Maybe not the right place but seems there are several web experts here.
Can web spiders read and harvest e-mail addresses from a pdf file ? Many users and folks like QRZ.com are using jpegs not ascii for listing e-mails -- this seems to work. So for pdf files without going to a jpeg --- are ascii text addresses harvestable ? Thanks -- CL -- I doubt, therefore I might be ! |
Caveat Lector wrote: Maybe not the right place but seems there are several web experts here. Can web spiders read and harvest e-mail addresses from a pdf file ? Ask the Adobe people. Many users and folks like QRZ.com are using jpegs not ascii for listing e-mails -- this seems to work. So for pdf files without going to a jpeg --- are ascii text addresses harvestable ? Thanks Loads fewer spams since switching to this throwaway web name, despite all of the heartfelt objections of loser K4YZ. |
Loads fewer spams since switching to this throwaway web name, despite all of the heartfelt objections of loser K4YZ. chuckle, snort, guffaw Pot, kettle. |
I don't think so. Actually, now that I think about it, Google uses
technology to generate hits on pdf's so it must be possible. Used to be it couldn't be done but OCR programming has come a long way and could be embedded in search engines. Unless the PDF files that are being found have a tag for the metasearchers because they want to be found. If I recall, when I search the net and get a pdf hit, the words in my search are still highlighted in the document. It probably wouldn't be difficult to do the same thing with a jpg. Just my 2 cents. Greg ki4bbl "Caveat Lector" wrote in message news:vzQje.1452$Xh.1367@fed1read07... Maybe not the right place but seems there are several web experts here. Can web spiders read and harvest e-mail addresses from a pdf file ? Many users and folks like QRZ.com are using jpegs not ascii for listing e-mails -- this seems to work. So for pdf files without going to a jpeg --- are ascii text addresses harvestable ? Thanks -- CL -- I doubt, therefore I might be ! |
All times are GMT +1. The time now is 08:35 AM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
RadioBanter.com