Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Old May 22nd 05, 11:10 PM
John Smith
 
Posts: n/a
Default

.... shame, searchable text is nice... I have finereader, what is the
graphic format of the scanned pages... perhaps it can work with that?

Warmest regards,
John

"Dave Platt" wrote in message
...
In article ,
John Smith wrote:

I am interested, did you use fine reader for scanning or another? And,
did
you use adobe to create the .pdf or other free software? Is the .pdf text
searchable (in text format) or not (in graphic format?)


The work was done using only noncommercial (freely-distributable)
software tools... the SANE scanning software, NETPBM image-processing
programs, The GIMP for manual image processing, and GPL GhostScript to
create the PDFs. I wrote a bunch of custom scripts to perform some
higher-level functions (e.g. automatically levelling, centering, and
"bleaching" the pages).

The text is not searchable. I don't have access to OCR software which
can do the job with acceptable accuracy, nor the time required to
proofread the whole book and correct the inevitable errors. The PDF
text is all in graphic format.

--
Dave Platt AE6EO
Hosting the Jade Warrior home page: http://www.radagast.org/jade-warrior
I do _not_ wish to receive unsolicited commercial email, and I will
boycott any company which has the gall to send me such ads!



  #2   Report Post  
Old May 23rd 05, 01:28 AM
Dave Platt
 
Posts: n/a
Default

In article ,
John Smith wrote:
... shame, searchable text is nice... I have finereader, what is the
graphic format of the scanned pages... perhaps it can work with that?


The original scans are 300 dpi grayscale, PGM (portable graymap)
format. Easily translated to TIFF.

The data in the PDF itself is 300 dpi one-bit-deep black&white data,
compressed... converted from the grayscale data via thresholding.

--
Dave Platt AE6EO
Hosting the Jade Warrior home page: http://www.radagast.org/jade-warrior
I do _not_ wish to receive unsolicited commercial email, and I will
boycott any company which has the gall to send me such ads!
  #3   Report Post  
Old May 23rd 05, 02:52 AM
John Smith
 
Posts: n/a
Default

Well, will have to play with this awhile... never attempted to use
finereader with existing scans... and not having much luck working something
out--I expected it to be more straight-forward...

Warmest regards,
John

"Dave Platt" wrote in message
...
In article ,
John Smith wrote:
... shame, searchable text is nice... I have finereader, what is the
graphic format of the scanned pages... perhaps it can work with that?


The original scans are 300 dpi grayscale, PGM (portable graymap)
format. Easily translated to TIFF.

The data in the PDF itself is 300 dpi one-bit-deep black&white data,
compressed... converted from the grayscale data via thresholding.

--
Dave Platt AE6EO
Hosting the Jade Warrior home page: http://www.radagast.org/jade-warrior
I do _not_ wish to receive unsolicited commercial email, and I will
boycott any company which has the gall to send me such ads!



  #4   Report Post  
Old May 23rd 05, 03:42 AM
Dan Richardson
 
Posts: n/a
Default

On Sun, 22 May 2005 17:52:48 -0700, "John Smith"
wrote:

--I expected it to be more straight-forward...


For Pete's sake. You're getting something for free and then bitching
about it.

Sheeeeeeeee

  #5   Report Post  
Old May 23rd 05, 04:46 AM
John Smith
 
Posts: n/a
Default

I don't think you grasp what is being done here... I am not even
contemplating using it... but transforming it into other formats for others
use... 33 megs is pretty big for a book... down about one-meg would be more
useful...

Warmest regards,
John

"Dan Richardson arrl net" k6mhatdot wrote in message
...
On Sun, 22 May 2005 17:52:48 -0700, "John Smith"
wrote:

--I expected it to be more straight-forward...


For Pete's sake. You're getting something for free and then bitching
about it.

Sheeeeeeeee





  #6   Report Post  
Old May 23rd 05, 05:28 AM
Dave Platt
 
Posts: n/a
Default

In article ,
John Smith wrote:

I don't think you grasp what is being done here... I am not even
contemplating using it... but transforming it into other formats for others
use... 33 megs is pretty big for a book... down about one-meg would be more
useful...


Getting it down to 1 meg would necessarily sacrifice almost all of the
detail in the photographs - they'd be unviewable. 1 meg might be
enough space for the text, and possibly for the black&white charts and
line drawings (as bitmaps) but the photos would be lost.

--
Dave Platt AE6EO
Hosting the Jade Warrior home page: http://www.radagast.org/jade-warrior
I do _not_ wish to receive unsolicited commercial email, and I will
boycott any company which has the gall to send me such ads!
  #7   Report Post  
Old May 23rd 05, 06:59 AM
Doug McLaren
 
Posts: n/a
Default

In article ,
Dave Platt wrote:

| In article ,
| John Smith wrote:
|
| I don't think you grasp what is being done here... I am not even
| contemplating using it... but transforming it into other formats
| for others use... 33 megs is pretty big for a book... down about
| one-meg would be more useful...
|
| Getting it down to 1 meg would necessarily sacrifice almost all of the
| detail in the photographs - they'd be unviewable. 1 meg might be
| enough space for the text, and possibly for the black&white charts and
| line drawings (as bitmaps) but the photos would be lost.

The reason it's 33 MB and not 1 MB is because the .pdf file is
basically a bunch of pictures, one of each page. That's also why it's
not searchable, and why you can't cut and paste text out of it.

33 MB is on the small side for books scanned like this.

In comparison, the Bible is only 1.34 MB in size in text format after
being compressed (http://www.gutenberg.org/etext/10) -- and it's a big
book. Even War and Peace is only 1.16 MB
(http://www.gutenberg.org/etext/2600).

In order to get it under 1 MB, you'd generally have to use some sort
of OCR software to convert the picture of text into text. I presume
there would also be some pictures, and they'd have to be stored as
pictures, of course.

Unfortunately, good OCR software is hard to find, and I know of no
software that could take a book, scan it, convert it to text and
images as appropriate, and do it accurately enough that a human
wouldn't need to proofread the entire document carefully. And that is
a very large job.

The reason it's available with BitTorrent is because that allows lots
of people to download it relatively quickly without totally sucking up
his bandwidth. It may be a bit more work to download than something
that's just a link on a web page, but it works nicely once set up.

In any event, scanning and distributing out of copyright books like
this is a worthy endeavor. Thank you!

Looks like there's a few other radio related works on Project
Gutenberg. Go to `http://www.gutenberg.org/catalog/world/search' and
search for `radio' for a list. None seem to cover antennas
specifically, but ` The Radio Amateur's Hand Book' looks interesting.


--
Doug McLaren,
To err is human, but to really foul things up requires a computer.
  #8   Report Post  
Old May 23rd 05, 05:54 AM
John Smith
 
Posts: n/a
Default

The pics would be converted to .jpeg... the text, in an efficient ebook
format, would be held compressed--very small... I did say one-meg+, I was
thinking about the pics (graphics) when I included the '+'... black and
white compresses very small, gray scale not as well... still, 33 megs is
HUGE! I don't see it being any larger then 3 megs at worst case... 1/10 is
good...

Warmest regards,
John

"Dave Platt" wrote in message
...
In article ,
John Smith wrote:

I don't think you grasp what is being done here... I am not even
contemplating using it... but transforming it into other formats for
others
use... 33 megs is pretty big for a book... down about one-meg would be
more
useful...


Getting it down to 1 meg would necessarily sacrifice almost all of the
detail in the photographs - they'd be unviewable. 1 meg might be
enough space for the text, and possibly for the black&white charts and
line drawings (as bitmaps) but the photos would be lost.

--
Dave Platt AE6EO
Hosting the Jade Warrior home page: http://www.radagast.org/jade-warrior
I do _not_ wish to receive unsolicited commercial email, and I will
boycott any company which has the gall to send me such ads!



  #9   Report Post  
Old May 23rd 05, 05:45 AM
Reg Edwards
 
Posts: n/a
Default

For Pete's sake. You're getting something for free and then
bitching
about it.

=========================

"For Pete's sake" is an interesting American exclamation. How did it
arise?

Did it arise in the 1930's? Any connection with the villain Pegleg
Pete who appeared in Mickey Mouse cartoons of that era?
----
Reg.


  #10   Report Post  
Old May 23rd 05, 06:05 AM
Dan Richardson
 
Posts: n/a
Default

On Mon, 23 May 2005 03:45:02 +0000 (UTC), "Reg Edwards"
wrote:

For Pete's sake. You're getting something for free and then

bitching
about it.

=========================

"For Pete's sake" is an interesting American exclamation. How did it
arise?

Did it arise in the 1930's? Any connection with the villain Pegleg
Pete who appeared in Mickey Mouse cartoons of that era?
----
Reg.

May not be solely an American expression. The only definition I could
locate was at a British site.
http://www.phrases.org.uk/bulletin_b...sages/383.html

Danny



Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Amateur Radio Newsline™ Report 1415 ­ September 24, 2004 Radionews Dx 0 September 24th 04 06:52 PM
193 English-language HF Broadcasts audible in NE US (01-APR-04) Albert P. Belle Isle Shortwave 2 April 3rd 04 07:54 AM
Amateur Radio Newsline™ Report 1379 – January 16, 2004 Radionews Shortwave 0 January 18th 04 10:37 PM
Amateur Radio Newsline™ Report 1379 – January 16, 2004 Radionews Policy 0 January 18th 04 10:35 PM
Amateur Radio Newsline™ Report 1379 – January 16, 2004 Radionews Dx 0 January 18th 04 10:34 PM


All times are GMT +1. The time now is 06:03 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 RadioBanter.
The comments are property of their posters.
 

About Us

"It's about Radio"

 

Copyright © 2017