How to Scan and Upload a Book

Wouldn't it be nice to be able to share those out-of-copyright books that y'all've saved from moldering bookshelves by scanning them? Or peradventure you would similar a simple way to convert scans of your books to text, PDF, DjVu, or other formats for your Kindle or other east-reader? The Cyberspace Annal and Book Scan Wizard makes it possible to practice just that.

To upload an item y'all demand an Cyberspace Archive “library card.” (basically an account). It’s like shooting fish in a barrel enough to do so, but realize that whatever email you lot use will be listed as the originator of the document, and will exist publicly bachelor. So if privacy is a concern you lot may desire to apply a throwaway email accost.

What can be uploaded:

All uploads using this interface to the Archive are public, and may be downloaded past anyone. Yet, both the BSW form, as well every bit the Archive upload form, take a checkbox to indicate that it is a test detail. Examination items will be processed, OCR’ed and made available, but not exist indexed (or searchable). They too will exist deleted after thirty days. Marking it as a test item can be useful for testing the process, or if you lot are uploading things only for the purpose of OCR’ing them and shouldn’t become office of the permanent archive.

Uploading books using the Annal.org website:

One option is to upload a book using the Archive’s own interface. Y'all click on the “upload” button at the top of the archive. Its url is http://world wide web.annal.org/create/
The Annal recommends uploading a pdf file. Still a naught file that contains the pages that ends with _images.zilch can likewise be used. The archive volition have a zip with jpeg, tiff, or jpeg2000 images. As function of the upload procedure you fill in metadata (things similar the author, title, date etc).

Uploading books using Book Scan Magician.

Book Browse Wizard has a new characteristic that allows yous to easily upload books to the Internet Archive. It can be run either interactively or as role of a batch process. The easiest manner to start it is past using the Web Get-go version which can be accessed from this link: http://bookscanwizard.sourceforge.net/run

For an example of what a book created with the upload feature, see this book. It was created by using a “New Standard” book scanner, Book Scan Magician, and a pair of Canon A480 cameras.

Here’southward the procedure: In the bill of fare under tools, cull “Prepare for Uploading…” and information technology volition bring up the following screen:

upload.png
upload.png (15.77 KiB) Viewed 31009 times

Fill up in the information for the book, and it will add together to the BSW script the metadata and commands to create a zip file for uploading to the Archive.

The access central and secret key are a special id and password only used for transfers. You go them from here. (Or press the “Lookup Keys” button which will too bring you to the right page).

The identifier becomes part of the url for the book. On the archive books it is ordinarily a combination of the title and the author of the volume, but it tin be whatsoever you want. Messages, numbers, periods (.), hyphens (-), and underscores(_) are permitted values for the identifier. All other fields can accept whatever characters. If needed, multiple lines can be used. For example, if at that place are multiple authors, you tin can add the additional authors past adding additional “creator” lines to the other metadata section.

Once y'all press Ok, the following configuration volition be added automatically:

Code: Select all

            Metadata = identifier: BigBookOfFairyTalesA Metadata = title: Large Volume of Fairy Tales Metadata = creator: Gustave Doré Metadata = date: 1896 Metadata = subject: Childrens fairy tales Metadata = description: Hardcover championship is Favorite Fairy Tales Metadata = keywords: childrens, fairy tales CreateArchiveZip = archive.zip x:1 # Uncomment the following line to send to the archive equally part of this job. #SaveToArchive = archive.nil xxxxxxxxxxxxxxxxx xxxxxxxxxxx                      

To actually send information technology, you tin do it as part of the processing by uncommenting SaveToArchive. Or if you have previously created the cipher file, you can upload it by choosing from the carte Tools, Upload to the archive. Another options is to queue upwardly your books and send them as a batch by using the â€'upload feature from the command line. (See the command line help for more data).

You can too create a zip file some other manner, so utilise the command line option to send it to the annal. To do that, zip upward your images, and include an xml file with the metadata. The images tin be chosen whatsoever you lot like and volition exist saved in alphabetical order.

If you want to see an estimate of the size the zip file will be, you tin can right-click the CreateArchiveZip line. It will render this:

size.png
size.png (2.4 KiB) Viewed 31009 times

And then adjust the compression setting (the ten:1 in the example above) until you have a event you similar.

How to Scan Books for the Archive:
While the Archive volition accept any sort of scans, it is nice to provide the scans in a way that matches their own works. For that, it is all-time if the books meet the post-obit criteria:

  • It should have a resolution of 300-600 DPI.
  • Information technology should be washed every bit a full color image that closely resembles the bodily volume image. The Internet Archive prefers color images because they accept found people like reading the book with the original look intact.
  • The volume should be deskewed, and cropped.
  • You should provide good metadata such as title, author, date, discipline, keywords, etc.

Tips for creating expert scans to transport to the Archive:

To make skilful full color images it often takes a bit of tweaking to look really good. Ideally y'all want the left and right pages to be consistent with each other, and take the colors match the original. BSW can help with that.

Once you have corrected for perspective distortion and cropped the prototype, it is skillful to increase the contrast a flake of the image. Try right clicking the image and cull “autolevels.” This volition requite y'all a good starting point, merely feel free to adjust the blackness and white levels until they appear accurate. The books done with Internet Archive’due south Scribe scanners use the equivalent of the following, and may be helpful every bit a starting indicate if you are starting with well exposed images:
Levels = 12 94

As well, if the saturation doesn’t look right (similar there is more colour in the image than there was in the original, the Saturation command tin exist used. Or if the effulgence is off, effort adjusting it with the Effulgence control. If your lighting isn’t quite consequent, information technology is sometimes necessary to adjust only the left or right images to make them match improve. Its pretty much trial and error until you get the results looking the way you like. The good thing is once you figure out the settings that work for you, you volition not need to suit information technology much for other books.

It’due south recommended that a lossy compression that results in a compression between 10:1 and 20:one is used for the transfer. For example at 10:i, if an image was a 10 meg uncompressed tiff, it would exist about a 1 million .jp2 file. BSW will default to a 10:i pinch, which works well for 300 DPI images. If you lot are providing scans closer to 600 you will probably desire to utilise a higher compression to keep the transfer sizes manageable.

The archive will have a cypher file containing jpegs, tiffs, and jp2 files. BSW uses jp2 as information technology gives the most control over the files size and a fleck amend compression than Jpeg files.

While information technology is preferable to transfer colour images, in that location may be times where you lot need to practice the transfer equally grayscale or black and white. Color images are quite large, and if you a slow connection information technology might non be feasible to transfer them. Grayscale images are about a tertiary the size of full color, and blackness and white are even smaller. Or if you lot can’t get a good color image it may be best to save information technology grayscale or black and white.

How long will it take to process?

Depending on what kind of pinch y'all are using, and the length of the volume the zip files volition be around 200-800 megs, so information technology can take quite a while to transfer, depending on your connection.

After the file is uploaded, it starts in motion a bunch of steps that end with the volume OCR’ed and converted to pdf, DjVu, Kindle, and other files. The process will take anywhere from an hour or and then to a few days depending on how backed up the Annal is. You can bank check on the progress by logging into the archive, choosing patron info, then choosing tasks that are not notwithstanding completed.

For further information:

For more information nearly uploading books to the annal you lot tin can bank check these links out:

General overview on uploading content:
http://www.archive.org/well-nigh/faqs.php#Uploading_Content

Information on the _images.goose egg format:
http://raj.blog.annal.org/2011/02/24/ ... e-uploads/

Detailed information for Internet Annal partners. This has some good information on the Net Archive process for scanning documents:
http://www.archive.org/details/ProcessDocument

Information on the protocol Book Scan Wizard uses to communicate with the Archive:
http://www.archive.org/aid/abouts3.txt

johnsonlesellizen.blogspot.com

Source: https://diybookscanner.org/forum/viewtopic.php?t=907

0 Response to "How to Scan and Upload a Book"

Postar um comentário

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel