A document scan is nothing more than a flat photograph of that document. If we were to take a picture of a document, we will most likely make the image distorted because of the way in which regular cameras capture depth. A scanner, on the other hand, eliminates that 3rd dimension and makes it suitable to capture images of text. Here is a comparison:
Performing the Scan.
There are many ways to perform a scan, which depend on the interface (or software) you use. In general you have a choice between using the native interface or using the scanner from within another software (microsoft word, acrobat, omnipage, photoshop, etc). For our purposes we will use the native interface. Usually the native interface will be the most trouble-free and it usually includes the most options. At the Scholars Lab we use EPSON scanners. All of them come with the same software: EPSON Scan, which you can find a shortcut to on your screen desktops:
The software has several modes. Make sure you select professional mode. Once you’re there, you will need to make a few selections to get the right output from the machine. Here is a list of the settings that you need to pay attention to:
- Auto Exposure Type: Document
- Image Type: 24-bit Color
Color is necessary because we are trying to capture the minutiae of type. When a scanner transforms a document into a black and white image, it pixelates unnecessarily.
- Resolution: 300 dpi
The resolution of 300 dpi (dots per inches) is a standard resolution for documents that need to be digitized for archival purposes. For libraries across the world the standard ranges between 300 dpi and 600 dpi. We will be using the lower one of those settings because our texts do not need to be analyzed under the digital microscope (many others do). All we need to do is produce well proofed copies of these texts and 300 dpi is good enough for that.
Choosing File Save Settings
The next step is to choose the appropriate file saving settings before we start scanning. Since we will be working in teams, it is important that we can coordinate our files appropriately. Remember, each team member is responsible for 1/4th of the book. There is 1 flash drive and 4 users, so we must be careful. When the time comes to put all the files in the right order, we will save a lot of time if we handle the files correctly now.
There are three possibilities here:
a) You can save directly on the hard-drive of the machine at the scholars lab, or
b) You can save on your home directory.
c) You can save directly on the flash drive, and
(a) and (b) are poor choices. (a) because computers in the Scholars Lab are reformatted every night, erasing all files saved on the local hard disk, and (b) because there is not that much space on your home directory.
In order to preserve the order of the original each member would be assigned a number from 1 to 4 according to the order in which his page allotment goes. The prefix for all of our files would then be the first four characters of the book + that number. As you scan the software will add a running count to Now let’s see how this works in practice:
1. Choose the File Save menu on the lower right-hand side column, next to the scan button.
2. Select a location to save the file by using the Browse button. In this case, you want to look for your flash drive in the file directory.
3. Select a prefix. In this case, the first 4 letters of the book + your number in the team roster. Ex.: spec1
4. Select a file type. In our case, you must select normal TIFF (.tif). Make sure you don’t select multi-TIFF.
NOTE: For some scanner software (i.e. HP) when you select the TIFF file, the software assume that you want to use multi-TIFF. Usually you will find an option that allows you to opt out saying something like, “when using a multi- file format, make individual files for each scan” or something of that sort. It is important that we make individual files for each individual image, as you will soon see when we start using those files in an OCR environment.
Voila! You’re ready to scan.
The first thing you want to do is scan a preview of your document. The books we’re scanning in this class happen to fit in the average scanner with a small crop of the bottom part of the page. For the purposes of producing our PDFs this is good enough. Keep in mind that for other projects where an important primary source is being scanned to be archived, this small cropping is unacceptable.
If you happen to use one of the large scanners in the Slab, you will be expected to reduce the size of the image using the preview. In order to do this you must resize the dotted lines by dragging at the edges until the size of the scanning area is the same as the image of the book. If you do this, it is advisable to redo the review every other 5-6 scans to make sure you are still scanning withing the boundaries.
Once our preview is acceptable, the next step is to hit the scan button. You will notice that the folder where the files are being saved comes up on the screen. If this new window overlaps the scanning interface, just move it to the side so that you can save time in between reloads. At this point you can scan all of your assigned pages continuously.
NOTE: If you stop your work half-way through, make sure the next time you come around you continue your work at the page you left off and that you let the scanning software know that you want it to start numbering pages at that place. You do this in the file save settings dialogue.
For any questions please feel free to contact Chris or Alex by email or during support hours.
Good luck and happy scanning!