The 'book info' panel holds the URL input field as well as the received book information: Title, number of pages and book ID. After entering the book URL Hathi Download Helper reads the html document after pushing the 'get Book info' button.
If desired a proxy server could be used by selecting the corresponding checkbox.
In the 'Download settings' panel the user can choose between two file formats:
pdfs : select this option to download the book pages as single searchable pdf documents generated by Hathitrust.org.
images: select this option to download the book pages as image files (jpeg, png).
The image quality can be adjusted by selecting a zoom factor.
To generate 'searchable' pdfs Hathi Download Helper has the option to download the ocr text in addition to the image files.
The ocr text files will be stored as html documents.
Using the 'pages' input field the user can decide either to download a whole book or only certain pages.
Hathi Download Helper creates the following sub-folder structure for downloaded data:
'pdfs'
:
For pdf files
'images'
:
For image files
'ocr'
:
For ocr text (*.hmtl)
In the 'PDF merge & conversion' panel the user can choose between the following options:
'merge pdfs' : Merge single pdf files using the free tool 'pdftk' (http://www.pdflabs.com)
'convert & merge images to pdf book': Convert and merge images to a pdf book. Page size and page margins are editable via 'Options' -> 'Page setup'
convert images to single pdf files: Create single pdf files for each page.
Sets the output resolution for pdf files generated by Hathi Download Helper from images/ocr text files.
Hathi Download Helper creates the following sub-folder structure for converted data:
'pdfs'
:
For generated pdf files. Existing files will be overwritten.
Hathi Download Helper is using a fixed name structure for downloaded data, starting with the document ID (but with removed reserved characters) (e.g. 32101076400420) + "_page_" + page number + filetype extension, e.g. njp.32101076400420_page_001.pdf
Hathi Download Helper is able to merge any pdf files utilizing the 'pdftk' application. For this purpose the radio button "merge pdfs" has to be selected. When selecting a folder without content downloaded by Hathi Download Helper (files/folders) a corresponding file dialog for file selection will apear. If you are running a linux or MAC OS system you have to install the'pdftk' tool (http://www.pdflabs.com).
Hathi Download Helper is able to convert a number of differnt image formats into pdf files. For this purpose the radio button "convert & merge images to pdf book" or "convert images to single pdf files" has to be selected. When selecting a folder without content downloaded by Hathi Download Helper (files/folders) a corresponding file dialog for file selection will apear."
Hathi (pronounced hah-tee) is the Hindi word for elephant, an animal highly regarded for its capability to suck a huge amount of water into its trunk, and blows the water into the mouth. In computer networks, to download means to receive data to a local system from a remote system, or to initiate such a data transfer. Helper refers to a device that helps. In combination, the words convey the key benefits users can expect from this application - to download pages or complete books in an easy way.
For merging existing pdf files Hathi Download Helper is using the 'pdftk' application. The error may occur due to missing permissions for the pdftk file missing files. To fix this error you have to do the following actions in dependency of your OS:
Windows
Download and install 'pdftk' from http://www.pdflabs.com
Open the pdftk program folder and copy the files pdftk.exe and libiconv2.dll
Open the Hathi Download Helper folder containing the hathidownloadhelper.exe file and create a new folder named pdftk
Copy the files from step 2 into the pdftk subfolder.
Hint:If you have compiled Hathi Download Helper on your own you have to place the pdftk subfolder in your Debug/Release target folder containing the HathiDownloadHelper.exe file.
Linux/MAC
Download and install 'pdftk' from http://www.pdflabs.com or use the pdftk file placed in the pdftk subfolder attached to this project.