Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Image File format: TIFF / 8-bit grayscale / uncompressed / embedded sGray ICC profile / 300 -- 400 ppi relative to the size of the analog content represented on film.

  • OCR File formats:

    • ALTO, UTF-8 encoded
    • Plain text, , UTF-8 encoded
  • Directory- and file-naming

    • directories: [document id]

    • image files: [document id]_####.tif
    • ocr files:
      • ALTO [document id]_####.xml
      • Plain text [document id]_####.txt
      directories
    • pdf file: [document id].pdf
  • Directory / file structure

	Example:
	└── 990009660700203941
	    ├── 990009660700203941_0001.tif
	    ├── 990009660700203941_0001.xml
	    ├── 990009660700203941_0002.tif
	    ├── 990009660700203941_0002.xml
	    ├── 990009660700203941_0003.tif
	    ├── 990009660700203941_0003.xml
	    ├── 990009660700203941_0004.tif
	    ├── 990009660700203941_0004.xml
	    ├── 990009660700203941_0005.tif
	    ├── 990009660700203941_0005.xml
	    ├── [...]
	    ├── 990009660700203941_9999.tif
	    ├── 990009660700203941_9999.xml
|__ 990009660700203941.pdf

Returning data and films to Harvard

...