BIT
The fundamental unit of digital information storage, which can have a binary value of either 1 or 0.
BIT ROT
BITSTREAM
A sequence of bytes, which has meaningful common properties for the purposes of preservation. A bitstream may be a file or a component of a file.
BORN-DIGITAL are assets that originated in digital form, such as Web sites, wikis, e-books, digital sound recordings, and email.
BYTE
A unit of digital information and measure of data volume, normally equivalent to eight bits.
CHECKSUM is a function used for validating data integrity. Also referred to as MD5 (Message-Digest algorithm 5). An algorithm or formula is applied against the source (typically a file and its content, such as the image of a scanned page from a book) in order to generate a unique, 128-bit hash value often called a checksum. In digital preservation processes, the MD5 checksum from when the content was created is compared to another checksum created after the content has been received or stored over a period of time. The values are compared and, if they match, this indicates that the data (e.g. the scanned page image) is intact and has not been altered.
DIGITAL ART may be as simple as digital photography or it may be much more complex in that it could be mixed media, dynamic, or could require recreation of an entire installation to render it effectively. More complex forms of digital art will likely require one-off solutions.
FEDORA (http://www.fedora-commons.org/) (Flexible Extensible Digital Object Repository Architecture) is a software framework to construct and maintain repositories of digital objects.
HYDRA (http://projecthydra.org/) is an open source respository software solution.
JPEG (Joint Photographic Experts Group) is the name of the group that developed the standard. JPG is a compression method for images.
JPEG 2000 is a wavelet-based image compression standard. It was created by the Joint Photographic Experts Group committee in the year 2000 with the intention of superseding their original discrete cosine transform-based JPEG standard (created about 1991). The standardized filename extension is JP2.
MASTER (copy) is also known as Preservation Master or Archival Master.
OCR (Optical Character Recognition), computer software designed to convert images of text (usually captured by a scanner) into machine-editable text
PDF (Portable Document Format) is a file format, created by Adobe Systems, for document exchange in a manner independent of the application software, hardware, and operating system.
TIFF (Tagged Image File Format) is recognized as the best format for preservation and technical longevity.
TXT (.txt) is a file format used for textual documents usually containing very little formatting.
UTF-8 (UCS Unicode Transformation Format—8-bit) is a form of encoding that is backwards compatible with ASCII. The encoding standard is capable of displaying in email and in Internet browsers the standard 128 ASCII characters for English as well as Latin alphabet characters with diacritics, Greek, Cyrillic, Coptic, Armenian, Hebrew, and Arabic characters.
WEB HARVESTING is performed using open-source tools developed by the Internet Archive to crawl and provide access to the content. The data can be kept in the ISO standard WARC (WebARChive) file format. The approach to Web harvesting is fairly stable.