Digital Preservation Levels Based On Format
Binghamton University Libraries evaluates the level of digital preservation that can be provided on a collection-level basis. However, to help the University community plan, we are in the process of identifying the level of preservation we can provide based on digital object's file format. Below are charts that shows the three proposed levels and the level of preservation we can provide for some common formats.*
- Full Support: The Libraries will take all reasonable actions to maintain usability including migration, emulation, or normalization. The Libraries will ensure access and data fixity.
- Limited Support: The Libraries will take limited steps to maintain usability. The Libraries may actively transform a file from one format to another to mitigate format obsolescence. The Libraries will ensure access and data fixity.
- Basic Support: The Libraries will provide for access to the item in its submission file format only. The Libraries will work to ensure data fixity.
Levels of Digital Preservation Support
|Full Support||Limited Support||Basic Support|
|Create a permanent URL that will point to the object and/or its metadata||•||•||•|
|Generate preservation metadata to support accessibility, provenance, and management over time||•||•||•|
|Provide discovery of objects via the Libraries' Find It! discovery layer||•||•||•|
|Periodically refresh storage media||•||•||•|
|Perform fixity checks using proven checksum methods on a regular basis||•||•||•|
|Undertakes strategic monitoring of format||•||•|
|Offers long-term storage in a trusted preservation-worthy format||•|
|Plans and performs migration to succeeding format upon obsolescence||•|
|Plans and performs normalization if necessary||•|
File Format Support
The following table details preservation support levels for commonly used file formats. Unless otherwise noted the latest version of a file format should be utilized.
|Format||File Extension||Support Level||Notes|
|Text and other document formats|
|Comma Separated Values||.csv||Full Support|
|Microsoft .doc||.doc||Basic Support||Microsoft Word switched to .docx format with the introduction of MS Word 2007. Therefore, .doc is on track to become obsolete (some earlier versions of .doc are already no longer readable by current versions of MS Word), MS Word .doc formats should be converted to a more preservation-friendly format such as PDF/A before submission.|
|Microsoft .docx||.docx||Limited Support||Microsoft Word switched to .docx format with the introduction of MS Word 2007. .docx is XML based and should be extensible but it is still recommended to convert files to PDF/A instead when possible.|
|PDF/A||Full Support||PDF/A is the preferred version of PDF for archival preservation. PDF/A-1 (ISO 19005-1:2005) and PDF/A/2 (ISO 19005-2:2011) are both supported. Full support is only provided for PDF/A files and PDF files with embedded fonts. Limited support is offered for other valid PDF files.|
|PDF (with embedded fonts)||Full Support||PDF/A is the preferred version of PDF for archival preservation. Full support is only provided for PDF/A files and PDF files with embedded fonts. Limited support is offered for other valid PDF files.|
|PDF (other)||Limited Support||PDF/A is the preferred version of PDF for archival preservation. Full support is only provided for PDF/A files and PDF files with embedded fonts. Limited support is offered for other valid PDF files.|
|Plain text||.txt||Full Support||Plain text using charset encoding UTF-8, USASCII or UTF-16 with Byte Order Mark.|
|WordPerfect||.wpd||Basic Support||Convert files to PDF/A or PDF with embedded fonts when possible.|
|JPEG 2000||.jp2||Full Support||The use of lossless (or reversible) compression is recommended although "visually lossless" compression (that is actually a lossy (irreversible) compression) is also acceptable for most images. For more details, read the report "JPEG 2000 as a Preservation and Access Format for the Wellcome Trust Digital Library" by Robert Buckley.|
|TIFF||.tif, .tiff||Full Support||TIFF 6.0 is considered the best format for storing your master images. Best practice is to save these files with no compression.|
|MP3||.mp3||Full Support||General preference for preservation-oriented recorded sound is uncompressed Wave. MP3 utilizes lossy (irreversible) compression and their are possible patent issues (all US patents appear to expire on or before December 30, 2017). However, MP3 is an open ISO standard in wide use. Thus, for compressed sound, MP3 is acceptable, especially at data rates of 128 Kb/s (mono) or 256 Kb/s (stereo) or higher. The patent-free, open FLAC standard that utilizes lossless (reversible) compression may be a good alternative to MP3 and Wave for some.|
|FLAC||.flac||Full Support||FLAC is a patent-free, open standard that utilizes lossless (reversible) compression. It may be a good alternative for some who are concerned about file size.|
|Wave||.wav||Full Support||This file format can store all the data in an uncompressed format and its wide use suggests long-term community support.|
|Note: Video files usually contain multiple formats within a wrapper. These formats may include audio formats, text formats (for closed caption), graphical formats, and others. Typically the also have some form of compression as well. Therefore these examples below should only be considered general guidelines and digital video objects will need to be evaluated on a case-by-case basis. For more information, see the Whither Digital Video Preservation? post on the Library of Congress's digital preservation blog.|
|Motion JPEG 2000||.mj2, .mjp2||Limited Support||Files should be JPEG2000 losslessly-compressed video wrapped up in Material Exchange Format (MXF). Motion JPEG 2000 is under consideration as a digital archival format by the Library of Congress. It is an open ISO standard and an advanced update to MJPEG (or MJ). Details about the Material Exchange Format (MXF) are available from the Library of Congress Sustainability of Digital Formats website|
|MPEG-2||.mp2||Limited Support||Details about the MPEG-2 and digital preservation are available from the Library of Congress Sustainability of Digital Formats website. Note that according to the Library of Congress digital preservation guidelines for audio streams in MPEG-2 formats, AAC is preferred to other audio encodings.|
|MPEG-4 (file format version #2)||.mp4||Limited Support||Details about the MPEG-2 and digital preservation are available from the Library of Congress Sustainability of Digital Formats website. Note that according to the Library of Congress digital preservation guidelines for audio streams in MPEG-4 formats, AAC is preferred to other audio encodings.|
Any file format not listed will default to Basic Support until it is evaluated by the Libraries for its preservation qualities. If you have a specific file format you would like evaluated, please contact us.
Note on compression: When a non-open compression algorithm is used the Libraries may only be able to guarantee the Basic Level of digital preservation support. Also, when using compression, it is important to remember that lossy compression is irreversible and some level of detail will be lost and unrecoverable when it is utilized.