The libtiff library is an open-source cross-platform library which enables applications to read and write images stored in TIFF files. (TIFF = Tagged Image File Format.) The TIFF file format is a widely accepted standard, supported by many applications on a wide range of platforms due to its capabilities and flexibility. Please see the libtiff home page for more information about libtiff and to download source code, executable libraries, and documentation.
Until recently TIFF files were limited in size to 4GB. A new version of libtiff has been created which supports BigTIFF files - TIFF files which are larger than 4GB. The new version is backward-compatible with previous versions and in many cases applications will not have to change at all in order to read or write BigTIFF files. In other cases the modifications required will be minor.
Please note this is not an official release of libtiff. This version has been submitted for a future standard update. It was named version 4.0 since the most recent official version is 3.8.2 (on which these changes were based). A subsequent update to the library to reduce the memory footprint for processing large TIFF files has been named version 4.1. The actual version when officially released may be different.
There are more technical details below, but at the highest level the BigTIFF changes made to libtiff were quite simple. The TIFF file format internally uses 32-bit byte offsets. The largest offset which can be represented is 232 = 4GB, making that the upper limit of the file size which could be supported by the design. The BigTIFF modifications to libtiff consisted primarily of changing all internal byte offsets to 64-bits. A key goal was to maintain backward compatibility with existing applications and files to the largest extent possible.
BigTIFF files have a ".tif" or ".tiff" file extension just like ordinary TIFF files. A new version in the file header prevents programs which have not linked with the BigTIFF version of libtiff from processing BigTIFF files. The library has been modified so that programs writing TIFF files do not need to know or care when they reach a 4GB data size. The file format will smoothly change from a standard TIFF file (compatible with all existing programs) to a BigTIFF file (compatible with all programs linked with a new version of the libtiff library). Similarly, programs reading TIFF files do not need to know or care whether a file is a standard TIFF file or a BigTIFF file.
The basic BigTIFF design was first proposed in 2004 and refined in this discussion on the Aware Systems mailing list. Contributors to the design discussion included Lynn Quam, Frank Warmerdam, Chris Cox, Rob Tillaart, Dan Smith, Bob Freisenhahn, Andrey Kiselev, Phillip Crews, and Gerben Vos. We thank all those who came before us for creating libtiff and designing the BigTIFF enhancements.
These changes were made by Ole Eichhorn (ole.eichhorn@gmail.com) while at Aperio (www.aperio.com) and are donated to the public domain, in gratitude to Sam Leffler, Silicon Graphics, Joris Van Damme, Aware Systems, Frank Warmerdam, Andrey Kisley, Mike Welles, and all who have worked on libtiff over the years to provide such a great tool. These changes are published on an "as is" basis and neither Ole Eichhorn nor Aperio make any warranty as to their fitness for any intended use.
A number of BigTIFF sample images have been made by stitching together copies of a digitized microscope slide. These images may be viewed in your web browser by clicking on the thumbnails below:
|
|
|
|
|
Each of these files is a pyramid TIFF file, that is, there are multiple images in each file stored at different resolutions to facilitate rapid zooming. The images have a tiled organization to facilitate rapid panning. Image data have been compressed with JPEG at quality 30 for an average compression ratio of 20:1. Please click on the "more details" links to view the pyramid dimensions and other image parameters.
Aperio is digitizing pathology. We provide systems and services for digital pathology, a digital environment for the management and interpretation of pathology information that originates with the digitization of a glass slide.
Aperio's ScanScope slide scanning systems and Spectrum digital pathology information management software improve the efficiency and quality of pathology services for pathologists and other professionals. ScanScope scanners create a digital image of an entire microscope slide at giga-pixel resolution in minutes, with inherently superior image quality. The Spectrum software provides a consolidated view of relevant case information - anywhere, anytime - and with clinical and workflow tools to improve the quality and efficiency of pathology services.
Scanned microscope slide images are very large - with dimensions that routinely exceed 100,000 x 100,000 pixels. The TIFF standard is perfect for storing digital slides - it is an open standard supported by a large number of applications on a wide variety of platforms. Lossy compression technologies such as JPEG2000 and JPEG are needed to keep file sizes manageable, but even so the 4GB file size limit had been a problem. With BigTIFF support it is now possible to store very large digital slides as TIFF files. Aperio is a strong believer in open standards and we hope that with these enhancements TIFF will continue to be the standard for storing and managing very large images such as digital slides.
The library retains the capability of writing standard TIFF files, compatible with previous versions of the library and other existing software. Standard TIFF files contain 32-bit offsets to directories and image data. This limits the total size of a standard TIFF file to 4GB (2^32).
The library now has the capability of writing BigTIFF files, which contain 64-bit offsets to directories and image data. For a BigTIFF file there is no practical limit to the size of a file. The BigTIFF file format is described in detail here: http://www.awaresystems.be/imaging/tiff/bigtiff.html.
There are several places where the format of a BigTIFF file differs from a standard TIFF file:
Standard TIFF 0 16-bit 0x4D4D constant 2 16-bit 0x002A version = standard TIFF 4 32-bit offset to first directory BigTIFF 0 16-bit 0x4D4D constant 2 16-bit 0x002B version = BigTIFF 4 16-bit 0x0008 bytesize of offsets 6 16-bit 0x0000 constant 8 64-bit offset to first directory
Standard TIFF 0 16-bit number of directory entries --- for each entry 0 16-bit tag identifying information for entry 2 16-bit data type of entry (for standard TIFF 14 types are defined, 0-13) 4 32-bit count of elements for entry 4 32-bit data itself (if <= 32-bits) or offset to data for entry --- after all entries 2+n*12 32-bit offset to next directory or zero BigTIFF 0 64-bit number of directory entries --- for each entry 0 16-bit tag identifying information for entry 2 16-bit data type of entry (for BigTIFF 17 types are defined, 0-13, 16-18) 4 64-bit count of elements for entry 12 64-bit data itself (if <= 64-bits) or offset to data for entry --- after all entries 8+n*20 64-bit offset to next directory or zero
When a file is opened for reading the TIFF version in the file header is used to determine whether the file is a standard TIFF (uses 32-bit offsets) or BigTIFF (uses 64-bit offsets). In either case 64-bit offsets are used in internal data structures.
When a file is written it is initially unknown whether it will have to be a BigTIFF (if the total size is less than 2^32, there is no reason to make it a BigTIFF file). The library begins by writing a standard TIFF file with 32-bit offsets. It leaves 8 extra bytes of space after the file header in case the file will need to become a BigTIFF file, but otherwise the processing is identical to before. Internally the library will use 64-bit offsets but externally the file will contain 32-bit offsets.
Image data are written to the file as before, with no change. All internal data offsets are 64-bit and all I/O routines have been modified to use 64-bit file offsets; files larger than 2^32 bytes are supported seamlessly.
When a directory is written, if the file is not already in BigTIFF format, note is made whether the directory will be located beyond 2^32 from the start of the file. If so, the file is converted into a BigTIFF file.
The following things are done to convert from standard TIFF to BigTIFF at this point:
After this point, all directories are written with the BigTIFF format, with 64-bit data offsets, 64-bit strip/tile offsets, etc. Subdirectory offsets are written as 64-bit values, and the pointers linking each directory to the next use 64-bit offsets.
The general technique was to leave all data structures and types alone with the exception of offsets within the file. All file offsets are stored internally and processed in the TIFF objects as toff_t (unsigned 64-bit) values.
The library was originally designed to keep tables with file block offsets and lengths in memory. A subsequent update to the BigTIFF version of the library modified this logic to manage block tables in segments. Now only fixed portions of the block tables are kept in memory, paged in and out from the TIFF file on disk; this enables files of any size whatsoever to be processed.
The majority of the API to the TIFF object has remained unchanged. The main change is that the toff_t data type used by some API functions for file offsets was a 32-bit integer, and it has been changed to a 64-bit integer. The API functions which use file offsets are infrequently used so many applications will not require any changes. Applications which use the API functions involving file offsets will require recompilation and may require minor changes to handle 64-bit values for file offsets.
It is possible for a program to use the TIFFSetField and TIFFGetField API functions to access subdirectory offsets. The TIFFTAG_SUBIFD tag formerly used 32-bit values, and now uses 64-bit values for subdirectory offsets. Applications which use TIFFTAG_SUBIFD will require recompilation and may require minor changes to handle 64-bit offsets.
The TIFFOpen function supports two new *optional* modes:
A new TIFFIsBigTIFF function has been added which returns whether the current file is a BigTIFF.
The source code has been modified and tested on Win32 (MS and Intel compiler), on Linux (RedHat), and on Mac OS X. Other platform-specific code has not been modified or tested.
The tiffinfo tool which is provided with libtiff has been updated to display file information for both standard TIFF and BigTIFF files. The other tools which are provided with libtiff have not been tested or updated.
Date | Version | Description |
---|---|---|
4/01/2007 | 4.0 | Initial support for BigTIFF, based on official 3.8.2 release. |
4/24/2007 | 4.0.1 | Updates to error handling; return GetLastError (Windows) or errno (Linux) in I/O error messages. |
3/21/2008 | 4.0.2 | Unix compatibility updates; support Windows 64 under MS Visual Studio 2005 |
12/18/2011 | 4.1 | Reduce memory footprint by implementing segmented access to block tables; 64-bit support; compatibilty with MS Visual Studio 2010 |