Welcome to the BigTIFF version of the libtiff library


Introduction

The libtiff library is an open-source cross-platform library which enables applications to read and write images stored in TIFF files.  (TIFF = Tagged Image File Format.)  The TIFF file format is a widely accepted standard, supported by many applications on a wide range of platforms due to its capabilities and flexibility.  Please see the libtiff home page for more information about libtiff and to download source code, executable libraries, and documentation.

Until recently TIFF files were limited in size to 4GB.  A new version of libtiff has been created which supports BigTIFF files - TIFF files which are larger than 4GB.  The new version is backward-compatible with previous versions and in many cases applications will not have to change at all in order to read or write BigTIFF files.  In other cases the modifications required will be minor.

Please note this is not an official release of libtiff.  This version has been submitted for a future standard update.  It was named version 4.0 since the most recent official version is 3.8.2 (on which these changes were based).  A subsequent update to the library to reduce the memory footprint for processing large TIFF files has been named version 4.1. The actual version when officially released may be different.



Overview

There are more technical details below, but at the highest level the BigTIFF changes made to libtiff were quite simple.  The TIFF file format internally uses 32-bit byte offsets.  The largest offset which can be represented is 232 = 4GB, making that the upper limit of the file size which could be supported by the design.  The BigTIFF modifications to libtiff consisted primarily of changing all internal byte offsets to 64-bits.  A key goal was to maintain backward compatibility with existing applications and files to the largest extent possible.

BigTIFF files have a ".tif" or ".tiff" file extension just like ordinary TIFF files.  A new version in the file header prevents programs which have not linked with the BigTIFF version of libtiff from processing BigTIFF files.  The library has been modified so that programs writing TIFF files do not need to know or care when they reach a 4GB data size.  The file format will smoothly change from a standard TIFF file (compatible with all existing programs) to a BigTIFF file (compatible with all programs linked with a new version of the libtiff library).  Similarly, programs reading TIFF files do not need to know or care whether a file is a standard TIFF file or a BigTIFF file.

The basic BigTIFF design was first proposed in 2004 and refined in this discussion on the Aware Systems mailing list.  Contributors to the design discussion included Lynn Quam, Frank Warmerdam, Chris Cox, Rob Tillaart, Dan Smith, Bob Freisenhahn, Andrey Kiselev, Phillip Crews, and Gerben Vos.  We thank all those who came before us for creating libtiff and designing the BigTIFF enhancements.

These changes were made by Ole Eichhorn (ole.eichhorn@gmail.com) while at Aperio (www.aperio.com) and are donated to the public domain, in gratitude to Sam Leffler, Silicon Graphics, Joris Van Damme, Aware Systems, Frank Warmerdam, Andrey Kisley, Mike Welles, and all who have worked on libtiff over the years to provide such a great tool.  These changes are published on an "as is" basis and neither Ole Eichhorn nor Aperio make any warranty as to their fitness for any intended use.



Sample BigTIFF images

A number of BigTIFF sample images have been made by stitching together copies of a digitized microscope slide.  These images may be viewed in your web browser by clicking on the thumbnails below:


Original image
73,042 x 62,633 pixels
Image data = 4.57Gp
File size = 575MB
(more details)


25 x original
365,210 x 313,165 pixels
Image data = 114.4Gp
File size = 14GB
(more details)


100 x original
730,420 x 626,330 pixels
Image data = 457.5Gp
File size = 56GBGB
(more details)


225 x original
1,095,630 x 939,495 pixels
Image data = 1.0Tp
File size = 126GB
(more details)

Each of these files is a pyramid TIFF file, that is, there are multiple images in each file stored at different resolutions to facilitate rapid zooming.  The images have a tiled organization to facilitate rapid panning.  Image data have been compressed with JPEG at quality 30 for an average compression ratio of 20:1.  Please click on the "more details" links to view the pyramid dimensions and other image parameters.



About Aperio

Aperio is digitizing pathology.  We provide systems and services for digital pathology, a digital environment for the management and interpretation of pathology information that originates with the digitization of a glass slide.

Aperio's ScanScope slide scanning systems and Spectrum digital pathology information management software improve the efficiency and quality of pathology services for pathologists and other professionals.  ScanScope scanners create a digital image of an entire microscope slide at giga-pixel resolution in minutes, with inherently superior image quality.  The Spectrum software provides a consolidated view of relevant case information - anywhere, anytime - and with clinical and workflow tools to improve the quality and efficiency of pathology services.

Scanned microscope slide images are very large - with dimensions that routinely exceed 100,000 x 100,000 pixels.  The TIFF standard is perfect for storing digital slides - it is an open standard supported by a large number of applications on a wide variety of platforms.  Lossy compression technologies such as JPEG2000 and JPEG are needed to keep file sizes manageable, but even so the 4GB file size limit had been a problem.  With BigTIFF support it is now possible to store very large digital slides as TIFF files.  Aperio is a strong believer in open standards and we hope that with these enhancements TIFF will continue to be the standard for storing and managing very large images such as digital slides.



Technical Details


FILE FORMAT

The library retains the capability of writing standard TIFF files, compatible with previous versions of the library and other existing software.  Standard TIFF files contain 32-bit offsets to directories and image data.  This limits the total size of a standard TIFF file to 4GB (2^32).

The library now has the capability of writing BigTIFF files, which contain 64-bit offsets to directories and image data.  For a BigTIFF file there is no practical limit to the size of a file.  The BigTIFF file format is described in detail here: http://www.awaresystems.be/imaging/tiff/bigtiff.html.

There are several places where the format of a BigTIFF file differs from a standard TIFF file:


FILE PROCESSING

When a file is opened for reading the TIFF version in the file header is used to determine whether the file is a standard TIFF (uses 32-bit offsets) or BigTIFF (uses 64-bit offsets).  In either case 64-bit offsets are used in internal data structures.

When a file is written it is initially unknown whether it will have to be a BigTIFF (if the total size is less than 2^32, there is no reason to make it a BigTIFF file).  The library begins by writing a standard TIFF file with 32-bit offsets.  It leaves 8 extra bytes of space after the file header in case the file will need to become a BigTIFF file, but otherwise the processing is identical to before.  Internally the library will use 64-bit offsets but externally the file will contain 32-bit offsets.

Image data are written to the file as before, with no change.  All internal data offsets are 64-bit and all I/O routines have been modified to use 64-bit file offsets; files larger than 2^32 bytes are supported seamlessly.

When a directory is written, if the file is not already in BigTIFF format, note is made whether the directory will be located beyond 2^32 from the start of the file.  If so, the file is converted into a BigTIFF file.

The following things are done to convert from standard TIFF to BigTIFF at this point:

After this point, all directories are written with the BigTIFF format, with 64-bit data offsets, 64-bit strip/tile offsets, etc.  Subdirectory offsets are written as 64-bit values, and the pointers linking each directory to the next use 64-bit offsets.


INTERNAL OBJECT CHANGES

The general technique was to leave all data structures and types alone with the exception of offsets within the file.  All file offsets are stored internally and processed in the TIFF objects as toff_t (unsigned 64-bit) values.

The library was originally designed to keep tables with file block offsets and lengths in memory.  A subsequent update to the BigTIFF version of the library modified this logic to manage block tables in segments.  Now only fixed portions of the block tables are kept in memory, paged in and out from the TIFF file on disk; this enables files of any size whatsoever to be processed.


API CHANGES

The majority of the API to the TIFF object has remained unchanged.  The main change is that the toff_t data type used by some API functions for file offsets was a 32-bit integer, and it has been changed to a 64-bit integer.  The API functions which use file offsets are infrequently used so many applications will not require any changes.  Applications which use the API functions involving file offsets will require recompilation and may require minor changes to handle 64-bit values for file offsets.

It is possible for a program to use the TIFFSetField and TIFFGetField API functions to access subdirectory offsets.  The TIFFTAG_SUBIFD tag formerly used 32-bit values, and now uses 64-bit values for subdirectory offsets.  Applications which use TIFFTAG_SUBIFD will require recompilation and may require minor changes to handle 64-bit offsets.

The TIFFOpen function supports two new *optional* modes:

A new TIFFIsBigTIFF function has been added which returns whether the current file is a BigTIFF.


SCOPE of CHANGES

The source code has been modified and tested on Win32 (MS and Intel compiler), on Linux (RedHat), and on Mac OS X.  Other platform-specific code has not been modified or tested.

The tiffinfo tool which is provided with libtiff has been updated to display file information for both standard TIFF and BigTIFF files.  The other tools which are provided with libtiff have not been tested or updated.


VERSION HISTORY

DateVersionDescription
4/01/20074.0Initial support for BigTIFF, based on official 3.8.2 release.
4/24/20074.0.1Updates to error handling; return GetLastError (Windows) or errno (Linux) in I/O error messages.
3/21/20084.0.2Unix compatibility updates; support Windows 64 under MS Visual Studio 2005
12/18/20114.1Reduce memory footprint by implementing segmented access to block tables; 64-bit support; compatibilty with MS Visual Studio 2010


SOURCE CODE CHANGES

HEADERS ---- 4.0 ---- tiff.h - define int64 and uint64 data type - redefine TIFFheader as union of standard and BigTIFF formats - redefine TIFFDirEntry as union of standard and BigTIFF forms - define TIFF_LONG8, TIFF_SLONG8, TIFF_IFD8 64-bit data types tiffconf.h - HAVE_INT64 macro (use "__int64" if set, else "long long") tiffio.h - redefine toff_t data type - define new API functions: TIFFIsBigTIFF TIFFSwabLongLong TIFFSwabArrayOfLongLong tiffiop.h - eliminate typeshift and typemask - define new tif_flags: TIFF_ISBIGTIFF this is a BigTIFF file TIFF_NOBIGTIFF do not process BigTIFF files (error instead) - define new macros: isBigTIFF whether this is a BigTIFF file noBigTIFF whether BigTIFF files are prevented isBigOff whether offset is &gt; 2^32 - define macros to access to directory offset in header TIFFGetHdrDirOff, TIFFSetHdrDirOff - define macros to access directory counts in IFDs TIFFGetDirCnt, TIFFSetDirCnt, TIFFSwabDirCnt, TIFFDirCntLen - define macros to access directory offsets for IFDs TIFFGetDirOff, TIFFSetDirOff, TIFFSwabDirOff, TIFFDirOffLen - define macros to access directory entry data TDIREntryLen, TDIREntryNext TDIRGetEntryCount, TDIRSetEntryCount, TDIRSwabEntryCount TDIRAddrEntryOff, TDIRGetEntryOff, TDIRSetEntryOff, TDIRSwabEntryOff, TDIREntryOffLen - add prototype for _TIFFsetLong8Array tifvers.h - update version to 4.0.0 (20070401) tif_dir.h - remove TIFFInsertData and TIFFExtractData macros ---- 4.1 ---- tiffiop.h - add prototypes for _TIFFGetOffset, _TIFFGetByteCount, _TIFFSetOffset, _TIFFFlushOffsets(TIFF*), _TIFFSetByteCount, _TIFFFlushByteCounts tifvers.h - update version to 4.1.0 (20111201) tif_dir.h - define STRIPBUFMAX macro for size of offsets and byte counts arrays - define new td_strip... data in TIFFDirectory for offsets and byte counts arrays CODE ---- 4.0 ---- tif_dir.c - implement _TIFFsetLong8Array - add 32/64-bit logic for TIFFTAG_SUBIFD - update TIFFAdvanceDirectory, TIFFNumberOfDirectories, TIFFSetDirectory, and TIFFUnlinkDirectory to use macros to access directory chain - Implement TIFFSetSubDirectoryB and TIFFCurrentDirOffsetB functions tif_dirinfo.c - add TIFF_LONG8 variation for TIFFTAG_STRIPOFFSETS - add TIFF_LONG8 variation for TIFFTAG_STRIPBYTECOUNTS - add TIFF_LONG8 and TIFF_IFD8 variations for TIFFTAG_SUBIFD - update TIFFDataWidth and _TIFFDataSize to support 64-bit types tif_dirread.c - update TIFFReadDirectory to use macros to access header fields and directory entries. Also support more than two types for tag. - add 32-bit/64-bit logic for TIFFTAG_SUBIFD - replace TIFFExtractData with TIFFFetch<x>Array calls - update TIFFReadCustomDirectory to use macros to access header fields and directory entries. - replace TIFFFetchStripThing with TIFFFetchByteCounts and TIFFetchOffsets - implement TIFFFetchLong8Array - support 64-bit data types in TIFFFetchData - update TIFFFetch<x>Array functions to support 64-bit data in BigTIFF entries tif_dirwrite.c - update _TIFFWriteDirectory to use macros to access header fields and directory entries. Also support 32-bit or 64-bit stripe/tile offsets. - replace TIFFInsertData with TIFFWrite<x>Array calls - implement TIFFWriteLong8Array - support 64-bit data types in TIFFWriteData - update TIFFWrite<x>Array functions to support 64-bit data in BigTIFF entries - update TIFFRewriteDirectory to use macros to access directory chain - update TIFFLinkDirectory to use macros to access directory chain - implement local TIFFMakeBigTIFF function (convert directories, etc) tif_ojpeg.c - use macro for directory entry length (ugly code in here!) tif_open.c - eliminate use of typemask and typeshift tables - update TIFFClientOpen to support BigTIFF header - support '4' mode to explicitly prevent reading and writing BigTIFF files - support '8' mode to explicitly force reading and writing BigTIFF files - read/write standard or BigTIFF header as required - implement TIFFIsBigTIFF tif_print.c - replace cascading if with switch (!) - support 64-bit data types - support 64-bit directory and file offsets - format stripe/tile offset array as 64-bit toff_t - format SUBIFD array as 64-bit toff_t tif_swab.c - add TIFFSwabLong8 routine - add TIFFSwabArrayOfLong8 routine tif_unix.c - modify Unix API calls to use 64-bit offsets everywhere tif_win32.c - modify Windows API calls to use 64-bit offsets everywhere tif_write.c - modify TIFFSetupStrips and TIFFGrowStrips for 64-bit offsets ---- 4.1 ---- tif_dir.c - initialize new td_strip... data in TIFFDirectory from strip/tile tags tif_dirread.c - setup td_strip... data during tag processing - only check for sorted offsets and byte counts if not read only - use _TIFFGetOffset and _TIFFGetByteCount instead of accessing arrays - change EstimateStripByteCounts for new td_strip... data - implement _TIFFGetOffset and _TIFFSetOffset tif_dirwrite.c - call _TIFFFlushOffsets and _TIFFFlushByteCounts prior to writing directory - set strip/tiole tags from new td_strip... data - implement _TIFFSetOffset, _TIFFFlushOffsets, _TIFFSetByteCount, _TIFFFlushByteCounts tif_ojpeg.c - (deprecated) use _TIFFGetOffset, _TIFFGetByteCount tif_print.c - use _TIFFGetOffset, _TIFFGetByteCount tif_read.c - use _TIFFGetOffset, _TIFFGetByteCount tif_strip.c - use _TIFFGetOffset, _TIFFGetByteCount tif_write.c - use _TIFFGetOffset, _TIFFGetByteCount, _TIFFSetOffset, _TIFFSetByteCount - modify TIFFSetupStrips to defer initialization to _TIFFGetOffset and _TIFFGetByteCount - check offsets and byte counts array locations when growing strip/tile </x></x></x></x>