Leptonica 1.68
C Image Processing Library

readfile.c File Reference

Pix from memory and disk; detecting format. More...

#include <string.h>
#include "allheaders.h"

Go to the source code of this file.

Enumerations

enum  { READ_24_BIT_COLOR = 0, CONVERT_TO_PALETTE = 1, READ_GRAY = 2 }

Functions

PIXApixaReadFiles (const char *dirname, const char *substr)
PIXApixaReadFilesSA (SARRAY *sa)
PIXpixRead (const char *filename)
PIXpixReadWithHint (const char *filename, l_int32 hint)
PIXpixReadIndexed (SARRAY *sa, l_int32 index)
PIXpixReadStream (FILE *fp, l_int32 hint)
l_int32 pixReadHeader (const char *filename, l_int32 *pformat, l_int32 *pw, l_int32 *ph, l_int32 *pbps, l_int32 *pspp, l_int32 *piscmap)
l_int32 findFileFormat (const char *filename, l_int32 *pformat)
l_int32 findFileFormatStream (FILE *fp, l_int32 *pformat)
l_int32 findFileFormatBuffer (const l_uint8 *buf, l_int32 *pformat)
l_int32 fileFormatIsTiff (FILE *fp)
PIXpixReadMem (const l_uint8 *data, size_t size)
l_int32 pixReadHeaderMem (const l_uint8 *data, size_t size, l_int32 *pformat, l_int32 *pw, l_int32 *ph, l_int32 *pbps, l_int32 *pspp, l_int32 *piscmap)
l_int32 ioFormatTest (const char *filename)

Variables

static const char * FILE_BMP = "/tmp/junkout.bmp"
static const char * FILE_PNG = "/tmp/junkout.png"
static const char * FILE_PNM = "/tmp/junkout.pnm"
static const char * FILE_G3 = "/tmp/junkout_g3.tif"
static const char * FILE_G4 = "/tmp/junkout_g4.tif"
static const char * FILE_RLE = "/tmp/junkout_rle.tif"
static const char * FILE_PB = "/tmp/junkout_packbits.tif"
static const char * FILE_LZW = "/tmp/junkout_lzw.tif"
static const char * FILE_ZIP = "/tmp/junkout_zip.tif"
static const char * FILE_TIFF = "/tmp/junkout.tif"
static const char * FILE_JPG = "/tmp/junkout.jpg"
static const char JP2K_CODESTREAM [4] = { 0xff, 0x4f, 0xff, 0x51 }
static const char JP2K_IMAGE_DATA [12]

Detailed Description

Pix from memory and disk; detecting format.

:  reads image on file into memory

    Top-level functions for reading images from file
         PIXA      *pixaReadFiles()
         PIXA      *pixaReadFilesSA()
         PIX       *pixRead()
         PIX       *pixReadWithHint()
         PIX       *pixReadIndexed()
         PIX       *pixReadStream()

    Read header information from file
         l_int32    pixReadHeader()

    Format finders
         l_int32    findFileFormat()
         l_int32    findFileFormatStream()
         l_int32    findFileFormatBuffer()
         l_int32    fileFormatIsTiff()

    Read from memory
         PIX       *pixReadMem()
         l_int32    pixReadHeaderMem()

    Test function for I/O with different formats 
         l_int32    ioFormatTest()

Definition in file readfile.c.


Enumeration Type Documentation

anonymous enum
Enumerator:
READ_24_BIT_COLOR 
CONVERT_TO_PALETTE 
READ_GRAY 

Definition at line 49 of file readfile.c.


Function Documentation

PIXA* pixaReadFiles ( const char *  dirname,
const char *  substr 
)

pixaReadFiles()

Input: dirname substr (<optional> substring filter on filenames; can be null) Return: pixa, or null on error

Notes: (1) is the full path for the directory. (2) is the part of the file name (excluding the directory) that is to be matched. All matching filenames are read into the Pixa. If substr is NULL, all filenames are read into the Pixa.

Definition at line 94 of file readfile.c.

References ERROR_PTR, getSortedPathnamesInDirectory(), NULL, pixaReadFilesSA(), PROCNAME, and sarrayDestroy().

Referenced by DoPageSegmentation(), and main().

PIXA* pixaReadFilesSA ( SARRAY sa)

pixaReadFilesSA()

Input: sarray (full pathnames for all files) Return: pixa, or null on error

Definition at line 121 of file readfile.c.

References ERROR_PTR, L_INSERT, L_NOCOPY, L_WARNING_STRING, NULL, pixaAddPix(), pixaCreate(), pixRead(), PROCNAME, sarrayGetCount(), and sarrayGetString().

Referenced by main(), and pixaReadFiles().

PIX* pixReadWithHint ( const char *  filename,
l_int32  hint 
)

pixReadWithHint()

Input: filename (with full pathname or in local directory) hint (bitwise OR of L_HINT_* values for jpeg; use 0 for no hint) Return: pix if OK; null on error

Notes: (1) The hint is not binding, but may be used to optimize jpeg decoding. Use 0 for no hinting.

Definition at line 197 of file readfile.c.

References ERROR_PTR, fopenReadStream(), NULL, pixReadStream(), and PROCNAME.

PIX* pixReadIndexed ( SARRAY sa,
l_int32  index 
)

pixReadIndexed()

Input: sarray (of full pathnames) index (into pathname array) Return: pix if OK; null if not found

Notes: (1) This function is useful for selecting image files from a directory, where the integer is embedded into the file name. (2) This is typically done by generating the sarray using getNumberedPathnamesInDirectory(), so that the pathname would have the number in it. The size of the sarray should be the largest number (plus 1) appearing in the file names, respecting the constraints in the call to getNumberedPathnamesInDirectory(). (3) Consequently, for some indices into the sarray, there may be no pathnames in the directory containing that number. By convention, we place empty C strings ("") in those locations in the sarray, and it is not an error if such a string is encountered and no pix is returned. Therefore, the caller must verify that a pix is returned. (4) See convertSegmentedPagesToPS() in src/psio1.c for an example of usage.

Definition at line 246 of file readfile.c.

References ERROR_PTR, L_ERROR_STRING, L_NOCOPY, NULL, pixRead(), PROCNAME, sarrayGetCount(), and sarrayGetString().

Referenced by convertSegmentedPagesToPS().

PIX* pixReadStream ( FILE *  fp,
l_int32  hint 
)

pixReadStream()

Input: fp (file stream) hint (bitwise OR of L_HINT_* values for jpeg; use 0 for no hint) Return: pix if OK; null on error

Notes: (1) The hint only applies to jpeg.

Definition at line 285 of file readfile.c.

References ERROR_PTR, findFileFormatStream(), IFF_BMP, IFF_GIF, IFF_JFIF_JPEG, IFF_JP2, IFF_PNG, IFF_PNM, IFF_SPIX, IFF_TIFF, IFF_TIFF_G3, IFF_TIFF_G4, IFF_TIFF_LZW, IFF_TIFF_PACKBITS, IFF_TIFF_RLE, IFF_TIFF_ZIP, IFF_UNKNOWN, IFF_WEBP, NULL, pixReadStreamBmp(), pixReadStreamGif(), pixReadStreamJpeg(), pixReadStreamPng(), pixReadStreamPnm(), pixReadStreamSpix(), pixReadStreamTiff(), pixReadStreamWebP(), pixSetInputFormat(), PROCNAME, and READ_24_BIT_COLOR.

Referenced by pixRead(), and pixReadWithHint().

l_int32 pixReadHeader ( const char *  filename,
l_int32 pformat,
l_int32 pw,
l_int32 ph,
l_int32 pbps,
l_int32 pspp,
l_int32 piscmap 
)

pixReadHeader()

Input: filename (with full pathname or in local directory) &format (<optional return>=""> file format) &w, &h (<optional returns>=""> width and height) &bps <optional return>=""> bits/sample &spp <optional return>=""> samples/pixel (1, 3 or 4) &iscmap (<optional return>=""> 1 if cmap exists; 0 otherwise) Return: 0 if OK, 1 on error

Notes: (1) This reads the actual headers for jpeg, png, tiff and pnm. For bmp and gif, we cheat and read the entire file into a pix, from which we extract the "header" information.

Definition at line 384 of file readfile.c.

References ERROR_INT, findFileFormatStream(), fopenReadStream(), IFF_BMP, IFF_GIF, IFF_JFIF_JPEG, IFF_JP2, IFF_PNG, IFF_PNM, IFF_SPIX, IFF_TIFF, IFF_TIFF_G3, IFF_TIFF_G4, IFF_TIFF_LZW, IFF_TIFF_PACKBITS, IFF_TIFF_RLE, IFF_TIFF_ZIP, IFF_UNKNOWN, IFF_WEBP, L_ERROR_STRING, NULL, pixDestroy(), pixGetDimensions(), pixRead(), PROCNAME, readHeaderJpeg(), readHeaderPng(), readHeaderPnm(), readHeaderSpix(), readHeaderTiff(), and readHeaderWebP().

Referenced by get_header_data(), main(), sarrayConvertFilesFittedToPS(), and sarrayConvertFilesToPS().

l_int32 findFileFormat ( const char *  filename,
l_int32 pformat 
)

findFileFormat()

Input: filename &format (<return>) Return: 0 if OK, 1 on error or if format is not recognized

Definition at line 513 of file readfile.c.

References ERROR_INT, findFileFormatStream(), fopenReadStream(), IFF_UNKNOWN, NULL, and PROCNAME.

Referenced by convertToPSEmbed(), extractJpegDataFromFile(), pixcompCreateFromFile(), pixReadRGBAPng(), writeImageCompressedToPSFile(), and writeMultipageTiffSA().

l_int32 findFileFormatStream ( FILE *  fp,
l_int32 pformat 
)

findFileFormatStream()

Input: fp (file stream) &format (<return>) Return: 0 if OK, 1 on error or if format is not recognized

Notes: (1) Important: Side effect -- this resets fp to BOF.

Definition at line 546 of file readfile.c.

References ERROR_INT, findFileFormatBuffer(), findTiffCompression(), fnbytesInFile(), IFF_TIFF, IFF_UNKNOWN, and PROCNAME.

Referenced by fileFormatIsTiff(), findFileFormat(), freadHeaderTiff(), pixReadHeader(), pixReadStream(), and testcomp().

l_int32 findFileFormatBuffer ( const l_uint8 buf,
l_int32 pformat 
)

findFileFormatBuffer()

Input: byte buffer (at least 12 bytes in size; we can't check) &format (<return>) Return: 0 if OK, 1 on error or if format is not recognized

Notes: (1) This determines the file format from the first 12 bytes in the compressed data stream, which are stored in memory. (2) For tiff files, this returns IFF_TIFF. The specific tiff compression is then determined using findTiffCompression().

Definition at line 595 of file readfile.c.

References BMP_ID, buf, convertOnBigEnd16(), ERROR_INT, IFF_BMP, IFF_GIF, IFF_JFIF_JPEG, IFF_JP2, IFF_PNG, IFF_PNM, IFF_SPIX, IFF_TIFF, IFF_UNKNOWN, IFF_WEBP, JP2K_CODESTREAM, JP2K_IMAGE_DATA, PROCNAME, TIFF_BIGEND_ID, and TIFF_LITTLEEND_ID.

Referenced by findFileFormatStream(), pixReadHeaderMem(), and pixReadMem().

l_int32 fileFormatIsTiff ( FILE *  fp)

fileFormatIsTiff()

Input: fp (file stream) Return: 1 if file is tiff; 0 otherwise or on error

Definition at line 704 of file readfile.c.

References ERROR_INT, findFileFormatStream(), IFF_TIFF, IFF_TIFF_G3, IFF_TIFF_G4, IFF_TIFF_LZW, IFF_TIFF_PACKBITS, IFF_TIFF_RLE, IFF_TIFF_ZIP, and PROCNAME.

Referenced by convertTiffMultipageToPS(), extractG4DataFromFile(), main(), and pixaReadMultipageTiff().

PIX* pixReadMem ( const l_uint8 data,
size_t  size 
)

pixReadMem()

Input: data (const; encoded) datasize (size of data) Return: pix, or null on error

Notes: (1) This is a variation of pixReadStream(), where the data is read from a memory buffer rather than a file. (2) On windows, this will only read tiff formatted files from memory. For other formats, it requires fmemopen(3). Attempts to read those formats will fail at runtime. (3) findFileFormatBuffer() requires up to 8 bytes to decide on the format. That determines the constraint here.

Definition at line 744 of file readfile.c.

References ERROR_PTR, findFileFormatBuffer(), IFF_BMP, IFF_GIF, IFF_JFIF_JPEG, IFF_JP2, IFF_PNG, IFF_PNM, IFF_SPIX, IFF_TIFF, IFF_TIFF_G3, IFF_TIFF_G4, IFF_TIFF_LZW, IFF_TIFF_PACKBITS, IFF_TIFF_RLE, IFF_TIFF_ZIP, IFF_UNKNOWN, NULL, pixGetDepth(), pixReadMemBmp(), pixReadMemGif(), pixReadMemJpeg(), pixReadMemPng(), pixReadMemPnm(), pixReadMemSpix(), pixReadMemTiff(), pixSetInputFormat(), PROCNAME, and READ_24_BIT_COLOR.

Referenced by convertImageDataToPdf(), convertImageDataToPdfData(), pixCreateFromPixcomp(), test_mem_gif(), test_mem_png(), and test_writemem().

l_int32 pixReadHeaderMem ( const l_uint8 data,
size_t  size,
l_int32 pformat,
l_int32 pw,
l_int32 ph,
l_int32 pbps,
l_int32 pspp,
l_int32 piscmap 
)

pixReadHeaderMem()

Input: data (const; encoded) datasize (size of data) &format (<optional returns>=""> image format) &w, &h (<optional returns>=""> width and height) &bps <optional return>=""> bits/sample &spp <optional return>=""> samples/pixel (1, 3 or 4) &iscmap (<optional return>=""> 1 if cmap exists; 0 otherwise) Return: 0 if OK, 1 on error

Notes: (1) This reads the actual headers for jpeg, png, tiff and pnm. For bmp and gif, we cheat and read all the data into a pix, from which we extract the "header" information. (2) On windows, this will only read tiff formatted files from memory. For other formats, it requires fmemopen(3). Attempts to read those formats will fail at runtime. (3) findFileFormatBuffer() requires up to 8 bytes to decide on the format. That determines the constraint here.

Definition at line 849 of file readfile.c.

References ERROR_INT, findFileFormatBuffer(), IFF_BMP, IFF_GIF, IFF_JFIF_JPEG, IFF_JP2, IFF_PNG, IFF_PNM, IFF_SPIX, IFF_TIFF, IFF_TIFF_G3, IFF_TIFF_G4, IFF_TIFF_LZW, IFF_TIFF_PACKBITS, IFF_TIFF_RLE, IFF_TIFF_ZIP, IFF_UNKNOWN, NULL, pixDestroy(), pixGetDimensions(), pixReadMemBmp(), pixReadMemGif(), PROCNAME, readHeaderMemJpeg(), readHeaderMemTiff(), sreadHeaderPng(), sreadHeaderPnm(), and sreadHeaderSpix().

Referenced by get_format_data(), get_header_data(), main(), and pixcompCreateFromString().

l_int32 ioFormatTest ( const char *  filename)

ioFormatTest()

Input: filename (input file) Return: 0 if OK; 1 on error or if the test fails

Notes: (1) This writes and reads a set of output files losslessly in different formats to /tmp, and tests that the result before and after is unchanged. (2) This should work properly on input images of any depth, with and without colormaps. (3) All supported formats are tested for bmp, png, tiff and non-ascii pnm. Ascii pnm also works (but who'd ever want to use it?) We allow 2 bpp bmp, although it's not supported elsewhere. And we don't support reading 16 bpp png, although this can be turned on in pngio.c. (4) This silently skips png or tiff testing if HAVE_LIBPNG or HAVE_LIBTIFF are 0, respectively.

Definition at line 986 of file readfile.c.

References ERROR_INT, FALSE, FILE_BMP, FILE_G3, FILE_G4, FILE_LZW, FILE_PB, FILE_PNG, FILE_PNM, FILE_RLE, FILE_TIFF, FILE_ZIP, IFF_BMP, IFF_PNG, IFF_PNM, IFF_TIFF, IFF_TIFF_G3, IFF_TIFF_G4, IFF_TIFF_LZW, IFF_TIFF_PACKBITS, IFF_TIFF_RLE, IFF_TIFF_ZIP, L_INFO, NULL, pixClone(), pixDestroy(), pixEqual(), pixGetColormap(), pixGetDepth(), pixRead(), pixRemoveColormap(), pixWrite(), PROCNAME, REMOVE_CMAP_BASED_ON_SRC, and TRUE.

Referenced by main().


Variable Documentation

const char* FILE_BMP = "/tmp/junkout.bmp" [static]

Definition at line 57 of file readfile.c.

Referenced by ioFormatTest().

const char* FILE_PNG = "/tmp/junkout.png" [static]

Definition at line 58 of file readfile.c.

Referenced by ioFormatTest().

const char* FILE_PNM = "/tmp/junkout.pnm" [static]

Definition at line 59 of file readfile.c.

Referenced by ioFormatTest().

const char* FILE_G3 = "/tmp/junkout_g3.tif" [static]

Definition at line 60 of file readfile.c.

Referenced by ioFormatTest().

const char* FILE_G4 = "/tmp/junkout_g4.tif" [static]

Definition at line 61 of file readfile.c.

Referenced by ioFormatTest().

const char* FILE_RLE = "/tmp/junkout_rle.tif" [static]

Definition at line 62 of file readfile.c.

Referenced by ioFormatTest().

const char* FILE_PB = "/tmp/junkout_packbits.tif" [static]

Definition at line 63 of file readfile.c.

Referenced by ioFormatTest().

const char* FILE_LZW = "/tmp/junkout_lzw.tif" [static]

Definition at line 64 of file readfile.c.

Referenced by ioFormatTest().

const char* FILE_ZIP = "/tmp/junkout_zip.tif" [static]

Definition at line 65 of file readfile.c.

Referenced by ioFormatTest().

const char* FILE_TIFF = "/tmp/junkout.tif" [static]

Definition at line 66 of file readfile.c.

Referenced by ioFormatTest().

const char* FILE_JPG = "/tmp/junkout.jpg" [static]

Definition at line 67 of file readfile.c.

const char JP2K_CODESTREAM[4] = { 0xff, 0x4f, 0xff, 0x51 } [static]

Definition at line 71 of file readfile.c.

Referenced by findFileFormatBuffer().

const char JP2K_IMAGE_DATA[12] [static]
Initial value:
 { 0x00, 0x00, 0x00, 0x0C,
                                          0x6A, 0x50, 0x20, 0x20,
                                          0x0D, 0x0A, 0x87, 0x0A }

Definition at line 72 of file readfile.c.

Referenced by findFileFormatBuffer().

 All Data Structures Files Functions Variables Typedefs Enumerations Enumerator Defines