Leptonica 1.68
C Image Processing Library
|
Find and remove text page skew. More...
Go to the source code of this file.
Find and remove text page skew.
Top-level deskew interfaces PIX *pixDeskew() PIX *pixFindSkewAndDeskew() PIX *pixDeskewGeneral() Top-level angle-finding interface l_int32 pixFindSkew() Basic angle-finding functions l_int32 pixFindSkewSweep() l_int32 pixFindSkewSweepAndSearch() l_int32 pixFindSkewSweepAndSearchScore() l_int32 pixFindSkewSweepAndSearchScorePivot() Search over arbitrary range of angles in orthogonal directions l_int32 pixFindSkewOrthogonalRange() Differential square sum function for scoring l_int32 pixFindDifferentialSquareSum() Measures of variance of row sums l_int32 pixFindNormalizedSquareSum() ============================================================== Page skew detection Skew is determined by pixel profiles, which are computed as pixel sums along the raster line for each line in the image. By vertically shearing the image by a given angle, the sums can be computed quickly along the raster lines rather than along lines at that angle. The score is computed from these line sums by taking the square of the DIFFERENCE between adjacent line sums, summed over all lines. The skew angle is then found as the angle that maximizes the score. The actual computation for any sheared image is done in the function pixFindDifferentialSquareSum(). The search for the angle that maximizes this score is most efficiently performed by first sweeping coarsely over angles, using a significantly reduced image (say, 4x reduction), to find the approximate maximum within a half degree or so, and then doing an interval-halving binary search at higher resolution to get the skew angle to within 1/20 degree or better. The differential signal is used (rather than just using that variance of line sums) because it rejects the background noise due to total number of black pixels, and has maximum contributions from the baselines and x-height lines of text when the textlines are aligned with the raster lines. It also works well in multicolumn pages where the textlines do not line up across columns. The method is fast, accurate to within an angle (in radians) of approximately the inverse width in pixels of the image, and will work on a surprisingly small amount of text data (just a couple of text lines). Consequently, it can also be used to find local skew if the skew were to vary significantly over the page. Local skew determination is not very important except for locating lines of handwritten text that may be mixed with printed text.
Definition in file skew.c.
Input: pixs (any depth) redsearch (for binary search: reduction factor = 1, 2 or 4; use 0 for default) Return: pixd (deskewed pix), or null on error
Notes: (1) This binarizes if necessary and finds the skew angle. If the angle is large enough and there is sufficient confidence, it returns a deskewed image; otherwise, it returns a clone.
Definition at line 146 of file skew.c.
References DEFAULT_BS_REDUCTION, ERROR_PTR, NULL, pixDeskewGeneral(), and PROCNAME.
Referenced by main().
Input: pixs (any depth) redsearch (for binary search: reduction factor = 1, 2 or 4; use 0 for default) &angle (<optional return>=""> angle required to deskew, in degrees; use NULL to skip) &conf (<optional return>=""> conf value is ratio of max/min scores; use NULL to skip) Return: pixd (deskewed pix), or null on error
Notes: (1) This binarizes if necessary and finds the skew angle. If the angle is large enough and there is sufficient confidence, it returns a deskewed image; otherwise, it returns a clone.
Definition at line 180 of file skew.c.
References DEFAULT_BS_REDUCTION, ERROR_PTR, NULL, pixDeskewGeneral(), and PROCNAME.
Referenced by main().
PIX* pixDeskewGeneral | ( | PIX * | pixs, |
l_int32 | redsweep, | ||
l_float32 | sweeprange, | ||
l_float32 | sweepdelta, | ||
l_int32 | redsearch, | ||
l_int32 | thresh, | ||
l_float32 * | pangle, | ||
l_float32 * | pconf | ||
) |
Input: pixs (any depth) redsweep (for linear search: reduction factor = 1, 2 or 4; use 0 for default) sweeprange (in degrees in each direction from 0; use 0.0 for default) sweepdelta (in degrees; use 0.0 for default) redsearch (for binary search: reduction factor = 1, 2 or 4; use 0 for default;) thresh (for binarizing the image; use 0 for default) &angle (<optional return>=""> angle required to deskew, in degrees; use NULL to skip) &conf (<optional return>=""> conf value is ratio of max/min scores; use NULL to skip) Return: pixd (deskewed pix), or null on error
Notes: (1) This binarizes if necessary and finds the skew angle. If the angle is large enough and there is sufficient confidence, it returns a deskewed image; otherwise, it returns a clone.
Definition at line 222 of file skew.c.
References DEFAULT_BINARY_THRESHOLD, DEFAULT_BS_REDUCTION, DEFAULT_MINBS_DELTA, DEFAULT_SWEEP_DELTA, DEFAULT_SWEEP_RANGE, DEFAULT_SWEEP_REDUCTION, ERROR_PTR, L_ABS, L_BRING_IN_WHITE, L_ROTATE_AREA_MAP, MIN_ALLOWED_CONFIDENCE, MIN_DESKEW_ANGLE, NULL, pixClone(), pixConvertTo1(), pixDestroy(), pixFindSkewSweepAndSearch(), pixGetDepth(), pixRotate(), and PROCNAME.
Referenced by pixDeskew(), and pixFindSkewAndDeskew().
Input: pixs (1 bpp) &angle (<return> angle required to deskew, in degrees) &conf (<return> confidence value is ratio max/min scores) Return: 0 if OK, 1 on error or if angle measurment not valid
Notes: (1) This is a simple high-level interface, that uses default values of the parameters for reasonable speed and accuracy. (2) The angle returned is the negative of the skew angle of the image. It is the angle required for deskew. Clockwise rotations are positive angles.
Definition at line 305 of file skew.c.
References DEFAULT_BS_REDUCTION, DEFAULT_MINBS_DELTA, DEFAULT_SWEEP_DELTA, DEFAULT_SWEEP_RANGE, DEFAULT_SWEEP_REDUCTION, ERROR_INT, pixFindSkewSweepAndSearch(), pixGetDepth(), and PROCNAME.
Referenced by main().
l_int32 pixFindSkewSweep | ( | PIX * | pixs, |
l_float32 * | pangle, | ||
l_int32 | reduction, | ||
l_float32 | sweeprange, | ||
l_float32 | sweepdelta | ||
) |
Input: pixs (1 bpp) &angle (<return> angle required to deskew, in degrees) reduction (factor = 1, 2, 4 or 8) sweeprange (half the full range; assumed about 0; in degrees) sweepdelta (angle increment of sweep; in degrees) Return: 0 if OK, 1 on error or if angle measurment not valid
Notes: (1) This examines the 'score' for skew angles with equal intervals. (2) Caller must check the return value for validity of the result.
Definition at line 347 of file skew.c.
References ERROR_INT, GPLOT_LINES, GPLOT_PNG, GPLOT_POINTS, gplotAddPlot(), gplotCreate(), gplotDestroy(), gplotMakeOutput(), L_BRING_IN_WHITE, L_INFO_FLOAT2, numaAddNumber(), numaCreate(), numaDestroy(), numaFitMax(), pixClone(), pixCreateTemplate(), pixDestroy(), pixFindDifferentialSquareSum(), pixGetDepth(), pixReduceRankBinaryCascade(), pixVShearCorner(), pixZero(), and PROCNAME.
l_int32 pixFindSkewSweepAndSearch | ( | PIX * | pixs, |
l_float32 * | pangle, | ||
l_float32 * | pconf, | ||
l_int32 | redsweep, | ||
l_int32 | redsearch, | ||
l_float32 | sweeprange, | ||
l_float32 | sweepdelta, | ||
l_float32 | minbsdelta | ||
) |
Input: pixs (1 bpp) &angle (<return> angle required to deskew; in degrees) &conf (<return> confidence given by ratio of max/min score) redsweep (sweep reduction factor = 1, 2, 4 or 8) redsearch (binary search reduction factor = 1, 2, 4 or 8; and must not exceed redsweep) sweeprange (half the full range, assumed about 0; in degrees) sweepdelta (angle increment of sweep; in degrees) minbsdelta (min binary search increment angle; in degrees) Return: 0 if OK, 1 on error or if angle measurment not valid
Notes: (1) This finds the skew angle, doing first a sweep through a set of equal angles, and then doing a binary search until convergence. (2) Caller must check the return value for validity of the result. (3) In computing the differential line sum variance score, we sum the result over scanlines, but we always skip:
Definition at line 489 of file skew.c.
References NULL, and pixFindSkewSweepAndSearchScore().
Referenced by main(), pixDeskewGeneral(), pixFindSkew(), and pixGetLocalSkewAngles().
l_int32 pixFindSkewSweepAndSearchScore | ( | PIX * | pixs, |
l_float32 * | pangle, | ||
l_float32 * | pconf, | ||
l_float32 * | pendscore, | ||
l_int32 | redsweep, | ||
l_int32 | redsearch, | ||
l_float32 | sweepcenter, | ||
l_float32 | sweeprange, | ||
l_float32 | sweepdelta, | ||
l_float32 | minbsdelta | ||
) |
pixFindSkewSweepAndSearchScore()
Input: pixs (1 bpp) &angle (<return> angle required to deskew; in degrees) &conf (<return> confidence given by ratio of max/min score) &endscore (<optional return>=""> max score; use NULL to ignore) redsweep (sweep reduction factor = 1, 2, 4 or 8) redsearch (binary search reduction factor = 1, 2, 4 or 8; and must not exceed redsweep) sweepcenter (angle about which sweep is performed; in degrees) sweeprange (half the full range, taken about sweepcenter; in degrees) sweepdelta (angle increment of sweep; in degrees) minbsdelta (min binary search increment angle; in degrees) Return: 0 if OK, 1 on error or if angle measurment not valid
Notes: (1) This finds the skew angle, doing first a sweep through a set of equal angles, and then doing a binary search until convergence. (2) There are two built-in constants that determine if the returned confidence is nonzero:
Definition at line 541 of file skew.c.
References L_SHEAR_ABOUT_CORNER, and pixFindSkewSweepAndSearchScorePivot().
Referenced by pixDeskewBarcode(), and pixFindSkewSweepAndSearch().
l_int32 pixFindSkewSweepAndSearchScorePivot | ( | PIX * | pixs, |
l_float32 * | pangle, | ||
l_float32 * | pconf, | ||
l_float32 * | pendscore, | ||
l_int32 | redsweep, | ||
l_int32 | redsearch, | ||
l_float32 | sweepcenter, | ||
l_float32 | sweeprange, | ||
l_float32 | sweepdelta, | ||
l_float32 | minbsdelta, | ||
l_int32 | pivot | ||
) |
pixFindSkewSweepAndSearchScorePivot()
Input: pixs (1 bpp) &angle (<return> angle required to deskew; in degrees) &conf (<return> confidence given by ratio of max/min score) &endscore (<optional return>=""> max score; use NULL to ignore) redsweep (sweep reduction factor = 1, 2, 4 or 8) redsearch (binary search reduction factor = 1, 2, 4 or 8; and must not exceed redsweep) sweepcenter (angle about which sweep is performed; in degrees) sweeprange (half the full range, taken about sweepcenter; in degrees) sweepdelta (angle increment of sweep; in degrees) minbsdelta (min binary search increment angle; in degrees) pivot (L_SHEAR_ABOUT_CORNER, L_SHEAR_ABOUT_CENTER) Return: 0 if OK, 1 on error or if angle measurment not valid
Notes: (1) See notes in pixFindSkewSweepAndSearchScore(). (2) This allows choice of shear pivoting from either the UL corner or the center. For small angles, the ability to discriminate angles is better with shearing from the UL corner. However, for large angles (say, greater than 20 degrees), it is better to shear about the center because a shear from the UL corner loses too much of the image.
Definition at line 588 of file skew.c.
References ERROR_INT, GPLOT_LINES, GPLOT_PNG, GPLOT_POINTS, gplotAddPlot(), gplotCreate(), gplotDestroy(), gplotMakeOutput(), L_BRING_IN_WHITE, L_INFO_FLOAT, L_INFO_FLOAT2, L_SHEAR_ABOUT_CENTER, L_SHEAR_ABOUT_CORNER, L_WARNING, MIN_VALID_MAXSCORE, MINSCORE_THRESHOLD_CONSTANT, numaAddNumber(), numaCreate(), numaDestroy(), numaEmpty(), numaGetCount(), numaGetFValue(), numaGetMax(), numaGetMin(), pixClone(), pixCreateTemplate(), pixDestroy(), pixFindDifferentialSquareSum(), pixGetDepth(), pixGetHeight(), pixGetWidth(), pixReduceRankBinaryCascade(), pixVShearCenter(), pixVShearCorner(), pixZero(), and PROCNAME.
Referenced by main(), pixFindSkewOrthogonalRange(), and pixFindSkewSweepAndSearchScore().
l_int32 pixFindSkewOrthogonalRange | ( | PIX * | pixs, |
l_float32 * | pangle, | ||
l_float32 * | pconf, | ||
l_int32 | redsweep, | ||
l_int32 | redsearch, | ||
l_float32 | sweeprange, | ||
l_float32 | sweepdelta, | ||
l_float32 | minbsdelta, | ||
l_float32 | confprior | ||
) |
Definition at line 963 of file skew.c.
References ERROR_INT, L_SHEAR_ABOUT_CORNER, pixDestroy(), pixFindSkewSweepAndSearchScorePivot(), pixGetDepth(), pixRotateOrth(), and PROCNAME.
Referenced by main().
pixFindDifferentialSquareSum()
Input: pixs &sum (<return> result) Return: 0 if OK, 1 on error
Notes: (1) At the top and bottom, we skip:
Definition at line 1033 of file skew.c.
References ERROR_INT, L_MAX, L_MIN, NULL, numaDestroy(), numaGetCount(), numaGetFValue(), pixCountPixelsByRow(), pixGetHeight(), pixGetWidth(), and PROCNAME.
Referenced by pixFindSkewSweep(), and pixFindSkewSweepAndSearchScorePivot().
l_int32 pixFindNormalizedSquareSum | ( | PIX * | pixs, |
l_float32 * | phratio, | ||
l_float32 * | pvratio, | ||
l_float32 * | pfract | ||
) |
Input: pixs &hratio (<optional return>=""> ratio of normalized horiz square sum to result if the pixel distribution were uniform) &vratio (<optional return>=""> ratio of normalized vert square sum to result if the pixel distribution were uniform) &fract (<optional return>=""> ratio of fg pixels to total pixels) Return: 0 if OK, 1 on error or if there are no fg pixels
Notes: (1) Let the image have h scanlines and N fg pixels. If the pixels were uniformly distributed on scanlines, the sum of squares of fg pixels on each scanline would be h * (N / h)^2. However, if the pixels are not uniformly distributed (e.g., for text), the sum of squares of fg pixels will be larger. We return in hratio and vratio the ratio of these two values. (2) If there are no fg pixels, hratio and vratio are returned as 0.0.
Definition at line 1101 of file skew.c.
References ERROR_INT, NULL, numaDestroy(), numaGetFValue(), numaGetSum(), pixCountPixelsByRow(), pixDestroy(), pixGetDepth(), pixGetDimensions(), pixRotateOrth(), and PROCNAME.
const l_float32 DEFAULT_SWEEP_RANGE = 7. [static] |
Definition at line 88 of file skew.c.
Referenced by pixDeskewGeneral(), and pixFindSkew().
const l_float32 DEFAULT_SWEEP_DELTA = 1. [static] |
Definition at line 89 of file skew.c.
Referenced by pixDeskewGeneral(), and pixFindSkew().
const l_float32 DEFAULT_MINBS_DELTA = 0.01 [static] |
Definition at line 95 of file skew.c.
Referenced by pixDeskewGeneral(), and pixFindSkew().
const l_int32 DEFAULT_SWEEP_REDUCTION = 4 [static] |
Definition at line 98 of file skew.c.
Referenced by pixDeskewGeneral(), and pixFindSkew().
const l_int32 DEFAULT_BS_REDUCTION = 2 [static] |
Definition at line 99 of file skew.c.
Referenced by pixDeskew(), pixDeskewGeneral(), pixFindSkew(), and pixFindSkewAndDeskew().
const l_float32 MIN_DESKEW_ANGLE = 0.1 [static] |
Definition at line 102 of file skew.c.
Referenced by pixDeskewGeneral().
const l_float32 MIN_ALLOWED_CONFIDENCE = 3.0 [static] |
Definition at line 105 of file skew.c.
Referenced by pixDeskewGeneral().
const l_int32 MIN_VALID_MAXSCORE = 10000 [static] |
Definition at line 108 of file skew.c.
Referenced by pixFindSkewSweepAndSearchScorePivot().
const l_float32 MINSCORE_THRESHOLD_CONSTANT = 0.000002 [static] |
Definition at line 113 of file skew.c.
Referenced by pixFindSkewSweepAndSearchScorePivot().
const l_int32 DEFAULT_BINARY_THRESHOLD = 130 [static] |
Definition at line 116 of file skew.c.
Referenced by pixDeskewGeneral().