← Back to team overview

cuneiform team mailing list archive

[Bug 344790] Re: OCR quality drops (mising point)

 

Second conclusion:
cuneiform (I try refactoring branch) produce error message like "Unknown DIB format                                    
CTIImageList::AddImage: invalid image info " if the page has image object.

Example

cuneiform -v 1871_016.3B 
1871_016.3B=> DIB 2544x3300 2544x3300+0+0 1-bit Bilevel DirectClass 24.02MiB 0.210u 0:00.209
############################                                                                
CuneiForm Recognize options:                                                                
  Language:      0                                                                          
  Fax:           false                                                                      
  Use speller:   false                                                                      
Layout options:                                                                             
  One Column:    false                                                                      
  Dot Matrix:    false                                                                      
  Auto Rotate:   false                                                                      
  Tables number: 0                                                                          
  Geometry:      Rect(Point(0,0), Point(2544,3300)) width:2544; height:3300                 
FormatOptions:                                                                              
  SerifName:         Times New Roman                                                        
  SansSerifName:     Arial                                                                  
  Monospace Name:    Courier New                                                            
  Use bold:          false                                                                  
  Use Italic:        false                                                                  
  Use font size:     false                                                                  
  Unrecognized char: '~'                                                                    
  Line breaks:       false                                                                  
############################                                                                
The image depth is 24 at this point.                                                        
falseWarning: RSL said that the lines don't need to be erased from the picture.             
VSL: before table search - 0, after -13                                                     
VSL: Нужных изменений не найдено                                                            
Container CPAGE contains:                                                                   
 name : size                                                                                
TYPE_IMAGE : 12032                                                                          
TYPE_IMAGE : 12032                                                                          
TYPE_TEXT : 12032                                                                           
TYPE_TEXT : 12032                                                                           
TYPE_TEXT : 12032                                                                           
TYPE_TEXT : 12032                                                                           
TYPE_TEXT : 12032                                                                           
TYPE_TEXT : 12032                                                                           
TYPE_TEXT : 12032                                                                           
TYPE_TEXT : 12032                                                                           
TYPE_TEXT : 12032                                                                           
TYPE_TEXT : 12032                                                                           
TYPE_TEXT : 12032                                                                           
Fragment 1 Line  1: <1000>                                                                  
Fragment 2 Line  2: <J. E. COSTA>                                                           
Fragment 3 Line  3: <reconstruction from rainfall data. Reconstructed flood peaks based>    
Fragment 3 Line  4: <on rainfall analyses of' the 1976 Big Thompson storm for the two>      
Fragment 3 Line  5: <basins are 153 and 110 m~/s (Miller and others, 1978) (Table 6).>      
Fragment 3 Line  6: <Th>                                                                    
Fragment 3 Line  7: <ese values are also closer to the paleohydraulic values than are>      
Fragment 3 Line  8: <the slope-area estimates. Thus, two independent indirect discharge>    
Fragment 3 Line  9: <estimates give peak discharge estimates similar to those calculated>   
Fragment 3 Line 10: <by the paleohydraulic technique developed here, but significantl>      
Fragment 3 Line 11: <icny>                                                                  
Fragment 3 Line 12: <less than the published conventional slope-area discharge estimates>   
Fragment 3 Line 13: <(Table 6). These results support the suggestion that excessive chan->  
Fragment 3 Line 14: <nel scour in small mountain tributaries could cause slope-area dis->   
Fragment 3 Line 15: <charge estimates to be too large.>                                     
Fragment 3 Line 16: <A second explanation for why most paleohydraulic recon->               
Fragment 3 Line 17: <structed discharges on small streams are lower than those estimated>   
Fragment 3 Line 18: <by slope-area techniques may be that slope-area inethods require>      
Fragment 3 Line 19: <the estimation of a roughness coefficient (n). Typical values consis-> 
Fragment 3 Line 20: <tently selected for large floods in small mountain channels are n =>   
Fragment 3 Line 21: <0.035 to 0.06. Research on verification of roughness coefficients for> 
Fragment 3 Line 22: <steep mountain channels recently completed (R. D. Jarrett, unpub.>     
Fragment 3 Line 23: <data) indicates that these n values may be too low by factors of 1.5>  
Fragment 3 Line 24: <to 2.0. Higher values would reduce velocities and result in lower>     
Fragment 3 Line 25: <slope-area discharge estimates.>                                       
Fragment 3 Line 26: <Finally, the fundamental assumptions that particles of all sizes>      
Fragment 3 Line 27: <are available for transport in small, steep mountain valleys and that> 
Fragment 3 Line 28: <flood velocity and depth (actually depth-slope product) are>           
Fragment 3 Line 29: <reflected in the size of boulders in flood deposits must be examined.> 
Fragment 3 Line 30: <Large floods may have been able to move boulders larger than>          
Fragment 3 Line 31: <those that were available. This may be the case in Sawmill Gulch>      
Fragment 3 Line 32: <(Table 6, site 9), which follows a major shear zone along which>       
Fragment 3 Line 33: <uranium enrichrnent occurs (Sims and Sheridan, 1964). Fault>           
Fragment 3 Line 34: <movements have crushed and broken the bedrock along closely>           
Fragment 3 Line 35: <spaced joints and fractures, and consequently there are no very>       
Fragment 3 Line 36: <large boulders available to be moved during a large flood. The>        
Fragment 3 Line 37: <second assumption, that average velocity and depth can be recon->      
Fragment 3 Line 38: <structed from particle size with reasonable accuracy, requires that>   
Fragment 3 Line 39: <the methods selected, premises, and numerical values estimated rea->   
Fragment 3 Line 40: <sonably approximate processes and conditions in the field. Unfor->     
Fragment 3 Line 41: <tunately, the hazards entailed in compiling actual measurements>       
Fragment 3 Line 42: <and observations during catastrophic floods preclude any substan->     
Fragment 3 Line 43: <tial direct empirical corroboration.>                                  
Fragment 4 Line 44: <macroturbulent effects in large rivers during flash fl>                
Fragment 4 Line 45: <'�rna>                                                                 
Fragment 4 Line 46: <explanation. Lifting forces induced by macrot>                         
Fragment 4 Line 47: <o ur ulent ">                                                          
Fragment 4 Line 48: <play an essential role in the entrainment of coarse>                   
Fragment 4 Line 49: <rse Particles ii>                                                      
Fragment 4 Line 50: <deep flows (Matthes, 1947; Baker, 1973; Jackson, 1976>                 
Fragment 4 Line 51: <upward forces promote the entrainment of parti.>                       
Fragment 4 Line 52: <ac son, 19761>                                                         
Fragment 4 Line 53: <coarsr>                                                                
Fragment 4 Line 54: <those that tractive force and velocity alone can .:>                   
Fragment 4 Line 55: <iphsh>                                                                 
Fragment 4 Line 56: <particles coarser than about 2 m may be moved;>                        
Fragment 4 Line 57: <iws le,>                                                               
Fragment 4 Line 58: <and more shallow than would be predicted by exi;>                      
Fragment 4 Line 59: <exu ~polatrni>                                                         
Fragment 4 Line 60: <incipient motion velocity and depth values for small>                  
Fragment 4 Line 61: <e" pariie>                                                             
Fragment 5 Line 62: <APPLICATION OF PALEOHYDRAULIC>                                         
Fragment 5 Line 63: <DISCHARGE COMPUTATIONS>                                                
Fragment 6 Line 64: <The application of the paleohydraulic flood dis h>                     
Fragment 6 Line 65: <isc arge (>                                                            
Fragment 6 Line 66: <struction technique developed here can be demonstr t d>                
Fragment 6 Line 67: <s rate usin>                                                           
Fragment 6 Line 68: <streams in the Colorado Front Range with sediment 1>                   
Fragment 6 Line 69: <mento ogrea>                                                           
Fragment 6 Line 70: <dence of large flash floods, but without conventional indiree>         
Fragment 6 Line 71: <charge estimates. The two examples are a small tributa>                
Fragment 6 Line 72: <u arytog>                                                              
Fragment 6 Line 73: <Gulch in the Big Thompson River basin, and a I:irge,t,>                
Fragment 6 Line 74: <Boulder Creek at Boulder, Colorado, where the s, -entolo>              
Fragment 6 Line 75: <flood record previously has been investigated by B. and q>             
Fragment 6 Line 76: <(1980).>                                                               
Fragment 7 Line 77: <Rabbit Gulch Tributary>                                                
Fragment 8 Line 78: <Figure 1 I shows a pile of large boulders deposited at the>            
Fragment 8 Line 79: <2>                                                                     
Fragment 8 Line 80: <of a small (1.8 km ) tributary to Rabbit Gulch from a catastrol>       
Fragment 8 Line 81: <flash flood in 1976 in the Big Thompson River basin. The averag>       
Fragment 8 Line 82: <the 5 largest boulders is 1,150 mm, and the channel slope measo>       
Fragment 8 Line 83: <from 1:24,000 scale topographic maps is 0.091. Using equatioo>         
Fragment 8 Line 84: <the estimated average flood velocity is 5.57 m/s, and from Figor>      
Fragment 8 Line 85: <the estimated average flood depth is 1.35 m. Two valley cros~ s>       
Fragment 8 Line 86: <tions are shown in Figure 12, along with the appropriate top flo>      
Fragment 8 Line 87: <width (dashed lines) for the estimated average depth., he averi>       
Fragment 8 Line 88: <discharge for the two cross sections is 57 ms/s. This ts to lx>
Fragment 8 Line 89: <reasonable value for the flood peak for two reasons thc or>
Fragment 8 Line 90: <discharge is approximately 32 m-/s/km2, which r.~ . rnilar>
Fragment 8 Line 91: <unit discharges for other small tributaries in the bra,ompsr>
Fragment 9 Line 92: <I.arge Streams>
Fragment10 Line 93: <When dealing with particles coarser than about 2 m, paleohy->
Fragment10 Line 94: <draulic reconstructions of' average velocity and depth are less accu->
Fragment10 Line 95: <rate than f' or smaller boulders. For the Big Thompson River, the>
Fragment10 Line 96: <only large stream in the Colorado Front Range used to verify>
Fragment10 Line 97: <paleohydraulic reconstructions, postflood slope-area surveys (Groz->
Fragment10 Line 98: <ier and others, 1976) indicated the average flood depth for two cross>
Fragment10 Line 99: <sections was 3.23 m. The reconstructed depth from Figure 7 using>
Fragment10 Line100: <the average of the five largest boulders moved through the cross>
Fragment10 Line101: <section and deposited below the mouth of the canyon (2.76 m)>
Fragment10 Line102: <( .. radley, 1982, personal commun.) is 4.80 m. The calculated>
Fragment10 Line103: <(W. C. Br>
Fragment10 Line104: <average velocity from slope-area measurements (rated "poor") is>
Fragment10 Line105: <7.92 m/s compared to 8.53 m/s computed from equation 10. The>
Fragment10 Line106: <paleohydraulic overestimation of depth and velocity results in the>
Fragment10 Line107: <reconstructed peak discharge at the mouth of the canyon exceedin>
Fragment10 Line108: <g>
Fragment10 Line109: <the slope-area estimate by 76eZ<.>
Fragment10 Line110: <This is a greater diff'erence than occurs in any of the smaller>
Fragment10 Line111: <streams with smaller-sized flood boulders. Possibly, additional>
Fragment11 Line112: <Figure II. Photograph of large boulders deposited at '"'>
Fragment11 Line113: <mouth of an unnamed tributary to Rabbit Gulch from a farg�'>
Fragment11 Line114: <flood in 1976. No previous discharge estimates e ist for this��>
Unknown DIB format
CTIImageList::AddImage: invalid image info

Text is recognized correctly, but there are no output file produced

-- 
OCR quality drops (mising point)
https://bugs.launchpad.net/bugs/344790
You received this bug notification because you are a member of Cuneiform
Linux, which is the registrant for Cuneiform for Linux.

Status in Linux port of Cuneiform: New

Bug description:
OCR quality drops during porting.
Look at the result of recognition stdj4.tif, line 7, smart text format

was (stdj4.txt.initial)
mli i f r nin. Ithas

is (stdj4.txt.puma)
m li i f r nin Ithas