Request a Personalized Demo

Learn how to accurately convert your designs with Scan2CAD

  • This field is for validation purposes and should be left unchanged.
  • Join thousands of happy customers worldwide

Raster Text Quality Checklist

 

Scan2CAD’s OCR text recognition only recognizes raster text where the following conditions are met:

 

The raster text is easily legible.
The raster text characters do not touch each other.
The raster text characters do not touch other drawing elements.
The raster text characters are not at different orientations.
The raster text characters are in a font that Scan2CAD can recognize.

 

To ensure that your text meets these conditions, work through the following Raster Text Quality Checklist.

 

First, place your cursor over a piece of text on your image. Press M to Magnify. Press M again and again until your image is highly magnified. Or, zoom in by scrolling your mouse wheel forward. (To zoom out again, click or press the Home key.)

 

1. Is the text easily legible?

If you cannot read the text easily, as in the examples below, Scan2CAD won’t be able to read it either.

 

               

 

If the text is not easily legible, the only remedy is to start off with a better quality raster image.

If this is not possible, you will have to retype the text manually. You can either do this in Scan2CAD or in your CAD program after you have imported the converted file into it.

You may want to erase areas of very poor quality text from the raster image so that these areas are not vectorized to lines and arcs.

 

2. Are the characters touching?

Scan2CAD cannot recognize characters that touch other characters, even if the characters are only connected by a few pixels:

 

 

If the characters touch, try selecting OCR Menu > Settings > Split before doing OCR recognition. When this option is selected Scan2CAD will attempt to split and identify touching characters.

This will improve text recognition on some raster images, however on others it may result in a lot of “junk characters” being recognized. This is because characters that touch are often very poor quality and are unrecognizable even after splitting.

For example, the characters in the example above have “bled”. Not only has this caused them to touch each other but it has also filled in the “A”. This means that the “A” is no longer typical of an “A” and Scan2CAD may have difficulty recognizing it even if it is not touching other characters.

You can often improve the quality of an image that has bled by rescanning it in grayscale and thresholding it.

 

3. Is the text written over other drawing elements?

If text is written over drawing elements or is attached to underlining or boxes as in the examples below, Scan2CAD won’t be able to recognize it.

 

 

4. Is the text at more than one orientation?

Scan2CAD can only reliably recognize text at one orientation at a time.

Text at different orientations can sometimes be recognized if you go to OCR > Settings and set Character Rotation to Auto.

However, where text is all jumbled up with lots of intermingled orientations, it is virtually impossible to recognize the text. This typically affects text on contour maps and site plan drawings such as the one below.

 

 

5. Can Scan2CAD recognize the font?

By default, Scan2CAD can only recognize text that has been written using a standard font such as the font in the example below.

 

 

It may not recognize other fonts well. It may also fail to recognize standard fonts that are narrower or wider than normal or that are italicized.

 

If Scan2CAD’s default text recognition cannot recognize a font well and you have a lot of images containing that font, you can train Scan2CAD to recognize the font. You can do this if the font characters are consistent and do not touch. For example:

 

Scan2CAD’s default text recognition will recognize this font but it will not recognize it optimally because the font is narrower than normal. You could train Scan2CAD to recognize this font well.

 

Scan2CAD’s default text recognition will recognize this font very poorly because it is italicized and hand written. However, because the characters are clear and do not touch you could train Scan2CAD to recognize it.

 

Scan2CAD’s default text recognition will recognize this font very poorly because it is hand written and because the characters touch each other. You could not train Scan2CAD to recognize this font because the characters touch.

 

Despite the fact that the quality of this text is poor you could train Scan2CAD to recognize it because the characters are consistent and do not touch each other.

 

It takes a few hours to train Scan2CAD to recognize a font but it can significantly improve text recognition.

 

Have questions on this topic? Talk to us