Configure text recognition settings

Relevant for: GUI tests and components

Prerequisites

In your application, display the text you want to capture.

Back to top

Analyze the characteristics of the text

Determine whether you can capture the text using a text (or text-like) property instead of using a text recognition mechanism.

Back to top

Decide which OCR engine to use

You may need to experiment with your application to find out which OCR settings achieve the best results. Once you determine which OCR engine works best with your tests, we recommend using that engine consistently. Using different engines for different runs may produce different results.

Note: Cloud OCR engines are supported only in UFT versions 15.0.1 or later.

When choosing an OCR engine, consider the following:

Consideration Google and Baidu (Cloud) Abbyy and Tesseract (Non cloud)
Accuracy

Choose the OCR engine that proves most accurate for your applications.

Cloud vendors provide different customer plans providing various levels of accuracy.

Availability

Requires a cloud OCR account and an available Internet connection

No need for account or Internet connection

Language support

Google Cloud OCR detects the languages automatically and supports mixed-language text. You do not have to specify which languages to expect in the application. See also Known issues - Multilingual applications.

Baidu OCR supports fewer languages than Abbyy but provides greater accuracy and better recognition in hieroglyphic languages, such as Chinese, Japanese, or Korean.

Check the list of available languages for Baidu and Abbyy in the Text Recognition pane. Baidu supports various languages separately, or text that contains Chinese and English. See also Known issues - Multilingual applications.

Abbyy OCR supports many languages, it can be configured to support multiple languages, and can recognize mixed-language text.

When using the Tesseract OCR engine, it is possible to use only one language pack at a time. See more details on language packs in Languages, below.

Performance

Affected by the quality of the Internet connection and not the computer's configuration.

Affected by your cloud platform plan.

Requires a strong processor. On older computers, may take a long time to provide result for complicated images or multilingual text.

The Tesseract OCR engine is slower than the Abbyy OCR engine. If your test has a significant use of text recognition steps (such as GetVisibleText), note that the total time required to run these tests will increase.

Back to top

Set the appropriate options

In the Text Recognition pane of the Options dialog box (Tools > Options > GUI Testing tab > Text Recognition node, set the following options:

OCR engine type

Select one of the following text recognition mechanisms:

  • The Abbyy OCR (the default option)
  • The Tesseract OCR engine
  • The Google Cloud OCR engine
  • The Baidu Cloud OCR engine

Note: Cloud OCR engines are supported only in UFT versions 15.0.1 or later.

Configure the connection to the Cloud OCR service

Supported on UFT versions 15.0.1 and later

Using a cloud OCR engine requires setting up an account with the relevant vendor and obtaining an access token or key used to connect to the cloud service.

  • Enter your access token or key
  • If your Internet connection requires a proxy, specify the proxy server address and authentication details.
  • Press Test Connection to test your connection details and make sure UFT can connect to the cloud OCR service.
Text Recognition mode

(Abbyy and Tesseract OCR engines only)

  • Single text block mode: Focuses on the area and treat it as a single text block. This is especially useful when trying to capture text on small objects or in a small text area. Select this option if the text on the object is uniform in font, size, color, and background.

  • Multiple text block mode: Instructs the OCR mechanism to handle each text area in the object that has a different background font and size. The OCR mechanism decides where to divide the text blocks according to an internal algorithm. Select this option only if the text on the object comprises different fonts, font sizes, colors, and/or backgrounds.

Languages

Available languages and supported languages (Abbyy OCR engine only)

From the list of available languages, select the supported languages for text recognition.

UFT 15.0 or earlier: You can select multiple non-hieroglypic languages, or one of the hieroglyphic languages, such as Chinese, Japanese, or Korean.

UFT 15.0.1 or later: You can select multiple languages to support.

Language type (Baidu OCR Engine only)

From the list of languages, select a single language to support for text recognition, or select Chinese and English.

Current language pack (Tesseract OCR engine only)

The current language pack to use in text recognition. When using the Tesseract engine, it is possible to use only one language pack at a time.

You can download additional language packs from the Tesseract OCR engine download site: https://sourceforge.net/projects/tesseract-ocr-alt/files/?source=navbar. After downloading, add the files from the language packs in the <UFT installation directory>/dat/tessdata folder.

Symbols for text recognition (Tesseract OCR engine only) Enter the list of characters you want UFT to recognize. When UFT runs the test, it will perform text recognition only on the characters specified and all others are ignored.
Fast mode

(Tesseract OCR engine only)

Select whether you want UFT to perform with greater text recognition accuracy or better test run performance. Clear the Fast mode checkbox to run with greater accuracy.

Use configuration from a file

(Tesseract OCR engine only) Instructs UFT load text recognition configuration from an externally created file.

For details on creating a file, see http://www.sk-spell.sk.cx/tesseract-ocr-parameters-in-302-version.

Preprocess the image before using text recognition Instructs UFT to process the background image before performing text recognition. This enables UFT to identify the image elements before using text recognition.

Back to top

Check the text recognition settings

  1. Create or open a test or component.
  2. Do any of the following:

    • Insert a text checkpoint or output value step (tests and scripted components only)

    • Insert a step that uses one of the following test object methods:

      • testobject.GetVisibleText

      • testobject.GetTextLocation

      • testobject.GetText (for Terminal Emulator objects)

    • Insert a step that uses one of the following reserved object methods (tests and scripted components only):

      • TextUtil.GetText

      • TextUtil.GetTextLocation

  3. Back to top

Adjust the settings as necessary

If the captured text is not as expected, analyze the problems and adjust the Text Recognition options to fine tune the way UFT captures your text.

Back to top

See also: