Application Kata "BankOCR"

Write a program that recognizes account numbers in ASCII files.

OCR stands for Optical Character Recognition. Of course, it would be asking a lot to write OCR software as part of an exercise. But even if you drastically reduce the task, it remains interesting.

The ASCII files to be read contain sequences of digits, each coded in three lines. The following illustrations show how the digits are displayed.

Application Kata BankOCR - Clean Code Developer Akademie

Each digit is three characters wide and three lines high. Consecutive digits are separated by a space. Consecutive lines with sequences of digits are separated by a blank line. The digits consist of the characters "_" (underscore) and "I" (capital i).

The program is called with a file name as a parameter and outputs the recognized digits as the result. The structure of the files is always correct.

C:> bankocr file1.txt
1234567890
815
42
07

Test data are here to find.

Variation #1

Files can also be structured incorrectly. However, the line structure is correct, i.e. three lines are combined to form digits. The blank lines are also present. However, other characters may appear in the lines or the characters may be arranged incorrectly. Lines with errors should be output as "Incorrect line":

C:> bankocr file1.txt
1234567890
Faulty line
42
07
en_USEnglish