Using Zebra for batch processing
As mentioned in my previous post, Zebra, more specifically, zebraimg, the barcode recognition component, is fast and rather accurate. However, the program isn’t really made for batch processing of barcodes. Yes, you can execute zebraimg *.jpg, but the problem is that it is difficult to extract information on which barcodes were successfully recognised, which were ambiguous, and which failed. I initially ran this command to process my barcodes, but ended up having to cross-check the results, which wasted quite a bit of time. I wanted some formatting of results to make it easier to identify problematic barcodes.
A simple script would be able to format the recognition results. Because I am not familiar with shell scripting, I wrote one in PHP instead.
The gist of the script is that it scans the working directory for all JPEGs, runs zebraimg and writes the result to a file. The result is saved in a CSV format with the filename and zero or more recognised barcodes.
First, we define the location of the zebraimg executable, and the allowed file extensions. Note that zebraimg is not limited to only JPEGs. However, since my barcodes are all JPEGs, I’m restricting my scripts to JPEGs.
Next, we get the file listing in the current directory, and open result.txt for writing.
We then do some basic validity checks:
if (!is_file($file)) {
//echo $file . " is not a file\n";
continue;
}
$fileinfo = pathinfo($file);
if (array_search(strtolower($fileinfo['extension']), $allowed_ext) === FALSE) {
continue;
}
We are now ready to execute zebraimg on the current file:
I used the -q option so that zebraimg only returns the barcodes found, if any (and no other messages).
Next, we do some simple text processing to get the CSV format that we want, and write out the result.
foreach($results as $result)
$towrite .= ','.$result;
$towrite .= "\r\n";
fwrite($fp, $towrite);
And of course, close the foreach loop and the file pointer:
Sample output from running the script:
Processing IMAGE_353.JPG... 0 barcodes found.
Processing IMAGE_354.JPG... 1 barcodes found.
Processing IMAGE_355.JPG... 1 barcodes found.
Processing IMAGE_356.JPG... 1 barcodes found.
Processing IMAGE_357.JPG... 1 barcodes found.
Processing IMAGE_358.JPG... 1 barcodes found.
Processing IMAGE_359.JPG... 2 barcodes found.
And the corresponding entries in the result file:
IMAGE_353.JPG
IMAGE_354.JPG,EAN-13:9789812046260
IMAGE_355.JPG,EAN-13:0071152006998
IMAGE_356.JPG,EAN-13:0070993007997
IMAGE_357.JPG,EAN-13:9780349112923
IMAGE_358.JPG,EAN-13:9780747236818
IMAGE_359.JPG,EAN-13:9781876095024,EAN-13:0633365095024
There is, of course, a slight performance overhead incurred, but this is nothing compared to the effort of cross-checking the recognised barcodes manually.
What performance overhead are we talking about here? Running the zebraimg *.JPG took around 10 seconds for 74 images. Running the script took around 13 seconds. As you can see, in absolute terms, this overhead is minimal, thanks to the speed at which zebraimg processes the images.