none
Get the bounding box of text when using OCR

    Question

  • I'm playing around with U-SQL cognitive extensions and try using OCR to get text from images. The result return all the text in image that concatenated into a single line. Is there a way to extract text with it's bounding box like OCR in Azure Computer Vision API?

    Here is my U-SQL code.

    REFERENCE ASSEMBLY ImageCommon;
    REFERENCE ASSEMBLY FaceSdk;
    REFERENCE ASSEMBLY ImageEmotion;
    REFERENCE ASSEMBLY ImageTagging;
    REFERENCE ASSEMBLY ImageOcr;

    @imgs =
        EXTRACT FileName string, ImgData byte[]
        FROM @"/images/{FileName:*}.TIF"
        USING new Cognition.Vision.ImageExtractor();


    @ocrs =
            PROCESS @imgs
            PRODUCE FileName,
                    Text string
            READONLY FileName
            USING new Cognition.Vision.OcrExtractor();

    OUTPUT @ocrs
    TO @"/images_output/test.txt"
    USING Outputters.Text();

    Thank you.


    • Edited by Art Bkk Wednesday, July 5, 2017 5:01 AM
    Tuesday, July 4, 2017 5:48 AM