locked
Testing Project Oxford OCR : How to use a local file ? (in base64 for example

    Question

  • Hello,

    I'm beginning to use the Project Oxford API, specifically to assist in OCR of scanned images.

    My starting point is an example that I found on :

    https://social.msdn.microsoft.com/Forums/en-US/52ce5aef-b3ac-4ba7-ba64-da539d700001/mvp-how-to-using-project-oxford-vision-api-with-javascript?forum=mlapi

    This example uses

          $.ajax({
             url: 'https://api.projectoxford.ai/vision/v1/analyses?' + $.param(params),
             type: 'POST',
             contentType: 'application/json',
             data: '{ "Url": "http://images.takungpao.com/2012/1115/20121115073901672.jpg" }',
          })

    What i'd like to do is test with a local file. I understand that for security reasons I cannot reference a local file so I'd like to use the base64 encoding of the image to include in the data parameter.

    I've tried

             data: ['..........gAooooAKKKKACiiigD//Z'],

    but I'm getting the response

    {"code":"BadArgument","requestId":"eb0a85e4-aa9e-4a66-865e-231b2746c84d","message":"JSON format error."}

    Does anyone have any pointers as to how I can do this ?

    I suppoe I could crank up a local server and test that way but I'm curious abot how to test directly with the base64 encoding.

    Best regards,

    Colm O'G

    Wednesday, December 02, 2015 4:08 PM

Answers

  • If you set the data to a Blob type, jQuery will send the raw binary as the body, which is what you want.  Here's some code you can play with:

    <html>
    <head>
        <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script>
        <script type="text/javascript">
            // makeblob cribbed from https://github.com/ebidel/filer.js/blob/master/src/filer.js#L137
            makeblob = function (dataURL) {
                var BASE64_MARKER = ';base64,';
                if (dataURL.indexOf(BASE64_MARKER) == -1) {
                    var parts = dataURL.split(',');
                    var contentType = parts[0].split(':')[1];
                    var raw = decodeURIComponent(parts[1]);
                    return new Blob([raw], { type: contentType });
                }
                var parts = dataURL.split(BASE64_MARKER);
                var contentType = parts[0].split(':')[1];
                var raw = window.atob(parts[1]);
                var rawLength = raw.length;
    
                var uInt8Array = new Uint8Array(rawLength);
    
                for (var i = 0; i < rawLength; ++i) {
                    uInt8Array[i] = raw.charCodeAt(i);
                }
    
                return new Blob([uInt8Array], { type: contentType });
            }
            ocr = function () {
                $.ajax({
                    url: "http://api.projectoxford.ai/vision/v1/ocr",
                    beforeSend: function (xhrObj) {
                        // Request headers
                        xhrObj.setRequestHeader("Content-Type", "application/octet-stream");
                        xhrObj.setRequestHeader("Ocp-Apim-Subscription-Key", "YOUR_API_KEY");
                    },
                    type: "POST",
                    // The DataURL will be something like "data:image/png;base64,{image-data-in-base64}"
                    data: makeblob(document.getElementById("c").toDataURL()),
                    processData: false
                })
                .success(function (data, status) {
                    //display data
                    console.log(JSON.stringify(data));
                    debugger;
                })
                .error(function (xhr, status, err) {
                    debugger;
                });
            }
        </script>
    </head>
    <body>
        <canvas id="c" width="100" height="50"></canvas>
        <script type="text/javascript">
            var ctx = document.getElementById("c").getContext("2d");
            ctx.fillStyle = "black";
            ctx.fillRect(0, 0, 100, 50);
            ctx.font = "20px serif";
            ctx.fillStyle = "white";
            ctx.fillText("Hello!", 35, 45);
        </script>
        <button onclick="ocr();">ocr!</button>
    </body>
    </html>

    Wednesday, December 02, 2015 10:44 PM
    Moderator

All replies

  • If you set the data to a Blob type, jQuery will send the raw binary as the body, which is what you want.  Here's some code you can play with:

    <html>
    <head>
        <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script>
        <script type="text/javascript">
            // makeblob cribbed from https://github.com/ebidel/filer.js/blob/master/src/filer.js#L137
            makeblob = function (dataURL) {
                var BASE64_MARKER = ';base64,';
                if (dataURL.indexOf(BASE64_MARKER) == -1) {
                    var parts = dataURL.split(',');
                    var contentType = parts[0].split(':')[1];
                    var raw = decodeURIComponent(parts[1]);
                    return new Blob([raw], { type: contentType });
                }
                var parts = dataURL.split(BASE64_MARKER);
                var contentType = parts[0].split(':')[1];
                var raw = window.atob(parts[1]);
                var rawLength = raw.length;
    
                var uInt8Array = new Uint8Array(rawLength);
    
                for (var i = 0; i < rawLength; ++i) {
                    uInt8Array[i] = raw.charCodeAt(i);
                }
    
                return new Blob([uInt8Array], { type: contentType });
            }
            ocr = function () {
                $.ajax({
                    url: "http://api.projectoxford.ai/vision/v1/ocr",
                    beforeSend: function (xhrObj) {
                        // Request headers
                        xhrObj.setRequestHeader("Content-Type", "application/octet-stream");
                        xhrObj.setRequestHeader("Ocp-Apim-Subscription-Key", "YOUR_API_KEY");
                    },
                    type: "POST",
                    // The DataURL will be something like "data:image/png;base64,{image-data-in-base64}"
                    data: makeblob(document.getElementById("c").toDataURL()),
                    processData: false
                })
                .success(function (data, status) {
                    //display data
                    console.log(JSON.stringify(data));
                    debugger;
                })
                .error(function (xhr, status, err) {
                    debugger;
                });
            }
        </script>
    </head>
    <body>
        <canvas id="c" width="100" height="50"></canvas>
        <script type="text/javascript">
            var ctx = document.getElementById("c").getContext("2d");
            ctx.fillStyle = "black";
            ctx.fillRect(0, 0, 100, 50);
            ctx.font = "20px serif";
            ctx.fillStyle = "white";
            ctx.fillText("Hello!", 35, 45);
        </script>
        <button onclick="ocr();">ocr!</button>
    </body>
    </html>

    Wednesday, December 02, 2015 10:44 PM
    Moderator
  • Also, if you just want to specify a local file (using <input type=file>, for example), look at the example here:

    Unable to use Image data as a parameter to the OCR API

    Wednesday, December 02, 2015 11:46 PM
    Moderator
  • Markblob worked liked charm! Used in emotion API... Will my web cam quality can affect result ?
    Wednesday, December 30, 2015 2:34 PM
  • Speaking generally, having higher quality means better recognition.  So both pixel density and frame rate will potentially have an impact on accuracy.
    Monday, January 04, 2016 5:11 PM
    Moderator
  • Thanks @cthrash99, So, It's not actually my concern. :D (jk)
    Saturday, January 09, 2016 8:07 AM