Skip to content

Latest commit

 

History

History
167 lines (148 loc) · 10.5 KB

README.md

File metadata and controls

167 lines (148 loc) · 10.5 KB

Handwriting Recognition in VR

video_screenshot.mp4

Description / Rationale

This repository contains various implementations of handwritten text recognition for web virtual reality. The repository was created to show the possibility of doing handwritten text recognition in web virtual reality environment based on API and without API or any server (only front end). I believe the use of handwritten text recognition can greatly enhance the experience and create some unique features. I hope this type of feature will become widely available in the near future.

There are two types of handwritten text recognition:

  • Handwritten text recognition based on stroke related data, which uses simple API access to the incredible handwriting recognition of Google IME and generates the results (i.e. API based).
  • Handwritten text recognition based on image analysis, which uses ML image classification model powered by Tensorflow.js (i.e. serverless and without any API).

The first type of handwritten text recognition allows to do the recognition of text, using Google IME API, of the majority of languages of the world. Notably, this API is used for doing handwritten text conversion on Android Devices. It is also used as part of INTERPLAY MODE created by Google Creative Lab, which demonstrates this API usage combined with video.

The second type of handwritten text recognition combines machine learning, computer vision and NLP and only recognizes English letters (A-z) and digits (0-9). Here is briefly how everything works in it:

  1. Segmentation is done using OpenCV.js, i.e. bounding box of each element based on contour in an image is calculated, then segmented and placed based on distance between bounding box x position (top left) and left corner of image. It results in several segmentations based on the total number of characters.
  2. Segmentation is then passed over to Tensorflow ML model (image segmentation task), imported and adapted from Keras model, which identifies to which class each segmented image corresponds.
  3. The text string is generated and passed over to words base, which analyzes it for correspondence and divides into meaningful words.
  4. At the end the text is displayed.

The second type of handwritten text recognition also has the following Tensorflow.js models, which are tiny and robust enough to be run on mobile devices (and therefore very suitable for web experiences):

  • Alphanumeric model (used in all examples).
  • Only letters models (16-bit and 32-bit floating-point types; see: "serverless" > "misc").

Instructions

The repository contains the following:

  • A-Frame based implementation (see: "serverless" > "a-frame-implementation" folder). It contains the last natural language processing (NLP) step (dividing into meaningful words).
  • Component for A-Frame (see: "serverless" > "a-frame-component" folder). It does not contain the last natural language processing (NLP) step (dividing into meaningful words).
  • Simple html implementation (see "serverless" > "simple-implementation" folder). It contains the last natural language processing (NLP) step (dividing into meaningful words).
  • A-Frame component with Google IME API (see "api" > "a-frame-component" folder).

To use A-Frame component (serveless one), please make sure to attach the following to element: handwriting-recognition texture-painter id="drawingArea" class="clickable". Below sample code is provided:

<html>
    <head>
    <title>Handwriting Recognition in VR: A-Frame Demo</title>
    <script src='https://aframe.io/releases/1.4.2/aframe.min.js'></script>
    <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@latest/dist/tf.min.js"></script>
</head>
    <body>
        <a-scene>
           <a-plane handwriting-recognition texture-painter id="drawingArea" class="clickable" position="0 1.5 -4" rotation="0 0 0" width="5" height="4"></a-plane>
           <a-entity cursor="rayOrigin: mouse" raycaster="objects: .clickable;"></a-entity>
           <a-entity  button-listener class="controller" laser-controls="hand: left" raycaster="objects: .clickable;" line="color: #000000"> 
           <a-sky color='#ECECEC'></a-sky>
        </a-scene>
        <script src="handwriting-recognition.js"></script>
    </body>
</html>

Please note: This A-Frame component is attached after a-scene element. It does not have recognize and clear buttons for mouse clicks. It only supports VR mode with controllers. A-Frame implementation and component also support Quest 2 buttons: button X - recognize, button Y - clear.

Sample usage of A-Frame component (with API) is provided below:

<!DOCTYPE html>
<html>
<head>
    <title>Handwriting Recognition in VR: A-Frame Component with API</title>
     <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <script src='https://aframe.io/releases/1.4.0/aframe.min.js'></script>
    <script src="https://unpkg.com/aframe-troika-text/dist/aframe-troika-text.min.js"></script>
</head>
<body>
    <a-scene>
        <a-plane id="drawingArea" class="clickable" handwriting-recognition-api="handwritingLanguage: en; size: 20" position="0 1.5 -5" rotation="0 0 0" width="5" height="4"></a-plane>
        
        <a-entity id="outputText" position="0 0.2 -4" 
        troika-text="value: Console; color: black"></a-entity>
        <a-entity id="send" text="value: Send; align: center; width: 3;" position="-2.5 0.2 -4" class="clickable" geometry="primitive: plane; height: 0.3" material="color: black">
        </a-entity>
        <a-entity id="clear" text="value: Clear; align: center; width: 3;" position="2.5 0.2 -4" class="clickable" geometry="primitive: plane; height: 0.3" material="color: black">
        </a-entity>
       
        <a-entity cursor="rayOrigin: mouse" raycaster="objects: .clickable;"></a-entity>
        <a-entity  class="controller" laser-controls="hand: left" raycaster="objects: .clickable;" line="color: #000000"></a-entity> 
        <a-sky color="#ECECEC" rotation="0 -90 0"></a-sky>
    </a-scene>
    <script src='handwriting-recognition-api.js'></script>

</body>

</html>

It has the following attributes/schemas:

  • color: { type: "color", default: "black" } - Color of stroke.
  • size: { type: "int", default: 20 } - Size of stroke
  • background: { type: "color", default: "white" } - Plane background color.
  • clearAll: { type: "boolean", default: false } - Whether clearAll is enabled.
  • language: { type: "string", default: "en"} = Language in which handwrtitten text should be recognized.

Please note: In this example we are using troika text component, which allows to show text in other languages.

Language Codes

The following is a list of language codes, which can be used with A-Frame component using Google IME API:

Language code
Afrikaans af
Albanian sq
Basque eu
Belarusian be
Bulgarian bg
Catalan ca
Chinese (Simplified) zh_CN
Chinese (Traditional) zh_TW
Croatian hr
Czech cs
Danish da
Dutch nl
English en
Estonian et
Filipino fil
Finnish fi
French fr
Galician gl
German de
Greek el
Haitian ht
Hindi hi
Hungarian hu
Icelandic is
Indonesian id
Irish ga
Italian it
Japanese ja
Korean ko
Latin la
Latvian lv
Lithuanian lt
Macedonian mk
Malay ms
Norwegian no
Polish pl
Portuguese (Brazil) pt_BR
Portuguese (Portugal) pt_PT
Romanian ro
Russian ru
Serbian sr
Slovak sk
Slovenian sl
Spanish es
Swahili sw
Swedish sv
Thai th
Turkish tr
Ukranian yk
Vietnamese vi
Welsh cy

Updates

It is definitely possible to add other ML language models and therefore do handwriting recognition in that language. Soon will add new language model. In addition, will be providing small tutorial on how to train own model.

Tech Stack

Handwritten text recognition is powered by AFrame, Three.js and OpenCV.js and Tensorflow.js. It uses updated/modified texture painter component, which is part of Whiteboard VR by Marlon Lückert. The code related to API was developed based on the example provided in Chen Yu Ho's Handwriting.js repository, and Amit Agarwal's blog post "Google Handwriting IME API Request". It should be noted though there is very little information on the use of this IME API!

To learn more about OpenCV.js and its various uses, please refer to: https://github.com/akbartus/OpenCV-Examples-in-JavaScript. To see another creative use of drawing in web VR, please refer to: https://github.com/akbartus/VR-Doodle-Painter.
To see handsfree handwriting recognition, using similar functionality refer to: https://github.com/akbartus/Web-Based-Touchfree-Handwriting-Recognition

Demo

The repository contains the following implementations/demos: