video_screenshot.mp4
This repository contains various implementations of handwritten text recognition for web virtual reality. The repository was created to show the possibility of doing handwritten text recognition in web virtual reality environment based on API and without API or any server (only front end). I believe the use of handwritten text recognition can greatly enhance the experience and create some unique features. I hope this type of feature will become widely available in the near future.
There are two types of handwritten text recognition:
- Handwritten text recognition based on stroke related data, which uses simple API access to the incredible handwriting recognition of Google IME and generates the results (i.e. API based).
- Handwritten text recognition based on image analysis, which uses ML image classification model powered by Tensorflow.js (i.e. serverless and without any API).
The first type of handwritten text recognition allows to do the recognition of text, using Google IME API, of the majority of languages of the world. Notably, this API is used for doing handwritten text conversion on Android Devices. It is also used as part of INTERPLAY MODE created by Google Creative Lab, which demonstrates this API usage combined with video.
The second type of handwritten text recognition combines machine learning, computer vision and NLP and only recognizes English letters (A-z) and digits (0-9). Here is briefly how everything works in it:
- Segmentation is done using OpenCV.js, i.e. bounding box of each element based on contour in an image is calculated, then segmented and placed based on distance between bounding box x position (top left) and left corner of image. It results in several segmentations based on the total number of characters.
- Segmentation is then passed over to Tensorflow ML model (image segmentation task), imported and adapted from Keras model, which identifies to which class each segmented image corresponds.
- The text string is generated and passed over to words base, which analyzes it for correspondence and divides into meaningful words.
- At the end the text is displayed.
The second type of handwritten text recognition also has the following Tensorflow.js models, which are tiny and robust enough to be run on mobile devices (and therefore very suitable for web experiences):
- Alphanumeric model (used in all examples).
- Only letters models (16-bit and 32-bit floating-point types; see: "serverless" > "misc").
The repository contains the following:
- A-Frame based implementation (see: "serverless" > "a-frame-implementation" folder). It contains the last natural language processing (NLP) step (dividing into meaningful words).
- Component for A-Frame (see: "serverless" > "a-frame-component" folder). It does not contain the last natural language processing (NLP) step (dividing into meaningful words).
- Simple html implementation (see "serverless" > "simple-implementation" folder). It contains the last natural language processing (NLP) step (dividing into meaningful words).
- A-Frame component with Google IME API (see "api" > "a-frame-component" folder).
To use A-Frame component (serveless one), please make sure to attach the following to element: handwriting-recognition texture-painter id="drawingArea" class="clickable". Below sample code is provided:
<html>
<head>
<title>Handwriting Recognition in VR: A-Frame Demo</title>
<script src='https://aframe.io/releases/1.4.2/aframe.min.js'></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@latest/dist/tf.min.js"></script>
</head>
<body>
<a-scene>
<a-plane handwriting-recognition texture-painter id="drawingArea" class="clickable" position="0 1.5 -4" rotation="0 0 0" width="5" height="4"></a-plane>
<a-entity cursor="rayOrigin: mouse" raycaster="objects: .clickable;"></a-entity>
<a-entity button-listener class="controller" laser-controls="hand: left" raycaster="objects: .clickable;" line="color: #000000">
<a-sky color='#ECECEC'></a-sky>
</a-scene>
<script src="handwriting-recognition.js"></script>
</body>
</html>
Please note: This A-Frame component is attached after a-scene element. It does not have recognize and clear buttons for mouse clicks. It only supports VR mode with controllers. A-Frame implementation and component also support Quest 2 buttons: button X - recognize, button Y - clear.
Sample usage of A-Frame component (with API) is provided below:
<!DOCTYPE html>
<html>
<head>
<title>Handwriting Recognition in VR: A-Frame Component with API</title>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<script src='https://aframe.io/releases/1.4.0/aframe.min.js'></script>
<script src="https://unpkg.com/aframe-troika-text/dist/aframe-troika-text.min.js"></script>
</head>
<body>
<a-scene>
<a-plane id="drawingArea" class="clickable" handwriting-recognition-api="handwritingLanguage: en; size: 20" position="0 1.5 -5" rotation="0 0 0" width="5" height="4"></a-plane>
<a-entity id="outputText" position="0 0.2 -4"
troika-text="value: Console; color: black"></a-entity>
<a-entity id="send" text="value: Send; align: center; width: 3;" position="-2.5 0.2 -4" class="clickable" geometry="primitive: plane; height: 0.3" material="color: black">
</a-entity>
<a-entity id="clear" text="value: Clear; align: center; width: 3;" position="2.5 0.2 -4" class="clickable" geometry="primitive: plane; height: 0.3" material="color: black">
</a-entity>
<a-entity cursor="rayOrigin: mouse" raycaster="objects: .clickable;"></a-entity>
<a-entity class="controller" laser-controls="hand: left" raycaster="objects: .clickable;" line="color: #000000"></a-entity>
<a-sky color="#ECECEC" rotation="0 -90 0"></a-sky>
</a-scene>
<script src='handwriting-recognition-api.js'></script>
</body>
</html>
It has the following attributes/schemas:
- color: { type: "color", default: "black" } - Color of stroke.
- size: { type: "int", default: 20 } - Size of stroke
- background: { type: "color", default: "white" } - Plane background color.
- clearAll: { type: "boolean", default: false } - Whether clearAll is enabled.
- language: { type: "string", default: "en"} = Language in which handwrtitten text should be recognized.
Please note: In this example we are using troika text component, which allows to show text in other languages.
The following is a list of language codes, which can be used with A-Frame component using Google IME API:
Language | code |
---|---|
Afrikaans | af |
Albanian | sq |
Basque | eu |
Belarusian | be |
Bulgarian | bg |
Catalan | ca |
Chinese (Simplified) | zh_CN |
Chinese (Traditional) | zh_TW |
Croatian | hr |
Czech | cs |
Danish | da |
Dutch | nl |
English | en |
Estonian | et |
Filipino | fil |
Finnish | fi |
French | fr |
Galician | gl |
German | de |
Greek | el |
Haitian | ht |
Hindi | hi |
Hungarian | hu |
Icelandic | is |
Indonesian | id |
Irish | ga |
Italian | it |
Japanese | ja |
Korean | ko |
Latin | la |
Latvian | lv |
Lithuanian | lt |
Macedonian | mk |
Malay | ms |
Norwegian | no |
Polish | pl |
Portuguese (Brazil) | pt_BR |
Portuguese (Portugal) | pt_PT |
Romanian | ro |
Russian | ru |
Serbian | sr |
Slovak | sk |
Slovenian | sl |
Spanish | es |
Swahili | sw |
Swedish | sv |
Thai | th |
Turkish | tr |
Ukranian | yk |
Vietnamese | vi |
Welsh | cy |
It is definitely possible to add other ML language models and therefore do handwriting recognition in that language. Soon will add new language model. In addition, will be providing small tutorial on how to train own model.
Handwritten text recognition is powered by AFrame, Three.js and OpenCV.js and Tensorflow.js. It uses updated/modified texture painter component, which is part of Whiteboard VR by Marlon Lückert. The code related to API was developed based on the example provided in Chen Yu Ho's Handwriting.js repository, and Amit Agarwal's blog post "Google Handwriting IME API Request". It should be noted though there is very little information on the use of this IME API!
To learn more about OpenCV.js and its various uses, please refer to: https://github.com/akbartus/OpenCV-Examples-in-JavaScript.
To see another creative use of drawing in web VR, please refer to: https://github.com/akbartus/VR-Doodle-Painter.
To see handsfree handwriting recognition, using similar functionality refer to: https://github.com/akbartus/Web-Based-Touchfree-Handwriting-Recognition
The repository contains the following implementations/demos:
- Serverless:
- API based: