Voyance Vison, Automated Data ExtractionGetting Started

How to Train a Simple Recognition and Extraction Model with Voyance Vision.

VoyanceHQAug 09, 2021
How to Train a Simple Recognition and Extraction Model with Voyance Vision.
Twitter ShareLinkedIn Share


Voyance Vision is an exciting technology that allows users to automatically detect objects (regions of interest) in images and extract text from those regions.

With Vision, you’re enabled to seamlessly capture data in documents (such as invoices, ID cards, e.t.c) and export the data to CSV or JSON via an API call.

The platform gives you the flexibility to define regions of interest in your dataset (images) by using state of the art computer vision algorithms to encode the information to provide accurate predictions.

Sign up for Voyance Vision here.

Now that you have signed up, sign in to your account and lets get started!

Z-m45NyB974CzOnsScPRJXjilNN6opz0MtyvDaGqz2kyJ0GS1gAr0gJHGAYcKO4_TyzND1PBvAArNQ6KUYL1m1h9AygEWgrjzuVh1jz_jFiGiUk8MW1q-_9lYq9xccWRayEwUZqM

We are going to create a demo model to extract information from invoices using Voyance vision.

Requirements.

  1. Minimum of 40 images. It is crucial to ensure that the uploaded images are consistent and that they are the images that you expect to use the model on to ensure good results...

Steps

  1. The first thing we’re going to do is upload and annotate our images. This means we are trying to draw bounding boxes around the regions in our images that we would like to extract text from. Click on the “create model” icon on the top right portion of the screen to get started.
PtNneZhMXCeic1WfZp1MzYdwQ3C3mu19UWMzQzWN9SmIVPdDRd7SAXYeovBt40E-OEcmlH3qLbcbwI83RM3StiVczC2Vu-kZ33kp2dSHiI4g7DNouFfzbNOZdwhDu3Qs8uswclrR


You can choose to either use a pre-trained model or train your model from scratch on the following web page. Pre-trained models refer to models that have been trained, optimised and are ready for use. Currently, we are providing pre-trained models for extracting ID information from international passports, and we will release support for more document types in due time.

However, we will be training a model from scratch. Thus, we will select “Create your Model”, as shown below.

axAZLlm-p63ByXQLxGWfSuUOIash6AvcVVnh7ApVYi7rExI7MtG_6EeYX2JTVgoOPJ2WGiFE72-GcN8-ftB-Blew9VEWJUOFVVss-c4HGtomnJWv225hLaVDbw4aTx2Kfhy14byR

2. Now we will create labels for our model. Labels refer to the name of the fields in the document we will extract data from. Labels vary from document to document, and they are entirely dependent on the use case. For this tutorial, we will be extracting text from the following fields (company name, invoice number & total amount) from invoices that follow the template shown below. In this scenario, our labels will be company_name, invoice_number, total_amount.

IM9TWktXbMr0PG0-jEL7w6PgUPo0y8UOUwzf6dFi5FVa38GbffrOmcNG2IT4wTuvJuxe-mfhJkkrYIi6xQAkKUt9lBfLmBe674MjjEkAp5wV8xQUaOYnCJMjSOlM_FcR-UW6sl2v


3. Next, we will upload the images that would serve as the dataset for training the model. Voyance vision supports uploading images from 3 primary data sources. They are Amazon S3 buckets, Google Drive and your local machine. To train a model, you need at least 40 images, and you need to ensure that those images are a good representation of the images the model will be used on going forward.

rmmwY3723OlKqOnhWBHnibeFEtOhK49auI5KzApW8jeTIxnMimap3ITswuCovFOrKmrxGCaHqLQQOBq6rlh2KgTzfyuAH7D8cauDC3yMjcpvpE8UA0BRzot5ZrrX6nSsAtu_FBNM



4. After uploading, it’s time to annotate your images. Annotation here refers to labelling our images by drawing bounding boxes around the area of the image where we want to extract text from. These bounding boxes help the model identify the region in the images we want to extract text from. We do this for all the images uploaded.

XIQYHQs7oeDpPX1r7RPyhH98mmYvDrfLzvbTUaTdMw55J7a8x9fIc8fYVVvqRX6qjskdqizujJe-aQRt8cBIj4WnjgpXJUrYwqDSrucJV4EQxQqXQj-TW3pzx6Mhj5_-dHekl0kf


Now it's time to train! At this point, you should have annotated all your images and are ready to train.

We are here to help. Your questions and concerns mean a lot to us.

Please contact us here at jen@voyancehq.com.


Author

ABOUT THE AUTHOR

VoyanceHQ

Voyance is leading a revolution in how businesses are done, inspired by democratizing access to data for all businesses. Every article signed with ‘Voyance’ is written by a member of the Voyance team.

Logo Svg

Related Post

How to extract text from images with OCR technology.

Voyance Vison, Automated Data Extraction

Aug 30, 2021

How to extract text from images with OCR technology.

VoyanceHQ

What is data extraction? Four Reasons for Extracting Data.

Voyance Vison, Automated Data Extraction

Aug 25, 2021

What is data extraction? Four Reasons for Extracting Data.

VoyanceHQ

8 Things to look for when choosing a Data Management Platform.

Getting Started

Apr 16, 2021

8 Things to look for when choosing a Data Management Platform.

Folasade Daini

Understanding the Role of Data Integration In Businesses.

Getting Started

Mar 05, 2021

Understanding the Role of Data Integration In Businesses.

Folasade Daini