Nsight VLM OCR | Edge VLM-OCR That Reads the Unreadable

Why does on-site OCR stall?

Conventional OCR that depends on templates and dictionaries only works on "clean, expected text." On the floor, the input is always unexpected.

Broken handwriting

Addresses and names on delivery slips, where everyone's habits and pen pressure differ. The character shapes don't match the dictionary, so rule-based OCR breaks down early.

Glare, fading and dirt

Glare from laminate packaging, faded print, smudging. The moment it strays from the training images, CNN-OCR's confidence collapses.

Layout and format variation

Every time a label's font or field placement changes by SKU, the template has to be re-set. Across many SKUs and many sites, operations can't keep up.

Reading examples

Below are reading examples from Nsight VLM-OCR for the kind of field images conventional OCR has struggled with — broken handwriting, reflective labels, format variation. Actual accuracy and output vary with the target image and imaging conditions.

A broken-script "千々田区" read as "千代田区" from the meaning of the address.
Even with faded characters nearly the same color as the paper background, it distinguishes the role of each line (address / building name) — with no template registered.
Even under laminate glare, it correctly pairs field name and value, with a confidence per item.

Why Nsight VLM-OCR reads on the floor

Unlike vendors that sell only an algorithm, Nsight designs the training platform, the edge, the optical hardware and the operation end to end, with development know-how in industrial image processing.

In-house & on-site

Trained in-house, runs self-contained on the floor

We train and optimize the model in-house, and inference runs self-contained on on-site edge devices. With no dependence on the cloud or the network, it reads without sending images outside. It can be deployed as-is in a closed network, even on manufacturing and logistics floors with strict security requirements.

Input configuration

Input you choose by use: from 2D/3D cameras to smartphones

For lines needing high accuracy and stable continuous operation, industrial 2D/3D line cameras; for spot checks and inspections on the move, a smartphone. We choose the input configuration to fit the use case.

Optics × inspection

Design strength that doesn't stall on the floor

Lighting, camera, lens and conveyance designed as one. With a team that includes developers from Keyence's image-processing division, image-quality problems are solved first at the optical level.

No master registration

Reads by meaning

The VLM understands the position and meaning of text, so it can read new formats with no master registration. It is robust to layout changes, multiple languages and handwriting.

From one image to structured data

The flow by which an on-site image becomes structured data, with each field given meaning. The image stays on-site and the process completes within the floor.

Step 01Image inputIndustrial 2D/3D line cameras / industrial cameras / smartphones — ingest from any source.

Step 02Semantic understandingNot just "reading" characters but "understanding them as fields." Handwriting and breaks are corrected by context.

Step 03StructuringOutput as data split by field — address, model number, quantity — with confidence noted.

Step 04Business-system integrationAutomatic integration with WMS or core systems. Only values needing confirmation are routed to human review.

Not the VLM alone: three techniques blended per project

The recognition engine blends VLM, CNN-OCR and rule-based per project.

VLMVision-Language ModelReads hard-to-standardize targets — handwriting, glare, fading, format variation — by context.

CNN-OCRCNN-based OCRThe foundation of character recognition, stably processing high-volume reading on standardized, high-speed lines.

Rule-basedRule-based verificationFinalizes results to business requirements via digit counts, check digits and format validation.

Spec summary

Edge

On-site, closed-network operation

Custom

In-house training platform

Multi

Handwriting / multilingual / reflection

Zero