Voice Dream Scanner: A New Kind of OCR by Bill Holton, AccessWorld

Voice Dream Scanner: A New Kind of OCR | AccessWorld
Author Bill Holton
9-11 minutes

Bill Holton
There is a new player in the optical character recognition (OCR) space, and it comes from an old friend: Winston Chen, the developer of Voice Dream Reader and Voice Dream Writer, both of which we’ve reviewed in past issues of AccessWorld. In this article we’ll start out with a brief conversation with Chen. Then we’ll take a look at the developer’s latest offering: Voice Dream Scanner. Spoiler alert—it will probably be the best $5.99 you’ll ever spend on a text recognition app!
AccessWorld readers who use their phones to audibly read e-Pub books, PDFs or Bookshare titles are likely already familiar with Voice Dream Reader. It works so well with VoiceOver and TalkBack, it’s hard to believe it wasn’t developed specifically for the access market. But according to Chen, “I just wanted to build a pocket reader I could use to store all my books and files so I could listen to them on the go. No one was more surprised than me when I began receiving feedback from dyslexic and blind users describing how helpful Voice Dream Reader was for their needs and making some simple suggestions to improve the app’s accessibility.”
Chen’s second offering, Voice Dream Writer, was also directed at the mainstream market. “Sometimes it’s easier to proofread your document by listening to it instead of simply rereading the text,” says Chen. At the time, Apple’s VoiceOver cut and paste features and other block text manipulation capabilities were,shall we say, not quite what they are today? The innovative way Chen handled these functions made Voice Dream Writer equally useful to users with visual impairments.
Reinventing the OCR Engine
“I’ve been wanting to add OCR to Voice Dream Reader for a few years now,” says Chen. “It would be useful for reading protected PDF’s and handouts and memos from school and work.”
The hurdle Chen kept encountering was finding a useable OCR engine. “There are some free, open source engines, but they don’t work well enough for my purposes,” he says. “The ones that do work well are quite expensive, either as a one-time license purchase with each app sold or with ongoing pay-by-the-use options. Either of these would have raised the price I have to charge too much for my value proposition.”
Last year, however, Chen began experimenting with Apple’s artificial intelligence (AI), called Vision Framework, that’s built into the latest iOS versions, along with Google’s Tesseract, TensorFlow Lite, and ML Kit.
“Instead of using a single standard OCR engine, I combined the best aspects of each of these freely available tools, and I was pleasantly surprised by the results.”
Instead of making OCR a Voice Dream Reader feature, Chen decided to incorporate his discovery into a separate app called Voice Dream Scanner. “I considered turning it into an in-app purchase, only there are a lot of schools that use Reader and they aren’t allowed to make in-app purchases,” he says. As to why he didn’t simply make it a new Reader feature, he smiles, “I do have a family to feed.”
Chen has been careful to integrate the new Voice Dream Scanner functionality into VD Reader, however. For example, if you load a protected PDF file into the app and open it, the Documents tab now offers a recognition feature. You can now also add to your Voice Dream Reader Library not only from Dropbox, Google Drive, and other sources, including Bookshare, but using your device’s camera as well.
To take advantage of this integration you’ll need both Voice Dream Reader and Voice Dream Scanner. Both can be purchased from the iOS App Store. VD Reader is also available for Android, but currently VD Scanner is iOS only.
Of course you don’t have to have VD Reader to enjoy the benefits of the new Voice Dream Scanner.
A Voice Dream Scanner Snapshot
The app installs quickly and easily, and displays with the icon name “Scanner” on your iOS device. Aim the camera toward a page of text. The app displays a real-time video image preview which is also the “Capture Image” button. Double tap this button, the camera clicks, and the image is converted to text almost immediately. You are placed on the “Play” button, give a quick double tap and the text is spoken using either a purchased VD Reader voice or your chosen iOS voice. Note: You can instruct Scanner to speak recognized text automatically in the Settings Menu.
From the very first beta version of this app I tested, I was amazed by the speed and accuracy of the recognition. The app is amazingly forgiving as far as camera position and lighting. Envelopes read the return addresses, postmarks and addresses. Entire pages of text voiced without a single mistake. Scanner even did an excellent job with a bag of potato chips, even after it was crumpled and uncrumpled several times. Despite the fact there is no OCR engine to download, and the recognition is done locally, a network connection is not required. I used the app with equal success even with Airplane mode turned on.
After each scan you are offered the choice to swipe left once to reach the Discard button, twice to reach the Save button. Note: the VoiceOver two-finger scrub gesture also deletes the current text.
Scanner does not save your work automatically. You have the choice to save it as a text file, a PDF, or to send it directly to Voice Dream Reader. You probably wouldn’t send a single page to Reader, but the app comes with a batch mode. Use this mode to scan several pages at once and then save them together: perfect for that 10-page print report your boss dropped on your desk, or maybe the short story a creative writing classmate passed out for review.
Other Scanner features of interest to those with visual impairments are edge detection and a beta version of auto capture.
Edge detection plays a tone that grows increasingly steady until all four edges are visible, at which time it becomes a solid tone. Auto-capture does just that, but since the AI currently detects any number of squares where there is no text this feature is only available in beta. However, if you’re using a scanner stand it will move along quite nicely, nearly as fast as you can rearrange the pages.
You can also import an image to be recognized. Unfortunately, as of now, this feature is limited to pictures in your photo library. There is currently no way to send an e-mail or file image to Scanner. Look for this to change in an upcoming version.
The benefits of Voice Dream Scanner are by no means limited to the blindness community. Chen developed the app to be used as a pocket player for documents and other printed material he wishes to scan and keep. Low vision users can do the same, then use either iOS magnification or another text-magnification app to review documents. It doesn’t matter in which direction the material is scanned. Even upside-down documents are saved right-side up. Performance is improved by the “Image Enhancement” feature, which attempts to locate the edges of scanned documents and save them more or less as pages.
The Bottom Line
I never thought I’d see the day when I would move KNFB-Reader off my iPhone’s Home screen. Microsoft’s Seeing AI gave it a good run for its money and until now I kept them both on my Home screen. But I have now moved KNFB-Reader to a back screen and given that honored spot to Voice Dream Scanner.
Most of my phone scanning is done when I sort through the mail. Seeing AI’s “Short Text” feature does a decent job helping me sort out which envelopes to keep and which to toss into my hardware recycle bin. But Scanner is just as accurate as any OCR-engine based app, and so quick, the confirmation announcement of the Play button often voices after the scanned document has begun to read.
This is the initial release. Chen himself says there is still work to be done. “Column recognition is not yet what I hope it will be,” he says. “I’d also like to improve auto-capture and maybe offer users the choice to use the volume buttons to initiate a scan.
Stay tuned.
This article is made possible in part by generous funding from the James H. and Alice Teubert Charitable Trust, Huntington, West Virginia.
Comment on this article.
Related articles:
• Envision AI and Seeing AI: Two Multi-Purpose Recognition Apps by Janet Ingber
• An Evaluation of OrCam MyEye 2.0 by Jamie Pauls
More by this author:
• Letters, We Get Letters: Receiving Digital Scans of Your Mail Envelopes Using Informed Delivery
• A Look at the New Narrator, Microsoft’s Built-In Windows Screen Reader
Share Share on Facebook Share on Twitter

Getting the Job Done with Assistive Technology: It May Be Easier Than You Think, AccessWorld

Getting the Job Done with Assistive Technology: It May Be Easier Than You Think | AccessWorld
afb.org

Getting the Job Done with Assistive Technology: It May Be Easier Than You Think | AccessWorld
Author Jamie Pauls
10-12 minutes
——————————————————————————–

main region
article
Jamie Pauls

I remember getting my first computer back in the early 90s almost like it was yesterday. A friend of mine was receiving regular treatments from a massage
therapist who happened to be blind. My friend mentioned that this gentleman used a computer with a screen reader. I was vaguely aware that this technology
existed, but I never really considered using a computer myself until that first conversation I had with my friend. I began doing some research, and eventually
purchased my first computer with a screen reader and one program included. I’m sure there were a few other programs on that computer, but WordPerfect is
the only one I recall today. The vendor from whom I purchased the computer came to my home, helped me get the computer up and running, and gave me about
a half-hour of training on how to use the thing. A few books from what is now
Learning Ally
as well as the
National Library Service for the Blind and Physically Handicapped
along with some really late nights were what truly started me on my journey. I sought guidance from a few sighted friends who were more than willing to
help, but didn’t have any knowledge about assistive technology. There were times when I thought I had wasted a lot of money and time, but I eventually
grew to truly enjoy using my computer.

I eventually became aware of a whole community of blind people who used assistive technology. They all had their preferred screen reader, and most people
used only one. Screen readers cost a lot of money and hardware-based speech synthesizers increased the cost of owning assistive tech. Unless the user was
willing to learn how to write configuration files that made their screen reader work with specific programs they wanted or needed to use, it was important
to find out what computer software worked best with one’s chosen screen reader. I eventually outgrew that first screen reader, and spent money to switch
to others as I learned about them. I have no idea how much money I spent on technology in those early years, and that is probably for the best!

Fast forward 25 years or so, and the landscape is totally different. I have a primary desktop PC and a couple laptop computers all running Windows 10.
I have one paid screen reader—JAWS for Windows from
Vispero
—and I use two free screen-reading solutions—NVDA, from
NVAccess
and Microsoft’s built-in screen reader called Narrator.

I also have a MacBook Pro running the latest version of Apple’s Mac operating system that comes with the free VoiceOver screen reader built in. I have
access to my wife’s iPad if I need to use it, and I own an iPhone 8 Plus. These devices also run VoiceOver. Finally, I own a BrailleNote Touch Plus,
HumanWare’s
Android-based notetaker designed especially for the blind.

Gone are the days when I must limit myself to only one screen reader and one program to get a task accomplished. If a website isn’t behaving well using
JAWS and Google’s Chrome browser, I might try the same site using the Firefox browser. If I don’t like the way JAWS is presenting text to me on that website,
maybe I’ll switch to NVDA. If the desktop version of a website is too cluttered for my liking, I’ll often try the mobile version using either Safari on
my iPhone, or Chrome on my BrailleNote Touch.

The lines between desktop application and Internet site have blurred to the point that I honestly don’t think about it much anymore. It is often possible
to use either a computer or a mobile device to conduct banking and purchase goods.

So what makes all this added flexibility and increased choice possible, anyway? In many cases, the actual hardware in use is less expensive than it used
to be, although admittedly products such as the BrailleNote Touch are still on the high end of the price spectrum. Along with the availability of more
screen readers and magnification solutions than ever before, the cost of most of these solutions has come down greatly. Even companies like Vispero that
still sell a screen reader that can cost over a thousand dollars if purchased outright are now offering software-as-a-service options that allow you to
pay a yearly fee that provides the latest version of their software complete with updates for as long as you keep your subscription active.

While some may not consider free options such as NVDA or Narrator to be as powerful and flexible as JAWS, they will be perfectly adequate for other people
who aren’t using a computer on the job complete with specialized software that requires customized screen reader applications to make it work properly.
There are those who will rightly point out that free isn’t really free. You are in fact purchasing the screen reader when you buy a new computer as is
the case with VoiceOver on the Mac. While this may be true, the shock to the pocketbook may not be as noticeable as it would be if you had to plunk down
another thousand bucks or so for assistive technology after you had just purchased a new computer.

In addition to the advancements in screen reading technology along with the reduced cost of these products, app and website developers are becoming increasingly
educated about the needs of the blind community. I once spoke with a game developer who told me that he played one of his games using VoiceOver on the
iPhone for six weeks so he could really get a feel for how the game behaved when played by a blind person. Rather than throwing up their hands in frustration
and venting on social media about how sighted developers don’t care about the needs of blind people, many in the blind community are respectfully reaching
out to developers, educating them about the needs of those who use assistive technology, and giving them well-deserved recognition on social media when
they produce a product that is usable by blind and sighted people alike. Also, companies like Microsoft and Apple work to ensure that their screen readers
work with the company’s own including Safari and Microsoft Edge. Google and Amazon continue to make strides in the area of accessibility as well. Better
design and standards make it more likely that multiple screen readers will work well in an increasing number of online and offline scenarios.

You may be someone who is currently comfortable using only one screen reader with one web browser and just a few recommended programs on your computer.
You may be thinking that everything you have just read in this article sounds great, but you may be wondering how to actually apply any of it in your life.
First, I would say that if you are happy with your current technology then don’t feel intimidated by someone else who uses other solutions. That said,
I would urge you to keep your screen reading technology up to date as far as is possible. Also, make sure that you are using an Internet browser that is
fully supported by the websites you frequently visit. This will ensure that your experience is as fulfilling as it should be. For example, though Microsoft
Internet Explorer has been a recommended browser for many years for those using screen access technology due to its accessibility, it is no longer receiving
feature updates from Microsoft, and therefore many modern websites will not display properly when viewed using it.

If you think you would like to try new applications and possibly different assistive technology solutions but you don’t know where to start, keep reading.

Back when I first started using a computer, I knew of very few resources to which I could turn in order to gain skills in using assistive technology. Today,
there are many ebooks, tutorials, webinars, podcasts, and even paid individual training services available for anyone who wishes to expand their knowledge
of computers and the like. One excellent resource that has been referenced many times in past issues of AccessWorld is
Mystic Access,
where you can obtain almost every kind of training mentioned in the previous sentences. Another resource you may recognize is the
National Braille Press,
which has published many books that provide guidance on using various types of technology. Books from National Braille Press can generally be purchased
in both braille or in electronic formats.

There are also many online communities of people with vision loss who use a specific technology. Two of the most well known are
AppleVis
for users of iOS devices and the
Eyes-Free Google Group
for users of the Android platform. Both communities are places where new and long time users of these platforms can go to find assistance getting started
with the technology or for help troubleshooting issues they may encounter.

While I vividly recall my first experiences as a novice computer user, it is almost impossible for me to imagine actually going back to those days. Today,
the landscape is rich and the possibilities are endless for anyone who wishes to join their sighted counterparts in using today’s technology. While there
are still many hurdles to jump, I am confident that things will only continue to improve as we move forward.

So fear not, intrepid adventurer. Let’s explore this exciting world together. In the meantime, happy computing!

This article is made possible in part by generous funding from the James H. and Alice Teubert Charitable Trust, Huntington, West Virginia.

Comment on this article.

Related articles:

list of 2 items
• Looking Back on 20 Years of Assistive Technology: Where We’ve Been and How Far and Fast We’ve Come
by Bill Holton
• Getting the Most out of Sighted Computer Assistance: How to Help the Helpers
by Bill Holton
list end

More by this author:

list of 2 items
• Pinterest Takes Steps Toward Accessibility
• A Review of “Stress Less, Browse Happy: Your Guide to More Easily and Effectively Navigating the Internet with a Screen Reader,” an audio tutorial from
Mystic Access
list end

Share
Share on Facebook
Share on Twitter
article end
main region end