Home About Us Services Experience Technology Resources Careers News Contact Us
Mission and Values
Leadership Team
Certifications
Capability Statement
Consulting
Development
Turnarounds
Support
Success Stories
Presentations
Publications
Awards
EDI
RFID
Extranet
Voice and Speech
Natural Language
Handheld and Mobile
Articles and Reference
Code Samples
Downloads
Regular Reading
Email Subscriptions
Services Recommendations
Software Recommendations
Current Opportunities
Careers for Developers
Multimodal Natural Language Processing Proof-of-Concept

Home
Introduction
Overview
Technology
Known Problems
Code Samples
Sources
Technical Resources
The Atlanta FoxPro Users Group - August 19, 2004

We will demonstrate and breakdown a live web-based application that encompasses several interesting technologies all working together. The application is made up of a VFP COM+ object running on a web server that manages a typical data (SQL Server) web site, but with the added ability for a user to query information via free-form English sentences. The language parsing aspect is handled by Microsoft English Query, which is bundled with SQL Server. In addition, Full-Text Indexing, also bundled with SQL Server, is used to further enhance the ability of English Query to handle requests against unorganized text objects.
Introduction
The application we are demonstrating is intended to be used in an art gallery or museum. A set of data exists describing a number of artists, their works, their exhibition history and other pertinent data.

We will demonstrate a web-based application that processes English-language questions and returns relevant answers.

Here's a sample of the questions we can ask:
Who made a presentation in Cape Town?
Which artist made a presentation in London? [Cape Town]
Which artists have a collection in Johannesburg? [Germany, New York, Pretoria, South Africa]
Who has a collection in New York?
Find all artists with a collection in New York?
Which artists had an exhibition in Germany? [Cape Town, Hong Kong, London, Atlanta]
Which artists received an award in 1996? [1993-2002]
Who received an award in 1993?
Which artist did a residency in Switzerland? [Atlanta, Montana, London, Switzerland]
Who [did | completed | performed] a residency in Montana?
List all artists who did a residency in Switzerland?
Please show all artists who did a residency in Switzerland?
Who received an award?

A Scenario
An art patron strolls into a new gallery or museum. She is informed that this location supports a wireless network that will enhance her visit, so she pulls out her PDA and browses to the given web site. As she works her way from gallery to gallery, and, indeed, from work to work, her PDA immediately senses the closest work, retrieves any relevant information, and speaks it to her through headphones. She is prompted to ask questions regarding this particular work, so she speaks a question to the PDA, which answers out loud. The system keeps track of where she paused the longest during her visit and each kind of interaction she had, and it tunes her experience accordingly.

The goal of this project is to create a proof-of-concept that demonstrates a stable architecture capable of supporting the aforementioned patron's visit.

To accomplish this requires a complex interaction of both off-the-shelf and custom components, at lost cost and high performance. Fortunately, the state-of-the-art in the various necessary disciplines has reached the broader marketplace, only waiting for smart, forward-thinking people to make it happen.

Overview
How cool would it be if you could just TELL an application the data you needed, and it actually went out and did it?

Building machines that communicate using human languages has proven tricky; now decades of R&D are finally paying off, delivering a suite of developer-ready tools. All the components to create these kind of application seem to be maturing at the same time.

These off-the-shelf tools fall into several categories, mirroring the research fields from which they've sprung: speech processing, speech synthesis and natural-language processing (NLP). Speech processing converts speech into text. Speech synthesis converts text into speech. NLP understands grammar: how words connect and how their definitions relate to one another. This last field stands on its own, but also contributes to the other three, because computers listen, speak, and interpret more accurately when they have guidelines to what words can mean.

Speech Processing
Speech processing empowers computers to recognize - and, to some extent, understand - spoken language. This technology has engendered two types of software products: continuous-speech recognition and command and control. For a broad-use applciation, context-free grammars are the most reliable. Because a context-free grammar allows a speech recognition engine to reduce the number of recognized words to a predefined list, high levels of recognition can be achieved in a speaker-independent environment. Context-free grammars work great with no voice training, cheap microphones, and average CPUs.

Speech Synthesis
The ability to synthesize the sound of speech is useful for applications that require spontaneous interaction, or in situations where reading isn't practical (giving instructions to a driver, for example). In products aimed at the general public, it's critical that the output sound pleasant and human enough to encourage regular use.

Natural Language Processing
NLP systems interpret written rather than spoken language. In fact, NLP modules can be found in speech-processing systems that start by converting spoken input into text. Using lexicons and grammar rules, NLP parses sentences, determines underlying meanings, and retrieves or constructs responses. This technology's main use is to enable databases to answer queries entered in the form of a question. A newer application is handling high-volume email. NLP performance can be improved by incorporating a common sense knowledge base - that is, an encyclopedia of real-world rules.

NLP with Microsoft English Query
Almost all of database query languages tend to be rigid and difficult to learn, not to mention that is is often difficult even for the experienced user to get the desired information out of databases. A natural language interface to the SQL language overcomes the need for users to master the complexities of SQL.

English Query is a component of SQL Server 2000 that provides the ability for users to query databases using plain English. The EQ engine then creates and executes a database query formats the answer. The development process is at higher level than traditional programming, but can be mastered by non-programmers with some database background. To implement natural language searching, you first use the authoring tool to provide domain knowledge to the engine. The authoring tool is used to relate database entities to objects in the domain. For example, user needs to create a verb relationship between salespeople table and a products table by indicating that "salespeople sell products." English Query engine than uses these relationships to perform natural language parsing of users' questions, which provides better search results than you would get using keyword-based technology.

Although your initial goal in an English Query project might be to answer the most common questions your users will ask, the ultimate goal is to identify and model all the relationships between entities in your database. You want to have a semantic model that represents your application.

Input Devices
If you add text-to-speech capability to your English Query application with a microphone, you can type or speak your question to the application. Without stretching much further, you could put that speech interface on a smart phone or handheld Personal Digital Assistant (PDA) with wireless Internet capability.

The combination of speech recognition and English Query represents a powerful way for a user to access information in a SQL Server database very quickly. For users who work in an environment where speed and ease of access are critical, it holds enormous promise for future applications. As hardware continues to become more powerful and cheaper, speech recognition should continue to become more accurate and useful to increasingly wider audiences.

Technology

System Components:

  • Web/Speech/Data server running Microsoft Windows 2003
  • Web server: IIS 6.0
  • Database: Microsoft SQL Server 2000 SP4
  • SQL Server English Query
  • SQL Server Full-Text Indexing
  • Microsoft Speech Server 1.0
  • Microsoft Speech Application SDK Version 1.0
  • HP iPAQ h4150 Pocket PC running Microsoft Windows CD 4.20 on ROM version 1.10.03
  • Speech Add-in for Microsoft Pocket Internet Explorer
  • D-Link DI-614+ Wireless Broadband Router
  • Speech Application Language Tags (SALT) protocol
  • DHTML, JavaScript, VBScript (ASP), CSS
  • Microsoft Visual FoxPro 8.0 SP1
  • Microsoft Component Services
  • Visual Studio .NET 2003
Known Problems
  • SALT Requires Internet Access
    This effectively prevents speech recognition and text-to-speech from being used in a pure Intranet environment. I think this is primarily due to resolving the SALT name space.
  • IE Speech Plug-In Only Available for PDA
    The IE Speech Plug-In for the desktop requires that the full SASDK be installed, unlike the PDA version. It looks like the sole target audience for the client was PDA's, with the desktop seen as only a development platform, not a deployment platform.
Code Samples
Accessing Microsoft English Query from Visual FoxPro:
loEQ = CREATEOBJECT("MSEQ.Session")
loEQ.InitDomain("EQDemo.eqd")

loEQResponse = loEQ.ParseRequest("Which artist has a " +;
                                 "collection in New York?")
loEQCmd      = loEQResponse.Commands(0)

=SQLEXEC(lnConn, loEQCmd.SQL)
© The Intellection Group, Inc.  All Rights Reserved
Privacy Statement    Disclaimer