Sunday, December 9, 2007

Herot

GRAPHICAL INPUT THROUGH MACHINE RECOGNITION OF SKETCHES - Christopher F. Herot

Herot describes his HUNCH system, an early sketching system with various levels of interpretations that the user can manipulate or convert between. He begins by describing the hardware and how the user inputs to the system. User sketch on paper covering an input tablet. This input is fed into the program STRAIT, a precursor of HUNCH. Seemingly an early Sezgin, STRAIT determines corners based on speed, with slower speed point determined as line end points. Next, HUNCH uses the CURVIT program to fit curves in the sketch to B-splines. At gradual curvature changes or a "careful" speed, STRAIT flags the point for fitting by CURVIT.

Next, Herot describes various programs used to infer the meaning of a sketch. First is an early version of STRAIT that uses latching to which joined together nearby point, for example reducing the number of endpoint for a sketched square from 5 to 4. Due to thresholding problems, STRAIT was converted to STRAIN which used a speed determined latching mechanism. However, this method did not produce ideal results either. Similar problems occurred facing overtracing of lines. Additionally, HUNCH possess limited inferencing mechanism that can produce 3D representations from 2D perspective sketches. Finally, a "room finding" program can produce a graph representation of a floorplan.

Herot next discusses how context can be used to help in the inference process. By defining the context, a sketcher can supply missing information needed to resolve ambiguities and guide low level recognition. The user defines a network of case descriptions that are matched to the context free interpretations made by the lower levels. The context is matched top-down. Partial matches may cause the low level recognizers to be re-invoked to search for missing components. However, even narrowed by context, the search space is large, and creating an adequate matching system would require solutions to several hard AI problems.

Therefore, Herot proposes a user-interactive system, prompting the user to resolve ambiguities that the system cannot. A database of interconnected interpretations is created and updated as new levels of interpretation are determined. Thus the overall system consists of interpretation, display, and manipulation programs which determine potential interpretations, display them, and allow the user to determine the correct interpretation or what modifications will correct the interpretations directly.

Moving to a more detailed discussion of the line and corner finder, Herot next discusses how lines and curves can be determined on the fly. Speed and "bentness" are determined over intervals of various sizes depending on the desire for local accuracy or smoothing. This system is tunable to the user, either through user modification of the interval or automated increase/decrease of the interval as the user add/deletes points. Herot's new system automatically calls the various components that HUTCH uses only when prompted by the user and reruns those components as their input is changed by user modification of the database.

Lastly, Herot poses the latching problem as a key area for future research and presents several criteria for latching.

Discussion - The use of paper over the input tablet seems unusual at first. It would appear to be an extraneous paper record of the user's sketches. However, these could be useful as notes of the user's thought process and intentions, helpful to another to determine how the sketches were generated or as a memory aid for the user. It's also interesting to note how much earlier this paper is compared to Sezgin, who reintroduced the ideas of speed and curvature at a later date as a novel idea. However, these metrics are somewhat buried in the paper and are far from the main focus of the paper which seems more directed as the interactivity of the system and the use of context for resolving sketch ambiguity.

2 comments:

Grandmaster Mash said...

I think the use of paper was a decision at the time based on the technology available and the target audience. The authors focused on architecture and floor plans, and people working in these fields are used to sketching on paper.

- D said...

And to elaborate on Aaron's comment, what did you have to draw on back in the day? Certainly not a Cintiq. Sutherland had a light pen, but who knows how ghastly expensive that thing was. Herot posted on my blog for this article explaining some of the tech they used. Check it out.