- This article is filed under:
- » View All
New Developments in PC-based Vision for Locating and Inspecting Parts
by Bill Silver, Chief Technology Officer - Cognex Corporation Posted 03/15/2000
Overview and Objectives
The price/performance ratio of general-purpose computers has in the last few years reached a point where image analysis algorithms can be implemented that would previously have been impractical-too computationally expensive for a software implementation, yet too complex for custom silicon. At the same time, the limitations of traditional pattern locating and inspecting tools such as blob analysis and normalized correlation have become increasingly apparent, with many potential applications for machine vision unsolved, and with barriers to improvement even in applications in which the traditional tools have been used successfully.
Inspired by the new computational capabilities and responding to these limitations, we have developed a set of novel image analysis algorithms that, working in concert, provide a practical, truly general-purpose tool for locating and inspecting objects. Broadly speaking, the new technology is a form of geometric pattern matching. By avoiding the pixel grid of brightness values traditionally used to represent and match patterns, the new technology can handle almost arbitrary orientation, size, and shading variations, delivers vastly higher accuracy, and does not sacrifice robustness, ease of training, and speed.
Such statements, however, do little to help the potential user understand the new technology. What are the limitations of the traditional methods? What makes a pattern matcher truly general purpose? How does geometric matching work, if not by matching the grid of brightness values produced by a video camera? How can one evaluate a geometric pattern matching tool to see if it delivers on the promise? To what real applications has the new technology been applied? In this paper we will answer these and other questions.
Traditional Limitations Overcome
In order to understand the new technology, it is useful to compare it to traditional methods in widespread use, which may be more familiar to the reader. The following brief summary will serve to place geometric pattern matching in the appropriate context.
The earliest vision tool widely used for object location and inspection in industry was blob analysis . The premise was simple-classify image pixels as object or background by some means, join the classified pixels to make discrete objects using neighborhood connectivity rules, and compute various moments of the connected objects to determine object position, size, and orientation. The advantages of blob analysis include high speed, sub-pixel accuracy (in cases where the image is not subject to degradation), easy training, and the ability to tolerate and measure variations in orientation and size. Disadvantages include inability to tolerate touching or overlapping objects, poor performance in the presence of various forms of image degradation, inability to determine the orientation of certain shapes, and poor ability to discriminate objects for inspection and identification purposes.
Perhaps the most serious problem, however, was that in practice the only generally reliable method ever found for separating object from background was to arrange for the objects to be entirely brighter or entirely darker than the background. This requirement so severely limits the range of potential applications that before long other methods for object location and inspection were developed.
Template Matching and Normalized Correlation
A template matching method starts with a training step, wherein a picture of an object to be located (the template) is stored. At run-time the template is compared to like-sized subsets of the image over a range of positions, with the position of greatest match taken to be the position of the object. The degree of match (a numerical value) can be used for inspection, as can comparisons of individual pixels between the template and image at the position of best match.
The earliest template matchers used a brightness threshold to reduce the pixels of the template and image to two states, "bright" and "dark", so that the comparison operation was practical on hardware that was available at the time. Unfortunately, the thresholding step made sub-pixel accuracy impractical, required machine operators to choose threshold values, and made the results highly susceptible to variations in illumination and object reflectivity.
In 1987 these limitations were overcome by the advent of a practical method for using normalized correlation (NC) for the template comparison operation. The considerable computational cost of NC was managed partly by sophisticated search strategies  and partly by the invention of a simple device for computing correlation without multiplies, which were very expensive at the time .
NC template matching also overcomes many of the limitations of blob analysis-it can tolerate touching or overlapping objects, performs well in the presence of various forms of image degradation, and the NC match value is useful in some inspection applications. Most significantly, perhaps, objects need not be separated from background by brightness, enabling a much wider range of applications. Within a few years NC became the dominant method for pattern matching used in industry, particularly in semiconductor manufacturing and electronics assembly.
Unfortunately, NC gives up some of the significant advantages of blob analysis, particularly the ability to tolerate and measure variations in orientation and size. NC will tolerate small variations, typically a few degrees and a few percent (depending on the specific template), but even within this small range of orientation and size the accuracy of the results falls off rapidly.
These limitations were partly overcome in 1991 with the advent of a method for extending NC by rotating and scaling the templates so as to measure orientation and size . The original method and its descendants are expensive, however, and have not been widely used in practice. Furthermore, accuracy is limited to about a degree of orientation and a couple percent of size, with position accuracy somewhat lower in general than when there is no variation in orientation or size.
The accuracy and cost problems are due to the nature of the pixel grid used to represent the templates. A grid can be translated, rotated, and scaled only approximately (except for integer translations and scale factors); these operations not only introduce distortions but are also rather expensive.
NC template matching locates objects based on their shading (i.e. pattern of brightness). For essentially 2D objects such as printed paper or silicon wafers, shading is generally determined by illumination intensity and surface reflectance and tends to vary in ways that NC can handle quite well. For 3D objects common in robotic pick-and-place and assembly applications, shading depends on many more factors, including illumination direction, surface orientation at each point, shadows, and mutual illumination. In these applications the shading of an object is quite unreliable, even though the shape of the object is consistent, making NC unsuitable.
Specialized Pattern Finders
Some of the limitations of blob and NC can be overcome for certain specific shapes such as lines and arcs, by using methods such as the Hough transform . Objects that can be described in terms of such basic shapes can often be located at any orientation, and this approach has been used with some success in certain robotics applications. These methods, however, have shown their own limitations, such as lack of generality, complex training, difficulty with image degradation, and speed.
A General Solution Is Needed
Each of the above methods has limitations that prevent its use in many applications where vision would be desirable. In robotics applications, for example, the inability of NC to handle shading variations, and the cost and accuracy limitations in handling orientation and size variations, have almost entirely prevented its use. In some cases the objects can be backlit so that blob can be used, and in other cases specialized pattern finding methods have been successful, but for the most part robot manufacturers are forced to resort to mechanical fixturing and dead reckoning.
Even in applications such as semiconductors where NC has been used with considerable success, its limitations are still serious. Some process steps, such as CMP, cause shading variations well beyond what NC can tolerate. Almost all such applications involve some variation in orientation, and must therefore accept substantially reduced accuracy and complex, time-consuming procedures to measure orientation. And although equipment in modern factories is generally networked, the desire to teach a template on one machine and then download it to every other one is thwarted by the loss of accuracy resulting from unavoidable variations in optical magnification.
Geometric Pattern Matching
As we have seen, NC has many desirable properties and has been quite successful, but suffers from the limitations of the pixel grid of brightness values on which it is based. Shading is among the least reliable of an object's properties, varying widely even though the object's shape is consistent, and a grid has trouble handling non-integral transformations.
Geometric pattern matching avoids these limitations by representing an object as a geometric shape, independent of shading and not tied to a discrete grid. The promise, and the challenge, of geometric pattern matching is to reclaim the ability to tolerate and measure variations in orientation and size that was lost in moving from blob to NC; preserve the speed, robustness, ease of training, and generality of NC, and which are generally missing in special purpose pattern finders; add the ability to tolerate severe shading variation; deliver substantially higher accuracy, particularly when orientation and size vary; and provide detailed defect data for inspection, independent of orientation, size, and shading variations.
After 4 years of intensive development the challenge has been met, resulting in truly general purpose object location and inspection. Barriers to advancement in traditional applications have been removed, and entirely new applications for machine vision have been created.
How It Works
The new geometric pattern matching technology consists of 5 new image analysis algorithms: feature detection, training, affine searching, pose refinement, and inspection. These modules are key to its operation and define its properties and behavior, and deserve some explanation.
Feature detection is the backbone of PatMax geometric pattern matching. It defines what the system sees, and if you can't see it you can't find it. Feature detection produces a geometric description of object boundaries-an ordered list of boundary points that lie along contours separating dissimilar regions in the image. Unlike traditional edge detection, which simply marks certain pixels as edges, each boundary point specifies position and orientation as real-valued coordinates (to some appropriate precision).
The existence and location of boundaries depends on the granularity with which one examines an image. For example, figure 2 shows lead tips from a quad flat pack (QFP), a device common in electronics assembly applications. In figure 2a, where feature detection operates at a relatively coarse granularity, a single boundary is found for the leads as a group. In figure 2b (shown at 2x magnification), where feature detection operates at a fine granularity, a boundary is found for each lead individually. Both results are equally valid, and PatMax will use a range of granularity as appropriate for a given image.
The importance of granularity is even more obvious in figure 3, which shows an example of feature detection for a portion of a highly textured metal part. Here the boundary points can be seen as short segments showing their position and orientation. A geometric description of this object only makes sense at relatively coarse granularity-at finer granularity, the only thing visible is random texture. The use of a traditional edge detector would make finding such an object impossible.
The purpose of training is to choose the range of granularity that makes sense for the objects to be found, and then to select features at each of the various chosen granularities to represent the object. This is in many respects the most challenging task for geometric pattern matching, because it requires considerable judgement rather than simple measurement. Of course one can always let humans make these choices, but experience with NC template matching has shown that pattern training cannot be made any more complex for machine operators than simply drawing a box around the object. Thus, the training choices must be fully automated, with human overrides provided only as a last resort for unusual cases.
The geometric description produced by feature detection, which is a set of conceptually real-valued positions and orientations, can be translated, rotated, scaled, stretched, and otherwise transformed essentially arbitrarily with no loss in fidelity. In practice we use a 6 degree-of-freedom affine transform, which includes two degrees of freedom of translation and one each of rotation, scale, aspect ratio, and skew. In practice, skew is rarely used, and for convenience we replace aspect ratio with horizontal and vertical scale as separate degrees of freedom.
Each non-translation degree of freedom is specified as either a fixed value or a range of values to be searched. A fixed value for a degree of freedom allows a transformed version of the trained pattern to be found with no time penalty and with no need to train extra patterns. For example, a single square pattern can be trained and used to find rectangular objects at a variety of known sizes and shapes by setting appropriate fixed values for the scale degrees of freedom.
If the orientation, size, and/or aspect ratio of an object can vary, a search range is specified for the appropriate degrees of freedom. Search time is generally proportional to the volume of the search space, and of course the space can vary between 2 and 6 dimensions depending on the application. The cases of practical interest include 3 and 4 dimensions, for example (512 pixels) x (512 pixels) x (360 degrees), and (512 pixels) x (512 pixels) x (360 degrees) x (2:1 scale).
Clearly with such large search spaces the primary challenge is speed, and we chose to address this with meticulous MMX assembly language programming. We rejected custom silicon as too costly, unable to keep pace with Intel's relentless performance improvements, and unable to cope with the processing complexity required by PatMax's 4 run-time image analysis algorithms.
Generally algorithms that can cover a large search space quickly do not produce accurate results. To avoid compromise, we designed affine searching to run as fast as possible and created an entirely new geometric matching algorithm, pose refinement, whose sole purpose is high accuracy. Here is where the fidelity of the geometric representation under rigid transformation is pushed to the limit.
Accuracy, of course, depends critically on applications conditions such as pattern size and image degradation. In carefully controlled but realistic experiments on silicon wafer patterns we have measured 3s absolute accuracy at 0.025 pixel translation, 0.02° orientation, and 0.05% size. For these experiments, we deliberately test translation accuracy with the run-time image at a different orientation and size (both simultaneously) from the trained pattern. Accuracy is unaffected by orientation and size variation, as it should be for geometric matching. No other technology has ever come close to these results under these conditions.
The inspection algorithm establishes a correspondence between boundary points in the run-time image and in the trained pattern, and then uses the correspondence to rate each boundary point. Unlike grid-based template comparison algorithms, there is no loss in fidelity under orientation and size variation, no loss of fidelity along edges, and no outright failure under shading variations.
How To Evaluate Geometric Matching
As the concept of geometric matching becomes accepted and widespread in industry, it will be important for potential users to be able to evaluate such systems independent of manufacturers claims. Here are some things one should look for in evaluating or comparing this technology.
Measure Orientation and Size
A true geometric pattern matching tool should allow 360° variation in orientation; there is no good reason to impose any limits. Size variation is of course limited by camera field of view and resolution, and may have further practical limits, but should nevertheless cover a very wide range. There is no reason not to support aspect ratio variation either. Even if your objects don't exhibit such variations, it can often simplify training to be able to use a standard synthetic pattern and then adjust the geometry at run time, without retraining.
Accuracy Unaffected by Orientation and Size
While many tools offer high accuracy when there is no variation in orientation and size, another key test of a geometric pattern matcher is that accuracy be unaffected by these variations. This is key because almost all applications exhibit some variation, and it defeats the purpose of geometric matching if accuracy falls off beyond a few degrees or a few percent of size as NC does. This is particularly key in networked factories where patterns may be distributed to machines with varying optical magnification and camera angle.
The ability to measure orientation and size is of little use if it's too slow. Modern PCs have little trouble with a 2 degree of freedom translation-only search, so a meaningful evaluation should use a 3 or 4 degree of freedom case.
Tolerate Shading Variation
A geometric pattern matcher should be able to tolerate wide variations in shading, such as might occur with unoriented metal parts, or due to the CMP step in wafer processing. Tolerate missing and extra features, touching and overlapping objects. Template matching methods such as NC are good at tolerating missing and extra features, and touching and overlapping objects, and in switching to geometric matching one should sacrifice none of this.
Tolerate Image Degradation
Noise, severe defocus, and very low contrast really stress feature detection. Even if your application doesn't exhibit these problems, it's a good idea to check them as a test of general robustness. Tolerate patterns with ill-defined edges. Patterns with ill-defined edges, such as shown in figure 3, are the ultimate stress test for feature detection. Since the edge quality of objects in a given application may be poor or not known in advance, these are good test cases. A feature detector that can be tuned to operate over a range of granularity should be able to handle this.
Train by Showing
One of the benefits of NC template matching was that a pattern could be trained by simply drawing a box around the desired portion of an image. Training a geometric pattern matcher should be just as simple. Beware of fancy wizards and pattern editing tools; although they can be useful in unusual circumstances, they shouldn't be necessary in most cases, and most machine operators have enough challenge already without having to learn to use these training tools.
Generate Defect Data for Inspection
The ability to find defects when orientation, size, and shading vary, and along edges, is essential for many inspection applications. Runs on PCs. If the pattern matching tool runs on an ordinary PC without special hardware, you can be sure that you'll keep up with Intel's aggressive performance improvements. Beware of custom accelerators from small companies who can't keep up with Intel.
In the end, the marketplace is the best judge of quality. Be sure to investigate actual installations of geometric pattern matching.
Geometric pattern matching has been proven in hundreds of applications worldwide, including semiconductors, electronics, automotive, and food packaging. Here are some examples. More information can be found by consulting the indicated references.
KLA/Tencor is locating wafer patterns in thin film measurement machines. The wafers undergo processing steps that radically alter the pattern's shading, making NC template matching unsuitable.
Ganton Technologies (Sturtevant, WI) is locating randomly oriented engine parts in a robotic automation cell . Traditional vision techniques were unsuitable due to the variation in shading and orientation of the parts, and so the cell was originally constructed with 12 separate mechanical fixtures, which was inflexible and costly.
Modgal Corporation (Golan Heights, Israel) is doing robotic pick-and-place of metal plumbing joints . An older vision system was replaced with PatMax because it had trouble with shading variations caused by varying sunlight, and limited ability to find the parts at varying orientation.
IBM Microelectronics (Burlington, VT) is verifying the alignment of semiconductor dies bonded to lead frames. In this application the die pattern and lead frame pattern are superimposed, but each has to be located separately to high precision. The dies and lead frames are trained separately, and then both patterns are located in the same run-time image.
ABB (Vasteras, Sweden) has introduced the new FlexPickerä robot for single picking of randomly oriented items in food packaging and pharmaceutical applications , . The product concept requires machine vision guidance, but no traditional pattern finding method was suitable.
Ford Motor Company (Windsor Aluminum Plant, Canada) is guiding the robotic loading of 27 kg aluminum engine blocks from an unfixtured bin to a fixtured conveyer , . The application requires robotic automation to achieve the cost, quality, and human safety objectives, and the robot requires machine vision guidance. Traditional machine vision tools were
unsuitable due to shading and orientation variation.
 Horn, B.K.P. Robot Vision. MIT Press, 1986, chapter 4.
 Silver, W.M. "Normalized Correlation Search in Alignment, Gauging, and Inspection". Proceedings of SPIE, Volume 755, January 1987, pp. 23-34.
 Silver, W.M., R.A. Wolff, and R.E. Dynneson. "Digital Image Processing System". U.S. Patent #4,972,359, November 20, 1990.
 McGarry, J. News Release, Acumen 900 series, October 22, 1991.
 Hough, P.V.C. "Method and means for recognizing complex patterns." U.S. Patent #3,069,654, 1962.
 Ballard, D.H. and Brown, C.M. Computer Vision. Prentice-Hall, 1982. pp. 75-81.
 Robotics World, March/April 1999, pp. 20-25.
 Manufacturing Engineering, March 1999, pp. 154-156.
 Machine Design, March 11, 1999, p. 176.
 ABB Flexible Automation, FlexPicker IRB 340 Industrial Robot product brochure, 1998.
 Automotive Engineering International, February 1999.
 Quality, April 1999, pp. 120-121.
© Copyright 2000, Cognex Corporation. All rights reserved.