- This article is filed under:
- » View All
University Research Related To Machine Vision
by Nello Zeuch, Contributing Editor - AIA Posted 02/02/2004
For the most part one can suggest that university research in the U.S. related to machine vision is applied research while research related to computer vision is generally basic research. Today in the U.S. most university research in virtually all fields is government supported. This was not always the case. Before WWII, virtually all university research was supported by industry. Hence, it tended to have a more applied orientation. During WWII the Armed Forces established several agencies to support research at universities. While this research had a military orientation, for the most part it was basic research, with at least one obvious exception being the applied research and advanced development leading to the hydrogen bomb.
Since universities found it much easier to get research funded by the government than to seek research grants from industry, as they did before the War, after WWII, they applied pressure on Congress to establish agencies to support research essentially for the common good of society as opposed to research with military objectives. In the late 40s the National Science Foundation and National Institutes of Health were established and the charter of what was then the National Bureau of Standards, a department within the Department of Commerce, was expanded. Suddenly universities found it even easier to get government money for research and so pursued industry support even less.
Given that government bureaucracies avoid conflict at all costs and, when an agency suggests it is supporting ‘‘applied’‘ research there is always the potential of a Congress person asking where is the research being applied, when, in fact, even applied research is actually a long way from being a solution ready for application, the government agencies supporting research generally find it more comfortable to support basic research. Hence, even today one will find more reference to computer vision research than machine vision research in universities.
One other personal note, back in the mid-70s, the National Science Foundation was charged by Congress to support applied research. Hence, they established the Research Applied to National Needs (RANN) Department. Probably the foremost applied research supported was in solar cells, an activity that was ultimately transferred to the Department of Energy. In any event, within RANN there was a section called Advanced Productivity Research & Technology. Within this section was Dr. Bernard Chern, an office mate of mine, who supported the machine vision research at what was then SRI (Charles Rosen’s work) and University of Rhode Island, Dr. John Birk’s work. The former work lead to the development of the SRI machine vision algorithms or global feature analysis algorithms and the latter lead to the development of a machine vision system for bin-picking. There may have been others that he supported.
While machine vision was not specifically my charter at NSF, I did have the responsibility for one of the first, if the not the first, university-industry consortium they supported – a Manufacturing Technology Center under Dr. Milt Shaw at Carnegie-Mellon University. Within that center a number of projects involved robots and vision-guided robots.
The Carnegie Mellon site - http://www-2.cs.cmu.edu/afs/cs/project/cil/www/v-groups.html - includes many of the research groups around the world suggesting they are engaged in computer vision related research. The list includes about 55 groups in North America.
The Robotics Research Group at Carnegie Mellon continues to be active in computer vision related research. What follows is a summary of some of their relatively recent projects taken from their website that appear to have relevance to machine vision.
Under Dr. Vladimir Brajovic - Reflectance Perception is an image processing software that intelligently compensates for illumination problems in digital pictures. Reflectance Perception has been developed to enable machines to approach the visual capabilities of the human eye. Users will find that the results of Reflectance Perception resemble what their eyes would see if they viewed the environment instead of a camera.
From the original image the software estimates the illumination field that illuminated the scene when the picture was taken. Then it corrects every pixel to produce a result that would have been seen if the scene were uniformly illuminated. The algorithm is intelligent in that it automatically ‘‘finds’‘ where shadows start and stop just by ‘‘looking’‘ at the original picture. By knowing where the shadow boundaries are, Reflectance Perception produces results free of a highly objectionable artifact known as ‘‘halo’‘.
Commonly used tools to compensate for shadows in originals include brightness/contrast adjustment, gamma adjustment and histogram equalization. Advanced users of Photoshop or similar photo editing packages could go through a tedious sequence of steps to ‘‘mask’‘ objects from the shadows and then selectively change the brightness of one or the other. The difference is that with Reflectance Perception, (a) it is an automatic, 'one-click' process, (b) the results are superior to anything currently available on the market, (c) other tools ‘‘illuminate’‘ the whole image; Reflectance Perception illuminates only what needs to be illuminated, and (d) other methods create shift in colors, or create ‘‘halos’‘ or ‘‘auras’‘ around the object/shadow boundary.
Several projects under Dr. Simon Baker include:
Template tracking is a well studied problem in computer vision which dates back to the Lucas-Kanade algorithm of 1981. Since then the paradigm has been extended in a variety of ways including: arbitrary parametric transformations of the template, and linear appearance variation. These extensions have been combined, culminating in non-rigid appearance models such as Active Appearance Models (AAMs) and Active Blobs. One question that has received very little attention is how to update the template over time so that it remains a good model of the object being tracked. This research proposes an algorithm to update the template that avoids the ‘‘drifting’‘ problem of the naive update algorithm. Our algorithm can be interpreted as a heuristic to avoid local minima. It can also be extended to templates with linear appearance variation. This extension can be used to convert (update) a generic, person-independent AAM into a person specific AAM.
Since the Lucas-Kanade algorithm was proposed in 1981, image alignment has become one of the most widely used techniques in computer vision. Applications range from optical flow, tracking, and layered motion, to mosaic construction, medical image registration, and face coding. Numerous algorithms have been proposed and a wide variety of extensions have been made to the original formulation. They present an overview of image alignment, describing most of the algorithms and their extensions in a consistent framework. They concentrate on the inverse compositional algorithm, an efficient algorithm that we recently proposed. They examine which of the extensions to the Lucas-Kanade algorithm can be used with the inverse compositional algorithm without any significant loss of efficiency, and which require extra computation.
Although the concept of the ‘‘light-field’‘ of a scene dates back far earlier, Levoy and Hanrahan's seminal 1996 SIGGRAPH ‘‘Light-Field Rendering’‘ paper created a large amount of interest in the graphics community in the capture and use of the light-field for rendering. The light-field is also of great interest in the computer vision community where it has a variety of other applications besides rendering. Thye have explored a couple of such applications as outlined below:
When is stereo unique and when is it inherently ambiguous? They have investigated this theoretical question and derived a concise characterization of when the stereo problem (given the entire light-field) has a unique solution and when there are multiple scenes that could have generated the same set of photometric measurements (the light-field.)
They have developed an appearance-based face recognition algorithm that can operate given any subset of the light-field of the face. They call this algorithm ‘‘eigen light-fields’‘ because it is a generalization of ‘‘eigen faces.’‘ The training and testing subsets of the light-field do not need to overlap. Hence, their algorithm can perform face recognition across pose; for example, it can recognize a person from a profile view even though the algorithm has only seen that person from the front in the training data.
Under Dr. Illah Nourbakhsh , CMUcam has been developed as a new low-cost, low-power sensor for mobile robots. CMUcam can be used to do many different kinds of on-board, real-time vision processing. Because CMUcam uses a serial port, it can be directly interfaced to other low-power processors such as PIC chips.
At 17 frames per second, CMUcam can do the following:
- track the position and size of a colorful or bright object
- measure the RGB or YUV statistics of an image region
- automatically acquire and track the first object it sees
- physically track using a directly connected servo
- dump a complete image over the serial port
- dump a bitmap showing the shape of the tracked object
Using CMUcam, it is easy to make a robot head that swivels around to track an object. One can also build a wheeled robot that chases a ball around, or even chases you around. In the Gallery, you can see pictures and videos of some of the robots they and others have built with CMUcam. The following have been licensed to sell CMUcams: Acroname – USA, Seattle Robotics – USA, Lextronic – France, Roboter-teile.de – Germany.
The Georgia Tech Research Institute under Wayne Daley continues to do applied research specifically in machine vision for various food products. Their pioneering work in applying machine vision to chicken inspection continues as they migrate to Firewire and USB camera-based approaches. They have licensed two of their chicken inspection/grading developments. One licensee is the Georgia company Gainco, a company that plans to augment their back-end chicken weighing and sorting technology with machine vision to look for defective birds on the kill line, and Spectral Fusion, a UK company that offers a family of x-ray-based machine vision products.
Their current research activities include adapting 3D as well as 2D machine vision for cutting and deboning operations. The combination of 3D and 2D in color provides both guidance and quality checking. They are also working with another Georgia-based company to commercialize a system to inspect, grade and sort citrus products. They are also doing work for the baked goods industry using internal cooking temperature images and comparing them with known good thermal models to estimate core temperatures and tie back into control of the ovens based both on thermal and color image data analysis.
The Computer Vision and Image Processing Lab (CVIP) under Dr. Aly A. Farag was established in 1994 at the University of Louisville and is committed to excellence in research and teaching. CVIP has two broad focus areas: computer vision and medical imaging. Among the active research projects at the laboratory are the following:
Trinocular active vision, which aims at creating accurate 3-D model of indoor environments. This research is leading to the creation of the U of L CardEye active vision system, which is their research platform in advanced manufacturing and robotics. Their Card Robot Project's main goal is to develop an autonomous robot that uses the CardEye as the vision head for the robot. Card Robot is to navigate in various urban terrains, and will be the basis for various human-friendly applications such as providing tours, care of the elderly, and surveillance. Other versions of Card Robot will do such tasks as Park service and Law Enforcement tasks.
In addition to the above machine vision related research they are also working on multimodality image fusion, which aims at creating robust target models using multisensory information. Building a functional model of the human brain based on the integration of structural information (from CT and MRI) and functional information (from EEG signals and functional-MRI scans). The functional brain model is their platform for their brain research in learning, aging, and dysfunctions. Image-guided minimally invasive endoscopic surgery which aims at creating a system to assist the surgeons locate and visualize, in real-time, the endoscope's tip and field of view during surgery. They are also working on building a computer vision-based system for reconstruction of the human jaw using intra-oral video images. This research will create the U of L Dental Station, which will have various capabilities for dental research and practice.
At the University of Colorado under Dr. Ross Beveridge most of the emphasis has been on face recognition. They have implemented open source versions of four base-line face recognition algorithms. These were selected based upon their performance in the FERET 1996/97 tests (a comparison of algorithms sponsored by NIST and several other federal agencies). These are now available through their website and their system has been downloaded by over 2,500 sites to date. They have also been doing advanced work on methodologies for comparing and studying human identification algorithms, including studying what factor make subjects harder or easier to recognize by face using a standard PCA (Eigenfaces) algorithm.
This is the first in what I hope will be a series of articles to introduce you to some of the activities related to computer vision and machine vision going on at our universities. Perhaps some of the projects described will trigger thoughts of commercialization that will be mutually beneficial.
There are currently no comments for this article.
Leave a Comment:
All fields are required, but only your name and comment will be visible (email addresses are kept confidential). Comments are moderated and will not appear immediately. Please no link dropping, no keywords or domains as names; do not spam, and please do not advertise.