Camera and Image Sensor Technology Fundamentals - Part 2
Camera and Image Sensor Technology Fundamentals - Part 2
Part of the AIA Certified Vision Professional-Basic program, Steve Kinney, Director of Technical Pre-Sales and Support at JAI, Inc., teaches the fundamentals of camera and image sensor technology. You'll gain an understanding of camera design including CCD and CMOS sensor technology.
This is section two of the CVP basic camera course. So now we are going to talk about concepts of the digital camera, beginning with analog and digital concepts and imaging. So charges from the pixels must first be converted to a voltage and this is done with a capacitive circuit or more accurately an analog to digital converter. The voltage levels must be measured and converted to a number done with an analog to digital converter, and along the way the gain and offset can be adjusted before the conversion. So the analog to digital converter will represent the voltage levels or the gray scale level of the camera as binary numbers. As humans we tend to use ten-bit math and we think in decimal and count 0 – 1 – 2 – 3 - 4. But computers and digital devices like cameras can’t think like this and would take more lines. Computers think in binary which mean a number is represented only as a zero or one and then they count in binary or as in mathematically this is represented as base two math. So in binary then the digital are all - you can imagine this is like an odometer in your car, but with just a 0 and 1 on the odometer. So it rolls from 0 to 1, when the one roles it rolls back to zero and it rolls the next digit to one, which stays want until this one saturates again – and when they both go to one, they both go to zero and the next digit goes to one. So just think of it like your odometer but there’s nothing but a 0 or a 1 on there and we count in binary. When the camera then makes a digital signal it just simply has to make a bit high or low and string the binary number out in this single serial string. So. Why is this important for imaging? Then. What happens is we tend to think of the gray scale values in the camera. The camera is converting light into a value and the more bits we have in the value than the more gray scales we can have and the more accurately we can measure that light. So the depth of the camera is then called the bit depth and that’s how many bits per pixel were being represented in the image. So we start at the bottom. This is what we would call a one-bit image. That means every pixel has one bit representing it, which can only be a zero or one or black and white. So there is only those two values available. So if I looked at a shaded area than I see only black and white. If we increased to four bits per image that means now instead of using one I have four digital numbers and two to the fourth power means I have 16 gray values. Then I can have 16 values as I go from dark to light. A little more accurately represented, but it still seems somewhat cartoony. An 8-bit image which is the most common on the machine vision – two to the eighth powers is then 56 values and I have 256 gray values and you can see that I have a fairly smooth transition from white to black. We have left this blown up and a little pixelated to make the point for you. But essentially I have enough values to make a transition that is smooth to your eyes. So you would say, “Well why don’t I just want more and why don’t I have 12 – 14 - 20 bits just all the time?” Well there are some high bit considerations. With more bits you get more accurate measurements of light. But with more bits you also have more data to transfer, process, and store. Since the data from the image is going across my interface method – and we will talk about interfaces near the end here - if I transmit 12 bit data instead of 8-bits per pixel then I have 50% more data per pixel. When we talk about megapixel cameras - one megapixel two four eight, all way up to 20 and 30 megapixels nowadays at a higher frame rate - this becomes a lot of data. So you want to manage how your bit depth for what you need with the application to get enough bit depth and resolution but yet manage your data and keep it minimal to manage the data transfer and not create a lot of extra work in the PC for properties you don’t necessarily need. The other consideration is the accuracy of the image. With more bits if I have a 10 or 12 bit image then I have 1000 or 4000 gray scale’s per image. Which means I can more accurately sample the light but it also means I’ve more accurately sampled the noise. It doesn’t necessarily mean there’s more noise but it means that I can see the noise in those last bits. This is good in some applications. Some high-end medical professionals actually use that noise to calculate, and do some noise analysis and remove noise from sequenced images. But in general over digitizing and digitizing the noise portion of the camera is not useful and creates just extra data that is not useful bits. So we see this a lot too. It doesn’t make sense to put a 12 bit output on a camera that uses a small pixel CMOS imager. We talked about the small pixels in section 1 and if you only have a 4000 electronic pixel doesn’t make sense to have 4000 grayscale because in the charge conversion you are not going to see a single photon in the well. So the signal-to-noise ratio of a 4000 pixel small electron imager is very small. In fact it’s probably going 30 or 40 dB to get 8-bits you need at least 48 dB of dynamic range or signal-to-noise ratio and to get 12 bits you need 62 dB. So it does not make sense of it a high bit depth 62 dV type output on a imager that only can make 30 or 40 dB. What are the benefits of digital cameras? Why do we see the whole world going from analog to digital - whether the consumer camcorders, whether it be the machine vision cameras - what’s gone on here? In the early days we talked about in section 1 we talked about the TV formats. Everything camera wise was analog in the early days, because I expected to go home and plug it in the TV or plug it into my VCR. It was a standard format, it was transferable and everything worked together and the whole world was analog. But as we saw as imagers became higher resolution, as we became progressive scan and non-interlaced scanning, we left some of the standard formats and in fact in machine vision we want nonstandard formats. One of the drivers here is just getting our arms back around these nonstandard formats in a way we can handle. But there are actually advantages to digital cameras as well that you don’t see in an analog camera. They start with the configurability. Analog cameras tended to also be configurable by switches. As the world moved on we do have 232 some high-end analog cameras had serial control by a PC by and large analog cameras went back to TV days and probably had buttons and switches on them. Digital cameras because they are digital by nature - they came long after the fact and the product development already had a little brain in there to drive it anyway – so that brain tends to be digital and we can talk directly to it and not only can we talk directly to it in a lot of cases there is extra processing and there are features we can do inside the camera. Here you go - with the analog camera we tended to have - this is a representation analog signal - here’s the sink here’s the gray scale values of the horizontal line of the image - and it would go into a PC for machine vision. In the consumer world you would connect it to your TV or VCR, but of course were concerned about the machine vision world and in a few applications you might be monitoring or something, but in most cases users then have to convert to digital to get in the PC to actually do some analysis of the image and create an output or an action that we expect in machine vision. By doing this this is actually the amount we took the nice flexible analog circuit or output from the camera and ran it into a digitizer. That can create some problems - pixel jitter between the two devices, this has to sample precisely at the clock frequency of this, it uses a phase lock loops to lock onto it but there’s always a little bit of jitter. Noise or EMI suppression go hand-in-hand and an analog signal - why the standard - this is one volt peak to peak signal with 200 and approximately 270 mV of sync. That means that only about 700 mV just over 700 mV represents the entire analog portion of the signal. As we saw again in section 1, the pixel intensity is being converted to an analog blip that is being represented here somewhere on the horizontal line. So what that means is “hey, I’m running down a single cable with this analog signal and I’ve only got 700 mV representing my full scale of my pixel and only one millivolts representing the bottom end of that pixel!” Actually in our analog signal there is a 50 mV offset to compensate some of this but essentially I have the entire signal 700 mV. That means that if I pickup as little as like 10 mV coming down my cable. Those 10 millivolts are a substantial portion of that 700 mV signal and that’s where you see noise or interference. So if you’ve ever seen TV and you got a bad signal antenna or someone started a motor or something noisy around it you see snow on the screen, thats because it’s making spikes in here, and those spikes are being interpreted by the receiving device as a bright pixel or a dark pixel if it’s a dive. And that noise is very inherent on here. A digital camera makes a chain of zeros and ones as we described in the binary counting and transmits them over the interface. This is inherently more immune to noise because the noise must upset these two scale levels and neither depending on the method of transfer either these are higher voltage it has to be several volts before it toggles from a 0 to a 1. Or maybe current driven were the voltage levels low but it takes a lot of current. So even if there is noise makes a little bit of voltage on the line, it can’t really build any charge because there is no current behind that noise being picked up. So therefore I’m transferring zeros and ones. The receiver only has to tell whether the line toggled way up here to a one or way down here to zero to get it right. If there’s noise it has to be so great that it offsets and probably it’s not only the cameras fault if you have that much noise in your environment. So inherently digital is more immune to noise and correspondingly is more immune to EMI and other noise sources. This is very important in factory settings - as a camera manufacturer definitely experienced cases were people have done installations - they run cables to close to hundred horsepower electric motors. I saw case actually at an automotive plant were they literally did run a gigabit Ethernet cable over 100 hp electric motor and were reducing currents in there. Digital is not immune entirely but it is more immune than analog. When we talk about digital camera benefits versus analog as I said analog has standard formats EIA NTSC in the United States, CCR all in Europe. But these are standard formats. I can plug anything together, it works but it is in fact analog and has the disadvantages we talked about including interlaced scanning. Whereas digital - we have different formats and we might have different resolutions – VGA, SVGA, XVGA or as we leave these and go on we see resolutions all the way up through about 30 megapixels nowadays. The frame rate might be different as well, not only the resolution, the aspect ratio whether it’s a one to one, a four to three, wide HDTV which happens to be a 16 to 9 aspect ratio. Those are all varying things that we have to deal with on the capture end, but in the digital world we just deal with those in the interface standards. So we make a standard that takes care of all these and in most cases – and we will talk about the standards – that either the format is predefined or its flexible and the camera says “I have this may pixels horizontally, I have this many vertically.” and the digital device configures itself. The point being, in the digital world we have the communication and the wherewithal and the tools that we can set up for these nonstandard formats, whereas in the analog world and your VCR and other things there is just no way that it fits the format doesn’t, and it doesn’t then you going to get an analog capture card and configure it. So digital allows us the chance to standardize the output and make plug-and-play type interfaces. Analog does have more than one format – I’ve described mostly what we call a cause composite format - which is the single cable that your accustom to either a coaxial cable bring the raw signal from the antenna or the RCA type that goes on the front of the VCR that contains the analog signal. In analog there are other cases. There is S-VHS also called YC video and this means that breaks the that analog signal into two components one for intensity one for the chroma. Or RGB which means there is now three or four cables. There is three if its RGB in the sink on the grain is for singles brought separately and RGB but what that means is there’s a separate channel for red, green, and blue, and that that channels only intensity. So RGB is actually the highest level analog format because the color components are separated each to their component plane and that gives a chance to have less crosstalk in between the colors and composite because there is a chroma burst for the entire signal - its four times the speed of the rest of the information - then that means there is little jitter and crosstalk among those colors and S-VHS or YC workers to cables or somewhere in between. But in digital now there is no crosstalk because we represented it as a number. Each pixel goes out discreetly as a number but they’re different formats. There are ways I can tell the host, “this pixel I just sent you was an RGB pixel and has everything in one pixel (red, green, and blue), it’s a 24-bit component with eight bits of each.” It tells it that “I’m a YUV pixels.” So, I am not going to go into the coding here, but this coding just says “this is how I encoded digital standard YUV 422.” It might be a monochrome saying, “that is a single 8-bit or a single 16-bit value for this pixel I sent you.” Or it might be raw Bayer, which means I’m just sending you the raw pixel. So “its a red pixel and it has an 8 or a 10-bit value and when I tell you about all the red green and blue, you use your own interpolation to get your color out of that raw data. The point is the digital standard has to know what type of pixel not just the pixel and the value. Digital cameras can also provide advanced features. Again because they are being driven essentially by a small microprocessor inside - whether it is physically a microprocessor or its another type of logic device like an FPGA or an ASIC - there is a device inside driving all the digital electronics, clocking the imager, and correspondingly we can talk to that device and we can build features in. Most notably some of the basic features that are most commonly used are things like test patterns. So in a digital system I can go turn the test pattern on in the camera and I should see on the other end, and if I don’t I can start troubleshooting about were my signal went. But it also includes things like timestamps, frame counters, settings in my I/O ports if I have some input output connected to the camera, things like error checking – I can check if I sent it across but at the end of receiving a frame I can have a checksum or some kind of error checking and check back to make sure that I didn’t introduce errors in the middle that frame or miss a pixel. All types of things come with digital benefits. Digital cameras can do onboard processing as well. Again because there’s a brain they are often able to get all the I/O and the stuff I need to make a good camera. I had to use a bigger processor inside the camera in order to get all the structure to interface it to the outside world but – I have processing left over inside and then what you see is manufacturers trying to make unique things to separate themselves from other manufacturers and attack various applications with the features we can then build into these smart digital devices. In this case you’re seeing here this is actually two CCD camera or two CCDs align per pixel on a prism and getting two simultaneous images. The one images being sent at a high gain and slower exposure so I can see inside the car and see the faces, while the other image is out here where I can see the headlights and presumably license plate if I wanted - but it’s too dark to see inside the car and with the range of a single sensor I can see both. If I turn it up to see the faces then I washout around the headlights and if I were trying to read the license plate when see a yet if I turn it down to see those features I can’t see the faces at all. This camera then because as a digital brain and processors is able to take the two images – fuse them into one high dynamic range image - where I can see everything inside the lights, the glance on the plate but also inside the car and because this is done in the camera, it is served up to the user as a high dynamic range and all of the sudden the user as a camera that attracts an application that was pervious previously a problem for a single camera. In the analog realm we simply couldn’t do this kind of processing. This is again an example of what’s going on in that camera - looking at a lightbulb again - if we want to see the filament we have to use a very fast shutter, very low gain to see the filament without it glowing, but if I need to read the text there is no chance that the camera can align the two on the prism, do it, and read it all out on one images the user Regency everything needs including the glass around the light bult – you can see the fine detail printed, everything. This again is made possible because we have a smart processor in the middle and I can do image processing and natural fusion -feed it for a memory circuit and now the rest of this would be typical output for the camera. The physical interface and a format in this case Gigabit Ethernet vision.
Okay so now we want talk about image quality basics. Those include temporal noise. Temporal noise is anything besides light that can change a pixels value over time. Things like temperature, random noise, thermal noise, the sensor as we start seeing cameras getting a hot out in the sun will see the noise level come up this shows up like snow on a TV screen - that is temporal noise - the pixel values changing it’s not steady, its changing with time, temperature, something. Spatial noise then is often called fixed pattern noise. Its fixed. Its still a noise source. This pixel values not right by some value but its constant. That value doesn’t change, so there is still an error there that we can correct things like pattern noise out. So this is where CMOS sensors typically have things called fixed pattern noise correction. They also have some more advanced corrections sometimes and what they are going for is the spatial noise, where they know that there is noise that is not real. They have to get that out of the image to get a higher quality image, yet they know their sensor and they know the parameters around them and they can apply a correction because it is fixed. Temporal noise cannot be mathematically corrected out. It might be averaged out over a number frames or some other corrections but in a single frame you cannot mathematically correct temporal noise because of the random nature. Some sources of temporal noise - shot noise or photon noise - a lot of people don’t recognize you hear this and see this but they don’t recognize that this is due to the nature of light. Shot noise is nothing to do with the camera. What it amounts to is light is a wave as we described the very beginning, and waves can cancel out. So if I have a bunch of rays coming at my camera, at a certain light level than a certain number of those rays before they get the camera get out of phase of each other and actually cancel each other out. Then this doesn’t register on the camera. So in bright light if I have a large number of pixels filled up with 40,000 electrons and I have 40,000 rays of light coming at me, you don’t see this random cancellation - you’re not very affected by shot noise. But as people turn the gains up - I get a lowlight situation and maybe I have a 40,000 electrons well, but I’m going to use five or six thousand electrons in that well and I turned the gain up at the end to give my image brightness back - then all of the sudden I have low photon counts. You can see that some of them canceled and again you see this is a random noise source because in this pixel a few canceled, in that one they didn’t cancel that many that are not brighter So you see random variation from pixel to pixel that is just based on the nature of light. Again this is called shot or photon noise. Dark current noise then varies by sensor - in the example here it says - every HC that are current noise doubles, and that’s an average value that varies by sensor. lows usually in that range and what’s happening is the sensors collecting those photons, converting them to a voltage or current for the device, but no matter whether it’s a CCD or CMOS imager, it needs to collect those lights and readout a value for them but during the time that is holding the charge and getting ready to read the value out, there’s always some – what we call - leakage current leaking in here. That’s called dark current noise and that’s because there shouldn’t be any even if I cap the camera and output a black image we can measure the electrons that are flowing into the well. This is essentially the fact that my bucket is not really a bucket - its little bit leaky. You can imagine maybe a well built with bricks or something where there is a little bit of light coming in the side, so if I hold the light for a long – say I do a four second integration to give a real lowlight image in a scientific type application - you can actually measure those photons creeping into the wall or those electrons that never really were photons, but they creep in the well anyway and get measured and that’s called dark current noise. That creates a fixed offset to the camera but again because this varies from every pixel its still a temporal noise source because it’s not perfectly uniform from pixel to pixel. Then there is also quantization noise and that’s errors coming from the A to D conversion process. So whether it is a CCD or is CMOS imager there’s still on analog-to-digital converters somewhere in the CMOS imager of the analog digital conversions taking place on the chip. Whereas in a CCD camera the charge is being read out of the chip and is taking place inside the camera. Both have an A to D converter, and an A to D converter always has a little bit of what we call quantization noise, and that’s where its being affected by some of these thermal effects as well, causing a little bit of noise.
Tip: use the better A to D converter to get less quantization noise.
This is fundamentally important because as I just told you, in a CMOS camera the A to D converters built on that chip. Not only that there is massively parallel ones on the chip readout all these pixels, which is fundamentally why they are faster. However by the nature of that, they have to keep the structure very simple and very small to fit at the pixel level or the column level and therefore the on-chip A to D converters on the CMOS imager are no were near the par of the off chip, discrete A to D converter that we can build into a CCD camera. Plus in a CCD camera then there’s only one stream running through the one A to D converter so there is no variation from A to D converter. So quantization noise is a big difference between CCD and CMOS cameras. This is also why CMOS cameras tend to be a little noisier than CCDs. This is exaggerated here this is obviously high gain. But this is showing spatial noise off a CMOS imager and what you will literally see is the little compression adding to this. But what you literally see is all these vertical stripes or column noise from the A to D’s and the variations I just talked about with the converters. All you can see that in there and that’s the primary noise source, there is some effect. I don’t know that bad sensor design is the right word there, but as a sensor design effect to get more or less noise and depending the level of sensor you buy – certainly CMOS cameras can be cheaper and in the less expensive imager you would expect more pattern noise and fewer refinements. Whereas in the higher-quality CMOS imager’s that you see nowadays in machine vision you can expect a better image even if there is still pattern noise in the background. There are also trade-offs and this was not always bad design versus what I wanted. For example were talking about global shutter. If I install a global shutter on a CMOS imager that means I had to add an extra transistor to that imager, because unlike CCD there was no ground plane that transistor now takes up space on the imager pixel surface area which means as a smaller photosensitive area. There is always trade-offs and some of which can results in spatial noise. So I’ve mentioned a couple times a signal-to-noise ratio. The ratio of good signal causes, the signal noise is the ratio of good signal caused by light to the unwanted noise - the most important measurement of image quality for digital cameras. Again we described in section one about the well capacity and having more electrons in the well. The blue electrons in this - this is simulating the well and the readout structure - the blue electrons are good signal I want. This light is coming in, registering in the well. The red ones are bad ones some of these are random and come in from just thermal fluctuations. Some of them are still random but they came from dark current leakage. So some of these might’ve been leakage during the readout. The rest might be some random noise, it might be external noise sources. Whatever it is there are some noise sources that can be divided are the good noise can be divided by that and give you a signal-to-noise ratio. So the higher the signal-to-noise ratio you have the better signal you have from the camera, the better we can digitize and create more gray scale values and creates better cameras for you. This also affects sensitivity. Good camera designs require less light to overcome the noise factor. So if you have a bad cameras design, a high noise, you can’t have good sensitivity because sensitivity means I can measure only one or two photons in their. If I have a whole bunch of photons of noise is harder to measure those few so comes back to camera design. Dynamic range then is the range from the brightest pixels the camera can measure to the darkest pixels the camera can measure. Again I’m coming back to the light bulb example that I gave earlier. But it also serves us in this case as well because what we can see is again the dynamic range of the imagers. The ability to measure between a certain black average black level and how far then - still measuring that black level – measure in the right direction. We can see the camera cannot measure everything to see the glass and the filament but still read text and if we turn up you can read the text but we can’t see this. So we can fuse the images and create a higher dynamic range image, with point beings that any CCD camera still has a dynamic range associated to it. The dynamic range is very closely tied to the signal-to-noise ratio and generally they don’t exceed each other by too far. I am not going to discuss the actual measurement method, but they are related. This again is showing the signal-to-noise ratio and how this affects the camera’s output. This is the point at which the camera has no photons in the well – or the black level of the camera. That output is driven to zero, which doesn’t mean cameras are not necessarily single photon devices – doesn’t mean there might not be like getting there but it means this is the minimum light this camera can detect. This camera is set in more or less a linear mode so as light doubles the output doubles and I have some output, and this is the saturation. This is the point at which the well was full. At this point more photons may continue to hit, but the well can’t hold anymore so I can’t readout anymore. The distance between the minimum and maximum point is then the dynamic range. As we go low in the dynamic range or the signal-to-noise ratio, there’s a point where I’m not getting a signal out of the cameras. A very small signal, the noise floor is fixed in the camera so for given noise - small signal, signal-to-noise ratio is very low down here. As I have more light than I’m getting more signal in that well. The noise level is constant. The signal-to-noise ratio is very high at the end of the curve. If available from the manufacturer then, you can get signal-to-noise ratio measurements. Typically this is only good from the manufactured and depending on the test conditions – it is not usually good to compare one manufacturers curve to another’s and say “oh, all things were equal here.” But if you can get them, either in absolute terms where you could consider one camera to another, or for several cameras from manufacturer you can tell things about the camera then. A camera that is good in low light is not always best in bright light. So you want to consider what is the dynamic range and by using this camera - one point is machine vision always have fixed lighting therefore I can count on this point or it’s still maybe machine vision but maybe it’s on an autonomous robot that is driving around outside or something, so light is changing and I’m concerned about the entire range. You want to consider that when you’re buying cameras. So again you can tell each of these cameras has a different point where it in the light. This is coming down to 1 this is 100 and the things with the highest dynamic range are usually the things with the poorest sensitivity because if I have good sensitivity and I have high gain that means unless I have really big well, my well saturated faster. Typically these things were little bit at odds with each other and you want to pay attention to the cameras - were the data is available anyway and where you are going to use them. This is also the point where we want to mention the EMVA 1288 standard. This is a industrial standard for measuring the image quality of digital cameras. To date this is the only industry-standard that’s available that really is applicable to compare one camera to another. There are some standards that have been out there that measure light sensitivity but they don’t necessarily take noise into account and of course noise can be traded for sensitivity. If I have half the noise in a camera I can than double the gain to get the same noise as another camera and have twice the sensitivity. So noise is a factor in sensitivity. EMVA 1288 is one of the few standards out there that takes everything - the sensitivity, the noise level, everything in the camera and it gives you some metrics based on that. It is testing using a known set of the conditions - light lenses and targets and stuff - so manufacturers cant fool it, like cheating on the lens or some parameters. Again there’s specs out there and the manufacturers always put signal-to-noise ratio on the back of a datasheet but the signal-to-noise ratio you read on a camera datasheet, unless it’s a EMVA 1288, its actually meaningless for comparing one camera to another. Because one set of test conditions wasn’t exactly equal to the other, and even when they kind of give you the basics that will look equal, sometimes it’s still not apples to apples comparing in the background. So EMVA 1288 again standardizes all this in a way that there is no fudge and no fudge factor and you can compare one camera to another. Also to get away from variation, results from multiple cameras must be published to show the level of consistency. So it is the same thing in the security market and glowing cameras we see some the highest standards are out there because they’ve tested a gold camera but the average value of batch might not match the gold camera standard. EMVA testing - they have to publish a batch of cameras and look at the averages of those cameras so that you can know how manufacturing variations are affecting the cameras response in the standard and use that to your benefit. Again this allows customers to compare apples to apples one camera to another, and they can be consistently sure. This is very good - my only concern about the standard is that it takes everything into account - all the noise sources, both special and temporal, camera sensitivities, everything. Some applications care less about things, maybe they are more concerned about the temporal noise and the spatial noise. Even though the spatial noise is being held against the camera in this, so in some cases and some very narrow cases you may want to read in between the lines and not just say, “Well these cameras are equal because they are rated.” But how were they equal, what is this one good at? But again the advantage of EMVA 1288, if you have a full test report on the cameras - you can see so all the parameters and you can make, you have the data to make those judgments.
So last thing I want to talk real briefly on is what are some basic camera controls and I wanted to emphasize the word basic. We are going to talk just about gain, exposure, and shutter and things like that. If we open a digital camera today there’s probably 50 different user parameters and there are a lot of things that customers can control. There’s a lot of special features like I’ve shown that manufactures build into. We are not going to talk to that level today, we are going to talk – this is the basic course – so we are going to talk about the basics affecting the image that are most usable to the user setting up for the first time. The first thing we want to talk about is gain and most people understand this concept. Gain is a akin to the volume control on your stereo, so camera is getting some light for some conditions. Its outputting, if the image is too dark you simply turn the gain up just like in the stereo if its to quiet I am going to turn the volume up. The gain is simply an amplifier so it’s amplifying the video signal but is also amplifying the noise and everything. Just like in the stereo if you turn up way loud and the song ends you can hear static in between songs, the same is true in the cameras. If I turn it to high gain I am going to see that fixed noise and this comes back to what we’ve been saying all along about the signal-to-noise ratio and the well output. So if we look at the output signal of the sensor, in percent, and then we look at the output value of the camera either in gray scales or we could call it percent as well. But what is basically happening is, in a low light situation, I’m only filling that well up 25% of what it would be in a bright light situation. So in sunlight I get enough photons that well, that 40,000 electron well is completely full and then I can run a minimum amount of gain, zero of gain and get a full output out of that and get the image. Great! Great imaging. Truth is very seldom are we in those cases. So often in machine vision we want high-speed electronic shutters to freeze the motion, but that also means we blocked out light. So if I take one 10,000 shutter freeze the motion like I showed in that motorcycle picture earlier, than that also means I only collected life for 1/10000 of a second and in 1/10000 of a second, probably 40,000 electrons didn’t register in my pixel well. So that means, “hey I used a high shutter, now I only got 10,000 electrons in my 40,000 wells so I am only at 25% of that well.” So how am I going to get my full output back? I am going to apply 12 DB of gain and multiply that up and get a full-scale count out. But again that means I multiplied the noise up as well. I want to pay attention to the gain. One of the principals in the imaging is that if I can keep the gain low but apply more light in any kind of way. If I can live with a longer shutter speed or better yet if I can just use twice as much light then I may get a better quality image and always its better to increase the light, open the F stop in the lens, or apply twice as many lights then it is to apply gain. However there are downfalls to some of those. Its not always possible, it is not always practical to apply more light. You need a shutter to freeze and it is not practical to limit the shutter. To turn the gain up in the camera at some point and recognize it affects your signal-to-noise ratio, notice that also affects my dynamic range. Again the dynamic range is the ratio between the light that’s black and how much light it takes to saturate the imager. So at full well capacity I have a very high dynamic range if I measure these ratios, but if I’ve only used one quarter of the well and use four times the gain than my curve is steep and the light range, the light it takes to saturate is now only ¼, so the dynamic range is cut by a factor of four as well since the noise floor didn’t necessarily move in the camera. It is important to recognize that because a lot of our customers don’t recognize that turning up the gain affects the dynamic range of the camera. They might think in terms of, “well I knew the signal-to-noise ratio fell.” But they didn’t necessarily understand the dynamic range fell. Forward gain then, increasing gain will increase the visibility of both signal and noise. It does not increase image quality. In fact it decreases it because it amplified the noise. It is always a last resort as I described. Turn up the light, anything you can do, if you can do it and gain may be limited at higher bit depths. So if I have a high bit depth camera, they typically, it doesn’t make sense to have 14 bits on a high gain camera. They typically don’t put as high gains in there. So if we look at this, what’s the practical effect of this? Again this is an amplified image for sample here. But this image was taken with the same camera low light. So if I look inside this lens here, in fact I can see the center of lens, I can see all the gray scale variations in the ring out here and in fact the shiny ring. This is taken by using a longer shutter and lower gain. But now if I turn the gain up and make compensations to the shutter, then this is the same image – notice the exposure is the same - I can still see the ring, the gray scale values are approximately the same, but you can see how much more noises in this image and that comes directly from the gain amplifying the noise in the camera and using a smaller percentage of the well capacity that is available in this particular camera. Exposure time is then the length of time that the sensors open for collecting light, also known as the shutter speed or integration time. Exposure time considerations frame rate may be reduced with increase, so if the camera has a certain frame rate there’s always a certain amount time it can expose before it can readout and have time. Obviously if you’re looking at a camera that has 30 frames a second and you want to do a four second exposure you’re probably not doing a four second exposure at 30 a second. Motion blur is greater with an increase in exposure time. Again were not freezing the motions so if I take a longer exposure time that means an object moving at a fixed velocity through the frame will move further during the longer exposure time, thus causing more blurring. Signal-to-noise ratio is increased with exposure just as I said though. So the advantage of exposure is, I can keep the gain low and get more light in there which gives me a higher signal-to-noise ratio provided I can live with the other two effects above. Paying attention to exposure itself. You will note the note on here that says good and better are always a matter of opinion. So somewhere in here you want to look. If I overexpose the image, I’ve lost so much information that I probably don’t have enough to really make a judgment. Of course if I underexposed that I lose the in the black area. There’s no recovering that whether I clip it black or clip it white that information can’t be gained back. If I have it somewhere in between, then I have information and if it’s not there I can actually apply digital methods to maybe stretch to get it back, at least it isn’t clipped away like it is black or white. However there’s optimum and that’s why we are saying that this is really just a matter of opinion. Probably this image gives me the best all round image here as far as seeing the lens, seeing the rings, seeing the light. But guess what? If there was really some kind of variation in the lens and you can see there actually is in their ism there are a couple of little stripes inside this lens, if I were actually measuring those I might want purposely overexpose the image if I didn’t care about inspecting stuff in the exposed area. This might be the kind of exposure I wanted to see - the variation in the black ring inside of the lens. In fact I can see it best here even though I said this is probably the worst image here. If I were really concerned with only measuring this variation I might only use this image. So be aware exposure affects the quality of your image and the dynamic range. Certainly this is the shortest dynamic range, but it’s also user dependent on what they want to see, so make sure that you’re getting the right exposure that is giving you the most gray scale values over the area of the image that you want to use. Black level, sometimes called brightness – it is almost incorrectly called brightness - adds an offset to the pixel value. So the black level - manufacturers call this different things – it can be called black level, in some cases its simply called offset. This is just a constant that we are applying to the image. It goes back to how black is black. In some cases like if I set a camera up in this room and turn it around and start looking at things like the black frame around this TV, well there’s light coming off this black frame is black our eyes, again there is a certain number of photons are still bouncing off and hitting it. So the camera will also see that, but in some cases maybe I wanted that black frame to be black, so the offset gives the user the ability to adjust and say, “ignore those few photons that are coming off, this really should be black,” and set the right starting point in the image. It does affect the brightness. I don’t like this because in the analog TV days brightness actually affected the white level at the top end of the signals the same way you offset the bottom. Brightness technically is moving the top level. In the digital camera world it because of losses affecting both when I move the offset. If you’re talking to older people that are using analog terminology they might not like it if you mix black level and brightness so. Black level considerations - proper use to ensure the camera accurately measures light when the scene is darker. Again how black is black for the correct point that I’m measuring. A side effect is that it can make the image brighter or darker but not by much. You’re mostly affecting the black levels, if I had five black counts of offset and I’m looking at this frame that only had five or ten counts then I may more than double the output of this frame. Yet when I look at the middle of this image that was starting to saturate in areas where I had 200 counts of light, adding five counts to 200 doesn’t affect it much. So we will gain overall brightness by that offset, but it is going to affect the blocks much more than it is going to affect the saturated part of the image. Again you can see this in this image again what are we inspecting here. If I turn the black level up again I can change the offset just in this black area and I can see this starts coming out. Of course if I change the offset you can see the center is a little bit brighter here but just as I was saying the bright pixels in the center are affected much less than the black pixels in the ring. Again if I were inspecting this and I were concerned about that ring setting the black level maybe as important if not much more important than setting the gain in the exposure. Image format controls the type of image sent from the camera. Its usually specified by color or monochrome and then by bit depth. As we were saying earlier, if we choose a higher bit depth we have more date to transmit or process. Correspondingly that gives us more detail than a lower bit depth where we have limited details, so there’s balancing act between how much transfer and how much processing I want to use, versus the amount of detail on my image or the bit depth of my image. Be wary of anyone wishing to view a 12-bit image on a computer monitor. All monitors can display bits or less. Despite the fact that Sharp tells me the contrast ratio is probably about a million to one on this, its back to your standards and how they measure that. You’re not going to realize that the same is true in your PC - you can get 12 and 14 bit images displayed on your PC, but Microsoft and most programs are locked in around 8-bits. You probably use a special program if you were going to view anything more than eight bits even on your computer monitor. Even than the dynamic range your eyes and your monitor are affected and you have to pay attention to what you’re actually seeing versus what the camera computer vision can see. Certainly a computer vision system at the computer levels – if you are doing math and analysis on an image - can see one bit out of 12 bits that will do the math and figure it for you. Your eyes certainly can’t see one out of 12 bits. Many people think they wouldn’t need 12 bits but don’t - again I see this more and more - some the cheapest cameras in the market have the highest bit depth even when the sensors don’t warrant it and you get people in the consumer market that are shopping by resolution and want the highest resolution. You go, “oh, I want the most bits off the pixel,” and most of the time there over specifying the system, so you want to listen carefully to what they need to do and look at what they need to do and make a balance. More is not better in these cases, it’s a balancing act. It can be better if you need it, but if you don’t need it it’s damaging to your application. So this again is more or less showing images but also notice it is showing the file size associated with this image. These are all monochrome but you notice this again is the one bit image, so it is all black and white, often called thresholding. To find things, if I want to measure this ring this is maybe not bad because I can measure the edge very sharp despite the glares. On a one bit image but only 11 kb per image that I am transferring, so if you multiply this by the video rate - 30 per second or something - not very much data. If I take a four-bit image now it is going to multiply – I got four bits per image and see more gray scale and start making out stuff on the lens more. I make the ring out in much greater detail, but I have now four times the data. It would be 44 kb per image again at 30 frames or something that would multiply up and if I made the image I have very good definition on all of these but at the largest I am 88 K per image times the frame rate of the camera. So these things start affecting the bandwidth and the transfer. Again you want to get good enough and balance this out with your customer. There are certainly shades in between. Color format considerations –so, color cameras used to cost a lot more, especially in the analog days where you had to process that analog signal and do it in the analog realm. There were a lot more components. Today in digital cameras especially in the machine vision world, where the digital cameras are making raw Bayer output, they are as little as $50 more. There is really a difference in the censor the camera electronics are basically the same even though the manufacturer may have to do a little more testing on the color camera. But it is not always something you want. Color images are nice, but they can reduce the resolution. The most common color imagers are Bayer complementary color imagers, which mean that one pixels is red, green or blue and I borrow from my neighboring to get my color. But that inherently means that I borrow neighboring pixels and that math called interpolation than causes some errors, which limit my image. In this if I go read the small black-and-white print and blow it up at the picture level with my color camera I have some color aliasing around the black-and-white edges. In fact one of the points in the slide is to show you that is a function of color. The amount of aliasing you get and how bad it is actually worse on black-and-white then it is on say yellow or whites. This is back to monochrome is 2-D color and color is 3-D color and the amount error you get to Bayer interpolation is a function difference of two 3-D vectors. So this means again depending on the contrast in color I am more or less affected by the Bayer interpolation. Resolution also affects it. Much like – we are talking about signal noise and bandwidth - more pixels achieve higher detail, but not always better because more pixels number one mean the pixels are smaller from the image formats so if I have four megapixel imager in the half inch format compared to a one megapixel in a half-inch, that means the pixel has to be smaller which affects everything - signal and dynamic range. Also means it’s harder for the lens, the lens has to resolved to smaller pixels which means I need a higher end lens and maybe I didn’t always need all these pixels for my application.
Maybe I could get it done with fewer and again if I don’t need them then you should make this a balancing act. Because more resolution means more data to transfer, more data to process, it also means higher resolution means higher price too because you bought a higher end camera and a bigger imager from somebody. So this comes down to an example - a practical example - what if the customer needs to inspect this box right here? And he is doing a color inspection to check his print and everything was right, he wants to make sure all the logos are correct, he wants to make sure that colors fit the standard, but he also must read this barcode. Well you have to look at the limiting resolution and if we look at just the barcode, this is the barcode at 250 pixels square on the barcode and this is a barcode at 25 so obviously I have a chance to read that barcode at 25 pixels. I have to have something more like 250 pixels in this area of the barcode. But now lets say this barcode is a factor of six to the total width of this image. Well if I had 250 pixels across here to cover this much of the image, the that says I probably had to have 3000 to cover this whole image in the camera so the two go hand-in-hand. You have to look at what the minimum object you want to inspect is, and what the resolution is and then carry that resolution over your entire field of view.
In summary, you want to accurately identify whether your applications low or a high-end, and that means everything - signal noise, resolution, dynamic range. Low end applications may only need a minimum frame rate and resolution, and sampling image to convince and get the job done. Higher-end applications will need deeper data. In the higher end applications pay attention to your EMVA reports and stuff on the camera. Select your camera appropriately. Become more familiar with the most common sensors - these are typical Kodak and Sony sensors that are available in the market. The ICX285 is probably one of most sensitive in the market, but the Kodak sensors tend to have larger pixels and higher dynamic range as well. So there trade-offs are out there for you and your customers. Talk to the camera suppliers as well and get recommendations, don’t just buy off the data sheets. Digital cameras are packed with features, many that customers never think they need - again talking manufactured to see what he says he has to your job done and don’t let bad settings misrepresent the image quality.
That wraps up section two.