1 # A Multi-Aperture Image Sensor with $0.7\mu m$ Pixels in $0.11\mu m$ CMOS Technology Keith Fife, Student Member, IEEE, Abbas El Gamal, Fellow, IEEE and H.-S. Philip Wong, Fellow, IEEE Abstract—The first integrated multi-aperture image sensor is reported. It comprises a $166\times76$ array of $16\times16$ , $0.7\mu m$ pixel, FT-CCD subarrays with local readout circuit, per-column 10-bit ADCs, and control circuits. The image sensor is fabricated in a 0.11 $\mu$ m CMOS process modified for buried channel charge transfer. Global snap shot image acquisition with CDS is performed at up to 15fps with 0.15V/lux-s responsivity, 3500ewell capacity, 5e- read noise, 33e-/sec dark signal, 57 dB dynamic range, and 35 dB peak SNR. When coupled with local optics, the multi-aperture image sensor captures overlapping views of the scene, which can be post-processed to obtain both a high resolution 2D image and a depth map. Other benefits include the ability to image objects at close proximity to the sensor without the need for objective optics, achieve nearly complete color separation through a per-aperture color filter array, relax the requirements on the camera objective optics, and increase the tolerance to defective pixels. The multi-aperture architecture is also highly scalable, making it possible to increase pixel counts well beyond current levels. Index Terms—CMOS image sensor, Multi-Aperture image sensor, Charge coupled devices (CCD), Charge transfer, Image sensors, Active pixel sensor (APS), Analog-digital conversion #### I. Introduction MULTI-APERTURE (MA) image sensor [1] consists of an array of apertures, each with its own subarray of pixels and integrated image-forming optics (see Fig. 1). The sensor readout is performed hierarchically, first at the aperture level and then globally across apertures. The operation of a multi-aperture system is different from that of a conventional camera. Whereas in a conventional camera the objective lens focuses an image at the image sensor in the focal plane such that the pixels capture the real image, in a multi-aperture configuration the objective lens focuses a virtual image a certain distance above the sensor such that the pixel subarrays capture overlapping subimages of the virtual image. A high resolution 2D image can be reconstructed from the subimages via post-processing. The overlap between the subimages provides several benefits that can be achieved through more sophisticated post-processing. These include (i) obtaining a depth map of the scene, (ii) eliminating color crosstalk between different color channels by using a per-aperture color filter array, (iii) relaxing the requirements on the objective lens, and (iv) increasing the tolerance to defective pixels. In addition to these benefits, the hierarchical readout architecture of the multi-aperture image sensor makes it feasible to scale the pixel count well beyond that of conventional image sensors. As discussed in [1], depth resolution continues to improve with pixel scaling below the limit set by the spot size of Fig. 1. A conceptual view of a multi-aperture image sensor. Each aperture comprises a small pixel subarray and integrated imaging optics. conventional optics. The reason is that the depth of a feature is measured by the displacement between the locations of its images in different apertures, and the accuracy of localization continues to improve with pixel scaling. The trade-off between achievable spatial and depth resolutions in a multi-aperture image sensor was discussed in [1]. Deeply scaled pixels also enable high resolution imaging at close proximity to the sensor, which can be useful in such applications as microscopy and in vivo imaging [2]. In this configuration, the objective optics are completely eliminated resulting in an imaging platform with very small working distance. Designing scalable arrays of submicron CMOS pixels with acceptable imaging performance is challenging, however, because of the high dielectric stack height and optical occlusions resulting from the use of metal layers in the pixel. As such, we proposed in [1] to use a frame-transfer charge couple device (FT-CCD) subarray to achieve both high optical coverage and large well capacity. We further proposed implementing the image sensor in CMOS technology to enable fast multiaperture readout and the integration of analog and digital circuits. A concern about this approach was the feasibility of implementing a CCD with acceptable charge transfer efficiency and imaging performance in a single polysilicon CMOS technology. Multiple polysilicon layers are typically used in a CCD process to create small polysilicon gap spacings, which reduce the potential barriers or pockets between electrodes [3]. As CMOS technology scales, however, the spacing between polysilicon lines becomes small enough to facilitate charge transfer between electrodes with voltage levels compatible with other CMOS circuits. Furthermore, narrow polysilicon electrodes create large fringing fields that further improve charge transfer time and efficiency. With additional implants, improvements can be made to reduce dark current and surface traps. The feasibility of implementing a CCD in a scaled CMOS technology that is suitable for realizing a subarray in a multi-aperture image sensor with acceptable performance was demonstrated in [4]. We reported on a $16\times16$ , $0.5\mu$ m pixelpitch surface channel FT-CCD in $0.11\mu$ m CMOS technology with 99.9% charge transfer efficiency, well capacity of 3550e-, dark current of 50e-/sec, peak SNR of 28dB and dynamic range of 60dB. In a subsequent paper [5], we presented the first complete multi-aperture image sensor chip comprising a $166\times76$ array of $16\times16$ , $0.7\mu$ m pixel FT-CCD subarrays, per-column ADCs, control logic and chip readout circuits fabricated in the same $0.11\mu$ m CMOS technology. This paper provides a more complete presentation of the multi-aperture image sensor reported in [5] and its characterization results. To be self-contained, we begin with a detailed discussion of the multi-aperture concept and some of its benefits and implementation challenges. In Section III, we describe the design and operation of the sensor. In Section IV, we provide detailed electrical and optical characterization results. # II. BACKGROUND A conventional CCD or CMOS image sensor comprises a contiguous array of pixels. The objective lens focuses the image onto the image sensor (focal) plane, creating a one-to-one correspondence between points in the object space and points in the image as illustrated in Fig. 2(a). In contrast, a multi-aperture image sensor comprises an array of pixel subarrays with gaps in between them and each pixel subarray has its own image-forming optics. The objective lens creates a focused virtual image a certain distance above the multi-aperture image sensor plane and the local optics form secondary subimages of the virtual image as illustrated in Fig. 2(b). By setting the magnification of the local optics to less than one, the subimages captured by the apertures overlap. As such, each point in the object space is mapped into several points in a group of neighboring apertures. The captured subimages can be combined to form a 2D image and a depth map of the scene. Note that the objective lens in our system has no aperture from the perspective of the aperture array. This allows for a relatively complete description of the wavefront in the focal plane. The amount of depth information that can be extracted depends on the total area of the objective lens that is covered by the aperture array and ultimately on the accuracy in the localization of features. While our system is similar in structure to the plenoptic system described in [6], which employs a separate microlens array on top of an image sensor, there is a key difference between the way these two systems operate. In the plenoptic system, the objective lens is focused onto the microlens array and the microlens array is focused onto the system aperture. Each microlens spreads out the incident rays to the pixels behind it, which provides information about their direction as well as intensity. While the spatial resolution of the plenoptic system is limited to the microlens count, the information about the directions of the rays can be used for a Fig. 2. Comparison of conventional versus multi-aperture imaging. number of applications such as range finding and perspective. Note that the plenoptic system contains only one aperture (that of the objective lens), which is imaged by each of the microlenses. The useful size of each pixel in this configuration is limited by microlens aberrations and fundamentally by diffraction. In contrast, our system captures less information about the wavefront but achieves higher spatial resolution than the aperture count because each aperture captures a focused subimage with magnification greater than the reciprocal of the number of pixels in either the horizontal or vertical directions. Depth is extracted by sampling each point in the focal plane from multiple perspectives. In this configuration, pixels smaller than the spot size of the local optics are useful because they increase the localization accuracy of features and thus the displacement between the locations of the same feature in different subimages. Our image sensor architecture is also similar to that of the TOMBO compound-eye [7] whose purpose is to realize a compact, thin camera with a total resolution exceeding that of an individual aperture. The TOMBO's spatial resolution is largely dependent on object distance, while our multi-aperture image sensor confines the imaging to a tight region behind the objective lens to enable high spatial resolution in both 2D and 3D imaging. Fig. 3 illustrates the process of image formation and the reconstruction of a 2D image in a multi-aperture imaging system. The figure shows the virtual image formed at the focal plane and the resulting subimages captured by the apertures. Note that a feature, such as the bird's eye, appears in different locations within 16 different apertures. By appropriately shifting the subimages and combining them, a 2D image Fig. 3. Reconstruction of a 2D image from multi-aperture subimages. can be reconstructed. In addition to the ability to form a complete 2D image of the scene, the overlap between the subimages captured by the apertures provides other benefits. In the following sections, we discuss two of these benefits in some detail. # A. Depth An example of how depth information is obtained using a multi-aperture imaging system is depicted in Fig. 4. The sphere represents an object in close proximity to the sensor. This object could be a real object or a virtual object as it appears in the focal plane of the objective lens. A point on the sphere appears at different locations in several aperture subimages. For example, the point shown in the figure appears at the locations labeled 1, 2, and 3 in the figure. Note that if the object is close to the image sensor, the displacement between the three locations is large, while if the object is farther away, the displacement is small. Thus in general, features close to the image sensor correspond to large displacements while objects farther from the sensor correspond to small displacements. A depth map can be obtained by judiciously combining this information for each feature of interest. From the above discussion, it is clear that to accurately estimate the depth of features within a scene, it is necessary to accurately measure the locations of each feature within the subimages. We now explain why depth resolution improves with pixel scaling beyond the spot size limit of the optics. First, note that spatial resolution is limited by the degree to which features within the same aperture can be distinguished from each other. As the distance between features in the focal plane become smaller than the spot size as illustrated in Fig. 5(a), they can no longer be resolved. A spot size is typically limited to a few microns in a conventional camera. On the other hand, depth resolution is determined by the displacement between the locations of the same feature in *different* subimages. These locations can be resolved to dimensions smaller than the spot size of the optics (see Fig. 5(b)). Such precise localization was Fig. 4. Example of how depth information is obtained from displacement of feature locations in separate aperture subimages. (a) Object close to sensor corresponds to large displacement. (b) Object at greater distance from sensor corresponds to smaller displacement. Fig. 5. Comparison of spatial resolution and depth resolution. demonstrated to nanometer scale accuracy using large pixels under magnification in [8], [9]. The accuracy of localization depends on the number of photons captured for a given feature and on the accuracy of estimating the shape of the feature itself. When the pixel size is large, it is difficult to determine the shape of the feature. When the pixel size is smaller than the feature size, the shape of the feature and its location can be better determined as long as we collect enough photons. Since the localization of features is dependent on the signal to noise ratio, it is necessary to keep the read noise of the pixel low so that the read noise accumulated from several pixels sampling a single feature does not limit the location accuracy. #### B. Color One of the most important benefits of the multi-aperture image sensor is the ability to perform high fidelity color imaging. In a conventional image sensor, color information is obtained using a color filter array (CFA) deposited on top of the pixel array [10]. The most commonly used CFA is the RGB Bayer pattern, where a $2\times2$ pixel pattern consisting of one red, one blue and two green filters arranged diagonally, is repeated over the pixel array. Such CFA configuration results in severe color crosstalk as pixel size is scaled due to scattering from the relatively thick dielectric stack and diffusion of carriers in Fig. 6. Chief ray diagram for color imaging with the multi-aperture configuration showing object at A, primary virtual image at B, and secondary images split into color channels at $C_R$ , $C_G$ , and $C_B$ . the substrate. This crosstalk results in additional noise in the final image obtained after performing color demosaicing and correction. Furthermore, pixels far off-axis may produce color gradients for several reasons including pixel layout asymmetry. This color crosstalk problem makes further pixel size scaling without a corresponding reduction in dielectric stack height problematic. This imposes severe constraints on the efficient implementation of imaging system-on-chips. The multi-aperture configuration offers an alternative solution to color imaging whereby pixel size can continue to scale independent of the height of the dielectric stack. This makes it possible to simultaneously scale pixel size and increase the number of metal layers to attain high logic integration density. To achieve these benefits, a per-aperture (instead of a per-pixel) color filter array such as an RGB Bayer pattern is employed. In this configuration, crosstalk between pixels is restricted to be of the same color. The magnification of the local optics is set so that each point in the scene is imaged by apertures with all three separate color filters. Some of the loss in spatial resolution due to this overlap is compensated for by imaging all three colors at each effective pixel location instead of performing color demosaicing. In our implementation, we require a local magnification of at least 1/4, which reduces the effective spatial resolution by a factor of 16. Fig. 6(a) illustrates this new color imaging idea. The object A is focused to virtual image B. The local optics magnification is set to capture subimages at C<sub>R</sub>, C<sub>G</sub>, and C<sub>B</sub>, each having a different color. # C. Implementation Challenges To realize the benefits of the multi-aperture image sensor we have described, it is necessary to use pixels smaller than the spot size limit of the optics and yet maintain reasonable well capacity with low read noise. Current CMOS pixels, even as small as the shared 4T APS structure [11], cannot meet these stringent constraints largely due to optical considerations. A CMOS pixel contains a buried diode that is read out via a circuit consisting of several transistors. At least two layers of metal are typically used in the pixel array to distribute readout signals as well as power to the pixels. In addition, one or more layers of metal are needed for global routing and to implement analog and digital circuitry in the periphery of the array. This results in a relatively high dielectric stack that does not scale with technology in the same aggressive way as pixel pitch, resulting in significant degradation in pixel optical Fig. 7. Ray diagrams for typical CMOS pixel with microlens. (a) Pinhole lens. (b) Lens NA matched to pixel NA. (c) Lens NA larger than pixel NA. (d) Pinhole lens off-axis. (e) Lens NA matched to pixel NA off-axis. (f) Lens NA larger than pixel NA off-axis. efficiency. A cross-section of a typical CMOS image sensor with microlens array is shown in Fig. 7 together with scenarios for on and off optical-axis pixels using objective lens with different numerical apertures (NA). Fig. 7(a) shows the ray diagram for an on-axis pixel using a pin-hole objective lens. In this case, the microlens is able to focus effectively onto the photodiode area. In Fig. 7(b), the NA of the objective lens is matched to the NA of the pixel, which is limited by the metal stack. The microlens can still effectively concentrate the light onto the photodiode, albeit with a larger spot size. In Fig. 7(c), the NA of the objective lens is larger than the NA of the pixel such that the microlens is no longer effective at concentrating light onto just the photodiode area. Thus the height of the dielectric stack limits the NA of the objective lens. As pixel size is scaled, it is beneficial to increase the NA of the objective lens to gain more signal while maintaining depth of field. CMOS pixel scaling actually works against this need for a larger NA. Fig. 7(d), shows the ray diagram for an off-axis pixel using a pinhole lens. The microlens and color filter are shifted to compensate for the larger chief ray angle. In Fig. 7(e), the NA of the objective lens is matched to the NA of the pixel and significant vignetting begins to occur. This results in roll-off in signal intensity, increased crosstalk, and color gradients. In Fig. 7(f), the situation becomes even worse as rays from any particular color filter enter the wrong pixel. In the multi-aperture image sensor, a small section of the microlens array is replaced by a single microlens as shown in Fig. 8(a). If CMOS pixels are used, the NA of the microlens would be limited to the small NA of the CMOS pixels. Furthermore, off-axis pixels would see severe roll-off and extreme optical scattering. In our proposed FT-CCD approach, the pixel array uses no metal layers. This allows for a large local NA, which increases sensitivity and resolution (see Fig. 8(b)). The height of the optical stack in this case can be large, allowing for several layers of metal to be used outside the pixel array. Fig. 8. Aperture using (a) conventional CMOS pixel subarray, versus (b) proposed FT-CCD. A major challenge in implementing an FT-CCD in a single polysilicon CMOS process is achieving high enough charge transfer efficiency and good imaging performance. As briefly discussed in the introduction and as demonstrated in [4], these challenges can be overcome in scaled CMOS technology. However, one of the obstacles to reducing pixel size in a conventional CCD is the use of multi-phase charge transfer. This requires incorporating additional electrodes in the pixel, which reduce optical efficiency and increase pixel size. To address this problem, our FT-CCD uses ripple charge transfer operation, which results in a pixel with only a single electrode. Although ripple charge transfer requires a separate driver for each electrode and is slower than multi-phase charge transfer, it becomes practical for the small subarrays used in the multi-aperture image sensor. #### III. MULTI-APERTURE IMAGE SENSOR CHIP A block diagram of the multi-aperture image sensor is shown in Fig. 9. The chip comprises a 166×76 aperture array, each with a 16×16 FT-CCD. The aperture control buses, V[35:0] and H[15:0] are globally connected to the FT-CCD array. To select a row of FT-CCD readout circuits, the RS signal is applied through the row decoder addressed by the ROW signal. The MUX blocks contain column control circuits, bias circuits, inputs for external testing of each column analog chain, and support for serial analog pixel readout through AOUT. The per-column ADCs share an output bus, which is controlled by the signal COL and buffered for digital readout through DOUT[10:0]. After every conversion cycle, the digitized values are read out out from the ADC buffers one column at a time. An ESD clamp to the lowest chip potential, typically used on IO pins, is not used on the CCD control lines so that negative voltages can be applied during testing and characterization. In the following section, we describe the design, fabrication and operation of the FT-CCD. In Section III-B, we describe the chip operation, and in Section III-C, we describe the circuit and operation of the per-column ADC. #### A. FT-CCD The schematic and device cross-sections of the $16 \times 16$ pixel FT-CCD are shown in Fig. 10. It consists of a pixel array, a Fig. 9. Block diagram of multi-aperture image sensor chip. frame buffer, a horizontal CCD (H-CCD) with floating diffusion (FD), and follower readout. The main process difference between this FT-CCD and the one in [4] is that the CCDs are formed using p+ polysilicon electrodes, n-type CCD channel and p-type channel stop implants. A total of 4 customized implants are needed to implement this structure in a standard process. The purpose of the additional implants is to improve charge transfer efficiency and dark current by using a buried channel instead of a surface channel and to use a channel stop implant instead of a shallow trench. The inputs to the channels at the top of the array are connected to V0 through an n-well implant, which also allows the p-type channel stops to remain isolated from the channels while connecting to ground. Two sides of the H-CCD connect to VP, which is used for filland-spill operation, reset of FD node, or as source follower drain supply. The image is captured in interlaced fields. Charge is collected under every other electrode, which allows large potential barriers between them leading to high well capacity. An STI region is used to create isolation between arrays and to serve as the area for contacts to the non-silicided electrodes. The gap width between the polysilicon electrodes is 180nm, which is small enough to implement a single electrode CCD at CMOS compatible voltages. The layout of the FT-CCD from Diffusion up to Metal 1 is shown in Fig. 11. A mask is used to block silicide on the polysilicon in order to improve the transmissivity. Metal 1 is used to both cover the frame buffer region and to provide the signal routing to the H-CCD electrodes. The electrodes in the vertical CCD (V-CCD) are constructed from long continuous lines of polysilicon. Since the active area has no silicide, the sheet resistance of the polysilicon is very high. In order to achieve fast transfer times, the time constants for driving the long lines of polysilicon should be minimized. All of the V-CCD electrodes are connected globally, so it is possible to route them either horizontally or vertically. In order to reduce the time constants required to drive the polysilicon Fig. 10. FT-CCD schematic and device cross sections. Fig. 11. CAD layout of two $16\times16$ FT-CCD subarrays showing diffusion, polysilicon, contacts and Metal 1. lines, vertical connections are made to the polysilicon between subarrays. However, since the space is very tight between neighboring subarrays, we can only connect one polysilicon line to one Metal 1 line at each subarray interval. Therefore, we step the connections to the electrodes at each subarray interval such that after 35 steps all electrodes have been connected. The step interval is $14.4~\mu m$ , which leads to a contact spacing on each electrode of $35 \times 14.4 \mu m = 504 \mu m$ . The layout of the FT-CCD readout circuit from Diffusion Fig. 12. CAD layout of CCD readout circuit. up to Metal 1 is shown in Fig. 12. The circuit is situated in between subarrays such that the fill-and-spill input diffusion node VP is shared with the neighboring drain diffusion of the reset device. Since the fill-and-spill operation is applied only during non-readout times, there is no conflict in using VP to supply the current for the source follower during readout. FT-CCD Operation: The FT-CCD performs snap shot imaging using a global electronic shutter. The capture of a frame can occur simultaneously with the read out of a previous frame. To minimize pixel pitch, ripple transfer operation (as opposed to the more common multi-phase transfer) is used. An image can be captured by integrating photocharge at each electrode or at every other electrode for interlaced operation. In interlaced operation, the pixel is effectively twice as long as it is wide during each field. The basic phases of the FT-CCD operation are described with the help of the timing diagram in Fig. 13. During FLUSH, the CCD pixel arrays are depleted of charge through V[0] by sequencing V[17:1]. During INTEGRATE, the pixel array electrodes are held at an intermediate voltage. At the end of integration, the accumulated charge in the CCD pixel arrays are transferred one row at a time to the frame buffers using ripple charge transfer (TRANSFER). After transferring all of the integrated charge to the frame buffer, FRAME BUFFER READOUT is performed while a new FLUSH cycle is initiated to set the next integration time period or to prevent new charge from flowing into the frame buffer. DIGITAL READOUT at the IO pins is synchronized to FRAME BUFFER READOUT through the ADC interface. A small VBLANK period, where no data is read, is required during the TRANSFER period. The FRAME BUFFER READOUT sequence consists of a Vertical-to-Horizontal (V-to-H) transfer sequence followed by a Horizontal transfer to the floating diffusion node as shown in Fig. 14. The goal of the V-to-H transfer is to move just one charge packet at a time into the H-CCD. When the charge packet is shifted along the H-CCD, it is essential to preserve the charge in the other columns. In [4], we described a scheme where the even columns are first transferred to the H-CCD, the H-CCD is drained, and then the odd columns are transferred into the H-CCD. This method has the disadvantage of keeping the charge packets held in the H-CCD for a longer period of time. We find that the defect density in the H-CCD is larger than that of the V-CCD. For this reason, we see higher dark current correlated by columns when this even/odd pixel readout approach is used. | | _ | | | | | | | |---------|---|-------|--------------------------|----------|-------|------------------------------|----------| | Capture | | FLUSH | INTEGRATE $(n + 1)$ | TRANSFER | FLUSH | INTEGRATE $(n + 2)$ | TRANSFER | | | 닐 | | | | | | | | | | | FRAME BUFFER READOUT (n) | WAIT | | FRAME BUFFER READOUT (n + 1) | WAIT | | Readout | | | | | | | | | | | | DIGITAL READOUT (n) | VBLANK | | DIGITAL READOUT $(n + 1)$ | VBLANK | | | _ | | | | | | | Fig. 13. Timing diagram for operation of the FT-CCD showing simultaneous capture and readout sequences. Fig. 14. FT-CCD readout sequence from the frame buffer. FT-CCD Simulation: The main reason we are able to achieve very small pixel size is that we use ripple charge transfer instead of the more conventional multi-phase transfer used in larger pixel count CCDs. A ripple charge transfer requires independent access to each of the electrodes to allow for charge confinement at each electrode. As such, each of the 35 vertical and 16 horizontal electrodes used in our FT-CCD are driven separately. Process and device simulations were performed with Sentaurus TCAD. A full simulation of the single electrode charge confinement and ripple charge transfer is shown in Fig. 15. Initially, all electrodes are held at -0.5V, which creates potential wells between every electrode as shown in Fig. 15(a). In this example, we show how charge packets can be placed at V1, V2, V3, and V4. To the best of our knowledge, this is the first CCD to use charge confinement between electrodes, as opposed to under the electrodes. To move one charge packet forward, 3.0V is applied to V4 while 2.0V is applied to the rest of the electrodes in the direction of the charge packet's movement as shown in Fig. 15(b). In Fig. 15(c), the process continues with a ripple transfer, where V5 is brought to 3.0V while V4 is driven back to -0.5V. The charge packet reaches the end of the CCD line as shown in Fig. 15(d). Next, another charge packet is brought forward as shown in Fig. 15(e). Note that it is essential to have an empty well between the packet already transferred and the new one. In a typical CCD this is achieved by having additional electrodes in the pixel. Here the empty wells are created at the end of each charge transfer. After reaching the state shown in Fig. 15(f), we have to be careful that charge packets do not mix on the next transfer. Fig. 16. Electrostatic potential diagrams under interlaced frame operation showing increased size of the potential wells. With the right choice of voltages, the scenario in Fig. 15(g) is created, where the left and right sides of electrode V6 are at different potentials, but with a large enough barrier under V6 to prevent charge mixing. Finally the state shown in Fig. 15(h) is reached, where two charge packets have moved from one side of the CCD to the other. This is the same way charge is moved during Frame Transfer and along the H-CCD. To increase well capacity, only the even electrodes are set to store charge while the odd electrodes are set to form barriers. A simulation of this setup is shown in Fig. 16(a). This *interlaced* imaging mode of operation results in a significantly a larger well capacity. The potential profile for the odd field is shown in Fig. 16(b). A simulation of the V-to-H-CCD transfer sequence is shown in Fig. 17. The charge packets in the columns under V34 are shifted to V35 as shown in Fig. 17(b). The horizontal electrodes are initially held at -0.5V to keep all charge under V35. The targeted horizontal electrode is then brought to 3.0V, which forces the targeted column charge to drain into the H-CCD as shown in Fig. 17(d). Potential barriers around the target horizontal electrode are enforced by holding the other electrodes at -0.5V, while the p+ region under the horizontal electrode (shown in the figure) provides isolation along the horizontal axis. Next, V34 is brought to 3.0V and a partial transfer occurs at the other columns from V35 to V34 as shown in Fig. 17(e). A full charge transfer is now achieved in both directions by slowly dropping V35 to -0.5V as shown in Fig. 17(f). This transfer mechanism relies on the condition that the fringing field from horizontal electrodes remain larger than that from V10, and that V11 provides a sufficient barrier. Once charge is completely transferred to the horizontal CCD, V34 is set close to 2.0V to ensure that all the non-targeted column charge is efficiently passed backwards. Next, the charge in the H-CCD is ripple shifted to the floating diffusion node where it Fig. 15. Electrostatic potential diagrams from the simulation of CCDs with single electrode charge confinement and ripple charge transfer. is buffered and double sampled using a source follower circuit. This procedure is repeated for the remaining charge in the other columns confined by the V34 electrode. FT-CCD Readout Circuit: The local readout circuit for the FT-CCD is shown in Fig. 18a. A long TX electrode provides sufficient isolation from the H-CCD to allow the FD node to be driven into reset while the other charge packets remain undisturbed in the H-CCD register. Note that the TX electrode can be eliminated from the design if we choose to move just one packet at a time during the V-to-H transfer. The readout circuit consists of a floating diffusion with reset and row select. To achieve low noise, we perform Correlated Double Sampling (CDS [12]) on the floating diffusion node. First, FD is reset by bringing RT high, which turns on the transistor $M_1$ . The voltage on VP is sufficiently low with respect to RT such that FD receives a hard reset. RT is then driven low followed by RS driven high. Drain current is now provided through M<sub>2</sub> for the source follower M<sub>3</sub>, which is biased at COLB by I<sub>0</sub>. The value that settles on COLB represents the reset level in addition to the sampled noise. Next, TX is brought high and the charge packet at H0 is shifted to the FD node, which decreases its voltage. The value that settles on COLB is the signal subtracted from the reset and sampled noise. Taking the difference between the 2 samples leaves just the signal without the reset and noise. To deplete the CCD channel and achieve a full charge transfer, we must keep the potential at FD higher than the channel potential under the depletion condition. For this reason, we boost the FD node potential during the transfer operation. To describe the readout operation in detail, consider the circuit with the most significant parasitic capacitors shown in Fig. 18(a) and the corresponding timing diagram in Fig. 18(b). The operation shows the 4 main phases previously described but now with more detail including the effect of the parasitic capacitors. For example, the VP line is switched between a high potential and a lower potential before the RT gate is driven high. Coupling through C3 causes the FD node to dip down slightly as shown. Driving the RT gate high resets FD to VP but the final reset value becomes lower after the RT gate is released low due to the feedthrough from the gate-source overlap capacitance C1. When FD is sampled during the Sample R phase, driving RS high causes the FD node to boost higher due to coupling from C4, which sees a nearly full range terminal switching. In addition, the gatesource overlap capacitor represented by C<sub>5</sub> discharges into FD causing it to rise even higher. We can take advantage of this effect by driving COLB high during the transfer phase. We create significant FD boosting during the Transfer operation as shown. The final boosted value is controlled by the selection of the switched VP voltage during Reset so it can be finely tuned. Under the case of significant boosting, it is better to drive TX high before boosting in order to incur less stress on the gate oxide (The timing diagram shows boosting occurring first so that the charge transfer is easier to see). Finally, the TX gate is released and RS is driven high for the second sample. The diagram shows the signal level as it appears at FD and COLB. # B. Chip Operation The basic timing diagram for 2 rows of the chip is shown in Fig. 19. The diagram is divided into 6 macro operations for simplicity. A representation of the chip with a $2\times2$ array of apertures, each with a $2\times2$ pixel FT-CCD is shown in Fig. 20. The readout sequence begins after the integration period shown in Fig. 20(a). The active CCD area contains Fig. 17. Details of the vertical to horizontal transfer. (a) Charge is initially confined under V34 for both columns. (b) Charge packets under V34 are shifted to V35. The even horizontal electrode is brought high, which forces the even column charge to drain into the H-CCD. (c) V34 is brought low to complete the transfer to V35. (d) V35 is brought to a low enough potential that charge is shifted under H0. (e) V34 is brought high and a partial transfer occurs at the odd columns from V35 to V34. (f) Final potential profile for complete charge transfer. 4 charge packets, while the frame buffer (shown in gray) has been fully depleted of charge. The FT-CCDs have their own local readout circuits, which are configured by columns each connected to an ADC. All of the CCD operations are global. Therefore, the start and end of the integration period is the same for all subarrays. At the end of the integration period, the charge is transferred to the frame buffer (see Fig. 20(b)) as described in Section III-A. To begin readout, the reset value for the first pixel of every subarray is first acquired by bringing RT high for every row (see Fig. 20(c)). Next, the row select signal is applied to RS[0] via the row decoder and the value at each column ADC is sampled (see Fig. 20(d)). Next, row select is applied to RS[1] and the value at each column ADC is sampled so that reset values from all subarrays are read out (see Fig. 20(e)). After a charge packet is transferred from the V-CCD to the H-CCD, TX is applied globally to all rows and the charge is transferred to the floating diffusion (see Fig. 20(f)). Next, row select is applied to RS[0] and the values are sampled at the ADCs (see Fig. 20(g)). Next, row select is applied to RS[1] and the pixel values are sampled at the ADCs (see Fig. 20(h)). After all reset values and signal values are read out, CDS is performed off-chip in software by subtracting each reset value from its corresponding pixel value. The ADCs are double buffered so that the pixels in a row of subarrays can be read out while the previously converted pixel data is read out from the ADCs. After the last pixel is read out, the floating diffusion is reset by bringing RT high for all rows (see Fig. 20(i)) and the readout sequence is repeated until all pixels are converted. This operation allows us to keep all signals in the image sensor global except for RS. This greatly simplifies the row decoder operation and allows for global shutter operation, which is preferred to the more common rolling shutter. The cost of implementing the image sensor in this way is that row buffers are required to realign the pixel Fig. 18. (a) Local CCD readout circuit including the most significant parasitic capacitors, and (b) the corresponding timing diagram. data in order to perform digital CDS. The number of row buffers required is equal to the number of aperture rows in the sensor. Furthermore, one pixel from each aperture is read out before the next pixel is available. A frame buffer is required in order to obtain all the pixels from one aperture. There are a number of other approaches to the global readout architecture. For example, adding the TX signal to the row decoder would allow sequential sampling of the reset level and the signal level such that the digital CDS can be performed immediately at each row, thereby eliminating the need for row buffers. This approach requires storing charge in the H-CCD at electrode H0 for a period of time that depends on its row position in the pixel array. This requires the dark current generated in the H-CCD to be kept low. Another option is to include all of the H-CCD signals and half of the V-CCD signals into the row decoder. This way all pixels for all apertures in a row can be readout at once, which may be preferable. We chose not to implement this readout architecture because we anticipated the need for using complex waveforms, including negative voltages, to experiment with the CCD operation, and there was no advantage to dealing with raster scan data in our system. Fig. 19. Basic chip timing diagram for 2 rows. # C. ADC Design and Operation The column ADC circuit and timing diagram are shown in Fig. 21. A 10-bit single-slope architecture with off-chip ramp (via 14-bit DAC) is used for flexible operation. Conversion begins by resetting the keeper cell. As the COLUMN value settles, SAMPLE is strobed while the bus signal C cycles through the gray code values, which are non-uniformly spaced to account for shot noise level. Once RAMP exceeds COLUMN, the code on C is latched into the A buffer. The keeper stays latched until the beginning of the next conversion where buffer B is used to store C while buffer A is read out. The comparator is capable of operating at 200MSPS, which is required for 15fps operation. It consists of a diffpair followed by a regenerative latch. Column gain is achieved by reducing the voltage range of the ramp. A resolution of 10 bits at 1V (double sampled pixel) is achieved with $1\mu A$ bias current. The diff-pair transistors, with a W/L ratio of 6, are kept in weak inversion. The comparator bias current can be increased for higher bandwidth. The regenerative portion of the comparator and the memory buffers are implemented with 1V transistors, while the diff-pair is implemented with 3V transistors, allowing for simple translation between power domains. #### IV. CHARACTERIZATION RESULTS The sensor was fabricated in a $0.11\mu m$ CMOS process with modifications to implement the buried channel CCD. A photomicrograph of the fabricated $3.0\times2.9~mm^2$ chip is shown in Fig. 22. Local optics were not integrated on this chip so all image testing is performed with standard focal plane imaging. The die was packaged in a ceramic 120 pin grid array. A test board, controlled by an FPGA, was built with a DAC for each CCD signal to provide flexibility in the generation of the CCD waveforms. The data is read into a DRAM and sent to a PC via a USB interface. The chip is fully functional and has been electrically and optically characterized. # A. Electrical Characterization The CCDs are electrically tested with the fill-and-spill circuit to verify charge confinement at the $0.7\mu m$ pixel pitch. Charge transfer efficiency (CTE) of 99.9% is measured by moving charge packets through all the columns of the vertical Fig. 20. Chip operation showing one complete cycle for reading one pixel from every subarray. The boxes under the ADC represent the digital values for the reset levels (gray) and the signal values (black). array via the H-CCD. Noise from the fill-and-spill operation is removed by averaging samples and the total transfer times are kept short to minimize corruption from dark current. The ADC is tested and characterized independent of the pixel array by configuring the MUX for external input via AIN (see Fig. 9). Two samples are taken for each effective ADC cycle. The first sample is held at 2V, while the second sample is ramped from 1V to 2V to test the ADC linearity and noise over the 1V range. The measured ADC linearity for one typical column is shown in Fig. 23(a). Both the peak-to-peak DNL and INL are kept to about 1/2 LSB on a 10-bit scale. In imaging applications, it is essential to keep column non-uniformity very low. Therefore, the ADCs need to maintain low INL and DNL across all columns in the array. In Fig. 23(b), the peak-to-peak INL and DNL are plotted across all columns showing variation of less than 0.6 LSB among them. Both the fixed pattern noise (FPN) and the temporal noise (TN) of all ADCs in the column array are shown in Fig. 23(c). Due to the digital CDS used in this design, we have an extremely low column FPN of 0.044 LSB RMS and 0.06 LSB max. The temporal noise is also acceptable at 0.26 LSB (254 $\mu$ V) RMS and 0.6 LSB (586 $\mu$ V) max, considering the noise floor of the pixel is 825 $\mu$ V RMS. Table I provides a summary of the chiplevel performance, which is largely determined by the ADC. # B. Optical Characterization To obtain acceptable results for well capacity, peak SNR, and dynamic range, all measurements are performed using the Fig. 21. (a) Ramp-based column ADC design. (b) Schematic showing transistor level design of comparator and keeper cell. (c) Timing diagram for 2 conversion cycles. Fig. 22. (a) Chip micrograph and (b) magnified view of aperture array. interlaced mode described in Section III-A. Many key CCD performance parameters can be measured from the photon transfer curve, which is a plot of the measured RMS noise versus the mean of the signal in electrons. The measured photon transfer curve for the FT-CCD subarray is shown in Fig. 24. From the curve, we find that the noise floor, i.e., the noise at very low signal, is around 5e-. The sensor conversion gain, which is derived from the shot noise limited regime, is $165\mu\text{V/e-}$ . The noise near the top of the curve begins to decrease, signifying a full well of 3500e-. The expected decrease in noise is due to charge mixing. The variation in the gain of each pixel is measured as Photo Response Non-Uniformity (PRNU) of 0.02 RMS. The dark current is measured over several integration times to remove any offsets. The histogram for the dark current over all pixels in image sensor is shown in Fig. 25(a). The mean dark current of 33e-/sec with Dark Signal Non-Uniformity (DSNU) of 0.35 (sigma/mean) is within the expected range for this device. The DSNU is also calculated independently for each $16\times16$ subarray in the image sensor and shown in Fig. 25(b). Using additional implants to increase the hole concentration near the surface may help reduce the dark current and non-uniformity. The measured quantum efficiency is shown in Fig. 26. Fig. 23. Measured ADC performance. (a) Linearity for one column. (b) Minimum and maximum DNL and INL for all 166 columns. (c) Fixed pattern noise (FPN) and temporal noise (TN) for all 166 columns. Fig. 24. Photon transfer curve at 1/30 second exposure showing read noise of 5e- and PRNU limitation at high illumination. Fig. 25. (a) Dark current histogram for all pixels in the image sensor, and (b) histogram for aperture-level DSNU. Fig. 26. Measured QE at 5nm intervals showing strongest response at 650nm. The response in the blue region is reasonable due to the 65% polysilicon coverage of the active pixel area. The dashed line is an estimate of the QE if there were no absorption in the electrodes. Despite the use of polysilicon electrodes, the blue response is acceptable. This is due to the thin (130nm) polysilicon layer and the open space in between each electrode. Table II provides a summary of the measured sensor imaging characteristics. A sample image acquired in a standard focal plane configuration using a fixed focus F/2.4 lens is shown in Fig. 27. Since local optics is not implemented on this chip, the images are directly projected to the subarrays. A black grid is inserted into the image data to show what the image looks like at the sensor. The image in the upper right shows the average value at Table I SUMMARY OF CHIP-LEVEL PERFORMANCE. | Parameter | Value | | | |--------------------|---------------------------|--|--| | Aperture count | 166×76 | | | | Aperture format | 16×16 | | | | Die size | 3.0×2.9 mm <sup>2</sup> | | | | Maximum frame rate | 15 fps | | | | ADC resolution | 10b | | | | ADC FPN | 0.044 LSB (43μV) | | | | ADC temporal noise | 0.26 LSB (254μV) | | | | ADC INL/DNL | (0.46,-0.60)/(0.39,-0.57) | | | | chip power | 10.45 mW | | | $\label{thm:continuous} Table \ II \\ Summary \ of pixel-level performance at room temperature.$ | Parameter | Value | |---------------------------|--------------------| | Pixel size | $0.7\mu\mathrm{m}$ | | Well capacity | 3500e- | | Conversion Gain | 165μV/e- | | Responsivity at 550nm | 930e-/lux-s | | QE at 550nm | 48% | | CTE | 99.9% | | Read Noise (RMS) | 5e- | | Dark Signal at RT | 33e-/sec | | DSNU (sigma/mean) | 0.35 | | PRNU at sat. (sigma/mean) | 0.02 | | Peak SNR | 35 dB | | Dynamic Range | 57 dB | Fig. 27. Sample image acquired with F/2.4 lens. The black grid is inserted by software to show the occluded area. each aperture. The image in the lower right shows a magnified portion of the image to show detail within each subarray. Sample images from a single subarray are shown in Fig. 28. The electrical image in Fig. 28(a) is generated using the fill-and-spill circuit to input the charge pattern into the array. This demonstrates the charge confinement at pixel pitch. The optical image in Fig. 28(b) demonstrates the imaging performance which is limited by the aperture of the camera lens at F/2.4. The spot size limit of this lens is between $2-3\mu m$ . However, with added contrast in Fig. 28(c) we can clearly see the pattern. To demonstrate the feasibility of color image reconstruction, subimages as they would appear through local optics are projected onto a subarray from an LCD display. The raw data Fig. 28. Sample images from a single $16 \times 16$ subarray. (a) Electrical image from fill-and-spill circuit. (b) Optical image projected from an LCD screen. (c) Optical image processed with added contrast enhancement. Fig. 29. Color image captured by chip using projected subimages from LCD. is obtained by cycling through all of the subimages for a given scene. Through simple reconstruction we produce the image in Fig. 29. A global parameter is used to shift each subarray and to reconstruct the image by summing the overlapping views. This image shows that the pixels in the subarrays are adequate for image reconstruction and that excellent color separation between apertures can be achieved. #### V. CONCLUSION Design and characterization results of the first integrated multi-aperture image sensor have been reported. The results show good imaging performance with $0.7\mu m$ pixels using an FT-CCD subarray designed in deep submicron CMOS. The results suggest that further pixel size scaling is possible while maintaining acceptable performance. Improvements in pixel performance are expected with further modifications to the CMOS process to reduce dark current, improve charge transfer efficiency, and decrease PRNU. The multi-aperture architecture provides a path for continued scaling of image sensor pixel count within the limits of practical optical formats, which may be needed to meet the demands of future imaging platforms. This is achieved in several ways: - Hierarchical readout: Scaling pixel count presents a major challenge to readout speed and signal fidelity. A similar problem is solved in digital memories by breaking the memory array into smaller blocks and using hierarchical readout. The multi-aperture architecture applies this approach to image sensors. The gaps between the subarrays resulting from this hierarchical architecture is compensated for by using local optics with large enough magnification. - CCD subarray design: CMOS pixels are becoming increasingly difficult to scale mainly due to optical considerations. As such, we proposed using an FT-CCD subarray design with no metal occlusions in the pixel. Pixel size is minimized by using ripple charge transfer charge to allow for confinement between electrodes. - *Per-Aperture CFA*: As discussed in Section II-B, color cross-talk represents a major impediment to CMOS pixel scaling. The multi-aperture architecture solves this problem by using CCD pixel subarrays with no metal occlusions and a per-aperture color filter array. - Redundancy: As pixel counts scale, it becomes increasingly difficult to achieve acceptable manufacturing yield without employing mechanisms for defect tolerance. This is achieved in the multi-aperture image sensor through the redundancy in the subimages. In addition to enabling continued scaling of pixel count, the multi-aperture image sensor provides new important capabilities, including capturing depth information, relaxing the requirements on the objective lens, and imaging objects at close proximity without the need for objective optics. Realizing the benefits of the multi-aperture image sensor requires significant post-processing of the subimage data. This should not be the limiting factor to its adoption, however, because of the ever decreasing cost of computational resources. #### VI. ACKNOWLEDGMENT The authors thank C. H. Tseng, David Yen, C. Y. Ko, J. C. Liu, Ming Li, and S. G. Wuu from TSMC for process customization and fabrication of the multi-aperture image sensor and Lane Brooks at MIT for collaboration on the python-based data acquisition software. Keith was supported by a Hertz Foundation Fellowship, which allowed for the flexibility that lead to this research topic. # REFERENCES - K. Fife, A. El Gamal, and H.-S. P. Wong, "A 3D multi-aperture image sensor architecture," *Custom Integrated Circuits Conference*, pp. 281– 284, Sep. 2006. - [2] T. Massoud and S. Gambhir, "Molecular imaging in living subjects: Seeing fundamental biological processes in a new light," *Genes & Dev*, pp. 545–580, 2003. - [3] A. Theuwissen, Solid-State Imaging with Charge-Coupled Devices. Dordrecht: Kluwer, 1995. - [4] K. Fife, A. El Gamal, and H. Wong, "A 0.5μm pixel frame-transfer CCD image sensor in 110nm CMOS," *IEDM*, pp. 1003–1006, Dec. 2007. - [5] —, "A 3MPixel multi-aperture image sensor with 0.7μm pixels in 0.11μm CMOS," *ISSCC Technical Digest*, pp. 48–49, Feb. 2008. - [6] E. Adelson and J. Wang, "Single lens stereo with a plenoptic camera," IEEE Trans. Pattern Anal. Mach. Intell., vol. 14, no. 2, pp. 99–106, Feb. 1992. - [7] J. Tanida et al., "Thin obervation module by bound optics (TOMBO): Concept and experimental verification," Applied Optics, vol. 40, pp. 1806–1813, Apr. 2001. - [8] R. Thompson, D. Larson, and W. Webb, "Precise nanometer localization analysis for individual fluorescent probes," *Biophysical*, pp. 2775–2783, May 2002. - [9] A. Yildiz et al., "Fluorophore imaging with 1.5-nm localization," Science, pp. 2061–2065, Jun. 2003. - [10] H. Rhodes et al., "CMOS imager technology shrinks and image performance," IEEE Workshop on Microelectronics and Electron Devices, pp. 7–18, 2004. - [11] M. Mori, M. Katsuno, S. Kasuga, T. Murata, and T. Yamaguchi, "A 1/4in 2M pixel CMOS image sensor with 1.75Transistor/pixel," in ISSCC Technical Digest, vol. 47, Feb. 2004, pp. 110–111. - [12] W. H. White, D. R. Lampe, F. C. Blaha, and I. A. Mack, "Characterization of surface channel CCD image arrays at low light levels," *IEEE Journal of Solid-State Circuits*, vol. SC-9, pp. 1–14, Feb. 1974 Keith Fife (S'02) received his B.S. and M.Eng. degrees in Electrical Engineering from Massachusetts Institute of Technology in 1999. He won the MIT 6.270 robot competition and an EE departmental award for his master's thesis. His work and research has led to several patents in imaging devices, circuits and systems. After finishing at MIT, he co-founded an image sensor company to develop solutions for consumer and automotive imaging markets. One product was recognized as "Best of CES" in 2001 and as "World's Thinnest Camera" by Guinness World Records in 2002. In 2003, he returned to graduate school at Stanford University to work on devices and architectures for new imaging systems. He is currently completing the Ph.D. degree at Stanford. Abbas El Gamal (S'71-M'73-SM'83-F'00) received his B.Sc. degree in Electrical Engineering from Cairo University in 1972, the M.S. in Statistics and the Ph.D. in Electrical Engineering from Stanford in 1977 and 1978, respectively. From 1978 to 1980 he was an Assistant Professor of Electrical Engineering at USC. He has been on the Stanford faculty since 1981, where he is currently Professor of Electrical Engineering and the Director of the Information Systems Laboratory. He was on leave from Stanford from 1984 to 1988 first as Director of LSI Logic Research Lab, then as co-founder and Chief Scientist of Actel Corporation. In 1990, he co-founded Silicon Architects, which was later acquired by Synopsys. From 19972003, he was the principal investigator on the Stanford Programmable Digital Camera project. His research has spanned several areas, including information theory, digital imaging, and integrated circuit design and design automation. He has authored or coauthored over 180 papers and 30 patents in these areas. H.-S. Philip Wong (S'81-M'82-SM'95-F'01) received the B.Sc. (Hons.) in 1982 from the University of Hong Kong, the M.S. in 1983 from the State University of New York at Stony Brook, and the Ph.D. in 1988 from Lehigh University, all in electrical engineering. He joined the IBM T. J. Watson Research Center, Yorktown Heights, New York, in 1988. In September, 2004, he joined Stanford University as Professor of Electrical Engineering. While at IBM, he worked on CCD and CMOS image sensors, double-gate/multi-gate MOSFET, de- vice simulations for advanced/novel MOSFET, strained silicon, wafer bonding, ultra-thin body SOI, extremely short gate FET, germanium MOSFET, carbon nanotube FET, and phase change memory. While he was Senior Manager, he had the responsibility of shaping and executing IBM's strategy on nanoscale science and technology as well as exploratory silicon devices and semiconductor technology. His research interests are in nanoscale science and technology, semiconductor technology, solid state devices, and electronic imaging. He is interested in exploring new materials, novel fabrication techniques, and novel device concepts for future nanoelectronics systems. His research also includes explorations into circuits and systems that are device-driven. His present research covers a broad range of topics including carbon nanotubes, semiconductor nanowires, self-assembly, exploratory logic devices, and novel memory devices. He is a Fellow of the IEEE and served on the IEEE Electron Devices Society (EDS) as elected AdCom member from 2001-2006. He served on the IEDM committee from 1998-2007 and was the Technical Program Chair in 2006 and General Chair in 2007. He served on the ISSCC program committee from 1998-2004, and was the Chair of the Image Sensors, Displays, and MEMS subcommittee from 2003-2004. He serves on the Executive Committee of the Symposia of VLSI Circuits and Technology. He was the Editor-in-Chief of the IEEE Transactions on Nanotechnology in 2005-2006. He is a Distinguished Lecturer of the IEEE EDS and Solid-State Circuit Society. He is a member of the Emerging Research Devices Working Group of the International Technology Roadmap for Semiconductors (ITRS).