Statistical Image Quality Assessment: Digital Vs. Microfilm

One of the benefits associated with the deployment of digital image capture technologies is the elimination for the need to photographically record images on microfilm.

There is a general perception that the quality of microfilm images is inferior to those produced by recently deployed digital imaging equipment. Despite the perception, there was a need and desire by National City Corp. to statistically prove that the quality of its digital images would in fact meet or exceed customer requirements. The corporation established a Six Sigma project to objectively assess the quality of digital images as compared to the corresponding microfilm images of the same items. The scope of the project was limited to image capture on BTM 7780 sorters.

If the quality of electronically captured images was proven to meet or exceed customer requirements, the redundancy of microfilm image capture may be able to be eliminated.

Microfilm has traditionally been used primarily in an historical research capacity. The National City photocopy retrieval (PCR) department is a centralized operation located at the company’s Cleveland Operations Center. PCR has the responsibility of maintaining a seven-year library of microfilm. Also, PCR fulfills all requests for copies of items captured throughout National City, one of the nation’s largest financial holding companies. Photocopies are generally used to provide proof of payment, or to support internal research work. Many times, photocopies are requested by legal entities via summons and subpoena. Users (requestors) of photocopy images include National City customers – individuals and businesses, internal departments, legal courts and attorneys.

Metrics

Any Six Sigma initiative is based on the achievement of a goal through a calculated, incremental improvement of measurable requirements. Generally speaking, the primary metric driving this project could be stated as “image quality.” Although, this needed to be more clearly defined in the context of the product’s customers and its intended use. Certainly, the inherently subjective nature of the metrics associated with this project presented a challenge.

One must first consider the current product, microfilm-based photocopies, its customers and the product’s intended use(s). While under no circumstance is a photocopy intended to suffice as a negotiable legal instrument, most uses aim to provide some level of proof or validation of past settlement and/or payment. Despite other more obvious differences in the appearances of a microfilm photocopy versus a digital image, i.e., brightness, contrast and resolution, it was decided to base an objective analysis on functional attributes in the context of the most likely customer requirements. These requirements, also known as critical-to-quality characteristics (CTQs) are based on the generally accepted seven points of negotiability of a check. The bank of first deposit (BOFD) endorsement on the back of the check also was included as a CTQ. (Figure 1)

Measurement System: Statistical Population and Sample

At the time of this project, production BTM 7780 electronic image capture had been operational for only a short period of time at two sites, Royal Oak and Columbus. Therefore, the population set from which to take a statistical sample was defined as items captured at those two sites and between the dates of Sept. 1 and Sept. 30, 2004.

A sample size calculation indicated that approximately 800 items would be required to detect a statistical difference between the two sources – image and microfilm. The sample set was randomly selected based on date and sequence number, the minimum criteria required to uniquely identify an item. An item sequence number consists of the following components:

Sequence number format: AABBCCCCCC
Where:
AA is the two-digit capture location identifier
BB is the two-digit sorter identifier
CCCCCC is the six-digit item identifier

Prior to the generation of the sample set, the item count of the total population set was calculated. The population item count was than broken down according to capture location, sorter and date. The beginning and ending sequence numbers where obtained for each segment. A proportionate number of random, uniformly distributed values were generated from each individual sequence number interval. The purpose of this approach was to ensure that each sorter at each site was given proportionately equal representation within the sample set. The fact that all items have been captured both electronically and on microfilm provided the advantage of being able to perform a direct comparison between identical sample sets from each source.

Sample items were retrieved from both the image archive and the microfilm library. Microfilm retrieval requests were submitted and processed along with production requests to ensure handling consistent with that of production requests. Electronic images were retrieved individually by way of the Checkworks image archive access tool, and printed on standard black and white laser printers. Prior to inspection and evaluation, printed images from both sources were matched in pairs and labeled for tracking.

All non-check items were excluded from the study. The final size of the sample set after adjustments and the removal of all invalid items was 812 unique items. Since two images (one from each source) were inspected per item, the total number of unique images was 1,624.

Attribute Agreement Analysis

The purpose of an attribute agreement analysis (AAA) is to assess the reliability of the measurement system. Similar to a gage R&R study, the AAA is specifically designed for measurement systems that rely on human inspection, and therefore are potentially subject to variability and bias. The goal of this AAA was to provide a level of confidence in the repeatability of the inspectors’ measurements. To this end, the inspection of all 1,624 unique images was performed twice by two entirely separate and unique groups of volunteers. The repeatability of measurement results was determined be tallying responses for all items and calculating a percentage of agreement versus disagreement between the two sessions. Generally, a consensus level of at least 75 percent is expected. The two right-hand columns of Table 1 show a complete list of AAA results for each CTQ.

Data Collection Plan

In an effort to obtain a diverse and unbiased assessment, 64 volunteers were solicited from the information systems organization. Volunteers included National City employees and contractors. Participants were not expected to have any prior knowledge of electronic image capture, nor where they chosen on such criteria. A total of 3,248 individual inspections were required. Two copies of each item (one from each source) had to be evaluated twice for the purpose of the attribute agreement analysis. Individual hardcopies of images were randomly mixed and split into 32 packets.

Given the subjective nature of the quality of printed images, it was decided to pursue an evaluation based simply on legibility of the individual CTQs of the sample items Prior to the inspection of images, participants were asked not to show bias in their decisions based on subjective factors such as brightness, contrast and overall clarity. Instead, participants were asked to only make an attempt to read each of the fields of the item, and judge them on basic legibility. Each field was to be assessed either a pass or fail. In the event that a given field could not be located on an item, the inspector had the option of indicate that no information was present. The choices presented to inspectors were:

1 = Pass
0 = Fail
X = No data present

Each inspector was given a packet containing an assessment sheet and a set of approximately 51 printed images, both from the image archive and from the microfilm library. The images from the two sources were randomly intermixed to avoid any overt indication as to the source of each. As the inspectors worked through their set of items, they recorded their decisions on the inspection record provided. The average time taken for one inspector to complete 51 images was approximately 45 minutes, although no time limit was imposed for the exercise.

Data Analysis

Once data collection was complete, the results were keyed into a master Excel spreadsheet. In the process, various steps were taken to ensure accuracy. Non-check items were removed, and a final balance was performed (by session and source) to ensure that each item underwent exactly four inspections – session A, session B, source: image, source: microfilm.

Hypothesis Test

Data was imported to Minitab 14 statistical software for final analysis. Given the attribute nature of the data, a two-sample test of proportions hypothesis test was chosen to prove a statistical difference between sources (image versus microfilm). Testing was based on a confidence interval of 95 percent, as was the initial sample size calculation. An alpha risk of 5 percent and a beta risk of 10 percent were used. The null hypothesis (Ho) assumed there was no statistical differences between the process capabilities of microfilm image capture and electronic image capture. The alternate hypothesis (Ha) asserted that there was in fact a statistical difference (either greater or lesser) between the two process capabilities. The result of the statistical indicator P was extremely low (in most cases 0) on all tests for the individual CTQs. A P-value of less that .05 would indicate a high probability of a statistically significant difference between the two sources.

The actual “pass” percentage values that each test produced may in some cases provided more insight than the P-value. These can be found in columns 2 through 7 of Table 1.

Table 1: Percent Non-Defective & Attribute Agreement Analysis
Percent Non-Defective “Pass” (2-Sample Test of Proportions Based on 95% Confidence Interval, 10% Beta)								Attribute Agreement Analysis (Percent Consensus Between Session A and B)
CTQ	Session A		Session B		Combined A:B
	Image	Microfilm	Image	Microfilm	Image	Microfilm	P-Value	Image	Microfilm
Payor	99.9	58.2	99.1	62.1	99.5	60.2	0.0	97.9	68.4
Date	99.6	81.0	99.8	79.0	99.7	80.0	0.0	99.1	82.4
Payor Bank	98.6	23.7	97.9	32.3	98.2	28.0	0.0	90.4	62.2
Amount	99.1	74.3	99.8	75.0	99.4	74.7	0.0	98.3	78.5
Signature	97.5	71.6	97.9	77.0	97.7	74.3	0.0	96.1	74.3
MICR	97.7	52.3	98.4	56.3	98.0	54.3	0.0	96.6	63.9
Payee	99.1	63.1	99.5	70.0	99.3	66.5	0.0	97.8	72.9
BOFD	95.2	35.0	95.7	47.7	95.4	41.3	0.0	92.5	65.5

DPU, DPMO and Sigma Level

Probably the most telling indicators of the difference of quality performance between the two image capture processes can be found in the results Table 2.

Table 2: DPU, DPMO and Sigma Level
Process	DPU	DPMO	Percent Yield	Sigma Level (Short-Term)
Electronic Image Capture	.1318	16,472	98.35	3.6
Microfilm Image Capture	3.3510	418,873	58.11	1.7

DPU, or defects per unit, is a statistic defined as the total number of defects divided by the total number of units produced. For this study, the total number of defects found on electronically captured items was 107 across a sample of 812 units, thus the result of .1318 defects per unit.

DPMO, or defects per million opportunities, is a statistic defined as the total number of defects divided by total opportunities for defects to occur, multiplied by 1 million. This statistic is more commonly used as a gauge of product quality across disparate products or processes. For this study, the total number of defects found on electronically captured items was 107 across a sample of 812 units. Each unit provides eight distinct opportunities for defects to occur (eight CTQs). Therefore, across the entire sample, there were 6,496 opportunities for defects. The total of 107 defects is divided by 6,496 opportunities, thus equaling .016472. Finally, .016472 is multiplied by 1,000,000, which results in 16,472 DPMO.

Sigma level is a measure of process capability with respect to defect-free production. Sigma level is directly correlated to DPMO. In the case of electronic image capture, the measured DPMO was 16,472, which approximately equates to a process capability of 3.6 sigma (short-term).

Statistical Image Quality Assessment: Digital Vs. Microfilm – An iSixSigma Case Study