HOME

TheInfoList



OR:

The Universal Product Code (UPC or UPC code) is a
barcode symbology A barcode or bar code is a method of representing data in a visual, machine-readable form. Initially, barcodes represented data by varying the widths, spacings and sizes of parallel lines. These barcodes, now commonly referred to as linear or on ...
that is widely used worldwide for tracking
trade item A trade item is an item that is the subject of trade. It is a term used primarily by people in supply chain management and logistic engineering. An often used term in Journals of Lewis and Clark. See also *Goods In economics, goods are it ...
s in stores. UPC (technically refers to UPC-A) consists of 12 digits that are uniquely assigned to each trade item. Along with the related International Article Number (EAN) barcode, the UPC is the barcode mainly used for scanning of trade items at the
point of sale The point of sale (POS) or point of purchase (POP) is the time and place at which a retail transaction is completed. At the point of sale, the merchant calculates the amount owed by the customer, indicates that amount, may prepare an invoice f ...
, per the specifications of the international GS1 organisation. UPC data structures are a component of Global Trade Item Numbers (GTINs) and follow the global GS1 specification, which is based on international standards. But some retailers (clothing, furniture) do not use the GS1 system (rather other barcode symbologies or article number systems). On the other hand, some retailers use the EAN/UPC barcode symbology, but without using a GTIN (for products sold in their own stores only). Research indicates that the adoption and diffusion of the UPC stimulated innovation and contributed to the growth of international retail supply chains.


History

Wallace Flint proposed an automated checkout system in 1932 using punched cards.
Bernard Silver Bernard Silver (September 21, 1924 – August 28, 1963) was an early developer of barcode technology alongside Norman Joseph Woodland. Silver earned his Bachelor of Science in Electrical Engineering from the Drexel Institute of Technology in 1947. ...
and Norman Joseph Woodland, a graduate student from Drexel Institute of Technology, developed a bull's-eye-style code and applied for the patent in 1949. In the 1960s and early 1970s,
railroads Rail transport (also known as train transport) is a means of transport that transfers passengers and goods on wheeled vehicles running on rails, which are incorporated in tracks. In contrast to road transport, where the vehicles run on a prep ...
in North America experimented with multicolor bar codes for tracking railcars, but this system was eventually abandoned and replaced with a radio-based system called Automatic Equipment Identification (AEI). In 1973, a group of trade associations from the grocery industry formed the Uniform Product Code Council (UPCC) which, with the help of consultants Larry Russell and Tom Wilson of
McKinsey & Company McKinsey & Company is a global management consulting firm founded in 1926 by University of Chicago professor James O. McKinsey, that offers professional services to corporations, governments, and other organizations. McKinsey is the oldest and ...
, defined the numerical format that formed the basis of the Uniform Product Code. Technology firms including Charegon, IBM, Litton-Zellweger, Pitney Bowes-Alpex, Plessey-Anker, RCA, Scanner Inc., Singer, and Dymo Industries/Data General, put forward alternative proposals for symbol representations to the council. The Symbol Selection Committee finally chose to implement the IBM proposal designed by
George J. Laurer George Joseph Laurer III (September 23, 1925 – December 5, 2019) was an American engineer for IBM at Research Triangle Park in North Carolina. He published 20 bulletins, held 28 patents and developed the Universal Product Code (UPC) in the ea ...
, but with a slight modification to the font in the human readable area. The first UPC-marked item ever to be scanned at a retail checkout was a 10-pack (50 sticks) of Wrigley's Juicy Fruit chewing gum, purchased at the Marsh supermarket in Troy, Ohio, at 8:01 a.m. on 26 June 1974. The NCR cash register rang up 67 cents. The shopping cart also contained other barcoded items but the gum was the first one picked up at the checkout. A facsimile of the gum packet went on display at the
Smithsonian Institution The Smithsonian Institution ( ), or simply the Smithsonian, is a group of museums and education and research centers, the largest such complex in the world, created by the U.S. government "for the increase and diffusion of knowledge". Found ...
's American history museum in Washington, D.C.
Murray Eden Murray Eden (August 17, 1920 – August 9, 2020), was an American physical chemist and academic. He was a professor in electrical engineering, a lecturer, a visiting professor and adjunct professor at institutions including at the Massachusett ...
was a consultant on the team that created the Universal Product Code barcode. As Chairman of a committee of scientists at the
Massachusetts Institute of Technology The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of th ...
, he helped "select a symbol that would endure the inevitable rush of technology that lay ahead." He chose the font, and he came up with the idea to add numbers to the bottom, which is a fail-safe system, in case the barcode reader is not working correctly.


IBM proposal

Around late 1969, IBM at
Research Triangle Park Research Triangle Park (RTP) is the largest research park in the United States, occupying in North Carolina and hosting more than 300 companies and 65,000 workers. The facility is named for its location relative to the three surrounding cities ...
(RTP) in North Carolina assigned George Laurer to determine how to make a supermarket scanner and label. In late 1970, Heard Baumeister provided equations to calculate characters-per-inch achievable by two IBM bar codes, Delta A and Delta B. In February 1971, Baumeister joined Laurer. Delta B compared bar widths to space width to code bits. This was extremely sensitive to ink spread, where improper levels of ink or pressure would cause both edges of a bar to spread outward or shrink in. In mid 1971, William "Bill" Crouse invented a new bar code called Delta C. It achieved four times the characters per inch as Delta B. Delta C achieved its higher performance by only using leading to leading or trailing to trailing edges which was unaffected by uniform ink spread. The code provided best performance when it had a defined character set with a fixed reference distance that spanned most or preferably all the character. In August 1971, Crouse joined the scanner effort. After several months they had made no progress. They were aware of the RCA bull's eye label that could be scanned with a simple straight line laser scanner, but a readable label was far too large. Although Litton Industries proposed a bull's eye symbol cut in half to reduce the area, it was still too large and presented the same ink smear printing problems as the RCA symbol. The redundancy and checking ability were removed completely. They were also aware of the many proposals from around the world, none of which were feasible. In the spring of 1972, Baumeister announced a breakthrough. He proposed a label with bars that were slightly longer than the distance across all bars that needed to be read in a single pass. This label could be scanned with a simple "X" scanner only slightly more complex than the straight line laser scanner. The next day Baumeister suggested if the label were split into two halves the bar lengths could be cut nearly in half. These two proposals reduced the area from the bull's eye by one third and then one sixth. The image to the right shows the label proposed by Baumeister. He did not specify any specific bar code as that was well understood. Except for the bar coding and ten digits the UPC label today is his proposal. Shortly after that Baumeister transferred to another area of RTP. Laurer proceeded to define the details of the label and write a proposal. N.J. Woodland was assigned as planner for the project and aided Laurer with writing his proposal. Laurer's first attempt with a bar code used Delta B. The resulting label size was about six inches by three inches which was too large. Crouse suggested that Laurer use his Delta C bar code and provided a copy of his patent that had a sample alphanumeric character set and rules to generate other size alphabets. This reduced the label size to about 1.5” x 0.9”. Later Laurer asked Crouse for assistance in how the scanner could detect a label. Together they defined guard bars and a definition of how to detect the label. The guard bars also provided identification for half label discrimination and training bars for the scanner threshold circuits. Laurer had a complete label definition and proceeded to write his proposal. Previously Crouse had an idea for a simple wand worn like a ring and bracelet. He decided to develop that wand to provide a demonstration of the label. On 1 December 1972, IBM presented Laurer's proposal to the Super Market Committee in
Rochester, Minnesota Rochester is a city in the U.S. state of Minnesota and the county seat of Olmsted County. Located on rolling bluffs on the Zumbro River's south fork in Southeast Minnesota, the city is the home and birthplace of the renowned Mayo Clinic. Ac ...
, the location where IBM would develop the scanner. During the presentation, Crouse gave a lab demonstration where he read UPC-like labels with his ring wand. In addition to reading regular labels, he read the large two-page centerfold label in the proposal booklet. He then turned to a page showing a photo of labeled items sitting on a table. The labels were small and flawed due to the resolution of the printed photo but the wand read many of them. This demonstration showed the robustness of the pure Delta C code. The proposal was accepted. One month later, 1 January 1973 Crouse transferred back to IBM's Advanced Technology group, and Laurer remained with the full responsibility for the label. Dymo Industries, makers of handheld printing devices insisted that the code be character independent, so that handheld printing devices could produce the bar code in store if the items were not bar-coded by the manufacturers. Dymo's proposal was accepted by IBM and incorporated in IBM's latest proposal. It was decided that the two halves of the label should have a different set of numeric characters. The character set Laurer derived from the Delta C patent used seven printable increments or units where two bars and two spaces would be printed. This yielded twenty combinations of characters, but there were two pairs that when read by Delta C rules yielded the same code for the pair. Since eighteen characters were not enough Laurer tried adding one unit to the character set. This yielded twenty-six Delta C characters which could provide the two sets of decimal characters but it also added fourteen percent to the width of the label and thereby the height. This would be a thirty percent increase in area or a label of 1.7”x1.03”. Laurer felt this was not acceptable. Laurer returned to the original character set with twenty characters but four of those were two pairs with the same Delta C reading. He decided to use them all. To distinguish between the pairs he would measure one bar width in each of the pairs to distinguish them from each other. For each pair those bars would be one or two units wide. Laurer didn't apply Baumeister's equations to this set. He felt just one bar width measurement would not be too serious. As it turned out it would have required over fifty percent increase in width and height for an area increase of more than double. Laurer later admitted these four characters in each set were responsible for most of the scanner read errors. David Savir, a mathematician, was given the task of proving the symbol could be printed and would meet the reliability requirements, and was most likely unaware of Baumeister's equations. He and Laurer added two more digits to the ten for error correction and detection. Then they decided to add odd/even parity to the number of units filled with bars in each side. Odd/even parity is a technique used to detect any odd number of bit errors in a bit stream. They decided to use odd on one half and even on the other. This would provide additional indication of which half ticket was being read. This meant that every bar width had to be read accurately to provide a good reading. It also meant every space would also be known. Requiring every bit width to be read precisely basically nullified the Delta C advantage except for the Delta C reference measurement. Only the strange character set and the size of the label remains as a shadow of the Delta C code. The size was still that calculated for pure Delta C. If the label size had been properly recalculated, taking into account the required bar width measurements the label would have been far too large to be acceptable. Mechanical engineering and electronic circuit design commonly require worst case designs using known tolerances. Many engineers working with bar codes had little experience with such things and used somewhat intuitive methods. This was the cause of the poor performance of the Delta B code and quite likely the failure of RCA's bull's eye scanner. The following table shows the workable labels, available in the early 1970s, with their sizes. This is assuming a bull's eye with the same information and reliable readability.


Composition

Each UPC-A barcode consists of a scannable strip of black bars and white spaces above a sequence of 12 numerical digits. No letters, characters or other content of any kind may appear on a UPC-A barcode. There is a one-to-one correspondence between 12-digit number and strip of black bars and white spaces, i.e. there is only one way to represent each 12-digit number visually and there is only one way to represent each strip of black bars and white spaces numerically. The scannable area of every UPC-A barcode follows the pattern SLLLLLLMRRRRRRE, where S (start), M (middle), and E (end) guard patterns are represented the same way on every UPC-A barcode and the L (left) and R (right) sections collectively represent the 12 numerical digits that make each UPC-A unique. The first digit L indicates a particular number system to be used by the following digits. The last digit R is an error detecting
check digit A check digit is a form of redundancy check used for error detection on identification numbers, such as bank account numbers, which are used in an application where they will at least sometimes be input manually. It is analogous to a binary parit ...
, that allows some errors to be detected in scanning or manual entry. The guard patterns separate the two groups of six numerical digits and establish the timing. UPC-A 042100005264 is equivalent to UPC-E 425261 with the "EOEEOO" parity pattern, which is defined by UPC-A number system 0 and UPC-A check digit 4.


Formatting

UPC-A barcodes can be printed at various densities to accommodate a variety of printing and scanning processes. The significant dimensional parameter is called x-dimension (width of single module element). The width of each bar (space) is determined by multiplying the x-dimension and the module width (1, 2, 3, or 4 units) of each bar (space). Since the guard patterns each include two bars, and each of the 12 digits of the UPC-A barcode consists of two bars and two spaces, all UPC-A barcodes consist of exactly (3 × 2) + (12 × 2) = 30 bars, of which 6 represent guard patterns and 24 represent numerical digits. The x-dimension for the UPC-A at the nominal size is 0.33 mm (0.013"). Nominal symbol height for UPC-A is 25.9 mm (1.02"). The bars forming the S (start), M (middle), and E (end) guard patterns, are extended downwards by 5 times x-dimension, with a resulting nominal symbol height of 27.55 mm (1.08"). This also applies to the bars of the first and last numerical digit of UPC-A barcode. UPC-A can be reduced or magnified anywhere from 80% to 200%. A quiet zone, with a width of at least 9 times the x-dimension, must be present on each side of the scannable area of the UPC-A barcode. For a GTIN-12 number encoded in a UPC-A barcode, the first and last digits of the
human-readable interpretation A human-readable medium or human-readable format is any encoding of data or information that can be naturally read by humans. In computing, ''human-readable'' data is often encoded as ASCII or Unicode text, rather than as binary data. In most c ...
are always placed outside the symbol in order to indicate the quiet zones that are necessary for UPC barcode scanners to work properly.


Encoding

The UPC-A barcode is visually represented by strips of bars and spaces that encode the UPC-A 12-digit number. Each digit is represented by a unique pattern of 2 bars and 2 spaces. The bars and spaces are variable width, i.e. 1, 2, 3, or 4 modules wide. The total width for a digit is always 7 modules; consequently, UPC-A 12-digit number requires a total of . A complete UPC-A is 95 modules wide: 84 modules for the digits (L and R sections) combined with 11 modules for the S (start), M (middle), and E (end) guard patterns. The S (start) and E (end) guard patterns are 3 modules wide and use the pattern ''bar-space-bar'', where each bar and space is one module wide. The M (middle) guard pattern is 5 modules wide and uses the pattern ''space-bar-space-bar-space'', where each bar and space is also one module wide. In addition, a UPC-A symbol requires a quiet zone (extra space of 9 modules wide) before the S (start) and after the E (end) guard patterns. The UPC-A's left-hand side digits (the digits to the left of the M (middle) guard pattern) have odd parity, which means the total width of the black bars is an odd number of modules. On the contrary, the right-hand side digits have even parity. Consequently, a UPC scanner can determine whether it is scanning a symbol from left-to-right or from right-to-left (the symbol is upside-down). After seeing a S (start) or E (end) guard pattern (they are the same, ''bar-space-bar'', whichever direction they are read), the scanner will first see odd parity digits, if scanning left-to-right, or even parity digits, if scanning right-to-left. With the parity/direction information, an upside-down symbol will not confuse the scanner. When confronted with an upside-down symbol, the scanner may simply ignore it (many scanners alternate left-to-right and right-to-left scans, so they will read the symbol on a subsequent pass) or recognize the digits and put them in the right order. There is another property in the digit encoding. The right-hand side digits are the optical inverse of the left-hand side digits, i.e. black bars are turned into white spaces and vice versa. For example, the left-hand side "4" is , meanwhile the right-hand side "4" is .


Numbering

The number of UPC-A and UPC-E barcodes are limited by the standards used to create them. ; UPC-A: \begin \text ^ \text &\times \text ^ \text\\ = 10^6 &\times 10^5 = 100,000,000,000. \end ; UPC-E: \begin \text ^ \text &\times \text\\ = 10^6 &\times 2 = 2,000,000.\end


Number system digit

Below is description of all possible number systems with corresponding 12-digit UPC-A numbering schema LLLLLLRRRRRR, where L denotes number system digit and R check digit. ; 0–1, 6–9 : For most products. The LLLLL digits are the manufacturer code (assigned by local GS1 organization), and the RRRRR digits are the product code. ; 2 : Reserved for local use (store/warehouse), for items sold by variable weight. Variable-weight items, such as meats, fresh fruits, or vegetables, are assigned an item number by the store, if they are packaged there. In this case, the LLLLL is the item number, and the RRRRR is either the weight or the price, with the first R determining which (0 for weight). ; 3 : Drugs by National Drug Code (NDC) number. Pharmaceuticals in the U.S. use the middle 10 digits of the UPC as their NDC number. Though usually only
over-the-counter drug Over-the-counter (OTC) drugs are medicines sold directly to a consumer without a requirement for a prescription from a healthcare professional, as opposed to prescription drugs, which may be supplied only to consumers possessing a valid prescr ...
s are scanned at point of sale, NDC-based UPCs are used on prescription drug packages and surgical products and, in this case, are commonly called UPN Codes. ; 4 : Reserved for local use (store/warehouse), often for
loyalty card A loyalty program is a marketing strategy designed to encourage customers to continue to shop at or use the services of a business associated with the program. Today, such programs cover most types of commerce, each having varying features and ...
s or store coupons. ; 5 : Coupons. The LLLLL digits are digits 2-6 of the product's UPC prefix, the next three RRR are a family code (set by manufacturer or supplied by the coupon clearing house), and the next two RR are a value code (according to the GS1 value code table), which determines the amount of the discount. These coupons can be doubled or tripled.


Check digit calculation

The UPC includes a check digit to detect common data entry errors. For example, UPC-A codes choose the check digit x_ to satisfy the ''check digit equation'': :(3x_1 + x_2 + 3x_3 + x_4 + 3x_5 + x_6 + 3x_7 + x_8 + 3x_9 + x_ + 3x_ + x_) \equiv 0 \pmod. If an entered code does not satisfy the equation, then it is not a valid UPC-A. The UPC-A check digit may be calculated as follows: # Sum the digits at
odd Odd means unpaired, occasional, strange or unusual, or a person who is viewed as eccentric. Odd may also refer to: Acronym * ODD (Text Encoding Initiative) ("One Document Does it all"), an abstracted literate-programming format for describing X ...
-numbered positions (first, third, fifth,..., eleventh). # Multiply the result by 3. # Add the digit sum at
even Even may refer to: General * Even (given name), a Norwegian male personal name * Even (surname) * Even (people), an ethnic group from Siberia and Russian Far East **Even language, a language spoken by the Evens * Odd and Even, a solitaire game wh ...
-numbered positions (second, fourth, sixth,..., tenth) to the result. # Find the result
modulo In computing, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another (called the '' modulus'' of the operation). Given two positive numbers and , modulo (often abbreviated as ) is ...
10 (i.e. the remainder, when divided by 10) and call it . # If is zero, then the check digit is 0; otherwise the check digit is . For example, in a UPC-A barcode "03600029145''x''12", where is the unknown check digit, may be calculated by: # Sum the odd-numbered digits (0 + 6 + 0 + 2 + 1 + 5 = 14). # Multiply the result by 3 (14 × 3 = 42). # Add the even-numbered digits (42 + (3 + 0 + 0 + 9 + 4) = 58). # Find the result modulo 10 (58 mod 10 = 8 = ''M''). # If is not 0, subtract from 10 (). Thus, the check digit is 2. The check digit equation is selected to have reasonable error detection properties (see
Luhn algorithm The Luhn algorithm or Luhn formula, also known as the " modulus 10" or "mod 10" algorithm, named after its creator, IBM scientist Hans Peter Luhn, is a simple checksum formula used to validate a variety of identification numbers, such as credit ...
). * UPC-A can detect 100% of single digit errors. *: A single digit error means exactly one digit is wrong. Let the difference modulo 10 of the erroneous digit and the correct digit be . The value of cannot be zero because that means the digits are the same, but can be any other value in . If the error digit is in an odd position (weight 1), the left hand side of check digit equation changes by and the equivalence is no longer zero. If the error digit is in an even position (weight 3), then the left hand side changes by , but that change is also nonzero modulo 10, so the check digit equation is not satisfied. * UPC-A can detect about 89% of transposition errors. Specifically, if and only if the difference between two adjacent digits is 5, the UPC-A can't detect their transposition. *# If 2 neighboring digits are transposed, then one of the digits will be weighted by 1, and the other digit will be weighted by 3, where is the difference between the two digits. If the digits were in their correct order, they would contribute *#::1a + 3b = 1a + 3(a+d) = 4a + 3d *#:to the left hand side of the check digit equation. In the transposed order, they contribute *#::1b + 3a = 3a + 1(a+d) = 4a + d. *#:to the LHS. Subtracting the two contributions gives how much they change the LHS: *#::(4a + 3d) - (4a + d) = 2d *#:An error will be detected as long as the modular change is nonzero; if modulo 10, then the change will not be detected. Consequently, only when the character difference will an error be undetected (when the degenerate "transposition" is not an error). *# Next consider how often a transposition has a distance of 5. ::::Here is the ''Table of d-transpositions for UPC-A barcodes'', where d \in \: :::: ::::Row Sum contains the number of ''d''-transpositions, therefore the proportion of non-detectable transposition errors is (ignoring the transpositions where ): :::::\frac=\frac=11.111\ldots\%. :::::::::::::::::::::::::::▯


Variations

UPC in its most common usage technically refers to UPC-A. Other variants of the UPC exist: * UPC-B is a 12-digit version of UPC with no check digit, developed for the National Drug Code (NDC) and National Health Related Items Code. It has 11 digits plus a 1-digit product code, and is not in common use. * UPC-C is a 12-digit code with a product code and a check digit; not in common use. * UPC-D is a variable length code (12 digits or more) with the 12th digit being the check digit. These versions are not in common use. * UPC-E is a 6-digit code, that has its equivalent in UPC-A 12-digit code with number system 0 or 1. * UPC-2 is a 2-digit supplement to the UPC used to indicate the edition of a magazine or periodical. * UPC-5 is a 5-digit supplement to the UPC used to indicate suggested retail price for books.


UPC-E

To allow the use of UPC barcodes on smaller packages, where a full 12-digit barcode may not fit, a zero-suppressed version of UPC was developed, called UPC-E, in which the number system digit, all trailing zeros in the manufacturer code, and all leading zeros in the product code, are suppressed (omitted). This symbology differs from UPC-A in that it only uses a 6-digit code, does not use M (middle) guard pattern, and the E (end) guard pattern is formed as ''space-bar-space-bar-space-bar'', i.e. UPC-E barcode follows the pattern SDDDDDDE. The way in which a 6-digit UPC-E relates to a 12-digit UPC-A, is determined by UPC-E numerical pattern and UPC-E parity pattern. It can only correspond to UPC-A number system 0 or 1, the value of which, along with the UPC-A check digit, determines the UPC-E parity pattern of the encoding. With the manufacturer code digits represented by M's, and product code digits by P's, then: For example, a UPC-E 654321 may correspond to the UPC-A 065100004327 or 165100004324, depending on the UPC-E parity pattern of the encoded digits, as described next: UPC-E 654321 with "EOEOEO" parity pattern (UPC-A 065100004327) would be encoded as :1-1-1  4-1-1-1  1-2-3-1  2-3-1-1  1-4-1-1  2-2-1-2  2-2-2-1  1-1-1-1-1-1. The barcode would look like this:


EAN-13

The EAN-13 was developed as a superset of UPC-A, adding an extra digit to the beginning of every UPC-A number. This expanded the number of unique values theoretically possible by ten times to 1 trillion. EAN-13 barcodes also indicate the country in which the company that sells the product is based (which may or may not be the same as the country in which the good is manufactured). The three leading digits of the code determine this, according to the GS1 country codes. Every UPC-A code can be easily converted to the equivalent EAN-13 code by prepending 0 digit to the UPC-A code. This does not change the
check digit A check digit is a form of redundancy check used for error detection on identification numbers, such as bank account numbers, which are used in an application where they will at least sometimes be input manually. It is analogous to a binary parit ...
. All
point-of-sale The point of sale (POS) or point of purchase (POP) is the time and place at which a retail transaction is completed. At the point of sale, the merchant calculates the amount owed by the customer, indicates that amount, may prepare an invoice ...
systems can now understand both equally. EAN-8 is an 8-digit variation of the EAN barcode. UPC usage notes: * All products marked with an EAN will be accepted in North America currently, in addition to those already marked with a UPC. * Products with an existing UPC do not have to be re-marked with an EAN. * In North America, the EAN adds 30% more codes, mainly by adding digits 10 through 12 to the UPC digits 00 through 09. This is a powerful incentive to phase out the UPC.


See also

* GS1 – International supply-chain standards organization * Proof of purchase


References


Further reading

* * Railroad bar code. * *


External links

*
UPC-Search.org
Free online database of UPCs
GS1 company prefix
GS1 three-digit prefix list indicating country of origin of company in UPC code (not necessarily country of manufacture) {{Authority control Barcodes American inventions Checksum algorithms Identifiers Unique identifiers Product classifications 20th-century inventions hu:Universal Product Code ja:バーコード#統一商品コード