Terminology
QR Code
A QR code (quick-response code) is a type of two-dimensional matrix barcode, invented in 1994 by Japanese company Denso Wave for labelling automobile parts. The QR labelling system was applied beyond the automobile industry due to its fast readability and greater storage capacity compared to standard UPC barcodes. QR Codes, more specifically, the popular Model 2, are internationally standardized in the ISO/IEC 18004.
Matrix
A QR symbol is arranged in a matrix consisting of an array of nominally square modules arranged in an overall square pattern.
For ease of reference, module positions are defined by their row and column coordinates in the symbol, in the form (x, y)
where x
designates the column (counting from left to right) and y
the row (counting from the top downwards) in which
the module is located, with counting commencing at 0. Module (0, 0)
is therefore located in the upper left corner of the symbol.
Module
A module represents a single square “pixel” in the matrix (not to confuse with pixels in a raster image or screen). A dark module represents a binary one and a light module represents a binary zero.
Version
The version of a QR symbol determines the side length of the matrix (and therefore the maximum capacity of code words),
ranging from 21×21 modules (441 total) at version 1 to 177×177 modules (31329 total) at version 40.
The module count increases in steps of 4 and can be calculated by 4 * version + 17
.
The maximum capacity for each version, mode and ECC level can be found in this table (qrcode.com).
Function Patterns
Finder Pattern
The Finder Pattern shall consist of three identical Position Detection Patterns located at the upper left, upper right and lower left corners of the symbol.
Each Position Detection Pattern may be viewed as three superimposed concentric squares and is constructed of dark 7×7 modules, light 5×5 modules and dark 3×3 modules.
The symbol is preferentially encoded so that similar patterns have a low probability of being encountered elsewhere in the symbol, enabling rapid identification of a possible QR Code symbol in the field of view. Identification of the three Position Detection Patterns comprising the finder pattern then unambiguously defines the location and orientation of the symbol in the field of view.
Alignment Pattern
The Alignment Pattern is a fixed reference pattern in defined positions, which enables the decode software to resynchronise the coordinate mapping of the modules in the event of moderate amounts of distortion of the image.
Each Alignment Pattern may be viewed as three superimposed concentric squares and is constructed of dark 5×5 modules, light 3×3 modules and a single central dark module.
The number of Alignment Patterns depends on the symbol version, and they shall be placed in all Model 2 symbols of version 2 or larger in positions defined in the specification.
Timing Pattern
The horizontal and vertical Timing Patterns respectively consist of a one module wide row or column of alternating dark and light modules, commencing and ending with a dark module. The horizontal Timing Pattern runs across row 6 of the symbol between the separators for the upper Position Detection Patterns; the vertical Timing Pattern similarly runs down column 6 of the symbol between the separators for the left-hand Position Detection Patterns. They enable the symbol density and version to be determined and provide datum positions for determining module coordinates.
Separators
A pattern of all light modules, one module wide, separating the Position Detection Patterns from the rest of the symbol.
Quiet Zone
This is a region 4 modules wide which shall be free of all other markings, surrounding the symbol on all four sides. Its nominal reflectance value shall be equal to that of the light modules.
Encoding Region
This region shall contain the symbol characters representing data, those representing error correction codewords, the Version Information and Format Information.
Data
This region contains the encoded data and error correction code blocks. Data bits are placed starting at the bottom-right of the matrix and proceeding upward in a column that is 2 modules wide. When the column reaches the top, the next 2-module column starts immediately to the left of the previous column and continues downward. Whenever the current column reaches the edge of the matrix, move on to the next 2-module column and change direction. If a function pattern or reserved area is encountered, the data bit is placed in the next unused module. (see wikipedia QR code - Encoding and thonky.com - QR Code Tutorial)
Version Information
The Version Information is an 18 bit sequence containing 6 data bits, with 12 error correction bits calculated using the (18, 6) BCH code which contains the version number.
Format Information
The Format Information is a 15 bit sequence containing 5 data bits, with 10 error correction bits calculated using the (15, 5) BCH code. It contains information on the error correction level applied to the symbol and on the masking pattern used, essential to enable the remainder of the encoding region to be decoded.
Darkmodule
The module in position (4 * version + 9, 8)
shall always be dark and does not form part of the Format Information.
Mode
The mode is the method of representing a defined character set as a bit string, with a mode indicator, a four-bit identifier indicating in which mode the next data sequence is encoded.
Mode |
Indicator |
Description |
---|---|---|
Numeric |
|
Numeric encoding, 10 bits per 3 digits |
Alphanumeric |
|
Alphanumeric encoding, 11 bits per 2 characters |
Byte |
|
Byte encoding, 8 bits per character |
Kanji |
|
|
Hanzi* |
|
Hanzi encoding (simplified Chinese, GB2312/GB18030), 13 bits per character |
Structured append |
|
used to split a message across multiple (up to 16) QR symbols |
ECI |
|
Extended Channel Interpretation (select alternate character set or encoding) |
FNC1 in first position |
|
see Code 128, also zxing/issues/1373 |
FNC1 in second position |
|
|
Terminator |
|
End of message |
* Hanzi mode is not part of the ISO specification, but the Chinese standard GB/T 18284
Segment
Each segment consists of the 4 bit mode indicator followed by the data bit stream, where the content of the bit stream can vary depending on the mode:
Mode |
Bit stream contents |
---|---|
Numeric |
[ |
Alphanumeric |
[ |
Byte |
[ |
Kanji |
[ |
Hanzi |
[ |
Structured append |
[ |
ECI |
[ |
FNC1 in first position |
[ |
FNC1 in second position |
[ |
Terminator |
[ |
The lenght of the Character Count Indicator for Numeric/Alphanumeric/Byte/Kanji/Hanzi varies, depending on the version:
Mode |
Version 1-9 |
Version 10-26 |
Version 27-40 |
---|---|---|---|
Numeric |
10 |
12 |
14 |
Alphanumeric |
9 |
11 |
13 |
Byte |
8 |
16 |
16 |
Kanji/Hanzi |
8 |
10 |
12 |
Extended Channel Interpretation (ECI)
Extended Channel Interpretation can be used to indicate an alternate character encoding for the following Byte segment (by default, ISO-8859-1 “Latin-1”).
An ECI segment starts with the 4 bit indicator 0111
followed by the ECI Assignment number (8, 16 or 24 bits),
followed by a Byte segment (0100
…) where the contents are encoded according to the preceding ECI ID.
The length of the ECI Assignment number depends on the given encoding ID:
ID |
length (bits) |
---|---|
0 - 127 |
8 |
128 - 16383 |
16 |
16384 - 999999 |
24 |
Mixed Mode
Encoding modes can be mixed as needed within a QR symbol in order to optimize data usage. Each segment of data is encoded in the appropriate mode, with the basic structure Mode Indicator / Character Count Indicator / Data and followed immediately by the Mode Indicator commencing the next segment.
[ Mode Indicator 1 ][ Mode bitstream 1 ]
…
[ Mode Indicator n ][ Mode bitstream n ]
…
[ 0000
End of message (Terminator) ]
ECC (Error Correction Coding)
QR codes use Reed–Solomon error correction that allow QR code readers to detect and correct errors. A detailed breakdown of the process can be found at thonky.com - QR Code Tutorial.
ECC Level
The number of data versus error correction bytes within each block depends on the version of the QR symbol and the error correction level. The higher the error correction level, the less storage capacity. The following table lists the approximate error correction capability at each of the four levels:
Level |
Short |
Capacity |
Indicator |
---|---|---|---|
Low |
L |
7% |
|
Medium |
M |
15% |
|
Quartile |
Q |
25% |
|
High |
H |
30% |
|
Maximum data capacity
The maximum data capacity of a QR Code at version 40 for each ECC level and mode is shown in the following table:
ECC |
max. bits |
Numeric |
Alphanumeric |
Binary |
Kanji/Hanzi * |
---|---|---|---|---|---|
L |
23648 |
7089 |
4296 |
2953 |
1817 |
M |
18672 |
5596 |
3391 |
2331 |
1435 |
Q |
13328 |
3993 |
2420 |
1663 |
1024 |
H |
10208 |
3057 |
1852 |
1273 |
784 |
* Hanzi mode stores one character less than Kanji as it uses an additional subset indicator of 4 bits length.
Data masking
Masking is the process of XORing the bit pattern in the encoding region with a masking pattern to provide a symbol with more evenly balanced numbers of dark and light modules and reduced occurrence of patterns which would interfere with fast processing of the image.
Evaluation
The mask pattern evaluation is done for each of the 8 mask patterns, the pattern with the lowest penalty score shall be used for the final output. During the evaluation, 4 rules are applied to get the penalty score:
find repetitive cells with the same color Example: 00000 or 11111 (horizontal and vertical).
find 2×2 blocks with the same color
find consecutive runs of 1:1:3:1:1:4 starting with black, or 4:1:1:3:1:1 starting with white
calculate the ratio of dark cells and give increasing penalty if the ratio is far from 50%
Mask pattern
Pattern |
Mask* |
Example |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* where x
= column (width) and y
= row (height), with x,y = 0,0
for the top left module
Reflectance
Symbols are intended to be read when either dark on light or light on dark. The International Standard (ISO/IEC 18004) is based on dark images on a light background (example on the left), reflectance reversal therefore means a light image on dark background (example on the right).