Terminology

QR Code

A QR code (quick-response code) is a type of two-dimensional matrix barcode, invented in 1994 by Japanese company Denso Wave for labelling automobile parts. The QR labelling system was applied beyond the automobile industry due to its fast readability and greater storage capacity compared to standard UPC barcodes. QR Codes, more specifically, the popular Model 2, are internationally standardized in the ISO/IEC 18004.

Matrix

A QR symbol is arranged in a matrix consisting of an array of nominally square modules arranged in an overall square pattern.

For ease of reference, module positions are defined by their row and column coordinates in the symbol, in the form (x, y) where x designates the column (counting from left to right) and y the row (counting from the top downwards) in which the module is located, with counting commencing at 0. Module (0, 0) is therefore located in the upper left corner of the symbol.

Module

A module represents a single square “pixel” in the matrix (not to confuse with pixels in a raster image or screen). A dark module represents a binary one and a light module represents a binary zero.

Version

The version of a QR symbol determines the side length of the matrix (and therefore the maximum capacity of code words), ranging from 21×21 modules (441 total) at version 1 to 177×177 modules (31329 total) at version 40. The module count increases in steps of 4 and can be calculated by 4 * version + 17.

The maximum capacity for each version, mode and ECC level can be found in this table (qrcode.com).

Function Patterns

Finder Pattern

The Finder Pattern shall consist of three identical Position Detection Patterns located at the upper left, upper right and lower left corners of the symbol.

Each Position Detection Pattern may be viewed as three superimposed concentric squares and is constructed of dark 7×7 modules, light 5×5 modules and dark 3×3 modules.

The symbol is preferentially encoded so that similar patterns have a low probability of being encountered elsewhere in the symbol, enabling rapid identification of a possible QR Code symbol in the field of view. Identification of the three Position Detection Patterns comprising the finder pattern then unambiguously defines the location and orientation of the symbol in the field of view.

Finder pattern

Alignment Pattern

The Alignment Pattern is a fixed reference pattern in defined positions, which enables the decode software to resynchronise the coordinate mapping of the modules in the event of moderate amounts of distortion of the image.

Each Alignment Pattern may be viewed as three superimposed concentric squares and is constructed of dark 5×5 modules, light 3×3 modules and a single central dark module.

The number of Alignment Patterns depends on the symbol version, and they shall be placed in all Model 2 symbols of version 2 or larger in positions defined in the specification.

Alignment Pattern

Timing Pattern

The horizontal and vertical Timing Patterns respectively consist of a one module wide row or column of alternating dark and light modules, commencing and ending with a dark module. The horizontal Timing Pattern runs across row 6 of the symbol between the separators for the upper Position Detection Patterns; the vertical Timing Pattern similarly runs down column 6 of the symbol between the separators for the left-hand Position Detection Patterns. They enable the symbol density and version to be determined and provide datum positions for determining module coordinates.

Timing Pattern

Separators

A pattern of all light modules, one module wide, separating the Position Detection Patterns from the rest of the symbol.

Separators

Quiet Zone

This is a region 4 modules wide which shall be free of all other markings, surrounding the symbol on all four sides. Its nominal reflectance value shall be equal to that of the light modules.

Quiet Zone

Encoding Region

This region shall contain the symbol characters representing data, those representing error correction codewords, the Version Information and Format Information.

Data

This region contains the encoded data and error correction code blocks. Data bits are placed starting at the bottom-right of the matrix and proceeding upward in a column that is 2 modules wide. When the column reaches the top, the next 2-module column starts immediately to the left of the previous column and continues downward. Whenever the current column reaches the edge of the matrix, move on to the next 2-module column and change direction. If a function pattern or reserved area is encountered, the data bit is placed in the next unused module. (see wikipedia QR code - Encoding and thonky.com - QR Code Tutorial)

Data

Version Information

The Version Information is an 18 bit sequence containing 6 data bits, with 12 error correction bits calculated using the (18, 6) BCH code which contains the version number.

Version Information

Format Information

The Format Information is a 15 bit sequence containing 5 data bits, with 10 error correction bits calculated using the (15, 5) BCH code. It contains information on the error correction level applied to the symbol and on the masking pattern used, essential to enable the remainder of the encoding region to be decoded.

Format Information

Darkmodule

The module in position (4 * version + 9, 8) shall always be dark and does not form part of the Format Information.

Darkmodule

Mode

The mode is the method of representing a defined character set as a bit string, with a mode indicator, a four-bit identifier indicating in which mode the next data sequence is encoded.

Mode

Indicator

Description

Numeric

0001

Numeric encoding, 10 bits per 3 digits

Alphanumeric

0010

Alphanumeric encoding, 11 bits per 2 characters

Byte

0100

Byte encoding, 8 bits per character

Kanji

1000

Kanji encoding (Japanese, Shift-JIS), 13 bits per character

Hanzi*

1101

Hanzi encoding (simplified Chinese, GB2312/GB18030), 13 bits per character

Structured append

0011

used to split a message across multiple (up to 16) QR symbols

ECI

0111

Extended Channel Interpretation (select alternate character set or encoding)

FNC1 in first position

0101

see Code 128, also zxing/issues/1373

FNC1 in second position

1001

Terminator

0000

End of message

* Hanzi mode is not part of the ISO specification, but the Chinese standard GB/T 18284

Segment

Each segment consists of the 4 bit mode indicator followed by the data bit stream, where the content of the bit stream can vary depending on the mode:

Mode

Bit stream contents

Numeric

[ 0001 : 4 ] [ Character Count Indicator : variable ] [ Data Bit Stream : 3 1⁄3 × charcount ]

Alphanumeric

[ 0010 : 4 ] [ Character Count Indicator : variable ] [ Data Bit Stream : 5 1⁄2 × charcount ]

Byte

[ 0100 : 4 ] [ Character Count Indicator : variable ] [ Data Bit Stream : 8 × charcount ]

Kanji

[ 1000 : 4 ] [ Character Count Indicator : variable ] [ Data Bit Stream : 13 × charcount ]

Hanzi

[ 1101 : 4 ] [ Subset Indicator : 4 ] [ Character Count Indicator : variable ] [ Data Bit Stream : 13 × charcount ]

Structured append

[ 0011 : 4 ] [ Symbol Position : 4 ] [ Total Symbols : 4 ] [ Parity : 8 ]

ECI

[ 0111 : 4 ] [ ECI Assignment number : variable ]

FNC1 in first position

[ 0101 : 4 ] [ Numeric/Alphanumeric/Byte/Kanji/Hanzi payload : variable ]

FNC1 in second position

[ 1001 : 4 ] [ Application Indicator : 8 ] [ Numeric/Alphanumeric/Byte/Kanji/Hanzi payload : variable ]

Terminator

[ 0000 : 4 ]

The lenght of the Character Count Indicator for Numeric/Alphanumeric/Byte/Kanji/Hanzi varies, depending on the version:

Mode

Version 1-9

Version 10-26

Version 27-40

Numeric

10

12

14

Alphanumeric

9

11

13

Byte

8

16

16

Kanji/Hanzi

8

10

12

Extended Channel Interpretation (ECI)

Extended Channel Interpretation can be used to indicate an alternate character encoding for the following Byte segment (by default, ISO-8859-1 “Latin-1”).

An ECI segment starts with the 4 bit indicator 0111 followed by the ECI Assignment number (8, 16 or 24 bits), followed by a Byte segment (0100 …) where the contents are encoded according to the preceding ECI ID.

The length of the ECI Assignment number depends on the given encoding ID:

ID

length (bits)

0 - 127

8

128 - 16383

16

16384 - 999999

24

Mixed Mode

Encoding modes can be mixed as needed within a QR symbol in order to optimize data usage. Each segment of data is encoded in the appropriate mode, with the basic structure Mode Indicator / Character Count Indicator / Data and followed immediately by the Mode Indicator commencing the next segment.

[ Mode Indicator 1 ][ Mode bitstream 1 ]

[ Mode Indicator n ][ Mode bitstream n ]

[ 0000 End of message (Terminator) ]

ECC (Error Correction Coding)

QR codes use Reed–Solomon error correction that allow QR code readers to detect and correct errors. A detailed breakdown of the process can be found at thonky.com - QR Code Tutorial.

ECC Level

The number of data versus error correction bytes within each block depends on the version of the QR symbol and the error correction level. The higher the error correction level, the less storage capacity. The following table lists the approximate error correction capability at each of the four levels:

Level

Short

Capacity

Indicator

Low

L

7%

01

Medium

M

15%

00

Quartile

Q

25%

11

High

H

30%

10

Maximum data capacity

The maximum data capacity of a QR Code at version 40 for each ECC level and mode is shown in the following table:

ECC

max. bits

Numeric

Alphanumeric

Binary

Kanji/Hanzi *

L

23648

7089

4296

2953

1817

M

18672

5596

3391

2331

1435

Q

13328

3993

2420

1663

1024

H

10208

3057

1852

1273

784

* Hanzi mode stores one character less than Kanji as it uses an additional subset indicator of 4 bits length.

Data masking

Masking is the process of XORing the bit pattern in the encoding region with a masking pattern to provide a symbol with more evenly balanced numbers of dark and light modules and reduced occurrence of patterns which would interfere with fast processing of the image.

Evaluation

The mask pattern evaluation is done for each of the 8 mask patterns, the pattern with the lowest penalty score shall be used for the final output. During the evaluation, 4 rules are applied to get the penalty score:

  • find repetitive cells with the same color Example: 00000 or 11111 (horizontal and vertical).

  • find 2×2 blocks with the same color

  • find consecutive runs of 1:1:3:1:1:4 starting with black, or 4:1:1:3:1:1 starting with white

  • calculate the ratio of dark cells and give increasing penalty if the ratio is far from 50%

Mask pattern

Pattern

Mask*

Example

000

(x + y) mod 2 = 0

Mask pattern 000

001

y mod 2 = 0

Mask pattern 001

010

x mod 3 = 0

Mask pattern 010

011

(x + y) mod 3 = 0

Mask pattern 011

100

((y intdiv 2) + (x intdiv 3)) mod 2 = 0

Mask pattern 100

101

(x y) mod 2 + (x y) mod 3 = 0
or:
(x y) mod 6 = 0

Mask pattern 101

110

((x y) mod 2 + (x y) mod 3) mod 2 = 0
or:
(x y) mod 6 < 3

Mask pattern 110

111

((x y) mod 3 + (x + y) mod 2) mod 2 = 0
or:
(x + y + (x y) mod 3) mod 2 = 0

Mask pattern 111

* where x = column (width) and y = row (height), with x,y = 0,0 for the top left module

Reflectance

Symbols are intended to be read when either dark on light or light on dark. The International Standard (ISO/IEC 18004) is based on dark images on a light background (example on the left), reflectance reversal therefore means a light image on dark background (example on the right).

Normal reflectance Reversed reflectance