HOME

TheInfoList



OR:

The Pixel Visual Core (PVC) is a series of ARM-based system in package (SiP)
image processor An image processor, also known as an image processing engine, image processing unit (IPU), or image signal processor (ISP), is a type of media processor or specialized digital signal processor (DSP) used for image processing, in digital cameras ...
s designed by
Google Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
. The PVC is a fully programmable
image An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensio ...
,
vision Vision, Visions, or The Vision may refer to: Perception Optical perception * Visual perception, the sense of sight * Visual system, the physical mechanism of eyesight * Computer vision, a field dealing with how computers can be made to gain und ...
and AI multi-core domain-specific architecture (DSA) for mobile devices and in future for
IoT The Internet of things (IoT) describes physical objects (or groups of such objects) with sensors, processing ability, software and other technologies that connect and exchange data with other devices and systems over the Internet or other com ...
. It first appeared in the Google Pixel 2 and 2 XL which were introduced on October 19, 2017. It has also appeared in the Google Pixel 3 and 3 XL. Starting with the Pixel 4, this chip was replaced with the
Pixel Neural Core The Pixel 4 and Pixel 4 XL are a pair of Android smartphones designed, developed, and marketed by Google as part of the Google Pixel product line. They collectively serve as the successors to the Pixel 3 and Pixel 3 XL. They were officially an ...
.


History

Google previously used
Qualcomm Snapdragon Snapdragon is a suite of system on a chip (SoC) semiconductor products for mobile devices designed and marketed by Qualcomm Technologies Inc. The Snapdragon's central processing unit (CPU) uses the ARM architecture. A single SoC may include m ...
's
CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
, GPU, IPU, and DSP to handle its
image processing An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimension ...
for their
Google Nexus Google Nexus is a discontinued line of consumer electronic devices that run the Android operating system. Google managed the design, development, marketing, and support of these devices, but some development and all manufacturing were carried o ...
and
Google Pixel Google Pixel is a brand of consumer electronic devices developed by Google that run either ChromeOS or the Android operating system. The Pixel brand was introduced in February 2013 with the first-generation Chromebook Pixel. The Pixel line in ...
devices. With the increasing importance of computational photography techniques, Google developed the Pixel Visual Core (PVC). Google claims the PVC uses less power than using
CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
and GPU while still being fully programmable, unlike their tensor processing unit (TPU)
application-specific integrated circuit An application-specific integrated circuit (ASIC ) is an integrated circuit (IC) chip customized for a particular use, rather than intended for general-purpose use, such as a chip designed to run in a digital voice recorder or a high-effici ...
(ASIC). Indeed, classical
mobile devices A mobile device (or handheld computer) is a computer small enough to hold and operate in the hand. Mobile devices typically have a flat LCD or OLED screen, a touchscreen interface, and digital or physical buttons. They may also have a physical ...
equip an image signal processor (ISP) that is a fixed functionality
image processing An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimension ...
pipeline. In contrast to this, the PVC has a flexible programmable functionality, not limited only to image processing. The PVC in the Google Pixel 2 and 2 XL is labeled SR3HX X726C502. The PVC in the Google Pixel 3 and 3 XL is labeled SR3HX X739F030. Thanks to the PVC, the Pixel 2 and Pixel 3 obtained a mobile DxOMark of 98 and 101. The latter one was the top-ranked single-lens mobile DxOMark score, tied with the iPhone XR.


Pixel Visual Core software

A typical image-processing program of the PVC is written in
Halide In chemistry, a halide (rarely halogenide) is a binary chemical compound, of which one part is a halogen atom and the other part is an element or radical that is less electronegative (or more electropositive) than the halogen, to make a f ...
. Currently, it supports just a subset of Halide programming language without floating point operations and with limited memory access patterns. Halide is a
domain specific language A domain-specific language (DSL) is a computer language specialized to a particular application domain. This is in contrast to a general-purpose language (GPL), which is broadly applicable across domains. There are a wide variety of DSLs, ranging f ...
that lets the user decouple the
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
and the
scheduling A schedule or a timetable, as a basic time-management tool, consists of a list of times at which possible tasks, events, or actions are intended to take place, or of a sequence of events in the chronological order in which such things are i ...
of its execution. In this way, the developer can write a program that is optimized for the target hardware architecture.


Pixel Visual Core ISA

The PVC has two types of instruction set architecture (ISA), a virtual and a physical one. First, a high-level language program is compiled into a ''virtual ISA (vISA)'', inspired by
RISC-V RISC-V (pronounced "risk-five" where five refers to the number of generations of RISC architecture that were developed at the University of California, Berkeley since 1981) is an open standard instruction set architecture (ISA) based on establi ...
ISA, which abstracts completely from the target hardware generation. Then, the vISA program is compiled into the so called ''physical ISA (pISA)'', that is a VLIW ISA. This compilation step takes into account the target hardware parameters (e.g. array of PEs size, STP size, etc...) and specify explicitly memory movements. The decoupling of ''vISA'' and ''pISA'' lets the first one to be cross-architecture and generation-independent, while ''pISA'' can be compiled offline or through JIT compilation.


Pixel Visual Core architecture

The Pixel Visual Core is designed to be a scalable multi-core energy-efficient architecture, ranging from even numbers between 2 to 16 core designs. The core of a PVC is the image processing unit (IPU) a programmable unit tailored for image processing. The Pixel Visual Core architecture was also designed either to be its own chip, like the SR3HX, or as an IP block for System on a chip (SOC).


Image Processing Unit (IPU)

The IPU core has a stencil processor (STP), a line buffer pool (LBP) and a
NoC A network on a chip or network-on-chip (NoC or )This article uses the convention that "NoC" is pronounced . Therefore, it uses the convention "a" for the indefinite article corresponding to NoC ("a NoC"). Other sources may pronounce it as an ...
. The STP mainly provides a 2-D
SIMD Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it should ...
array of processing elements (PEs) able to perform stencil computations, a small neighborhood of pixels. Though it seems similar to
systolic array In parallel computer architectures, a systolic array is a homogeneous network of tightly coupled data processing units (DPUs) called cells or nodes. Each node or DPU independently computes a partial result as a function of the data received f ...
and wavefront computations, the STP has an explicit software controlled data movement. Each PEs features 2x 16-bit arithmetic logic units (ALUs), 1x 16-bit Multiplier–accumulator unit (MAC), 10x 16-bit registers, and 10x 1-bit predicate registers.


Line Buffer Pool (LBP)

Considering that one of the most energy costly operation is DRAM access, each STP has temporary buffers to increase data locality, namely LBP. The used LBP is a 2-D FIFO that accommodates different sizes of reading and writing. The LBP uses single-producer multi-consumer behavioral model. Each LBP can have eight logical LB memories and one for
DMA DMA may refer to: Arts * ''DMA'' (magazine), a defunct dance music magazine * Dallas Museum of Art, an art museum in Texas, US * Danish Music Awards, an award show held in Denmark * BT Digital Music Awards, an annual event in the UK * Doctor of M ...
input-output operations. Due to the real high complexity of the memory system, the PVC designers state the LBP controller as one of the most challenging components. The NoC used is a ring network on chip used to communicate with only neighbor cores for energy savings and pipelined computational pattern preservation.


Stencil Processor (STP)

The STP has a 2-D array of PEs: for example, a 16x16 array of full PEs and four lanes of simplified PEs called ''"halo"''. The STP has a scalar processor, called scalar lane (SCL), that adds control instructions with a small instruction memory. The last component of an STP is a load store unit called sheet generator (SHG), where the sheet is the PVC memory access unit.


SR3HX design summary

The SR3HX PVC features a 64-bit ARMv8a ARM Cortex-A53 CPU, 8x image processing unit (IPU) cores, 512 MB LPDDR4, MIPI, PCIe. The IPU cores each have 512
arithmetic logic units In computing, an arithmetic logic unit (ALU) is a combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit (FPU), which operates on floating point numb ...
(ALUs) consisting of 256 processing elements (PEs) arranged as a 16 x 16 2-dimensional array. Those cores execute a custom VLIW ISA. There are two 16-bit ALUs per processing element and they can operate in three distinct ways: independent, joined, and fused. The SR3HX PVC is manufactured as a SiP by
TSMC Taiwan Semiconductor Manufacturing Company Limited (TSMC; also called Taiwan Semiconductor) is a Taiwanese multinational semiconductor contract manufacturing and design company. It is the world's most valuable semiconductor company, the world' ...
using their 28HPM HKMG process. It was designed over 4 years in partnership with
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the devel ...
. (Codename: Monette Hill) Google claims the SR3HX PVC is 7-16x more energy-efficient than the
Snapdragon 835 This is a list of Qualcomm Snapdragon systems on chips (SoC) made by Qualcomm for use in smartphones, tablets, laptops, 2-in-1 PCs, smartwatches, and smartbooks devices. Before Snapdragon SoC made by Qualcomm before it was renamed to Snapdr ...
. And that the SR3HX PVC can perform 3 trillion operations per second, HDR+ can run 5x faster and at less than one-tenth the energy than the
Snapdragon 835 This is a list of Qualcomm Snapdragon systems on chips (SoC) made by Qualcomm for use in smartphones, tablets, laptops, 2-in-1 PCs, smartwatches, and smartbooks devices. Before Snapdragon SoC made by Qualcomm before it was renamed to Snapdr ...
.{{Cite web, url=https://www.blog.google/products/pixel/pixel-visual-core-image-processing-and-machine-learning-pixel-2/, title=Pixel Visual Core: image processing and machine learning on Pixel 2, date=2017-10-17, website=Google, language=en, access-date=2019-02-02 It supports
Halide In chemistry, a halide (rarely halogenide) is a binary chemical compound, of which one part is a halogen atom and the other part is an element or radical that is less electronegative (or more electropositive) than the halogen, to make a f ...
for image processing and TensorFlow for machine learning. The current chip runs at 426MHz and the single IPU is able to perform more than 1 TeraOPS.


References


Google hardware Application-specific integrated circuits