HOME

TheInfoList



OR:

In
computer programming Computer programming is the process of performing a particular computation (or more generally, accomplishing a specific computing result), usually by designing and building an executable computer program. Programming involves tasks such as anal ...
, a p-code machine (portable code machine) is a
virtual machine In computing, a virtual machine (VM) is the virtualization/ emulation of a computer system. Virtual machines are based on computer architectures and provide functionality of a physical computer. Their implementations may involve specialized h ...
designed to execute ''p-code'' (the
assembly language In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence b ...
or
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
of a hypothetical
central processing unit A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, a ...
(CPU)). This term is applied both generically to all such machines (such as the
Java virtual machine A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describe ...
(JVM) and
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
precompiled code), and to specific implementations, the most famous being the p-Machine of the Pascal-P system, particularly the
UCSD Pascal UCSD Pascal is a Pascal programming language system that runs on the UCSD p-System, a portable, highly machine-independent operating system. UCSD Pascal was first released in 1977. It was developed at the University of California, San Diego (U ...
implementation, among whose developers, the ''p'' in ''p-code'' was construed to mean ''pseudo'' more often than ''portable'', thus ''pseudo-code'' meaning instructions for a pseudo-machine. Although the concept was first implemented circa 1966—as O-code for the Basic Combined Programming Language (
BCPL BCPL ("Basic Combined Programming Language") is a procedural, imperative, and structured programming language. Originally intended for writing compilers for other languages, BCPL is no longer in common use. However, its influence is still ...
) and P code for the language
Euler Leonhard Euler ( , ; 15 April 170718 September 1783) was a Swiss mathematician, physicist, astronomer, geographer, logician and engineer who founded the studies of graph theory and topology and made pioneering and influential discoveries in ...
—the ''term'' p-code first appeared in the early 1970s. Two early
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
s generating p-code were the Pascal-P compiler in 1973, by Kesav V. Nori, Urs Ammann, Kathleen Jensen, Hans-Heinrich Nägeli, and Christian Jacobi, and the Pascal-S compiler in 1975, by Niklaus Wirth. Programs that have been translated to p-code can either be interpreted by a software program that emulates the behavior of the hypothetical CPU, or
translated Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. The English language draws a terminological distinction (which does not exist in every language) between ''transla ...
into the machine code of the CPU on which the program is to run and then executed. If there is sufficient commercial interest, a hardware implementation of the CPU specification may be built (e.g., the
Pascal MicroEngine Pascal MicroEngine is a series of microcomputer products manufactured by Western Digital from 1979 through the mid-1980s, designed specifically to run the UCSD p-System efficiently. Compared to other microcomputers, which use a machine language ...
or a version of a Java processor).


Benefits and weaknesses of implementing p-code

Compared to direct translation into native
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
, a two-stage approach involving translation into p-code and execution by
interpreting Interpreting is a translational activity in which one produces a first and final target-language output on the basis of a one-time exposure to an expression in a source language. The most common two modes of interpreting are simultaneous interp ...
or just-in-time compilation (JIT) offers several advantages. * It is much easier to write a small p-code interpreter for a new machine than it is to modify a compiler to generate native code for the same machine. * Generating machine code is one of the more complicated parts of writing a compiler. By comparison, generating p-code is much easier because no machine-dependent behavior must be considered in generating the
bytecode Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (norma ...
. This makes it useful for getting a compiler up and running quickly. * Since p-code is based on an ideal virtual machine, a p-code program is often much smaller than the same program translated to machine code. * When the p-code is interpreted, the interpreter can apply additional run-time checks that are difficult to implement with native code. One of the significant disadvantages of p-code is execution speed, which can sometimes be remedied via JIT compiling. P-code is often also easier to
reverse-engineer Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accomp ...
than native code. In the early 1980s, at least two
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
s achieved machine independence through extensive use of p-code. The Business Operating System (BOS) was a cross-platform operating system designed to run p-code programs exclusively. The
UCSD p-System UCSD Pascal is a Pascal programming language system that runs on the UCSD p-System, a portable, highly machine-independent operating system. UCSD Pascal was first released in 1977. It was developed at the University of California, San Diego (U ...
, developed at The University of California, San Diego, was a self-compiling and self-hosting operating system based on p-code optimized for generation by the
Pascal Pascal, Pascal's or PASCAL may refer to: People and fictional characters * Pascal (given name), including a list of people with the name * Pascal (surname), including a list of people and fictional characters with the name ** Blaise Pascal, Frenc ...
language. In the 1990s, translation into p-code became a popular strategy for implementations of languages such as Python, Microsoft P-Code in
Visual Basic Visual Basic is a name for a family of programming languages from Microsoft. It may refer to: * Visual Basic .NET (now simply referred to as "Visual Basic"), the current version of Visual Basic launched in 2002 which runs on .NET * Visual Basic ( ...
, and
Java bytecode In computing, Java bytecode is the bytecode-structured instruction set of the Java virtual machine (JVM), a virtual machine that enables a computer to run programs written in the Java programming language and several other programming langua ...
in
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
. The language Go uses a generic, portable assembly as a form of p-code, implemented by
Ken Thompson Kenneth Lane Thompson (born February 4, 1943) is an American pioneer of computer science. Thompson worked at Bell Labs for most of his career where he designed and implemented the original Unix operating system. He also invented the B programmi ...
as an extension of the work on
Plan 9 from Bell Labs Plan 9 from Bell Labs is a distributed operating system which originated from the Computing Science Research Center (CSRC) at Bell Labs in the mid-1980s and built on UNIX concepts first developed there in the late 1960s. Since 2000, Plan 9 has be ...
. Unlike
Common Language Runtime The Common Language Runtime (CLR), the virtual machine component of Microsoft .NET Framework, manages the execution of .NET programs. Just-in-time compilation converts the managed code (compiled intermediate language code) into machine instru ...
(CLR) bytecode or JVM bytecode, there is no stable specification, and the Go build tools do not emit a bytecode format to be used at a later time. The Go assembler uses the generic assembly language as an
intermediate representation An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
, and Go executables are machine-specific
statically linked A stand-alone program, also known as a freestanding program, is a computer program that does not load any external module, library function or program and that is designed to boot with the bootstrap procedure of the target processor – it runs o ...
binaries.


UCSD p-Machine


Architecture

Like many other p-code machines, the UCSD p-Machine is a
stack machine In computer science, computer engineering and programming language implementations, a stack machine is a computer processor or a virtual machine in which the primary interaction is moving short-lived temporary values to and from a push down ...
, which means that most instructions take their operands from a
stack Stack may refer to: Places * Stack Island, an island game reserve in Bass Strait, south-eastern Australia, in Tasmania’s Hunter Island Group * Blue Stack Mountains, in Co. Donegal, Ireland People * Stack (surname) (including a list of people ...
, and place results back on the stack. Thus, the add instruction replaces the two topmost elements of the stack with their sum. A few instructions take an immediate argument. Like Pascal, the p-code is
strongly typed In computer programming, one of the many ways that programming languages are colloquially classified is whether the language's type system makes it strongly typed or weakly typed (loosely typed). However, there is no precise technical definition ...
, supporting boolean (b), character (c), integer (i), real (r), set (s), and pointer (a)
data type In computer science and computer programming, a data type (or simply type) is a set of possible values and a set of allowed operations on it. A data type tells the compiler or interpreter how the programmer intends to use the data. Most progra ...
s natively. Some simple instructions: Insn. Stack Stack Description before after   adi i1 i2 i1+i2 add two integers adr r1 r2 r1+r2 add two reals inn i1 s1 is1 set membership; b1 = whether i1 is a member of s1 ldi i1 i1 i1 load integer constant mov a1 a2 a2 move not b1 b1 -b1 boolean negation


Environment

Unlike other stack-based environments (such as
Forth Forth or FORTH may refer to: Arts and entertainment * ''forth'' magazine, an Internet magazine * ''Forth'' (album), by The Verve, 2008 * ''Forth'', a 2011 album by Proto-Kaw * Radio Forth, a group of independent local radio stations in Scotla ...
and the
Java virtual machine A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describe ...
) but very similar to a real target CPU, the p-System has only one stack shared by procedure stack frames (providing return address, etc.) and the arguments to local instructions. Three of the machine's registers point into the stack (which grows upwards): * SP points to the top of the stack (the stack pointer). * MP marks the beginning of the active stack frame (the mark pointer). * EP points to the highest stack location used in the current procedure (the extreme pointer). Also present is a constant area, and, below that, the heap growing down towards the stack. The NP (the new pointer) register points to the top (lowest used address) of the heap. When EP gets greater than NP, the machine's memory is exhausted. The fifth register, PC, points at the current instruction in the code area.


Calling conventions

Stack frames look like this: EP -> local stack SP -> ... locals ... parameters ... return address (previous PC) previous EP dynamic link (previous MP) static link (MP of surrounding procedure) MP -> function return value The procedure calling sequence works as follows: The call is introduced with mst n where n specifies the difference in nesting levels (remember that Pascal supports nested procedures). This instruction will ''mark'' the stack, i.e. reserve the first five cells of the above stack frame, and initialise previous EP, dynamic, and static link. The caller then computes and pushes any parameters for the procedure, and then issues cup n, p to call a user procedure (n being the number of parameters, p the procedure's address). This will save the PC in the return address cell, and set the procedure's address as the new PC. User procedures begin with the two instructions ent 1, i ent 2, j The first sets SP to MP + i, the second sets EP to SP + j. So i essentially specifies the space reserved for locals (plus the number of parameters plus 5), and j gives the number of entries needed locally for the stack. Memory exhaustion is checked at this point. Returning to the caller is accomplished via retC with C giving the return type (i, r, c, b, a as above, and p for no return value). The return value has to be stored in the appropriate cell previously. On all types except p, returning will leave this value on the stack. Instead of calling a user procedure (cup), standard procedure q can be called with csp q These standard procedures are Pascal procedures like readln() (csp rln), sin() (csp sin), etc. Peculiarly eof() is a p-Code instruction instead.


Example machine

Niklaus Wirth specified a simple p-code machine in the 1976 book '' Algorithms + Data Structures = Programs''. The machine had 3 registers - a
program counter The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, i ...
''p'', a base register ''b'', and a top-of-stack register ''t''. There were 8 instructions: # lit 0, ''a'' : load constant ''a'' # opr 0, ''a'' : execute operation ''a'' (13 operations: RETURN, 5 math functions, and 7 comparison functions) # lod ''l'', ''a'' : load variable ''l,a'' # sto ''l'', ''a'' : store variable ''l,a'' # cal ''l'', ''a'' : call procedure ''a'' at level ''l'' # int 0, ''a'' : increment t-register by ''a'' # jmp 0, ''a'' : jump to ''a'' # jpc 0, ''a'' : jump conditional to ''a'' This is the code for the machine, written in Pascal: const amax=2047; levmax=3; cxmax=200; type fct=(lit,opr,lod,sto,cal,int,jmp,jpc); instruction=packed record f:fct; l:0..levmax; a:0..amax; end; var code: array ..cxmaxof instruction; procedure interpret; const stacksize = 500; var p, b, t: integer; i: instruction; s: array ..stacksizeof integer; function base(l: integer): integer; var b1: integer; begin b1 := b; while l > 0 do begin b1 := s 1 l := l - 1 end; base := b1 end ; begin writeln(' start pl/0'); t := 0; b := 1; p := 0; s := 0; s := 0; s := 0; repeat i := code p := p + 1; with i do case f of lit: begin t := t + 1; s := a end; opr: case a of 0: begin t := b - 1; p := s + 3 b := s + 2 end; 1: s := -s 2: begin t := t - 1; s := s + s + 1end; 3: begin t := t - 1; s := s - s + 1end; 4: begin t := t - 1; s := s * s + 1end; 5: begin t := t - 1; s := s div s + 1end; 6: s := ord(odd(s ); 8: begin t := t - 1; s := ord(s = s + 1 end; 9: begin t := t - 1; s := ord(s <> s + 1 end; 10: begin t := t - 1; s := ord(s < s + 1 end; 11: begin t := t - 1; s := ord(s >= s + 1 end; 12: begin t := t - 1; s := ord(s > s + 1 end; 13: begin t := t - 1; s := ord(s <= s + 1 end; end; lod: begin t := t + 1; s := s ase(l) + aend; sto: begin s ase(l)+a:= s writeln(s ; t := t - 1 end; cal: begin s + 1:= base(l); s + 2:= b; s + 3:= p; b := t + 1; p := a end; int: t := t + a; jmp: p := a; jpc: begin if s = 0 then p := a; t := t - 1 end end until p = 0; writeln(' end pl/0'); end ; This machine was used to run Wirth's
PL/0 PL/0 is a programming language, intended as an educational programming language, that is similar to but much simpler than Pascal, a general-purpose programming language. It serves as an example of how to construct a compiler. It was originally intr ...
, a Pascal subset compiler used to teach compiler development.


Microsoft P-Code

P-Code is a name for several of
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washi ...
's proprietary
intermediate language An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A " ...
s. They provided an alternate binary format to
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
. At various times, Microsoft have said p-code is an abbreviation for either ''packed code'' or ''pseudo code''. Microsoft p-code was used in
Visual C++ Microsoft Visual C++ (MSVC) is a compiler for the C, C++ and C++/CX programming languages by Microsoft. MSVC is proprietary software; it was originally a standalone product but later became a part of Visual Studio and made available in both tri ...
and
Visual Basic Visual Basic is a name for a family of programming languages from Microsoft. It may refer to: * Visual Basic .NET (now simply referred to as "Visual Basic"), the current version of Visual Basic launched in 2002 which runs on .NET * Visual Basic ( ...
. Like other p-code implementations, Microsoft p-code enabled a more compact
executable In computing, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instructions", as opposed to a data fil ...
at the expense of slower execution.


Other implementations


See also

*
Bytecode Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (norma ...
*
Intermediate representation An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
* Joel McCormack, designer of the NCR Corporation version of the p-code machine * Runtime system * Token threading


References


Further reading

* * (NB. Has Pascal sources of the P4 compiler and interpreter, usage instructions.) * (NB. Has the p-code of the P4 compiler, generated by itself.) * * , including packaging and pre-compiled binaries; a friendly fork of the * * * * * (NB. Especially see the articles ''Pascal-P Implementation Notes'' and ''Pascal-S: A Subset and its Implementation''.)


External links

* {{DEFAULTSORT:P-Code Machine Stack-based virtual machines Pascal (programming language) * Programming language implementation Articles with example Pascal code