Marshalling (computer science)
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, marshalling or marshaling ( US spelling) is the process of transforming the
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembe ...
representation of an object into a data format suitable for storage or transmission, especially between different runtimes. It is typically used when data must be moved between different parts of a
computer program A computer program is a sequence or set of instructions in a programming language for a computer to Execution (computing), execute. It is one component of software, which also includes software documentation, documentation and other intangibl ...
or from one program to another. Marshalling simplifies complex communications, because it allows using '' composite objects'' instead of being restricted to '' primitive objects''.


Comparison with serialization

Marshalling is similar to or synonymous with
serialization In computing, serialization (or serialisation, also referred to as pickling in Python (programming language), Python) is the process of translating a data structure or object (computer science), object state into a format that can be stored (e. ...
, although technically serialization is one step in the process of marshalling an object. * Marshalling is describing the overall intent or process to transfer some ''live'' object from a client to a server (with ''client'' and ''server'' taken as abstract, mirrored concepts mapping to any matching ends of an arbitrary communication link ie. sockets). The point with marshalling an object is to have that object that is present in one ''running'' program be present in another ''running'' program; that is, an object on the ''client'' should be transferred to the ''server,'' which is a form of reification allowing the object’s structure, data and state to transit from a runtime to another, leveraging an intermediate, serialized, "dry" representation (which is of second importance) circulating onto the communication socket. * Serialization does not necessarily have this same intent, since it is only concerned about transforming data to generate that intermediate, "dry" representation of the object (for example, into a stream of bytes) which could then be either reified in a different runtime, or simply stored in a database, a file or in memory. Marshalling and serialization might thus be done differently, although some form of serialization is usually used to do marshalling. The term ''deserialization'' is somewhat similar to ''un''-marshalling a dry object "on the server side", i.e., demarshalling (or unmarshalling) to get a live object back: the serialized object is transformed into an internal data structure, i.e., a live object within the target runtime. It usually corresponds to the exact inverse process of marshalling, although sometimes both ends of the process trigger specific business logic. The accurate definition of marshalling differs across programming languages such as Python,
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, and
.NET The .NET platform (pronounced as "''dot net"'') is a free and open-source, managed code, managed computer software framework for Microsoft Windows, Windows, Linux, and macOS operating systems. The project is mainly developed by Microsoft emplo ...
, and in some contexts, is used interchangeably with serialization.


Marshalling in different programming languages

To " serialize" an object means to convert its state into a byte stream in such a way that the byte stream may be converted back into a copy of the object, which is unmarshalling in essence. Different programming languages either make or don’t make the distinction between the two concepts. A few examples: In Python, the term "marshal" is used for a specific type of "serialization" in the Python standard library – storing internal python objects: In the
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
-related , marshalling is used when serializing objects for remote invocation. An object that is marshalled records the state of the original object and it contains the codebase (''codebase'' here refers to a list of URLs where the object code can be loaded from, and not source code). Hence, in order to convert the object state and codebase(s), unmarshalling must be done. The unmarshaller interface automatically converts the marshalled data containing codebase(s) into an executable Java object in JAXB. Any object that can be deserialized can be unmarshalled. However, the converse need not be true. In
.NET The .NET platform (pronounced as "''dot net"'') is a free and open-source, managed code, managed computer software framework for Microsoft Windows, Windows, Linux, and macOS operating systems. The project is mainly developed by Microsoft emplo ...
, marshalling is also used to refer to serialization when using remote calls:


Usage and examples

Marshalling is used within implementations of different
remote procedure call In distributed computing, a remote procedure call (RPC) is when a computer program causes a procedure (subroutine) to execute in a different address space (commonly on another computer on a shared computer network), which is written as if it were a ...
(RPC) mechanisms, where it is necessary to transport data between
process A process is a series or set of activities that interact to produce a result; it may occur once-only or be recurrent or periodic. Things called a process include: Business and management * Business process, activities that produce a specific s ...
es and/or between threads. In Microsoft's
Component Object Model Component Object Model (COM) is a binary-interface technology for software components from Microsoft that enables using objects in a language-neutral way between different programming languages, programming contexts, processes and machines ...
(COM), interface pointers must be marshalled when crossing COM apartment boundaries. In the .NET Framework, the conversion between an unmanaged type and a CLR type, as in the P/Invoke process, is also an example of an action that requires marshalling to take place. Additionally, marshalling is used extensively within scripts and applications that use the
XPCOM Cross Platform Component Object Model (XPCOM) is a cross-platform component model from Mozilla. It is similar to Component Object Model (COM), Common Object Request Broker Architecture (CORBA) and system object model (SOM). It features multiple ...
technologies provided within the Mozilla application framework. The Mozilla Firefox browser is a popular application built with this framework, that additionally allows
scripting language In computing, a script is a relatively short and simple set of instructions that typically automation, automate an otherwise manual process. The act of writing a script is called scripting. A scripting language or script language is a programming ...
s to use XPCOM through XPConnect (Cross-Platform Connect).


Example

In the
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
family of operating systems the entire set of device drivers for
Direct3D Direct3D is a graphics application programming interface (API) for Microsoft Windows. Part of DirectX, Direct3D is used to render three-dimensional graphics in applications where performance is important, such as games. Direct3D uses hardware ...
are kernel-mode drivers. The user-mode portion of the
API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
is handled by the DirectX runtime provided by Microsoft. This is an issue because calling kernel-mode operations from user-mode requires performing a
system call In computing, a system call (syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, accessing a hard disk drive ...
, and this inevitably forces the CPU to switch to "kernel mode". This is a slow operation, taking on the order of
microsecond A microsecond is a unit of time in the International System of Units (SI) equal to one millionth (0.000001 or 10−6 or ) of a second. Its symbol is μs, sometimes simplified to us when Unicode is not available. A microsecond is to one second, ...
s to complete. During this time, the CPU is unable to perform any operations. As such, minimizing the number of times this switching operation must be performed would optimize performance to a substantive degree. Linux OpenGL drivers are split in two: a kernel-driver and a user-space driver. The user-space driver does all the translation of
OpenGL OpenGL (Open Graphics Library) is a Language-independent specification, cross-language, cross-platform application programming interface (API) for rendering 2D computer graphics, 2D and 3D computer graphics, 3D vector graphics. The API is typic ...
commands into machine code to be submitted to the GPU. To reduce the number of system calls, the user-space driver implements marshalling. If the GPU's command buffer is full of rendering data, the API could simply store the requested rendering call in a temporary buffer and, when the command buffer is close to being empty, it can perform a switch to kernel-mode and add a number of stored commands all at once.


Formats

Marshalling data requires some kind of data transfer, which leverages a specific data format to be chosen as the serialization target.


XML vs JSON vs…

XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
is one such format and means of transferring data between systems. Microsoft, for example, uses it as the basis of the file formats of the various components (Word, Excel, Access, PowerPoint, etc.) of the Microsoft Office suite (see
Office Open XML Office Open XML (also informally known as OOXML) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial version ...
). While this typically results in a verbose wire format, XML's fully-bracketed "start-tag", "end-tag" syntax allows provision of more accurate diagnostics and easier recovery from transmission or disk errors. In addition, because the tags occur repeatedly, one can use standard compression methods to shrink the content—all the Office file formats are created by zipping the raw XML. Alternative formats such as
JSON JSON (JavaScript Object Notation, pronounced or ) is an open standard file format and electronic data interchange, data interchange format that uses Human-readable medium and data, human-readable text to store and transmit data objects consi ...
(JavaScript Object Notation) are more concise, but correspondingly less robust for error recovery. Once the data is transferred to a program or an application, it needs to be converted back to an object for usage. Hence, unmarshalling is generally used in the receiver end of the implementations of Remote Method Invocation (RMI) and
Remote procedure call In distributed computing, a remote procedure call (RPC) is when a computer program causes a procedure (subroutine) to execute in a different address space (commonly on another computer on a shared computer network), which is written as if it were a ...
(RPC) mechanisms to unmarshal transmitted objects in an executable form.


JAXB

JAXB or
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
Architecture for XML Binding is the most common framework used by developers to marshal and unmarshal Java objects. JAXB provides for the interconversion between fundamental data types supported by Java and standard
XML schema An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constrai ...
data types.


XmlSerializer

XmlSerializer is the framework used by C# developers to marshal and unmarshal C# objects. One of the advantages of C# over Java is that C# natively supports marshalling due to the inclusion of XmlSerializer class. Java, on the other hand requires a non-native glue code in the form of JAXB to support marshalling.


From XML to an executable representation

An example of unmarshalling is the conversion of an XML representation of an object to the default representation of the object in any programming language. Consider the following class: public class Student * XML representation of a specific ''Student'' object'':'' Jayaraman Shyam * Executable representation of that ''Student'' object: // Code Snippet 2 Student s1 = new Student(); s1.setID(11235813); s1.setName("Jayaraman"); Student s2 = new Student(); s2.setID(21345589); s2.setName("Shyam"); Unmarshalling is the process of converting the XML representation of Code Snippet 1 to the default executable Java representation of Code Snippet 2, and running that very code to get a consistent, live object back. Had a different format been chosen, the unmarshalling process would have been different, but the end result in the target runtime would be the same.


Unmarshalling in Java


Unmarshaller in JAXB

The process of unmarshalling XML data into an executable Java object is taken care of by the in-built Unmarshaller class. The unmarshal methods defined in the Unmarshaller class are overloaded to accept XML from different types of input such as a File, FileInputStream, or URL. For example: JAXBContext jcon = JAXBContext.newInstance("com.acme.foo"); Unmarshaller umar = jcon.createUnmarshaller(); Object obj = umar.unmarshal(new File("input.xml"));


Unmarshalling XML Data

Unmarshal methods can deserialize an entire XML document or a small part of it. When the XML root element is globally declared, these methods utilize the JAXBContext's mapping of XML root elements to JAXB mapped classes to initiate the unmarshalling. If the mappings are not sufficient and the root elements are declared locally, the unmarshal methods use declaredType methods for the unmarshalling process. These two approaches can be understood below.


Unmarshal a global XML root element

The unmarshal method uses JAXBContext to unmarshal the XML data, when the root element is globally declared. The JAXBContext object always maintains a mapping of the globally declared XML element and its name to a JAXB mapped class. If the XML element name or its @xsi:type attribute matches the JAXB mapped class, the unmarshal method transforms the XML data using the appropriate JAXB mapped class. However, if the XML element name has no match, the unmarshal process will abort and throw an ''UnmarshalException''. This can be avoided by using the unmarshal by declaredType methods.


Unmarshal a local XML root element

When the root element is not declared globally, the application assists the unmarshaller by application-provided mapping using declaredType parameters. By an order of precedence, even if the root name has a mapping to an appropriate JAXB class, the declaredType overrides the mapping. However, if the @xsi:type attribute of the XML data has a mapping to an appropriate JAXB class, then this takes precedence over declaredType parameter. The unmarshal methods by declaredType parameters always return a JAXBElement instance. The properties of this JAXBElement instance are set as follows:


See also

* Free and open-source graphics device driver#Software architecture *
Component Object Model Component Object Model (COM) is a binary-interface technology for software components from Microsoft that enables using objects in a language-neutral way between different programming languages, programming contexts, processes and machines ...
* CORBA * Pickle (Python) * Protocol Buffers * Java Architecture for XML Binding * Calling convention


References

{{reflist Persistence Remote procedure call