The Challenge of Java Virtualization
June 25th, 2013 by John Matthew Holt
The Java Platform as defined by the Java Virtual Machine Specification(s) and the Java Language Specification(s) pose particular difficulties to virtualization. Full virtualization (presenting the illusion of a completely isolated abstraction) of the host JVM involves solving unique challenges arising from the design of the Java Platform, and significant complexity in the design and implementation of a VMM for the JVM.
The challenge of Java virtualization is very different from the challenge of virtualizing other machine specifications, such as for example Intel x86. A VMM for an x86 machine is based upon a series of trap-and-emulate routines of sensitive instructions that for security and isolation reasons, must not be executed by guest VMs. For example, when a guest VM of an x86 VMM attempts to read or modify privileged state of the host CPU by means of a sensitive instruction, the x86 VMM must trap this action before it occurs and transfer control to the VMM so that the VMM can emulate the sensitive instruction in a safe manner, and then transparently resume direct execution of the guest VM at the next guest software instruction.
The design of VMMs for the x86 instruction set architecture (ISA) therefore are directed to classifying every single instruction of the x86 ISA as either non-sensitive (meaning the instruction will not interfere with any other guest software or the host software) and safe to be executed directly by guests, or sensitive (meaning the instruction may interfere with other guests or the host software) and not safe to be executed directly by guests . The x86 VMM provides an emulation function for every sensitive instruction, and uses a variety of techniques to intercept/trap attempted execution of a sensitive instruction by a guest VM to the ‘safe’ emulation function provided by the x86 VMM for that sensitive instruction.
Virtualizing the Java Platform however is entirely different. The principal reason for this difference is that the Java Platform ISA (the ISA of the JVM) does not contain any sensitive instructions according to the classification criteria that is the basis of x86 virtualization . In other words, according to the classification criteria used to classify the x86 ISA for x86 VMMs, all of the instructions in the Java Platform ISA are non-sensitive instructions.
This presents something of a paradox. On the one-hand, the entire Java Platform ISA is considered non-sensitive (i.e., no instruction will interfere with another guest) according to the classification criteria used for x86 virtualization, yet attempting to co-locate independent Java applications within the same JVM quickly fails due to incompatible classlibrary dependencies and unsafe Java APIs. Thus, some new virtualization technique that is completely different to that employed for x86 virtualization is going to be required in order to successfully virtualize the Java Platform.
Java Virtualization is Type Virtualization
The basis of the J-VMM is an innovative virtualization technique designed exclusively for the challenges of virtualizing the Java Platform and JVM, which we call type virtualization. Unlike x86 virtualization that presents a logically isolated view of an x86 ISA for each guest (called instruction-set virtualization), type virtualization presents a logically isolated view of the Java type system for each guest.
To understand the challenge that type virtualization solves, consider the arrangement of two Java SE applications running in two separate JVMs today. Two applications in two JVMs will obviously be isolated – i.e. isolated from interference by each other – but what is it precisely that achieves this isolation? The answer is they are independent type systems.
Isolation through independent type systems however comes at a cost. Those two Java applications operating in two independent JVM instances will have independent java.lang.Object types, one for each JVM instance, meaning an instance of java.lang.Object of one JVM will not be an instanceofthe other JVM’s java.lang.Object type.
If two applications do not share at least a common java.lang.Object supertype, then all of the objects of one application will not be instanceof any type of the other application. Direct method invocation between two such applications, and passing of object references between such applications, will not be possible. In situations such as this, the only means of data exchange between applications with independent java.lang.Object types is via remote method invocation (RMI) schemes over network sockets or other bitstream equivalents, with all of the poor performance and marshaling overheads common to all remote procedure call schemes.
In Java EE and other Java application frameworks, co-located application modules (.WAR webapps, .EAREJB applications, and so on) and the application server platforms upon which they operate are required to share the same type system. This is an explicit design requirement – and practical necessity – of the Java EE specifications. As a consequence, isolation by independent type systems cannot offer an isolation solution for Java EE (and similar) application frameworks.
A new technical solution that allows for direct method invocation between co-located application units and the application server platforms on which they run, while providing full-scale isolation across all compute resource vectors, is required. Such a solution can only be achieved if the type systems of the co-located application units and the hosting application server platform are not wholly independent – i.e. they must at a very minimum share the same java.lang.Object supertype!
Type virtualization by the Java VMM balances these conflicting requirements by virtualizing a JVM’s types in much the same way that an x86 VMM virtualizes a CPU’s instructions. The Java VMM uses type virtualization – instead of instruction virtualization used by x86 VMMs – to create plural isolated type system views for JVCs (called guest typespaces or virtual typespaces) much like instruction-set virtualization creates plural isolated instruction-set views for guest VMs. Each guest typespace is a logically isolated view of the host JVM’s Java type system (called the host type system) – in other words, there is only one java.lang.Object type. In this way, the ‘real’ Java type system of the host JVM can be securely shared between multiple JVCs, with each JVC behaving as a logically isolated guest typespace.
Further discussion on the operation of guest typespaces to follow in my next blog.
The challenge of virtualizing the Java Platform is creating the abstraction of logically isolated guest typespaces called Java Virtual Containers (JVCs) – or restated in more technical terms: supporting plural independent views of singleton types (e.g. java.lang.Object, java.lang.Class, etc). Independence of guest typespaces is critical, this ensures that what happens in one guest stays in that guest and does not affect either any other guest typespace or the entire host type system.
The goal of virtualizing the host type system of a JVM into JVCs is for each guest typespace to appear to its operator and application as logically isolated Java type system, when in reality it is not. This is similar to the goal of x86 VMMs that seek to present the view of an logically isolated x86 CPU to each guest VM, when infact the CPU is not isolated and is shared.
X86 VMMs achieve this feat by intercepting the sensitive instructions of guest applications, but letting all non-sensitive instructions operate ‘natively’ and ‘directly’ on the real CPU (called direct execution). The advantage of this is very considerable: non-sensitive instructions constitute the vast majority of all instructions executed by guest applications, so no performance loss for executing non-sensitive instructions – and therefore the majority of guest code – is achieved.
Java type virtualization borrows the notion of sensitive and non-sensitive actions from x86 VMMs and applies these to the Java Platform’s notion of types. So-called non-sensitive types can then be accessed by a guest application ‘natively’ and ‘directly’ (as with direct execution in the x86 VMM) with no performance penalty, while sensitive types are trapped and emulated by theJava Virtual Machine Monitor. More explanation of sensitive and non-sensitive types, and how they interact with guest typespaces and host type systems, will follow in the next blog.
To assist understanding the design and operation of JVI, it is helpful to have familiarity with the theory and practice of hardware virtualization generally, and virtual machine monitors specifically. For readers not familiar with virtual machine monitors and hardware virtualization, it is recommended to read some or all of the following articles before proceeding with the Java Virtualization Interface (JVI) User Guide series:
Find out why Waratek’s virtualization dramatically improves Application Security here.
John Matthew Holt is the Founder and Chief Technology Officer of Waratek. He is the inventive inspiration and technical driving force behind Waratek’s groundbreaking research and development into distributed computing and virtualization technologies, which has led to the granting of over 50 patents to date with many more pending.
As CTO, John Matthew leads a multinational team of expert computer engineers on a journey that has resulted in the creation of a disruptive new approach to web security that allows organisations to protect their Java applications and data from SQL Injection, targeted attacks and unpatched vulnerabilities at runtime, without making any code changes or deploying any hardware.
John Matthew has a long-standing passion for exploring contrarian frontiers of metacircular abstract machine interpreters and dynamic recompilation frameworks.