JVM Architecture

JVM Architecture

Hello everyone, this is Shikhar. I hope you are all doing well. So, today we will be discussing about the Java Virtual Machine. At its core, JVM is the cornerstone of Java's platform independence, enabling developers to write code once and run it anywhere. However, beneath this abstraction lies a sophisticated architecture meticulously designed to manage memory, execute bytecode, and provide a robust runtime environment for Java applications.

In this blog, we'll unravel the layers of JVM architecture, exploring its components, execution model, and the magic that makes Java a leading programming language in software development. Let's start by discussing virtual machines. /h3

Virtual Machines :

Virtual machines are like having a computer within a computer! This "virtual" machine can run its own operating system and software, just like a real computer does. However, instead of physical hardware, it uses resources from your actual computer, like memory and processing power.

Virtual machines (VMs) come in various types, each suited for different purposes and environments. Here are some common types of virtual machines:

  • Process Virtual Machines: Process virtual machines execute individual computer programs in a platform-independent manner. Examples include the Java Virtual Machine (JVM) and the Common Language Runtime (CLR) in the .NET framework.

  • System Virtual Machines: System virtual machines provide a complete operating system environment that runs as a software layer on top of the host operating system. Examples include VMware Workstation and Oracle VirtualBox, which allow you to run multiple operating systems concurrently on a single physical machine.

JVM :

In programming languages like C and C++, the code is first compiled into platform-specific machine code. These languages are called compiled languages. On the other hand, in languages like JavaScript and Python, the computer executes the instructions directly without having to compile them. These languages are called interpreted languages.

Java uses a combination of both techniques. Java code is first compiled into byte code to generate a class file. This class file is then interpreted by the Java Virtual Machine for the underlying platform. The same class file can be executed on any version of JVM running on any platform and operating system.

Similar to virtual machines, the JVM creates an isolated environment on a host machine. This can be used to execute Java programs irrespective of the platform or operating system of the machine.

As of January 2022, Oracle primarily maintains two official Java Virtual Machine (JVM) implementations:

  1. Oracle HotSpot JVM: This is the most widely used JVM implementation. It's optimized for performance and is the default JVM for Oracle's JDK distribution.

  2. Oracle GraalVM: GraalVM is a high-performance JVM and polyglot runtime that Oracle Labs developed. It supports running applications written in multiple languages, including Java, JavaScript, Python, Ruby, and more. GraalVM includes the HotSpot VM, but it also provides the GraalVM compiler, which offers enhanced performance for certain workloads.

These are the primary JVM implementations maintained by Oracle. However, there are also other JVM implementations, such as OpenJ9 (formerly IBM J9), maintained by other organizations.

JVM Components :

JVM's architecture is divided into 3 components— ClassLoader, Data Area and Execution engine. Let us discuss them one by one.

Image: Michelle Ridomi

  • Class Loader:

    In the Java Virtual Machine (JVM), the Java Class Loader is a crucial component responsible for loading Java classes into memory as a running Java program references them. It plays a fundamental role in the Java runtime environment, enabling dynamic class loading and providing mechanisms for class loading delegation and visibility.

    Here's an overview of the Java Class Loader in the JVM:

    1. Class Loading: When a Java program references a class for the first time, the Java Class Loader loads the class file from the file system or another source into memory. The loading process involves locating the bytecode for the class, reading it into memory, and creating a representation of the class within the JVM.

    2. Delegation Model: The Java Class Loader follows a delegation model, where each class loader delegates the class loading request to its parent class loader before attempting to load the class itself. This hierarchical delegation ensures that classes are loaded in a controlled and predictable manner. If a class loader's parent can successfully load the class, it is not loaded again by the child class loader.

    3. Visibility: Classes loaded by a parent class loader are visible to its child class loaders, but not vice versa. This visibility hierarchy ensures that classes loaded at higher levels (e.g., system classes) are accessible to classes loaded at lower levels (e.g., application classes).

    4. Custom Class Loaders: Java provides the ability to create custom class loaders by extending the ClassLoader class. Custom class loaders allow developers to implement specialized class loading behaviour, such as loading classes from non-standard sources (e.g., databases, network), modifying class bytecode before loading, or implementing custom class loading policies.

    5. Class Loading Phases: The class loading process in the JVM typically consists of three phases: loading, linking, and initialization. During the linking phase, the class loader performs verification, preparation (static initialization), and resolution (symbolic references). Finally, the class is initialized, which involves executing static initializers and initializing static fields.

      1. Loading:

        • Locating and Loading Class File: The loading phase starts when a class is first referenced by a Java program. The JVM searches for the corresponding class file in the classpath or other defined locations. If the class file is found, it is loaded into memory.

        • Creation of Class Object: Once the class file is located and loaded, the JVM creates an instance of the java.lang.Class object representing that class. This Class object contains metadata about the class, such as its methods, fields, and superclass information.

      2. Linking:

        • Verification: In this step, the JVM verifies the correctness of the loaded class file to ensure it adheres to the JVM specification. Verification checks include ensuring the validity of the bytecode, detecting illegal operations, and verifying that the class file adheres to access control rules.

        • Preparation: After verification, the JVM allocates memory for static variables and initializes them with default values (zero for numeric types, null for reference types). This step prepares the memory layout for the class.

        • Resolution: During resolution, symbolic references in the class file, such as references to other classes, methods, or fields, are replaced with direct references. This involves linking the class to other classes or methods it depends on. Resolution may involve loading and initializing referenced classes if they have not been loaded yet.

      3. Initialization:

        • Static Initialization: In this final phase, the static variables of the class are initialized to their specified values or to default values if not explicitly initialized. Static initialization blocks and static variable initializers are executed in the order they appear in the class file.

        • Execution of <clinit> method: If the class contains a static initialization block (static { ... }) or declares static variables with initial values, the JVM generates a special <clinit> method to execute these initialization tasks. This method is invoked automatically by the JVM the first time the class is loaded and initialized.

    6. Class Unloading: While the JVM automatically unloads classes that are no longer in use to reclaim memory, it's important to note that the unloading of classes is not explicitly triggered by the developer. Instead, it is managed by the JVM's garbage collector as part of its memory management process.

Overall, the Java Class Loader is a critical component of the JVM, providing dynamic class loading capabilities and supporting the execution of Java programs flexibly and modularly.

Note: Class loading is performed on-demand as the program needs classes, and classes are typically loaded by the JVM's built-in class loader hierarchy, which includes the bootstrap, extension, and application class loaders. Additionally, class loading is a dynamic process, and classes may be loaded, linked, and initialized multiple times during the execution of a Java program, depending on factors such as class reloading, garbage collection, and class unloading.

  • Runtime Data Area:

    In the Java Virtual Machine (JVM), the runtime data area refers to the memory organization used by the JVM during the execution of a Java program.

    1. Method Area (or Permanent Generation in older JVMs):

      The Method Area stores class-level structures such as method bytecode, field information, method and constructor code, and runtime constant pool. It is shared among all threads and contains data that is specific to individual class structures.

    2. Heap:

      • The Heap is the runtime data area where objects are allocated during program execution. Garbage collection algorithms such as the Garbage-First (G1) Garbage Collector manage memory allocation and deallocation in the heap to minimize pause times and maximize throughput.
    3. Java Stack (or Stack Frames):

      • Each thread in the JVM has its own Java Stack, which stores method invocation frames (stack frames).

        The Java Stack is used to manage method invocation and method execution. Each time a method is invoked, a new frame is pushed onto the stack. When a method completes execution, its frame is popped from the stack.

    4. PC Register (Program Counter Register):

      • The PC Register stores the address of the currently executing JVM instruction. Each thread in the JVM has its own PC Register, which keeps track of the execution flow within that thread.
    5. Native Method Stack:

      • Similar to the Java Stack, the Native Method Stack is used to store native method invocation frames for methods written in languages other than Java (e.g., C or C++). It is used when Java code invokes native methods through the Java Native Interface (JNI).
  • Execution Engine :

    The execution engine in the Java Virtual Machine (JVM) is responsible for executing Java bytecode instructions. It consists of several components that work together to interpret or compile bytecode into native machine code and manage the execution of Java programs. Here's a detailed explanation of the components of the JVM execution engine:

    1. Interpreter:

      • The interpreter is one of the primary components of the JVM execution engine. It reads bytecode instructions one by one and executes them sequentially.

      • Each bytecode instruction is decoded, and the corresponding operation is performed by the interpreter.

    2. Just-In-Time (JIT) Compiler:

      • The JIT compiler is another key component of the JVM execution engine. It transforms bytecode into native machine code at runtime, optimizing the performance of Java applications.

      • The JIT compiler identifies frequently executed bytecode sequences, known as hot spots, and compiles them into highly optimized native code.

    3. HotSpot Compiler:

      • HotSpot is Oracle's high-performance JIT compiler implementation, which is integrated into the JVM execution engine. By dynamically optimizing code based on runtime characteristics, HotSpot improves the overall performance of Java applications.
    4. Garbage Collector (GC):

      • The GC is responsible for reclaiming memory occupied by objects that are no longer in use, preventing memory leaks and ensuring efficient memory utilization. Modern JVM implementations, such as HotSpot, incorporate advanced garbage collection algorithms, such as the Garbage-First (G1) Garbage Collector, to minimize pause times and optimize memory management.
    5. Execution Control:

      • The execution engine also includes components for managing program execution control, such as thread scheduling, synchronization, and exception handling.

      • It ensures proper concurrency and synchronization between multiple threads executing concurrently within a Java program.

      • Exception handling mechanisms handle runtime exceptions and errors, providing a mechanism for graceful error recovery and program termination.

Overall, the execution engine in the JVM combines interpretation and dynamic compilation techniques to execute Java bytecode efficiently. By leveraging JIT compilation and advanced optimization strategies, it maximizes the performance of Java applications while maintaining flexibility and adaptability to changing runtime conditions.

Java Native Interface :

Certainly! The Java Native Interface (JNI) is a framework that allows Java code running in the Java Virtual Machine (JVM) to call and be called by native applications and libraries written in other programming languages such as C, C++, and assembly. JNI provides a way for Java programs to interact with platform-specific functionality and leverage existing libraries written in native code.

Broadly, JNI enables seamless integration between Java and native code, allowing Java applications to leverage existing libraries and access platform-specific functionality when needed. However, it requires careful attention to memory management, error handling, and adherence to the JNI specification to ensure compatibility and stability across different platforms.

In conclusion, JVM is the heart of Java programming language, from its foundational components like the heap and method area to the execution engine and runtime data areas, the JVM orchestrates the execution of Java programs with finesse and efficiency.