Collecting garbage is a messy business both in the real world and the programming world. Garbage collection is still one of the root causes of many bugs in low-level languages such as C. It is also a reason why modern languages like Java or Rust provide solutions to automatically handle garbages. Though modern languages do not expect every developer to be aware of memory usage and garbage collection, understanding what is garbage collection and the impact of garbage collection will help you optimize your code and Java Virtual Machine. This article introduces what is garbage collection and why it is important to understand garbage collection.
What is Garbage?
In C, pointers are used to access objects (C doesn’t have the concept of objects but I am using Java’s term here for Java developers to understand). In theory, developers can walk through the entire allocated memory area using pointers and access anything stored in the memory. This means that the developer can allocate memory in one function and access that allocated memory in another function. Therefore, the language cannot decide if an object is accessible to the user or not. In C, any unfreed memory allocation is considered garbage if the developer has no intention to use them later.
On the other hand, Java employs references to reach objects. Direct memory access is kept for advanced use cases and is not something used for day-to-day tasks. Therefore, Java can determine if an object is reachable by the runtime by checking the active references to that object. Such objects without any active reference to reach are defined as garbage.
Why Does It Matter?
Since it is hard to explain the behavior using Java, let’s use C for a little experiment. The following C code allocates a memory location with 10,000,000,000 bytes and stores a string value repetitively in a “for” loop. Save this code as application.c
on your computer.
Compile the code using any C compiler. The GNU Compiler Collection command to compile in Linux is given below.
In Linux, it will produce an executable output. Run the following command to run the program.
Depending on your system configuration, either you may get Out of memory error or 10,000 “Hello World” messages. My computer (Linux Mint 64-bit, AMD Ryzen 7 5800X 8-Core Processor, and 32 GB memory) prints 10,000 messages. While the process is waiting for the user input before exits, you can check the memory used by the process in the system monitor. As you can see, the program takes 40.9 MB in memory. If you increase the number of iterations (in other words, the number of garbage memory allocations), you may get Memory out of error.
Next, modify the code as shown below. Note the free(name);
on line number 21. The free
command is used to free the memory allocated by malloc
.
Compile the source code and run it again.
This time our application consumed just 102.4 kB. This example clearly shows the impact of garbage objects on memory management. In a language that does not provide automatic garbage collection, you must be aware of what you are doing. If not, your application may throw memory out of error and terminate.
In a low-level language like C, the developer has the responsibility to free the memory after use. Failing to do so can crash your application. In Java, there is a dedicated daemon process to clear the garbage for you. However, nothing comes for free! The more garbage you create, the more work Java has to do behind the scene to sweep the memory. This will allocate more computing resources for garbage collection and can have a severe impact on your software’s performance. More details about detailed garbage collection algorithms will be covered in another article.
From Object To Garbage
The above example demonstrated garbage in C. Let’s see how an object becomes garbage in Java. To understand how an object becomes garbage, first, we need to understand how an object is created in memory. Almost all programming languages including Java use three memory regions to store runtime variables:
- Static – To store static primitives and references
- Stack – To store local primitives and references
- Heap – To store objects
Note that all objects are stored in heap doesn’t matter where it was created. For example, running the following code will create a Student object in the heap but the stu
reference remains in slack. All the primitive variables and references (the Student class has only one integer primitive) will be stored inside the object.
Null Reference
If you reassign the stu
reference to null after line number 3, the Student object created in the heap will become unreachable, aka garbage.
Reassigning Reference
To make an object garbage, you don’t have to assign null to the reference. You can create a new object and use the existing reference to point to the new object and make the old one garbage. Again the idea is not much related to the reference but about if an object in the heap is reachable or not. The code given below creates a new object on line number 4 and assigns it to the existing reference. This will make the student object with index 10 garbage.
Island of Isolation
To strengthen the idea that garbage is about reachability but not about having references, let’s see another example. In the following code, there are three Node objects referring to each other. However, after making all the references a, b, and c null on line numbers 14 to 16, all three Node objects become garbage.
What is Garbage Collection?
Let’s end this article with a brief introduction to garbage collection in Java.
As mentioned earlier, Java developers do not have to worry about garbage collection to the same extent as a C developer because Java takes care of garbage collection. In the Java Virtual Machine, there is a dedicated daemon thread to monitor the heap usage and to clean the heap if needed. The process of removing unreachable objects (garbage) is known as garbage collection. Though Java collects the garbage automatically, creating too many garbage objects can cause serious performance issues in your application. This may not trivial with simple Java applications like a web service. However, the usability of complex solutions like data processing, machine learning, or big data computations can be decided by garbage collection.
The next article will cover the impact of garbage in Java Virtual Machine and the best practices to reduce the amount of garbage. I will also write another detailed article on how to tune Java Virtual Machine garbage collection algorithms for the best performance.
Have you found this article useful? Please let me know below in the comments. Knowing someone found my articles useful motivates me to write more. Also, comment below if you face any issues with following this article or getting it working. I will try my best to help you resolve the problem. The Java Helps community is also willing to help each other and grow together.