In this section, we are going to examine object references and a subtle security issue that can arise if references are not managed with due care. This security issue is called escaping references and we will explain when and how it occurs with the aid of an example. In addition, we will fix the issue in the example, demonstrating how to address this security concern.
Inspecting the escaping references issue
In this section, we will discuss and provide an example of Java’s call-by-value parameter passing mechanism. Once we understand call-by-value, this will enable us to demonstrate the issue that occurs when passing (or returning) references. Let us start with Java’s call-by-value mechanism.
Call-by-value
Java uses call-by-value when passing parameters to methods and returning results from methods. Put simply, this means that Java makes a copy of something. In other words, when you are passing an argument to a method, a copy is made of that argument, and when you are returning a result from a method, a copy is made of that result. Why do we care? Well, what you are copying – a primitive or a reference – can have major implications (especially for mutable types, such as StringBuilder and ArrayList). This is what we want to dig into further here. We will use a sample program and an associated diagram to help. Figure 2.5 shows the sample code:
Figure 2.5 – A call-by-value code sample
Figure 2.5 details a program where we have a simple Person
class with two properties: a String
name and an int
(primitive) age. The constructor enables us to initialize the object state, and we have accessor/mutator methods for the instance variables.
The CallByValue
class is the driver class. In main()
on line 27
, a local primitive int
variable, namely age
, is declared and initialized to 20
. On line 28
, we create an object of type Person
, passing in the String
literal, John
, and the primitive variable, age
. Based on these arguments, we initialize the object state. The reference, namely john
, is the local variable used to store the reference to the Person
object on the heap. Figure 2.6 shows the state of memory after line 28 has finished executing. For clarity, we have omitted the args
array object.
Figure 2.6 – The initial state of the stack and the heap
As Figure 2.6 shows, the frame for the main()
method is the current frame on the stack. It contains two local variables: the int
primitive age with its value of 20
and the Person
reference, john
, referring to the Person
object on the heap. The Person
object has its two instance variables initialized: the age
primitive variable is set to 20
and the name String
instance variable is referring to the John String
object in the String Pool (as John is a String
literal, Java stores it there).
Now, we execute line 29, change(john, age);
in Figure 2.5. This is where it gets interesting. We call the change()
method, passing down the john
reference and the age
primitive. As Java is call-by-value, a copy is made of each of the arguments. Figure 2.7 shows the stack and the heap just as we enter the change()
method and are about to execute its first instruction on line 34:
Figure 2.7 – The stack and heap as the change() method is entered
In the preceding figure, we can see that a frame has been pushed onto the stack for the change()
method. As Java is call-by-value, a copy is made of both arguments into local variables in the method, namely age
and adult
. The difference here is crucial and requires subsections as a result.
Copying a primitive
Copying a primitive is similar to photocopying a sheet of paper. If you hand the photocopy to someone else, they can do whatever they want to that sheet – you still have the original. This is what is going to happen in this program; the called change()
method will alter the primitive age
variable, but the copy of age
back in main()
will be untouched.
Copying a reference
Copying a reference is similar to copying a remote control for a television. If you hand the second/copy remote to someone else, they can change the channel that you are watching. This is what is going to happen in this program; the called change()
method will, using the adult
reference, alter the name
instance variable in the Person
object and the john
reference back in main()
will see that change.
Going back to the code example from Figure 2.5, Figure 2.8 shows the stack and heap after lines 34 and 35 have finished executing but before the change()
method returns to main()
:
Figure 2.8 – The stack and heap as the change() method is exiting
As can be seen, the age
primitive in the method frame for change()
has been changed to 90
. In addition, a new String
literal object is created for Michael in the String Pool and the name
instance variable in the Person
object is referring to it. This is because String
objects are immutable; that is, once initialized, you cannot change the contents of String
objects. Note that the John String
object in the String Pool is now eligible for garbage collection, as there are no references to it.
Figure 2.9 show the state of the stack and heap after the change()
method has finished executing and control has returned to the main()
method:
Figure 2.9 – The stack and heap after the change() method has finished
In Figure 2.9, the frame on the stack for the change()
method has been popped. The frame for the main()
method is now, once again, the current frame. You can see that the age
primitive is unchanged, that is, it is still 20
. The reference is also the same. However, the change()
method was able to change the instance variable that john
was looking at. Line 30, System.out.println(john.getName() + " " + age);
, proves what has occurred by outputting Michael 20.
Now that we understand Java’s call-by-value mechanism, we will now discuss escaping references with the aid of an example.
The problem
The principle of encapsulation in OOP is that a class’s data is private
and accessible to external classes via its public
API. However, in certain situations, this is not enough to protect your private
data due to escaping references. Figure 2.10 is an example of a class that suffers from escaping references:
Figure 2.10 – Code with escaping references
The preceding figure contains a Person
class with one private
instance variable, a StringBuilder
called name
. The Person
constructor initializes the instance variable based on the argument passed in. The class also provides a public getName()
accessor method to enable external classes to retrieve the private
instance variable.
The driver class here is EscapingReferences
. In main()
, on line 16, a local StringBuilder
object is created, containing the String
Dan and sb
is the name of the local reference. This reference is passed into the Person
constructor in order to initialize the name
instance variable in the Person
object. Figure 2.11 shows the stack and heap at this point, that is, just after line 17 has finished executing. The String Pool is omitted, in the interests of clarity.
Figure 2.11 – Escaping references on the way in
At this point, the issue of escaping references is emerging. Upon executing the Person
constructor, a copy of the sb
reference is passed in, where it is stored in the name
instance variable. Now, as Figure 2.11 shows, both the name
instance variable and the local main()
variable, sb
, refer to the same StringBuilder
object!
Now, when line 18 executes in main()
, that is, sb.append("Dan");
, the object is changed to DanDan
for both the local sb
reference and the name
instance variable. When we output the instance variable on line 19, it outputs DanDan, reflecting the change.
So, that is one issue on the way in: initializing our instance variables to the (copies of) the references passed in. We will address how to fix that shortly. On the way out, however, we also have an issue. Figure 2.12 demonstrates this issue:
Figure 2.12 – Escaping references on the way out
Figure 2.12 shows the stack and heap after line 21, StringBuilder sb2 = p.getName();
, executes. Again, we have a local reference, this time called sb2
, which refers to the same object that the name
instance variable in the Person
object on the heap is referring to. Thus, when we use the sb2
reference to append Dan
to the StringBuilder
object and then output the instance variable, we get DanDanDan
.
At this point, it is clear that just having your data private
is not enough. The problem arises because StringBuilder
is a mutable type, which means, at any time, you can change the (original) object. Contrast this with String
objects, which are immutable (as are the wrapper types, for example: Double
, Integer
, Float
, and Character
).
Immutability
Java protects String
objects because any change to a String
object results in the creation of a completely new object (with the changes reflected). Thus, the code requesting a change will see the requested change (it’s just that it is a completely new object). The original String
object that others may have been looking at is still untouched.
Now that we have discussed the issues with escaping references, let us examine how to solve them.
Finding a solution
Essentially, the solution revolves around a practice known as defensive copying. In this scenario, we do not want to store a copy of the reference for any mutable object. The same holds for returning references to our private
mutable data in our accessor methods – we do not want to return a copy of the reference to the calling code.
Therefore, we need to be careful both on the way in and on the way out. The solution is to copy the object contents completely in both scenarios. This is known as a deep copy (whereas copying the references only is known as a shallow copy). Thus, on the way in, we copy the contents of the object into a new object and store the reference to the new object. On the way out, we copy the contents again and return the reference to the new object. We have protected our code in both scenarios. Figure 2.13 shows the solution to the previous code from Figure 2.10:
Figure 2.13 – Escaping references code fixed
Line 7 shows the creation of the copy object on the way in (the constructor). Line 10 shows the creation of the copy object on the way out (the accessor method). Both lines 19 and 23 output Dan
, as they should. Figure 2.14 represents the stack and heap as the program is about to exit:
Figure 2.14 – The stack and heap for escaping references code fix
For clarity, we omit the String Pool. We have numbered the StringBuilder
objects 1 to 5. We can match the objects to the code as follows:
- Line 16 creates object 1.
- Line 17, which calls line 7, creates object 2. The
Person
instance variable name
refers to this object.
- Line 18 modifies object 1, changing it to
DanDan
(note, however, that the object referred to by the name
instance variable, that is, object 2, is untouched).
- Line 19 creates object 3. The reference is passed back to
main()
but never stored. As Dan is output, this proves that the defensive copying on the way in is working.
- Line 21 creates object 4. The local
main()
reference, sb2
, refers to it.
- Line 22 amends object 4 to DanDan (leaving the object that the instance variable is referring to untouched).
- Line 23 creates object 5. As Dan is output, this proves that the defensive copying on the way out is working.
Figure 2.14 shows that the StringBuilder object referred to by the name
instance variable never changes from Dan. This is exactly what we wanted.
That wraps up this chapter. We have covered a lot, so let us recap the major points.