Understand the serialization and deserialization of Java

  • 2020-05-05 11:14:38
  • OfStack

The article mainly involves the following problems:

How does implement Java serialization why does java.io.Serializable interface have to be serialized what does do

Serialization of Java objects

The Java platform allows us to create reusable Java objects in memory, but in general, these objects are only possible when JVM is running; that is, the life cycle of these objects is no longer than that of JVM. In a real-world application, however, you might be required to be able to save (persist) the specified object after JVM stops running, and to re-read the saved object in the future. The Java object serialization can help us achieve this.

With Java object serialization, when an object is saved, its state is saved as a set of bytes, which are then assembled into objects in the future. It is important to note that object serialization holds the "state" of the object, its member variables. Thus, object serialization does not care about static variables in the class.

In addition to object serialization when persisting objects, object serialization is used when using RMI(remote method calls) or when passing objects across the network. Java serialization API provides a standard mechanism for handling object serialization, which is easy to use and will be covered in later sections of this article.


public class ArrayList<E> extends AbstractList<E>
    implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
  private static final long serialVersionUID = 8683452581122892189L;

  transient Object[] elementData; // non-private to simplify nested class access

  private int size;
}

Ii. How to serialize and deserialize

objects

In Java, a class can be serialized as long as it implements the java.io.Serializable interface. Here is a piece of code:

code 1 creates an User class to serialize and deserialize


package com.hollis;
import java.io.Serializable;
import java.util.Date;

/**
 * Created by hollis on 16/2/2.
 */
public class User implements Serializable{
  private String name;
  private int age;
  private Date birthday;
  private transient String gender;
  private static final long serialVersionUID = -6849794470754667710L;

  public String getName() {
    return name;
  }

  public void setName(String name) {
    this.name = name;
  }

  public int getAge() {
    return age;
  }

  public void setAge(int age) {
    this.age = age;
  }

  public Date getBirthday() {
    return birthday;
  }

  public void setBirthday(Date birthday) {
    this.birthday = birthday;
  }

  public String getGender() {
    return gender;
  }

  public void setGender(String gender) {
    this.gender = gender;
  }

  @Override
  public String toString() {
    return "User{" +
        "name='" + name + '\'' +
        ", age=" + age +
        ", gender=" + gender +
        ", birthday=" + birthday +
        '}';
  }
}

code 2 serializes and deserializes Demo

for User


package com.hollis;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;
import java.io.*;
import java.util.Date;

/**
 * Created by hollis on 16/2/2.
 */
public class SerializableDemo {

  public static void main(String[] args) {
    //Initializes The Object
    User user = new User();
    user.setName("hollis");
    user.setGender("male");
    user.setAge(23);
    user.setBirthday(new Date());
    System.out.println(user);

    //Write Obj to File
    ObjectOutputStream oos = null;
    try {
      oos = new ObjectOutputStream(new FileOutputStream("tempFile"));
      oos.writeObject(user);
    } catch (IOException e) {
      e.printStackTrace();
    } finally {
      IOUtils.closeQuietly(oos);
    }

    //Read Obj from File
    File file = new File("tempFile");
    ObjectInputStream ois = null;
    try {
      ois = new ObjectInputStream(new FileInputStream(file));
      User newUser = (User) ois.readObject();
      System.out.println(newUser);
    } catch (IOException e) {
      e.printStackTrace();
    } catch (ClassNotFoundException e) {
      e.printStackTrace();
    } finally {
      IOUtils.closeQuietly(ois);
      try {
        FileUtils.forceDelete(file);
      } catch (IOException e) {
        e.printStackTrace();
      }
    }

  }
}
//output 
//User{name='hollis', age=23, gender=male, birthday=Tue Feb 02 17:37:38 CST 2016}
//User{name='hollis', age=23, gender=null, birthday=Tue Feb 02 17:37:38 CST 2016}

Three, serialization and deserialization related knowledge

1. In Java, a class can be serialized as long as it implements the java.io.Serializable interface.

2. Serialize and deserialize

through ObjectOutputStream and ObjectInputStream

3. Whether the virtual machine allows deserialization depends not only on whether the classpath and the functional code are consistent, but also on whether the serialization of the two classes is consistent (that is, private static final long serialVersionUID)

Serialization does not save static variables.

5. To serialize the parent class object, you need to have the parent class implement the Serializable interface as well.

6. The function of Transient keyword is to control the serialization of variables. Adding the keyword before the variable declaration can prevent the variable from being serialized to the file.

7, the server to the client sends the serialized object data, some of object data is sensitive, such as password string, etc., and hope for the password field when serialization, encrypt, and if the client has the decryption key, only when the client deserialize, can read the password, so that we can to a certain extent to ensure that the serialized object data security.

iv, ArrayList serialization

Before going into ArrayList serialization, let's consider one thing:

How to customize the serialization and deserialization policies

With this question in mind, let's look at the source

for java.util.ArrayList

code 3


public class ArrayList<E> extends AbstractList<E>
    implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
  private static final long serialVersionUID = 8683452581122892189L;
  transient Object[] elementData; // non-private to simplify nested class access
  private int size;
}

I have omitted the other member variables, and from the above code we can see that ArrayList implements the java.io.Serializable interface, so we can serialize and deserialize it. Because elementData is transient, we assume that this member variable will not be serialized and will remain. Let's write Demo to test our idea:

code 4


public static void main(String[] args) throws IOException, ClassNotFoundException {
    List<String> stringList = new ArrayList<String>();
    stringList.add("hello");
    stringList.add("world");
    stringList.add("hollis");
    stringList.add("chuang");
    System.out.println("init StringList" + stringList);
    ObjectOutputStream objectOutputStream = new ObjectOutputStream(new FileOutputStream("stringlist"));
    objectOutputStream.writeObject(stringList);

    IOUtils.close(objectOutputStream);
    File file = new File("stringlist");
    ObjectInputStream objectInputStream = new ObjectInputStream(new FileInputStream(file));
    List<String> newStringList = (List<String>)objectInputStream.readObject();
    IOUtils.close(objectInputStream);
    if(file.exists()){
      file.delete();
    }
    System.out.println("new StringList" + newStringList);
  }
//init StringList[hello, world, hollis, chuang]
//new StringList[hello, world, hollis, chuang]

As anyone who knows ArrayList knows, the underlying ArrayList is implemented through arrays. So the array elementData is actually used to hold the elements in the list. We know from the way this property is declared that it cannot be persisted by serialization. So why did the code 4 result preserve the List elements through serialization and deserialization?

v, writeObject and readObject methods

A method is defined in ArrayList: writeObject and readObject.

Here's the conclusion :

During the serialization process, if the writeObject and readObject methods are defined in the serialized class, the vm attempts to call the writeObject and readObject methods in the object class for user-defined serialization and deserialization.

If does not have such a method, the default calls are defaultWriteObject of ObjectOutputStream and defaultReadObject of ObjectInputStream.

The user-defined writeObject and readObject methods allow the user to control the serialization process, such as dynamically changing the serialization value during serialization.

Take a look at the implementation of these two methods:

code 5


private void readObject(java.io.ObjectInputStream s)
    throws java.io.IOException, ClassNotFoundException {
    elementData = EMPTY_ELEMENTDATA;

    // Read in size, and any hidden stuff
    s.defaultReadObject();

    // Read in capacity
    s.readInt(); // ignored

    if (size > 0) {
      // be like clone(), allocate array based upon size not capacity
      ensureCapacityInternal(size);

      Object[] a = elementData;
      // Read in all elements in the proper order.
      for (int i=0; i<size; i++) {
        a[i] = s.readObject();
      }
    }
  }

code 6


private void writeObject(java.io.ObjectOutputStream s)
    throws java.io.IOException{
    // Write out element count, and any hidden stuff
    int expectedModCount = modCount;
    s.defaultWriteObject();

    // Write out size as capacity for behavioural compatibility with clone()
    s.writeInt(size);

    // Write out all elements in the proper order.
    for (int i=0; i<size; i++) {
      s.writeObject(elementData[i]);
    }

    if (modCount != expectedModCount) {
      throw new ConcurrentModificationException();
    }
  }

So why does ArrayList implement serialization this way?

why transient

ArrayList is actually a dynamic array that automatically increases the specified length each time the array is full. If the array is automatically increased to 100 and only one element is actually placed, then 99 null elements will be serialized. To ensure that so many null are not serialized at the same time, ArrayList sets the array of elements to transient.

why writeObject and readObject

As mentioned earlier, ArrayList USES transient to declare elementData in order to prevent an array containing a large number of empty objects from being serialized and to optimize storage.
However, as a collection, you must also ensure that its elements can be persisted during serialization, so you can keep them by overriding the writeObject and readObject methods.

The writeObject method saves the traversed elements of the elementData array to the output stream (ObjectOutputStream).

The readObject method reads the object from the input stream (ObjectInputStream) and saves the assignment to the elementData array.

At this point, let's try to answer the question just posed:

1. How to customize the serialization and deserialization policy

Answer: you can add the writeObject and readObject methods to the serialized class. 2. So here's the problem:

The writeObject and readObject methods are written in ArrayList, but they are not shown to be called.

So if a class contains writeObject and readObject methods, how are they called?

vi, ObjectOutputStream

From code 4, we can see that the serialization process of objects is realized by ObjectOutputStream and ObjectInputputStream. So with the question above, let's analyze how the writeObject and readObject methods in ArrayList are called.

To save space, the call stack of writeObject for ObjectOutputStream is given here:

writeObject --- > writeObject0 --- > writeOrdinaryObject--- > writeSerialData--- > invokeWriteObject

Here's invokeWriteObject:


void invokeWriteObject(Object obj, ObjectOutputStream out)
    throws IOException, UnsupportedOperationException
  {
    if (writeObjectMethod != null) {
      try {
        writeObjectMethod.invoke(obj, new Object[]{ out });
      } catch (InvocationTargetException ex) {
        Throwable th = ex.getTargetException();
        if (th instanceof IOException) {
          throw (IOException) th;
        } else {
          throwMiscException(th);
        }
      } catch (IllegalAccessException ex) {
        // should not occur, as access checks have been suppressed
        throw new InternalError(ex);
      }
    } else {
      throw new UnsupportedOperationException();
    }
  }

writeObjectMethod.invoke(obj, new Object[]{out}); Is the key. The writeObjectMethod method is called by reflection. Here's the official explanation for writeObjectMethod:

class-defined writeObject method, or null if none

In our example, this method is the writeObject method that we defined in ArrayList. It's called by reflection.

At this point, let's try to answer the question just posed:

If a class contains writeObject and readObject methods, how are they called?

Answer: when ObjectOutputStream's writeObject method and ObjectInputStream's readObject method are used, they are called by reflection.

So far, we have covered the serialization of ArrayList. So, I don't know if anyone has raised the question:

Serializable is clearly an empty interface. How does it ensure that only methods that implement the interface can be serialized and deserialized?

Definition of Serializable interface:


public interface Serializable {
}

You can try to remove the code that inherits Serializable from code 1, and then execute code 2, which throws java.io.NotSerializableException.

This is an easy question to answer, so let's go back to ObjectOutputStream's writeObject call stack:

writeObject --- > writeObject0 --- > writeOrdinaryObject--- > writeSerialData--- > invokeWriteObject

The writeObject0 method has this code:


if (obj instanceof String) {
        writeString((String) obj, unshared);
      } else if (cl.isArray()) {
        writeArray(obj, desc, unshared);
      } else if (obj instanceof Enum) {
        writeEnum((Enum<?>) obj, desc, unshared);
      } else if (obj instanceof Serializable) {
        writeOrdinaryObject(obj, desc, unshared);
      } else {
        if (extendedDebugInfo) {
          throw new NotSerializableException(
            cl.getName() + "\n" + debugInfoStack.toString());
        } else {
          throw new NotSerializableException(cl.getName());
        }
      }

When the serialization operation is performed, it is determined whether the class to be serialized is of type Enum, Array, and Serializable, and if not, NotSerializableException is thrown directly.

Conclusion

1. If a class wants to be serialized, the Serializable interface needs to be implemented. Otherwise, an NotSerializableException exception is thrown, because the type is checked during the serialization operation and the serialized class must be of either Enum, Array, or Serializable types.

2. Adding the keyword before the variable declaration prevents the variable from being serialized to the file.

3. Add writeObject and readObject methods to the class to implement the custom serialization policy

The above is the entire content of this article, I hope to help you with your study.


Related articles: