IOS Cache Design Details and Simple Examples

  • 2021-09-11 21:37:51
  • OfStack

IOS Cache Design

The design of Cache is a basic computer theory, and it is also one of the important basic skills of programmers. Cache is almost everywhere, such as L1 L2 Cache of CPU, clean page and dirty page mechanism of iOS system, tag mechanism of HTTP, etc. Behind these are the applications of Cache design ideas.

Why do you need Cache

The purpose of Cache is to pursue a higher speed experience, and the source of Cache is the difference in cost and performance between the two data reading methods.

Before starting to design Cache, it is necessary to clarify the media of data storage. As client developers, there are many ways to store data:

The data is initially stored on the Server, and the data needs to be retrieved through network requests. Getting data from Server goes through various intermediate network nodes (such as proxies) that sometimes cache our data. After downloading the data locally, we will cache one copy in the local disk, so that we may not have to go to the server again every time. After being stored in disk, the storage mode of data will affect the reading speed. sqlite stored in B + Tree is much faster than serializing NSArray directly into files. When App starts, the system will load the data downloaded from Server from disk to memory, and the read and write performance of memory is much faster than that of disk. In Memory, different data structure storage methods will also have differences in speed. When NSDictionary (hash table) is used to store read data, the write performance is better than Array, but space has higher overhead. Although the reading and writing performance of memory is much higher than that of disk, it sometimes encounters bottlenecks when operating large sets of data. Register, L1 and L2 are faster than Memory, but for iOS App development, it rarely goes deep into this level of optimization.

There are differences in performance and cost in every link mentioned above, and the data of Server is naturally the most timely and accurate. However, it takes a "long" process for an App to obtain the data of Server in the form of NSArray. It can be said that the design idea of cache exists in every step.

To understand and practice Cache, the premise is that we have a deep grasp of storage media and differences in different data structures.

Most of our performance optimization of App, if it involves Cache, is generally processed on Memory. The data that needs to be obtained from Disk or through complex calculation of CPU can be stored in Memory through reasonable data structure, which can solve most of Cache requirements in our App development. This 1-level Cache design also has different postures. Let's look at the simple usable type first.

Simple usable Cache

Thanks to the encapsulation of NSDictionary in Foundation, we can use the data structure of hash table to implement a simple and usable cache mechanism. Let's look at an example first:


- (NSString*)getFormmatedPhoneNumber:(NSNumber*)phone
{
 if(phone == nil)
 {
  return nil;  
 }
 
 return [PhoneFormatLib formatPhoneNumber:phone]; //CPU Time-consuming operation 
}

This is a simple function for formatting mobile phone numbers, in which formatPhoneNumber function is a call of CPU Intensive, and for the same mobile phone number in business scenarios, it is necessary to obtain formatted NSString frequently. If repeated calculation is repeated every time, it is obviously a waste of CPU resources, and its performance is not good. We can add a simple Cache to optimize:


static NSMutableDictionary* gPhoneCache = nil;
- (NSString*)getFormmatedPhoneNumber:(NSNumber*)phone
{
  if(phone == nil)
  {
    return nil;
  }
  
  NSString* phoneNumberStr = nil;
  
  [_phoneLock lock];
  if(gPhoneCache == nil)
  {
    gPhoneCache = @{}.mutableCopy;
  }
  
  phoneNumberStr = [gPhoneCache objectForKey:phone];
  if (phoneNumberStr == nil) {
    phoneNumberStr = [PhoneFormatLib formatPhoneNumber:phone];
    [gPhoneCache setObject:phoneNumberStr forKey:phone];
  }
  [_phoneLock unlock];
  
  return phoneNumberStr; 
}

By introducing NSMutableDictionary, the problem of calling formatPhoneNumber repeatedly every time is avoided. so easy has completed a fast cache design, which can be submitted to the test immediately, and the optimization results are thrown at the product manager's face, which is attributed to the time complexity of hash table O (1). Memory space will consume more than 1, but it has little impact on a small amount of data. Modern hash tables will not allocate a large amount of space at the beginning, but gradually expand with the increase of data.

The biggest problem with this simple usable Cache design is that the code is too fragmented and uncontrollable. A small and scattered cache design is almost equivalent to digging a hole. When you design cache, the amount of data may be small, but when you maintain it later and change your business, no one can guarantee that the overhead of this memory can still be ignored. Moreover, this kind of memory loss is difficult to detect, and it is cleverly hidden in a. m file. When you want to control the memory overhead of the whole App in the later period, you will feel that there are pits everywhere, so you can't start. As you may have noticed, the above Cache code doesn't release Cache.

All code that has side effects on our whole App needs to be managed centrally, and can be understood and located from the architectural level. How to define side effects? It can be abstracted into a kind of "write operation". Adding new records to Cache is a write operation. The side effect of this write operation is extra memory overhead. The essence of Cache is to exchange space for time. This space loss is our side effect. One side effect will cause other more side effects. To clarify these side effects, it is often necessary to consult a large number of codes repeatedly. A better way is to centralize the code with side effects from the beginning.

Elegant controllable Cache

The way to avoid the scattered placement of Cache code is to design an elegant and controllable Cache module. In an App, there may be various data needs Cache, phoneNumberCache, avatarCache, spaceshipCache, etc. We need a source to track these cache. The intuitive way is to generate and hold these various cache through factory classes:


//CacheFactory.h
@interface CacheFactory : NSObject
+ (instancetype)sharedInstance;
- (id<MyCacheProtocol>)getPhoneNumberCache;
- (void)clearPhoneNumberCache;
- (id<MyCacheProtocol>)getAvatarCache;
- (void)clearAvatarCache;
@end

In this way, when we need to evaluate the impact of various Cache on the memory overhead of the whole App, we only need to start with the CacheFactory code, and the debugging can be traced, and other engineers will be grateful to take over your code.

It is also a good habit to separate the declaration and implementation of cache in the way of protocol. Another important knowledge point of cache is the elimination strategy of cache, Different strategies perform differently, FIFO, LRU, 2Queues and so on, There are many mature third-party cache frameworks available now. The system also provides NSCache with unclear elimination strategy. If you have not written any cache elimination strategy, I still suggest that you try to make one by yourself, and at least read the relevant implementation source code. It is necessary to understand these elimination strategies. When making some deep optimization, you need to make decisions according to local conditions.

The use of cache should be collected and released, not just created and not released. In fact, all operations involving data should consider the life cycle of data. When we do business, we mostly take Controller as the basic unit. In some scenarios, the possibility of an Controller being re-entered after exiting is very low. Timely cleaning cache will make our overall performance of App better.

Immutable Cache

What is stored in Cache? It's Data. When it comes to Data, we have to mention the favorite "Immutability" of peak. Immutability has a great relationship with the stability of our code, which is as big as "elephant in the room", which is very important and easily overlooked.

When practicing Immutability, it is necessary to classify Data first, and then distinguish how to implement immutability for each type of Data. The most important thing to do Data classification is to distinguish the difference between value type and reference type. When passing a value, it passes a new memory copy, so most of the value types are safe. When passing a pointer, it passes the same shared memory space, which is one of the major reasons why pointers are dangerous. bool, Int, long, etc. These primitive type are all value types, which can be safely passed, while object types are often passed in the form of pointers, which requires special attention. We generally pass them through copy (generating new memory copies). This is why Swift changed many of the basic classes in Objective C into value types, and strengthened Immutability to make our code more secure.

Let's look at the read and write operations of different types of data in Cache.

Value type-read

Value types can return with peace of mind:


- (int)spaceshipCount
{
  //...
  return _shipCount;
}

Value type-write

Value types can also be safely written:


- (void)setSpaceshipCount:(int)count
{
  _shipCount = count;
}

Object type-read

Pointer type needs to make a new copy:


- (User*)luckyUser
{
  //...
 return [_luckyUser copy];  
}

The copy method of object class requires us to implement NSCopying protocol manually. Although it is tedious in the early stage of development, the return in the later stage is great. Moreover, copy here must be deep copy, and every property held in User needs recursive copy.

Object type-write

The danger of writing to an object type lies in the input parameter of the function. If the input parameter is also an object type, a shared reference is passed in:


- (void)setLuckyUser:(User*)user
{
  //...
  _luckyUser = [user copy]; 
}

Collection type-read

The collection class also requires copy, which is the hardest hit by bug and crash:


- (NSArray*)hotDishes
{
 //...
  return [_hotDishes copy];
}

Collection type-write


- (void)setHotDishes:(NSArray*)dishes
{
 //...
 _hotDishes = [dished copy];
}

Seeing this, you may also find that the principle is relatively simple. As long as the data obtained by the business module from Cache is independent copy, all kinds of hidden dangers brought by data sharing can be avoided. Cache module is somewhat similar to pure function in functional programming, which neither depends on the external state nor modifies the external state. It focuses on input (input parameter) and output (return value) of every function call.

Multithread safety

The focus of Cache multithread safety is on the processing of collection classes, and Cache itself is managing collections of data most of the time. It should be noted that NSString should also be classified as a collection class. From the aspects of data reading and writing and multithreading safety, NSString and NSArray are identical in many aspects. 1 Some mature third-party Cache library has handled the problem of multithreading safety for us. If it is a wheel made by ourselves, special attention should be paid to ensuring that reading and writing are atomic operations. As for how to use locks, related articles have been shared a lot, so we will not repeat them here.

Summarize

The key to understanding Cache is to understand the design idea behind it, so as to have a more comprehensive grasp of our App behavior and understand the bottleneck of data processing behind every business process. As more code is written and the business becomes more and more complex, today or tomorrow, we will always meet the time when we need to apply Cache design.

Thank you for reading, hope to help everyone, thank you for your support to this site!


Related articles: