golang snappy use case details

  • 2020-06-12 09:18:25
  • OfStack

preface

The compression/decompression requirements encountered in the project should be numerous, such as the typical compression and transmission of data due to network transmission delays, or various other space-saving storage requirements. Similar requirements were also encountered this time. When I was making a crawler, I considered compressing the entire html page and storing it in the database because the crawling project had not been determined yet, so all kinds of google came to google's home Snappy :-)

google's own snappy compression has the advantage of very high speed and reasonable compression rate. The compression ratio is lower than gzip and CPU occupies less space.

snappy in golang

Here are a few simple strings before and after snappy compression:


package main

import (
 "fmt"
 "github.com/golang/snappy"
 "io/ioutil"
)
var (
 textMap = map[string]string{
 "a": `1234567890-=qwertyuiop[]\';lkjhgfdsazxcvbnm,./`,
 "b": `1234567890-=qwertyuiop[]\';lkjhgfdsazxcvbnm,./1234567890-=qwertyuiop[]\';lkjhgfdsazxcvbnm,./1234567890-=qwertyuiop[]\';lkjhgfdsazxcvbnm,./1234567890-=qwertyuiop[]\';lkjhgfdsazxcvbnm,./`,
 "c": ` � � � � � � � � � � � � hollow contact sent � Jia pouring Zhen turbidity measuring Hui dhi clear muddy xushuguan thick invertors � Er � � jie � � Hui � corner o9ccf � � � the her Ji � � spilled � � � � � wash our 洚 los hole � � � � �, tianjin arousing � � hong cars � isn � China continent Ru � alone but � � � huan � live saliva `,
 "d": ` � � � � � � � � � � � � hollow contact sent � Jia pouring Zhen turbidity measuring Hui dhi clear muddy xushuguan thick invertors � Er � � jie � � Hui � corner o9ccf � � � the her Ji � � spilled � � � � � wash our 洚 los hole � � � � �, tianjin arousing � � hong cars � isn � China continent Ru � alone but � � � huan live saliary � � � � � � � � � � � � � hollow contact sent � Jia pouring Zhen turbidity measuring Hui dhi clear muddy xushuguan thick invertors � Er � � jie � � Hui � corner o9ccf � � � the her Ji � � spilled � � � � � wash our 洚 los hole � � � � �, tianjin arousing � � hong cars � isn � China continent Ru � alone but � � � huan live saliary � � � � � � � � � � � � � hollow contact sent � Jia pouring Zhen turbidity measuring Hui dhi clear muddy xushuguan thick invertors � Er � � jie � � Hui The circulation of foreign readers and the washing machines can melt away the holes and enjoy the fun. The washing machine can sing and enjoy the fun `,
 }
 imgSrc = []string{
 "1.jpg", "2.jpg", "3.jpg", "4.jpg",
 }
)
func main() {
 for k, v := range textMap {
 got := snappy.Encode(nil, []byte(v))
 fmt.Println("k:", k, "len:", len(v), len(got))
 }
 fmt.Println("snappy jpg")
 for _, v := range imgSrc {
 buf, err := ioutil.ReadFile(v)
 if err == nil {
  got := snappy.Encode(nil, buf)
  fmt.Println("k:", v, "len:", len(buf), len(got))
 }
 }
}

Output:


k: a len: 46 48
k: b len: 184 58
k: c len: 246 250
k: d len: 738 274
snappy jpg
k: 1.jpg len: 302829 282525
k: 2.jpg len: 89109 89051
k: 3.jpg len: 124463 123194
k: 4.jpg len: 420886 368608

If the string contains more than one repeated character to see the effect of compression, jpg image compression rate is not large.

By comparing the actual use of snappy for a database, both the user and the article are 100,000, and the article content is relatively simple.

Before using snappy compression:


 when  4m32.916312692s
 Database footprint  176,209,920  Bytes (on disk  172 MB ) 

Compressed using snappy:


 when  4m6.750271414s
 Database footprint  159,424,512  Bytes (on disk  150.9 MB ) 

From the perspective of usage time, the CPU time used by this example compression is less than the time saved by IO for data storage after data compression. Because the article data is short, the content is simple, the compression effect is not obvious.

conclusion


Related articles: