用Gzip數據壓縮方式優化redis大對象緩存
現象
1,業務需要,存入redis中的緩存數據過大,占用了10+G的內存,內存作為重要資源,需要優化一下大對象緩存
選擇GZIP的原因
1,參照如下圖,gzip的壓縮比和壓縮效率都還算中上,重要的是, 當我們用gzip壓縮,我們用http返回業務數據的時候,直接以gzip方式返回,減少解壓開銷
2,減少redis內存占用,減少網絡帶寬
文中以一個445M的文件對常見的壓縮方式進行了比較
初步探索
相關代碼
方案一:做序列化,再做Gzip壓縮,再存入redis,獲取時,反向操作
1,弊端就是需要解壓,反序列化,增加了開銷
2,當下只能用jedis,才能存儲byte[] 二進制數據數據,但是jedis是線程不安全的,且項目中已經有了lecture作為redis client,不好再引入jedis
3,redis還只能存儲gzip壓縮之后的二進制數據,否則會解析不出來,lecture的API又沒有操作二進制的方法,如果二進制轉string,就會發生string得不到原二進制數據
import com.fasterxml.jackson.annotation.JsonAutoDetect;
import com.fasterxml.jackson.annotation.PropertyAccessor;
import com.fasterxml.jackson.databind.ObjectMapper;
import lombok.extern.slf4j.Slf4j;
import org.apache.tomcat.util.http.fileupload.IOUtils;
import org.springframework.data.redis.serializer.Jackson2JsonRedisSerializer;
import org.springframework.data.redis.serializer.JdkSerializationRedisSerializer;
import org.springframework.data.redis.serializer.RedisSerializer;
import org.springframework.data.redis.serializer.SerializationException;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;
@Slf4j
public class CompressRedis extends JdkSerializationRedisSerializer {
public static final int BUFFER_SIZE = 4096;
// 序列化器
private RedisSerializer<Object> innerSerializer;
public CompressRedis() {
this.innerSerializer = getValueSerializer();
}
@Override
public byte[] serialize(Object graph) throws SerializationException {
if (graph == null) {
return new byte[0];
}
ByteArrayOutputStream bos = null;
GZIPOutputStream gzip = null;
try {
// 先序列化
byte[] bytes = innerSerializer.serialize(graph);
bos = new ByteArrayOutputStream();
gzip = new GZIPOutputStream(bos);
// 再壓縮
gzip.write(bytes);
gzip.finish();
byte[] result = bos.toByteArray();
return result;
} catch (Exception e) {
throw new SerializationException("Gzip Serialization Error", e);
} finally {
IOUtils.closeQuietly(bos);
IOUtils.closeQuietly(gzip);
}
}
@Override
public Object deserialize(byte[] bytes) throws SerializationException {
if (bytes == null || bytes.length == 0) {
return null;
}
ByteArrayOutputStream bos = null;
ByteArrayInputStream bis = null;
GZIPInputStream gzip = null;
try {
bos = new ByteArrayOutputStream();
bis = new ByteArrayInputStream(bytes);
gzip = new GZIPInputStream(bis);
byte[] buff = new byte[BUFFER_SIZE];
int n;
// 先解壓
while ((n = gzip.read(buff, 0, BUFFER_SIZE)) > 0) {
bos.write(buff, 0, n);
}
// 再反序列化
Object result = innerSerializer.deserialize(bos.toByteArray());
return result;
} catch (Exception e) {
throw new SerializationException("Gzip deserizelie error", e);
} finally {
IOUtils.closeQuietly(bos);
IOUtils.closeQuietly(bis);
IOUtils.closeQuietly(gzip);
}
}
private static RedisSerializer getValueSerializer() {
Jackson2JsonRedisSerializer jackson2JsonRedisSerializer = new Jackson2JsonRedisSerializer(Object.class);
ObjectMapper om = new ObjectMapper();
om.setVisibility(PropertyAccessor.ALL, JsonAutoDetect.Visibility.ANY);
om.enableDefaultTyping(ObjectMapper.DefaultTyping.NON_FINAL);
jackson2JsonRedisSerializer.setObjectMapper(om);
return jackson2JsonRedisSerializer;
}
}
2,jedis存儲二進制gzip數據
public byte[] getCompressAndSave(String word) {
String key= SimilarFormResourceKeyCompress+"::"+word;
Jedis jedis=new Jedis();
byte[] compress=jedis.get(key.getBytes());
if(compress==null) {
SimilarForm similarForm = getNebulaSimilarForm(word);
Result result = Result.success(similarForm);
String content = JSONObject.toJSONString(result);
compress = CompressUtil.compress(content);
jedis.set(key.getBytes(), compress);
}
return compress;
}
解決方案:完整應答對象(Result{code,msg,data})轉json字符串,再Gzip壓縮,獲取時,直接作為http Gzip數據流應答
1,優勢則是不用額外解壓和反序列化
2,直接作為http gzip數據流應答,減少網絡帶寬,提升效率
設計基於lecture redis client的gzip緩存方法
1,原因:redis還只能存儲gzip壓縮之后的二進制數據,否則會解析不出來,lecture的API又沒有操作二進制的方法,如果二進制轉string,就會發生string得不到原二進制數據
2,解決辦法就是,再設計一個RedisCompressObj,只用來存儲byte[]數據,包裝一層,以避免直接操作二進制數組
@Data
public class RedisCompressObj implements Serializable {
private static final long serialVersionUID = 1849342735494672132L;
private byte[] bytes;
}
1,完整應答對象(Result{code,msg,data})轉json字符串,再Gzip壓縮,再作為RedisCompressObj存入redis,不會破壞gzip二進制數據,又可以統一用@Cacheable
@Cacheable(value =SimilarFormResourceKeyCompress, key = "#word", unless = "#result == null")
public RedisCompressObj getResource(String word) {
SimilarForm similarForm = getNebulaSimilarForm(word);
Result result = Result.success(nebulaSimilarForm);
String content = JSONObject.toJSONString(result);
byte[] compress = CompressUtil.compress(content);
RedisCompressObj redisCompressObj = new RedisCompressObj();
redisCompressObj.setBytes(compress);
return redisCompressObj;
}
2,json字符串,直接Gzip壓縮
public class CompressUtil {
public static byte[] compress(String content){
if (StringUtils.isEmpty(content)) {
return null;
}
ByteArrayOutputStream bos = null;
GZIPOutputStream gzip = null;
try {
bos = new ByteArrayOutputStream();
gzip = new GZIPOutputStream(bos);
// 再壓縮
gzip.write(content.getBytes());
gzip.finish();
return bos.toByteArray();
} catch (Exception e) {
throw new SerializationException("Gzip Serialization Error", e);
} finally {
IOUtils.closeQuietly(bos);
IOUtils.closeQuietly(gzip);
}
}
}
作為http gzip數據流直接應答,減少帶寬
public ResponseEntity<byte[]> getWordCompress(@RequestParam(value = "word") String word) {
String uid = getCurrentUserId();
RedisCompressObj redisCompressObj = similarFormHandle.getResource(word, uid, true);
byte[] json = redisCompressObj.getBytes();
MultiValueMap<String, String> headers = new HttpHeaders();
headers.add("Content-Encoding", "gzip");
headers.add("Content-Type", "application/json");
ResponseEntity<byte[]> rspEntity = new ResponseEntity<byte[]>(json, headers, HttpStatus.OK);
return rspEntity;
}
對方案二做測試
1,工具postman,用來查看接口應答,有gzip和沒有gzip之間的應答數據量情況
2,redis桌面工具Another-Redis-Desktop,用來查看在redis中,gzip壓縮之后和沒有壓縮之后所占內存情況
資源 | 不壓縮http應答 | 壓縮http應答 | 不壓縮redis內存占用 | 壓縮redis內存占用 | 備注 |
---|---|---|---|---|---|
boot | 749KB | 29KB | 1.2MB | 39.5KB | http接口應答優化96.2%,redis內存占用優化 96.8% |
mistreat | 199.5KB | 13.9KB | 487.5KB | 14K | http接口應答優化93.1%,redis內存占用優化 97.2% |
Monday | 55.7KB | 5.1KB | 134KB | 6.7KB | http接口應答優化90.9%,redis內存占用優化 95% |
allocation | 4.22MB | 252.6KB | 10.3MB | 336.6KB | http接口應答優化94%,redis內存占用優化 96.8% |
adoption | 659.5KB | 35.4KB | 1.59MB | 46.9KB | http接口應答優化94.7%,redis內存占用優化 97.2% |
結論
1,用gzip做壓縮優化內存,是當前幾種壓縮算法中算法壓縮比和加壓縮性能,屬於中上,但是很適合http,可以避免解壓和反序列開銷
2,基於lecture redis client不能操作二進制數據,但是gzip二進制數據不能轉string,會反轉失敗
3,基於lecture redis client和@Cacheable結合的緩存機制,會把對象序列化成json(項目中配置的是jackson2JsonRedisSerializer),並還會額外保存引用的對象,利於反序列化成對象,多占用了內存
4,用redis緩存RedisCompressObj(byte[]),是當前方案中,比較適用的方式,不會出現亂碼,格式轉換失敗的異常
5,用gzip壓縮大對象優化redis緩存和接口流量,效果都達到90%以上