KV#
基本概念#
openYuanrong datasystem (下文中称为数据系统)提供了近计算 KV 缓存能力,基于共享内存实现免拷贝的 KV 数据读写,实现高性能数据缓存。同时 KV 接口通过对接外部组件提供数据可靠性语义。
样例代码#
from datasystem.ds_client import DsClient
client = DsClient("127.0.0.1", 31501)
client.init()
key = "key"
expected_val = b"value"
client.kv().set(key, expected_val)
val = client.kv().get([key])
assert val[0] == expected_val
client.kv().delete([key])
#include "datasystem/datasystem.h"
ConnectOptions connectOptions = { .host = "127.0.0.1", .port = 31501 };
auto client = std::make_shared<DsClient>(connectOptions);
ASSERT_TRUE(client->Init().IsOk());
std::string key = "testKey";
std::string value = "Hello kv client";
std::string value2 = "Hello modify";
Status status = client->KV()->Set(key, value);
ASSERT_TRUE(status.IsOk());
std::string getValue;
status = client->KV()->Get(key, getValue);
ASSERT_TRUE(status.IsOk());
ASSERT_TRUE(getValue == value);
status = client->KV()->Set(key, value2);
ASSERT_TRUE(status.IsOk());
status = client->KV()->Get(key, getValue);
ASSERT_TRUE(status.IsOk());
ASSERT_TRUE(getValue == value2);
status = client->KV()->Del(key);
ASSERT_TRUE(status.IsOk());
status = client->KV()->Get(key, getValue);
ASSERT_TRUE(status.IsError());
使用限制#
key 仅支持大写字母、小写字母、数字以及如下特定字符:
-_!@#%^*()+=:;。key 的最大长度为 255 字节。
value 的最大长度没有限制,但是不能超出配置的共享内存大小。
未写入二级缓存的数据,不保证数据可靠性,当发生故障时数据可能会丢失。
关于 KV 更多信息#
数据一致性#
KV 接口支持 Causal 级别数据读写一致性。 一致性模型定义参见 Consistency Models。
数据溢出到磁盘#
KV 数据存储在数据系统的共享内存中,当内存不足时,支持自动将数据溢出到磁盘并从内存中删除数据。当数据需要读取时,自动从磁盘中加载到共享内存。 若磁盘空间也不足时,如果数据已写入到二级缓存,则自动将数据从本地磁盘和内存中删除。当数据需要读取时,自动从二级缓存加载到共享内存。 使用 KV 溢出功能,需要在部署时指定相关参数,默认为关闭。
# The path of the spilling, empty means local_dick spill disabled.
# It will create a new subdirectory("datasystem_spill_data") under the SPILL_DIRECTORY to store the spill file.
# Example: If SPILL_DIRECTORY is "/home/spill", spill files will exist in the "/home/spill/datasystem_spill_data".
spillDirectory: ""
数据溢出有以下参数,可用于设置磁盘空间上限、溢出的并发线程、文件大小等参数,用于性能调优。
# Maximum amount of spilled data that can be stored in the spill directory. If spill is enable and spillSizeLimit is 0, spillSizeLimit will be set to 95% of the spill directory.
# Unit for spillSizeLimit is Bytes.
spillSizeLimit: "0"
# It represents the maximum parallelism of writing files, more threads will consume more CPU and I/O resources.
spillThreadNum: 8
# The size limit of single spill file, spilling objects which lager than that value with one object per file.
# If there are some big objects, you can increase this value to avoid run out of inodes quickly.
# The valid range is 200-10240.
spillFileMaxSizeMb: 200
# The maximum number of open file descriptors about spill. If opened file exceed this value,
# some files will be temporarily closed to prevent exceeding the maximum system limit. You need reduce this value if your system resources are limited.
# The valid range is greater than or equal to 8.
spillFileOpenLimit: 512
# Disable readahead can mitigate the read amplification problem for offset read, default is true
spillEnableReadahead: true
# Thread number of eviction for object cache.
evictionThreadNum: 1
数据可靠性#
数据系统 KV 接口提供可靠性语义,在数据写入时,通过 writeMode 参数配置数据可靠性级别。仅当 writeMode 配置为 WRITE_THROUGH_L2_CACHE 或 WRITE_BACK_L2_CACHE 时才保证数据可靠性,否则当出现故障或空间不足时,数据可能丢失。
数据系统通过对接外部存储组件作为二级缓存实现数据可靠性。当前支持的二级缓存组件有:OBS/SFS。
在集群部署时,需要在数据系统的部署参数中配置二级缓存相关参数,若未配置,则在 KV 写入时 writeMode 参数指定为 WRITE_THROUGH_L2_CACHE 或 WRITE_BACK_L2_CACHE 时会写入失败。
集群部署时通过以下参数指定二级缓存类型。
# 指定二级缓存的类型。可选值为:'obs', 'sfs'.
# 默认值为'none',表示不支持二级缓存。
l2CacheType: "none"
对接各类外部组件的配置参数如下:
obs:
# The access key for obs AK/SK authentication. If the value of encryptKit is not plaintext, encryption is required.
obsAccessKey: ""
# The secret key for obs AK/SK authentication. If the value of encryptKit is not plaintext, encryption is required.
obsSecretKey: ""
# OBS endpoint. Example: "xxx.hwcloudtest.cn"
obsEndpoint: ""
# OBS bucket name.
obsBucket: ""
# Whether to enable the https in obs. false: use HTTP (default), true: use HTTPS
obsHttpsEnabled: false
# Use cloud service token rotation to connect obs.
cloudServiceTokenRotation:
# Whether to use ccms credential rotation mode to access OBS, default is false. If is enabled, need to specify
# iamHostName, identityProvider, projectId, regionId at least.
# In addition, obsEndpoint and obsBucket need to be specified.
enable: false
# Domain name of the IAM token to be obtained. Example: iam.example.com.
iamHostName: ""
# Provider that provides permissions for the ds-worker. Example: csms-datasystem.
identityProvider: ""
# Project id of the OBS to be accessed. Example: fb6a00ff7ae54a5fbb8ff855d0841d00.
projectId: ""
# Region id of the OBS to be accessed. Example: cn-beijing-4.
regionId: ""
# Whether to access OBS of other accounts by agency, default is false. If is true, need to specify tokenAgencyName
# and tokenAgencyDomain.
enableTokenByAgency: false
# Agency name for proxy access to other accounts. Example: obs_access.
tokenAgencyName: ""
# Agency domain for proxy access to other accounts. Example: op_svc_cff.
tokenAgencyDomain: ""
# The path to the mounted SFS.
sfsPath: ""