Redis Overview

Overview of Redis#

Basic Data Structures#

string
list
set
hash
zset

string#

Strings in Redis are mutable and exist in memory as byte arrays, with an internal structure similar to ArrayList in Java.
It uses pre-allocated redundant space to reduce memory allocation, but initially, it does not allocate extra space since append operations are rarely used to modify strings, thus avoiding memory reallocation due to string expansion.
During expansion, if the size is below 1M, it doubles the size; if above 1M, it only increases by 1M each time, with a maximum string length of 512M.
There are two storage methods for strings: when the length is particularly short, it uses the emb(embstr) format, and for longer lengths, it uses the raw format.

list#

Lists in Redis are a linked list structure, not an array, so insertion and deletion are faster, but indexing is slower, making element lookup relatively slow.
In earlier versions, when there were fewer list elements, it used a contiguous memory block, known as the ziplist structure. When there are more elements, it uses a standard doubly linked list structure. However, due to space consumption from the additional pointers in linked lists, it later replaced this with the quicklist structure.

quicklist#

quicklist is a hybrid of ziplist and linkedlist, where linkedlist is segmented, and each segment uses ziplist for compact storage, with multiple ziplist connected by doubly pointers.

The default length of a single ziplist in quicklist is 8k bytes, determined by the configuration parameter list-max-ziplist-size. If it exceeds this length, a new ziplist is created.
To further save memory space, Redis also compresses ziplist storage using the LZF algorithm, with the actual compression depth determined by the configuration parameter list-compress-depth.

Common Operations:
The list structure in Redis is commonly used for asynchronous queues. Tasks that need to be processed later are serialized into strings and pushed into the Redis list, while another thread polls data from this list for processing.

> rpush
> lpop
> rpop

# The following operations are O(n) operations, use with caution
> lindex   # Get the value at the specified index
> lrange   # Traverse the values in the specified index range
> ltrim books 1 -1  # Keep the values in the books list within the index range [1 to -1]

hash#

The hash implementation in Redis is fundamentally similar to the HashMap implementation in Java, based on an array + linked list data structure.
The differences are:

Values in Redis dictionaries can only be strings.
The conditions for expansion are different.
The rehash operation in Java's HashMap is done all at once. Redis, for high performance and to avoid blocking services, uses a progressive rehash strategy.

Expansion Implementation#

Expansion Conditions:
Generally, when the number of elements reaches the length of a one-dimensional array, it will expand, with the new array being twice the size of the original. However, if Redis is performing a bgsave (background asynchronous snapshot operation for RDB persistence), it tries to avoid expansion to reduce excessive memory page separation (due to the Copy On Write mechanism, detailed reference: COW! Understanding the Copy On Write Mechanism). But if the hash table is very full, reaching five times the array length, it will still force an expansion.

Expansion Implementation:
Progressive rehash retains both old and new hash structures during rehashing. Queries will check both hash structures, and in subsequent scheduled tasks and hash operation commands, the contents of the old hash will be gradually migrated to the new hash structure. Once the migration is complete, the new hash structure will replace the old one.

Common Operations:

> hset
> hmset # Batch set
> hget
> hgetall # entries(), get all key-value pairs simultaneously

set#

Sets in Redis are similar to Java's HashSet, internally implemented as a special dictionary where all elements are stored as keys with empty values, and the internal key-value pairs are unordered and unique.

Common Operations: sadd, spop, smembers, sunion, etc.
The set structure can be used to store user IDs that won activities due to its deduplication feature.

zset#

Similar to a combination of Java's SortedSet and HashMap, on one hand, it is a set that ensures the uniqueness of internal values, and on the other hand, it allows assigning a score to each value, representing the sorting weight of that value.
Implementation Methods:

ziplist: Used when the number of elements is less than 128 and each element's length is less than 64 bytes.
skiplist: If the above two conditions are not met, a skiplist is used, specifically a combination of a map and a skiplist map to store the mapping from member to score, allowing O(1) time to find the score corresponding to a member. The skiplist stores scores in ascending order, with each element being [score, value].

Common Operations: zadd, zrange, zrem, zcard, etc.

Skip List#

Linked list + multi-level index

Thought: Why use a skip list instead of a red-black tree (balanced binary tree)?

Both have the same time complexity, but skip lists are simpler to implement.
During concurrent insertions or deletions, a balanced binary tree may require rebalancing operations that could affect other parts of the tree, while skip list operations are more localized, offering slightly better performance.
In cases of range queries using the ZRANGE command, the bidirectional linked list in the skip list can be easily manipulated. Additionally, there is cache locality, so efficiency is not worse than that of balanced trees.

Reference Links:

Redis's Thread IO Model#

Redis is single-threaded. Although multi-threading has been introduced in newer versions, the multi-threaded part is only used for handling network data reading and writing and protocol parsing, improving network IO efficiency.

Why is Single-threaded Redis So Fast?#

Most operations are in-memory, which is very fast.
Being single-threaded avoids context switching and resource contention issues.
It uses non-blocking IO with multiplexing.

Reference Links#

Redis Persistence#

Redis provides two methods for persistence:

RDB snapshots: Snapshot storage of data within specified time intervals, which is a full backup.
AOF log files: Logs that record commands that modify data in memory, representing continuous incremental backups.

After Redis 4.0, a hybrid persistence feature was introduced.

RDB#

Related Configuration:

# Time policy, automatically triggered periodically
save 900 1
save 300 10
save 60 10000

# File name
dbfilename dump.rdb

# File save path
dir /home/work/app/redis/data/

# Whether to stop writing if persistence fails
stop-writes-on-bgsave-error yes

# Whether to compress
rdbcompression yes

# Whether to check during import
rdbchecksum yes

RDB persistence can be triggered in two ways: manually and automatically by Redis.

Manual Trigger#

save: Blocks the current Redis server until persistence is complete, not recommended for online use.
bgsave: Forks a child process responsible for the persistence process, so blocking only occurs during the fork.

The save command is not commonly used; the execution flow of bgsave is as follows:

Principle of bgsave

fork: Redis creates a child process through fork to perform the bgsave operation.
cow: Copy on write; after the child process is created, the parent and child processes share the data segment, allowing the parent process to continue providing read and write services, while dirty page data gradually separates from the child process.

Automatic Trigger#

Automatic triggering scenarios:

Automatically triggered according to the save m n rules configured in the configuration file.
When a full copy is made from a replica node, the master node sends the RDB file to the replica node to complete the copy operation, triggering bgsave.
When executing debug reload.
When executing shutdown, if AOF is not enabled, it will also trigger.

AOF#

Related Configuration:

# Whether to enable AOF
appendonly yes

# File name
appendfilename "appendonly.aof"

# Synchronization method
appendfsync everysec

# Whether to synchronize during AOF rewriting
no-appendfsync-on-rewrite no

# Rewriting trigger configuration
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

# How to handle errors when loading AOF
aof-load-truncated yes

# AOF rewriting strategy
aof-rewrite-incremental-fsync yes

The entire AOF process can be roughly divided into two steps: one is real-time writing of commands (if configured as appendfsync everysec, there will be a 1s loss), the second is rewriting the AOF file.

Redis will immediately store the command text in the AOF log after receiving a modification command from the client, following parameter validation and logical processing. If everything is fine, it stores the command text in the AOF log, meaning it executes the command before writing the log to disk. This differs from storage engines like LevelDB and HBase, which store logs before logical processing.

AOF Rewriting#

As Redis runs for a long time, the AOF log grows longer. If the instance crashes and restarts, replaying the entire AOF log can be very time-consuming, causing Redis to be unable to provide services for an extended period. Therefore, AOF log slimming is necessary.

Redis provides the bgrewriteaof command to slim down the AOF log. Its principle is to open a child process to traverse memory and convert it into a series of Redis operation commands, serializing them into a new AOF log file. After serialization, the incremental AOF logs generated during the operation are appended to this new AOF log file, which will replace the old AOF log file once the appending is complete.

Log Loss#

AOF logs exist as files. When the program writes to the AOF log file, it actually writes the content to a memory cache allocated by the kernel for the file descriptor, and the kernel asynchronously flushes the data back to disk. If the machine suddenly crashes, the log content may not have been fully written to disk, resulting in log loss.

Linux's glibc provides the fsync(int fd) function to force the specified file's content from the kernel cache to disk. In production environment servers, Redis typically executes an fsync operation approximately every 1 second, which can be configured. This is a compromise between data safety and performance, maintaining high performance while minimizing data loss.

Reference Links#

Understanding Redis Persistence Principles in One Article

Redis Synchronization Mechanism#

Redis can use master-slave synchronization and slave-slave synchronization. During the first synchronization, the master node performs a bgsave and simultaneously records subsequent modification operations in an in-memory buffer. After completion, the RDB file is fully synchronized to the replica node, which loads the RDB image into memory upon completion. After loading, it notifies the master node to synchronize the modification records to the replica node for replay, completing the synchronization process.

Redis Transactions#

Thanks to Redis's single-threaded model, its transaction model is very simple.

Each transaction operation has begin, commit, and rollback; begin indicates the start of the transaction, commit indicates the submission of the transaction, and rollback indicates the rollback of the transaction.

In form, Redis looks similar, with multi/exec/discard. multi indicates the start of the transaction, exec indicates the execution of the transaction, and discard indicates the abandonment of the transaction.

Redis Cache Expiration Mechanism#

As a caching system, it needs to periodically clean up invalid data, requiring a primary key expiration and eviction strategy.

Expiration Deletion Strategy#

Redis will place each key that has an expiration time into a separate dictionary, and will periodically traverse this dictionary to delete expired keys. In addition to periodic deletion, it also uses a lazy strategy to delete expired keys, meaning that when a client accesses this key, Redis checks the expiration time, and if expired, deletes it immediately. Periodic deletion is centralized, while lazy deletion is scattered.

Six Eviction Strategies#

When Redis memory exceeds the maxmemory limit, data in memory will start to frequently swap with disk (swap), which can severely degrade Redis's performance. Therefore, Redis provides several eviction strategies:

noeviction: Will not continue to serve write requests (DEL requests can still be served), but read requests can continue. This ensures that data is not lost, but it prevents ongoing business operations. This is the default eviction strategy.
volatile-lru: Attempts to evict keys with expiration times, prioritizing the least recently used keys. Keys without expiration times will not be evicted, ensuring that persistent data is not suddenly lost.
volatile-ttl: Evicts keys with expiration times, prioritizing those with the smallest remaining time to live (ttl).
volatile-random: Evicts keys with expiration times, but the evicted keys are randomly selected from the set of expired keys.
allkeys-lru: Unlike volatile-lru, this strategy evicts keys from the entire key set, not just the expired keys. This means that keys without expiration times can also be evicted.
allkeys-random: Randomly selects keys to evict from the entire key set.

Redis Hash Slots#

Redis clusters do not use consistent hashing but introduce the concept of hash slots. Redis clusters have 16,384 hash slots, and each key is hashed using CRC16 and then taken modulo 16,384 to determine which slot to place it in. Each node in the cluster is responsible for a portion of the hash slots.

Redis Pipelining#

Sends multiple commands to the server without waiting for a reply, then reads the response in one step, improving IO efficiency.

Redis Clustering#

Redis Sentinel focuses on high availability, automatically promoting a slave to master when the master fails, continuing to provide services.
Redis Cluster focuses on scalability, using clustering for sharding storage when a single Redis instance runs out of memory.

Cache Scenario Issues#

Cache Penetration#

Data that needs to be queried does not exist in either the cache or the database, which may lead to malicious high-concurrency requests attacking and causing the backend DB to crash.

Solution Ideas:

Cache null values: After the first query, set the value for non-existent keys to null, and subsequent requests for this key will return null directly, but be sure to set an expiration time.
Bloom Filter: Use a Bloom filter to determine whether an element (a specific key) exists.

Cache Avalanche#

When setting cache, if the same expiration time is used, a large-scale cache invalidation may occur at a certain moment, leading to queries to the database, which may further cause excessive pressure on the database.

Solution Ideas:

Stagger expiration time settings: For example, add a random value to the original expiration time to reduce the repetition rate of cache expiration times.
Rate limiting & degradation to avoid overwhelming the backend DB due to excessive request pressure.
Use locks or queues to prevent too many requests from hitting the backend DB at the same time.
Add a cache marker time to each cached data value, which is earlier than the actual set cache time. When the cache marker expires, it will notify other threads to update the cache, while still returning the old cached data until the cache is updated.

Cache Breakdown#

At a certain point, the cache expires, and many queries for this cache occur simultaneously. Upon finding that the cache has expired, they will load data from the backend DB and update the cache. If the request volume is too large, it may cause excessive pressure on the DB and lead to a crash.

Note:
Cache breakdown and cache avalanche are different; the former targets a single key, while the latter involves many keys.

Solution Ideas:
Refer to cache avalanche solutions, similar.

Cache Warming#

After the system goes live, relevant cached data is preloaded into the cache to avoid the first request querying and loading from the DB to update the cache.

Cache Update#

Update the Database First, Then Update the Cache#

Not applicable, with the following issues:

Thread safety issues may cause dirty data. For example, if multiple write requests arrive, the last write request's data should be updated to the cache, but due to network fluctuations, the actual data updated to the cache may be from one of the earlier writes.
Not suitable for write-heavy, read-light scenarios, as the cache is frequently updated, wasting performance.

Delete Cache First, Then Update the Database#

Problem Analysis:
If one request A performs an update operation while another request B performs a query operation, the following situation may occur:

Request A performs a write operation and deletes the cache.
Request B queries and finds that the cache does not exist.
Request B queries the database and gets the old value.
Request B writes the old value to the cache.
Request A writes the new value to the database.

This situation can lead to inconsistencies. Moreover, if no expiration time strategy is set for the cache, the data will always be dirty.

Solution: Delayed Double Delete Strategy

First, delete the cache.
Then write to the database (these two steps are the same as before).
Sleep for a short time, such as 1 second, and delete the cache again.

This way, dirty data generated within 1 second can be cleared, and the sleep time needs to be evaluated based on the actual business scenario.

Update the Database First, Then Delete the Cache (Common Strategy)#

Invalidation: The application first retrieves data from the cache; if not found, it retrieves it from the database and, upon success, places it in the cache.
Hit: The application retrieves data from the cache and returns it.
Update: First, store the data in the database; upon success, invalidate the cache and delete it.

Problem Analysis:
Assuming there are two requests, one request A performs a query operation, and another request B performs an update operation, the following situation may arise:

The cache just expires.
Request A queries the database and gets an old value.
Request B writes the new value to the database.
Request B deletes the cache.
Request A writes the old value to the cache.

The condition for data inconsistency is that the write operation to the database in step 3 is faster than the read operation in step 2. However, the speed of database read operations is generally much faster than write operations, making this scenario unlikely.