C语言跳表实现难点与优化方向？-C语言-盈思创科技

Skip List 是一种概率型数据结构，由 William Pugh 在 1990 年发明，它通过在多个层级上维护一个有序的链表，实现了类似平衡二叉搜索树（如 AVL 树、红黑树）的查找、插入、删除效率，但实现起来要简单得多。

（图片来源网络，侵删）

Skip List 的核心思想

想象一下你在一本很厚的书中查找一个单词。

第一层（顶层）：你先看目录，快速定位到大概的章节。
第二层：你翻到那个章节，看章节的子目录，定位到具体的页面范围。
第三层（底层）：你翻到那个页面，然后逐行查找，直到找到目标单词。

Skip List 的思想与此类似：

它是一个多层的有序链表。
最底层（第 0 层）：包含所有的元素，是一个完整的有序链表。
更高层：是第 0 层的一个“稀疏”子集，像高速公路一样，可以快速跨越大量元素。
查找：从最高层开始，像在高速公路上行驶一样，尽可能地向右移动，直到目标元素的前一个节点，然后下降一层，继续向右移动，直到到达第 0 层，这时就能精确找到目标元素。
插入/删除：首先执行查找过程，记录下每一层中目标元素的前驱节点，在决定是否要将新元素插入到更高一层时，通过一个“抛硬币”的随机过程来决定，正面”（rand() % 2 == 0），就再上升一层，继续抛，直到“反面”为止，这保证了跳表的层级是动态且平衡的。

Skip List 的 C 语言实现

下面是一个完整、可运行的 C 语言 Skip List 实现，我们将它拆解成几个部分：节点结构、跳表结构、核心操作函数和主函数。

1. 数据结构定义

我们需要定义两个结构体：Node（节点）和 SkipList（跳表）。

（图片来源网络，侵删）

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <stdbool.h>
// 跳表的最大层级
#define MAX_LEVEL 16
// 节点结构体
typedef struct Node {
    int key;            // 键
    int value;          // 值
    struct Node** forward; // 指向各层后继节点的指针数组
} Node;
// 跳表结构体
typedef struct SkipList {
    int level;          // 当前跳表的最大层级
    Node* header;       // 头节点
} SkipList;

Node:
- key: 用于排序和比较的键。
- value: 与键关联的数据。
- forward: 这是一个关键，它是一个指针数组，forward[i] 指向当前节点在第 i 层的后继节点，这使得一个节点可以同时存在于多个层级中。
SkipList:
- level: 记录当前跳表中实际存在的最大层级。
- header: 一个虚拟头节点，它的 forward 数组指向了每一层的第一个真实节点，这大大简化了插入和删除时的边界条件处理。

2. 辅助函数

在实现核心功能前,我们需要一些辅助函数。

创建新节点

// 创建一个新节点
Node* createNode(int level, int key, int value) {
    Node* newNode = (Node*)malloc(sizeof(Node));
    if (!newNode) {
        perror("Failed to allocate memory for new node");
        exit(EXIT_FAILURE);
    }
    newNode->key = key;
    newNode->value = value;
    // 为 forward 数组分配内存
    newNode->forward = (Node**)malloc(sizeof(Node*) * (level + 1));
    if (!newNode->forward) {
        perror("Failed to allocate memory for forward pointers");
        free(newNode);
        exit(EXIT_FAILURE);
    }
    // 初始化所有层级的后继指针为 NULL
    for (int i = 0; i <= level; i++) {
        newNode->forward[i] = NULL;
    }
    return newNode;
}

随机生成层级

这是跳表“概率”特性的核心，我们用一个简单的抛硬币模拟来决定新节点的最高层级。

（图片来源网络，侵删）

// 随机生成一个层级 (1 <= level <= MAX_LEVEL)
int randomLevel() {
    int level = 0;
    while (rand() % 2 && level < MAX_LEVEL) {
        level++;
    }
    return level;
}

3. 核心操作函数

初始化跳表

// 初始化跳表
SkipList* initSkipList() {
    SkipList* list = (SkipList*)malloc(sizeof(SkipList));
    if (!list) {
        perror("Failed to allocate memory for skip list");
        exit(EXIT_FAILURE);
    }
    list->level = 0;
    // 创建头节点，其层级为 MAX_LEVEL
    list->header = createNode(MAX_LEVEL, -1, -1); // 头节点的 key/value 无意义
    return list;
}

插入元素

插入是跳表最复杂的操作,但逻辑清晰。

// 向跳表中插入一个键值对
void insert(SkipList* list, int key, int value) {
    Node* update[MAX_LEVEL + 1]; // 保存每一层中，新节点的前驱节点
    Node* current = list->header;
    // 1. 查找插入位置，并记录每一层的前驱节点
    for (int i = list->level; i >= 0; i--) {
        while (current->forward[i] != NULL && current->forward[i]->key < key) {
            current = current->forward[i];
        }
        update[i] = current;
    }
    // current 现在在第 0 层，位于目标节点的前一个位置
    current = current->forward[0];
    // 2. key 已存在，则更新值
    if (current != NULL && current->key == key) {
        current->value = value;
        return;
    }
    // 3. key 不存在，则创建新节点
    int newLevel = randomLevel();
    // 4. 如果新节点的层级大于当前跳表的最大层级，需要更新跳表层级
    if (newLevel > list->level) {
        for (int i = list->level + 1; i <= newLevel; i++) {
            update[i] = list->header; // 新增的层，前驱都是头节点
        }
        list->level = newLevel;
    }
    // 5. 创建新节点
    Node* newNode = createNode(newLevel, key, value);
    // 6. 将新节点插入到各层中
    for (int i = 0; i <= newLevel; i++) {
        // 新节点的后继 = 前驱节点的后继
        newNode->forward[i] = update[i]->forward[i];
        // 前驱节点的后继 = 新节点
        update[i]->forward[i] = newNode;
    }
}

查找元素

查找相对简单,完美体现了跳表的快速查找思想。

// 在跳表中查找一个 key，返回对应的节点，如果没找到返回 NULL
Node* search(SkipList* list, int key) {
    Node* current = list->header;
    // 从最高层开始查找
    for (int i = list->level; i >= 0; i--) {
        // 在当前层向右移动，直到找到第一个大于 key 的节点
        while (current->forward[i] != NULL && current->forward[i]->key < key) {
            current = current->forward[i];
        }
    }
    // 下降到第 0 层，current 应该是目标节点的前驱
    current = current->forward[0];
    // 检查是否找到
    if (current != NULL && current->key == key) {
        return current;
    } else {
        return NULL; // 未找到
    }
}

删除元素

删除的逻辑与插入类似,也需要先找到前驱节点，然后在各层中“绕过”要删除的节点。

// 从跳表中删除一个 key
void delete(SkipList* list, int key) {
    Node* update[MAX_LEVEL + 1];
    Node* current = list->header;
    // 1. 查找要删除的节点，并记录每一层的前驱节点
    for (int i = list->level; i >= 0; i--) {
        while (current->forward[i] != NULL && current->forward[i]->key < key) {
            current = current->forward[i];
        }
        update[i] = current;
    }
    current = current->forward[0];
    // 2. 如果没找到，直接返回
    if (current == NULL || current->key != key) {
        printf("Key %d not found in the list.\n", key);
        return;
    }
    // 3. 从各层中移除节点
    for (int i = 0; i <= list->level; i++) {
        // 如果前驱节点的后继不是要删除的节点，说明更高层也没有了，可以 break
        if (update[i]->forward[i] != current) {
            break;
        }
        update[i]->forward[i] = current->forward[i];
    }
    // 4. 释放节点内存
    free(current->forward);
    free(current);
    // 5. 更新跳表的最大层级（如果顶层节点被删除）
    while (list->level > 0 && list->header->forward[list->level] == NULL) {
        list->level--;
    }
}

4. 打印和释放内存

为了方便调试和测试,我们还需要打印函数和释放内存的函数。

// 打印跳表
void printList(SkipList* list) {
    printf("\n*****Skip List*****\n");
    for (int i = 0; i <= list->level; i++) {
        Node* node = list->header->forward[i];
        printf("Level %d: ", i);
        while (node != NULL) {
            printf("%d:%d ", node->key, node->value);
            node = node->forward[i];
        }
        printf("\n");
    }
}
// 释放跳表内存
void freeSkipList(SkipList* list) {
    Node* current = list->header;
    Node* next;
    while (current != NULL) {
        next = current->forward[0];
        free(current->forward);
        free(current);
        current = next;
    }
    free(list);
}

5. 主函数（测试）

int main() {
    srand((unsigned)time(NULL)); // 设置随机种子
    SkipList* list = initSkipList();
    insert(list, 3, "A");
    insert(list, 6, "B");
    insert(list, 7, "C");
    insert(list, 9, "D");
    insert(list, 12, "E");
    insert(list, 19, "F");
    insert(list, 17, "G");
    insert(list, 26, "H");
    insert(list, 21, "I");
    insert(list, 25, "J");
    printf("After insertion:");
    printList(list);
    Node* node = search(list, 19);
    if (node) {
        printf("Found key 19, value: %s\n", node->value);
    } else {
        printf("Key 19 not found.\n");
    }
    delete(list, 19);
    printf("\nAfter deleting key 19:");
    printList(list);
    node = search(list, 19);
    if (node) {
        printf("Found key 19, value: %s\n", node->value);
    } else {
        printf("Key 19 not found.\n");
    }
    freeSkipList(list);
    return 0;
}

时间复杂度分析

Skip List 的性能非常出色，并且其分析相对简单。

查找:
- 在每一层,我们最多需要移动 O(1) 次（因为指针是随机的，期望跨度是常数）。
- 我们最多需要遍历 O(log n) 层。
- 平均时间复杂度: O(log n)
插入:
- 查找插入位置是 O(log n)。
- 随机生成新节点的层级是 O(1)（因为期望的层级是常数）。
- 更新各层指针是 O(log n)（最多更新 log n 层）。
- 平均时间复杂度: O(log n)
删除:
- 查找要删除的节点是 O(log n)。
- 更新各层指针是 O(log n)。
- 平均时间复杂度: O(log n)

最坏情况: 如果随机数生成器不幸，一直生成最高层级，那么跳表会退化成一个单层链表，此时所有操作的时间复杂度都是 O(n)，但这种概率极低。

Skip List 的优缺点

优点

实现简单: 相比平衡二叉搜索树，Skip List 的代码逻辑清晰，更容易实现和调试。
性能优异: 在平均情况下，其时间复杂度与平衡树相当（O(log n)）。
支持范围查询: 和链表一样，可以轻松地进行范围查找（查找大于 10 且小于 20 的所有元素）。
并发友好: 在并发环境下，Skip List 比平衡树更容易实现无锁操作，因为它的修改操作（插入、删除）只涉及局部的指针重定向，而不需要复杂的旋转操作。

缺点

空间开销: 为了维持多层级结构，Skip List 需要额外的 forward 指针数组，空间复杂度为 O(n)（虽然期望值是 O(n)，最坏情况是 O(n log n)），相比之下，平衡树的空间开销是 O(n)。
最坏性能: 如前所述，存在理论上的最坏情况 O(n)，尽管概率极低。

C 语言实现 Skip List 是一个非常好的数据结构练习，它完美地展示了如何用简单的组件（链表、随机数）构建一个高效、复杂的系统，通过这个实现，你可以深刻理解其“分层”和“概率平衡”的核心思想，并将其应用到需要高性能有序数据存储的场景中。

C语言跳表实现难点与优化方向？

Skip List 的核心思想

Skip List 的 C 语言实现

1. 数据结构定义

2. 辅助函数

3. 核心操作函数

4. 打印和释放内存

5. 主函数（测试）

时间复杂度分析

Skip List 的优缺点

优点

缺点

相关文章

目录[+]