Java哈希表在算法竞赛中的应用

一、哈希表基础概念

1. 什么是哈希表

哈希表（Hash Table）是一种基于哈希函数实现的数据结构，它提供了近乎O(1)时间复杂度的查询、插入和删除操作。在Java中，主要通过HashMap和HashSet两个类来实现哈希表的功能。

2. Java中的哈希表实现

HashMap：用于存储键值对
HashSet：用于存储不重复元素的集合
LinkedHashMap：保持插入顺序的HashMap
TreeMap：基于红黑树的有序映射

3. 基本操作时间复杂度

插入：O(1)平均
删除：O(1)平均
查找：O(1)平均
遍历：O(n)

二、HashMap的基本用法

1. 创建和基本操作

// 创建HashMap
HashMap<String, Integer> map = new HashMap<>();

// 插入元素
map.put("apple", 1);
map.put("banana", 2);

// 获取元素
int value = map.get("apple"); // 返回1
int defaultValue = map.getOrDefault("orange", 0); // 键不存在时返回默认值

// 检查键是否存在
boolean exists = map.containsKey("apple"); // true

// 删除元素
map.remove("apple");

// 获取大小
int size = map.size();

// 清空map
map.clear();

2. 遍历方式

// 遍历键值对
for (Map.Entry<String, Integer> entry : map.entrySet()) {
    String key = entry.getKey();
    Integer value = entry.getValue();
    System.out.println(key + ": " + value);
}

// 遍历键
for (String key : map.keySet()) {
    System.out.println(key);
}

// 遍历值
for (Integer value : map.values()) {
    System.out.println(value);
}

三、HashSet的基本用法

1. 创建和基本操作

// 创建HashSet
HashSet<String> set = new HashSet<>();

// 添加元素
set.add("apple");
set.add("banana");

// 检查元素是否存在
boolean exists = set.contains("apple"); // true

// 删除元素
set.remove("apple");

// 获取大小
int size = set.size();

// 清空set
set.clear();

2. 集合操作

HashSet<Integer> set1 = new HashSet<>(Arrays.asList(1, 2, 3));
HashSet<Integer> set2 = new HashSet<>(Arrays.asList(2, 3, 4));

// 并集
set1.addAll(set2);

// 交集
set1.retainAll(set2);

// 差集
set1.removeAll(set2);

四、竞赛中的典型应用场景

1. Two Sum问题

给定一个整数数组和目标值，找出数组中两个数之和等于目标值的下标。

public int[] twoSum(int[] nums, int target) {
    HashMap<Integer, Integer> map = new HashMap<>();
    for (int i = 0; i < nums.length; i++) {
        int complement = target - nums[i];
        if (map.containsKey(complement)) {
            return new int[] {map.get(complement), i};
        }
        map.put(nums[i], i);
    }
    return new int[0];
}

2. 字符串匹配

判断两个字符串是否是字母异位词（包含相同的字母，但排列不同）。

public boolean isAnagram(String s, String t) {
    if (s.length() != t.length()) return false;
    
    HashMap<Character, Integer> map = new HashMap<>();
    
    // 统计第一个字符串中字符出现次数
    for (char c : s.toCharArray()) {
        map.put(c, map.getOrDefault(c, 0) + 1);
    }
    
    // 检查第二个字符串
    for (char c : t.toCharArray()) {
        map.put(c, map.getOrDefault(c, 0) - 1);
        if (map.get(c) < 0) return false;
    }
    
    return true;
}

3. 频率统计

找出数组中出现次数最多的k个元素。

public List<Integer> topKFrequent(int[] nums, int k) {
    // 统计频率
    HashMap<Integer, Integer> frequency = new HashMap<>();
    for (int num : nums) {
        frequency.put(num, frequency.getOrDefault(num, 0) + 1);
    }
    
    // 使用优先队列找出前k个高频元素
    PriorityQueue<Map.Entry<Integer, Integer>> pq = 
        new PriorityQueue<>((a, b) -> b.getValue() - a.getValue());
    pq.addAll(frequency.entrySet());
    
    List<Integer> result = new ArrayList<>();
    for (int i = 0; i < k; i++) {
        result.add(pq.poll().getKey());
    }
    
    return result;
}

4. 子数组问题

求和为k的连续子数组的个数。

public int subarraySum(int[] nums, int k) {
    HashMap<Integer, Integer> map = new HashMap<>();
    map.put(0, 1);
    int sum = 0, count = 0;
    
    for (int num : nums) {
        sum += num;
        count += map.getOrDefault(sum - k, 0);
        map.put(sum, map.getOrDefault(sum, 0) + 1);
    }
    
    return count;
}

五、性能优化技巧

1. 初始容量设置

1
2
3

// 当知道大约需要存储的元素个数时，设置初始容量可以减少重哈希次数
int expectedSize = 1000;
HashMap<String, Integer> map = new HashMap<>(expectedSize);

2. 负载因子选择

1
2
3

// 默认负载因子是0.75，可以根据需求调整
float loadFactor = 0.8f;
HashMap<String, Integer> map = new HashMap<>(expectedSize, loadFactor);

3. 避免哈希冲突

// 自定义对象作为键时，确保正确实现hashCode()和equals()
class Point {
    int x, y;
    
    public Point(int x, int y) {
        this.x = x;
        this.y = y;
    }
    
    @Override
    public int hashCode() {
        return Objects.hash(x, y);
    }
    
    @Override
    public boolean equals(Object obj) {
        if (this == obj) return true;
        if (!(obj instanceof Point)) return false;
        Point other = (Point) obj;
        return x == other.x && y == other.y;
    }
}

六、实战案例分析

1. LRU缓存实现

使用LinkedHashMap实现LRU（最近最少使用）缓存。

class LRUCache {
    private LinkedHashMap<Integer, Integer> cache;
    private int capacity;
    
    public LRUCache(int capacity) {
        this.capacity = capacity;
        this.cache = new LinkedHashMap<Integer, Integer>(capacity, 0.75f, true) {
            @Override
            protected boolean removeEldestEntry(Map.Entry<Integer, Integer> eldest) {
                return size() > capacity;
            }
        };
    }
    
    public int get(int key) {
        return cache.getOrDefault(key, -1);
    }
    
    public void put(int key, int value) {
        cache.put(key, value);
    }
}

2. 单词模式匹配

判断字符串是否遵循给定的模式。

public boolean wordPattern(String pattern, String str) {
    String[] words = str.split(" ");
    if (words.length != pattern.length()) return false;
    
    HashMap<Character, String> charToWord = new HashMap<>();
    HashMap<String, Character> wordToChar = new HashMap<>();
    
    for (int i = 0; i < pattern.length(); i++) {
        char c = pattern.charAt(i);
        String word = words[i];
        
        if (!charToWord.containsKey(c)) {
            if (wordToChar.containsKey(word)) return false;
            charToWord.put(c, word);
            wordToChar.put(word, c);
        } else {
            if (!charToWord.get(c).equals(word)) return false;
        }
    }
    
    return true;
}

七、注意事项与最佳实践

选择合适的集合类型
- 需要键值对：使用HashMap
- 只需要去重：使用HashSet
- 需要保持插入顺序：使用LinkedHashMap
- 需要排序：使用TreeMap
避免常见错误
- 确保键类型的hashCode()和equals()正确实现
- 注意可变对象作为键的潜在问题
- 处理null值的情况
性能考虑
- 合理设置初始容量
- 避免频繁的扩容操作
- 注意内存使用