当前位置：首页 > news >正文

题解：AtCoder AT_awc0062_d Nearly Identical Signal Patterns

news 2026/6/21 11:46:22

【题目来源】

AtCoder：D - Nearly Identical Signal Patterns

【题目描述】

Takahashi is analyzing the log of a communication system. In this system, a bit string \(S\) of length \(N\) consisting only of 0 and 1 is recorded.
高桥正在分析一个通信系统的日志。在这个系统中，记录了一个长度为 \(N\) 的、仅由 0 和 1 组成的比特字符串 \(S\)。

Takahashi wants to find pairs of "nearly identical" intervals in this signal log.
高桥想在这个信号日志中找到"近乎相同"的区间对。

A contiguous substring of \(S\) is represented by a pair of starting position \(l\) and ending position \(r\), denoted \((l, r)\) (\(1 \leq l \leq r \leq N\)). This refers to the contiguous portion from the \(l\)-th character to the \(r\)-th character of \(S\), and its length is \(r - l + 1\).
\(S\) 的一个连续子串由其起始位置 \(l\) 和结束位置 \(r\) 表示，记为 \((l, r)\)（\(1 \leq l \leq r \leq N\)）。这指的是 \(S\) 中从第 \(l\) 个字符到第 \(r\) 个字符的连续部分，其长度为 \(r - l + 1\)。

A pair of two contiguous substrings \(((l_1, r_1), (l_2, r_2))\) is called a "nearly identical" pair if and only if all of the following conditions are satisfied:
两个连续子串 \(((l_1, r_1), (l_2, r_2))\) 被称为"近乎相同"对，当且仅当满足以下所有条件：

\((l_1, r_1) \neq (l_2, r_2)\). That is, \(l_1 \neq l_2\) or \(r_1 \neq r_2\).
\((l_1, r_1) \neq (l_2, r_2)\)。也就是说，\(l_1 \neq l_2\) 或 \(r_1 \neq r_2\)。
\(r_1 - l_1 = r_2 - l_2\). That is, the two contiguous substrings have equal length.
\(r_1 - l_1 = r_2 - l_2\)。也就是说，两个连续子串的长度相等。
Let \(L\) be the length of the two contiguous substrings. When comparing the \(i\)-th characters from the beginning (\(1 \leq i \leq L\)), there is exactly \(1\) position \(i\) where the characters differ (that is, the Hamming distance is exactly \(1\)).
设 \(L\) 为两个连续子串的长度。当比较它们的第 \(i\) 个字符（\(1 \leq i \leq L\)）时，恰好有 \(1\) 个位置 \(i\) 的字符不同（即汉明距离恰好为 \(1\)）。

Pairs are counted as unordered. That is, \(((l_1, r_1), (l_2, r_2))\) and \(((l_2, r_2), (l_1, r_1))\) are considered the same pair.
无序计数对。也就是说，\(((l_1, r_1), (l_2, r_2))\) 和 \(((l_2, r_2), (l_1, r_1))\) 被视为同一个对。

Note that substrings are distinguished by their position pair \((l, r)\). Even if the actual string contents are identical, if \((l_1, r_1) \neq (l_2, r_2)\), they are treated as different substrings.
注意，子串通过其位置对 \((l, r)\) 来区分。即使实际的字符串内容相同，如果 \((l_1, r_1) \neq (l_2, r_2)\)，它们也被视为不同的子串。

Find the number of pairs that satisfy the conditions.
求满足条件的对的数量。

【输入】

\(N\)
\(S\)

The first line contains an integer \(N\) representing the length of the bit string.
The second line contains a string \(S\) of length \(N\) consisting only of 0 and 1.

【输出】

Output the number of unordered pairs satisfying the conditions as a single integer on one line.

【输入样例】

3
101

【输出样例】

【核心思想】

问题分析：给定长度为 \(N\) 的二进制字符串 \(S\)，求所有无序子串对 \(((l_1, r_1), (l_2, r_2))\) 的数量，要求两子串长度相等且汉明距离恰好为 \(1\)（即恰好有一个位置字符不同）。子串通过位置 \((l, r)\) 区分，即使内容相同，位置不同也视为不同子串。这是一个字符串哈希问题，关键在于快速判断子串相等并高效统计满足条件的对数。
算法选择：
- 字符串哈希（Rolling Hash）：预处理前缀哈希数组，将任意子串的相等判断转化为 \(O(1)\) 的哈希值比较
- 二分求 LCP：利用哈希的 \(O(1)\) 比较，二分查找两个后缀的最长公共前缀长度
- 巧妙计数：对于固定的起始位置对 \((i, j)\)，通过两次 LCP 将"恰好一个不同字符"的约束分解，将方案数转化为第二次 LCP 的长度加 \(1\)
关键步骤：
- 预处理：
  - 计算前缀哈希数组 \(h\) 和幂次数组 \(p\)（基数通常取 \(131\) 或 \(13331\)）
  - \(h[i] = h[i-1] \times P + s[i]\)，\(p[i] = p[i-1] \times P\)
- 枚举起始位置对：遍历 \(1 \leq i < j \leq N\)
- 第一次 LCP：计算 \(p_1 = \text{LCP}(i, j)\)，即从位置 \(i\) 和 \(j\) 开始的最长公共前缀长度
- 跳过不同字符：\(ni = i + p_1 + 1\)，\(nj = j + p_1 + 1\)（跳过第一个不同字符）
- 边界判断：若 \(ni > N\) 或 \(nj > N\)，说明一个子串是另一个的前缀，贡献为 \(1\)（仅长度 \(p_1\) 的子串）或直接跳过
- 第二次 LCP：计算 \(p_2 = \text{LCP}(ni, nj)\)，即跳过不同字符后，后面部分的最长公共前缀长度
- 累加贡献：\(ans \mathrel{+}= p_2 + 1\)。含义：两子串前面 \(p_1\) 个字符相同，中间 \(1\) 个字符不同，后面可以延伸 \(0\) 到 \(p_2\) 个相同字符，共 \(p_2 + 1\) 种长度选择
- 输出答案 \(ans\)
时间/空间复杂度：
- 时间复杂度：\(O(N^2 \log N)\)。枚举起始位置对 \(O(N^2)\)，每次两次 LCP 二分查找 \(O(\log N)\)
- 空间复杂度：\(O(N)\)，前缀哈希数组和幂次数组
字符串哈希的核心思想：
- 滚动哈希：将字符串看作 \(P\) 进制数，前缀哈希使任意子串的哈希值可在 \(O(1)\) 内计算：\(\text{get}(l, r) = h[r] - h[l-1] \times p[r-l+1]\)
- LCP 二分：利用哈希的 \(O(1)\) 比较能力，二分查找两个后缀首次不同的位置，从而得到最长公共前缀长度
- 组合计数转化：将"恰好一个不同"的约束拆解为"相同前缀 + 一个不同 + 相同后缀"，利用 LCP 快速计算相同前缀和相同后缀的长度，从而将方案数转化为后缀 LCP 长度加 \(1\)
- 冲突处理：使用双哈希或 \(64\) 位自然溢出（unsigned long long）降低哈希冲突概率
- 适用于子串相等判断、最长公共前缀、回文检测、字符串匹配等场景

【算法标签】

字符串哈希

【代码详解】

#include <bits/stdc++.h>
using namespace std;typedef unsigned long long ULL;  // 定义无符号长整型别名
#define int long long  // 将int定义为long long类型
const int P = 131, N = 5005;  // P:哈希基数(131), N:最大长度(5005)
int n;  // 字符串长度
int ans;  // 最终答案
string s;  // 输入字符串
ULL h[N];  // 哈希前缀数组
ULL p[N];  // 哈希基数幂次数组// 初始化哈希数组
void init()
{p[0] = 1;  // P^0 = 1h[0] = 0;  // 空串的哈希值为0for (int i = 1; i <= n; i++){p[i] = p[i - 1] * P;  // 计算P^ih[i] = h[i - 1] * P + s[i];  // 计算前缀哈希值}
}// 获取子串s[l..r]的哈希值
ULL get(int l, int r)
{return h[r] - h[l - 1] * p[r - l + 1];
}// 判断两个子串是否相等
bool substr(int l1, int r1, int l2, int r2)
{return get(l1, r1) == get(l2, r2);
}// 计算从位置a和b开始的最长公共前缀长度
int lcp(int a, int b)
{int l = 0, r = n - max(a, b) + 1;  // 二分查找范围while (l < r){int mid = (l + r + 1) >> 1;  // 向上取整if (get(a, a + mid - 1) == get(b, b + mid - 1))  // 如果前mid个字符相等{l = mid;  // 增加下界}else{r = mid - 1;  // 减少上界}}return l;  // 返回最长公共前缀长度
}signed main()  // 因为使用了#define int long long，所以用signed
{cin >> n >> s;  // 输入字符串长度和字符串s = " " + s;  // 在字符串前添加空格，使下标从1开始init();  // 初始化哈希数组// 遍历所有位置对(i,j)，其中i<jfor (int i = 1; i <= n; i++){for (int j = i + 1; j <= n; j++){int p1 = lcp(i, j);  // 计算从i和j开始的最长公共前缀长度int ni = i + p1;  // 跳过公共前缀后的位置iint nj = j + p1;  // 跳过公共前缀后的位置jif (ni > n || nj > n) continue;  // 如果跳过公共前缀后超出范围，跳过ni++;  // 移到下一个字符nj++;  // 移到下一个字符if (ni > n || nj > n)  // 如果移动后超出范围{ans += 1;  // 特殊情况：贡献为1continue;}int p2 = lcp(ni, nj);  // 计算新位置的最长公共前缀长度ans += (p2 + 1);  // 贡献为p2+1}}cout << ans << endl;  // 输出最终答案return 0;  // 程序正常结束
}

【运行结果】