Loi_Vampire's Blog

自己选择的路,就算跪着也要走完

02/7
13:30
KMP NOI 字符串

BZOJ 3670 [Noi2014]动物园

Description

近日,园长发现动物园中好吃懒做的动物越来越多了。例如企鹅,只会卖萌向游客要吃的。为了整治动物园的不良风气,让动物们凭自己的真才实学向游客要吃的,园长决定开设算法班,让动物们学习算法。

某天,园长给动物们讲解KMP算法。

园长:“对于一个字符串S,它的长度为L。我们可以在O(L)的时间内,求出一个名为next的数组。有谁预习了next数组的含义吗?”

熊猫:“对于字符串S的前i个字符构成的子串,既是它的后缀又是它的前缀的字符串中(它本身除外),最长的长度记作next[i]。”

园长:“非常好!那你能举个例子吗?”

熊猫:“例S为abcababc,则next[5]=2。因为S的前5个字符为abcab,ab既是它的后缀又是它的前缀,并且找不到一个更长的字符串满足这个性质。同理,还可得出next[1] = next[2] = next[3] = 0,next[4] = next[6] = 1,next[7] = 2,next[8] = 3。”

园长表扬了认真预习的熊猫同学。随后,他详细讲解了如何在O(L)的时间内求出next数组。

下课前,园长提出了一个问题:“KMP算法只能求出next数组。我现在希望求出一个更强大num数组一一对于字符串S的前i个字符构成的子串,既是它的后缀同时又是它的前缀,并且该后缀与该前缀不重叠,将这种字符串的数量记作num[i]。例如S为aaaaa,则num[4] = 2。这是因为S的前4个字符为aaaa,其中a和aa都满足性质‘既是后缀又是前缀’,同时保证这个后缀与这个前缀不重叠。而aaa虽然满足性质‘既是后缀又是前缀’,但遗憾的是这个后缀与这个前缀重叠了,所以不能计算在内。同理,num[1] = 0,num[2] = num[3] = 1,num[5] = 2。”

最后,园长给出了奖励条件,第一个做对的同学奖励巧克力一盒。听了这句话,睡了一节课的企鹅立刻就醒过来了!但企鹅并不会做这道题,于是向参观动物园的你寻求帮助。你能否帮助企鹅写一个程序求出num数组呢?

特别地,为了避免大量的输出,你不需要输出num[i]分别是多少,你只需要输出对1,000,000,007取模的结果即可。

Input

第1行仅包含一个正整数n ,表示测试数据的组数。随后n行,每行描述一组测试数据。每组测试数据仅含有一个字符串S,S的定义详见题目描述。数据保证S 中仅含小写字母。输入文件中不会包含多余的空行,行末不会存在多余的空格。

Output

包含 n 行,每行描述一组测试数据的答案,答案的顺序应与输入数据的顺序保持一致。对于每组测试数据,仅需要输出一个整数,表示这组测试数据的答案对 1,000,000,007 取模的结果。输出文件中不应包含多余的空行。

Sample Input

3

aaaaa

ab

abcababc

Sample Output

36

1

32

HINT

n≤5,L≤1,000,000

Source

 

Solution

还是有点糊涂……毕竟人太弱 沿着fail数组统计答案,时间复杂度做到 O(n)

 

01/11
11:08
AC自动机 字符串

BZOJ 3172 [Tjoi2013]单词 AC自动机-fail树

Description

某人读论文,一篇论文是由许多单词组成。但他发现一个单词会在论文中出现很多次,现在想知道每个单词分别在论文中出现多少次。

Input

第一个一个整数N,表示有多少个单词,接下来N行每行一个单词。每个单词由小写字母组成,N<=200,单词长度不超过10^6

Output

输出N个整数,第i行的数字表示第i个单词在文章中出现了多少次。

Sample Input

3
a
aa
aaa

Sample Output

6
3
1

HINT

 

Source

Solution

朴素的做法是建好AC自动机之后,拿每一个串匹配,把路径上的点都cnt++,统计答案,然而会被卡死。 正解:将fail反向,成为一棵以0为根节点的fail树,然后向树根扫一遍,O(n)统计答案

 

01/10
21:37
KMP 字符串

BZOJ 1355 [Baltic2009]Radio Transmission KMP

Description

给你一个字符串,它是由某个字符串不断自我连接形成的。 但是这个字符串是不确定的,现在只想知道它的最短长度是多少.

Input

第一行给出字符串的长度,1 < L ≤ 1,000,000. 第二行给出一个字符串,全由小写字母组成.

Output

输出最短的长度

Sample Input

8

cabcabca

Sample Output

3

HINT

对于样例,我们可以利用"abc"不断自我连接得到"abcabcabc",读入的cabcabca,是它的子串

Source

Solution

用KMP来做……算是一个结论题吧……答案是n-fail[n]

 

01/10
21:17
AC自动机 字符串

HDU 2222 Keywords Search AC自动机

In the modern time, Search engine came into the life of everybody like Google, Baidu, etc. Wiskey also wants to bring this feature to his image retrieval system. Every image have a long description, when users type some keywords to find the image, the system will match the keywords with description of image and show the image which the most keywords be matched. To simplify the problem, giving you a description of image, and some keywords, you should tell me how many keywords will be match.

Input

First line will contain one integer means how many cases will follow by. Each case will contain two integers N means the number of keywords and N keywords follow. (N <= 10000) Each keyword will only contains characters 'a'-'z', and the length will be not longer than 50. The last line is the description, and the length will be not longer than 1000000.

Output

Print how many keywords are contained in the description.

Sample Input

1
5
she
he
say
shr
her
yasherhs

Sample Output

3

Solution

题目大意:先给你n个串,再给你另外一个串,问在这n个串中有多少个在最后一个串中出现过。

AC自动机……第一天学,感觉理解的还不是很透彻……

AC自动机简而言之就是在tire树上跑KMP,能在O((N+M)L)的时间内完成一个串与多个串的匹配。AC自动机中的fail数组的获得和KMP算法中的fail数组基本类似。 ps: 关于时间复杂里的 N M L分别为 N个串,长度为L 和 一个长度为L的串匹配

 

01/10
16:43
KMP 字符串

POJ 3461 Oulipo KMP

Description

The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book:

Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces.

So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.

Input

The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W). One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000. Output

For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.

Sample Input

3 BAPC BAPC AZA AZAZAZA VERDI AVERDXIVYERDIAN

Sample Output

1 3 0

Source

BAPC 2006 Qualification

题目大意:每次给你两个串 s1 s2,问s1在s2中出现了多少次。

KMP做……就当先存个板子吧。