《周字符串处理》PPT课件.ppt

资源描述

第六讲字符串处理 ACM算法与程序设计 2/30 一个简单的字符串操作的例子 #include #include char strl = “The quick brown dog jumps over the lazy fox”; char str250 =“The QUICK brown dog Jumps over the lazy fox”； char str340 =“The QUICK brown dog Jumps over the lazy fox”； /错误：字符串共有 43个字符，需要一个长度至少为 44的字符串变量存储。 /易忽略在字符串的末尾要添加表示结束的额外标志字符 /0。 char str450； void main(void) int result; str4 =“The QUICK brown DOG jumps over the lazy fox”; /错误：不能将一个字符串常量赋值给另一个字符串变量。 str4=str2; /错误：不能将一个字符串变量赋值绘另一个字符串变量 str4=str1; /错误：不能将一个字符串变量赋值给另一个字符串变量 printf(“Compare strings:nt%snt%snn”,strl， str2); 3/30 Look and Say 1、链接地址 Code=2886 2、题目内容 The look and say sequence is defined as follows. Start with any string of digits as the first element in the sequence. Each subsequent element is defined from the previous one by verbally describing the previous element. For example, the string 122344111 can be described as one 1, two 2s, one 3, two 4s, three 1s. Therefore, the element that comes after 122344111 in the sequence is 1122132431. Similarly, the string 101 comes after 1111111111. 4/30 Input The input consists of a number of cases. The first line gives the number of cases to follow. Each case consists of a line of up to 1000 digits. Output For each test case, print the string that follows the given string. Sample Input 3 122344111 1111111111 12345 Sample Output 1122132431 101 1112131415 5/30 3、解题思路本题是处理重复子串的问题。虽然输入的都是数字，但应当把它们当成字符串处理。由于本题时限一秒，每个字符串的长度多达 1000位，所以，不好的算法容易超时。 scanf和 printf所用的时间大大少于 cin和 cout所消耗的时间。由于本题需要频繁输出，采用 cout则会超过一秒；但使用 printf则不会超过一秒。这点是 ACM竞赛中节约时间的赏识。一般地，由于 cin和 cout调试很方便，所以调度期间使用它们，但是提交判题系统后，如果发现超时，可尝试将 cin和 cout改为 scanf和 printf，看看是不是由于输入输出过于频繁而导致的。 6/30 4、参考程序 #include #include int main() char s1001,t; int c,i,j,n,len,temp; scanf(%d, for(i=0;in;i+) scanf(%s,s); c=0; t=s0; temp=0; len=strlen(s); 7/30 4、参考程序 for(j=0;jlen;j+) if(sj=t) temp+; if(j=len-1) printf(%d%c,temp,t); else printf(%d%c,temp,t); t=sj; temp=1; if(j=len-1) printf(%d%c,temp,t); printf(n); return 0; 8/30 Abbreviation 1、链接地址 Code=2947 2、题目内容 When a Little White meets another Little White: Little White A: (Surprised) ! Little White B: ? Little White A: You Little White know SHDC? So unbelievable! Little White B: You are little white! Little white is you! What is SHDC you are talking about? Little White A: Wait. I mean Super Hard-disc Drive Cooler. Little White B: I mean Spade Heart Diamond Club. Duck talks with chicken -_-/ Little White A: Duck. chicken. faint! 9/30 -quote from qmd of Spade6 in CC98 forum. Sometimes, we write the abbreviation of a name. For example IBM is the abbreviation for International Business Machines. A name usually consists of one or more words. A word begins with a capital letter (A - Z) and followed by zero or more lower-case letters (a - z). The abbreviation for a name is the word that consists of all the first letters of the words. Now, you are given two names and asked to decide whether their abbreviations are the same. 10/30 Input Standard input will contain multiple test cases. The first line of the input is a single integer T which is the number of test cases. And it will be followed by T consecutive test cases. There are four lines for each case. The first line contains an integer N (1 = N = 5), indicating the number of words in the first name. The second line shows the first name. The third line contains an integer M (1 = M = 5), indicating the number of words in the second name. The fourth line shows the second name. Each name consists of several words separated by space. Length for every word is less than 10. The first letter for each word is always capital and the rest ones are lower-case. 11/30 Output Results should be directed to standard output. The output of each test case should be a single line. If two names abbreviations are the same, output SAME, otherwise output DIFFERENT. 12/30 Sample Input 3 4 Super Harddisc Drive Cooler 4 Spade Heart Diamond Club 3 Shen Guang Hao 3 Shuai Ge Hao 3 Cai Piao Ge 4 C P C S Sample Output SAME SAME DIFFERENT 13/30 3、解题思路本题是比较两个缩写词是否相同，而缩写词又是从一个包含多个单词的名字中合成的。每次读入一个单词，然后取出它的第一个单词，连接在字符串上，就组成一个缩写词。本题的难点在单词的读取控制上，对于输入数据的控制，是 ACM竞赛中考查的一个重要方面。另外，大家可以试试，使用 printf输出比使用 cout输出快很多。 14/30 4、参考程序 #include #include int main() char s11,ssa6,ssb6; int i,j,t,n; scanf(%d, for(i=0;it;i+) scanf(%d, for(j=0;jn;j+) scanf(%s,s); ssaj=s0;/只取首字母 ssaj=0;/作字符串处理 scanf(%d, for(j=0;jn;j+) scanf(%s,s); ssbj=s0; ssbj=0; if(strcmp(ssa,ssb)=0) printf(SAMEn); else printf(DIFFERENTn); return 0; 15/30 Caesar 密码 South Central USA 2002 Description Julius Caesar 生活在充满危险和阴谋的年代。为了生存，他首次发明了密码，用于军队的消息传递。假设你是 Caesar 军团中的一名军官，需要把 Caesar 发送的消息破译出来、并提供给你的将军。消息加密的办法是：对消息原文中的每个字母，分别用该字母之后的第 5个字母替换（例如：消息原文中的每个字母 A都分别替换成字母 F），其他字符不变，并且消息原文的所有字母都是大写的。密码字母： A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 原文字母： V W X Y Z A B C D E F G H I J K L M N O P Q R S T U 16/30 Input 最多不超过 100个数据集组成。每个数据集由 3 部分组成：起始行： START 密码消息：由 1到 200个字符组成一行，表示 Caesar发出的一条消息结束行： END 在最后一个数据集之后，是另一行： ENDOFINPUT Output 每个数据集对应一行，是 Caesar 的原始消息。 17/30 Sample Input START NS BFW, JAJSYX TK NRUTWYFSHJ FWJ YMJ WJXZQY TK YWNANFQ HFZXJX END START N BTZQI WFYMJW GJ KNWXY NS F QNYYQJ NGJWNFS ANQQFLJ YMFS XJHTSI NS WTRJ END START IFSLJW PSTBX KZQQ BJQQ YMFY HFJXFW NX RTWJ IFSLJWTZX YMFS MJ END ENDOFINPUT Sample Output IN WAR, EVENTS OF IMPORTANCE ARE THE RESULT OF TRIVIAL CAUSES I WOULD RATHER BE FIRST IN A LITTLE IBERIAN VILLAGE THAN SECOND IN ROME DANGER KNOWS FULL WELL THAT CAESAR IS MORE DANGEROUS THAN HE 18/30 问题分析问题需要将密码消息中的每个字母分别进行相应的变换。关键之处是识别输入数据中的消息行、读入消息行的数据。 scanf函数使用输入字符串时，每个字符串中不能有空格。 gets函数一次可读入一行，但有可能会导致 warning message fgets(str, sizeof str, stdin) = gets(str) 19/30 解密时，需将消息中单词的字符串作为普通数组，一次变换其中每个字母。需用以下几个字符串处理函数： gets：读入一行字符串，允许包含空格 strcmp：识别消息行的 start和 end strlen：计算加密消息中的每个单词的长度 20/30 程序 #include #include void decipher (char message); void main() char message201; gets(message); while(strcmp(message, “START” = 0) decipher(message); printf(“%sn”, message); gets(message); return; 21/30 void decipher(char message) char plain27=“VWXYZABCDEFGHIJKLMNOPQRSTU”; char cipherEnd201; int i， cipherLen; gets(message); cipherLen = strlen(message); for(i = 0; i=A gets(cipherEnd); return; 22/30 487-3279 East Central North America 1999 Description 企业喜欢用容易被记住的电话号码。让电话号码容易被记住的一个办法是将它写成一个容易记住的单词或者短语。例如，你需要给滑铁卢大学打电话时，可以拨打 TUT-GLOP。有时，只将电话号码中部分数字拼写成单词。当你晚上回到酒店，可以通过拨打 310- GINO来向 Ginos订一份 pizza。让电话号码容易被记住的另一个办法是以一种好记的方式对号码的数字进行分组。通过拨打必胜客的“三个十”号码 3-10-10-10，你可以从他们那里订 pizza。 23/30 电话号码的标准格式是七位十进制数，并在第三、第四位数字之间有一个连接符。电话拨号盘提供了从字母到数字的映射，映射关系如下： A, B, 和 C 映射到 2 D, E, 和 F 映射到 3 G, H, 和 I 映射到 4 J, K, 和 L 映射到 5 M, N, 和 O 映射到 6 P, R, 和 S 映射到 7 T, U, 和 V 映射到 8 W, X, 和 Y 映射到 9 24/30 Q和 Z没有映射到任何数字，连字符不需要拨号，可以任意添加和删除。 TUT-GLOP的标准格式是 888-4567， 310-GINO的标准格式是 310-4466， 3-10-10-10的标准格式是 310-1010。如果两个号码有相同的标准格式，那么他们就是等同的（相同的拨号）你的公司正在为本地的公司编写一个电话号码薄。作为质量控制的一部分，你想要检查是否有两个和多个公司拥有相同的电话号码。 25/30 Input 输入的格式是，第一行是一个正整数，指定电话号码薄中号码的数量（最多 100000）。余下的每行是一个电话号码。每个电话号码由数字，大写字母（除了 Q和 Z）以及连接符组成。每个电话号码中只会刚好有 7个数字或者字母。 Output 对于每个出现重复的号码产生一行输出，输出是号码的标准格式紧跟一个空格然后是它的重复次数。如果存在多个重复的号码，则按照号码的字典升序输出。如果输入数据中没有重复的号码，输出一行： No duplicates. 26/30 Sample Input 12 4873279 ITS-EASY 888-4567 3-10-10-10 888-GLOP TUT-GLOP 967-11-11 310-GINO F101010 888-1200 -4-8-7-3-2-7-9- 487-3279 Sample Output 310-1010 2 487-3279 4 888-4567 3 27/30 问题分析关键是判断输入的电话号码中是否有重复号码解决方法： (1)将各种电话号码表示转换成标准表示一个长度为 8的字符串，前 3个字符是数字、第 4 个字符是一、后 4个字符是数字。 (2)根据电话号码的标准表示，搜索重复的电话号码。办法是对全部的电话号码进行排序，这样相同的电话号码就排在相邻的位置。输出重复的电话号码时，按照号码的字典升序进行输出。 28/30 解决方案用一个二维数组 telNumbers1000009来存储全部的电话号码。每一行存储一个电话号码的标准表示。每读入一个电话号码首先将其转换成标准表示然后存储到二维数组 telNumbers中。全部电话号码都输入完毕后将数组 telNumbers作为一个一维数组。其中每个元素是一个字符串用 C C+提供的函数模板 sort 对其进行排序。用字符串比较函数 strcmp比较 telNumbers中相邻的电话号码判断是否有重复的电话号码，并计算重复的次数。 29/30 #include #include #include char map =“22233344455566677778889999”; /map表示从电话拨号盘的字母到数字的映射关系： mapj表示字母 j+A映射成的数字。 char str80, telNumbers1000009; int compare(const void *eleml, const void *elem2) /为函数模扳 sort定义数组元素的比较函数 strcmp查找重复的电话号码 return(strcmp(char * )eleml, (char * )elem2); ; 30/30 void standardizeTel(int n) Int j, k; j=k=-1; while(k= A continue; ; telNumbersnk = strj; telNumbersnk = 0; Return; 31/30 void main() int n,i,j; bool noduplicate; scanf(“%d”, for (i=0; in; i+)/输入电话号码，存储刭数组 telNumbers中 scanf(“%s”, str); standardizeTel(i); /将 str中的电话号码转换成标准形式，存储在 telNumbers的第 i行 qsort(telNumbers, n, 9, compare);/对输入的电话号码进行排序 32/30 noduplicate = true; i = 0; while(i n) /搜索重复的电话号码并进行输出 j = I; i+; while(i 1) printf(“%s%dn”, telNumbersj, i - j); noduplicate = false; if (noduplicate) printf (“No duplicates.n”); 33/30 总结：习惯用数据结构来表示问题中的事实和关系。例如字符数组 map。而用一组条件判断语句来实现这个功能。这样虽然也能够实现但程序代码看起来不简洁，也容易出错。尽量使用 C C+自带的函数来完成程序的功能简化程序代码的实现。例如函数模板 sort，字符串比较函数 strcmp。对程序进行模块化把一个独立的功能作为一个函数。并用一个单词、短语对函数进行命名。在上面的程序中，对电话号码标准化是一个独立的功能，定义一个函数 standardizeTel，使得整个程序的结构清晰、简洁、易读。不同的程序模块需要共同访问的数据作为全局变量。既可简化函数的参数接口，又可以降低函数调用的参数传递开销。例如，数组 map和 telNumbers都作为全局变量。

展开阅读全文

《周字符串处理》PPT课件.ppt

最新文档