C/C++ 字符串分割： strtok 與 strsep 函數說明

本文轉載自查看原文 2017-01-27 17:06 5275 C/C++

函數原型：

char *strtok(char *s, const char *delim);

char *strsep(char **s, const char *delim);

功能：strtok和strsep兩個函數的功能都是用來分解字符串為一組字符串。s為要分解的字符串，delim為分隔符字符串。

返回值：從s開頭開始的一個個子串，當沒有分割的子串時返回NULL。

相同點：兩者都會改變源字符串，想要避免，可以使用strdupa（由allocate函數實現）或strdup（由malloc函數實現）。

strtok函數第一次調用時會把s字符串中所有在delim中出現的字符替換為NULL。然后通過依次調用strtok(NULL, delim)得到各部分子串。

作用:
   分解字符串為一組字符串。s為要分解的字符串，delim為分隔符字符串。
說明:
   strtok()用來將字符串分割成一個個片段。參數s指向欲分割的字符串，參數delim則為分割字符串，當strtok()在參數s的字符串中發現到參數delim的分割字符時則會將該字符改為\0 字符。在第一次調用時，strtok()必需給予參數s字符串，往后的調用則將參數s設置成NULL。每次調用成功則返回下一個分割后的字符串指針。
返回值:
   從s開頭開始的一個個被分割的串。當沒有被分割的串時則返回NULL。
   所有delim中包含的字符都會被濾掉，並將被濾掉的地方設為一處分割的節點。(如下面的例子,可修改 seps里面的數據,然后看輸出結果)

#include <string.h>
#include <stdio.h>
char string[] ="A string\tof ,,tokens\nand some  more tokens";
char seps[]   =" ,\t\n";
char *token;
int main( void )
{
   printf( "%s\n\nTokens:\n", string );
/* Establish string and get the first token: */
   token = strtok( string, seps );
while( token != NULL )
   {
/* While there are tokens in "string" */
      printf( " %s\n", token );
/* Get next token: */
      token = strtok( NULL, seps );
   }
return 0;
}

總結:

strtok內部記錄上次調用字符串的位置，所以不支持多線程，可重入版本為strtok_r,有興趣的可以研究一下。它適用於分割關鍵字在字符串之間是“單獨”或是 “連續“在一起的情況。

strsep:

#include <string.h>
#include <stdio.h>
char string[] ="A string\tof ,,tokens\nand some  more tokens";
char seps[]   =" ,\t\n";
char *token, *s;
int main( void )
{
   printf( "%s\n\nTokens:\n", string );
/* Establish string and get the first token: */
   s=string;
   token = strsep( &s, seps );
while( token != NULL )
   {
/* While there are tokens in "string" */
      printf( " %s\n", token );
/* Get next token: */
      token = strsep( &s, seps );
   }
return 0;
}

為什么用strtok時子串中間沒有出現換行，而strsep卻有多個換行呢？文檔中有如下的解釋：

One difference between strsep and strtok_r is that if the input string contains more
than one character from delimiter in a row strsep returns an empty string for each
pair of characters from delimiter. This means that a program normally should test
for strsep returning an empty string before processing it.

大意是：如果輸入的串的有連續的多個字符屬於delim，（此例source中的逗號+空格，感嘆號+空格等就是這種情況），strtok會返回NULL，而strsep會返回空串 ""。因而我們如果想用strsep函數分割字符串必須進行返回值是否是空串的判斷。這也就解釋了strsep的例子中有多個換行的原因。

改進后的代碼：

效果：

其中，字符‘\0’ 的 10進制數為0 ，宏定義為 NULL 。

下面的說明摘自於最新的Linux內核2.6.29，說明了strtok()已經不再使用，由速度更快的strsep()代替。

/** linux/lib/string.c** Copyright (C) 1991, 1992 Linus Torvalds*/　　

/** stupid library routines.. The optimized versions should generally be found　　

* as inline code in <asm-xx/string.h>　　

* These are buggy as well..　　

* * Fri Jun 25 1999, Ingo Oeser <ioe@informatik.tu-chemnitz.de>　　

* - Added strsep() which will replace strtok() soon (because strsep() is　　

* reentrant and should be faster). Use only strsep() in new code, please.　　

** * Sat Feb 09 2002, Jason Thomas <jason@topic.com.au>,　　

* Matthew Hawkins <matt@mh.dropbear.id.au>　　

* - Kissed strtok() goodbye

*/

總結:

strsep返回值為分割后的開始字符串，並將函數的第一個參數指針指向分割后的剩余字符串。它適用於分割關鍵字在兩個字符串之間只嚴格出現一次的情況。

PS:

因為函數內部會修改原字符串變量，所以傳入的參數不能是不可變字符串（即文字常量區）。

如 char *tokenremain ="abcdefghij"//編譯時為文字常量，不可修改。

strtok(tokenremain,"cde");

strsep(&tokenremain,"cde");

編譯通過,運行時會報段錯誤。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 strtok 與 strsep 函數說明 C/C++_字符串分割strtok函數 C語言分割字符串函數strtok C中字符串分割函數strtok的一個坑 C++的字符串分割函數 strtok(), strtok_s() 字符串分割函數 C++ split分割字符串函數字符串分割函數strtok（線程不安全）,線程安全函數strtok_r C++ 字符串分割函數 str_split c++ string 字符串分割