字符串拼接
String
在Java中,String是一個不可變類,所以String對象一旦在堆中被創建出來就不能修改。
package java.lang;
//import ...
public final class String
implements java.io.Serializable, Comparable<String>, CharSequence {
/** The value is used for character storage. */
private final char value[];
}
Java字符串其實是基於字符數組實現的,該數組被關鍵字final標注,一經賦值就不可修改。
既然字符串是不可變的,那么字符串拼接又是怎么回事呢?
字符串不變性與字符串拼接
其實所謂的字符串拼接,都是重新生成了一個新的字符串(JDK7開始,substring() 操作也是重新生成一個新的字符串)。下面一段字符串拼接代碼:
String s = "hello ";
s = s.concat("world!");
其實生成了一個新字符串,s最終保存的是一個新字符串的引用,如下圖所示:

Java字符串拼接方式
+ 語法糖
在Java中,拼接字符串最簡單的方式就是直接使用符號+來拼接,如:
public class Main2 {
public static void main(String[] args) {
String s1 = "hello " + "world " + "!";
String s2 = "xzy ";
String s3 = s2 + s1;
}
private void concat(String s1) {
String s2 = "xzy" + s1;
}
}
這里要特別說明一點,有人把Java中使用+拼接字符串的功能理解為運算符重載。其實並不是,Java是不支持運算符重載的,這其實只是Java提供的一個語法糖。
編譯,反編譯上面的代碼:
public class Main2 {
public Main2() {
}
public static void main(String[] var0) {
String var1 = "hello world !";
String var2 = "xzy ";
(new StringBuilder()).append(var2).append(var1).toString();
}
private void concat(String var1) {
(new StringBuilder()).append("xzy").append(var1).toString();
}
}
通過查看反編譯后的代碼,我們發現,使用 + 進行字符串拼接,最終是通過StringBuilder,創建一個新的String對象。
concat
除了使用+拼接字符串之外,還可以使用String類中的方法concat方法來拼接字符串,如:
public static void main(String[] args) {
String s1 = "hello " + "world " + "!";
String s2 = "xzy ";
String s3 = s2.concat(s1);
}
concat方法的源碼如下:
public final class String
implements java.io.Serializable, Comparable<String>, CharSequence {
/** The value is used for character storage. */
private final char value[];
/**
* Concatenates the specified string to the end of this string.
* <p>
* If the length of the argument string is {@code 0}, then this
* {@code String} object is returned. Otherwise, a
* {@code String} object is returned that represents a character
* sequence that is the concatenation of the character sequence
* represented by this {@code String} object and the character
* sequence represented by the argument string.<p>
* Examples:
* <blockquote><pre>
* "cares".concat("s") returns "caress"
* "to".concat("get").concat("her") returns "together"
* </pre></blockquote>
*
* @param str the {@code String} that is concatenated to the end
* of this {@code String}.
* @return a string that represents the concatenation of this object's
* characters followed by the string argument's characters.
*/
public String concat(String str) {
int otherLen = str.length();
if (otherLen == 0) {
return this;
}
int len = value.length;
char buf[] = Arrays.copyOf(value, len + otherLen);
str.getChars(buf, len);
return new String(buf, true);
}
/**
* Copy characters from this string into dst starting at dstBegin.
* This method doesn't perform any range checking.
*/
void getChars(char dst[], int dstBegin) {
System.arraycopy(value, 0, dst, dstBegin, value.length);
}
}
Arrays.copyOf()方法源碼:
//創建一個長度為newLength的字符數組,然后將original字符數組中的字符拷貝過去。
public static char[] copyOf(char[] original, int newLength) {
char[] copy = new char[newLength];
System.arraycopy(original, 0, copy, 0,
Math.min(original.length, newLength));
return copy;
}
從上面的源碼看出,使用a.concat(b)拼接字符串a b,創建了一個長度為a.length + b.length的字符數組,a和b先后被拷貝進字符數組,最后使用這個字符數組創建了一個新的String對象。
StringBuffer 和StringBuilder
關於字符串,Java中除了定義了一個可以用來定義字符串常量的String類以外,還提供了可以用來定義字符串變量的StringBuffer類、StringBuilder,它的對象是可以擴充和修改的,如:
public static void main(String[] args) {
StringBuffer stringBuffer = new StringBuffer();
String s1 = "hello " + "world " + "!";
String s2 = "xzy ";
String s3;
stringBuffer.append(s1).append(s2);
s3 = stringBuffer.toString();
}
public static void main(String[] args) {
StringBuilder stringBuilder = new StringBuilder();
String s1 = "hello " + "world " + "!";
String s2 = "xzy ";
String s3;
stringBuilder.append(s1).append(s2);
s3 = stringBuilder.toString();
}
接下來看看StringBuffer和StringBuilder的實現原理。
StringBuffer和StringBuilder都繼承自AbstractStringBuilder,下面是AbstractStringBuilder的部分源碼:
abstract class AbstractStringBuilder implements Appendable, CharSequence {
/**
* The value is used for character storage.
*/
char[] value;
/**
* The count is the number of characters used.
*/
int count;
}
與String類似,AbstractStringBuilder也封裝了一個字符數組,不同的是,這個字符數組沒有使用final關鍵字修改,也就是所,這個字符數組是可以修改的。還要一個差異就是,這個字符數組不一定所有位置都要被占滿,AbstractStringBuilder中有一個count變量同來記錄字符數組中存在的字符個數。
試着看看StringBuffer、StringBuilder、AbstractStringBuilder中append方法的源碼:
public final class StringBuffer
extends AbstractStringBuilder
implements java.io.Serializable, CharSequence{
/**
* A cache of the last value returned by toString. Cleared
* whenever the StringBuffer is modified.
*/
private transient char[] toStringCache;
@Override
public synchronized StringBuffer append(String str) {
toStringCache = null;
super.append(str);
return this;
}
}
public final class StringBuilder
extends AbstractStringBuilder
implements java.io.Serializable, CharSequence{
@Override
public StringBuilder append(String str) {
super.append(str);
return this;
}
}
abstract class AbstractStringBuilder implements Appendable, CharSequence {
/**
* The value is used for character storage.
*/
char[] value;
/**
* The count is the number of characters used.
*/
int count;
/**
* Appends the specified string to this character sequence.
* <p>
* The characters of the {@code String} argument are appended, in
* order, increasing the length of this sequence by the length of the
* argument. If {@code str} is {@code null}, then the four
* characters {@code "null"} are appended.
* <p>
* Let <i>n</i> be the length of this character sequence just prior to
* execution of the {@code append} method. Then the character at
* index <i>k</i> in the new character sequence is equal to the character
* at index <i>k</i> in the old character sequence, if <i>k</i> is less
* than <i>n</i>; otherwise, it is equal to the character at index
* <i>k-n</i> in the argument {@code str}.
*
* @param str a string.
* @return a reference to this object.
*/
public AbstractStringBuilder append(String str) {
if (str == null)
return appendNull();
int len = str.length();
ensureCapacityInternal(count + len);
//拷貝字符到內部的字符數組中,如果字符數組長度不夠,進行擴展。
str.getChars(0, len, value, count);
count += len;
return this;
}
}
可以觀察到一個比較明顯的差異,StringBuffer類的append方法使用synchronized關鍵字修飾,說明StringBuffer的append方法是線程安全的,為了實現線程安全,StringBuffer犧牲了部分性能。
效率比較
既然有這么多種字符串拼接的方法,那么到底哪一種效率最高呢?我們來簡單對比一下。
long t1 = System.currentTimeMillis();
//這里是初始字符串定義
for (int i = 0; i < 50000; i++) {
//這里是字符串拼接代碼
}
long t2 = System.currentTimeMillis();
System.out.println("cost:" + (t2 - t1));
public class Main2 {
public static void main(String[] args) {
test1();
test2();
test3();
test4();
}
public static void test1() {
long t1 = System.currentTimeMillis();
String str = "";
for (int i = 0; i < 50000; i++) {
String s = String.valueOf(i);
str += s;
}
long t2 = System.currentTimeMillis();
System.out.println("+ cost:" + (t2 - t1));
}
public static void test2() {
long t1 = System.currentTimeMillis();
String str = "";
for (int i = 0; i < 50000; i++) {
String s = String.valueOf(i);
str = str.concat("hello");
}
long t2 = System.currentTimeMillis();
System.out.println("concat cost:" + (t2 - t1));
}
public static void test3() {
long t1 = System.currentTimeMillis();
String str;
StringBuffer stringBuffer = new StringBuffer();
for (int i = 0; i < 50000; i++) {
String s = String.valueOf(i);
stringBuffer.append(s);
}
str = stringBuffer.toString();
long t2 = System.currentTimeMillis();
System.out.println("stringBuffer cost:" + (t2 - t1));
}
public static void test4() {
long t1 = System.currentTimeMillis();
String str;
StringBuilder stringBuilder = new StringBuilder();
for (int i = 0; i < 50000; i++) {
String s = String.valueOf(i);
stringBuilder.append(s);
}
str = stringBuilder.toString();
long t2 = System.currentTimeMillis();
System.out.println("stringBuilder cost:" + (t2 - t1));
}
}
我們使用形如以上形式的代碼,分別測試下五種字符串拼接代碼的運行時間。得到結果如下:
+ cost:7135
concat cost:1759
stringBuffer cost:5
stringBuilder cost:5
從結果可以看出,用時從短到長的對比是:
StringBuilder < StringBuffer < concat < +
那么問題來了,前面我們分析過,其實使用+拼接字符串的實現原理也是使用的StringBuilder,那為什么結果相差這么多,高達1000多倍呢?
反編譯上面的代碼:
/*
* Decompiled with CFR 0.149.
*/
package com.learn.java;
public class Main2 {
public static void main(String[] arrstring) {
Main2.test1();
Main2.test2();
Main2.test3();
Main2.test4();
}
public static void test1() {
long l = System.currentTimeMillis();
String string = "";
for (int i = 0; i < 50000; ++i) {
String string2 = String.valueOf(i);
string = new StringBuilder().append(string).append(string2).toString();
}
long l2 = System.currentTimeMillis();
System.out.println(new StringBuilder().append("+ cost:").append(l2 - l).toString());
}
public static void test2() {
long l = System.currentTimeMillis();
String string = "";
for (int i = 0; i < 50000; ++i) {
String string2 = String.valueOf(i);
string = string.concat("hello");
}
long l2 = System.currentTimeMillis();
System.out.println(new StringBuilder().append("concat cost:").append(l2 - l).toString());
}
public static void test3() {
long l = System.currentTimeMillis();
StringBuffer stringBuffer = new StringBuffer();
for (int i = 0; i < 50000; ++i) {
String string = String.valueOf(i);
stringBuffer.append(string);
}
String string = stringBuffer.toString();
long l2 = System.currentTimeMillis();
System.out.println(new StringBuilder().append("stringBuffer cost:").append(l2 - l).toString());
}
public static void test4() {
long l = System.currentTimeMillis();
StringBuilder stringBuilder = new StringBuilder();
for (int i = 0; i < 50000; ++i) {
String string = String.valueOf(i);
stringBuilder.append(string);
}
String string = stringBuilder.toString();
long l2 = System.currentTimeMillis();
System.out.println(new StringBuilder().append("stringBuilder cost:").append(l2 - l).toString());
}
}
我們可以看到,反編譯后的代碼,在for循環中,每次都是new了一個StringBuilder,然后再把String轉成StringBuilder,再進行append。
而頻繁的新建對象當然要耗費很多時間了,不僅僅會耗費時間,頻繁的創建對象,還會造成內存資源的浪費。
所以,阿里巴巴Java開發手冊建議:循環體內,字符串的連接方式,使用 StringBuilder 的 append 方法進行擴展。而不要使用+。
總結
常用的字符串拼接方式有:+、使用concat、使用StringBuilder、使用StringBuffer
由於字符串拼接過程中會創建新的對象,所以如果要在一個循環體中進行字符串拼接,就要考慮內存問題和效率問題。
經過對比,我們發現,直接使用StringBuilder的方式是效率最高的。因為StringBuilder天生就是設計來定義可變字符串和字符串的變化操作的。
但是,還要強調的是:
1、如果不是在循環體中進行字符串拼接的話,直接使用+就好了。
2、如果在並發場景中進行字符串拼接的話,要使用StringBuffer來代替StringBuilder。
參考文獻:https://hollischuang.github.io/toBeTopJavaer/#/basics/java-basic/string-concat
