python tips:文件讀取——換行符的問題


問題:在windows系統中,換行的符號是'\r\n'。python在讀文件的時候為了系統兼容,會默認把'\r','n','\r\n'都視作換行。但是在windows文件中,可能在同一行中同時存在'\n','\r\n','\r'。這個時候python的默認行為會將一行拆分成多行輸出,影響預期結果。

此時需要設置open函數的newline參數,修改python對換行的默認行為。

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

newline有五種取值:None,'','\n','\r','\r\n'。

在輸入過程(從文件到程序),newline用於定義換行的符號:

1.如果newline為None,碰到'\r','\n','\r\n'都算行尾,而且這些符號都會被轉換成'\n'。

2.如果newline為'',也是碰到'\r','\n','\r\n'都算行尾,但是這些符號不會發生轉換。

3.如果newline為'\r','\n','\r\n',等於是顯示指定了換行符,而且行中的符號不會發生轉換。

在輸出過程(從程序到文件),newline用於指定'\n'的轉換符號:

1.如果newline為None,所有的'\n'都被轉換成系統換行符。

2.如果newline為'','\n',不會發生轉換。

3.如果newline為'\r','\r\n',所有的'\n'會被轉換成'\r'或者'\r\n'。

實例一:輸出不指定newline,所有的'\n'都被替換成'\r\n',即使是'\r\n'中的'\n'也不例外。

def file_seperator_test1():
    # output
    with open("medical.txt", "w") as f:
        f.write("I am a\r good\n boy.\r\n")
    #input
    with open("medical.txt", "r", newline="\r\n") as f:
        print(list(f))

if __name__ == "__main__":
    file_seperator_test1()

輸出結果:

['I am a\r good\r\n', ' boy.\r\r\n']

實例二: 輸出指定newline為''或'\n',不會轉換

def file_seperator_test2():
    # output
    with open("medical.txt", "w", newline="") as f:
        f.write("I am a\r good\n boy.\r\n")
    with open("medical2.txt", "w", newline="\n") as f:
        f.write("I am a\r good\n boy.\r\n")

    #input
    with open("medical.txt", "r", newline="\r\n") as f:
        print(list(f))
    with open("medical2.txt", "r", newline="\r\n") as f:
        print(list(f))

if __name__ == "__main__":
    file_seperator_test2()

輸出結果:

['I am a\r good\n boy.\r\n']
['I am a\r good\n boy.\r\n']

實例三:輸出指定newline為'\r'或'\r\n',所有的'\n'都被替換了,當所有'\n'都被替換成'\r'時,在windows中,換行符就不見了,所有的行變成了一行

def file_seperator_test3():
    # output
    with open("medical.txt", "w", newline="\r") as f:
        f.write("I am a\r good\n boy.\r\n where should\r\n I change the line ?\r\n")
        f.write("I can't stop\r\n")
    with open("medical2.txt", "w", newline="\r\n") as f:
        f.write("I am a\r good\n boy.\r\n")

    #input
    with open("medical.txt", "r", newline="\r\n") as f:
        print(list(f))
    with open("medical2.txt", "r", newline="\r\n") as f:
        print(list(f))


if __name__ == "__main__":
    file_seperator_test3() 

輸出結果:

["I am a\r good\r boy.\r\r where should\r\r I change the line ?\r\rI can't stop\r\r"]
['I am a\r good\r\n', ' boy.\r\r\n']

實例四:輸入不指定newline,默認把所有的三種符號都當做換行符,而且全都轉換成'\n'

def file_seperator_test4():
    # output
    with open("medical.txt", "w", newline="") as f:
        f.write("I am a\r good\n boy.\r\n")
    #input
    with open("medical.txt", "r") as f:
        print(list(f))


if __name__ == "__main__":
    file_seperator_test4() 

輸出結果:

['I am a\n', ' good\n', ' boy.\n']

實例五:輸入指定newline為'',仍然把三種符號都當做換行符,但是不轉換

def file_seperator_test5():
    # output
    with open("medical.txt", "w", newline="") as f:
        f.write("I am a\r good\n boy.\r\n")
    #input
    with open("medical.txt", "r", newline="") as f:
        print(list(f))

if __name__ == "__main__":
    file_seperator_test5()

輸出結果:

['I am a\r', ' good\n', ' boy.\r\n']

實例六:輸入指定newline為'\r','\n','\r\n',顯式指定了換行符,只有碰到這幾個符號才會換行

def file_seperator_test6():
    # output
    with open("medical.txt", "w", newline="") as f:
        f.write("I am a\r good\n boy.\r\n where should\r\n I change the line ?\r\n")
        f.write("I can't stop\r\n")
    with open("medical2.txt", "w", newline="") as f:
        f.write("I am a\r good\n boy.\r\n where should\r\n I change the line ?\r\n")
        f.write("I can't stop\r\n")
    with open("medical3.txt", "w", newline="") as f:
        f.write("I am a\r good\n boy.\r\n where should\r\n I change the line ?\r\n")
        f.write("I can't stop\r\n")

    #input
    with open("medical.txt", "r", newline="\r") as f:
        print(list(f))
    with open("medical2.txt", "r", newline="\n") as f:
        print(list(f))
    with open("medical3.txt", "r", newline="\r\n") as f:
        print(list(f))

if __name__ == "__main__":
    file_seperator_test6()

輸出結果:

['I am a\r', ' good\n boy.\r', '\n where should\r', '\n I change the line ?\r', "\nI can't stop\r", '\n']
['I am a\r good\n', ' boy.\r\n', ' where should\r\n', ' I change the line ?\r\n', "I can't stop\r\n"]
['I am a\r good\n boy.\r\n', ' where should\r\n', ' I change the line ?\r\n', "I can't stop\r\n"]

結論:

1.如果要寫入帶'\n'的行,可以把newline設定為''或者'\n',避免python更改'\n'

2.如果要讀入帶'\n'的行,可以把newline設定為'\r\n',指定換行符只能是'\r\n'。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM