實現兩個文本內容去重,輸出兩個文本不重復的結果
兩個測試文本內容如下
1.txt中內容為 1 2 3 4 5 6 7 8
2.txt中內容為 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
分別讀取兩個文本的內容
讀取1.txt的內容,具體實現如下:
str1 = []
file_1 = open("1.txt","r",encoding="utf-8")
for line in file_1.readlines():
str1.append(line.replace("\n",""))
讀取2.txt的內容,具體實現如下:
str2 = []
file_2 = open("2.txt", "r", encoding="utf-8")
for line in file_2.readlines():
str2.append(line.replace("\n", ""))
取出重復的內容
創建一個空列表,將兩個文件中重復的內容取出來,具體實現如下:
str_dump = []
for line in str1:
if line in str2:
str_dump.append(line) #將兩個文件重復的內容取出來
去掉重復內容
將兩個文本的內容合並,去除重復的內容
str_all = set(str1 + str2) #將兩個文件放到集合里,過濾掉重復內容
for i in str_dump:
if i in str_all:
str_all.remove(i) #去掉重復的文件
完整代碼如下
#!/usr/bin/env python
# -*- coding:utf-8 -*-
def file_qc():
str1 = []
file_1 = open("1.txt","r",encoding="utf-8")
for line in file_1.readlines():
str1.append(line.replace("\n",""))
str2 = []
file_2 = open("2.txt", "r", encoding="utf-8")
for line in file_2.readlines():
str2.append(line.replace("\n", ""))
str_dump = []
for line in str1:
if line in str2:
str_dump.append(line) #將兩個文件重復的內容取出來
str_all = set(str1 + str2) #將兩個文件放到集合里,過濾掉重復內容
for i in str_dump:
if i in str_all:
str_all.remove(i) #去掉重復的文件
for str in str_all: #去重后的結果寫入文件
print(str)
with open("qc_V.txt","a+",encoding="utf-8") as f:
f.write(str + "\n")
if __name__=="__main__":
file_qc()