difflib組件提供了一種在兩個序列之間進行比較的工具,比較兩個序列串中之間的差別類似於linux中diff命令。常用的功能有Diff類,ndiff函數,unified_diff函數,context_diff函數,HtmlDiff類,以及SequenceMatcher類。
Diff類以及ndiff:
Diff類和ndiff中兩個功能輸出的結果基本相似,用法稍有不同:
#Differ使用 d = difflib.Differ() diff = d.compare(text1_lines, text2_lines) #ndiff使用 diff = difflib.ndiff(text1_lines, text2_lines)
unified_diff,context_diff,HtmlDiff:
以上者三個函數控制比較結果的輸出格式。如果HtmlDiff類中make_file是的比較結果以html的源代碼輸出,unified_diff將相同串放在一起輸出。
#content_diff用法 diff = difflib.context_diff(text1_lines,text2_lines) #unified_diff用法 diff = difflib.unified_diff(text1_lines,text2_lines) #HtmlData用法: diff = difflib.HtmlDiff() diff.make_file(text1_lines,text2_lines) #也可以 diff.make_table(text1_lines,text2_lines)
SequenceMatcher類:
可以手動設置忽略的字符,同時可以用於比較任意類型的序列。不過這個序列中的元素需要有對應的hash值
import difflib from difflib_data import * s1 = [1,2,3,5,6,4] s2 = [2,3,5,4,6,1] print 'Initial data:' print 's1 =', s1 print 's2 =', s2 print 's1==s2',s1==s2 print matcher = difflib.SequenceMatcher(None,s1,s2) for tag, i1, i2, j1, j2 in reversed(matcher.get_opcodes()): if tag == 'delete': print 'Remove %s from positions [%d:%d]'%(s1[i1:i2],i1,i2) del s1[i1:i2] elif tag == 'equal': print 'The sections [%d:%d] of s1 and [%d:%d] of s2 are the same' % \ (i1, i2, j1, j2) elif tag == 'insert': print 'Insert %s from [%d:%d] of s2 into s1 at %d' % \ (s2[j1:j2], j1, j2, i1) s1[i1:i2] = s2[j1:j2] elif tag == 'replace': print 'Replace %s from [%d:%d] of s1 with %s from [%d:%d] of s2' % ( s1[i1:i2], i1, i2, s2[j1:j2], j1, j2) s1[i1:i2] = s2[j1:j2] print 's1 =', s1 print 's2 =', s2 print print 's1 == s2:', s1 == s2