arff文件和txt文件之間的轉換_python


在github上,已經有前輩對這兩種格式的文件間的轉換提供了相應的python庫,比如liac-arff: https://github.com/renatopp/liac-arff。但是當程序比較復雜時,再調用這么多外部文件,未免顯得冗雜;而且這些arff庫,在attribute和值數目不一致時,會報錯。所以,在師兄的支持下,我參考overflow寫了兩個簡單的轉換函數。(用時5個多小時。。。以后要效率啊)

arff2txt():

將arff文件轉換成txt格式:

import re
import sys

def arff2txt(filename):
    txtfile = open('./generatedtxt.txt','w')
    arr = []
    lines = []
    arff_file = open(filename)
    for line in arff_file:
        if not (line.startswith("@")):
            if not (line.startswith("%")):
                line = line.strip("\n")
                line = line.split(',')
                arr.append(line)

    del arr[0]
    for child in arr:
        del child[10]
        if child[9] == "True":
            child[9] = 1
        else:
            child[9] = 0
        lines.append('\t'.join(map(str,child)))
    result = '\n'.join(lines)
    print result

    txtfile.writelines(result)
    txtfile.close()

 

txt2arff():

將txt文件轉換成arff()格式:

def txt2arff(filename, value):
    with open('./generatedarff.arff', 'w') as fp:
        fp.write('''@relation ExceptionRelation

@attribute ID string
@attribute Thrown numeric
@attribute SetLogicFlag numeric
@attribute Return numeric
@attribute LOC numeric
@attribute NumMethod numeric
@attribute EmptyBlock numeric
@attribute RecoverFlag numeric
@attribute OtherOperation numeric
@attribute class-att {True,False}

@data
''')
        with open(filename) as f:
            contents = f.readlines()

        for content in contents:
            lines = content.split('\t')
            lines = [line.strip() for line in lines]
            if lines[9] == '1':
                lines[9] = "True"
                lines.append('{' + str(value) + '}')
            else:
                lines[9] = "False"
                lines.append('{1}')
            array = ','.join(lines)

            fp.write("%s\n" % array)

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM