arff文件和txt文件之间的转换_python


在github上,已经有前辈对这两种格式的文件间的转换提供了相应的python库,比如liac-arff: https://github.com/renatopp/liac-arff。但是当程序比较复杂时,再调用这么多外部文件,未免显得冗杂;而且这些arff库,在attribute和值数目不一致时,会报错。所以,在师兄的支持下,我参考overflow写了两个简单的转换函数。(用时5个多小时。。。以后要效率啊)

arff2txt():

将arff文件转换成txt格式:

import re
import sys

def arff2txt(filename):
    txtfile = open('./generatedtxt.txt','w')
    arr = []
    lines = []
    arff_file = open(filename)
    for line in arff_file:
        if not (line.startswith("@")):
            if not (line.startswith("%")):
                line = line.strip("\n")
                line = line.split(',')
                arr.append(line)

    del arr[0]
    for child in arr:
        del child[10]
        if child[9] == "True":
            child[9] = 1
        else:
            child[9] = 0
        lines.append('\t'.join(map(str,child)))
    result = '\n'.join(lines)
    print result

    txtfile.writelines(result)
    txtfile.close()

 

txt2arff():

将txt文件转换成arff()格式:

def txt2arff(filename, value):
    with open('./generatedarff.arff', 'w') as fp:
        fp.write('''@relation ExceptionRelation

@attribute ID string
@attribute Thrown numeric
@attribute SetLogicFlag numeric
@attribute Return numeric
@attribute LOC numeric
@attribute NumMethod numeric
@attribute EmptyBlock numeric
@attribute RecoverFlag numeric
@attribute OtherOperation numeric
@attribute class-att {True,False}

@data
''')
        with open(filename) as f:
            contents = f.readlines()

        for content in contents:
            lines = content.split('\t')
            lines = [line.strip() for line in lines]
            if lines[9] == '1':
                lines[9] = "True"
                lines.append('{' + str(value) + '}')
            else:
                lines[9] = "False"
                lines.append('{1}')
            array = ','.join(lines)

            fp.write("%s\n" % array)

 


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM