python - Convert an iteration over dictionaries into a pandas dataframe -
i have generated 2 dictionaries 2 input files using conditional statements (not shown). aimed use these 2 dictionaries identify overlapping values. use output iteration pandas dataframe directly. that, first outputting iteration/for loop file (output.xls) , read file pandas dataframe. while works well, wondering if there way use 'newline' iteration below directly input empty pandas dataframe. couldn't find option on python except dataframe.from_dict . however, takes in 1 dictionary have multiple dictionaries joining together, other variables utilizing.
exp1_dict.items() is: [('lnc3', ['spata1', 'ahnak', 'fgg', 'erap1', 'hz', 'saasdas', 'nlrc5', 'huwe1']), ('lnc2', ['spata1', 'fgg', 'tmem68', 'atp6ap', 'huwe1']), ('lnc1', ['spata1', 'ahnak', 'fgg', 'tmem68', 'erap1', 'atp6ap', 'saasdas', 'rad17', 'huwe1'])] exp2_dict.items() is: [('lnc3', ['spata1', 'ahnak', 'tmem68', 'erap1', 'hz', 'rad17', 'nlrc5', 'huwe1']), ('lnc2', ['spata1', 'fgg', 'erap1', 'hz']), ('lnc1', ['spata1', 'ahnak', 'fgg', 'tmem68', 'erap1', 'hz', 'atp6ap', 'rad17']), ('lnc4', ['erap1', 'prss16', 'hz', 'nlrc5'])]
the code iterate on dictionaries , generates 'newline' is:
out = open("output.xls", "w") #generates empty output file out.write('header1\theader2\theader3\theader4\theader5\theader6\theader7\theader8\theader9\theader10\theader11\n')#adds header output file intersection_dict={} #empty intersection header key, value1 in exp1_dict.items(): #reiterates on 2 dictionaries if key in exp2_dict.keys(): intersection_dict[key]=list(set(value1).intersection(exp2_dict[key])) newline=key, str(f_exp1_dict[key]), str(f_exp2_dict[key]), str('|'.join(value1)), str(len(exp1_dict[key])), str(len(exp1_corr.index)), str('|'.join(exp2_dict[key])), str(len(exp2_dict[key])), str(len(exp2_corr.index)), str('|'.join(intersection_dict[key])), str(len(intersection_dict[key])) out.write('\t'.join(newline)+'\n')
i read output.xls file using pandas dataframe:
out.close() new_input=pd.read)table("output.xls", index_col=0)
instead of creating output file inputting pandas dataframe, wondering if there way write "newline" above empty pandas dataframe headers above directly.
the output.xls file looks this:
header1 header2 header3 header4 header5 header6 header7 header8 header9 header10 header11 lnc3 4 4 spata1|ahnak|fgg|erap1|hz|saasdas|nlrc5|huwe1 8 12 spata1|ahnak|tmem68|erap1|hz|rad17|nlrc5|huwe1 8 12 hz|erap1|ahnak|huwe1|nlrc5|spata1 6 lnc2 2 3 spata1|fgg|tmem68|atp6ap|huwe1 5 12 spata1|fgg|erap1|hz 4 12 spata1|fgg 2 lnc1 1.5 2 spata1|ahnak|fgg|tmem68|erap1|atp6ap|saasdas|rad17|huwe1 9 12 spata1|ahnak|fgg|tmem68|erap1|hz|atp6ap|rad17 8 12 erap1|rad17|ahnak|tmem68|atp6ap|spata1|fgg 7
create list of lists use create dataframe:
df = [] key, value1 in exp1_dict.iteritems(): if key in exp2_dict: dict_union = list(set(value1).intersection(exp2_dict[key])) col1 = key col2 = str(f_exp1_dict[key]) col2 = str(f_exp2_dict[key]) col3 = str('|'.join(value1)) col4 = str(len(exp1_dict[key])) col5 = str(len(exp1_corr.index)) col6 = str('|'.join(exp2_dict[key])) col7 = str(len(exp2_dict[key])) col8 = str(len(exp2_corr.index)) col9 = str('|'.join(dict_union)) col10 = str(dict_union) df.append([col1, col2, col3, col4, col5, col6, col7, col8, col9, col10]) df = pd.dataframe(df)
Comments
Post a Comment