python - Convert an iteration over dictionaries into a pandas dataframe -


i have generated 2 dictionaries 2 input files using conditional statements (not shown). aimed use these 2 dictionaries identify overlapping values. use output iteration pandas dataframe directly. that, first outputting iteration/for loop file (output.xls) , read file pandas dataframe. while works well, wondering if there way use 'newline' iteration below directly input empty pandas dataframe. couldn't find option on python except dataframe.from_dict . however, takes in 1 dictionary have multiple dictionaries joining together, other variables utilizing.

 exp1_dict.items() is: [('lnc3', ['spata1', 'ahnak', 'fgg', 'erap1', 'hz', 'saasdas', 'nlrc5', 'huwe1']), ('lnc2', ['spata1', 'fgg', 'tmem68', 'atp6ap', 'huwe1']), ('lnc1', ['spata1', 'ahnak', 'fgg', 'tmem68', 'erap1', 'atp6ap', 'saasdas', 'rad17', 'huwe1'])]  exp2_dict.items() is: [('lnc3', ['spata1', 'ahnak', 'tmem68', 'erap1', 'hz', 'rad17', 'nlrc5', 'huwe1']), ('lnc2', ['spata1', 'fgg', 'erap1', 'hz']), ('lnc1', ['spata1', 'ahnak', 'fgg', 'tmem68', 'erap1', 'hz', 'atp6ap', 'rad17']), ('lnc4', ['erap1', 'prss16', 'hz', 'nlrc5'])] 

the code iterate on dictionaries , generates 'newline' is:

out = open("output.xls", "w") #generates empty output file out.write('header1\theader2\theader3\theader4\theader5\theader6\theader7\theader8\theader9\theader10\theader11\n')#adds header output file  intersection_dict={} #empty intersection header key, value1 in exp1_dict.items(): #reiterates on 2 dictionaries         if key in exp2_dict.keys():                 intersection_dict[key]=list(set(value1).intersection(exp2_dict[key]))                 newline=key, str(f_exp1_dict[key]), str(f_exp2_dict[key]), str('|'.join(value1)), str(len(exp1_dict[key])), str(len(exp1_corr.index)), str('|'.join(exp2_dict[key])), str(len(exp2_dict[key])), str(len(exp2_corr.index)), str('|'.join(intersection_dict[key])), str(len(intersection_dict[key]))                 out.write('\t'.join(newline)+'\n')   

i read output.xls file using pandas dataframe:

out.close() new_input=pd.read)table("output.xls", index_col=0) 

instead of creating output file inputting pandas dataframe, wondering if there way write "newline" above empty pandas dataframe headers above directly.

the output.xls file looks this:

 header1 header2 header3 header4 header5 header6 header7 header8 header9 header10    header11 lnc3    4   4   spata1|ahnak|fgg|erap1|hz|saasdas|nlrc5|huwe1   8   12  spata1|ahnak|tmem68|erap1|hz|rad17|nlrc5|huwe1  8   12  hz|erap1|ahnak|huwe1|nlrc5|spata1   6 lnc2    2   3   spata1|fgg|tmem68|atp6ap|huwe1  5   12  spata1|fgg|erap1|hz 4   12  spata1|fgg  2 lnc1    1.5 2   spata1|ahnak|fgg|tmem68|erap1|atp6ap|saasdas|rad17|huwe1    9   12  spata1|ahnak|fgg|tmem68|erap1|hz|atp6ap|rad17   8   12  erap1|rad17|ahnak|tmem68|atp6ap|spata1|fgg  7 

create list of lists use create dataframe:

df = [] key, value1 in exp1_dict.iteritems():      if key in exp2_dict:         dict_union = list(set(value1).intersection(exp2_dict[key]))         col1 = key         col2 = str(f_exp1_dict[key])         col2 = str(f_exp2_dict[key])         col3 = str('|'.join(value1))         col4 = str(len(exp1_dict[key]))         col5 = str(len(exp1_corr.index))         col6 = str('|'.join(exp2_dict[key]))         col7 = str(len(exp2_dict[key]))         col8 = str(len(exp2_corr.index))         col9 = str('|'.join(dict_union))         col10 = str(dict_union)         df.append([col1, col2, col3, col4, col5, col6, col7, col8, col9, col10])  df = pd.dataframe(df) 

Comments

Popular posts from this blog

Django REST Framework perform_create: You cannot call `.save()` after accessing `serializer.data` -

Why does Go error when trying to marshal this JSON? -