performance - Parallel processing of files in java with ExecutorService does not use all of the CPU power -


i have directory contains 1000s of csv files need parse. have implemented executorservice class of java job, wherein assign each thread csv file parse. have 4 cores in machine. efficiency compared single-threaded application. however, when see cpu utilization( using task manager) doesn't seem utilising of cpu power, % of cpu used 30%-40%. wanted know if approach correct.

file dir = new file(file); if(dir.isdirectory()){     file[] files = dir.listfiles();  for(file f : files){     string file_abs_path = f.getabsolutepath();     int index = file_abs_path.lastindexof("/") + 1;     file_name = file_abs_path.substring(index);     futureslist.add(eservice.submit(new myparser(file_abs_path))); }  object gpdocs; for(future<list<myobj>> future:futureslist) { try {     docs = future.get();     arraylist = (list<myobj>)docs;     iterator<myobj> = arraylist.iterator();     while(it.hasnext()){     doc = createdocument(file_name,it.next());     try{         //somefunction(doc);         }catch(exception e){} }}catch (interruptedexception e) {} catch (executionexception e) {} }}  

i wondering if approach correct? appreciated.

thanks

the code parser :

public list<myobj> call(){     columnpositionmappingstrategy<myobj> strat =  new columnpositionmappingstrategy<myobj>(); strat.settype(myobj.class); string[] columns = new string[] {//list of columns in csv file};   strat.setcolumnmapping(columns); csvtobean<myobj> csv = new csvtobean<myobj>(); bufferedreader reader = null; string doc_line = ""; string[] docs; string doc = ""; file dir = new file(file_path); try{     int comma_count = 0;     reader = new bufferedreader(new filereader(dir));     while((doc_line = reader.readline()) != null){         docs = doc_line.split(",");     doc += docs[i] + " ";     }     reader.close();     }catch (ioexception e) {/*e.printstacktrace();*/}     return(csv.parse(strat,new stringreader(doc))); } 

as commented, task io bound, tasks involving io hard-drive are.

the best performance can hope decouple reading threads processing. probably, single reading thread, reading blocks of data large possible , feeding queue processing yield best overall throughput. number of processing threads whatever necessary keep reading.


Comments

Popular posts from this blog

html - Styling progress bar with inline style -

java - Oracle Sql developer error: could not install some modules -

How to use autoclose brackets in Jupyter notebook? -