user3001566 user3001566 - 3 months ago 63
Android Question

How to exit loop including Callable ExecutiorService MultiThreading with Java (Android)

I have a large plist (Xml) file that i parse with SAX (dd-plist library). Since it's a Large file for parsing and for the performance reason i have to use multiThreading , My goal is having exact number of threads which are same to the exact numbers of the keys in my plist files, I mean for each key in the plist , a single thread search the value and compare it with url , if the key and url are equals , then return the value of the key otherwise return null and skip and cancel the thread (which the value is the title of html Content and the key is the path stored in the plist , and url is whatever link's url user click and captured in onPageFinished of Android WebView). I would appreciate if one tell me for above goal , what i have missed with the code.

In my WebFragment (android.support.v4.app.Fragment , in the onPageFinished :

import com.dd.plist.NSDictionary;
import com.dd.plist.NSObject;
import com.dd.plist.PropertyListParser;

...
try {
is = getResources().openRawResource(R.raw.title);
rootDict = (NSDictionary) PropertyListParser.parse(is);
dict = new LinkedHashMap<>();
dict = rootDict.getHashMap();
ExecutorService executor = Executors.newFixedThreadPool(rootDict.size());
Future<String> future;
String myStr = null;
String key;
NSObject value;

for (Map.Entry<String, NSObject> entry : dict.entrySet()) {
key = entry.getKey();
value = entry.getValue();
// following line is refer to WebFragment (line 285 where logs complain and crash because of the memory
future = executor.submit(new ParsePlistThread(key, value, url.substring(32).toString()));
myStr = future.get();
if (myStr != null && !myStr.isEmpty()) {
break;
} else {
//future.cancel(true);
}
}
executor.shutdown();

if (myStr != null) {

if (numTab == 0) {
titleTextView.setText(myStr);
}
} catch (Exception ex) {
//Handle exceptions...
}


here is the ParsePlistThread class :

import com.dd.plist.NSObject;

import java.util.concurrent.Callable;

/**
* Created by manager on 2016-08-18.
*/
public class ParsePlistThread implements Callable<String> {

public String key;
public NSObject valueObject;
public String url;
public ParsePlistThread(String key , NSObject valueObj , String url) {
this.key = key;
this.valueObject = valueObj;
this.url = url;
}

@Override
public String call() throws Exception {

if (key.equals(url)) {
return valueObject.toString();
} else

{
return null;
}
}
}


here is the log:

E/art: Throwing OutOfMemoryError "pthread_create (1040KB stack) failed: Try again"
08-19 09:52:50.328 28749-28749/ca.ccohs.oshanswers E/AndroidRuntime: FATAL EXCEPTION: main
Process: XXX, PID: 28749
java.lang.OutOfMemoryError: pthread_create (1040KB stack) failed: Try again
at java.lang.Thread.nativeCreate(Native Method)
at java.lang.Thread.start(Thread.java:1063)
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:920)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1327)
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:103)
at ca.ccohs.oshanswers.ui.WebFragment$3.onPageFinished(WebFragment.java:285)
at com.android.webview.chromium.WebViewContentsClientAdapter.onPageFinished(WebViewContentsClientAdapter.java:531)
at org.chromium.android_webview.AwContentsClientCallbackHelper$MyHandler.handleMessage(AwContentsClientCallbackHelper.java:188)
at android.os.Handler.dispatchMessage(Handler.java:102)
at android.os.Looper.loop(Looper.java:145)
at android.app.ActivityThread.main(ActivityThread.java:6117)
at java.lang.reflect.Method.invoke(Native Method)
at java.lang.reflect.Method.invoke(Method.java:372)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:1399)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1194)
08-19 09:52:50.343 2850-29945/? E/android.os.Debug: ro.product_ship = true
08-19 09:52:50.343 2850-29945/? E/android.os.Debug: ro.debug_level = 0x4f4c

Answer

Problems with your approach

Except for the conceptual problems of putting extremely small tasks into other threads (you'll spend more time passing information around than actual computing), there are 2 main problems with this code:

1) The actual error you got. This error is due to you running out of stack memory. Java memory is divided into multiple regions, the largest and most common known one of which is "Heap Memory". This is where (almost) all of your objects live. A lesser known region is "Stack Memory". This is where your threads get their memory to store the current state, stack trace, local (method) variables, etc. When a Thread is created, it is allocated some fixed memory for it's stack from this space. If there are too many threads being created, it will run out, and an error such as the one you encountered will be thrown.

Solution - Reuse your Threads!

Your executor has built-in capabilities to reuse threads when they are done with a task. More on this below. In general, having more threads than logical cores in your CPU will not improve speed.

2) You're not actually doing anything concurrently. In your loop, you are submitting a task to the Executor (the executor.submit method), then you are waiting for the task to complete (future.get), then going to the next line. Hence, you are waiting for the current task to finish before making a new one! You will not have 2 tasks running in parallel with this arrangement.

Last point here would be that you should not rely on multithreading to speed up file processing. The bottleneck is almost always reading the file. It's very likely that you are doing something silly to it that makes it slow.


Multithreading done right better.

It has come up in comments that it may be useful to see how these mistakes would be solved in the case that it was worthwhile. The below addresses that.

First thing is first - limit the threads you need. If you were running on a quad-core desktop with hyper-threading, I'd recommend 6 threads, or just a work-stealing pool. I'm not sure what's a good number for Android, but it's definitely lower than the number of lines in your "very large file".

ExecutorService executor = Executors.newFixedThreadPool(6);

or

ExecutorService executor = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());

This way, when you have more tasks then there are available threads (and consequently - then there are available processors, unless your threads are I/O bound), instead of making more threads, new tasks are queued up until an existing thread becomes available.

The next problem is actually putting all tasks into the executor to execute ASAP, rather than sequentially. For this you have to keep track of the futures you've created (Note, I've also removed the substring on each loop, since the URL doesn't seem to change in between the invocations, so you can precalculate it. It's a general thing - don't redo work in loops that you can do once!)

List<Future<String>> tasks = new ArrayList<>();
for (Map.Entry<String, NSObject> entry : dict.entrySet()) {
    key = entry.getKey();
    value = entry.getValue();
    tasks.add(executor.submit(new ParsePlistThread(key, value, url)));
}

Now that you have submitted all the tasks (I'll reiterate again, using such small tasks in such large quantities, is, in general, counter-productive), you need to collect the results. Doing this is rather simple, just iterate through your futures!

String result;
for (Future<String> fut : tasks) {
    String taskResult = fut.get();
    if (taskResult != null && !taskResult.isEmpty()) {
        result = taskResult;
        break;
    }
}

There is one major difference between your approach and this one - yours does not proceed with parsing if it found a result. You can achieve that in this particular case by simply using future.cancel on the futures you've not yet visited. I'll leave the code to you. In general, this is more difficult, as this will involve inter-thread communication (You have to signal another thread to gracefully stop its execution, which may not be trivial).


A word of advice - starting to learn about multithreading while trying to achieve more speed is not very productive IMHO. There are many subtleties around it (the above didn't even mention the 2 devils - statement reordering and memory visibility), and doing them right AND fast is quite a challenge! It's far better to try and do something in parallel, but not necessarily quicker, but make it right. When you're comfortable with making parallel processes correctly, you can think about making them faster.