Thursday 5 January 2012

ANR keyDispatchingTimedOut and who's to blame

Huh, it took me a while to deal with this.

Imagine you have an app, which is generally accepted really well (although not insanely popular) and gets overwhelmingly positive reviews and 5-star ratings.

Then something strange happens: someone puts 1-star rating. 'OK' you think, 'probably this is just a disturbance in the force. Someone was pissed about something or is generally a nasty person, so let's live with it.'.

Then things go as usual for some time. You get some 5-stars juice and all is fine.

Then this happens again. One star. This time with a colorful review of how this crap doesn't work at all and force-closes constantly. Fortunately, the person was nice enough  to actually submit an error report so I could get a clue of what's going on.

I must admit I am a bit ashamed (professionally) because I spent a good one hour staring and scrolling through the code and looking for the cause. What makes me feel a bit better is that (unfortunately) the Android Market's Developer console was in one of its not-so-great periods again and all I could see was the exception message which was:

ANR keyDispatchingTimedOut in SearchResultActivity

OK. I know that ANR keyDispatching is caused by executing time-consuming operations, such as super-heavy CPU processing or -- more often -- blocking I/O calls (downloading from the network, accessing a database etc.) in the UI thread.  So at some point, the runtime will accept the activity has hanged and will force close it.

In the activity that creates the problem I have a ListView and its items are indeed loaded from the Internet (the app itself is here but it is in Bulgarian). Each item of the list may or may not have an image (again downloaded from the Internet). What puzzled me is that all downloads happened in their own threads and the UI was not blocked. The app worked fine in each and every occasion I have tested it and apparently it worked fine for its more than 1000 users (except two).

I was ready to curse Android, Java and every handset and ROM creator under the Sun for this apparent bug in the platform. Only, I am in the code writing business for long enough to know that 99.999% of the times it is my fault. If you're a programmer, you will know what I am talking about. How many times did you 'find' bugs in gcc, Java, .NET or python? And how many times it was actually your code to blame? You get the point.

So, as the stupid Universe works, again, I was the idiot here.

Roughly speaking, I had code like this:


private class ImageFetcher extends Thread {
Bitmap b;
ImageView iv;
public ImageFetcher(ImageView iv){
this.iv = iv;
String completeUrl = BASE_URL + recipy.getImageUrl();
b = ImageDownloader.DownloadImage(completeUrl);
handler.sendEmptyMessage(0);
}

private Handler handler = new Handler() {
                       @Override
                       public void handleMessage(Message msg) {
                            if (b != null){
                              iv.setImageBitmap(b);
                          } else {
                               iv.setVisibility(View.GONE);
                          }
                        }
        };


Did you spot it? I bet you did. If you haven't take another look. The problem is so stupid! And so apparent! This thread is actually not overriding the run() method. It does it's job in the constructor! It's synchronous!

I can't really tell how many times I looked at the code and didn't see anything suspicious. It's a thread, it has a handler, all seems fine! And during all the testing I have made, over WiFi and 3G the performance of the network (and the backend server) was apparently great because the UI was feeling fast and sleek. And this was the case for -- almost, but obviously not -- all of the app users.

But when the server is overloaded, or when the network is slow or unstable, you'll get what you coded. 1 star and a frustrated user.

So the moral. Two of them actually:
1/ The Universe's default mode is that it's your fault!
2/ Watch for the details.

=======================
Update: Here is the correct implementation of ImageFetcher class (I post it here just for completeness, not that it contains something surprising or interesting).


private class ImageFetcher extends Thread {  
  Bitmap b;
  ImageView iv;
  public ImageFetcher(ImageView iv){
   this.iv = iv;     
  }
  @Override 
  public void run(){
   String completeUrl = BASE_URL + recipy.getImageUrl();
   b = ImageDownloader.DownloadImage(completeUrl);
   handler.sendEmptyMessage(0);
  }
  
  private Handler handler = new Handler() {

            @Override
            public void handleMessage(Message msg) {
             if (b != null){    
                 iv.setImageBitmap(b);
             } else {
                 iv.setVisibility(View.GONE);
             }
            }
        };
 }