Hello Yonit,
Thanks for reading the article.
If you are saying about 500 ms end to end detection that means from capturing image->Transforming it to the size required by model->passing it to model->Fetching results from model->Drawing the result on android app.
Then I would say that basically looking into app side optimization will help. However, with the mobile net model, it approximately takes 203 ms (Based on my runtime while developing the model with the command line). If you want optimization in that way then image input type and inferences would be helped.
In case of any insights, please put your comments.
Happy to help.