I think what you have in mind is very feasible. This quite similar to what google assistant does when you give an image/camera feed to it. It can detect common flowers, etc to help general usage. I believe you're trying to apply a similar idea within DNR for a specific task.
Given your situation it would be more like a classification. Given an image, check if it is a weed rather than finding weed within an image. Classification would be rather accurate too.
My suggestion for the implementation is, pick a pretained model like imagenet or resnet. Then perform transfer learning on your specific images and classes. This should give you a quite accurate model for downstream analysis.
You can indeed run a machine learning model in your phone. But the implementation might be challenging. One idea would be a take a video or photos with geo tagging. Later analyse them on a powerful computer to detect weeds.
What I said might sound complicated (or may be not). But there are plenty of resources if you google alone "transfer learning image classification". There should be plenty of properly implemented examples too.
I hope this text will help you in due course.