Robust Reinforcement Learning for Customized Farming
Advancements in technology unlock an increasing number of ways to tailor farming practices to particular farm characteristics. For example, farming data collected from various sensors can guide customization of fertilizers and enable data-driven improvement over time. This is a typical Reinforcement Learning (RL) problem as crop quality depends on a range of factors in a complex and often obscure way. For RL to be integrated into the process of fertilizer customization, it is crucial that the techniques are robust. An aspect of robustness is the capability to detect and adjust for model errors. This entails both model selection and parameter estimation. How to achieve both objectives simultaneously while maintaining provably good performance is an open problem that we intend to address.
- Motivated by the fast-growing number of data sources that are becoming available for data-driven decision making, we studied a simplified problem known as adversarial bandits with expert advice.
- We focused on the case where the number of experts is large, possibly infinite.
- Our goal is to design algorithms that are computationally feasible and provably performant. By the latter, we mean that the algorithm makes decisions almost as well as it can in hindsight.
- Inspired by the problem of finite-time model selection for RL, we considered experts structured in a novel way that simulates the trade-off between approximation and estimation.
- We proposed an algorithm called Best Expert Search (BEES) and provided theoretical guarantees on its performance.
- Given a set of policies, BEES can learn in finite time and perform almost as well as the best policy considered.
X. F. Meng, T. Sarkar, M. A. Dahleh. Adversarial bandits with infinitely many experts. Under Review.