With the maturity of large language model (LLM) technology, its application scope is expanding. From intelligent writing to search engines, the potential of LLM applications is being tapped little by little.
Recently, Microsoft Research Asia suggested that LLM can be used for industrial control, and only a small number of sample samples can achieve better results than traditional reinforcement learning methods. The study attempted to use GPT-4 to control air conditioning systems (HVAC), with quite positive results.
In the field of intelligent control, reinforcement learning (RL) is one of the most popular decision-making methods, but it has the problem of sample inefficiency and the resulting high training cost. When an agent learns a task from scratch. The traditional reinforcement learning paradigm is fundamentally difficult to solve these problems. After all, even humans typically need thousands of hours of learning to become domain experts, which presumably corresponds to millions of interactions.
However, for many control tasks in industrial scenarios, such as inventory management, quantitative trading and HVAC control, people prefer to use high-performance controllers to handle different tasks at low cost, which is a great challenge to traditional control methods.
For example, we might want to control the HVAC of different buildings with minimal fine-tuning and a limited number of reference demonstrations. The basic principles of HVAC control may be similar for different tasks, but the dynamics of the scene migration and even the state/action space may be different.