Reinforcement learning for inventory management

The inventory stock control is one of the most significant problems in the supply chain management process of a firm. Reducing its stock costs helps gaining in performance and competitiveness.

Partners.

The inventory stock management includes aspects such as controlling and overseeing purchases from suppliers as well as customers, maintaining the storage of stock, controlling the amount of product for sale, and order fulfillment. A decision maker (learning agent) observes the random stochastic demands and local information of inventory such as inventory levels as its inputs to make decisions about the next ordering values as its actions. Since the inventory on-hand (the available amount of stock in inventory), unmet demands (backorders), and the existence of ordering are costly, the optimization problem is designed to minimize the overall cumulative costs.

As a result, the objective function is to reduce the long-run cost (cumulative reward) whose components are linear holding, linear penalties, and fixed ordering costs. In most inventory management policies, this is done using basic heuristics that are not always able to account for the complexity of the system and the stochasticity of the demand.

This results in two possible scenarios: the first is to exceedingly order which results in paying unnecessary costs, the second is to make an insufficient order which results in unsatisfied demands.

In order to minimize inventory management costs, a promising route is to utilise a reinforcement learning approach. Indeed, stock management can be modeled as a sequential decision-making process under uncertainty which is often written as a Markov decision process (MDP). In this case, reinforcement learning provides robust solutions for this kind of tasks. Recent progress in machine learning, and RL in particular, involving deep models in other complex domains (deep Q-learning) suggests that the achieving of high-quality results (which can even be transferred from one inventory problem to another via transfer learning, minimizing overheads) even in these types of highly complex environments may be well possible.

A quantum-enhanced method to generate optimal inventory management strategies would bring two main benefits:

Improved inventory management directly translates in reduction of expenses;
Better performance ensures more timely delivery of items to customers, and avoids delays.

Our publications and deliverables

Parametrized Quantum Policies for Reinforcement Learning (NeurIPS 2021)
Quantum agents in the Gym: a variational quantum algorithm for deep Q-learning (Quantum 6, 720, 2022)
D5.2 Specification of QRL algorithm for inventory management
Unsupervised strategies for identifying optimal parameters in Quantum Approximate Optimization Algorithm (EPJ Quantum Technology)
Equivariant quantum circuits for learning on weighted graphs (npj Quantum Information)
Quantum Machine Learning Beyond Kernel Methods (The Journal of Chemical Physics)
D5.5 Implementation of QRL algorithm on real architecture (report)
High Dimensional Quantum Machine Learning With Small Quantum Computers (Quantum)
Reinforcement learning assisted recursive QAOA (EPJ Quantum Technology)

Our webinar

04/06/2024 – Quantum machine learning and industrial applications

Other use cases in “Machine Learning & Optimisation”

Hard optimisation problems for smart-charging of electric vehicles

Financial applications

HPC mesh segmentation

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 951821.

Our website uses cookies to give you the most optimal experience online by: measuring our audience, understanding how our webpages are viewed and improving consequently the way our website works, providing you with relevant and personalized marketing content. You have full control over what you want to activate. You can accept the cookies by clicking on the “Accept all cookies” button or customize your choices by selecting the cookies you want to activate. You can also decline all cookies by clicking on the “Decline all cookies” button. Please find more information on our use of cookies and how to withdraw at any time your consent on our privacy policy.

Accept all cookies

Decline all cookies