We consider a station-based bike sharing system (BSS) where users can rent and return bikes spontaneously. Rental and return requests are uncertain and subjected to a spatio-temporal pattern. Service providers dynamically dispatch transport vehicles to relocate bikes between stations. The challenge is to balance the numbers of bikes and free bike racks at every station to satisfy as many requests as possible.
The considered problem is a stochastic-dynamic inventory routing problem (SDIRP). We model the SDIRP as a Markov decision process. The objective is to identify an optimal policy, minimizing the expected number of failed requests.
To solve the SDIRP, we draw on policies by means of approximate dynamic programming. We present dynamic look-ahead policies (DLA) to anticipate potentially failing requests in online simulations. The DLA simulates a limited time of the overall time horizon to evaluate feasible inventory and routing decisions. Due to the spatio-temporal request pattern, the lengths of suitable simulation horizons differ in the course of the day. To select suitable horizons for every hour of the day, we apply value function approximation (VFA). VFA carries out offline simulations and returns a sequence of suitable simulation horizons.
Our computational studies on real-world data by the BSS of Minneapolis (Minnesota, USA) point out, that the VFA-based DLA outperforms look-ahead policies with static simulation horizons as well as conventional policies from literature. Further, the sequence of simulation horizons reflects the temporal aspect of the request pattern.