Listing 1 - 3 of 3 |
Sort by
|
Choose an application
This thesis studies a periodic review, single item lost sales model with positive lead times. A deep Q-network (DQN), a deep reinforcement learning (DRL) algorithm, is constructed and domain knowledge is added with potential-based reward shaping to boost its performance. The domain knowledge is provided by existing heuristics, namely the base-stock, restricted base-stock and constant-order policy. The performance of the DQN algorithm without domain knowledge is evaluated against DQN algorithms with added domain knowledge, the optimal policy and the following heuristics: the constant-order, base-stock and restricted base-stock policy. When comparing the DQN algorithm without added knowledge to the one with reward shaping, using the base-stock or restricted base-stock policy as a teacher improves the performance of the algorithm in all six experiments. In one experiment using a restricted base-stock as a teacher improves the optimality gap by as much as 21,37%. Looking at the performance of the DQN algorithm with reward shaping and the heuristics policies themselves, in four out of six experiments, the DQN-agent with a base-stock policy as a teacher outperforms the base-stock policy. In all experiments, using a constant order policy as a teacher results in better performance than the constant order policy. These results demonstrate the potential of reward shaping to boost the performance of DRL in a lost sales inventory management environment.
Choose an application
This thesis discusses the performance of capacity allocation rules in collaborative shipping, where different companies collaborate to replenish their goods. Trucks on the road often have a low utilisation, but imposing full truckloads improves this utilisation. A capacity allocation rule on top of a replenishment policy is needed when implementing full truckloads. This research applies a stochastic inventory problem with an (S,T) replenishment policy. Seven capacity allocation rules are proposed and studied using a simulation model in Arena. Each allocation rule’s objective is to minimise costs while taking into account the service levels. Which capacity allocation rule performs best depends on the coalition partners, the costs and the demand structure. The fair share allocation rule and the consistent appropriate share allocation rule generally perform well in terms of total cost and service levels.
Choose an application
The continuous growth of road transportation is inherently linked to an increase in emitted CO2 levels that raises the demand for greener solutions. Freight consolidation may answer to this since these operations reduce the number of vehicles sent onto the road by increasing the fill rates of trucks. The bundling decision determines which and when freight orders are put onto a mutual trailer at an intermediate facility, whilst being subject to the realized transit times of these shipments. In this thesis, a decision support model is thus developed for a third-party logistics (3PL) provider in which the impact is determined of real-time information in transit times on the consolidation decision, represented as a mixed-integer linear programming problem. Stochasticity emerges in the form of disturbances in transit times in a many-to-one setting that influences the wait-or-go decision of freight orders. A numerical study with the 3PL’s data is conducted in order to evaluate the consolidation decisions in terms of transportation costs, service quality and emitted CO2 levels.
Listing 1 - 3 of 3 |
Sort by
|