Solving multi-echelon inventory problems with heuristic-guided deep reinforcement learning and centralized control