START Conference Manager |
Our recent work has shown that modeling non-preemptive resource sharing between threads as a Markov Decision Process (MDP) produces (1) an analyzable utilization state space, and (2) a representation of a scheduling decision policy based on the MDP, even when task execution times are loosened from exact values to known distributions within which their execution times may vary. However, if dependences among tasks, or the distributions of their execution times are not known, then how to obtain the appropriate MDP remains an open problem.
In this paper, we posit that this problem formulation is likely well suited for applying reinforcement learning (RL) techniques. In doing so, our goal is to overcome a lack of knowledge about how the tasks behave internally, by recording their utilizations (states) and their actions (which tasks are scheduled), and observing the transitions among states under different actions. We give an overview of our progress to date, and our planned research over the next year or so in this direction.
START Conference Manager (V2.54.6)