Optimal Scheduling Policy Determination in HSDPA Networks

Size: px

Start display at page:

Download "Optimal Scheduling Policy Determination in HSDPA Networks"

Todd Shields
5 years ago
Views:

1 Optimal Scheduling Policy Determination in HSDPA Networks Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris SCE-Carleton University 1125 Colonel By Drive, Ottawa, ON, Canada {hussein, jtalim, 26 June 2007 ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 1 / Emai 34

2 Outline 1 Objective and Motivation 2 Methodology 3 Problem Definition and Model Description 4 Case Study and Results 5 Conclusion and Future Work ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 2 / Emai 34

3 Objective Objective and Motivation Objective To devise a methodology to find the optimal scheduling regime in HSDPA networks, that controls the allocation of the time-code resources. This resulting optimal policy should have the following properties: ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 3 / Emai 34

4 Objective Objective and Motivation Objective To devise a methodology to find the optimal scheduling regime in HSDPA networks, that controls the allocation of the time-code resources. This resulting optimal policy should have the following properties: Fair; Divide the resources fairly between all the active users. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 3 / Emai 34

5 Objective and Motivation Objective Objective To devise a methodology to find the optimal scheduling regime in HSDPA networks, that controls the allocation of the time-code resources. This resulting optimal policy should have the following properties: Fair; Divide the resources fairly between all the active users. Optimal Transmission: Maximizes the overall cell throughput. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 3 / Emai 34

6 Objective and Motivation Objective Objective To devise a methodology to find the optimal scheduling regime in HSDPA networks, that controls the allocation of the time-code resources. This resulting optimal policy should have the following properties: Fair; Divide the resources fairly between all the active users. Optimal Transmission: Maximizes the overall cell throughput. Optimal Resource Utilization: Provide channel aware (diversity gain) and high speed resource allocation. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 3 / Emai 34

7 Objective and Motivation Motivation Motivation 3GPP only suggested some guidelines for HSDPA downlink scheduler and left the design specifics undefined. This resulted in many different scheduling techniques and implementations most of which are proprietary. Most of the available work in scheduler design is based on intuition and creativity of the designers. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 4 / Emai 34

8 Methodology Methodology This work presents a different approach for scheduling in HSDPA. A declarative approach is used, Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 5 / Emai 34

9 Methodology Methodology This work presents a different approach for scheduling in HSDPA. A declarative approach is used, Develop an analytic model for the HSDPA downlink scheduler. A MDP based discrete stochastic dynamic programming model is used to model the system. This Model is a simplifying abstraction of the real scheduler which estimates system behavior under different conditions and describes the role of various system components in these behaviors. It must be solvable. Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 5 / Emai 34

10 Methodology Methodology This work presents a different approach for scheduling in HSDPA. A declarative approach is used, Develop an analytic model for the HSDPA downlink scheduler. A MDP based discrete stochastic dynamic programming model is used to model the system. This Model is a simplifying abstraction of the real scheduler which estimates system behavior under different conditions and describes the role of various system components in these behaviors. It must be solvable. Define an objective function. Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 5 / Emai 34

11 Methodology Methodology This work presents a different approach for scheduling in HSDPA. A declarative approach is used, Develop an analytic model for the HSDPA downlink scheduler. A MDP based discrete stochastic dynamic programming model is used to model the system. This Model is a simplifying abstraction of the real scheduler which estimates system behavior under different conditions and describes the role of various system components in these behaviors. It must be solvable. Define an objective function. Value iteration is then used to solve for optimal policy. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 5 / Emai 34

12 Methodology Methodology This work presents a different approach for scheduling in HSDPA. A declarative approach is used, Develop an analytic model for the HSDPA downlink scheduler. A MDP based discrete stochastic dynamic programming model is used to model the system. This Model is a simplifying abstraction of the real scheduler which estimates system behavior under different conditions and describes the role of various system components in these behaviors. It must be solvable. Define an objective function. Value iteration is then used to solve for optimal policy. Study the structure of the optimal policy and develop a near-optimal heuristic policy. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 5 / Emai 34

13 Problem Definition and Model Description Problem Definition and Conceptualization Problem Definition and Conceptualization The HSDPA downlink channel uses a mix of TDMA and CDMA: Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 6 / Emai 34

14 Problem Definition and Model Description Problem Definition and Conceptualization Problem Definition and Conceptualization The HSDPA downlink channel uses a mix of TDMA and CDMA: Time is slotted into fixed length 2 ms TTIs. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 6 / Emai 34

15 Problem Definition and Model Description Problem Definition and Conceptualization Problem Definition and Conceptualization The HSDPA downlink channel uses a mix of TDMA and CDMA: Time is slotted into fixed length 2 ms TTIs. During each TTI, there are 15 available codes that may be allocated to one or more users. Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 6 / Emai 34

16 Problem Definition and Model Description HSDPA Scheduler Model (Downlink) Problem Definition and Conceptualization PDUs User 1 UE 1 SDU Scheduler Transceiver RLC RNC RNC User L Channel State Monitor/Predictor Node-B UE L ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 7 / Emai 34

17 Problem Definition and Model Description Problem Definition and Conceptualization FSMC Model for HSDPA Downlink Channel P 00 P 01 P M-1 P 10 ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 8 / Emai 34

18 Problem Definition and Model Description Model Description and Basic Assumptions The Model MDP based Model. HSDPA downlink scheduler is modelled by the 5-tuple (T, S, A, P ss (a), R(s, a)), where, T is the set of decision epochs, S and A are the state and action spaces, P ss (a)=pr(s(t + 1)=s s(t)=s, a(s)=a) is the state transition probability, and R(s, a) is the immediate reward when at state s and taking action a. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks By Drive, Ottawa, 26 June 2007 ON, Canada 9 / Emai 34

19 Problem Definition and Model Description Model Description and Basic Assumptions Basic Assumptions L active users in the cell. Finite buffer with size B per user for each of the L users. Error free transmission. SDUs are segmented by RLC into a fixed number of PDUs (u i ) and delivered to Node-B at the beginning of the next TTI. Independent Bernoulli arrivals with parameter q i. Scheduler can assign c codes chunks at a time, where c {1, 3, 5, 15}. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 10 / Emai 34

20 Problem Definition and Model Description Model Description and Basic Assumptions Basic Assumptions FSMC State Space The channel state of user i during slot t is denoted by γ i (t). Channel state space is the set M = {0, 1,..., M 1}. user i channel can handle up to γ i (t) PDUs per code. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 11 / Emai 34

21 Problem Definition and Model Description State and Action Sets State and Action Sets The system state s(t) S is a vector and is given by s(t) = (x 1 (t), x 2 (t),..., x L (t), γ 1 (t), γ 2 (t),..., γ L (t)) (1) ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 12 / Emai 34

22 Problem Definition and Model Description State and Action Sets State and Action Sets The system state s(t) S is a vector and is given by s(t) = (x 1 (t), x 2 (t),..., x L (t), γ 1 (t), γ 2 (t),..., γ L (t)) (1) S = {X M} L is finite, due to the assumption of finite buffers size and channel states. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 12 / Emai 34

23 Problem Definition and Model Description State and Action Sets State and Action Sets The system state s(t) S is a vector and is given by s(t) = (x 1 (t), x 2 (t),..., x L (t), γ 1 (t), γ 2 (t),..., γ L (t)) (1) S = {X M} L is finite, due to the assumption of finite buffers size and channel states. The action a(s) A is taken when in state s a(s) = (a 1 (s), a 2 (s),..., a L (s)) (2) Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 12 / Emai 34

24 Problem Definition and Model Description State and Action Sets State and Action Sets The system state s(t) S is a vector and is given by s(t) = (x 1 (t), x 2 (t),..., x L (t), γ 1 (t), γ 2 (t),..., γ L (t)) (1) S = {X M} L is finite, due to the assumption of finite buffers size and channel states. The action a(s) A is taken when in state s a(s) = (a 1 (s), a 2 (s),..., a L (s)) (2) subject to, L i=1 a i (s) 15 c, and a xi (t) i(s) γ i (t)c Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 12 / Emai 34

25 Problem Definition and Model Description State and Action Sets State and Action Sets The system state s(t) S is a vector and is given by s(t) = (x 1 (t), x 2 (t),..., x L (t), γ 1 (t), γ 2 (t),..., γ L (t)) (1) S = {X M} L is finite, due to the assumption of finite buffers size and channel states. The action a(s) A is taken when in state s a(s) = (a 1 (s), a 2 (s),..., a L (s)) (2) subject to, L i=1 a i (s) 15 c, and a xi (t) i(s) γ i (t)c a i (t)c, number of codes allocated to user i at time epoch t. Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 12 / Emai 34

26 Problem Definition and Model Description Reward Function Reward Function The reward must achieve the objective function R(s, a) have two components corresponding to the two objectives R(s, a) = L L a i γ i c σ (x i x) 1 {xi =B} (3) i=1 i=1 where we defined the fairness factor (σ) to reflect the significance of fairness in the optimal policy. The positive term of the reward maximizes the cell throughput. The second term guarantees some level of fairness and reduces dropping probability. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 13 / Emai 34

27 Problem Definition and Model Description State Transition Probability State Transition Probability P ss (a) denotes the probability that choosing an action a at time t when in state s will lead to state s at time t + 1. P ss (a) = Pr(s(t + 1)=s s(t)=s, a(t)=a) = Pr(x 1,..., x L,γ 1,...,γ L x 1,..., x L,γ 1,...,γ L,a 1,...,a L ) Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 14 / Emai 34

28 Problem Definition and Model Description State Transition Probability State Transition Probability P ss (a) denotes the probability that choosing an action a at time t when in state s will lead to state s at time t + 1. P ss (a) = Pr(s(t + 1)=s s(t)=s, a(t)=a) = Pr(x 1,..., x L,γ 1,...,γ L x 1,..., x L,γ 1,...,γ L,a 1,...,a L ) The evolution of the queue size (x i ) is given by x i = min ( [x i y i ] + + z i, B ) = min ( [x i a i γ i c] + + z i, B ) (4) Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 14 / Emai 34

29 Problem Definition and Model Description State Transition Probability State Transition Probability P ss (a) denotes the probability that choosing an action a at time t when in state s will lead to state s at time t + 1. P ss (a) = Pr(s(t + 1)=s s(t)=s, a(t)=a) = Pr(x 1,..., x L,γ 1,...,γ L x 1,..., x L,γ 1,...,γ L,a 1,...,a L ) The evolution of the queue size (x i ) is given by x i = min ( [x i y i ] + + z i, B ) = min ( [x i a i γ i c] + + z i, B ) (4) Using the independence of the channel state and queue sizes L ( ) P ss (a) = P xi x i (γ i, a i ) P γi γ i where P γi γ i i =1 is the Markov transition probability of the FSMC. (5) ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 14 / Emai 34

30 Problem Definition and Model Description State Transition Probability cont. State Transition Probability where 1 if x i =x i =B & a i γ i = 0, q i if x i =x i =B & 0 < a i γ i c u i, P xi x i (γ q i if x i =B & x i < B & W 1 B, i, a i )= q i if x i <B & x i = W 1, 1 q i if x i <B & x i = W 2, 0 otherwise. W 1 = [x i a i γ i c] + + u i W 2 = [x i a i γ i c] + (6) ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 15 / Emai 34

31 Value Function Problem Definition and Model Description Value Function Infinite-horizon MDP. Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 16 / Emai 34

32 Problem Definition and Model Description Value Function Value Function Infinite-horizon MDP. Total expected discounted reward optimality criterion with discount factor λ is used, where 0 < λ < 1. Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 16 / Emai 34

33 Problem Definition and Model Description Value Function Value Function Infinite-horizon MDP. Total expected discounted reward optimality criterion with discount factor λ is used, where 0 < λ < 1. The objective is to find the policy π among all policies, that maximize the value function V π (s). Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 16 / Emai 34

34 Problem Definition and Model Description Value Function Value Function Infinite-horizon MDP. Total expected discounted reward optimality criterion with discount factor λ is used, where 0 < λ < 1. The objective is to find the policy π among all policies, that maximize the value function V π (s). The optimal policy is characterized by V (s) = max [R(s, a) + λ P ss (a)v (s )] (7) a A s S where, V (s) = sup π V π (s), attained when applying the optimal policy π. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 16 / Emai 34

35 Problem Definition and Model Description Value Function Value Function Infinite-horizon MDP. Total expected discounted reward optimality criterion with discount factor λ is used, where 0 < λ < 1. The objective is to find the policy π among all policies, that maximize the value function V π (s). The optimal policy is characterized by V (s) = max [R(s, a) + λ P ss (a)v (s )] (7) a A s S where, V (s) = sup π V π (s), attained when applying the optimal policy π. The model was solved numerically using Value Iteration. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 16 / Emai 34

36 Case Study and Results Case Study: Two Users with 2-State FSMC The Optimal Policy for Two Symmetrical Users 0 0,0 1,0 2, x 1 0,1 1,1 2,1 3,0 0,2 10 1, ,3 25 x2 P(γ i =1)=0.5 and P(z i =5)=0.5 for all i {1, 2}; c = 5. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 17 / Emai 34

37 Case Study and Results Case Study: Two Users with 2-State FSMC The Effect of Channel Quality on Policy Structure 0 0,0 1,0 2, x 1 0,1 1,1 2,1 3,0 0,2 10 1,2 15 0, x2 P(γ 1 =1)=0.8, P(γ 2 =1)=0.5 and P(z i =5)=0.5. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 18 / Emai 34

38 Case Study and Results Case Study: Two Users with 2-State FSMC The Effect of Arrival Probability on Policy Structure x 1 0 0,0 1,0 2,0 1 0,1 1,1 5 0,2 2,1 3,0 10 1, ,3 25 x2 P(γ 1 = 1) = P(γ 2 = 1) = 0.5 and P(z 1 =5)=0.8 P(z 2 =5)=0.5. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 19 / Emai 34

39 Case Study and Results Case Study: Two Users with 2-State FSMC Heuristic Policy We studied the optimal policy structure by running a wide range of scenarios, we noticed the following trends The policy is a switch-over. Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 20 / Emai 34

40 Case Study and Results Case Study: Two Users with 2-State FSMC Heuristic Policy We studied the optimal policy structure by running a wide range of scenarios, we noticed the following trends The policy is a switch-over. The weight (w i ) is a function of the difference of the two channel qualities and that of the arrival probabilities: w 1 = f ([ P γ ] +, [ P z ] + ) (8) w 2 = f ([ P γ ] +, [ P z ] + ) (9) where P γ =P(γ 1 =1) P(γ 2 =1) and P z =P(z 1 =u) P(z 2 =u). Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 20 / Emai 34

41 Case Study and Results Case Study: Two Users with 2-State FSMC Heuristic Policy We studied the optimal policy structure by running a wide range of scenarios, we noticed the following trends The policy is a switch-over. The weight (w i ) is a function of the difference of the two channel qualities and that of the arrival probabilities: w 1 = f ([ P γ ] +, [ P z ] + ) (8) w 2 = f ([ P γ ] +, [ P z ] + ) (9) where P γ =P(γ 1 =1) P(γ 2 =1) and P z =P(z 1 =u) P(z 2 =u). The intermediate regions has almost a constant width that equals 2c. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 20 / Emai 34

42 Case Study and Results Case Study: Two Users with 2-State FSMC Heuristic Policy We studied the optimal policy structure by running a wide range of scenarios, we noticed the following trends The policy is a switch-over. The weight (w i ) is a function of the difference of the two channel qualities and that of the arrival probabilities: w 1 = f ([ P γ ] +, [ P z ] + ) (8) w 2 = f ([ P γ ] +, [ P z ] + ) (9) where P γ =P(γ 1 =1) P(γ 2 =1) and P z =P(z 1 =u) P(z 2 =u). The intermediate regions has almost a constant width that equals 2c. a 1 (respectively a 2 ) is increasing in x 1 (respectively x 2 ). ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 20 / Emai 34

43 Case Study and Results Case Study: Two Users with 2-State FSMC Heuristic Policy We studied the optimal policy structure by running a wide range of scenarios, we noticed the following trends The policy is a switch-over. The weight (w i ) is a function of the difference of the two channel qualities and that of the arrival probabilities: w 1 = f ([ P γ ] +, [ P z ] + ) (8) w 2 = f ([ P γ ] +, [ P z ] + ) (9) where P γ =P(γ 1 =1) P(γ 2 =1) and P z =P(z 1 =u) P(z 2 =u). The intermediate regions has almost a constant width that equals 2c. a 1 (respectively a 2 ) is increasing in x 1 (respectively x 2 ). f () is increasing in P γ and decreasing in P z. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 20 / Emai 34

44 Case Study and Results Weight Function Approximation Weight Function Approximation Following these observations, we approximated w 1 and w 2 as follows ŵ 1 = [ P γ ] + 0.7[ P z ] + (10) ŵ 2 = [ P γ ] + 0.7[ P z ] + (11) ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 21 / Emai 34

45 Case Study and Results Heuristic Policy Structure Heuristic (dotted line) vs. optimal policy; c = x 1 0 0, , ,1 25 x2 P(γ i =1)=0.5 and P(z i =5)=0.5 for all i {1, 2}. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 22 / Emai 34

46 Case Study and Results Heuristic Policy Structure Heuristic (dotted line) vs. optimal policy; c = x ,0 1, , x2 P(γ 1 =1)=0.8, P(γ 2 =1)=0.5 and P(z i =5)=0.5. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 23 / Emai 34

47 Case Study and Results Heuristic Policy Structure Heuristic (dotted line) vs. optimal policy; c = x ,0 1, , x2 P(γ 1 = 1) = P(γ 2 = 1) = 0.5, P(z 1 =5)=0.8 and P(z 2 =5)=0.5. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 24 / Emai 34

48 Case Study and Results Heuristic Policy Structure Heuristic (dotted line) vs. optimal policy; c = x 1 0 0,0 1,0 2, ,1 1,1 3,0 0,2 10 2,1 15 1,2 20 0,3 25 x2 P(γ i =1)=0.5 and P(z i =5)=0.5 for all i {1, 2}. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 25 / Emai 34

49 Case Study and Results Heuristic Policy Structure Heuristic (dotted line) vs. optimal policy; c = x 1 0 0,0 1,0 2,0 1 0,1 1,1 3,0 5 0,2 2, ,2 20 0,3 25 x2 P(γ 1 =1)=0.8, P(γ 2 =1)=0.5 and P(z i =5)=0.5. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 26 / Emai 34

50 Case Study and Results Heuristic Policy Structure Heuristic (dotted line) vs. optimal policy; c = x 1 0 0,0 1,0 2,0 1 0,1 5 0,2 1,1 2,1 3,0 10 1, ,3 25 x2 P(γ 1 = 1) = P(γ 2 = 1) = 0.5, P(z 1 =5)=0.8 and P(z 2 =5)=0.5. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 27 / Emai 34

51 Case Study and Results Performance Evaluation Performance Evaluation: The Effect of Policy Granularity Two actions (c=15) Four actions (c=5) Six actions (c=3) Average Queue Length (PDUs) User1 (c=15) User2 (c=15) User1 (c=5) User2 (c=5) User1 (c=3) User2 (c=3) Average Drop Probability Offered Load (Rho) (a) on Average Queue Length Offered Load (Rho) (b) on Average Drop Probability Where ρ = i P z i u i /r π is the offered load and r π is the measured system capacity under π. P(γ 1 =1)=0.8 and P(γ 2 =1)=0.5. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 28 / Emai 34

52 Case Study and Results Heuristic Policy Evaluation Performance Evaluation Throughput (PDUs/msec) alpha=0.63, beta=0.12 Rho = 1.2 Rho = 0.8 Rho = 0.5 Round Robin Heuristic MDP Time slots (c) System Throughput for different ρ; P(γ 1 =1)=0.8 and P(γ 2 =1)=0.5. Delay (msec) alpha=0.63, beta=0.12, Pz1=0.8, Pz2=0.5, u=10 Round Robin User1 User2 Heuristic User1 User2 2 Optimal (MDP) User1 (c=5) User2 (c=5) Time slots (d) Queueing Delay Performance; P(γ 2 = 1) = 0.5, q 1 = 0.8, q 2 = 0.5 and u = 10. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 29 / Emai 34

53 Conclusion Conclusion and Future Work Conclusion The optimal policy can be described as share the codes in proportion to the weighted queue length of the connected users. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 30 / Emai 34

54 Conclusion and Future Work Conclusion Conclusion The optimal policy can be described as share the codes in proportion to the weighted queue length of the connected users. A policy with finer granularity will perform better in light to moderate loading conditions, while a coarse policy is more desirable in heavy loading conditions. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 30 / Emai 34

55 Conclusion and Future Work Conclusion Conclusion The optimal policy can be described as share the codes in proportion to the weighted queue length of the connected users. A policy with finer granularity will perform better in light to moderate loading conditions, while a coarse policy is more desirable in heavy loading conditions. However, the performance gain when using c < 5 is marginal and does not justify the added complexity. Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 30 / Emai 34

56 Conclusion cont. Conclusion and Future Work Conclusion The suggested heuristic policy has a reduced constant time complexity (O(1)) as compared to the exponential time complexity needed in the determination of the optimal policy. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 31 / Emai 34

57 Conclusion and Future Work Conclusion Conclusion cont. The suggested heuristic policy has a reduced constant time complexity (O(1)) as compared to the exponential time complexity needed in the determination of the optimal policy. The performance of the resulted heuristic policy matches very closely to the optimal policy. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 31 / Emai 34

58 Conclusion and Future Work Conclusion Conclusion cont. The suggested heuristic policy has a reduced constant time complexity (O(1)) as compared to the exponential time complexity needed in the determination of the optimal policy. The performance of the resulted heuristic policy matches very closely to the optimal policy. The results also proved that RR is undesirable in HSDPA system due to the poor performance and lack of fairness. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 31 / Emai 34

59 Conclusion and Future Work Conclusion Conclusion cont. The suggested heuristic policy has a reduced constant time complexity (O(1)) as compared to the exponential time complexity needed in the determination of the optimal policy. The performance of the resulted heuristic policy matches very closely to the optimal policy. The results also proved that RR is undesirable in HSDPA system due to the poor performance and lack of fairness. The suggested heuristic policy can be extended to the case with more than two active users. It also can be easily adapted to accommodate more than one class of service. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 31 / Emai 34

60 Future Work Conclusion and Future Work Future Work Prove analytically some of the optimal policy and value function characteristics, such as monotonicity, multi-modularity, and the switch-over behavior that we noticed before. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 32 / Emai 34

61 Conclusion and Future Work Future Work Future Work Prove analytically some of the optimal policy and value function characteristics, such as monotonicity, multi-modularity, and the switch-over behavior that we noticed before. Relax the assumption of error free transmission and extend the model to take into account retransmissions. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 32 / Emai 34

62 Conclusion and Future Work Future Work Future Work Prove analytically some of the optimal policy and value function characteristics, such as monotonicity, multi-modularity, and the switch-over behavior that we noticed before. Relax the assumption of error free transmission and extend the model to take into account retransmissions. Study the effect of using different arrival process statistics using simulation obviously. ussein Al-Zubaidy, Jerome Talim, Ioannis Lambadaris Optimal Scheduling [8ex] SCE-Carleton Policy Determination University1125 in HSDPA Colonel Networks26 By Drive, Ottawa, June 2007 ON, Canada 32 / Emai 34

63 Conclusion and Future Work Future Work Thank You Discussion Hussein Zubaidy hussein/ iscussion [8ex] Hussein Zubaidy [4ex] hussein/ Thank You () 33 / 34

64 Conclusion and Future Work Acronyms Acronyms HSDPA High Speed Downlink Packet Access. 3GPP Third Generation Partnership Project MDP Markov Decision Process TDMA Time Division Multiple Access CDMA Code Division Multiple Access TTI Transmission Time Interval (2 ms) FSMC Finite State Markov Channel SDU Service Data Unit RLC Radio Link Control Protocol located at Radio Network Controller (RNC) PDU Protocol data unit LQF Longest Queue First iscussion [8ex] Hussein Zubaidy [4ex] hussein/ Thank You () 34 / 34

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 9 Sep, 28, 2016 Slide 1 CPSC 422, Lecture 9 An MDP Approach to Multi-Category Patient Scheduling in a Diagnostic Facility Adapted from: Matthew