Huawei Sweden Hackathon is back for the 5th edition with a brand new coding challenge! The challenge: Design and develop a smart data channel allocation algorithm for 5G & 6G. One of the most challenging problems in designing mobile networks is data channel allocation, where many mobile users connected to the baseband need to be served in real-time. -Huawei Sweden Hackathon
Stockholm in winter is gorgeous, espacially during the Nobel week: pure white snow peacefully falling from sky, colorful lights on the city hall, easy and transquil mood when the night comes…
First time taking part in a team contest as an abroad student, I experienced unforgetable moments and met friends who have complete different backgrounds. Even if the performance is not satisfying enough and the process is quite torturing, we ended up getting over it and gaining progresses.
Parameters:
Each user is denoted by a tuple including: {initial speed, data size, factor}
For example, we can have three users as follows:
The number of instances of each user depends on both the size of its data and the speed. When a user’s data cannot be sent entirely in a single grid, it needs multiple grids.
Users placed in the same column negatively affect each other in terms of data speed, while users on the same row have no effect on each other.
The speed of each user in case of not collocating with other users is equal to its initial speed, however, in case of collocating with other users in the same column, it is reduced to
\[\text{Speed}_i = \text{initial_Speed}_i \times \left( 1- \text{collocated_factor}_i \right)\]Where:
\[\text{collocated_factor}_i = \text{factor}_i \times \sum_{\forall j \neq i} \text{factor}_j\]which Users $j$ are assianed to the same column.
There is a corresponding relationship between the speed and the amount of data that a user can send, shown by the speed to data map table. Here is an example of such a table.
For example, when the speed of a user is 8, 2158 bytes data can be sent.
Objective: Maximise the sum of average speed of all users ($ \text{Avg_Speed}_i $), formulated as
\[\text{Objective_function} = \frac{ \sum_{\forall \text{Users}} \text{Avg_Speed}_i}{\text{BestSpeedUsers}}\]Where $ \text{BestSpeedUsers} $ is sum of maximum speed of all users and can be calculated by $ \sum_{\forall \text{Users}} \text{Init_Speed}_i $ Indeed objective function is a float number between 0 and 1 Constraint: More than one instance of the same user cannot be placed in the same column.
Penalty term: As we would like to send all the data, we apply a penalty term that shows which portion of total data of all users cannot be sent (Data_Loss). Indeed, the penalty term is a float number in range 0 to 1, where 0 indicates no data loss. A solution with zero data loss is called as ‘feasible solution’.
\[\text{Penalty_term} =\frac{\sum_{\forall \text{Users}} \text{Data_Loss}_i}{\text{Total_Data_of_All_Users}}\]$\text{Total_Data_of_All_Users}$ = Sum of total data size of all users
The Score_function includes both the objective and the penalty function. It is formulated as:
Where:
\[\text{Score} = \text{Objective_function} - \alpha \times \text{Penalty_term}\]𝛼 = Data loss penalty coefficient set by the system designer based on the importance of data users loss. The value of alpha is given in each the test case.
The best scores are achived when the penalty term is zero, which in this case the score is equal to the objective function. The worst value of score is achived when all the data is lost, which in this case the score is equal to –𝛼.
Range of score is [–𝛼 , 1]
Note: At the end of the qualification phase, to validate the codes, we check the execution time of your codes on a VM with
If the execution time of your code on the VM is far more than 1 second, it is counted as an invalid solution. As most of today’s laptops are not more powerful than the considered VM, if the execution time of your code is in the scale of a few milliseconds on your laptop, you can ensure that it doesn’t take longer on the considered VM.
The next part is called as COMPLEMENTARY PART including four next rows, used to show data loss, user speeds, the goal function value, and the execution time of the algorithm respectively.
Notes:
Adding Users Weight to represent the prior of users.
Each user is denoted by a tuple including: {initial speed, data size, factor, weight}
where \(\text{BestSpeedUsers} = \sum_{\forall \text{Users}} \text{Init_Speed}_i\cdot \text{Weight}_i\)
Different from traditional ACM problems, this problem is considered NP-hard, which is impossible to derive the optimal solution within polynomial computing time. A common method is to try getting a legal solution within the constraint and optimize it toward a certain optimization goal, so a feasible solution among a certain range may arise.
Now, try to mathematical analyze and form the optimization goal as follow:
Find decision 0-1 matrix \(\mathbf{e}\), size \(U\times N\), where \(e_{ij} = \{0,1\}\) . The problem now is representing as a requiry of a 0-1 matrix, where the corresponding element $ e_{ij} $ represents whether the $ i $ -th user exists on $ j $ -th column of channel.
Initial speed: a vector \(\mathbf{v_0}\), size \(1\times U\), given.
Weight: a vector \(\mathbf{w}\), size \(1\times U\), given.
Factor: a vector \(\mathbf{k}\), size \(1\times U\), given.
Data size: a vector \(\mathbf{d}\), size \(1\times U\), given.
Combined parameter: a vector \(\mathbf{B} = \mathbf{v_0} \odot \mathbf{w} \odot \mathbf{k}\), size \(1\times U\), which means \(\beta_i = {v_0}_i \cdot w_i \cdot k_i\), derived by given. (Note: $\odot$ is denoted as the Hadamard product (element wise product))
User speed: a matrix \(\mathbf{v}\), size \(U\times N\).
Objective function can be written as:
\[\begin{aligned} f(\mathbf{e}_{U\times N}) &=\sum_{i=1}^{U} \left( \frac{1}{\sum_{j=1}^{N} e_{ij}}\sum_{j=1}^{N} e_{ij}\cdot v_{ij} \right) \cdot w_i\\ &= \sum_{i=1}^{U}w_i \frac{\sum_{j=1}^{N} e_{ij}\cdot {v_0}_i \left[ 1-\left(\sum_{l=1}^{U} e_{lj}\cdot k_l -e_{ij}k_i \right) \cdot k_i \right] }{\sum_{j=1}^{N} e_{ij}} \\ &= \sum_{i=1}^{U} w_i {v_{0}}_i \frac{(1-k_{i}^2)\sum_{j=1}^{N} e_{ij} - k_i \sum_{j=1}^{N} \sum_{l=1}^{U} e_{ij}e_{lj}k_l}{\sum_{j=1}^{N} e_{ij}}\\ &= \sum_{i=1}^{U} w_i {v_{0}}_i \left[ (1-k_{i}^2)- \frac{k_i\sum_{j=1}^{N} \sum_{l=1}^{U} e_{ij}e_{lj}k_l}{\sum_{j=1}^{N} e_{ij}} \right]\\ &= \sum_{i=1}^{U} w_i {v_{0}}_i\cdot (1-k_{i}^2)- g(\mathbf{e}_{U\times N}) \end{aligned}\]which could be organized in matrix form as:
\[\begin{aligned} f(\mathbf{e}_{U\times N}) = \mathbf{B}_{1\times U} \odot \left( \frac{1}{\mathbf{k}_{1\times U}}- \mathbf{k}_{1\times U} \right) \cdot \mathbf{1}^\intercal_{U} - g(\mathbf{e}_{U\times N}), \end{aligned}\]where
\[\begin{aligned} g(\mathbf{e}_{U\times N}) &= \sum_{i=1}^{U} w_i {v_{0}}_i k_i \frac{\sum_{j=1}^{N} \sum_{l=1}^{U} e_{ij}e_{lj}k_l}{\sum_{j=1}^{N} e_{ij}}\\ &= \sum_{i=1}^{U} \beta_i \frac{\sum_{j=1}^{N} \sum_{l=1}^{U} e_{ij}e_{lj}k_l}{\sum_{j=1}^{N} e_{ij}} \end{aligned}\]which could be organized in matrix form as:
\[\begin{aligned} g(\mathbf{e}_{U\times N}) = \mathbf{B}_{1\times U} \cdot \left[ \mathbf{k}_{1\times U} \cdot \mathbf{e}_{U\times N} \cdot \mathbf{e}^\intercal_{U\times N} \oslash \left(\mathbf{e}_{U\times N} \cdot \mathbf{1}^\intercal_{N} \right) \right], \end{aligned}\](Note: $\oslash$ is denoted as the Hadamard division).
And
\[\begin{aligned} v_{ij} &= {v_{0}}_i \left[ 1-\left(\sum_{l=1, l \neq i}^{U} e_{lj}\cdot k_l \right) \cdot k_i \right] \\ &={v_{0}}_i \left[ 1-\left(\sum_{l=1}^{U} e_{lj}\cdot k_l -e_{ij}k_i \right) \cdot k_i \right] \end{aligned}\]which could be organized in matrix form as:
\[\begin{aligned} \mathbf{v} = \mathbf{v_{0}}^\intercal_{1\times U} \cdot \mathbf{1}_{N} \cdot \left[ 1-\left( \mathbf{k}_{1\times U} \cdot \mathbf{k}^\intercal_{1\times U} - {\mathbf{k}^2_{1\times U}}^\intercal \cdot \mathbf{1}_{N}\right) \odot \mathbf{e}_{U\times N} \right]. \end{aligned}\]subject to
where \(d^{\prime}_{ij} = D(v_{ij})\) is a mapping function, \(\mathbb{R} \mapsto \mathbb{D}\).
In matrix representation:
\[\mathbf{1}_{N} \cdot \left( \mathbf{e}_{U\times N} \odot D(\mathbf{v}_{U\times N}) \right)^\intercal - \mathbf{d}_{1\times U} \geq \mathbf{0}_{U}.\]In matrix representation:
\[M - \mathbf{1}_{U} \cdot \mathbf{e}_{U\times N} \geq \mathbf{0}_{N}.\]For fesible solutions, find $\arg \max_{\mathbf{e}} f(\mathbf{e}) = \arg \min_{\mathbf{e}} g(\mathbf{e})$.
In this case consider gradient descent, we calculate the gradient of $g(\mathbf{e})$ w.r.t $\mathbf{e}$ in each step,
\[\begin{aligned} \frac{\partial g}{\partial e_{ab}} = \frac{\partial}{\partial e_{ab}}\ \sum_{i=1}^{U} \beta_i \frac{\sum_{j=1}^{N} \sum_{l=1}^{U} e_{ij}e_{lj}k_l}{\sum_{j=1}^{N} e_{ij}} \end{aligned}\]which could be organized in matrix form as:
\[\begin{aligned} \frac{\partial g}{\partial \mathbf{e}_{U\times N}} &= \mathbf{k}^\intercal_{1\times U} \cdot \mathbf{1}_{U} \cdot \left[ \mathbf{e}_{U\times N} \oslash \left(\mathbf{e}_{U\times N} \cdot \mathbf{1}_{N\times N}\right)\right]\\ &- \mathbf{k}^\intercal_{1\times U} \cdot \mathbf{1}_{N} \odot \left[ \mathbf{e}_{U\times N} \oslash \left(\mathbf{e}_{U\times N} \cdot \mathbf{1}_{N\times N}\right) \right]. \end{aligned}\]If Data size Limit cannot be fulfilled, counted as infesible solutions.
Penalty term (related to data loss):
\[P(\mathbf{e}_{U\times N}) = -\alpha\cdot \frac{\sum_{i=1}^{U} \left[d_i - \sum_{j=1}^{N} e_{ij}\cdot D(v_{ij}) \right]}{\sum_{i=1}^{U} d_i}\]which could be organized in matrix form as:
\[P(\mathbf{e}_{U\times N}) = \alpha\cdot \frac{\left[ \mathbf{1}_{N} \cdot \left( \mathbf{e}_{U\times N} \odot D(\mathbf{v}_{U\times N}) \right)^\intercal - \mathbf{d}_{1\times U} \right] \cdot \mathbf{1}_{U}^\intercal}{ \mathbf{1}_{U} \cdot \mathbf{d}_{1\times U}^\intercal }\]Goal Function with Penalty term:
\[f_p(\mathbf{e}_{U\times N}) = f(\mathbf{e}_{U\times N}) + P(\mathbf{e}_{U\times N})\]id | $M$ | $N$ | $U$ | $\alpha$ |
---|---|---|---|---|
1 | 16 | 25 | 80 | 10000 |
2 | 16 | 32 | 40 | 1000 |
3 | 16 | 100 | 80 | 10000 |
4 | 16 | 128 | 300 | 1000 |
5 | 16 | 250 | 400 | 1000 |
6 | 20 | 800 | 200 | 1000 |
7 | 20 | 1600 | 1000 | 1000 |
8 | 20 | 1600 | 50 | 1000 |
9 | 20 | 1600 | 2000 | 1000 |
10 | 16 | 275 | 440 | 100 |
The mapping function is a single value mapping:
\[D(v_{ij}) = \tilde{d}_{\left\lfloor \tilde{v}_{ij} \right\rfloor +1}\]where
\[\tilde{v}_{ij} = \min(\max(v_{ij},0), |\tilde{\mathbf{d}}|-1).\]The value of $\tilde{\mathbf{d}}$ is given as follows:
$\tilde{\mathbf{d}} = [$ 0, 290, 575, 813, 1082, 1351, 1620, 1889, 2158, 2427, 2696, 2965, 3234, 3503, 3772, 4041, 4310, 4579, 4848, 5117, 5386, 5655, 5924, 6093, 6360, 6611, 6800 $].$
Clear to see, after reforming it as a constraint optimization problem, a lot of methods could be used to find a feasible / local optimal solution. The trivial one is to do some Stochastic Gradient Descent (SGD), since each operation takes reasonable computational complexity. However, the goal value is not guaranteed to converge.
The winning team gave us a lot of interesting insights.
First of all: most of the feasible solutions for the main objective were obtained by randomized algorithms shuffle-relax, heuristic objectives (example: sorting by factor likelihood).
In terms of optimizing the objective function:
The data structure affects much the coding normalization and running efficiency. During the design, an elegent and efficient structure with functions like – searching by index (direct/inverse), adding, deleting, inquiring and modifying – should be supported.
All in all, it was still a rewarding day of hammering through. trying to form a strong team next time and try again!
Below is a picture of Stockholm City Hall lit up and glowed just in time for Nobel Night:
Thanks for reading.