codecogs equations

fredag 8 februari 2019

Zen of Python: Flat is better than nested

Previous:
If this model is going to be useful, it needs to be able to read data in a realistic format. That includes at least being able to take in data on when students are available, as datetime strings. 

In the first model, I used a discretized time model, and the assignment variables were stored in a list of lists with equal length. One dimension for students, and one for time. This representation requires a lot of nested loops. Now, I set the challenge as being able to handle increased complexity with a more readable model. For this, I changed the representation from list of lists to a dict that maps student-time pairs to the assignment variables. 
# Old representation
# N is number of students
# T is number of time units
student2time2take = [[solver.IntVar(0, 1) for _ in range(T)]
for _ in range(N)]
# New representation
# available is a set((student, time))
student_time2take = dict([(item, solver.IntVar(0, 1))
for item in available])
view raw .py hosted with ❤ by GitHub

This changes the way groups of variables are retrieved. For example:
# Old representation
# take_times is list
def constrain_take_times(solver, student2time2take, take_times):
for student, time2take in enumerate(student2time2take):
solver.Add(solver.Sum(time2take) == num_take_times[student])
# New representation
# take_times is dict
# This code is one line longer, but is more explicit
def constrain_take_times(solver, students, student_time2take,take_times):
for student in students:
takes = [take for (st_id, _), take in student_time2take.items()
if st_id == student]
solver.Add(solver.Sum(takes) == take_times[student])
view raw .py hosted with ❤ by GitHub

The upside is particularly clear when we want to sum in the non-primary dimension, and won't have to do transposing. In this case, we also get out of retrieving by indexes.
# Old representation
# D is number of days
# T_D is time units per day
def get_total_workdays(solver, student2time2take, D, T_D):
works_days = []
time2student2take = list(transpose(student2time2take))
for d in range(D):
student2take = time2student2take[d*T_D:(d+1)*T_D]
takes = flatten_simple(student2take)
works_day = max_var(solver, takes, lb=0, ub=1)
works_days.append(works_day)
return works_days
# New representation
# times is set of times
# This code is unbothered with the order of the keys
def get_day2works(solver, times, student_time2take):
day2works = {}
for day in equivalence_partition(times, lambda x: x.date()):
date = next(iter(day)).date()
takes = [take for (_, t), take in student_time2take.items()
if t in day]
works = max_var(solver, takes, lb=0, ub=1)
day2works[date] = works
return day2works
view raw .py hosted with ❤ by GitHub

In summary I was happy with the outcome of this change. The flat representation both handles a new feature, and makes for clearer operations. New results, using this data:
student_id date time
1 2019-02-04 15:00
1 2019-02-05 15:00
1 2019-02-06 16:00
1 2019-02-07 16:00
2 2019-02-04 17:00
2 2019-02-05 17:00
2 2019-02-06 15:00
2 2019-02-08 14:00
3 2019-02-04 15:00
3 2019-02-05 16:00
3 2019-02-07 17:00
3 2019-02-08 14:00
4 2019-02-04 17:00
4 2019-02-05 15:00
4 2019-02-05 16:00
4 2019-02-05 17:00
5 2019-02-06 17:00
5 2019-02-06 15:00
5 2019-02-06 16:00
5 2019-02-07 17:00
6 2019-02-04 17:00
6 2019-02-05 16:00
6 2019-02-06 15:00
6 2019-02-08 14:00
7 2019-02-04 17:00
7 2019-02-05 17:00
7 2019-02-06 15:00
7 2019-02-07 14:00
view raw available.csv hosted with ❤ by GitHub

The lessons are 20 minutes.
Coloring: Dark blue is active. Light blue is available student. Red is teacher waiting time.

Inga kommentarer:

Skicka en kommentar