-
Notifications
You must be signed in to change notification settings - Fork 0
/
task2_design_doc.txt
412 lines (316 loc) · 18.7 KB
/
task2_design_doc.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
+-------------------------+
| OS 211 |
| TASK 2: USER PROGRAMS |
| DESIGN DOCUMENT |
+-------------------------+
---- GROUP ----
>> Fill in the names and email addresses of your group members.
FirstName LastName <[email protected]>
Ruoyu Hu <[email protected]>
Siyuan Shen <[email protected]>
Hantang Sun <[email protected]>
Yifei Zhang <[email protected]>
---- PRELIMINARIES ----
>> If you have any preliminary comments on your submission, or notes for the
>> markers, please give them here.
>> Please cite any offline or online sources you consulted while preparing your
>> submission, other than the Pintos documentation, course text, lecture notes
>> and course staff.
ARGUMENT PASSING
================
---- DATA STRUCTURES ----
>> A1: (1 mark)
>> Copy here the declaration of each new or changed `struct' or `struct' member,
>> global or static variable, `typedef', or enumeration.
>> Identify the purpose of each in roughly 25 words.
No structs or struct members were edited for this section.
---- ALGORITHMS ----
>> A2: (2 marks)
>> How does your argument parsing code avoid overflowing the user's stack page?
>> What are the efficiency considerations of your approach?
Every time the stack pointer (esp) is decremented, we check if it is less than
BASE_LINE, which is the minimum valid user address. If the stack pointer is now
less than BASE_LINE, the setup_stack() function immediately returns false, this
prevents something being pushed to the stack that would cause it to overflow the
user's stack page. The running thread then skips to the "done" tag inside the
load() function, causing it to also return false. As the load has failed, the
thread's memory is deallocated and the thread will not run.
This also reduces wasted computation as threads that will not run will have its
memory deallocated and recycled. Checking every stack decrement also reduces the
amount of unnecessary pushes to stack in the event of a failure to push, as it
returns immediately from the function.
---- RATIONALE ----
>> A3: (3 marks)
>> Why does Pintos implement strtok_r() but not strtok()?
strtok_r() is the reentrant version of strtok. As interrupts are not disabled
while setting up the stack, the current running thread can be preempted or
interrupted during the course of execution, strtok could return the incorrect
substring, as the interrupt could cause previous progress made on tokenisation
to be lost. strtok_r() is reentrant, such that it can return to its previous
state of execution once the interrupt behaviour has been appropriately dealt
with.
>> A4: (2 marks)
>> In Pintos, the kernel separates commands into an executable name and
>> arguments. In Unix-like systems, the shell does this separation. Identify two
>> advantages of the Unix approach.
One advantage of the Unix approach is that it simplifies the system kernel by
delegating the separation task to the shell, as such it does not require any
complex and defined tools within the kernel for such a task.
The other advantage is that it provides flexibility in the kernel, such that
there can be many different types of dedicated user shells with a set interface
to the kernel. And there would be no need to change the kernel if the user
is to use a new shell with a different encoding so long as it adheres to the
interface.
SYSTEM CALLS
============
---- DATA STRUCTURES ----
>> B1: (6 marks)
>> Copy here the declaration of each new or changed `struct' or `struct' member,
>> global or static variable, `typedef', or enumeration.
>> Identify the purpose of each in roughly 25 words.
#define SYSCALL_ERROR -1
This is a new constant representing an erroneous status
struct file_fd
{
int fd;
struct file *file;
struct list_elem elem;
};
This is a new struct introduced for threads to keep track of files it has opened,
and their unique file descriptors this is added to a newly introduced list
member within the thread struct.
struct child_bookmark
{
pid_t child_pid;
int child_exit_status;
struct list_elem elem;
};
This is a new struct introduced for threads to keep a "bookmark" of its children
threads and their corresponding exit statuses. This is used in the event the
parent has to wait on a child to finish execution. A bookmark is created during
an execute system call.
#define CHILD_RUNNING -2
This is a new constant introduced to represent a default child_exit_status
within the child_bookmark struct when it is created. The SYSCALL_ERROR constant
defined above cannot be used as it is a potential exit status return from the
child.
struct list files;
This is a new list member added to the thread struct to keep track of all the
files this current thread has opened.
struct list child_waits;
This is a new list member added to the thread struct to keep track of the
bookmarks it has on the exit status of its child threads.
struct semaphore wait_for_child;
This is a new semaphore member added to the thread struct to allow for the
parent thread to wait until a child thread has finished performing some task,
and "wakes up" the parent thread.
tid_t child_waiting;
This is a new tid_t member added to the thread struct which keeps track of the
tid of the child thread the current thread is waiting for/last thread the parent
thread waited for. Used together with the wait_for_child semaphore above to
allow the child thread to determine if the parent is waiting for it.
struct thread *parent; /* Current thread's parent thread */
This is a new thread pointer member added to the thread struct which points the
child thread towards its parent thread.
int child_exit_status;
This is a new int member added to the thread struct which keeps track of the
exit status of the last child to exit. Primarily used for the first kernel
thread.
char executable_filename[MAX_FILENAME_LEN];
This is a string member added to the thread struct which keeps track of the
current thread's executable file, used to deny write to the file.
---- ALGORITHMS ----
>> B2: (2 marks)
>> Describe how your code ensures safe memory access of user provided data from
>> within the kernel.
The new function is_valid_pointer(const void *uaddr) is used to check if the
given pointer points towards an address within user space, in addition, it
checks if the content of the page table at the given address is not NULL.
The above function is used every time we interact with a user given pointer
within the system call context before dereferencing said pointer.
>> B3: (3 marks)
>> Suppose that we choose to verify user provided pointers by validating them
>> before use (i.e. using the first method described in the spec).
>> What is the least and the greatest possible number of inspections of the page
>> table (e.g. calls to pagedir_get_page()) that would need to be made in the
>> following cases?
>> a) A system call that passes the kernel a pointer to 10 bytes of user data.
>> b) A system call that passes the kernel a pointer to a full page
>> (4,096 bytes) of user data.
>> c) A system call that passes the kernel a pointer to 4 full pages
>> (16,384 bytes) of user data.
>> You must briefly explain the checking tactic you would use and how it applies
>> to each case to generate your answers.
a) least : 1 check, greatest : 2 checks (if all bytes are in the same page,
validate that page, otherwise validate both pages)
b) least : 1 check, greatest 2 checks
c) least : 4 checks, greatest 5 checks
The page of the first byte is first found and validated. Then, the address of
the first byte of next page is calculated. If it is part of the used data, that
page needs to be checked as well. Repeat the process until reaching a byte that
is not part of the pointed memory.
>> B4: (2 marks)
>> When an error is detected during a system call handler, how do you ensure
>> that all temporarily allocated resources (locks, buffers, etc.) are freed?
Whenever an error occurs in the system call handler, the program will invoke
exit() immediately and passes -1 as the parameter. Inside the function, all
the temporarily allocated resources, such as the struct file_fd and
child_bookmarks, will be freed. At the end of the function call, thread_exit()
is invoked, inside which the resources within struct thread will be freed.
All system calls end is terminated by the exit() function, as a result, it
is ensured that all resources will be de-allocated.
>> B5: (8 marks)
>> Describe your implementation of the "wait" system call and how it interacts
>> with process termination for both the parent and child.
When the wait() system call is called, it first checks through the list of child
bookmarks possessed by the current thread by calling thread_waiting_child().
The returned bookmark is stored into the variable child_exit. If child_exit is
NULL, the process with the given pid is no a child of the current process, at
which point the current process does not need to wait for the given process and
should return an SYSCALL_ERROR.
If the process with the given pid is a child of the current process, the current
process then checks whether the child process is still running. As specified in
section B1, the default value of the child_exit_status member of the
child_bookmark struct defaults to CHILD_RUNNING. If the child_exit status is not
this default value when the parent process (current process) checks, then the
child process must have finished execution prior to this point in time, as such
all the parent process has to do is return the value of the child_exit_status.
At this point, the parent process will also set the child_exit_status member of
the bookmark to SYSCALL_ERROR, such that attempting to wait on the same child
that has finished execution will only yield an error.
In the event that the child process is still executing, the parent process will
call sema_down on its wait_for_child member semaphore, so that it will not be
scheduled until its child process finishes execution. It also sets its
child_waiting member to the pid of the child process.
When the child process has finished execution, it exits via the exit() system
call. The exception handler has been changed such that a process will also exit
via the system call in the event of a page fault in user space.
The exit system call frees the memory allocated to all bookmarks held by the
calling process, then proceed to check whether its parent is waiting on a
process. The child process first checks if its parent is NULL, as it is
permitted for the parent process to terminate before the child. It then finds
its child_bookmark within the parent's child_waits list (list of children and
their bookmarks), the child process stores its own exit status within the
child_exit_status member of the child_bookmark. The child process then checks if
its parent process is waiting on it, by using the parent's wait_for_child
semaphore and the aforementioned child_waiting member that stores the required
pid. If the parent process is waiting on the child process, the child process
ups the parent's wait_for_child semaphore.
Once the child process has exited, the parent process will eventually be
scheduled again, at which point it resumes execution within the wait system
call, since there is no guarantee that the process the parent process waited on
was the last child process to terminate, it cannot use the child_exit_status
member within its thread, which stores the exit status of the last child thread
to exit and is used by process_wait(). Instead, the parent process gets the
child process's bookmark and gets the member child_exit_status from it, setting
the member value afterwards to SYSCALL_ERROR and returning the original value.
If the parent process exits at any point during its execution, it will
deallocate any memory allocated to its bookmarks, but allow its children
processes to continue running, as such, the execution of the children processes
are not affected by the parent terminating. Once the children processes
terminate, they will not attempt to set any bookmarked exit statuses, as they
no longer exist, and will exit normally.
---- SYNCHRONIZATION ----
>> B6: (2 marks)
>> The "exec" system call returns -1 if loading the new executable fails, so it
>> cannot return before the new executable has completed loading.
>> How does your code ensure this?
>> How is the load success/failure status passed back to the thread that calls
>> "exec"?
To ensure that the exec() function does not return before the executable has
finished loading, the thread sema downs wait_for_child immediately after
process_execute() has been invoked and the child_bookmark struct has been
created, so that it blocks itself to allow the new executable to be loaded.
Within the start_process() function in process.c, a conditional statement is
implemented after the load() function. This conditional statement checks a list
of conditions, if all of which are satisfied, the parent of the current thread
will be woken up by calling sema_up() on wait_for_child and the current thread
will yield in order to allow its parent to run and obtain its tid (pid).
If the condition fails, the child thread calls exit() with -1 as the parameter.
Inside exit(), it checks if the current thread's parent has a bookmark for it,
and sets the child_exit_status field in the bookmark to -1, so that when
its parent starts running again, it is able to acquire the error code, and
correspondingly return -1 in the exec() function.
>> B7: (5 marks)
>> Consider parent process P with child process C.
>> How do you ensure proper synchronization and avoid race conditions when:
>> i) P calls wait(C) before C exits?
>> ii) P calls wait(C) after C exits?
>> iii) P terminates, without waiting, before C exits?
>> iv) P terminates, without waiting, after C exits?
i) P fetches C's corresponding child_bookmark from its child_waits list, and
gets a child_exit_status of CHILD_RUNNING, as C has not exited. P set's its
child_waiting member to C's pid, and then calls sema_down() on P's
wait_for_child semaphore. At this point only C could call sema_up() on P's
wait_for_child semaphore, as such when P resumes execution, there is a
guarantee that C must have completed its execution and has exited. No race
conditions can occur as P does not resume execution until C has exited.
ii) C exits, fetching its corresponding bookmark from P, and sets the
child_exit_status member to its exit status. P calls wait() on process C, where
it then gets C's corresponding bookmark from its child_waits member list, P sees
that the bookmarked child_exit_status for C is not CHILD_RUNNING, meaning C must
have already exited. P returns the bookmarked child_exit_status as the exit
status of C. No race conditions can occur as P does not wait and C has already
exited.
iii) P terminates, during its exit, it frees all the memory it allocated,
including its list of child_bookmarks. C at some point executes, and exits,
where it checks if its parent is NULL, which in this case since P no longer
exists, is true. C will not attempt to set its corresponding bookmark in P as P
no longer exist. There should be no race conditions in this case as C checks for
P's existence with interrupts disabled, as such P cannot exit whilst C is in the
process of accessing its resources.
iv) P exits, after C has exited. P proceeds to deallocate any memory it
allocated i.e opened files, children bookmarks. There should be no race
conditions in this case as C's exit status no longer matters to P, and C no
longer exists.
>> Additionally, how do you ensure that all resources are freed regardless of
>> the above case?
All user processes must go through system call exit when it is terminating. When
system call to exit is made, the process frees the memory allocated to each of
the files it has open, in the files member of the thread struct. It then
iterates through the list of child_bookmarks, the child_waits member of the
thread struct, and frees the memory allocated to each bookmark entry.
At this point, the memory allocated during the execution of system calls for the
current process has been freed. The process then calls thread_exit(), which in
turn calls process_exit(). process_exit deallocates and destroys the current
process's page directory. Then thread_exit() then sets the thread's status to
THREAD_DYING, causing its resources to be deallocated at the schedule() which it
then calls.
After the above steps, all resources used by the current process has been freed,
and since user processes will always exit via a system call to exit, this will
always free all resources used by the process.
---- RATIONALE ----
>> B8: (2 marks)
>> Why did you choose to implement safe access of user memory from the kernel in
>> the way that you did?
Our method does have a higher overhead compared to the second method of editing
the exception handler, however by checking each of our given pointers prior to
dereferencing, in the event of an erroneous pointer, we exit the system call
immediately, as such we avoid the computation that would otherwise be performed
within the system call. This design choice was made under the assumption that
unnecessary system calls would be more expensive than the overhead from pointer
validation.
>> B9: (2 marks)
>> What advantages and disadvantages can you see to your design for file
>> descriptors?
The file descriptor is allocated using a static int "file_desc_count" that is
initialized to 2. Each time a new file is opened, a new struct file_fd is
created and the value of "file_desc_count" is assigned as the new file
descriptor then incremented. The struct file_fd contains a pointer to the opened
file and the allocated file descriptor. Next, the struct is pushed onto the list
of files opened by the current thread. When the file needs to be used. The file
pointer will be retrieved by the thread from its list of files through a linear
search by file descriptor.
The major advantage of this design is that the information of the file pointers
and descriptors are held by the thread rather than as a global data structure.
The number of files opened by a thread is much fewer than the total number of
opened files, as a result, it will take less time for the thread to retrieve the
file pointer. Additionally, as the access to the static file_desc_count variable
inside the open system call is done only with the file system access lock,
filesys_lock acquired, we can still guarantee that the file descriptor is unique
for each of the thread's opened files.
A disadvantage is that it still takes linear time to get back a file pointer
with a file descriptor, which, whilst not a significant issue with a small
number of files open, would be inefficient if a process would require a larger
number of files. This could be potentially improved by replacing the files list
with hashmap, which gives constant access time.