Added doc about inclusion of packages mentioned in the student's subm… #22

vkgurbani · 2024-10-30T13:56:39Z

…ission in the test environment

tbrown122387 · 2024-10-30T20:08:00Z

There's a small issue here. Technically "[t]herefore, any libraries included in the student submissions must also be included in the test environment" is not correct.

Strictly speaking, this would be unfortunate if it was true. In that case the instructor would have to match and use the exact same libraries that every student is using. In a big class, that would be especially difficult. So the modularity is a perk.

For instance, if students are using dplyr to create tibbles, I can check the shape and size and particular elements without needing to library(dplyr) in the test file.

However, extra library() calls become necessary when your test code say calls functions from a third party library. Checking values created by students does not require third party libraries, but recreating them in the test file sometimes does.

In the example code you emailed me, your test files ultimately required a few extra library() calls because your test code was calling functions that used third party libraries.

Also, it should be mentioned that instead of using library() you could also prefix certain function calls with the name of their library. This would be difficult in the case of the test cod eyou sent me, I think, but I think it should still be mentioned if we were to change the vignette.

tbrown122387 · 2024-11-03T15:22:10Z

Happy to pulll after some modifications have been made

vkgurbani · 2024-11-04T18:41:07Z

On 11/3/24 9:22 AM, Taylor R. Brown wrote: Happy to pulll after some modifications have been made

The issue is a bit more complex, it seems. Let's work with a small example that uses the dplyr package. Let's call this "Case 1". Assume a student submits the following code: library(dplyr) data("iris") df <- iris tmp_df <- filter(df, Species == "setosa") Assume that the auto-grading program is checking this as follows: library(testthat) test_that("1(a)", { expect_equal(dim(tmp_df)[1], 50) expect_equal(dim(tmp_df)[2], 5) }) Then everything works as expected. Let's call this "Case 2". Now, consider that the student modularizes his or her code, creates a function and filters in the function to return a data.frame object, as follows: library(dplyr) data("iris") q_1_a <- function() { df <- iris tmp_df <- filter(df, Species == "setosa") tmp_df } Assume that the auto-grading program is now structured as follows: library(testthat) test_that("1(a)", { tmp_df <- q_1_a() expect_equal(dim(tmp_df)[1], 50) expect_equal(dim(tmp_df)[2], 5) }) Now suddenly, the test fails because dplyr is not available in the environment of the auto-grader. There are two options to fix the above issue: (1) force the student to use namespaces, i.e., instead of filter(...), use dplyr::filter(...) in q_1_a(); or (2) import package dplyr in the auto-grading program. Option 1 (using namespaces) is a non-starter for me for a variety of reasons, primary among them being that the concept of namespaces may not be familiar to all students. The course I teach is a cross-sectional course in data mining and machine learning that attracts students from other disciplines besides CS. So while students may be familiar with some block-structured programming language, functions, and variable scope, and can pick up R quickly, they may not be familiar with the more CS-oriented concepts as namespaces and OOP. Option 2 works for me, but I understand that the instructor would have to match and use the exact same libraries that every student is using. To me, this is not a problem as I introduce the libraries to be used for their modeling and expect the students to only use those libraries. In the end, it is a matter of style. Case 1 encourages free use of global variables with the stipulation that the names of the global variables are pre-determined and shared by the instructor with the students. Case 2 encourages modularization through functional scope so the student can write small functions that perform the task, and the instructor simply calls those functions. Question is, where is the happy medium such that the gradeR package can support both cases? I do not know yet. Apologies for a long response. Let me know if something workable strikes you; I will be doing the same. Cheers,

…

-- - vijay --- Vijay K. Gurbani, Ph.D. Research Associate Professor of Computer Science, Illinois Institute of Technology Chicago, Illinois ***@***.*** http://www.cs.iit.edu/~vgurbani

tbrown122387 · 2024-11-04T21:14:41Z

Option 2 is better, but regards to your example, the problem with it is that your test code is assuming things about the student implementation, which is not a great idea. The student can surprise you by either not defining or, or changing it to somethign expecting.

At the very least, you should re-define this function you're calling in your own test code. Then, when you redefine it, you'll see you're clearly using third party libraries, and you can deal with them either by using library() or by using ::.

Even better, check the output without using the function at all.

"Option 2 works for me, but I understand that the instructor would have to match and use the exact same libraries that every student is using. "

Not so. The instructor can test student output without using all of those libraries. The student is limited to using libraries that are already installed on whatever machine the code is being run on. This is a feature as well. It forces students to minimize dependencies.

vkgurbani · 2024-11-04T21:40:43Z

Option 2 is better, but regards to your example, the problem with it is that your test code is assuming things about the student implementation, which is not a great idea. The student can surprise you by either not defining or, or changing it to somethign expecting.

The same problem --- i.e., the student surprising you --- exists even in the case when pre-determined names of global variables are used; I don't see much of a difference. When I give the homeworks, I am meticulous to state the expected return value, for example: "(v) [0.33 points] How many frequent itemsets are there with a support of 0.10? Your function should return an object of class itemset."

At the very least, you should re-define this function you're calling in your own test code. Then, when you redefine it, you'll see you're clearly using third party libraries, and you can deal with them either by using library() or by using ::.

Even better, check the output without using the function at all.

I am not sure what you mean by "re-define this function you're calling in your test code.". Perhaps a quick example using the code I provided may explain your thought better.

Thanks,

vijay

tbrown122387 · 2024-11-04T21:46:17Z

Yes, at the very least, you need to request specific variable names. If students, say, spell them wrong, that's on them.

Requesting types is more strict. R is dynamically typed, so usually a few types can be returned and still pass my checks (e.g. tibbls and data frames are both fine).

If you asked them to define a function called q_1_a, and you're testing it, then yes, you need to expect at the very least a certain signature (i.e. what arguments it takes and in what order). In this case you would not redefine the function in your test file.

If q_1_a is some demo code that you allow them to make use of, but cannot force them to use, then I would redefine it again in your test file, just to be sure that it exists in the way you expect when you use it to test.

Anything you use in your test file, if it is dependent on a third party library, I don't see how you can expect to not have to use library() or something like this. If you let student code dictate what libraries are read in, it would be an absolute nightmare. Actually, the first time I wrote an autograder, I did this, and it was absolutely terrible.

vkgurbani · 2024-11-04T22:30:30Z

If you asked them to define a function called q_1_a, and you're testing it, then yes, you need to expect at the very least a certain signature (i.e. what arguments it takes and in what order). In this case you would not redefine the function in your test file.

If q_1_a is some demo code that you allow them to make use of, but cannot force them to use, then I would redefine it again in your test file, just to be sure that it exists in the way you expect when you use it to test.

Right, in my case, q_1_a() is NOT demo code; it is the name of a function that they are instructed to write as part of their homework. The homework writeup stipulates (1) the name of the function, (2) any parameters it takes, and (3) the return value.

Anything you use in your test file, if it is dependent on a third party library, I don't see how you can expect to not have to use library() or something like this. If you let student code dictate what libraries are read in, it would be an absolute nightmare. Actually, the first time I wrote an autograder, I did this, and it was absolutely terrible.

Agreed; the intent is not to have the student code dictate libraries to be used to reduce the complexity on the auto-grader. So, by explicitly importing the library in the auto-grader code, I can continue using it and allow the students to modularize their code.

Question is, how should we update the gradeR document to support such a use case, assuming we want to support it. The only use case documented in the vignette right now is the use of pre-determined global variables. I am trying to stay away from such global variables due to unintended consequences and side effects that may creep into all but the most simple of programs.

Thanks,

vijay

vkgurbani · 2024-11-06T16:33:11Z

@tbrown122387, good morning. Any more thoughts on the above chain?

One way to proceed is to expand the gradeR vignette so it supports multiple ways to structure homeworks for auto-grading. Currently there is only one way to do so --- using pre-determined and agreed-upon names of variables available in the global namespace. This may not work for all cases, so we could present a second way to use the package as I have done.

WDYT?

Thanks,

vijay

tbrown122387 · 2024-11-11T15:10:07Z

I’m thinking more of a minimal edit. Maybe an additional sentence or two

vkgurbani · 2024-11-13T00:10:39Z

I think it will be hard to fit it in a sentence or two. The issue is complex enough. The way to use the auto-grader as currently shown is good and it works. However, there are other ways to approach the problem. If interested, I don't mind writing a vignette on my approach if you think that is warranted. Cheers, - vijay --- Vijay K. Gurbani, Ph.D. Research Associate Professor Department of Computer Science, Illinois Institute of Technology Chicago, Illinois | http://www.cs.iit.edu/~vgurbani <http://mypages.iit.edu/~vgurbani> | ***@***.*** ***@***.***>

…

On Mon, Nov 11, 2024 at 9:10 AM Taylor R. Brown ***@***.***> wrote: I’m thinking more of a minimal edit. Maybe an additional sentence or two — Reply to this email directly, view it on GitHub <#22 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/APBHV5C67WC7VVU54MQVNYL2ADCGPAVCNFSM6AAAAABQ4FT37KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRYGQYDGOBSGY> . You are receiving this because you authored the thread.Message ID: ***@***.***>

tbrown122387 · 2024-11-14T03:40:35Z

A whole vignette? Is your approach that distinct from mine?

vkgurbani · 2024-11-14T15:04:00Z

A whole vignette? Is your approach that distinct from mine?

@tbrown122387, Good morning. Well, the three sentences I suggested couple of weeks ago in the pull are:

"Note that the student submissions are run in a new, clean environment. The tests are run in a different environment than the student submissions. Therefore, any libraries included in the student submissions must also be included in the test environment."

However, these did garner some pushback [1], which is why we are having the current conversation. If you think we can wordsmith the above sentences, that'll be one way to proceed. Any suggestions on wordsmithing the above?

Thanks,

vijay

[1] #22 (comment)

Added doc about inclusion of packages mentioned in the student's subm…

e541fc5

…ission in the test environment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added doc about inclusion of packages mentioned in the student's subm… #22

Added doc about inclusion of packages mentioned in the student's subm… #22

vkgurbani commented Oct 30, 2024

tbrown122387 commented Oct 30, 2024

tbrown122387 commented Nov 3, 2024

vkgurbani commented Nov 4, 2024 via email

tbrown122387 commented Nov 4, 2024

vkgurbani commented Nov 4, 2024

tbrown122387 commented Nov 4, 2024 •

edited

Loading

vkgurbani commented Nov 4, 2024

vkgurbani commented Nov 6, 2024

tbrown122387 commented Nov 11, 2024

vkgurbani commented Nov 13, 2024 via email

tbrown122387 commented Nov 14, 2024

vkgurbani commented Nov 14, 2024

Added doc about inclusion of packages mentioned in the student's subm… #22

Are you sure you want to change the base?

Added doc about inclusion of packages mentioned in the student's subm… #22

Conversation

vkgurbani commented Oct 30, 2024

tbrown122387 commented Oct 30, 2024

tbrown122387 commented Nov 3, 2024

vkgurbani commented Nov 4, 2024 via email

tbrown122387 commented Nov 4, 2024

vkgurbani commented Nov 4, 2024

tbrown122387 commented Nov 4, 2024 • edited Loading

vkgurbani commented Nov 4, 2024

vkgurbani commented Nov 6, 2024

tbrown122387 commented Nov 11, 2024

vkgurbani commented Nov 13, 2024 via email

tbrown122387 commented Nov 14, 2024

vkgurbani commented Nov 14, 2024

tbrown122387 commented Nov 4, 2024 •

edited

Loading