Skip to main content
Teaching

Can ChatGPT do an exam?

In this post we have a look at ChatGPT's answers to the final exam of a course I teach. I leave it to the reader to decide whether the given answers are correct or not. Anyway, I am impressed, but not convinced.

Hans-Dieter Hiep

In this post, I'll publish the answers given by OpenAI's ChatGPT for the final exam of the course Concepts of Programming Languages, a course I gave in the fall semester of 2022. I leave it to the reader to decide whether the given answers are correct or not ;-)

ChatGPT is a research project developed by OpenAI, which gives the following charter on their website: "OpenAI’s mission is to ensure that artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity. We will attempt to directly build safe and beneficial AGI, but will also consider our mission fulfilled if our work aids others to achieve this outcome." I have used version Februari 13 (2023) of ChatGPT.

After creating an account and logging in, the user interface for ChatGPT gives the following disclaimer: "While we have safeguards in place, the system may occasionally generate incorrect or misleading information and produce offensive or biased content. It is not intended to give advice." So, we're better beware of the answers...

I give it the following first input: Hi. I would like to ask you a number of exam questions about the course Concepts of Programming Languages.

ChatGPT: Of course, I'll do my best to help you with your exam questions about Concepts of Programming Languages. What would you like to ask?

So, here we go. All quotations are generated by ChatGPT.

Question 1.

What distinguishes a high-level programming language from a low-level programming language? Choose one or more correct answer from below.

  1. High-level programming languages are used by more programmers than low-level programming languages.
  2. High-level programming languages are always compiled into low-level programming languages, whereas low-level programming languages are directly executed by hardware processors.
  3. High-level programming languages allow (some) concepts to be expressed more directly, whereas low-level programming languages require (convoluted, error-prone) encodings to express the same concepts.
  4. A low-level programming language allows the programmer to describe more efficient computations, whereas a high-level programming language describes less efficient computations.
ChatGPT: The correct answer is 3. High-level programming languages allow (some) concepts to be expressed more directly, whereas low-level programming languages require (convoluted, error-prone) encodings to express the same concepts.

Explanation: High-level programming languages are designed to provide a more abstract view of programming, allowing the programmer to express concepts and algorithms more easily and directly. They typically use more human-readable syntax and provide built-in abstractions for common programming tasks, such as memory management, string manipulation, and I/O operations. In contrast, low-level programming languages require the programmer to work more directly with the underlying hardware, requiring more detailed knowledge of the system architecture and often relying on manual memory management and other low-level operations. While low-level programming languages may offer greater control over the hardware, they can be more error-prone and harder to use effectively than high-level languages.

Question 2.

Give a short motivation for each choice of the question above, why you made your selection (that is, why it is correct if you selected it, or why it is incorrect if you did not select it). Keep your motivation short: write at most two sentences per choice.

ChatGPT: Sure, here are the motivations for each choice:

1. Incorrect: The number of programmers using a language is not a factor in distinguishing between high-level and low-level programming languages.

2. Incorrect: While some high-level programming languages may be compiled into low-level machine code, this is not always the case, and low-level languages can also be compiled.

3. Correct: High-level programming languages provide built-in abstractions and a more abstract view of programming, making it easier to express concepts and algorithms more directly. In contrast, low-level languages require more direct manipulation of the hardware, making it necessary to use convoluted encodings to express the same concepts.

4. Incorrect: The efficiency of computations is not directly related to whether a language is high-level or low-level. Both types of languages can be used to write efficient code, depending on the specific task and the skill of the programmer.

Question 3.

What statements about pure and impure expressions are true? Choose one or more correct answer from below.

  1. There exists an expression that is considered impure if you look at its parts, but which has no effect on the program state after it is fully evaluated.
  2. An expression is only impure if its evaluation requires a large amount of time or space.
  3. Impure expressions must have no side-effect on the program state when they are evaluated.
  4. Assignment statements with impure expressions on the right-hand side can be eliminated, resulting in an equivalent program without any impure expressions.
ChatGPT: The correct statement is 1. There exists an expression that is considered impure if you look at its parts, but which has no effect on the program state after it is fully evaluated.

Explanation: Pure expressions are those that have no side effects and always return the same result given the same input. Impure expressions, on the other hand, have side effects or may return different results for the same input. However, it is possible for an expression to be impure by definition but have no effect on the program state when evaluated. For example, an impure expression might generate a warning or log a message, but if the program does not use or check for these outputs, the expression is effectively pure.

Statement 2 is incorrect because the time or space required for evaluation is not a factor in determining whether an expression is pure or impure.

Statement 3 is incorrect because impure expressions by definition have side effects, so they do affect the program state when evaluated.

Statement 4 is incorrect because assignment statements with impure expressions on the right-hand side cannot always be eliminated without changing the behavior of the program. For example, if the impure expression generates a warning or logs a message, eliminating the assignment would result in the loss of this information.

Question 4.

Give a short motivation for each choice of the previous question above, why you made your selection (that is, why it is true if you selected it, or why it is not true if you did not select it). Keep your motivation short: write at most two sentences per choice.

(ChatGPT already did that.)

Question 5.

What statements about exceptions are true? Choose one or more correct answer from below.

  1. Numerical errors such as division by zero must immediately cause a program crash.
  2. It is never possible to rethrow an exception that was caught before.
  3. Throwing an exception always requires unwinding the whole call stack to find an exception handler.
  4. An unhandled exception that was thrown could lead to a program crash.
ChatGPT: The correct statements are 3 and 4.

Explanation:

1. Incorrect: Numerical errors such as division by zero can be handled by throwing an exception, rather than causing a program crash.

2. Incorrect: It is possible to rethrow an exception that was caught before using the "throw" statement.

3. Correct: When an exception is thrown, the call stack is unwound until an appropriate exception handler is found. This means that the function that threw the exception and any functions that it called will be exited, but this does not necessarily mean that the whole call stack will be unwound.

4. Correct: If an exception is not caught by any handler, it will cause the program to terminate with an error message. This can lead to a program crash if the error is not handled gracefully.

Question 6.

Give a short motivation for each choice of the previous question above, why you made your selection (that is, why it is true if you selected it, or why it is not true if you did not select it). Keep your motivation short: write at most two sentences per choice.

(ChatGPT already did that.)

Question 7.

What statements about object-oriented programming are true? Choose one or more correct answer from below.

  1. Dynamic dispatch is a calling convention, where the method body that is actually invoked depends on the run-time type of the callee.
  2. Class inheritance may sometimes be incompatible with Liskov's substitution principle (where code that works on an object of type A also works for any object that is a subtype of A).
  3. Every class has an object as an instance.
  4. Objects live on a heap, and in garbage collected languages reachable objects may be removed in the background while the program is running.
ChatGPT: The correct statements are 1, 2, and 4.

Explanation:

1. Correct: Dynamic dispatch is a key feature of object-oriented programming, where the method body that is actually invoked depends on the run-time type of the object that the method is invoked on.

2. Correct: Class inheritance can lead to violations of Liskov's substitution principle if the subclass behavior is not compatible with the superclass behavior. For example, if a subclass violates the preconditions or postconditions of a superclass method, code that works on the superclass object may not work correctly on the subclass object.

3. Incorrect: While every object belongs to a class, not every class has an object as an instance. A class can be abstract or simply serve as a blueprint for creating objects.

4. Correct: Objects in object-oriented programming are typically allocated on a heap, and in garbage-collected languages, unreachable objects may be removed by the garbage collector while the program is running. This allows the programmer to focus on the program logic and not worry about memory management.

Question 8.

Give a short motivation for each choice of the previous question above, why you made your selection (that is, why it is true if you selected it, or why it is not true if you did not select it). Keep your motivation short: write at most two sentences per choice.

(ChatGPT already did that.)

Question 9.

Given the following context-free grammar in Backus-Naur Form (BNF):

<expr> ::= (<expr>) | <expr> + <expr> | <expr> * <expr> | <num> | <var>
<num> ::= <digit> | <digit><num>
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<var> ::= <letter> | <letter><var>
<letter> ::= a | b | c | ...
<stmt> ::= {<stmt>} | <var> := <expr> | if <expr> = 0 then <stmt> else <stmt> | <stmt>; <stmt>

The starting symbol is <stmt>.

Which of the following strings are in the language defined by the grammar above? Choose one or more correct answer from below.

  1. abc := a+b;
  2. c := (((c)))
  3. a := b + a; b := a - b
  4. if (a + b) = 0 then a := b else b := a
ChatGPT: The correct strings are 1, 2, and 4.

Explanation:

1. Correct: The string abc := a+b; can be generated by the rule <var> := <expr>. The variable "abc" is a sequence of letters, which is generated by the rule <letter><var>, and the expression "a+b" is generated by the rule <expr> + <expr> with <expr> being the variable "a" and the variable "b", which are generated by the rule <var>.

2. Correct: The string (((c))) can be generated by the rule <expr> with <expr> being (<expr>) and <expr> being (<expr>) and <var> being "c".

3. Incorrect: The string a := b + a; b := a - b cannot be generated by the given grammar because it is not a valid <stmt>. The grammar allows only one assignment statement per <stmt>.

4. Correct: The string if (a + b) = 0 then a := b else b := a can be generated by the rule if <expr> = 0 then <stmt> else <stmt>, with <expr> being (a+b) generated by the rule <expr> + <expr>, and <stmt> being the assignment statements "a := b" and "b := a" generated by the rule <var> := <expr>.

Question 10.

Motivate, for each string in the previous question above, why you made your selection (i.e. why the string is in the language if you selected it, or why it is not if you did not select it).

(ChatGPT already did that.)

Question 11.

Is the grammar given above ambiguous? Give a reasoned answer (i.e. not only "yes" or "no", but also why and how you found out).

ChatGPT: Yes, the grammar given above is ambiguous.

Reasoning:

Ambiguity in a context-free grammar means that there exists at least one string in the language that can be generated by the grammar in more than one way. In the given grammar, the expression <expr> + <expr> can be associated with either left or right recursion, which leads to different interpretations of the same input. For example, the input string "1 + 2 * 3" can be parsed as either "(1 + 2) * 3" or "1 + (2 * 3)", which have different values. This indicates that the grammar is ambiguous.

One way to demonstrate the ambiguity is by constructing a parse tree for an ambiguous string. For instance, consider the string "1 + 2 * 3". It can be parsed in two different ways, resulting in two different parse trees:
   +                    *
  / \                  / \
 1   *                +   3
    / \              / \
   2   3            1   2
The left tree corresponds to the interpretation "(1 + 2) * 3", while the right tree corresponds to the interpretation "1 + (2 * 3)". Since the grammar allows both of these parse trees, it is ambiguous.

Question 12.

Given the following context-free grammar of the (untyped) lambda calculus:

<expr> ::= <var> | λ<var><expr> | <expr><expr>

We use only single letter variables. Use the standard way of resolving ambiguity: application associates to the left, abstraction precedes application (that is, abstraction binds more strongly than application, or, in other words, abstraction has a higher precedence than application). Expressions can use parentheses for disambiguation (not shown in the grammar).

Draw the abstract syntax tree of the following lambda expressions (in black). Circle the free variable occurrences (in red). Show the binding of variables (in green).

  1. xyz
  2. λxxx
  3. (λyxλx(yx))y
ChatGPT: 1.
     xyz
      |
    -----
   |     |
   x     y
        / \
       x   z
Variable occurrences: x, y, z. All three are free.

Binding of variables: none.

1.
      λxxx
        |
      -----
     |     |
     λ     x
          |
          x
Variable occurrences: none.

Binding of variables: x is bound by the abstraction λxxx.

1.
          λyx
            |
         --------
        |        |
        x     --------
             |        |
             y     -----
                   |     |
                   x     y
                   |
                   y
Variable occurrences: y appears once as a free variable.

Binding of variables: x is bound by the first abstraction λx(yx), and y is bound by the second abstraction λyx.

Question 13.

Explain the meaning (what kind of statement) and execution (what happens when statement is executed) of each statement of the following program with dynamic memory management:

int *x = new(5);
int *z = null;
{
  int *y = new(6);
  z := y;
  delete z;
  *x := *y;
}
ChatGPT: This is a program written in C++ with dynamic memory management. Here's what each statement does:

1. int *x = new(5);: This statement dynamically allocates memory for an integer array of size 5 and assigns the address of the first element to the variable x.

2. int *z = null;: This statement initializes a pointer variable z to null.

3. {: This is the beginning of a block statement.

4. int *y = new(6);: This statement dynamically allocates memory for an integer array of size 6 and assigns the address of the first element to the variable y.

5. z := y;: This statement assigns the value of y to z, so z now points to the same memory location as y.

6. delete z;: This statement deallocates the memory that z points to, which is the memory that y also points to. This means that the memory for the integer array of size 6 that was allocated on the heap is now freed.

7. *x := *y;: This statement copies the value of the first element of the integer array that y points to, which is also the value of the first element of the integer array that x points to. This means that the value of x is now equal to the value of y.

The execution of this program involves allocating and deallocating memory on the heap and copying values between memory locations. When the program runs, it first allocates memory for the integer array of size 5 and assigns its address to x. Then it initializes z to null. Inside the block statement, it allocates memory for the integer array of size 6 and assigns its address to y. It then assigns the value of y to z, so z points to the same memory location as y. It then deallocates the memory that z points to, which also frees the memory that y points to. Finally, it copies the value of the first element of the integer array that y points to, which is also the value of the first element of the integer array that x points to. After the block statement, the memory for the integer array of size 5 is still allocated and assigned to x, while the memory for the integer array of size 6 that was allocated inside the block statement has been deallocated.

Question 14.

Consider the following two programs (on the left and right) that are executed concurrently (x and y are shared integer variables):

x := 5;                       y := 3;
y := 2;                       x := 2;
if (x < 5) {                  x := x - y;
  y := 4                      y := y + 6;
}                             print x
x := x + y;

Give five different executions of the program, i.e. five executions that each prints a different value. The result (of the five executions) alone is not sufficient. Please also describe, for each execution, what happened/how you obtained the result.

ChatGPT: Here are five possible executions:

Execution 1:
Thread 1: x := 5;
Thread 2: y := 3;
Thread 2: x := 2;
Thread 1: y := 4;
Thread 1: if (x < 5) { // Condition is false, skip if statement
Thread 1: x := x + y;
Thread 2: x := x - y; // x = 2 - 3 = -1
Thread 2: y := y + 6; // y = 3 + 6 = 9
Thread 1: print x; // x = 9

Execution 2:
Thread 1: x := 5;
Thread 2: y := 3;
Thread 2: x := 2;
Thread 1: y := 4;
Thread 1: if (x < 5) { // Condition is false, skip if statement
Thread 2: y := y + 6; // y = 3 + 6 = 9
Thread 2: x := x - y; // x = 2 - 9 = -7
Thread 1: x := x + y; // x = 5 + 4 = 9
Thread 1: print x; // x = 9

Execution 3:
Thread 1: x := 5;
Thread 2: y := 3;
Thread 2: x := 2;
Thread 1: y := 4;
Thread 1: if (x < 5) { // Condition is false, skip if statement
Thread 2: y := y + 6; // y = 3 + 6 = 9
Thread 2: x := x - y; // x = 2 - 9 = -7
Thread 1: x := x + y; // x = 5 + 4 = 9
Thread 2: print x; // x = -7

Execution 4:
Thread 1: x := 5;
Thread 2: y := 3;
Thread 2: x := 2;
Thread 1: y := 4;
Thread 1: if (x < 5) { // Condition is false, skip if statement
Thread 2: y := y + 6; // y = 3 + 6 = 9
Thread 1: x := x + y; // x = 5 + 4 = 9
Thread 2: x := x - y; // x = 2 - 9 = -7
Thread 1: print x; // x = 9

Execution 5:
Thread 1: x := 5;
Thread 2: y := 3;
Thread 2: x := 2;
Thread 1: y := 4;
Thread 1: if (x < 5) { // Condition is false, skip if statement
Thread 2: y := y + 6; // y = 3 + 6 = 9
Thread 1: x := x + y; // x = 5 + 4 = 9
Thread 2: print x; // x = -7

In general, the value printed depends on the order in which the threads execute their statements and whether the if statement is executed or skipped. In each execution, we simulate the interleaving

Conclusion

(Note that I have removed some questions which are hard to give as input to ChatGPT textually.)

Decide for yourself what you think of the answers. I am impressed: the answers generated by ChatGPT seem closely related to the question, and make sense grammatically. However, when looking more closely into the answers, it is revealed that the chat bot is not able to give reasonably correct answers to all questions.