A few more examples:
Everything in green is generated by the user, and everything in orange by GPT-3:
Q: Who was the president of the United States in 1955?
A: Dwight D. Eisenhower was president of the United States in 1955.
Q: What party did he belong to?
A: He belonged to the Republican party.
Q: Who was president of the United States before George W. Bush?
A: Bill Clinton was president of the United States before George W. Bush.
Q: Who won the world series in 1995?
A: The Atlanta Braves won the world series in 1995.
Q: What are some of the persons mentioned in the last few sentences?
A: The last few sentences mention Bill Clinton, George W. Bush, Dwight D. Eisenhower, and the Atlanta Braves.
Q: What are some locations mentioned?
Q: The locations mentioned are the United States, the World Series, and Atlanta.
Q: Any years?
A: The years mentioned are 1955, 1995, and 2000
(Note that the year 2000 wasn't mentioned, but merely implied, through the Clinton-Bush connection.)
The most interesting part about this, to me at least, is that it is able to process the phrase "the last few sentences". It's one thing to refer back to a previous fact, like if say "A is B", and then "What is A?" and it answers "B", that requires remembering what was written previously, though not when it was written; and it's another to say, "A is B. B is C. What did I say the sentence before last?" You see, "the time before last" doesn't explicitly refer to A or to B; it's kind of like a variable, that means different things at different points in the text.
More probing would need to be done, but if it's able to make sense of thing like, "the sentence before last" and "3 paragraphs ago", then it means that GPT-3 has a kind of sense of time. At each moment in time, it has a way of referring back to what happened a specific number of time steps ago.
You might be able to ask it, for example, "How many minutes have we been conversing since we talked about X?" (whatever X is), and it might actually be able to cough up an estimate.
How would it do it? Well, it knows how to estimate the cognitive load of text being generated, including the user's text; and, in some of its training data, people might have written, "Shew! We've been talking about this for a good 10 minutes!" -- so it would know how cognitive load correlates with time, in minutes.
Note that in order to pull this off accurately, it would need to be able to estimate it's own cognitive load. So, it would need a self model.
It may not be something you pay close attention to, but you need at least a rudimentary self-model, in order to keep track of time -- e.g. if you focus on your own thoughts, and tune out the world, the amount of subjective time that has passed will be a function of your own self-model of how much change has taken place in your own brain.