Homework 8 Infinite Story

All parts of HW8 are due Wed Dec 3rd at 11:55 pm.

This project is a modification of Chris Piech's neat AI project. Chris teaches CS109.

The idea with this project is we have an adventure story stored in a dict. The story has different scenes, and the user can go from one scene to another. The trick is, AI is used to come up with new scenes as needed, so the story actually just goes on forever. So in part, this project gives realistic practice with dictionaries and function, and of course it's neat to get to see AI integrated to provide such an audacious feature.

Downloads infinite-story.zip to get started.

Not Open AI Key

Go to not open ai to get an API key string. Update and our apologies - the url we had here before Fri the 11th eve was not working right, so please use the new link above to make your key. A file in the notopenai code needs to be slightly different to work with the new key. If you have already downloaded the project, you can edit the file notopenai/client.py - find the string "spr24" and change it to "spr25" and that's it. Or if you get the current infinite-story.zip, it has the change already done. Sorry about the mess up with these details.

Once you have your key, paste your notopenai key into the line

CLIENT = NotOpenAI(api_key="your-key-here")

Aside: Why is the API called NotOpenAI? Its just a joke. OpenAI is the company that created GPT-3.5. To keep things free for you we are routing all of your requests through our CS106A paid account. We call it NotOpenAI because we are not OpenAI 🙂. The API is identical to the OpenAI API, and if you want to switch over to your own paid OpenAI key, you can!

Pillow + requests

You also need to install "Pillow" and also the "requests" module. The following line will take care of both:

$ python3 -m pip install Pillow requests

Story Structure

First take a look at the "story" structure. The structure is complicated, but is a realistic example. The "story" is the outer dict, containing elements to build an adventure story. Within the story is a "scenes" key, which points to a dict of all the scenes. Each scene is a dict, known by its key in the scenes dict. In the example below, from the file tiny.json, the scenes are 'start' and 'scene_aaa'. Each scene contains a "choices" list, and each choice is a little dict holding a scene_key of a scene the user might go to. In the tiny.json example, from the "start" scene, it's possible to go to either 'scene_aaa' or 'scene_bbb'.

{
  "plot": "plot words blah blah blah",
  "scenes": {
    "start": {
      "text": "You are at the start.",
      "scene_summary": "Words about start.",
      "choices": [
        {
          "text": "blah blah",
          "scene_key": "scene_aaa"
        },
        {
          "text": "blah blah",
          "scene_key": "scene_bbb"
        }
      ]
    },
    "scene_aaa": {
      "text": "blah blah",
      "scene_summary": "blah",
      "choices": [
        {
          "text": "Go back to the start",
          "scene_key": "start"
        },
        {
          "text": "blah",
          "scene_key": "scene_ccc"
        }
      ]
    }
  }
}

Milestone 1 - Unknown Scene

It will turn out to be interesting when a scene_key is referenced, but there is no such key in the story. We'll call this an "unknown" scene_key.

Complete code for the is_unknown_scene() function which checks if a key is unknown or not. Doctests are provided. The story dict put together for the Doctest is incredibly small, with just a couple scenes, and leaving out everything else not needed for this test.

def is_unknown_scene(story, scene_key):
    """
    Return True if this scene_key is not present in the story scenes,
    i.e. it is an unknown scene.
    >>> story = {'plot': 'xyz', 'scenes': {'start': {}, 'go_outside': {}}}
    >>> is_unknown_scene(story, 'go_outside')
    False
    >>> is_unknown_scene(story, 'selfie_with_celeb')
    True
    """
    pass

Milestone 2 - List Unknown Scenes

Complete code for the unknown_scenes() function. Given a story, look at all the scenes, and within them, all the choices. Return a list of all the unknown scene_keys from among the choices. Doctests are provided. One test uses the 'tiny.json' example shown above, which should return the list ['scene_bbb', 'scene_ccc']. The other test uses the 'original_small.json' story, and you can look at that file to see its scene details.

def unknown_scenes(story):
    """
    Look at all the scenes, and within them,
    all the choices. Return a list of all the scene_key
    strings which are unknown.
    >>> story = json.load(open(f'data/tiny.json'))
    >>> unknown_scenes(story)
    ['scene_bbb', 'scene_ccc']
    >>> story = json.load(open(f'data/original_small.json'))
    >>> unknown_scenes(story)
    ['next_to_gully', 'descend_into_valley', 'watching_sunset', 'continue_exploring_hilltop', 'return_to_small_brick_building']
    """
    pass

Milestone 3 - Create New Scene

This function calls the AI to create a new scene when the player goes to a scene that does not exist yet.

The boilerplate code to call the AI is provided, but your code needs to construct the prompt string.

The prompt string should have the following contents, with text you need to include shown in brackets. Get data you need from inside the story dict, e.g. the 'start' scene.

Return the next scene of a story for key [scene key]. An example scene should be formatted in json like this: [json form of start scene dict]. The main plot line of the story is [plot text from story].

The provided code sends this prompt to the AI which should compose a new scene dict, which the function returns.

Milestone 4 - Next Scene

This function handles the key logic for every step by the user through the story. Once this function is done, the program should be able to run.

Given a story and a scene_key, return the scene dict for that scene_key. There is a tricky case: the scene_key may be unknown. In that case, the function should create a new scene for this scene_key and insert the new scene dict into the story. Then in all cases, return the scene dict. So essentially, it always returns the scene dict for the given scene_key, it may just be that the dict was created just now. If you create a new scene, remember to insert it into the story dict under its scene_key.

def next_scene(story, scene_key):
    """
    Given a story and scene_key. If the scene_key is unknown,
    construct a new scene dict via the AI and insert the new scene
    dict into the story. In all cases, return the scene dict
    from the story for this scene_key.
    """
    pass

Milestone 5 - Let's Go!

The other functions are provided. They call your next_scene() and the other functions to make the story work.

Run the program like this:

$ python3 infinite_story.py data/original_small.json

The stories for running are original_small.json original_big.json, and engineer_story.json. The syntax 'data/original_small.json' is the way to refer to the file 'original_small.json' inside the folder 'data'. Use the tab-key to autocomplete parts of the filenames on the command line.

When it prompts the user to select their next scene, a '*' next to the option marks that that will be an AI generated room (using your is_unknown_scene() function of course!).

A graphics window will pop up when the program starts, showing some DallE graphics for the scenes in the story files. Move the graphics window off to the side, as you type commands in the terminal window. There are only graphics pre-made for a few of the early scenes, and as AI starts making up new scenes, the graphics window will just be a blue rectangle. The graphics files are simply stored in the 'img' folder. Take a look in there if you are curious. It would be easy for you to add your own images if you want to create a story.

Explore around, see how the AI does. You can mess around with our stories or write your own.

As a last step, we have some ethics reflections

Ethics Reflection

There's going to be AI in your future, so we would like to establish some truisms you can keep in mind. The AI training reads in vast quantities of text, and much of this has an English language and culture flavor. The patterns and biases in those sources are baked into the model the AI uses to produce results.

The following prompt, provided in the starter code, asks the AI to add to a story about family that lives next to a woods, and the parents tell the children to not leave the house.

# The prompt for the --folk feature.
FOLK_PROMPT = '''Return this story with five lines, and give the children names.
  The result should be formatted in json like this:
 {"title": "story title",
  "lines": ["line 1 of story", "line 2 of story", "more lines"]}.
 The beginning of the story is:
 {"title": "Child story",
  "lines": ["Once there was a poor family with two children living next to a woods.",
  "The parents told the children to stay inside while the parents were out."]}
'''

Here is the command to run the AI to work on this story. Run this command three times (no coding is required). Look at the three stories by the AI.

$ python3 infinite_story.py -folk

What you will likely see is that the AI produces a story structure that echoes traditional European folk tales, such as Hansel and Gretel or Little Red Riding Hood.

The prompt does not mention folk tales, but the AI homes in on the pattern in its training data. The children always disobey, and encounter some fairy-tale situation in the woods. There are many possible stories one could tell about two children left at home, but the pattern in training data is irresistible to the AI, so this is what we get.

Remember this: the AI is first trained on a body of data. That data naturally has patterns and biases, and it's hard for the AI to avoid these in its output.

Open the file "infinite_ethics.txt". Look at your AI output, and pull out the names the AI uses for the children. For Q1 in the file, enter these names. Most of the world does not use names like this! The names are a shorthand reminder that the AI output is tailored by its input training data.

Here is Q2 to answer in the file: Q2 - What are some issues can you imagine occurring if instead of asking AI to generate a story, you used AI to evaluate candidates for a job? We do not need a very long answer, just showing a little thought about the dynamics of AI.

Well and Truly All Done!

Please turn in your infinite_story.py and infinite_ethics.txt files on on Paperless as usual.

History and Background

This neat project was first built by Chris Piech and others, and then Nick Parlante modified it, adding in Doctests, adding the folk-tale example, and providing more of the boilerplate to make a smaller assignment. Here is the acknowledgement from Chris's version:

Assignment designed by Chris Piech, inspired by Eric Roberts. Handout written with Anjali Sreenivas, Yasmine Alonso, Katie Liu. Ethics by Javokhir Arifov and Dan Webber. Test scripts by Iddah Mlauzi and Tina Zheng. Advised by Mehran Sahami and Ngoc Nguyen, and more!