Writing Instruction and Generative AI

Author: Amanda Leary

Generative artificial intelligence tools, and especially large language models (LLMs) such as ChatGPT, have rattled the way we think about the purpose and value of writing in our classrooms. With the ability to generate decent (not perfect, not expert) essays, students and instructors alike are questioning the value of writing assignments at all. We as instructors know that writing is thinking, a process of becoming. LLMs don’t think; they predict what words should come next based on data in the textual corpus on which the model was trained. You can prompt an LLM to act within a certain persona, write for a particular audience, use a particular set of guidelines, and demonstrate certain features in the writing, but at the end of the day all generative AI is doing is guessing based on statistical patterns in its training data. The cognitive and affective labor that goes into writing is bypassed when a piece of writing is simply generated by an LLM.

But this is not the only way LLMs can be used in writing. While we may say in our syllabi that “AI-generated content cannot be submitted as your own,” these smaller ways—providing topic ideas and outlines, editing, and feedback—are murkier ground and can have significant benefits and ramifications for student learning. Ready or not, AI is here to stay; understanding how our students are using it—and what AI is actually doing—is key to discerning how and when students can use AI in their writing process.

Throughout this resource, we have provided examples of how the three main large language models tackle different writing tasks. It is important to keep in mind that the quality of the output depends on the quality of the input. Prompt engineering, the process of adding layers of context and guardrails to shape the generated text, is crucial to getting a more sophisticated and nuanced output. If you are allowing or requiring students to use AI, providing information on how to engineer prompts for specific use cases will help build AI literacy. Prompt engineering is also a productive way to manage expectations for appropriate use by limiting the context in which students can engage and what they can ask an LLM to do. You may even consider providing a list of allowed prompts and set the expectation that students will cite their chat threads.

As you explore these uses of AI for writing and determine what “acceptable use” looks like in your class, consider your learning objectives. To what degree is writing instruction, and learning specific disciplinary modes of communication, central to the course? What skills do you want students to demonstrate through their writing? If things like synthesis, analysis, argument, and understanding are most important, there are uses of AI that support—and don’t inhibit—writing as thinking. If writing instruction is part of your course goals, integrating AI should foster skill acquisition, not replace it. Build in critical, metacognitive reflections that encourage students to compare their own thinking and learning with the AI output. Critiquing, expanding, and revising outputs can help students develop their own writing style and voice, and help grow confidence in their knowledge and abilities as writers.¹

Expand any of the sections by clicking on the + sign.

Pre-Writing with AI: Brainstorming Topics and Outlines

Because of the sheer amount and variety of data LLMs are trained on, they have a little bit of knowledge about a lot of things. This makes them especially helpful for generating topic ideas and prompting directions for further research. An LLM’s shallow understanding of a topic can help students narrow down what actually interests them and can generate specific areas for them to research on their own.

A screenshot of the beginning of a chat interaction with ChatGPT 3.5. The prompt from an anonymous user is: "What are some big questions in cybersecurity I could write an 8-page researched argument essay on?" — A screenshot of the beginning of a chat interaction with ChatGPT 3.5

Let’s look at how brainstorming with ChatGPT 3.5 can work. In the adjacent screenshot, you can see the beginning of a chat about cybersecurity research essay topics; to view the whole interaction, go to https://chat.openai.com/share/5a660cdf-7102-4e51-ac42-1d5a14dbb25c (no ChatGPT account required).

Here, the LLM has helped narrow down a topic from “cybersecurity” to potential avenues of research, including regulating bodies, frameworks for evaluating threats, and educational initiatives. These ideas are just suggestions; students still need to research, analyze, and synthesize appropriate sources to write their paper. It’s likely that a student would have arrived at the same place through some preliminary research and a quick Google search, but working with ChatGPT to brainstorm allows students to explore ideas and relationships that they may not have thought of before. Particularly in classes where essay prompts are more open-ended, thinking with an LLM can save students time and instructors from reading papers on the same topic from semester to semester. Note that there is no usable prose in any of the responses, nor did ChatGPT provide an argument. Students still have the task of conducting research, analyzing and synthesizing their sources, and constructing their own argument.

If you consider this use of AI as a thought partner for topic ideation and refinement to be acceptable in your class, clearly communicate that to students in your syllabus and in class. Add a line in your AI policy and model for students the kinds of prompts they can use to narrow down their research topics—you could even provide them with prompt templates. Ask students to cite their use of AI and include the chat thread in their citation to mitigate any confusion about how students used the LLM and the output they were given.

LLMs can assist with other pre-writing tasks, such as outlining, with varying degrees of depth and success. The following links show how ChatGPT 3.5, Google Gemini, and Microsoft Copilot each approach an outline for an essay on the reception history of Arthurian legend.

Like the topic refinement, each outline suggests avenues students could research to address particular points in the essay. It does not, however, provide any content that students could copy and paste into their essay. For students unfamiliar or uncomfortable with the conventions of academic writing, ideating over an outline to refine their thesis statement, main points, and sources can be a helpful starting point. Generating an outline can also help guide research decisions and can be used to help students understand academic writing conventions such as formulating coherent paragraphs, introductions and conclusions, transitions, and connecting back to a thesis statement.

The prompt used above (“Give me an outline for …”) is not engineered to create expert-level results, such as giving the LLM a persona and audience for the outline, nor does it contain any information about the student’s actual topic. The more context given, the more detailed and specific the output will be. Here and in the adjacent screenshot, we have provided a specific work for the LLM to incorporate into the outline and given more context for the assignment.

A screenshot of the beginning of a chat interaction with Microsoft Copilot. The prompt from the user is: "Generate an outline for a paper on how the Disney film The Sword in the Stone reinterpreted Arthurian legend to appeal to its contemporary audience. The paper should be 10 pages and is for a class on medievalism in pop culture." — A screenshot of the beginning of a chat interaction with Microsoft Copilot

Note how the model has provided a thesis statement: “Disney’s ‘The Sword in the Stone’ reinterprets Arthurian legend to resonate with contemporary audiences through animation, modern storytelling techniques, and character development.” This thesis uses a three-part structure and offers a high-level summary of the information to be presented rather than something more concrete and arguable. A very common element of persuasive writing, the rule of three is not necessarily a sign that text has been AI-generated. However, because of its ubiquity, an LLM might struggle to compose theses without it. Encourage students to think beyond the generated outline and supplement the output with their own thoughts, research, and style—a starting point, not a finished product. A revised, more specific thesis might look like this: “The Sword in the Stone’s modernization of the traditional Arthurian narrative appeals to the sociopolitical reimagining of Camelot in the 1960s.” This thesis is animated by the author’s research and original thinking, and clearly states the “so what” of the argument without forecasting the main points—such as animation, storytelling, and character development.

Editing and Revising Writing

Though used interchangeably, there are differences between editing and revising that are important to keep in mind, especially when it comes to using AI to do either of those tasks.

Editing deals with lower-order concerns, such as grammar, structure, and sentence-level mechanics, considering writing on a more local scale.
Revising (re + visere, literally to re-see) is the process of addressing the big picture: argument, use of evidence, overall structure and development, audience, tone, etc.

Another way of thinking about editing and revising is when they tend to happen. Revision, as part of the writing process, occurs between drafts, as a movement from one version to the next, refining ideas and structure. Editing, on the other hand, is most effective once you have a final product to polish any errors or inconsistencies.

LLMs aren’t great at maintaining this distinction. ChatGPT 3.5 emphasizes lower-order concerns in its description of editing, yet in practice, when asked to edit a paper, ChatGPT performed revision tasks in much the same way as when asked to revise.

If ChatGPT is “rephrasing sentences, restructuring paragraphs,” and “clarifying ambiguous points” for students, where exactly are we seeing their original tone and intent? Maybe that “awkward phrasing” is a stylistic choice on behalf of the author; if it is removed because it isn’t aligned with what the model’s algorithm anticipates—and if students take these edits uncritically, without the skills, expertise, and confidence in their own writing that practice fosters—how are they developing their own unique style and voice? Using LLMs in this way—flattening and conforming prose to a particular statistical model—has potentially far-reaching implications for the future of writing.

Let’s look at an example:

Original Paragraph:
Upon conversion to Christianity, one must add a new dimension to their conception of love: divine love. C.S. Lewis explores that new dimension in his works The Four Loves and Till We Have Faces. The Four Loves explains love in theory: the three natural, human loves, affection, friendship, eros, divine love, charity. Till We Have Faces is an exposition of the theory Lewis develops in The Four Loves. Till We Have Faces shows the true nature of human love, and emphasizes the need for divine redemption. Ultimately, The Four Loves and Till We Have Faces show how human love will inevitably fall short and become corrupted, and divine love needs to intervene to transform it.

[All excerpts are taken from essays the author wrote in 2011 as a junior at Valparaiso University to demonstrate how LLMs respond to student writing. You can see the full ChatGPT 3.5 thread here.]

Edited Version:
Upon embracing Christianity, individuals must expand their understanding of love to include divine love. C.S. Lewis delves into this new dimension in his works "The Four Loves" and "Till We Have Faces." "The Four Loves" lays out the theoretical framework of love, delineating affection, friendship, eros, and divine love (charity). "Till We Have Faces" serves as a narrative exploration of the concepts Lewis develops in "The Four Loves," revealing the essence of human love and stressing the necessity of divine intervention for redemption. Ultimately, both works illustrate how human love inevitably falters and becomes corrupted, requiring divine love to intervene and effect transformation.

Revised Version:
Upon conversion to Christianity, individuals must incorporate a new dimension into their understanding of love: divine love. C.S. Lewis explores this dimension in his works "The Four Loves" and "Till We Have Faces." "The Four Loves" lays out a theoretical framework of love, delineating affection, friendship, eros, and divine love (charity). "Till We Have Faces" serves as a narrative exploration of the concepts Lewis develops in "The Four Loves," revealing the essence of human love and stressing the necessity of divine intervention for redemption. Ultimately, both works illustrate how human love inevitably falters and becomes corrupted, requiring divine love to intervene and effect transformation.

—

Both the edited and revised versions of the original paragraph are quite similar, and both with significant divergences from the original. The edited and revised versions maintain the essence of the original paragraph, but make changes to the language that affect the meaning—swapping embracing for conversion, for example, and offering a “theoretical framework” and “narrative exploration” in place of the original’s simpler “theory” and “exposition.” The AI versions are slightly shorter, cleaning up grammar and sentence structure to convey the central idea more succinctly. The difference between edited and revised in this case is limited to word choice; the two are substantially different from the original in style and clarity, but minimally different from one another, as the line between editing and revising gets blurred.

A screenshot of the beginning of a chat interaction with Google Gemini. The prompt from the user is: "Edit this paper:" after which they have pasted the full text of an essay. — A screenshot of the beginning of a chat interaction with Google Gemini

Likewise, Microsoft Copilot and Google Gemini confuse editing and revision. Copilot offered a “revised version” with “structural changes” and new information to clarify certain points when asked to edit. Gemini did not edit the whole essay (see adjacent screenshot), but instead offered suggestions for improvement that focused on content, argument, and evidence—not lower-order concerns typically considered as editing. Its suggested introduction sophisticates the language of the original, adds new information, and removes the original hook. It is, perhaps, a better introduction. Writing this kind of introduction comes from years of experience and practice cultivating style, voice, and understanding of academic writing. It is not an accurate representation of this undergraduate’s writing.

In conjunction with in-class exercises that help students evaluate the changes made to their writing and incorporate revisions on a case-by-case basis, AI-edited writing is a helpful learning tool. But without guidance, LLMs aren’t effective editors. Providing a more specific prompt—asking ChatGPT to focus on grammar and clarity, for example, or for copyediting—retains the features of the original paragraph while cleaning up the grammar and sentence structure.

The allowance or not of AI-edited writing will depend on your learning goals and what features of writing you are teaching and assessing. While each LLM did substantially alter the language and style of the original writing, the ideas and arguments themselves are retained. If that is your goal for students’ writing, consider your assignment in phases: a fully student-written rough draft, and an AI-edited final. Submitted as a portfolio, with appropriate citations of AI use, an instructor can trace the student’s thinking and identify their ideas and argument from rough to final draft—using AI doesn’t impede this process. You may also want to include a cover letter or reflective memo as part of the assignment for students to explain their choices.

Course goals that center the writing process, including feedback and revision, may take AI editing off the table, and that’s absolutely okay. Be transparent with students about the purpose of assignments and how using AI will inhibit their ability to meet the goals of the course successfully. There are tell-tale signs of AI’s intervention, certain words and structures that are indicative of AI’s writing style, so its use will still be noticeable; citing and verifying chat threads may help mitigate unauthorized use.

It must be said that LLMs are trained in high-resource languages and do not include much linguistic diversity in their training data. Because of the influence of “standard” English—which second-language users are often taught in the context of TOEFL tests—in that data, AI detectors are more likely to flag the writing of non-native English speakers. The results of such detection softwares should be evaluated with caution and in dialogue with students. Certain linguistic and stylistic markers are not necessarily a sign of academically dishonest uses of AI.

AI for Feedback

Much has been written on AI’s potential for personalized and immediate feedback—both positive and negative (see also Dai et. al., 2023 and Escalante, Pack, and Barrett, 2023). AI-generated feedback should never replace interactions between students and instructors—providing feedback is one of the most important tasks of teaching and can improve students’ confidence and transference—nor should AI be used unchecked in grading. But it’s now possible (and relatively easy) to create custom GPTs that are trained on your assignments and rubrics, or to craft hyper-specific prompts to guide the kind and quality of feedback given. Even simpler prompts can be helpful as a first line of feedback for students to help them consider their work from an outside perspective.

A screenshot of the beginning of a chat interaction with Microsoft Copilot, highlighting the specific prompting style where the user has given the AI highly detailed instructions of well over a paragraph in length on how to interact with a student — A screenshot of the beginning of a chat interaction with Microsoft Copilot, highlighting the specific prompting style

Let’s look at prompts for feedback using Microsoft Copilot:

Specific prompt (see also adjacent screenshot)
General prompt
Hybrid prompt

Asking the LLM to act as a mentor and highlight both strengths and opportunities yields actionable feedback without revealing how exactly to enact it. For example, the questions to prompt deeper critical analysis would be meaningful additions to the short essay provided. Without more context for the assignment, however, its other suggestions for improvement are less useful. The goal of the assignment was not to provide an argument but a summary of the relevant literature and its length is not conducive to section headings. This is certainly a limitation of this more complex prompting; the feedback’s effectiveness is limited if you do not provide all the information needed in the prompt.

The more general prompt explains what the essay does well, primarily focusing on ideas and content in a positive tone. It does not, however, provide any guidance on how to improve the inputted text. The feedback is still helpful as a check on the clarity and correctness of the discussion and can help students identify gaps in their understanding. In order to get more effective feedback, adding “include suggestions for improvement” to the prompt returns actionable feedback to consider. The feedback references specific moments in the passage without doing any re-writing for students. Experiment with the wording of a few prompts, or develop a more systematic prompt for students to use in your course.

Incorporating feedback from AI as part of the writing process can develop AI literacy through prompt engineering while helping students define their writing style and voices for themselves by rejecting or incorporating feedback as they find relevant. As with editing and revising, reflective assignments that encourage students to articulate their choices within and against AI-generated suggestions and consider it alongside other forms of feedback—whether instructor, self, or peer—works to resist the commodification of writing and the ubiquity of AI-trained styles.

What AI-generated feedback lacks is the personal knowledge of our students’ interests, progress, and goals. Some of that can be accomplished by providing more context when prompting, but AI’s knowledge of our students and their writing is not yet able to surpass our own. Nor does AI know more about teaching the specific skills we want students to learn and demonstrate, or the specific disciplinary language and concepts of our courses. Instructor feedback is more important than ever for guiding students in developing their writing confidence and abilities.

Conclusion

No matter how you choose to incorporate AI for writing into your courses, keep the following in mind:

Be transparent with students about what is and is not acceptable and why.
Experiment with LLMs and provide students prompts; model how to use them successfully to meet the learning objectives.
Build in opportunities for students to develop AI literacy with prompt engineering and evaluation.
Plan to include metacognitive reflection activities to reinforce student learning over AI outputs.

¹A note about LLMs: All of the examples on this page were generated using the free versions of ChatGPT, Google Gemini, and Microsoft Copilot. As of April 26, 2024, ChatGPT’s free model is GPT-3.5, whereas the subscription model is GPT-4; Copilot’s free version uses GPT-4.0 Turbo, a more advanced version of the model that powers the paid version of ChatGPT; and Google Gemini is proprietary to Google. Each of the free models have limits on how much information they will process and output per interaction. As of May 2024, OpenAI released its newest LLM, GPT-4o, to its Plus subscribers. GPT-4o is available on a limited basis for a certain number of responses to free users.