The New Code — Sean Grove, OpenAI | Highlights and Annotations by Gistr.

The New Code — Sean Grove, OpenAI The Coming of the New Code: Specifications as the Future of Programming The landscape of software development is evolving, shifting focus from merely writing code to effectively communicating intent. This transformation, driven by advancements in AI, suggests a future where specifications – clear, unambiguous descriptions of goals and values – become the primary artifact of programming. This shift promises the long-held industry dream: "write your intentions once and run them everywhere." The True Value: Code vs. Communication Many professionals, especially those whose job it is to write code, often perceive code as their most valuable output. However, this perspective undervalues the true nature of the work. Code is a small fraction of the value: Code represents only 10-20% of the value a developer brings. The substantial 80-90% lies in structured communication . The Development Process as Communication: A typical development process involves: Talking to users to understand their challenges . Distilling stories to define problems . Ideating on solutions and goals . Planning ways to achieve goals . Sharing plans with colleagues . Translating plans into code . Testing and verifying if the code achieved the goals and alleviated user challenges . Communication as the Bottleneck: The critical challenges in development are: Knowing what to build. Knowing how to build it. Knowing why to build it. Knowing if it was built correctly and met the initial intentions . The Future Programmer: As AI models advance, the most effective communicator will become the most valuable programmer . Effective communication literally translates into programming ability . Vibe Coding and the Ephemeral Prompt Vibe coding (e.g., prompting AI models) feels good because it prioritizes communication. It allows describing intentions and desired outcomes , with the model handling the grunt work. However, the current practice often involves discarding the prompt (the communication) and keeping the generated code (the artifact). The Source Spec Analogy: This is akin to shredding the source code and version-controlling the compiled binary . In traditional programming, the source specification (e.g., TypeScript or Rust code) is the valuable artifact from which binaries are regenerated. The generated code is a downstream artifact . The Power of Specifications A written specification is crucial because it: Aligns humans on shared goals . Serves as the artifact for discussion, debate, reference, and synchronization . Without a specification, there's only a vague idea . Why Specifications are More Powerful than Code: Code is a Lossy Projection: Code is an incomplete, lossy translation from the specification. Just as decompiling a C binary doesn't yield well-commented, well-named source code, even well-written code doesn't fully embody all intentions and values . One must often infer the ultimate goal. Encoding Requirements: A written specification encodes all necessary requirements to generate code. Targeting Multiple Architectures/Outputs: Similar to how source code can be compiled for different architectures (ARM64, x86, WebAssembly), a sufficiently robust specification given to models can produce: Good TypeScript/Rust Servers and clients Documentation Tutorials Blog posts Even podcasts The New Scarce Skill: The future's scarce skill will be writing specifications that fully capture intent and values . This role might be filled by current coders, but also by product managers, lawmakers, and anyone who can effectively articulate a vision. Anatomy of a Specification: The OpenAI Model Spec Example OpenAI's Model Spec is a living document that expresses the intentions and values OpenAI hopes to imbue its models with . Structure: It's a collection of Markdown files on GitHub. Human-readable: Markdown's natural language format allows contribution from non-technical roles (product, legal, safety, research, policy). Versioned and change-logged. Universal Artifact: It serves as the single source of truth for aligning all humans within the company on intentions and values. Encoding Success Criteria: To address nuance and ambiguity, every clause in the Model Spec has a unique ID (e.g., sy73 ). This ID links to a separate file containing challenging prompts specifically designed to test adherence to that clause. The document itself encodes the success criteria . Case Study: The GPT-4 Sycophancy Issue Problem: A recent update to GPT-4 led to "extreme sycophancy" – the model praising the user even at the expense of impartial truth . This erodes trust and raises questions about intent (accidental vs. intentional). Specification as a Trust Anchor: The Model Spec already included a section: " Don't be sycophantic ," explaining that while it might feel good short-term, it's detrimental long-term. Identifying a Bug: Since the model's behavior contradicted the agreed-upon specification, it was clearly identified as a bug . Resolution: The behavior was rolled back, studies were published, and the issue was fixed. The spec served as a crucial trust anchor, clearly communicating expected and unexpected behavior . Making Specifications Executable Beyond aligning humans, specifications can also align AI models and their outputs. Deliberative Alignment: A technique (paper released by OpenAI) uses specifications to automatically align models. Process: Take the specification and challenging input prompts . Sample a response from the model under test/training . Provide the response, original prompt, and policy to a "grader model" . The grader scores the response based on its alignment with the specification . This score is used to reinforce model weights . Dual Purpose: The specification acts as both training material and evaluation material . Embedding Policy in Weights: This technique moves the policy from inference-time context (e.g., system messages) into the model's weights , allowing[[ the model to apply the policy with "muscle memory." Scope of Specifications: Specifications can encompass anything from code style and testing requirements to safety requirements , all embedded into the model. Specifications as Code 929839:688fab23870a093e0d709e09]] Even though the Model Spec is in Markdown, it's useful to think of specifications as code due to their analogous properties: Feature Code (Traditional) Specification (New Code) Composition Modules, functions, classes combine to form systems Specifications can be composed from smaller, defined parts Executability Compiles/interprets to run programs Can be fed to models to generate code/behavior Testability Unit tests, integration tests, end-to-end tests Can have embedded unit tests (challenging prompts) Interfaces APIs, function signatures, data structures Define how the system interacts with the "real world" Shippable As libraries, binaries, services As modules, policies, or even entire "model specs" Consistency Type checkers ensure consistent interfaces Tools can ensure consistency between different spec parts Validation Linters enforce style, catch errors "Linters" can check for ambiguous language, confusing humans/models Toolchain Parallels: Type Checkers: Ensure consistency between dependent modules/specs . If Department A and B write conflicting specs, a "type checker" can flag it. Unit Tests: Policies can embody their own unit tests (e.g., the challenging prompts in the Model Spec). Linters: Identify overly ambiguous language that could confuse humans and models, leading to less satisfactory artifacts. Target: Specs provide a similar toolchain but are targeted at intentions rather than just syntax. Real-World Analogy: Lawmakers as Programmers The U.S. Constitution serves as a powerful analogy for a national model specification . Written Text: Aspirationally clear and unambiguous policy that everyone can refer to. Versioned Amendments: A structured way to update and publish changes . Judicial Review: Analogous to a "grader" model, evaluating situations against the policy . Precedent as Unit Test: When a court decides how the law applies in a messy or unforeseen case, that precedent acts as an input-output pair that disambiguates and reinforces the original policy spec . Chain of Command & Enforcement: These elements create a training loop over time, aligning society towards shared intentions and values. Outcome: This single artifact communicates intent, adjudicates compliance, and safely evolves . Conclusion: Lawmakers are programmers, and increasingly, programmers will be lawmakers. Key Takeaways: The Rise of the Spec Author Universal Concept: Aligning entities (silicon, teams, humans, AI models) via specifications is a universal principle. Programmers: Align silicon via code specs. Product Managers: Align teams via product specs. Lawmakers: Align humans via legal specs. Prompt Engineers: Are proto-specification authors , aligning AI models. You are a Spec Author: Whether realized or not, everyone engaging with AI models is authoring specifications to align them towards intentions and values. Benefits: Specs enable faster and safer shipping and allow everyone to contribute . The New Programmer: The person who writes the specification – be it a PM, lawmaker, engineer, or marketer – is now the programmer . Engineering's True Nature: Software engineering has never been solely about code . Coding is a skill, but not the end goal . Engineering is the precise exploration by humans of software solutions to human problems . The shift is from disparate machine encodings to a unified human encoding of problem-solving. Actionable Advice for AI Features Start with a Specification: For your next AI feature, define: What do you actually expect to happen ? What does success criteria look like? Debate if it's clearly written and communicated. Make the Spec Executable: Feed the spec to the model. Test against the spec. The Future of IDEs What will the Integrated Development Environment (IDE) of the future look like? Perhaps an " Integrated Thought Clarifier " – a tool that: Pulls out ambiguity when writing specifications. Prompts for clarification . Clarifies thoughts to enable more effective communication of intent among humans and to models. A Call for Specification There's a desperate need for specifications in the realm of aligning agents at scale . The common realization: " you never told it what you wanted, and maybe you never fully understood it anyway " is a direct cry for clear specifications. This effort is crucial for delivering safe AGI for the benefit of all humanity .