---
marp: true
html: true
theme: legal-appellate
size: 16:9
paginate: true
title: "How Engineers Use AI"
description: "Draft OBA 2026 slides from notes."
author: "Sam Carlton"
footer: "[oba-2026.samcarlton.com](https://oba-2026.samcarlton.com)"
---

<!-- _class: title -->

# How Engineers Use AI

<!--
Open with the practical frame: this is not an AI hype talk, it is a workflow talk.
The question is how to use AI while keeping professional judgment, source review, and accountability intact.
-->

---

<!-- _class: section center-heading -->

# Intro

<!--
Use this as the reset slide before introducing yourself.
Give the audience the promise: a developer's view of how to use AI without surrendering verification.
-->

---

<!-- _class: center-heading -->

## Hey, I'm Sam. 

<!--
- Sam Carlton
- Tulsa native
- Been building and consulting on software for over 15 years
- VP of Techlahoma
- Built products used by NFL, Virgin Brands, Samsung, Aston Martin
- Maintain many Open Source projects for the community helping thousands of developers. 
- Love Grilling, Wagyu, and GrandMA Lighting Consoles.  
-->

---

<!-- _class: quote -->

## Disclaimer

> AI facts are perishable. Model behavior, vendor terms, court expectations, and bar guidance can change faster than a conference Wi-Fi password.

<!--
Open with: I heard a rumor there's a handful of lawyers somewhere in the room.
Use this as the currentness disclaimer.
The joke keeps the room relaxed, but the point is serious: today's AI facts are perishable, so the durable takeaway is the workflow.
-->

---

## Raise of hands

- Who has used ChatGPT or Claude?
- Who has used Claude Code or Codex?
- Who has used the terminal or command line for AI?
- Who has used Cursor or VSCode for AI?

<!--

- AI is taking over Software
- Finance and legal are next
- Kate paid my speaking fee
- Finance will just have to be a year yall

-->

---

## Agents

An Agent is any AI model that takes action, not just reads and analyzes.

**Agents can**

- Review and send a contract
- Handle a client intake conversation

**Agents are not**

- Plain chatbots that answer legal questions
- Research mode on ChatGPT or Claude

<!--
- My best effort definition
- The lines can be fuzzy
-->

---

## How Engineers Use AI

The basics

- Verifications and algebra
- Play dumb
- Code smells
- Prompt smells
- Lean into existing summarizing frameworks
- Smart Brevity Core 4

<!--
The basics 
-->

---

<!-- _class: center-heading -->

## Verifications and algebra

<!--
- Hold back things you already know
- Anchor with actual numbers such as: counts
- The AI 6 problem
-->

---

<!-- _class: center-heading -->

## Play dumb

<!--
There's so much power 
-->

---

<!-- _class: center-heading -->

## Code smells

<!--
Explain code smells
-->

---

<!-- _class: center-heading -->

## Prompt smells

<!--
The more you use these tools you get a feel for when something is off. 

Lean into that. 

Interogate the Agent until you feel confident. 

Mark Manson's video AI Brain video. 
-->

---

## Summaries are you friend 

- Smart Brevity Core 4
- BLUF - Bottom Line Up Front
- SCQA / Pyramid Principle

<!--
Lean into existing summarizing frameworks. 

- Summaries can give you AI smells early
- Worth reading or throw away
-->

---

<!-- _class: section center-heading -->

# How an Engineer mitigates Hallucination

<!--
- AI will always lie to you eventually
- There are tools
-->

---

## Mitigations

- Always check the sourced materials - RTFS
- Give it clear the consequences of faulty data. The more serious the better
- Always check the sourced materials
- RTFS

<!--
- Treat AI as a conduit and never and oracle
- Showing you what sources to read is it's superpower
-->

---

<!-- _class: center-heading -->

## The burger example

<!--
- I will be poisoned
-->

---

## Consequences of faulty data

- I will lose my legal license
- I could be disbarred

<!--
- There are real stakes give them to the AI
-->

---

<!-- _class: claim center-heading -->

## AI will still lie to you

<!--
Do not let any techniques lull you
-->

---

<!-- _class: section center-heading -->

# Live demo: OSCN on request

<!--
- Explain Live Demos and that they always fail
- Get 3 OSCN case details from the audience to look up
- Review the accuracy with the lawyers
-->

---

<!-- _class: section center-heading -->

# How AI has changed Software work

<!--
- We can do way more way faster
- It's easy to accidentally create useless extra work
- It's easy to take longer to get to the same result
- Spidering
- Lean into one thing at a time
-->

---

## Comprehension Debt

The gap between what AI can generate and what a human can explain, verify, and maintain.

<https://addyosmani.com/blog/comprehension-debt/>

<!--
- Lot's of code(or docs) that know human has read much less understands
- 
-->

---

## Cognitive Surrender

Letting the Agent's judgment replace your own judgment, especially when the work still needs review.

<https://addyosmani.com/blog/cognitive-surrender/>

<!--
- The complete surrender to the Agent's judgment
- You stop verifying it's outputs 
- You stop challenging smells
-->

---

<!-- _class: section center-heading -->

# The Context Window

<!--
Show of hands
- AI has a senior moment
-->

---

<!-- _class: section center-heading left-bullets -->

# Context Window Habits

- Start with fresh threads as often as possible
- Keep files, documents, and context you need to reuse in a shared file or project space
- Don't trust Agents to remember and use what you talked about in that thread

<!--
Tie this back to the hard-drive point: reusable context belongs somewhere outside the chat transcript.
-->

---

<!-- _class: section center-heading -->

# Benchmark Rankings

<!--
Who's familiar with AI Benchmarks?

Who's familiar with Legal AI Benchmarks?
-->

---

<!-- _class: appendix -->

## Benchmark Rankings

- Don't use benchmarks as God's word.
- They're just signal help you use your own discernment.

<!--
- Don't use benchmarks as God's word. 
- They're just signal help you use your own discernment. 
- Balance against practitioner's opinions
- Sometime it's better to keep using the old tool for a few months
-->

---

<!-- _class: appendix -->

## Model Benchmarks

- [Vals AI LegalBench leaderboard](https://www.vals.ai/benchmarks/legal_bench) - general LLM rankings on LegalBench; updated May 28, 2026. It lists 111 models and current top scores.
- [Vals AI benchmarks index](https://www.vals.ai/benchmarks) - broader benchmark directory; LegalBench is under the Legal section.
- [MLEB leaderboard](https://isaacus.com/mleb) - legal embedding / retrieval model rankings.
- [LawBench leaderboard](https://lawbench.opencompass.org.cn/leaderboard) - Chinese-law legal LLM rankings.

<!--
Group these as model or retrieval-model signals.
Vals LegalBench is useful for broad model signal.
MLEB matters when the workflow depends on retrieval, search, or embeddings rather than chat output.
LawBench is jurisdiction-specific, so do not overgeneralize it to U.S. legal work.
-->

---

<!-- _class: appendix -->

## Tool / Harness Benchmarks

- [LegalBenchmarks.ai leaderboard](https://www.legalbenchmarks.ai/leaderboard) - practical legal-task rankings, including contract drafting, info extraction, legal research analysis, contract review, and translation.
- [Harvey LAB initial results](https://www.harvey.ai/blog/legal-agent-benchmark-initial-results) - legal-agent benchmark results; not a neutral live leaderboard, but useful for agent/workflow rankings.
- [Harvey BigLaw Bench](https://www.harvey.ai/blog/introducing-biglaw-bench) - older vendor-published legal work-product rankings.
- [Vals VLAIR legal research report](https://www.vals.ai/industry-reports/vlair-10-14-25) - legal research product comparison against a lawyer baseline; report-style rankings, not a live leaderboard.

<!--
Group these as tools, harnesses, workflows, and product-style comparisons.
Use them as references, not as a shopping list.
Remind the audience to ask what task was measured, what data was used, and whether the benchmark resembles their real work.
-->

---

<!-- _class: title -->

# Questions

<p class="end-link"><a href="https://oba-2026.samcarlton.com">oba-2026.samcarlton.com</a></p>

<!--
Invite questions on workflow, verification, and risk boundaries.
If questions are slow, ask: "Where would AI output be hardest for you to verify in your current practice?"
Point everyone back to the deck URL for links.
-->

---

<!-- _class: resources -->

<div class="resources-layout">
  <div class="resources-qr-panel">
    <div class="resources-qr-frame">
      <img src="assets/ai-for-lawyers-article-qr.svg" alt="QR code linking to the AI for Lawyers article" />
    </div>
    <p class="resources-qr-label">Article + slides</p>
    <p class="resources-qr-url">samcarlton.com/ai-for-lawyers-engineering-world</p>
  </div>
  <div class="resources-link-panel">
    <h2>Resources</h2>
    <ul>
      <li><a href="https://www.skills.sh/audits">Skills.sh audits</a><span>Combined Gen Agent Trust Hub, Socket, and Snyk signals for public skills.</span></li>
      <li><a href="https://platform.claude.com/docs/en/about-claude/use-case-guides/legal-summarization">Anthropic legal summarization guide</a><span>Claude pattern for legal summaries, metadata extraction, citations, and review criteria.</span></li>
      <li><a href="https://openai.com/academy/skills/">OpenAI Academy: Using skills</a><span>Reusable workflow examples, including contract review, policy Q&amp;A, and legal memo skills.</span></li>
      <li><a href="https://www.harvey.ai/blog/5-workflows-ip-patent-litigation">Harvey IP litigation workflows</a><span>Five workflow patterns for patent and IP litigation teams.</span></li>
    </ul>
  </div>
</div>

<!--
Frame these as starting points, not guarantees.
The practical rule: discover through public directories, but inspect source, pin versions, and sandbox before trusting a skill in a real client workspace.
-->

---

<!-- _class: section center-heading left-bullets -->

# Extra

Bleeding Edge to watch
- Subagents
- Adversarial Harnesses
- Claude Dynamic Workflows
- AI Fatigue

<!--

-->

---

<!-- _class: section center-heading -->

# The power of a Hard Drive

Welcome to 1995

<!--
The joke is that local files still matter.
Having the right source materials organized and available can be more valuable than chasing a new model.
Tie this back to RTFS: AI gets safer when the source set is concrete, available, and reviewable.
-->
