Most developers try Claude Code, run a few prompts, and form an opinion. But the engineers getting the most out of it are doing something fundamentally different: they're treating it as a development environment, not just a chatbot. The gap between casual usage and expert-level productivity is enormous, and it comes down to a handful of techniques that change how the AI Agent writes, verifies, and ships code.
We dug into workflows from Anthropic's own engineering team, power users, and our own experience building with Claude Code to distill the practices that actually matter. Whether you're a senior engineer evaluating AI tooling or a team lead looking to scale these workflows across your organization, here's what separates the advanced users from everyone else.
Mind Your Context Window
Claude Code's context window is its working memory. Every message you send, every file it reads, every command output it processes... all of it accumulates in a shared 200k token budget. And performance degrades as context fills up. A Claude session at 90% context usage isn't just slower. It's dumber. Important instructions get buried, earlier context gets lost in the middle, and the model starts making mistakes it wouldn't make with a clean window.
The /context command gives you visibility into where your tokens are going. Run it at any point in a session and you'll get a breakdown like this:
This matters more than most people realize. MCP servers consume tokens just by being available: their tool definitions are loaded into context on every request, whether you use them or not. A few MCP servers can eat 30% or more of your window before you've typed a single prompt. Use /context to audit this regularly, and disable any MCP servers you're not actively using in the current session.
The companion command is /clear, and it's one of the most important habits to build. When you run /clear, it wipes the conversational history from the context window entirely. Claude will re-read your CLAUDE.md and still have access to your files, but all the back-and-forth, the failed attempts, the dead-end debugging is gone.
The best pattern is to /clear between distinct tasks. Finished implementing a feature and ready to fix a bug? Clear. Just merged a PR and moving to a different part of the codebase? Clear. Got stuck in a loop where Claude keeps making the same mistake? Definitely clear! Stale context full of failed approaches actively hurts the next attempt. Think of it as the equivalent of closing a bunch of browser tabs before starting new work.
For long sessions where you want to preserve some context but not all of it, /compact offers a middle ground. It summarizes the conversation, reducing token count while preserving key decisions and code context. You can even guide what gets kept:
/compact Focus on the authentication implementationClaude also auto-compacts when it hits roughly 95% of context capacity, but by that point you've already been operating in degraded territory. Manual compaction at 70-80% gives you better results.
The bottom line: treat context like a finite resource because it is one. Monitor it with /context, reset it aggressively with /clear, and you'll get consistently better output than someone who lets sessions run until the model auto-compacts.
Give Claude a Way to Verify Its Work
If there's one principle that defines advanced Claude Code usage, it's this: always give the agent a feedback loop.
Boris Cherny, the creator of Claude Code, has been vocal about this. He argues that giving Claude a way to verify its own output improves the quality of the final result by two to three times. The pattern is straightforward. Instead of asking Claude to write code and hoping it works, you tell it to write tests alongside the implementation, or you give it a way to check its work through browser automation, linting, or build commands.
This mirrors test-driven development, and it works particularly well with AI agents. Here's the workflow Anthropic recommends:
- Ask Claude to write tests from your input/output specifications (emphasize no mocking of code that doesn't exist yet)
- Confirm the tests fail without an implementation
- Commit the tests
- Ask Claude to write code that passes all tests without modifying the test suite
- Deploy a subagent to verify the implementation doesn't overfit to the tests
- Commit the working code
The verification mechanism depends on your domain. For backend code, it might be running pytest or bun test. For frontend work, Claude can use browser automation tools like Puppeteer on MCP servers to take screenshots, navigate the UI, and iterate until things look right. In mobile development, iOS and Android simulator MCPs provide similar feedback loops.
Even if you don't write tests, you can embed verification into your workflow by adding a simple instruction to your CLAUDE.md file: "Before completing any task, describe how you would verify the work." This forces the agent to think about correctness before it stops.
Write a CLAUDE.md That Actually Works
Unless you're using the --continue or --resume flags, every Claude Code session starts from zero. The model has no memory of previous conversations, no context about your codebase, and no idea what conventions your team follows. The CLAUDE.md file is the single mechanism that bridges this gap, and most people are doing it wrong.
The most common mistake is bloating the file. Cherny's own CLAUDE.md is about 2,500 tokens. That's roughly one page of text. If yours is significantly longer, Claude will start ignoring instructions because important rules get buried in noise, and to automate token bloat checks in CI, check out our GitHub Action: Token Guard.
Here's what should be in it:
- Tech stack and project structure at a glance
- Build, test, and lint commands (the exact bash commands Claude should run)
- Code style conventions your team follows
- Things Claude should not do (this is the most important section)
For each line in the file, ask: "Would removing this cause Claude to make mistakes?" If the answer is no, cut it. If Claude already does something correctly without being told, the instruction is wasting space.
The real power move is treating CLAUDE.md as a living document that your entire team maintains. At Anthropic, whenever someone sees Claude make a mistake during a PR review, they don't just fix the code. They add a rule to CLAUDE.md so it never happens again. Every mistake becomes a rule. The longer a team works this way, the smarter the agent gets in that specific codebase.
For full-stack applications, consider using separate CLAUDE.md files for each service or directory. The frontend can have its own conventions, the backend its own. Claude loads files from the current working directory and its parents, so this hierarchy works naturally.
You can generate a starting point by running /init, which analyzes your codebase to detect build systems, test frameworks, and code patterns. Then refine it over time based on what Claude gets wrong.
Resume Work Without Losing Context
One of Claude Code's most underused features is its session management system. Every conversation is automatically saved locally with full message history, tool usage, and results. When you need to step away, you don't lose your work. Claude Code provides two flags for resuming previous sessions:
--continue for quick access to your most recent session:
# Resume the most recent conversation in the current directory
claude --continue
# Continue with a specific prompt
claude --continue "now add unit tests for the functions we just wrote"
# Continue in non-interactive mode (for scripts)
claude --continue --print "show me our progress"Use --continue when you stepped away for lunch, your terminal crashed, or you just want to pick up where you left off. It automatically loads the most recent session from your current directory with no prompts or selection required.
--resume for browsing and selecting specific sessions:
# Open the interactive session picker
claude --resume
# Resume a specific session by ID
claude --resume abc123def456
# Resume with a prompt
claude --resume session-id "continue implementing the authentication flow"When you run claude --resume without arguments, you get an interactive picker showing:
- Session summary (or initial prompt)
- Time elapsed and message count
- Git branch (if applicable)
The picker includes keyboard shortcuts that make navigation fast:
P: Preview a session before resumingR: Rename a session/: Search across all sessions
Inside a session, you can also use /resume to switch to a different conversation without exiting Claude Code. The picker shows sessions from the same git repository, including worktrees.
Name your sessions for easier retrieval:
/rename payment-integrationThis is a best practice when working on multiple tasks. Finding "payment-integration" later is much easier than scrolling through sessions named "explain this function."
Sessions are stored per project directory in ~/.claude/projects/. This means you can even run meta-analysis on session history if you want to identify common error patterns or improve your CLAUDE.md over time.
One advanced tip: you can capture session IDs programmatically for scripted workflows:
# Capture session ID from JSON output
session_id=$(claude -p "Start code review" --output-format json | jq -r '.session_id')
# Resume that session later with new prompts
claude --resume "$session_id" -p "Check for security issues"
claude --resume "$session_id" -p "Generate summary"The key insight is that --continue is for speed (no decisions required), while --resume is for precision (you choose exactly which session). Both preserve your full context, including messages, tool state, and configuration from the original session.
Automate Quality with Hooks
Hooks are one of Claude Code's most powerful and most underused features. They let you run custom commands at specific points in the agent's lifecycle: before a tool executes, after it completes, when the user submits a prompt, or when Claude finishes responding.
The most practical hook is the Stop hook, which runs whenever Claude finishes its response and is waiting for your next input. Here's an example that automatically checks for TypeScript errors:
// .claude/hooks/on-stop.ts
import type { HookInput } from "@anthropic-ai/claude-code";
const input: HookInput = JSON.parse(await Bun.stdin.text());
// Check if files were changed
const result = await Bun.spawn(["git", "diff", "--name-only"]).text();
if (!result.trim()) process.exit(0); // No changes, nothing to do
// Run type check
const typeCheck = Bun.spawnSync(["bun", "type-check"]);
if (typeCheck.exitCode !== 0) {
// Send errors back to Claude for fixing
console.log(JSON.stringify({
decision: "block",
reason: `TypeScript errors found:\n${typeCheck.stderr.toString()}`
}));
process.exit(0);
}
// No errors — commit the changes
console.log(JSON.stringify({
decision: "block",
reason: "All checks passed."
}));Configure it in .claude/settings.json:
{
"hooks": {
"Stop": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": "bun run .claude/hooks/on-stop.ts"
}
]
}
]
}
}This creates a self-correcting loop. Claude writes code, the hook catches TypeScript errors, sends them back, Claude fixes the errors, the hook runs again and finds no issues, then prompts Claude to continue. Two passes, zero manual intervention.
Other high-value hooks include:
- PostToolUse hooks that auto-format files after edits (run Prettier on any modified
.tsfile) - PreToolUse hooks that block writes to production config files or sensitive paths
- UserPromptSubmit hooks that scan for accidentally pasted API keys or credentials
The settings file can be checked into your repo as .claude/settings.json, giving your entire team the same automated quality gates. The local version (~/.claude/settings.json) stays personal.
Plan First, Execute After
Advanced users rarely jump straight into auto-accept mode. The workflow that consistently produces the best results is: plan first, iterate on the plan, then execute.
Claude Code's Plan mode is designed for exactly this. When you switch into it, Claude analyzes your request and proposes a detailed implementation approach without modifying any files. You go back and forth, refine the plan, and only start execution once you're confident the approach is sound.
Cherny describes this as his default workflow for any task beyond a small file change. He switches to Plan mode, describes what he wants, iterates on the plan until he's satisfied, and then kicks off the implementation in auto-accept edits mode. This front-loads the thinking and dramatically reduces the "drift" that happens when Claude starts implementing before it fully understands the task.
A practical tip: save your plans. If Claude generates a good implementation plan, ask it to write that plan as a GitHub issue or a markdown file. If the code iteration diverges from the plan later, you have a reference point to course-correct rather than starting over.
Use Docker Sandboxes
Docker Desktop 4.50 introduced Docker Sandboxes, an experimental feature designed specifically for running AI coding agents securely. When you run docker sandbox run claude in a project directory, Docker creates an isolated container with your workspace mounted at the same absolute path.
The setup handles several things automatically. Your git user.name and user.email are injected so commits made by the agent are attributed to you. Credentials persist in a Docker volume across sessions. And critically, Claude Code launches with --dangerously-skip-permissions enabled by default, but inside the sandbox, that's actually safe.
This matters because the container is isolated from your host system. Even if Claude runs a destructive command, the blast radius is contained to the sandbox. Your host filesystem, network, and Docker daemon remain untouched. The workspace directory syncs between host and sandbox, so file paths in error messages match your local environment and changes appear immediately on both sides.
Docker enforces one sandbox per workspace. Running docker sandbox run claude in the same directory reuses the existing container, which means installed packages and configuration persist between sessions. If you need to change a sandbox's configuration, you remove and recreate it.
This is particularly useful for:
- Autonomous workflows where you want Claude to work without permission prompts
- Parallel execution where multiple Claude instances work on different features simultaneously
- Low-risk tasks like linting fixes, boilerplate generation, or documentation updates
Tools like TSK take this further by letting you delegate development tasks to multiple AI agents running in sandboxed Docker environments in parallel. Each agent returns a git branch for human review. It's the "junior developer team" model, with guardrails.
Note that Docker Sandboxes requires Docker Desktop and is currently macOS and Windows only (Linux users can use legacy container-based sandboxes). Since it's experimental, features may change so check the docs for the latest.
Manage Permissions Instead of Skipping Them
The --dangerously-skip-permissions flag is tempting. No more interruptions, no more clicking "allow" on every file edit or bash command. But there's a better approach that gives you the speed of auto-accept with actual guardrails: maintaining explicit permission lists.
Claude Code's permission system works at two levels. User-level settings (~/.claude/settings.json) apply across all projects. Local settings (.claude/settings.json in your repo) apply to a specific codebase and can be shared with your team. Both support allowedTools and deniedTools arrays that control what Claude can do without asking.
Approve the Commands You Actually Use
Most Claude Code sessions involve the same handful of operations: reading files, listing directories, running your test suite, checking types. Instead of approving these one at a time or skipping permissions entirely, add them to your allowed list like these examples:
"permissions": {
"allow": [
"Read(*)",
"Search(*)",
"Edit(*)",
"Write(*)",
"Bash(*)",
// or more granular
"Bash(ls:*)",
"Bash(cat:*)",
"Bash(grep:*)",
// Node.js
"Bash(npm run:*)",
"Bash(npm test:*)",
"Bash(npm install:*)",
// Python
"Bash(python:*)",
"Bash(pip install:*)",
"Bash(uv:*)",
"Bash(pytest:*)",
// Git
"Bash(git status)",
"Bash(git diff:*)",
"Bash(git log:*)",
"Bash(git branch:*)",
// Linters / Formatters
"Bash(eslint:*)",
"Bash(ruff:*)"
]
}The wildcard syntax (*) lets you approve command prefixes. Bash(npm run:*) allows any npm script without approving arbitrary npm commands. Bash(git diff:*) lets Claude inspect changes without giving it push access.
Deny Commands That Can Cost You
Here's where the denied list becomes essential. Even inside a Docker sandbox, some commands can reach outside your container and cause real damage. Cloud CLI tools pose the biggest risk:
"permissions": {
"deny": [
// Cloud CLIs
"Bash(aws:*)",
"Bash(gcloud:*)",
"Bash(az:*)",
"Bash(terraform:*)",
"Bash(flyctl:*)",
"Bash(heroku:*)",
"Bash(vercel:*)",
// Docker + Kubernetes
"Bash(docker push:*)",
"Bash(docker login:*)",
"Bash(kubectl:*)",
// HTTP destructive requests
"Bash(curl -X DELETE:*)",
// Git operations
"Bash(gh release:*)",
"Bash(gh workflow run:*)",
// Package publishing
"Bash(npm publish:*)",
"Bash(pip upload:*)",
// SSH + SCP + RSYNC
"Bash(ssh:*)",
"Bash(scp:*)",
"Bash(rsync:*)",
]
}Note: Your settings.json must be valid JSON, so copy this but remove the comments, else it will throw an error when claude starts.
This list blocks cloud provisioning (aws, gcloud, az, terraform), container orchestration (kubectl), remote execution (ssh, scp), deployment tools (flyctl, heroku, vercel), and package publishing (npm publish, pip upload). Each of these can spin up resources, modify infrastructure, or push code to production... all things you want a human to approve explicitly.
Note, sometimes claude code may attempt to run these commands itself, but you should see something like this to show you it's working:
Bash(aws ec2 ls)
⎿ Error: Permission to use Bash with command aws ec2 ls has been denied.Why This Beats Skipping Permissions
The combination of Docker sandboxes and explicit permission lists gives you something --dangerously-skip-permissions can't: predictable sessions. You know exactly what Claude can and can't do. Your team shares the same guardrails through the local settings file. And when Claude does try something outside the approved list, you get a chance to evaluate it rather than having it execute silently.
For teams, this pattern also creates an audit trail. The settings file is version-controlled, so you can see when someone added a new allowed command and why. It's the difference between "we trust the sandbox" and "we trust the sandbox, and also we've explicitly defined the blast radius."
Start with a minimal allowed list and expand it as you encounter friction. Every time Claude asks for permission on something routine, add it. Every time you think "that command could cause problems," add it to the denied list. Over time, you'll build a permission profile that matches exactly how your team uses Claude Code.
Use MCP to Turn Claude into an Orchestrator
Claude Code isn't just a code editor. Through the Model Context Protocol (MCP), it becomes an orchestrator that connects to your entire tool ecosystem.
MCP servers give Claude access to external tools through a standardized protocol. Connect it to Slack, and Claude can search channels, post updates, and respond to messages. Connect it to Notion, and it can create databases, populate records, and understand your project documentation. Connect it to Sentry, and it can investigate production errors. BigQuery, GitHub, Jira, your internal APIs; if there's an MCP server for it, Claude can use it.
Configuration lives in a .mcp.json file that you can check into your repo for team-wide access:
{
"mcpServers": {
"slack": {
"command": "npx",
"args": ["-y", "@anthropic/mcp-slack"],
"env": {
"SLACK_TOKEN": "${SLACK_TOKEN}"
}
},
"notion": {
"command": "npx",
"args": ["-y", "@anthropic/mcp-notion"],
"env": {
"NOTION_TOKEN": "${NOTION_TOKEN}"
}
}
}
}One thing to watch: MCP tools consume context window space just by being available, even when they're not used. Tool descriptions are loaded into the agent's context, and some tools can consume 8% to 30% of your available tokens. Use the /context command to audit how much space each tool occupies, and only enable the servers you actually need for a given session.
Preload Context with Diagrams and Memory
In the most recent episode of the How I AI podcast, John Lindquist (co-founder of egghead.io) shared a great tip for injecting flow diagrams into the model's context. Every time you start a Claude Code session, the agent knows nothing about your application's architecture, data flow, or how components connect. Many developers try to solve this by writing extensive CLAUDE.md files, but there's a better approach: preloading compressed context through Mermaid diagrams.
Mermaid is a text-based diagramming language that renders flowcharts, sequence diagrams, and entity-relationship diagrams inside Markdown files. While these diagrams can be hard for humans to parse in text form, LLMs consume them efficiently. A few hundred tokens of Mermaid syntax can convey what would take thousands of tokens to describe in prose.

The technique is to maintain a set of diagram files in an docs/diagrams/ directory, then inject them at session start using the system prompt:
claude --append-system-prompt "$(cat docs/diagrams/*.md)"With all the architectural context preloaded, Claude can answer questions about authentication flows, database operations, or service interactions without doing any file reads or codebase exploration. The trade-off is higher upfront token cost, but the time savings and output quality make it worthwhile for complex codebases.
To generate these diagrams, ask Claude to analyze your codebase and produce Mermaid diagrams organized by concern: authentication flows, database operations, API routes, user interactions. Update them after major features land, ideally as part of your PR workflow.
Shell Aliases and Custom Commands
Power users minimize friction by creating shell aliases for their most common Claude Code configurations:
alias cc="claude"
# Fast, lightweight model
alias cch="claude --model haiku"
# With diagrams
alias ccd="claude --append-system-prompt '$(cat ./diagrams/*.md)'"
# For sandboxed environments only
alias ccx="claude --dangerously-skip-permissions"For repeatable workflows, Claude Code supports slash commands. Drop a markdown prompt template into .claude/commands/, and it becomes available as a /project:command-name inside any session. Use $ARGUMENTS for dynamic inputs:
# .claude/commands/fix-issue.md
Review GitHub issue $ARGUMENTS.
Identify the root cause by examining relevant files and git history.
Create a plan for the fix, then implement it with appropriate tests.Since these files live in your repo's .claude/ directory, the entire team benefits when someone creates a useful command. It's a shared library of AI-powered workflows.
Scale with Parallel Sessions and Background Agents
Advanced users run multiple Claude instances. Cherny describes numbering his terminal tabs so he can tell which session is sending a notification. Each session works on a different task, and he cycles between them based on which one needs attention.
Claude Code also supports background agents through web sessions. Connect the web version of Claude Code to your GitHub repo, kick off a long-running task, and it works in the cloud. When it's done, it creates a new branch and pushes the changes for review. You can even start a task from your phone and continue it on your desktop later using the /teleport command, which brings a cloud session back into your local terminal.
For tasks within a local session, the subagent model works well. Subagents are isolated Claude instances that handle specific subtasks with their own instructions and context. Common uses include:
- Architecture verification (does this design make sense?)
- Code refactoring (clean up what was just written)
- Build validation (does the final output actually work?)
For parallel feature work, git worktrees let you run multiple Claude instances on separate branches without conflicts:
git worktree add ../project-feature-a feature-a
cd ../project-feature-a && claudeEach worktree gets its own directory, its own Claude session, and its own git state. When the feature is complete, merge the branch and clean up with git worktree remove.
Choose the Right Model for the Job
Not every task needs the most powerful model. Claude Code lets you switch between models based on the complexity of what you're doing.
Cherny uses Opus 4.5 with extended thinking enabled for virtually everything. His reasoning: even though it's slower and more expensive per token, it makes significantly fewer errors. The total time spent steering and correcting is lower than with faster models, so the net productivity is higher.
That said, Haiku is great for quick, low-stakes tasks like generating boilerplate, formatting files, or answering simple questions about your codebase. Setting up aliases (like cch for Haiku) makes switching effortless.
The key insight is that model selection isn't just about speed or cost per token. It's about the total cost of completing a task, including the time you spend reviewing, correcting, and re-prompting. For anything that touches production code or requires architectural reasoning, the more capable model almost always wins. (For a deeper look at how to evaluate model quality, throughput, and cost, see our guide to benchmarking AI and evaluating LLMs.)
Integrate Claude into Your CI/CD Pipeline
Claude Code's GitHub Action turns it into an automated teammate in your CI/CD pipeline. The most impactful use case is PR reviews. When a pull request is opened, Claude can review the changes, identify potential issues, suggest improvements, and even fix problems directly.
Cherny takes this a step further. When he finds mistakes during a PR review, he asks Claude to add those mistake patterns to the CLAUDE.md file. This means the review process itself makes the agent smarter for future work. It's a self-improving system where every code review tightens the quality bar.
You can also use the headless mode (claude -p "your prompt") for scripted automation: running Claude as part of a build pipeline, generating changelogs, updating documentation, or triaging incoming issues. The --output-format stream-json flag gives you structured output for downstream processing.
Getting Started
You don't need to adopt all of these practices at once. Start with the two that deliver the most immediate value:
- Add verification to your workflow. Even a simple "run tests after every change" instruction in your
CLAUDE.mdwill measurably improve output quality. - Tighten your CLAUDE.md. Prune it to under 3,000 tokens. Focus on what Claude gets wrong, not what it already does right.
From there, add hooks for automated quality checks, experiment with Plan mode for complex features, and explore MCP connections to your existing tools. Each layer compounds on the last.
The engineers getting the most from AI coding tools aren't the ones with the cleverest prompts. They're the ones who've built systems around the tool: feedback loops, quality gates, shared configurations, and automated workflows. The tool is powerful on its own. The system around it is what makes it transformative.
If you're evaluating how AI development tools fit into your engineering workflow, or building products that integrate with AI agents, our AI/ML team can help you design the right approach for your team and codebase.


