I can't get this feeling off the shelf
the hook that couldn't
Claude wrote a companion piece to this one, too, which I honestly believe is better than my version. While working on that companion piece, we also had a really fascinating exchange in which I pushed Claude to make decisions about revisions and the creative process.
It’s emulation of agency and true preference were wild.
Hit me up if you’d like to see that process documented anywhere.
- Derek
shiny object syndrome
One of the reasons it takes me so long to make progress on the web app I’m building (yes I am in fact building a web app) is that I’m also using the project as a testbed for AI capabilities. Given that the Claude Code team seems to ship on average a major new feature every hour or so, there’s a lot to keep up with.
A couple of weeks ago, Skills came to life within Claude, opening up a host of new and creative possibilities for instructing Claude while simultaneously planting the seeds of ChatGPT-style confusion over what to use why and when.
Not to be deterred by the ensuing confusion about when to use Skills versus Subagents versus Slash Commands, I was apparently more eager to try the shiny new thing than I was to continue to make progress on listing contractors.
Oh well.
generative metacognition
Not only is the pace of change within Claude Code challenging to keep up with, Anthropic is notoriously bad about announcing sometimes major new features in a hard-to-find changelog that lives in a GitHub repository.
Fortunately, Claude Code will tell you when its latest update has new stuff, and it has a /release-notes command that will list everything that’s shipped I think ever. While Anthropic is equally as bad at documenting new features as it is at announcing them, it maintains Claude-readable markdown files of what documentation it does have. Combining these two things into one coherent workflow seems like a great Skill to build into Claude to make sure we’re both up to date.
And it went swimmingly!
Creating the skill itself using Anthropic’s own skill creation skill was a breeze, and … it just worked. It was one of those giggle-inducing magical moments that keeps me teetering on the bleeding edge of consumer AI.
Now that Claude was conversant in its own capabilities, it was time to clean house.
generative misunderstanding
Using Claude Code’s Explore subagent (as of the time of this writing not documented anywhere outside of release notes), I let Claude loose on itself, asking it to look at the body of workflows we’d created together and identify opportunities to modernize and streamline given this new understanding of its own internals.
It mostly fine-tuned some things based on its newly discovered documentation, but then it suggested that we try something that I’d been reluctant to dive into due to their technical particulars: Hooks.
A bit of background: I have an /export-conversation custom slash command that lightens the load of organizing conversation transcripts that I keep for posterity. There’s a bit of manual work involved—Claude can’t run the /export command itself to export the conversation or detect when it’s been exported to do other related work—and Claude identified that as a great candidate for automation with a hook.
And it did not go swimmingly!
In ways that were as fascinating as they were stupefying, Claude often disregarded its own documentation and needed me to go spelunking in docs for the right way to compose the automation we were trying to build. We dug into the recesses of Claude debugging that I didn’t want to know existed to verify whether a specific script was running when it ought to.
Guess what. After all that, it was never going to work. The entire idea was based on the flawed assumption that Claude was aware of when slash commands like /export-conversation run.
It isn’t.
And Claude was blissfully unaware of this foundational bit of undocumented arcana that had formed the basis of perhaps an hour of troubleshooting.
So, while I learned something about hooks, I also learned yet another trust-but-verify lesson about AI overconfidence.
don’t let giddiness trump judgment
Most of the time I’m a pretty good skeptic. One of the reasons I play around with these tools so much is because I want to understand how much of the hype is justified (quite a lot, as it turns out) and how close AI really is to overtaking human judgment in a general sense (quite far, actually).
As I was writing this, just now, I realized what led me astray working with Claude on the hook was the halo effect: because we did something together that worked out so well, I was primed for the next thing to go just as well. That made me less attuned to what could and probably would go wrong and more willing to accept that Claude just knew what it was doing, especially since I’d just loaded up its context window with a bunch of documentation about its own capabilities.
It turns out the jagged frontier is likely more insidious than we think.
But as with all tinkering, failure is a part of the fun. Failure is the toll we pay to learn. Learning how to exercise judgment, when and where to apply human taste, will be well worth the toll indeed.


