Why I'm Optimistic
Some of the biggest positive turning points in my life and career happened during the hardest times. Work isn’t life, but it provides part of the foundation, and I prefer my work-life balance to be a little blurry anyway.
Carbon-Based Content
Some of the biggest positive turning points in my life and career happened during the hardest times. Work isn’t life, but it provides part of the foundation, and I prefer my work-life balance to be a little blurry anyway.
One of the more painful recurring experiences I’ve had working with AI tools is when the model seems so close to being able to do something right, that I spend way too much time on that last 10-20% of trying to get them to do the right thing through prompting and pleading or giving them more information.
Sometimes, the skill formerly known as Prompt Engineering (which spoiler alert is just Good Communication) is enough to get the result you want. In those cases where ‘better instructions and context’ aren’t doing what you want, another technique to try is asking the model to create the tool you need, instead of having it do the work itself.
The Model Context Protocol a.k.a. MCP has been gaining traction as a standardized communication protocol for AI applications to connect to external resources. However without extra context, Claude and other LLMs will come up with their own imaginative explanations of what MCP is. Some of my favorite guesses include:
Most of these are real things related to AI, but none of them are the MCP I am usually referring to these days.
Using an AI-assisted tool like Cline is becoming more common for creating code. But what about reviewing other people’s changes to existing code? Sometimes you want a second pair of eyes to help you review a pull request, and Cline can help you do that too. You don’t even need to install any special tools beyond the GitHub CLI, since Cline will help you run the gh commands in the terminal.
Even though it’s not an image generation model, Claude.ai can draw pictures and create simple animations in the form of Scalable Vector Graphics a.k.a SVGs. Lately I’ve been using this feature more for things like creating icons, making diagrams, and visualizing weird geometric entities. I wanted to create a post to recap some of the specific workflows I’ve used to create usable images and animations in case it helps anyone else, including future me.
If you want consistent, deterministic outputs from an LLM, setting the temperature to 0.0 seems like the obvious choice. In my mind, ‘removing randomness equals better’ for most tasks I use these models for. This is especially tempting for tasks like coding or technical writing where mistakes and reproducibility matter most. However, asking the LLMs themselves to suggest temperature settings for tasks has made me rethink this assumption. The lowest setting they suggest is usually 0.2, even for high accuracy technical or analytical tasks.
Today I had a fun conversation with Claude about “Visualizing Generative AI Beyond Humanoid Robots.” Note that Claude’s responses have been reformatted end edited down a little, but I didn’t fundementally change any of the content. All generated visuals are linked inline as publicly published artifacts.
Me: I am getting a little tired of humanoid robots that look like Apple products being used to represent AI. What are some alternative visual representations, specifically for generative AI?
Claude: I’ve created an abstract visual representation of generative AI that moves away from anthropomorphic designs.
This design emphasizes:
Me: This reminds me of the idea that typical representations of angels and cherubs are not actually ‘historically accurate’ but that the more ‘realistic’ representations would be so weird and terrifying that we need to simplify them and make them more human-friendly.
Claude: Yes, the biblical descriptions of angels, particularly in Ezekiel and Isaiah, describe beings that are far more abstract and otherworldly - multiple faces, wheels within wheels, countless wings, covered in eyes - which is actually a much better metaphor for AI than our current sleek, humanoid representations. Let me create a visualization inspired by this concept - something that captures the more eldritch and abstract nature of artificial intelligence.
I finally read the official doc on the GPT-o1 beta this morning. I hadn’t read it very closely before (I can be allergic to reading manuals) until a Slack conversation on the topic inspired me to look it up. I’ve been loosely thinking about o1 as “better GPT model with fewer options to tweak.” That’s not totally accurate and has probably been making my own results worse.
Here are a few counter-intuitive things about using o1 that jumped out to me, in case anyone else missed the memo:
Almost every standard software development best practice and trick in the book is applicable to AI-assisted coding. The one I was reminded of the hard way is using Minimal Reproducible Examples to help understand and fix a problem. I’ve noticed especially while using more ‘proactive’ and automated coding tools like Cline to write and update unit tests, test failures can lead to a long session of going in circles while the token count climbs.
Besides the time and cost concerns, the bigger problem is that the context window rapidly starts to fill up with all these fruitless iterations. The more that happens, the more easily confused your assistant will get. I can usually tell something is up when they start to say things like “Aha! I figured out the problem…” It could be a cooincidence, but I feel like a bot growing a personality usually correlates with higher chance of hallucination.
Today I had to stop Claude a few times, start a new session, and remind them to start with a minimal reproducible example. So far it seems like this has been a good remedy for the circular unit test ‘fixing’ spree we were having. Its something that I should probably apply to my own human troubleshooting more often. For more on how/what/why this is, I’ll defer to this existing help article on Stack Overflow.
According to some, there is a conspiracy brewing at Anthropic. Their models mysteriously got dumber lately, and they refuse to answer to it. All kinds of anecdotes and explanations are popping up on the ClaudeAI subreddit, including from people who appear to know a lot about how these models work. This drama intensified after someone from the company dropped in to assert that they hadn’t noticed any widespread issues that would result in a global degradation. Everyone loves a good conspiracy, so as the “X-Files” music started playing in my head, I reflected back on my recent experiences with Claude to find some evidence that I was being lied to.
It feels like ages ago now - the pre-meeting-transcript-summary days. Being able to condense an hour of recorded conversation into an organized list of bullet points still feels like a magic trick.
One of the common pieces of feedback that stands out from earlier in my career is to speak up more in meetings. Sometimes people straight up said it. Other times, it was implied by other more vaguely worded feedback related to fuzzier concepts like scope of influence.
One thing I like about using tools like ChatGPT or Claude for guiding my personal research is the simplicity and mental quiet of the interface. They also have a sincerely helpful attitude that can be equally rare when you go to ask for help online. The feeling I get from using them brings me back to watching “Star Trek: The Next Generation” as a kid. That was a future I wanted to live in. Not only was there all that stuff about Earth becoming a utopia - you could just say “Computer” and a helpful voice would always respond, that wasn’t just trying to get you to buy more stuff on Amazon.
The practice of adding disclaimers to AI-generated content is one way to make sure that people know that it might be misleading, or totally wrong. In other words, the creator knows that a model’s effectiveness is expected to vary, so they encourage the human seeing it to be more skeptical than usual and maybe cut it some extra slack. However, since both AI and misinformation are both so increasingly embedded in our digital experiences, it could make more sense in the future to call out when content is 100% ‘organic’ or reviewed by a human expert.
It’s almost my 5 year anniversary with the Personalization team at Nordstrom. Throughout that time one of my favorite things has been building, running, and thinking about experiments. Yesterday I got to share some of those thoughts on the Tech at Nordstrom blog.
I’m leaving this “hello world” default blog post here for motivational reasons.