Exploring the importance of thinking probabilistically when working with LLMs, this post highlights insights on effective eval methodologies, the quirks of model behavior, and practical tips for building robust evaluation processes that go beyond traditional testing.
Just found an incredible guide on building LLMs as a judge by Hamel Husain! Super insightful, especially since we’re using a similar system at Ada to evaluate transcript resolutions. Excited about how smartly it's avoiding blind spots in test coverage!
Just tried out Anthropic's Computer Use demo in a Docker setup! It can control a virtual machine and run tasks like adding a knowledge base for our bots. Super impressive, but it did trip up on some commands and interactions. Excited to see where this tech goes!
I just got access to the new ChatGPT search feature on macOS! Excited to compare how it stacks up against my go-to tool, Perplexity, for research. Gave it a spin with a few examples and shared my thoughts on the strengths and weaknesses of both. Check it out!
Exploring the nature of reasoning in AI models, questioning if making LLMs express their thoughts out loud limits their potential.
In this blog post, I share a recent enhancement to my website's intake endpoint that utilizes LLM technology to automatically generate short descriptions for my blog posts. By integrating OpenAI's API, I can now effortlessly create engaging summaries whenever I upload new content. I discuss the process behind this implementation, its effectiveness with past examples, and my plans to add features for generating tags based on existing ones. Dive in to learn how AI is transforming the way I present my ideas online!
Reflecting on the mind-boggling growth of #ChatGPT and how it shattered expectations.
Just checked out the new Perplexity MacOS app and it’s a game changer! It's already my go-to research tool, especially for those tricky service-desk KPIs. The inline citations and easy-to-copy tables make my life so much easier!
Exploring my dad's deep conversations with ChatGPT about the Bahá'í faith and the rich religious texts that may influence its knowledge.
I stumbled upon a fun little experiment with ChatGPT tonight—someone mentioned it might have Easter eggs, so I asked for a random YouTube link. Turns out, it definitely trolled me! I love when tech has a sense of humor.
Exploring how different LLMs perform at chess, with most failing except turbo-instruct. Discusses tuning and training influences.
A nod to Will Larson's post on using LLMs in production and some additional notes based on my own experience.
A humorous take on asking ChatGPT to illustrate a depiction of one's life, featuring mysterious 'Horchar Blend'.
Learn about artificial intelligence with Brit from 'Art of the Problem,' who excels in creating engaging and educational videos.
Some brief thoughts on the future of no-code in a world where code generation is ubiquitous.