<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Llm on OranLooney.com</title>
    <link>https://www.oranlooney.com/tags/llm/</link>
    <description>Recent content in Llm on OranLooney.com</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <copyright>&amp;copy; Copyright {year} Oran Looney</copyright>
    <lastBuildDate>Wed, 05 Jun 2024 00:00:00 +0000</lastBuildDate>
    
	<atom:link href="https://www.oranlooney.com/tags/llm/index.xml" rel="self" type="application/rss+xml" />
    
    
    <item>
      <title>A Picture is Worth 170 Tokens: How Does GPT-4o Encode Images?</title>
      <link>https://www.oranlooney.com/post/gpt-cnn/</link>
      <pubDate>Wed, 05 Jun 2024 00:00:00 +0000</pubDate>
      
      <guid>https://www.oranlooney.com/post/gpt-cnn/</guid>
      <description>Here&amp;rsquo;s a fact: GPT-4o charges 170 tokens to process each 512x512 tile used in high-res mode. At ~0.75 tokens/word, this suggests a picture is worth about 227 words&amp;mdash;only a factor of four off from the traditional saying.
(There&amp;rsquo;s also an 85 tokens charge for a low-res &amp;lsquo;master thumbnail&amp;rsquo; of each picture and higher resolution images are broken into many such 512x512 tiles, but let&amp;rsquo;s just focus on a single high-res tile.</description>
    </item>
    
    <item>
      <title>Let&#39;s Play Jeopardy! with LLMs</title>
      <link>https://www.oranlooney.com/post/jeopardy/</link>
      <pubDate>Sun, 12 May 2024 00:00:00 +0000</pubDate>
      
      <guid>https://www.oranlooney.com/post/jeopardy/</guid>
      <description>How good are LLMs at trivia? I used the Jeopardy! dataset from Kaggle to benchmark ChatGPT and the new Llama 3 models. Here are the results:
There you go. You&amp;rsquo;ve already gotten 90% of what you&amp;rsquo;re going to get out of this article. Some guy on the internet ran a half-baked benchmark on a handful of LLM models, and the results were largely in line with popular benchmarks and received wisdom on fine-tuning and RAG.</description>
    </item>
    
    <item>
      <title>My Dinner with ChatGPT</title>
      <link>https://www.oranlooney.com/post/my-dinner-with-chatgpt/</link>
      <pubDate>Sat, 10 Dec 2022 00:00:00 +0000</pubDate>
      
      <guid>https://www.oranlooney.com/post/my-dinner-with-chatgpt/</guid>
      <description>It&#39;s hard to talk about ChatGPT without cherry-picking. It&#39;s too easy to try a dozen different prompts, refresh each a handful of times, and report the most interesting or impressive thing from those sixty trials. While this problem plagues a lot of the public discourse around generative models, cherry-picking is particularly problematic for ChatGPT because it&#39;s actively using the chat history as context. (It might be using a $\mathcal{O}(n \log{} n)$ attention model like reformer or it might just be brute forcing it, but either it has an impressively long memory; about 2048 &#34;</description>
    </item>
    
  </channel>
</rss>