DEJAN & Salomon: How ChatGPT sees the Web & its hidden Cache

25. Nov. 2025
3 Min. Lesezeit

Aktualisiert: 30. Nov. 2025

Key Takeaways:

DEJAN/Dan Petrovic analyzed how ChatGPT retrieves websites for grounding and outlined it on an example of his page using the Web Search tool in the Asisstants API.
How it works in a nutshell:
1. Initially retrieves small data object from web search results (title, description, 1-3 sentences, retrieval ID)
2. Then, it can look at certain windows/passages and even follow links It uses an open() call to access the page at a certain row and click() to initiate the same process for a url mentioned.
  (context/window size is a setting for the assitant to choose: low vs. medium vs. high)
3. It is not able to reconstruct the full page nor long passages - by retrieval and output limits it is forced to summarize
4. it only provides plain text - no HTML, no JSON-LD mark-up of the pages
Chris Long explains why SEO/GEO or rather well structured content works well:
1. Theoretically, AI search might prioritize the top of the page. That's why including key content [at the top of the page] is important since it's more likely to be included in a window.
2. Structured content like bullets, lists and tables work well because of this approach. If one of the "windows" captures content in this format, it's great context that contextualizes the entire page for GPT.
3. Spreading structured content around the page works well too. If you have multiple structured formats sprinkled throughout the content, you're ensuring that your content as a much stronger change of getting included in one of these "windows".
Jerome Salomon found that ChatGPT has a hidden Cache: when a website was visited with "external_web_access":true before it doesn't need to access it again, using "external_web_access":false

How ChatGPT grounding / page retrieval works:

Initial web search result as small structured object When GPT requests a web search result, it receives a small structured object:
1. Title
2. URL
3. Short text snippet (1–3 sentences)
4. Optional metadata such as date or score
5. A unique internal ID (turn0search0, etc.) This is all the grounding GPT gets initially.
  It does not receive:
  1. Full pages
  2. Raw HTML
  3. Full article content
  4. Site navigation or structure
Dive into page sections via sliding window browsing pattern Each snippet comes with a retrieval ID. GPT can request more with:
1. open()
  Fetches a larger slice of text from the same page, centered around a line number. This is how GPT “scrolls.”
2. click()
  Follows an outgoing link from the snippet. The new page is fetched as another snippet, using the same rules as the original search.
Demonstrated example:
1. First snippet from web search
2. First open() call reveals the start of the article:
  - Title
  - Date
  - First paragraph
  - Some introductory context
3. Expanding deeper (line 30, line 60, etc.)
  Each expansion retrieves more of the page:
  - Body sections
  - Headings
  - Explanatory paragraphs
  - Lists and examples
  But still windowed.
4. Switching to High context makes each window taller, so expansions return:
  - Longer excerpts
  - More adjacent paragraphs
  - Larger text blocks per request
  But even on High, expansions eventually hit tool caps. Each window is a plaintext extraction