Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copilot: Correct o3-mini context length #24152

Merged
merged 2 commits into from
Feb 4, 2025
Merged

Conversation

chapel
Copy link
Contributor

@chapel chapel commented Feb 3, 2025

It should be 200k (with 100k output). I can't find anything that puts it at 20k and the changeover in 2f82374 only changed the name from o1-mini to o3-mini

References:

Release Notes:

  • Corrected Github Copilot o3-mini context length

This comment was marked as resolved.

@maxdeviant maxdeviant changed the title Correcting o3-mini context length Correct o3-mini context length Feb 3, 2025
@maxdeviant maxdeviant changed the title Correct o3-mini context length copilot: Correct o3-mini context length Feb 3, 2025
@chapel

This comment was marked as resolved.

@cla-bot cla-bot bot added the cla-signed The user has signed the Contributor License Agreement label Feb 3, 2025

This comment was marked as resolved.

@maxdeviant
Copy link
Member

I can't find anything that puts it at 20k and the changeover in 2f82374 only changed the name from o1-mini to o3-mini

Here's the context for where the 20k limit came from: #20362

@chapel
Copy link
Contributor Author

chapel commented Feb 3, 2025

Here's the context for where the 20k limit came from: #20362

Ah, I didn't find that, but appreciate the context. I haven't tested the API, if they are limiting it to that then obviously close the PR. Wish they publicly posted what their limits were regardless.

@maxdeviant
Copy link
Member

Here's the context for where the 20k limit came from: #20362

Ah, I didn't find that, but appreciate the context. I haven't tested the API, if they are limiting it to that then obviously close the PR. Wish they publicly posted what their limits were regardless.

That was for o1-mini, so the question is whether o3-mini has a higher token count and what that is.

It looks like there is an API endpoint retrieve the information: #20362 (comment)

It doesn't seem to work unauthenticated for me, so I'd need to figure out how to auth against it.

@chapel
Copy link
Contributor Author

chapel commented Feb 3, 2025

It doesn't seem to work unauthenticated for me, so I'd need to figure out how to auth against it.

Sadly my work account for copilot doesn't give us access to the new models right now, later tonight I can see if the free version of copilot lets you use o3-mini and if the context is higher.

@itsaphel
Copy link

itsaphel commented Feb 4, 2025

Can confirm 200k input and 100k output tokens, according to the /models endpoint anyway:

    {
        "id": "azureml://registries/azure-openai/models/o3-mini/versions/2025-01-31",
        "registry": "azure-openai",
        "name": "o3-mini",
        "original_name": "o3-mini",
        "friendly_name": "OpenAI o3-mini",
        "task": "chat-completion",
        "publisher": "OpenAI",
        "license": "custom",
        "summary": "o3-mini includes the o1 features with significant cost-efficiencies for scenarios requiring high performance.",
        "model_family": "OpenAI",
        "model_version": "2025-01-31",
        "popularity": 55.01,
        "tags": [
            "reasoning",
            "multilingual",
            "coding"
        ],
        "rate_limit_tier": "custom",
        "supported_languages": [
            "en",
            "it",
            "af",
            "es",
            "de",
            "fr",
            "id",
            "ru",
            "pl",
            "uk",
            "el",
            "lv",
            "zh",
            "ar",
            "tr",
            "ja",
            "sw",
            "cy",
            "ko",
            "is",
            "bn",
            "ur",
            "ne",
            "th",
            "pa",
            "mr",
            "te"
        ],
        "max_output_tokens": 100000,
        "max_input_tokens": 200000,
        "training_data_date": null,
        "license_description": "Use of Azure OpenAI Service is subject to applicable Microsoft\nProduct Terms <https://www.microsoft.com/licensing/terms/welcome/welcomepage> including the Universal License Terms for Microsoft Generative AI Services and the service-specific terms for the Azure OpenAI product offering.",
        "static_model": false,
        "supported_input_modalities": [
            "text"
        ],
        "supported_output_modalities": [
            "text"
        ]
    },

I have not personally tried to input 200k tokens, however.

@notpeter notpeter merged commit 2853649 into zed-industries:main Feb 4, 2025
13 checks passed
@notpeter
Copy link
Member

notpeter commented Feb 4, 2025

Thanks!

@chapel chapel deleted the patch-1 branch February 4, 2025 18:04
@SirSilver
Copy link
Contributor

image
Got this error with 0.173.1-pre using copilot o3-mini, so I believe the context length was correct before this MR or wrong model is used under o3-mini name

@itsaphel
Copy link

itsaphel commented Feb 5, 2025

Hm. I've just sniffed the request made by Copilot in VSCode, and I get this:

{
	"data": [{
		"capabilities": {
			"family": "gpt-3.5-turbo",
			"limits": {
				"max_context_window_tokens": 16384,
				"max_output_tokens": 4096,
				"max_prompt_tokens": 12288
			},
			"object": "model_capabilities",
			"supports": {
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "cl100k_base",
			"type": "chat"
		},
		"id": "gpt-3.5-turbo",
		"model_picker_enabled": false,
		"name": "GPT 3.5 Turbo",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "gpt-3.5-turbo-0613"
	}, {
		"capabilities": {
			"family": "gpt-3.5-turbo",
			"limits": {
				"max_context_window_tokens": 16384,
				"max_output_tokens": 4096,
				"max_prompt_tokens": 12288
			},
			"object": "model_capabilities",
			"supports": {
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "cl100k_base",
			"type": "chat"
		},
		"id": "gpt-3.5-turbo-0613",
		"model_picker_enabled": false,
		"name": "GPT 3.5 Turbo",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "gpt-3.5-turbo-0613"
	}, {
		"capabilities": {
			"family": "gpt-4",
			"limits": {
				"max_context_window_tokens": 32768,
				"max_output_tokens": 4096,
				"max_prompt_tokens": 32768
			},
			"object": "model_capabilities",
			"supports": {
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "cl100k_base",
			"type": "chat"
		},
		"id": "gpt-4",
		"model_picker_enabled": false,
		"name": "GPT 4",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "gpt-4-0613"
	}, {
		"capabilities": {
			"family": "gpt-4",
			"limits": {
				"max_context_window_tokens": 32768,
				"max_output_tokens": 4096,
				"max_prompt_tokens": 32768
			},
			"object": "model_capabilities",
			"supports": {
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "cl100k_base",
			"type": "chat"
		},
		"id": "gpt-4-0613",
		"model_picker_enabled": false,
		"name": "GPT 4",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "gpt-4-0613"
	}, {
		"capabilities": {
			"family": "gpt-4o",
			"limits": {
				"max_context_window_tokens": 128000,
				"max_output_tokens": 4096,
				"max_prompt_tokens": 64000,
				"vision": {
					"max_prompt_image_size": 3145728,
					"max_prompt_images": 1
				}
			},
			"object": "model_capabilities",
			"supports": {
				"parallel_tool_calls": true,
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "o200k_base",
			"type": "chat"
		},
		"id": "gpt-4o",
		"model_picker_enabled": true,
		"name": "GPT 4o",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "gpt-4o-2024-05-13"
	}, {
		"capabilities": {
			"family": "gpt-4o",
			"limits": {
				"max_context_window_tokens": 128000,
				"max_output_tokens": 4096,
				"max_prompt_tokens": 64000,
				"vision": {
					"max_prompt_image_size": 3145728,
					"max_prompt_images": 1
				}
			},
			"object": "model_capabilities",
			"supports": {
				"parallel_tool_calls": true,
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "o200k_base",
			"type": "chat"
		},
		"id": "gpt-4o-2024-05-13",
		"model_picker_enabled": false,
		"name": "GPT 4o",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "gpt-4o-2024-05-13"
	}, {
		"capabilities": {
			"family": "gpt-4o",
			"limits": {
				"max_context_window_tokens": 128000,
				"max_output_tokens": 4096,
				"max_prompt_tokens": 64000
			},
			"object": "model_capabilities",
			"supports": {
				"parallel_tool_calls": true,
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "o200k_base",
			"type": "chat"
		},
		"id": "gpt-4-o-preview",
		"model_picker_enabled": false,
		"name": "GPT 4o",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "gpt-4o-2024-05-13"
	}, {
		"capabilities": {
			"family": "gpt-4o",
			"limits": {
				"max_context_window_tokens": 128000,
				"max_output_tokens": 16384,
				"max_prompt_tokens": 64000
			},
			"object": "model_capabilities",
			"supports": {
				"parallel_tool_calls": true,
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "o200k_base",
			"type": "chat"
		},
		"id": "gpt-4o-2024-08-06",
		"model_picker_enabled": false,
		"name": "GPT 4o",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "gpt-4o-2024-08-06"
	}, {
		"capabilities": {
			"family": "text-embedding-ada-002",
			"limits": {
				"max_inputs": 256
			},
			"object": "model_capabilities",
			"supports": {},
			"tokenizer": "cl100k_base",
			"type": "embeddings"
		},
		"id": "text-embedding-ada-002",
		"model_picker_enabled": false,
		"name": "Embedding V2 Ada",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "text-embedding-ada-002"
	}, {
		"capabilities": {
			"family": "text-embedding-3-small",
			"limits": {
				"max_inputs": 512
			},
			"object": "model_capabilities",
			"supports": {
				"dimensions": true
			},
			"tokenizer": "cl100k_base",
			"type": "embeddings"
		},
		"id": "text-embedding-3-small",
		"model_picker_enabled": false,
		"name": "Embedding V3 small",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "text-embedding-3-small"
	}, {
		"capabilities": {
			"family": "text-embedding-3-small",
			"object": "model_capabilities",
			"supports": {
				"dimensions": true
			},
			"tokenizer": "cl100k_base",
			"type": "embeddings"
		},
		"id": "text-embedding-3-small-inference",
		"model_picker_enabled": false,
		"name": "Embedding V3 small (Inference)",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "text-embedding-3-small"
	}, {
		"capabilities": {
			"family": "gpt-4o-mini",
			"limits": {
				"max_context_window_tokens": 128000,
				"max_output_tokens": 4096,
				"max_prompt_tokens": 12288
			},
			"object": "model_capabilities",
			"supports": {
				"parallel_tool_calls": true,
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "o200k_base",
			"type": "chat"
		},
		"id": "gpt-4o-mini",
		"model_picker_enabled": false,
		"name": "GPT 4o Mini",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "gpt-4o-mini-2024-07-18"
	}, {
		"capabilities": {
			"family": "gpt-4o-mini",
			"limits": {
				"max_context_window_tokens": 128000,
				"max_output_tokens": 4096,
				"max_prompt_tokens": 12288
			},
			"object": "model_capabilities",
			"supports": {
				"parallel_tool_calls": true,
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "o200k_base",
			"type": "chat"
		},
		"id": "gpt-4o-mini-2024-07-18",
		"model_picker_enabled": false,
		"name": "GPT 4o Mini",
		"object": "model",
		"preview": false,
		"vendor": "Azure OpenAI",
		"version": "gpt-4o-mini-2024-07-18"
	}, {
		"capabilities": {
			"family": "o1-ga",
			"limits": {
				"max_context_window_tokens": 200000,
				"max_prompt_tokens": 20000
			},
			"object": "model_capabilities",
			"supports": {
				"tool_calls": true
			},
			"tokenizer": "o200k_base",
			"type": "chat"
		},
		"id": "o1",
		"model_picker_enabled": true,
		"name": "o1 (Preview)",
		"object": "model",
		"preview": true,
		"vendor": "Azure OpenAI",
		"version": "o1-2024-12-17"
	}, {
		"capabilities": {
			"family": "o1-ga",
			"limits": {
				"max_context_window_tokens": 200000,
				"max_prompt_tokens": 20000
			},
			"object": "model_capabilities",
			"supports": {
				"tool_calls": true
			},
			"tokenizer": "o200k_base",
			"type": "chat"
		},
		"id": "o1-2024-12-17",
		"model_picker_enabled": false,
		"name": "o1 (Preview)",
		"object": "model",
		"preview": true,
		"vendor": "Azure OpenAI",
		"version": "o1-2024-12-17"
	}, {
		"capabilities": {
			"family": "o3-mini",
			"limits": {
				"max_context_window_tokens": 200000,
				"max_output_tokens": 100000,
				"max_prompt_tokens": 20000
			},
			"object": "model_capabilities",
			"supports": {
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "o200k_base",
			"type": "chat"
		},
		"id": "o3-mini",
		"model_picker_enabled": true,
		"name": "o3-mini (Preview)",
		"object": "model",
		"preview": true,
		"vendor": "Azure OpenAI",
		"version": "o3-mini-2025-01-31"
	}, {
		"capabilities": {
			"family": "o3-mini",
			"limits": {
				"max_context_window_tokens": 200000,
				"max_output_tokens": 100000,
				"max_prompt_tokens": 20000
			},
			"object": "model_capabilities",
			"supports": {
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "o200k_base",
			"type": "chat"
		},
		"id": "o3-mini-2025-01-31",
		"model_picker_enabled": false,
		"name": "o3-mini (Preview)",
		"object": "model",
		"preview": true,
		"vendor": "Azure OpenAI",
		"version": "o3-mini-2025-01-31"
	}, {
		"capabilities": {
			"family": "o3-mini",
			"limits": {
				"max_context_window_tokens": 200000,
				"max_output_tokens": 100000,
				"max_prompt_tokens": 20000
			},
			"object": "model_capabilities",
			"supports": {
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "o200k_base",
			"type": "chat"
		},
		"id": "o3-mini-paygo",
		"model_picker_enabled": false,
		"name": "o3-mini (Preview)",
		"object": "model",
		"preview": true,
		"vendor": "Azure OpenAI",
		"version": "o3-mini-paygo"
	}, {
		"capabilities": {
			"family": "claude-3.5-sonnet",
			"limits": {
				"max_context_window_tokens": 128000,
				"max_output_tokens": 4096,
				"max_prompt_tokens": 128000,
				"vision": {
					"max_prompt_image_size": 3145728,
					"max_prompt_images": 1
				}
			},
			"object": "model_capabilities",
			"supports": {
				"parallel_tool_calls": true,
				"streaming": true,
				"tool_calls": true
			},
			"tokenizer": "o200k_base",
			"type": "chat"
		},
		"id": "claude-3.5-sonnet",
		"model_picker_enabled": true,
		"name": "Claude 3.5 Sonnet (Preview)",
		"object": "model",
		"policy": {
			"state": "enabled",
			"terms": "Enable access to the latest Claude 3.5 Sonnet model from Anthropic. [Learn more about how GitHub Copilot serves Claude 3.5 Sonnet](https://docs.github.com/copilot/using-github-copilot/using-claude-sonnet-in-github-copilot)."
		},
		"preview": true,
		"vendor": "Anthropic",
		"version": "claude-3.5-sonnet"
	}],
	"object": "list"
}

Namely, for o3-mini:

"limits": {
    "max_context_window_tokens": 200000,
    "max_output_tokens": 100000,
    "max_prompt_tokens": 20000
},

So indeed, it looks like only 20k tokens can be sent in the prompt. Assuming Copilot does even allow a full 200k to be sent (as in, assuming it's not restricting beyond the model's capabilities), I presume it's not directly via the chat context. I can see file contexts are being chunked and sent to a separate API endpoint, which I presume is the 'correct' way to use the entire available token window. I'll try look into it a bit more.

I presume this change should be reverted though. My apologies the first time -- it seems the different methods of calling Copilot give different results and limits.

notpeter added a commit that referenced this pull request Feb 5, 2025
notpeter added a commit that referenced this pull request Feb 5, 2025
Reverts #24152
See comment: #24152 (comment)
Manually confirmed >20k generates error.
osiewicz pushed a commit to RemcoSmitsDev/zed that referenced this pull request Feb 5, 2025
@chapel
Copy link
Contributor Author

chapel commented Feb 6, 2025

Thanks @SirSilver and @itsaphel for getting to the bottom of this, it is unfortunate that copilot doesn't actually document this.

It is an interesting distinction though, where you can potentially go up to 200k tokens but only send up to 20k at a time. I wonder if it considers cached tokens (things it's seen before like openai).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed The user has signed the Contributor License Agreement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants