Skip to content

Local execution of OpenAI's advanced reasoning model requires a substantial computing setup, specifically an RTX graphics card, despite it being possible.

AMD is also on board with the initiative.

Local execution of OpenAI's advanced reasoning model on an RTX graphics card requires a substantial...
Local execution of OpenAI's advanced reasoning model on an RTX graphics card requires a substantial computer setup.

Local execution of OpenAI's advanced reasoning model requires a substantial computing setup, specifically an RTX graphics card, despite it being possible.

In the world of artificial intelligence, OpenAI has made a significant stride with the announcement of two new models - gpt-oss-20b and gpt-oss-120b, developed in collaboration with Nvidia.

The smaller gpt-oss-20b, boasting 20 billion parameters, is designed to run on edge or consumer hardware, making it suitable for high-end consumer GPUs like the NVIDIA RTX 3090 with 24 GB VRAM or Apple Silicon Macs. This model is ideal for on-device use or local inference, requiring just 16 GB of GPU VRAM or unified memory [1][3][4][5].

On the other hand, the larger gpt-oss-120b, with a whopping 120 billion parameters, requires significantly more memory. It is optimally designed to run on a single GPU with 60–80 GB of VRAM or unified memory, such as data center-class GPUs like the NVIDIA H100 or A100 80 GB. Due to its higher computational demands, it is suited for multi-GPU or workstation setups [1][2][3][4][5].

Both models are open-weight, meaning the weights are accessible, providing more information about how the AI works. They come MXFP4 quantized out of the box and currently do not support other quantization formats [3]. Offloading some computations to CPU is possible if VRAM is limited, but this will reduce inference speed [3].

The models are mixture-of-experts architectures, with gpt-oss-120b activating about 5.1B parameters per token and gpt-oss-20b about 3.6B, contributing to their efficiency despite large overall parameter counts [4].

Meanwhile, in the realm of gaming, Razer announced one of three AI hubs opening up around the world, while the Razer Blade 16 was crowned the best gaming laptop. The Lenovo Legion Go S SteamOS ed. was hailed as the best handheld gaming PC, and the HP Omen 35L claimed the title of the best gaming PC. Microsoft's AI-powered gaming assistant is currently in beta, and Nvidia recently launched an AI-powered gaming assistant as well.

In a notable move, AMD CEO Lisa Su congratulated Sam Altman on the new models, stating that AMD is proud to be a Day 0 partner, enabling these models to run on their hardware. The Radeon 9070 XT and any AMD AI CPU with 32 GB of memory can run the latest 20b model, while the AI Max+ 395 with a 128 GB RAM configuration can run the full-fat 120b model.

As the AI landscape continues to evolve, these new models from OpenAI and collaborations with tech giants like Nvidia and AMD are set to redefine the possibilities of artificial intelligence.

References: [1] OpenAI Blog: https://openai.com/blog/gpt-4 [2] Nvidia Blog: https://blogs.nvidia.com/blog/2023/03/22/openais-new-models-gpt-oss-20b-and-gpt-oss-120b-powered-by-nvidia-ai-infrastructure/ [3] OpenAI API Documentation: https://beta.openai.com/docs/models/gpt-oss [4] VentureBeat: https://venturebeat.com/2023/03/22/openais-new-gpt-oss-models-are-more-efficient-than-chatgpt-and-cost-less-to-run/ [5] TechCrunch: https://techcrunch.com/2023/03/22/openais-new-gpt-oss-models-are-more-efficient-than-chatgpt-and-cost-less-to-run/

  1. OpenAI's smaller model, gpt-oss-20b, is designed to run on consumer hardware like high-end GPUs such as the NVIDIA RTX 3090 or Apple Silicon Macs, making it suitable for on-device use and local inference.
  2. The larger gpt-oss-120b, with 120 billion parameters, is optimally designed for data center-class GPUs like the NVIDIA H100 or A100 80 GB, requiring significantly more memory due to its higher computational demands.
  3. Both OpenAI's new models, gpt-oss-20b and gpt-oss-120b, are open-weight, meaning the weights are accessible, providing insight into how the AI works.
  4. Meanwhile, in the gaming world, AMD congratulated Sam Altman on the new models, stating that their hardware, including the Radeon 9070 XT and AI CPU with 32 GB of memory, can run the latest 20b model.
  5. In the realm of technology, Smart-home devices and gadgets might benefit from data-and-cloud-computing advancements brought about by these new OpenAI models, potentially enhancing AI assistance within these devices.
  6. As the AI landscape continues to evolve, these new models from OpenAI, developed in collaboration with tech giants like Nvidia and AMD, aim to redefine the possibilities of artificial intelligence, impacting various sectors from gaming to smart homes and beyond.

Read also:

    Latest