Release of GLM-4.6V with Native Function Calling Support

TG AI News·December 9, 2025 at 5:12 AM·
Trusted Source
The flagship GLM-4.6V model with 106 billion parameters and the lightweight GLM-4.6V-Flash (9B) have been released. Both models feature a context window of 128k tokens and mixed content generation, where text is combined with images. The model can transmit images and screenshots to external tools without prior conversion to text, as well as embed visual results back into the reasoning chain. Both models are already available on HuggingFace, accessible via API and in the web version. A six-person startup outperformed Google Gemini 3 in the ARC-AGI logic test. The Poetiq team took first place in the semi-closed benchmark ARC-AGI-2, scoring 54% correct answers, confidently surpassing the industry giant: Google previously reported a result of 45% for Gemini 3 Deep Think. ARC-AGI, developed by researcher François Chollet, is considered one of the toughest tests for AI. The test checks not just knowledge but the ability for abstract thinking and solving fundamentally new tasks. Poetiq's success is attributed not to training a new model but to the effective orchestration of existing ones. The co-author of the Transformer architecture released a coding model Rnj-1. The startup Essential AI, founded by Ashish Vaswani, presented the open-weight model Rnj-1. With a size of only 8 billion parameters, it demonstrates top results in SWE-bench Verified. Rnj-1 scored 20.8 points, while the similarly sized Qwen 3 (8B) only reached 4.5. The novelty is based on the Gemma 3 architecture. The developers intentionally refrained from focusing on post-training and RL. Instead, the team focused on quality pre-training using the Muon optimizer. NVIDIA introduced the largest CUDA update since 2006. Along with the release of CUDA 13.1, the company launches a virtual instruction set for "tiled" parallel programming. The new paradigm abstracts low-level hardware details, allowing for higher-level algorithm writing. CUDA Tile enables operations on data blocks, automatically optimizing execution for specific tensor cores and memory architecture. The foundation of the technology is CUDA Tile IR - an intermediate representation similar to PTX but tailored for matrix operations. This ensures cross-platform compatibility: the written code will work efficiently on different generations of GPUs without deep refactoring. Grok 4.20 outperformed top models in stock trading.
Release of GLM-4.6V with Native Function Calling Support | AI News | AIventa