The secret sauce of this model is its ability to “read” images without translating them into text first, speeding up the process and reducing the amount of computing power required. “The model’s entire inference process is no longer limited to text. It can infer using images as well,” Dahua Lin, co-founder and chief scientist at SenseTime, said in an interview with WIRED.
Lin, who is also a professor of information engineering at the Chinese University of Hong Kong, says models capable of directly processing images will enable robots to better understand the physical world in the future.
Like DeepSeek’s latest flagship model, SenseTime says the U1 could be powered by a Chinese-made chipset. “Many Chinese domestic chipmakers have finished improving compatibility with our new model,” Lin says. On release day, 10 Chinese chip designers, including Cambricon and Biren Technology, announced their U1-enabled devices.
This flexibility is important because US export controls restrict Chinese companies from accessing the world’s most advanced AI chips, especially those used for training, which at this point are developed primarily by Western companies like Nvidia. “We will continue to push for training on more different chips,” Lin says. But he also acknowledges that SenseTime “may still need to use the best chips to ensure fast iteration.”
SenseTime has released the U1 app for free on Hugging Face and GitHub, another sign that Chinese companies are becoming some of the most active contributors to open source AI.
Founded in 2014, SenseTime has become a global leader in computer vision, which is used in applications such as facial recognition and autonomous driving. But as ChatGPT and other AI systems powered by natural language processing became the hottest thing in the tech industry, SenseTime began to struggle to turn a profit and fell behind newer Chinese startups like DeepSeek and MiniMax.
SenseTime says it hopes releasing the SenseNova-U1 publicly for anyone to use will help catch up with domestic and Western AI players. Lin says the company finally made the decision last year to focus on open source because of the helpful feedback it receives from researchers, which enables the company to iterate faster. “In this day and age, being open source or closed source is not the winning factor, it is the speed of iteration,” Lin explains.
Switching to open source also helps SenseTime continue to collaborate with international researchers without the interference of geopolitics. The US government has repeatedly sanctioned the company in recent years over allegations that its facial recognition technology helped power surveillance systems used to monitor and detain Uighurs and other minority groups in China’s Xinjiang region. As a result, US companies are prohibited from investing in SenseTime and selling certain technologies to it without a license. (SenseTime has denied the allegations.)
See clearly
In an accompanying technical report, SenseTime claims that the SenseNova-U1 generates higher quality images than all other open source models currently on the market. Its performance is comparable to leading Chinese closed-source models like Alibaba’s Qwen and ByteDance’s Seedream, but it still lags behind industry leaders like GPT-Image-2.0, which was released just a week ago.
But the main selling point of the model is its ability to generate images much faster than all of those models. It’s based on an innovative technology architecture called NEO-Unify that SenseTime previewed earlier this year.
