Artificial intelligence coding tools turn into a sudden place: the station

For years, symbol editing tools such as Cursor, Windsurf and GitHub Copilot were the standard for the development of the proxy software. But with the growth of AISIRIC AI more powerful and the coding is launched, a precise transformation has changed how artificial intelligence systems interact with programs. Instead of working on software instructions, they are increasingly interacting directly with a shell of any installed system in it. It is a major change in how to develop software that works artificial intelligence-and despite the low attention, it may have significant effects on the place where the field moves from here.

The station is known as the black and white screen station that you remember from the films of infiltrators in the 1990s-a very old school to operate programs and data processing. It is not visually impressive as contemporary software editors, but it is a very strong interface if you know how to use it. While the agents based on the code can write and correct the correction code, the peripheral tools are often needed to obtain programs from the written code to something that can be already used.

The sign of the transformation to the station came from the main laboratories. Since February, all anthropologists, DeepMind and Openai have released Clauds Code, Gemini Cli and Cli Codex, respectively), which are already among the most popular companies for companies. This shift was easy to miss, because they work largely under the same brand as previous coding tools. But under the cover, there were real changes in how agents interacted with other computers, online and not connected. Some believe that these changes have just started.

“Our great bet is that there is a future in which 95 % of the LLM-Compater reaction through a terminal-like interface,” says Mike Merca, co-author of The Comforing Terminal Listerment. Station.

The tools -based stations also come completely as the code -based prominent tools began to look shaken. Code Ai Windsurf Editor with Empire for Fencing, with senior CEOs Google rented it away And the remaining company It was acquired by perception Leave the future of the consumer in the long term inaccurate.

At the same time, new research indicates that programmers may be exaggerated in estimating productivity gains from traditional tools. Metr study The Cursor Pro, the main Windsurf competitor, found that although developers have estimated that they could complete the tasks 20-30 percent faster, the observed process was almost 20 percent slower. In short, the blog assistant actually assigned the programmers the time.

This has left an opening for companies like Warp, which is ranked first in Terminalbench. WARP displays itself as a “agent development environment”, which is a medium floor between IDE programs and command line tools such as Claude Code. But the founder of Warp Zach Lloyd is still optimistic about the station, and he sees this as a way to address problems that will be out of its code as Cursor.

“The station occupies a very low level in the developer’s staple, so it is the most diverse place for agents,” says Lloyd.

To understand how different the new approach, it may be useful to look at the criteria used to measure it. The code-based generation focused on the tools on solving GitHub problems, which is the basis for Swe-Beck test. Each problem on Swe-Bences is an open problem of GitHub-basically, a piece of code does not work. The models are repeated on the code until they find something that works, and solve the problem. Integrated products such as the indicator built more advanced methods of the problem, but the GitHub/Swe-Bench model remains the essence of how these tools are treated from the problem: starting from the broken code and converting it into a working code.

The tools -based tools take a wider show, looking beyond the code to the entire environment on which the program works. This includes coding, but also more tasks towards Devops such as the composition of the GIT server or exploring and fixing errors why the text program will not be run. in One problem TerminalbenchThe instructions give the pressure canceling program and a targeted text file, which challenges the agent in engineering unlike the matching pressure algorithm. last The agent is required to create Kernel Linux from the source, and he failed to mention that the agent will have to download the same source code. Problem solving requires a kind of ability to solve the problems that programming needs.

“What makes Terminalbench difficult not only the questions we ask to agents. They are the environments that we put in them,” says Alex Show.

It is important, this new approach means addressing a step-by-step-step problem that makes AI Agency very strong. But even modern functional models cannot deal with all these environments. WARP has obtained its high grades on Terminalbench by solving slightly more than half of the problems – a sign of the standard challenge, but also the amount of work that is still to be done to open the potential of the full station.

However, LLOYD believes that we are already at a stage where the tools that are reliablely based on the peripheral can deal with many of the unrighteous developers-a valuable suggestion that is difficult to ignore.

“If you are considering daily work to prepare a new project, and to know and obtain dependency, the warp can significantly lead to it independently,” says Lloyd. “And if he is not able to do that, he will tell you why.”

Leave a ReplyCancel Reply

Trending now