Addressing AI’s ‘Hand-Me-Down Infrastructure’ Issue

Solving the Problem of AI’s ‘Hand-Me-Down Infrastructure’ – EE Times

//php echo do_shortcode(‘[responsivevoice_button voice=”US English Male” buttontext=”Listen to Post”]’) ?>

A stealthy Silicon Valley startup aims to solve the problem of AI software, once and for all.

“As an industry, we’re at this interesting point where everyone knows [AI’s] potential,” said Chris Lattner, co-founder and CEO of Modular AI, in an exclusive interview with EE Times. “Everyone has seen the research, but it doesn’t really go into products except by the biggest companies in the world. It shouldn’t be like this.

Since AI and machine learning (ML) are still nascent fields, some of the technologies that today’s AI/ML software stacks depend on originated from research projects.

“It was pure research, and therefore it made sense for a research lab to build these kinds of tools,” Lattner said, referring to the widely used AI frameworks and compiler infrastructure. today. “Fast forward to today, this is no longer research.”

This, he added, is one of the main reasons why software and AI tools are not perfectly reliable, unpredictable and have little security. They were simply not designed to be production software.

Lattner points out that today’s AI frameworks, such as Google’s TensorFlow, Meta’s Pytorch, or Google’s Jax, “are not here to make ML great for the whole world; they’re there to solve the problems of the business that’s paying them,” and that if a business doesn’t have the same setup and use cases as the hyperscaler, then “it box work, but it’s not made for work.”

Lattner calls this the “hand infrastructure” problem. Modular’s co-founder and chief product officer, Tim Davis, calls it “trickle down infrastructure.”

Chris Lattner and Tim Davis
Modular co-founders Chris Lattner (left) and Tim Davis (right) (Source: Modular AI)

The problem for chipmakers is that changes at the framework layer have repercussions.

“[Hardware companies] have to stick to that programming model, to lower it on their hardware,” Davis said. “As these frameworks evolve, the stack must continue to evolve to meet the needs [of the hardware], to completely saturate the material and use it. This means that they have to keep coming back to the framework level, to be able to support all the different frameworks. It turns out to be very difficult.

Over the past few years, chipmakers have released dozens of different accelerators based on domain-specific architectures. Each requires a bespoke compiler, which in most cases must be built from scratch.

“The cool thing about machine learning tensors and graphs is that they have implicit parallelism as part of describing computation, Lattner said (Tensors are a commonly used data type in AI “That means you’re suddenly at a higher level of abstraction, which means compilers can do so much more. There are two sides to this coin: the first is that they box do so much more, but the other is that they to have to do so much more.

The state of AI software is also bad news for developers, as the same program may need to be deployed on multiple systems with vastly different system constraints – everything from a server to a mobile phone to a browser. website.

“If every system you want to deploy to has a different toolchain, a team building a product has to rewrite their code over and over again,” Lattner said. “It’s a huge challenge. Right now a hardware team will have to create their own stack because there is nothing they can connect to…. We need more of a standardization force, which can make it easier for the hardware people, but also solve the problem for the software developer, because the tools can be good.”

Lattner and Davis’ startup Modular intends to tackle some of these issues.

“We’re tackling all the familiar problems: how to do hardware abstraction, how to get compilers to communicate with a wide range of different hardware, and how do you create the points at which you can connect with a large number of different materials? ” Lattner said. “Basically, what we’re building is a production-grade version of all the tools and technology the world already uses.”

Modular blueprints to tackle everything between framework and hardware, including some common issues faced by hardware manufacturers, while allowing them to build the parts of their stack that are specific to their accelerators themselves.

“It is unlikely that we will be able to solve their unique problems,” he said. “But they also have common problems. For example, how do you load the data? How do you plug into Pytorch? We can offer value on this side of the issue.

This would also include things like image decoding and feature table integration lookups, in other words, things that aren’t related to AI acceleration but are nonetheless expected by customers.

“There’s a lot of really interesting material out there that’s really struggling to get adopted because it’s just trying to get the basics right,” Lattner said.

Davis added that hardware companies are struggling to keep up with the changing demands of the framework, combined with ever-changing algorithms.

“How can [evolving algorithms] be reduced to hardware without hardware companies having to rewrite half of their AI software stack just to make it work? ” he said. “It’s a very concrete issue and we think there’s a significant opportunity there.”

Why does it take a whole new company to solve these problems?

Lattner and Davis’ view is that most compiler engineers in industry work to get given hardware to work, under tight time constraints. This means no one can look at the larger problem.

“It’s almost like a fragmentation problem,” Lattner said. “[Compiler engineering] talent is spread across all the different chips: there’s no center of gravity where you can have a team that can care about building things that not only solve the problem, but are also high quality. »

Modular is building such a team, starting with Lattner, co-inventor of LLVM. His resume also includes Clang, MLIR and Swift through SiFive, Google, Apple and Tesla.

Davis previously worked on Google’s AI infrastructure, including TFLite and Android ML. Tatiana Shpeisman, head of compiler engineering at Modular, previously led the CPU and GPU compiler infrastructure for Google ML and is also a co-founder of MLIR.

The other team members have a background in XLA, TensorFlow, Pytorch and ONNX. In total, Modular employs around 30 people.

Modular’s goal is a development platform where different slices of the company’s technology can be used in different products in different ways.

“What we’re trying to do is fundamentally help ML grow, help the infrastructure become something everyone can rely on, and empower people to build products on top of the instead of having to worry about all those things,” Lattner said. “There are really hard problems that can be solved using technology – and people want to work on that problem, not work on keeping all the different things they want to take for granted.”

Modular is still in stealth mode, but plans to release its first products next year.


#Solving #Problem #AIs #HandMeDown #Infrastructure #Times

Leave a Comment

Your email address will not be published. Required fields are marked *