Modéliser les relations pour résoudre efficacement des problèmes complexes | Actualités du MIT

Modéliser les relations pour résoudre efficacement des problèmes complexes | Actualités du MIT

Certainly! Here’s a rewritten version of the article:

German philosopher Friedrich Nietzsche once remarked that "invisible threads are the strongest ties." These "invisible threads" can be imagined as connections between related objects, such as homes on a delivery driver’s route, or more abstract entities like transactions within a financial network or users on a social media platform.

Computer scientist Julian Shun explores these complex yet often unseen connections using graphs, where objects are depicted as nodes or vertices, and the relationships between them are represented by lines or edges.

Shun, recently tenured as an associate professor in the Department of Electrical Engineering and Computer Science, designs graph algorithms that could be employed to determine the shortest path between homes on a delivery route or to identify fraudulent transactions by malicious actors in a financial network.

With the increasing volume of data, these networks have expanded to include billions of objects and connections. To find efficient solutions, Shun develops high-performance algorithms that leverage parallel computing to quickly analyze even the largest graphs. Given the notorious difficulty of parallel programming, he also creates user-friendly programming frameworks that make it easier for others to write their own efficient graph algorithms.

"When you search for something in a search engine or on a social network, you want results very quickly. If you’re trying to identify fraudulent financial transactions at a bank, you want to do it in real-time to minimize damage. Parallel algorithms can speed things up by using more computing resources," explains Shun, who is also a principal investigator at the Computer Science and Artificial Intelligence Laboratory (CSAIL).

Such algorithms are frequently used in online recommendation systems. Search for a product on an e-commerce website, and you’re likely to quickly see a list of related items you might also want to add to your cart. This list is generated using graph algorithms that exploit parallelism to swiftly find related items across a vast network of users and available products.

Campus Connections

As a teenager, Shun’s only experience with computers was a high school course on website creation. More interested in math and natural sciences than technology, he intended to major in one of those subjects when he enrolled as an undergraduate at the University of California, Berkeley.

However, in his first year, a friend recommended he take an introductory computer science course. Despite not knowing what to expect, he decided to enroll.

"I fell in love with programming and algorithm design. I switched to computer science and never regretted my choice," he recalls.

This introductory computer science course was self-paced, so Shun taught himself most of the material. He enjoyed the logical aspects of algorithm development and the short feedback loop of computing problems. Shun could input his solutions into the computer and immediately see if he was right or wrong. Errors in incorrect solutions would guide him to the correct answer.

"I always thought it was fun to build things, and in programming, you create solutions that do something useful. That appealed to me," he adds.

After graduating, Shun spent some time in the industry but soon realized he wanted to pursue an academic career. In a university setting, he knew he would have the freedom to study the problems that interested him.

Diving into Graphs

He enrolled as a graduate student at Carnegie Mellon University, where he focused his research on applied algorithms and parallel computing.

As a student, Shun had taken theoretical algorithm courses and practical programming courses, but the two worlds were not connected. He wanted to conduct research that combined theory and application. Parallel algorithms were the perfect fit.

"In parallel computing, you have to care about practical applications. The goal of parallel computing is to speed things up in real life, so if your algorithms aren’t fast in practice, they’re not very useful," he says.

At Carnegie Mellon, he was introduced to graph datasets, where network objects are modeled as vertices connected by edges. He was drawn to the many applications of these types of datasets and the challenging problem of developing efficient algorithms to manage them.

After completing a postdoctoral fellowship at Berkeley, Shun sought a faculty position and decided to join MIT. He had collaborated with several MIT faculty members on parallel computing research and was excited to join an institute with such expertise.

In one of his first projects after joining MIT, Shun teamed up with Saman Amarasinghe, a professor in the Department of Electrical Engineering and Computer Science and a CSAIL member, who is an expert in programming languages and compilers, to develop a programming framework for graph processing known as Graphite. The user-friendly framework, which generates efficient code from high-level specifications, operated about five times faster than the next best approach.

"It was a very fruitful collaboration. I couldn’t have created such a powerful solution if I had worked alone," he says.

Shun has also expanded his research to include clustering algorithms, which aim to group related data points. He and his students are building parallel algorithms and frameworks to quickly solve complex clustering problems, which can be used for applications such as anomaly detection and community detection.

Dynamic Challenges

Recently, he and his collaborators have focused on dynamic problems where data in a graph network changes over time.

When a dataset contains billions of data points, running an algorithm from scratch to make a small change can be extremely costly from a computational perspective. He and his students design parallel algorithms that handle many updates simultaneously, improving efficiency while maintaining accuracy.

However, these dynamic problems also pose one of the biggest challenges that Shun and his team must overcome. Since there aren’t many dynamic datasets available to test algorithms, the team often has to generate synthetic data that may not be realistic and could hinder the performance of their algorithms in the real world.

Ultimately, his goal is to develop dynamic graph algorithms that work efficiently in practice while adhering to theoretical guarantees. This ensures they will be applicable in a wide range of contexts, he says.

Shun expects dynamic parallel algorithms to become an even more significant research focus in the future. As datasets grow larger, more complex, and evolve faster, researchers will need to create more efficient algorithms to keep up.

He also anticipates new challenges arising from advances in computing technology, as researchers will need to design new algorithms to leverage the properties of new hardware.

"That’s the beauty of research: I’m trying to solve problems that others haven’t solved before and bring something useful to society," he says.

Source