Systems and AI

Focuses on two directions of research: System Software for AI and Distributed Systems with ML.

System Software for AI

We design and develop algorithms and tools to enhance the performance of the Deep Learning and Machine Learning software on large-scale parallel and distributed architectures. Here are some of the projects we currently work on:

Parallelization of deep learning frameworks

Many state-of-the-art Deep Neural Networks have substantial memory and computation requirements. Limited device memory becomes a bottleneck when training large models. In our group, we seek novel approaches to scale DNNs on parallel and distributed large-scale systems.

Heterogeneous computing

The majority of the DNN frameworks perform their compute-intensive operations on GPUs, IPUs, TPUs or similar hardware accelerators. Our goal is to develop efficient and custom kernels for the target architecture so that the training and inference time is minimized.

Efficient graph analytics

The data that needs to be processed, managed, and analyzed is increasingly distributed and unstructured. A large fraction of this data is sparse. Such sparse data is typically modeled as a graph, sparse matrix or a sparse tensor. We enable the acceleration of graphs and sparse computation on diverse hardware.

Distributed Systems with ML

Integration of machine learning (ML) techniques for enabling intelligent decisions in distributed systems and reliable networks has been a very promising and active research area. Through several developments in distributed system technologies, such as edge computing, software defined networks, blockchain, information centric networks and the Internet of Things, the field is being transformed by new solutions that are increasingly robust, efficient, scalable, reliable and secure.

With our focus on Distributed Systems with ML, sample projects we currently conduct:

Machine Learning Framework for SDN

Software defined networks (SDN), with the logically centralized controller, generate massive amount of information due to switch-controller communication. With the ever increasing amount of network information, machine learning techniques are formidable and play vital role in discovering knowledge from the stored network information. We propose a novel hybrid machine learning based framework that combines the capabilities of SDN and machine learning for energy efficient routing.

Blockchain-assisted Intelligent P2P Services

Blockchain is a promising technology that offers a distributed, robust, and secure framework for P2P energy trading. Scalability and security problems with centralized architecture models have provided opportunities for blockchain-based distributed models. Our goal is to develop machine learning-based intelligent services for blockchain-assisted P2P energy trading using smart contracts. In particular, we utilize reinforcement learning to create a novel self-adaptive grouping technique.

Intelligent Edge Computing

Distributed edge intelligence is becoming increasingly significant since processing large amount of data in the cloud is not always efficient due to the reasons such as bandwidth and delay constraints, security, energy consumption and fault tolerance. In order to enable intelligent decisions at the network edge, supervised and unsupervised machine learning techniques are highly utilized. We develop a novel edge computing architecture in the oil refinery industry to enable intelligent and sustainable services at the network edge.

System Software for AI

Distributed Systems with ML

Affiliated Faculty:

Affiliated Labs:

System Software for AI

Distributed Systems with ML

Affiliated Faculty:

Affiliated Labs:

Cookie Policy