Fall 2023

January 23, 2024

Öznur Taştan, Sabancı University

Machine Learning for Life Sciences

In this talk, I will give examples of methods developed in our group at the intersection of machine learning and computational biology. The main body of the talk will focus on our drug synergy prediction efforts. Combination drug therapies are effective treatments for cancer. However, the genetic heterogeneity of the patients and exponentially large space of drug pairings pose significant challenges for finding the right combination for a specific patient. Current in silico prediction methods promise to reduce the vast number of candidate drug combinations for further screening. However, existing powerful methods are trained with cancer cell line gene expression data, which limits their applicability in clinical settings. While synergy measurements on cell lines models are available at large scale, patient-derived samples are too few to train a complex model. On the other hand, patient-specific single-drug response data are relatively more available. In this talk, I will first present our method trained on cell line gene expression data and further describe training strategies for customizing patient drug synergy predictions using single drug response data.

January 16, 2024

João Henriques, University of Oxford

A Light Touch Approach to Teaching Transformers Multi-view Geometry

Transformers are powerful visual learners, in large part due to their conspicuous lack of manually-specified priors. This flexibility can be problematic in tasks that involve multiple-view geometry, due to the near-infinite possible variations in 3D shapes and viewpoints (requiring flexibility), and the precise nature of projective geometry (obeying rigid laws). To resolve this conundrum, we propose a “light touch” approach, guiding visual Transformers to learn multiple-view geometry but allowing them to break free when needed. We achieve this by using epipolar lines to guide the Transformer’s cross-attention maps, penalizing attention values outside the epipolar lines and encouraging higher attention along these lines since they contain geometrically plausible matches. Unlike previous methods, our proposal does not require any camera pose information at test-time. We focus on pose-invariant object instance retrieval, where standard Transformer networks struggle, due to the large differences in viewpoint between query and retrieved images. Experimentally, our method outperforms state-of-the-art approaches at object retrieval, without needing pose information at test-time.

January 9, 2024

Hatice Köse, İstanbul Technical University

Affective Social Robots and Interaction Studies for Children with Disabilities

I will be summarizing our projects on the affective social humanoid robots for the education and health applications of children with disabilities. We develop affective modules to recognize the behaviours, attention, emotion and stress of children based on the data (physiological, facial, audio, an gaze) collected during their interaction with the robots and the sensory setup. We also work on the affect based serious games and exergames for children.

December 26, 2023

Tolga Birdal, Imperial College London

Topological Deep Learning: A New Hope for AI4Science

Topological deep learning is a rapidly growing field that pertains to the development of deep learning models for data supported on topological domains such as simplicial complexes, cell complexes, and hypergraphs, which generalize many domains encountered in scientific computations. In this talk, Tolga will present a unifying deep learning framework built upon an even richer data structure that includes widely adopted topological domains. Specifically, he will begin by introducing combinatorial complexes, a novel type of topological domain. Combinatorial complexes can be seen as generalizations of graphs that maintain certain desirable properties. Similar to hypergraphs, combinatorial complexes impose no constraints on the set of relations. In addition, combinatorial complexes permit the construction of hierarchical higher-order relations, analogous to those found in simplicial and cell complexes. Thus, combinatorial complexes generalize and combine useful traits of both hypergraphs and cell complexes, which have emerged as two promising abstractions that facilitate the generalization of graph neural networks to topological spaces. Second, building upon combinatorial complexes and their rich combinatorial and algebraic structure, Tolga will develop a general class of message-passing combinatorial complex neural networks (CCNNs), focusing primarily on attention-based CCNNs. He will additionally characterize permutation and orientation equivariances of CCNNs, and discuss pooling and unpooling operations within CCNNs. The performance of CCNNs on tasks related to mesh shape analysis and graph learning will be provided. The experiments demonstrate that CCNNs have competitive performance as compared to state-of-the-art deep learning models specifically tailored to the same tasks. These findings demonstrate the advantages of incorporating higher-order relations into deep learning models and shows great promise for AI4Science.

December 21, 2023

Dilara Keküllüoğlu, University of Edinburgh

Analysing User Behaviour and Assisting Users in Ever Advancing Technology

The rapidly changing technological landscape makes it difficult for people to understand the impacts of these advances and manage their boundaries. One such aspect is protecting user privacy in online social media platforms. People share a wide variety of information on social media, including personal and sensitive information, without understanding the size of their audience which may cause privacy complications. The networked nature of the platforms further exacerbates these complications where the information can be shared without the information owner’s control. People struggle to achieve their intended audience using the privacy settings provided by the platforms. Researching user behaviours in these situations is essential to understanding and helping people protect their privacy. Another domain where technological advances trouble people is the increasing use of artificial intelligence decisions in potentially harmful situations. The automated systems’ reasoning processes still often remain unclear to people interacting with such systems, which may also harm people by making unjust decisions. There are no efficient means for people to challenge automated decisions and obtain proper restitution if necessary. It is imperative that people are given tools to understand and contest these automated decisions easily.

December 12, 2023

Maciej Best, ETH Zurich

Chains, Trees, and Graphs of Thoughts: Demystifying Structured-Enhanced Prompting

The field of natural language processing has witnessed significant progress in recent years, with a notable focus on improving language models’ performance through innovative prompting techniques. Among these, structure-enhanced prompting has emerged as a promising paradigm, with designs such as Chain-of-Thought (CoT) or Tree of Thoughts (ToT), in which the LLM reasoning is guided by a structure such as a tree. In the first part of the talk, we overview this recent field, focusing on fundamental classes of harnessed structures, the representations of these structures, algorithms executed with these structures, relationships to other parts of the generative AI pipeline such as knowledge bases, and others. Second, we introduce Graph of Thoughts (GoT): a framework that advances prompting capabilities in LLMs beyond those offered by CoT or ToT. The key idea and primary advantage of GoT is the ability to model the information generated by an LLM as an arbitrary graph, where units of information (“LLM thoughts”) are vertices, and edges correspond to dependencies between these vertices. This approach enables combining arbitrary LLM thoughts into synergistic outcomes, distilling the essence of whole networks of thoughts, or enhancing thoughts using feedback loops. We illustrate that GoT offers advantages over state of the art on different tasks such as keyword counting while simultaneously reducing costs. We finalize with outlining research challenges in this fast-growing field.

December 5, 2023

Noah Snavely, Cornell University and Google Research

Modeling 3D Shape and Motion from Video

Computer vision and machine learning methods are getting really good at 3D reconstruction from 2D images. What they are not as good at is understanding, reconstructing, and generating scenes that are in motion. I’ll talk about recent work on new methods and scene representations that reconstruct and generate scenes that unfold over time.

November 28, 2023

Georgia Chalvatzaki, TU Darmstadt

Interactive Robot Perception and Learning for Mobile Manipulation

The long-standing ambition for autonomous, intelligent service robots that are seamlessly integrated into our everyday environments is yet to become a reality. Humans develop comprehension of their embodiments by interpreting their actions within the world and acting reciprocally to perceive it —- the environment affects our actions, and our actions simultaneously affect our environment. Besides great advances in robotics and Artificial Intelligence (AI), e.g., through better hardware designs or algorithms incorporating advances in Deep Learning in robotics, we are still far from achieving robotic embodied intelligence. The challenge of attaining artificial embodied intelligence — intelligence that originates and evolves through an agent’s sensorimotor interaction with its environment — is a topic of substantial scientific investigation and is still an open challenge. In this talk, I will walk you through our recent research works for endowing robots with spatial intelligence through perception and interaction to coordinate and acquire skills that are necessary for their promising real-world applications. In particular, we will see how we can use robotic priors for learning to coordinate mobile manipulation robots, how neural representations can allow for learning policies and safe interactions, and, at the crux, how we can leverage those representations to allow the robot to understand and interact with a scene, or guide it to acquire more “information” while acting in a task-oriented manner.

November 14, 2023

Kyunghyun Cho, New York University

Beyond Test Accuracies for Studying Deep Neural Networks

Already in 2015, Leon Bottou discussed the prevalence and end of the training/test experimental paradigm in machine learning. The machine learning community has however continued to stick to this paradigm until now (2023), relying almost entirely and exclusively on the test-set accuracy, which is a rough proxy to the true quality of a machine learning system we want to measure. There are however many aspects in building a machine learning system that require more attention. Specifically, I will discuss three such aspects in this talk; (1) model assumption and construction, (2) optimization and (3) inference. For model assumption and construction, I will discuss our recent work on generative multitask learning and incidental correlation in multimodal learning. For optimization, I will talk about how we can systematically study and investigate learning trajectories. Finally for inference, I will lay out two consistencies that must be satisfied by a large-scale language model and demonstrate that most of the language models do not fully satisfy such consistencies.

November 7, 2023

Berfin Şimşek, New York University

Finite-Width Neural Networks: A Landscape Complexity Analysis

In this talk, I will present an average-case analysis of finite-width neural networks through permutation symmetry. First, I will give a new scaling law for the critical manifolds of finite-width neural networks derived from counting all partitions due to neuron splitting from an initial set of neurons. Considering the invariance of zero neuron addition, we derive the scaling law of the zero-loss manifolds that is exact for the population loss. In a simplified setting, a factor 2log2 of overparameterization guarantees that the zero-loss manifolds are the most numerous. Our complexity calculations show that the loss landscape of neural networks exhibits extreme non-convexity at the onset of overparameterization, which is tamed gradually with overparameterization, and it effectively vanishes for infinitely wide networks. Finally, based on the theory, we will propose an `Expand-Cluster’ algorithm for model identification in practice.

October 31, 2023

Edward Johns, Robot Learning Lab at Imperial College London

Images, Language, and Actions: The Three Ingredients of Robot Learning

Most of the major recent breakthroughs in AI have relied on training huge neural networks on huge amounts of data. But what about a breakthrough in real-world robotics? One of the challenges is that physical robotics data is very scarce, and very expensive to collect. To address this, my team and I have been developing very data-efficient methods for robots to learn new tasks through human demonstrations. Using these methods, we are now able to quickly teach robots a range of everyday tasks, such as hammering in a nail, inserting a plug into a socket, and scooping up an object with a spatula. However, even with these efficient methods, providing human demonstrations can be laborious. Therefore, we have also been exploring the use of off-the-shelf neural networks trained on web-scale data, such as OpenAI’s DALL-E and GPT, to act as a robot’s “imagination” or its “internal monologue” when solving new tasks. Through this talk, we will explore the importance of image, language, and action data in robotics, as the three ingredients for scalable robot learning.

October 17, 2023

Petar Velickovic from Google DeepMind

Decoupling The Input Graph and The Computational Graph: The Most Important Unsolved Problem in Graph Representation Learning

When deploying graph neural networks, we often make a seemingly innocent assumption: that the input graph we are given is the ground-truth. However, as my talk will unpack, this is often not the case: even when the graphs are perfectly correct, they may be severely suboptimal for completing the task at hand. This will introduce us to a rich and vibrant area of graph rewiring, which is experiencing a renaissance in recent times. I will discuss some of the most representative works, including two of our own contributions (https://arxiv.org/abs/2210.02997, https://arxiv.org/abs/2306.03589), one of which won the Best Paper Award at the Graph Learning Frontiers Workshop at NeurIPS’22.

October 10, 2023

Eunsol Choi, University of Texas at Austin

Knowledge Augmentation for Language Models

Modern language models have the capacity to store and use immense amounts of knowledge about real world. Yet, their knowledge about the world is often incorrect or outdated, motivating ways to augment their knowledge. In this talk, I will present two complementary avenues for knowledge augmentation: (1) a modular, retrieval-based approach which brings in new information at inference time and (2) a parameter updating approach which aims to enable models to internalize new information and make inferences based on it.