April 30, 2024

Murat Erdogdu, University of Toronto

Feature Learning in Two-layer Neural Networks: The Effect of Data Covariance

We study the effect of gradient-based optimization on feature learning in two-layer neural networks. We consider a setting where the number of samples is of the same order as the input dimension and show that, when the input data is isotropic, gradient descent always improves upon the initial random features model in terms of prediction risk, for a certain class of targets. Further leveraging the practical observation that data often contains additional structure, i.e., the input covariance has non-trivial alignment with the target, we prove that the class of learnable targets can be significantly extended, demonstrating a clear separation between kernel methods and two-layer neural networks in this regime.

March 26, 2024

Georg Martius, University of Tübingen

Machine Learning for Autonomously Learning Robots

I am driven by the question of how robots can autonomously develop skills to become versatile helpers for humans. Considering children, it seems natural that they have their own agenda. They playfully explore their environment, without the necessity for somebody to tell them exactly what to do next. Replicating such flexible learning in machines is highly challenging. I will present my research on different machine learning methods as steps towards solving this challenge. Part of my research is concerned with artificial intrinsic motivations — their mathematical formulation and embedding into learning systems. Equally important is to learn the right representations and internal models and I will show how powerful intrinsic motivations can be derived from learned models. With model-based reinforcement learning and planning methods, I show how we can achieve active exploration and playful robots but also safety aware behavior. A really fascinating feature is that these learning-by-playing systems are able to perform well in unseen tasks zero-shot.

March 19, 2024

Sebastian Risi, IT University of Copenhagen

Growing Adaptive and Self-Assembling Machines

Despite all their recent advances, current AI methods are often still brittle and fail when confronted with unexpected situations. By incorporating collective intelligence ideas, we have recently been able to create neural networks that self-organize their weights for fast adaptation, machines that can recognize their own shape, and machines that self-assemble through local interactions alone. Additionally, in this talk I will present initial results from my GROW-AI ERC project, where we are developing neural networks that grow through a developmental process that mirrors key properties of embryonic development in biological organisms. The talk concludes with future research opportunities and challenges that we need to address to best capitalize on the same ideas that allowed biological intelligence to strive.

March 12, 2024

Ahmet Üstün, Cohere For AI

Aya: An Open Science Initiative to Accelerate Multilingual AI Progress

Access to cutting-edge breakthroughs in large language models (LLMs) has been limited to speakers of only a few, primarily English, languages. The Aya project aimed to change that by focusing on accelerating multilingual AI through an open-source initiative. This initiative resulted in a state-of-the-art multilingual instruction-tuned model and the largest multilingual instruction collection. Built by 3,000 independent researchers across 119 countries, the Aya collection is the largest of its kind, crafted through templating and translating existing NLP datasets across 114 languages. As part of this collection, the Aya dataset is the largest collection of original annotations from native speakers worldwide, covering 65 languages. Finally, trained on a diverse set of instruction mixtures, including the Aya collection and dataset, the Aya model is a multilingual language model that can follow instructions in 101 languages, achieving state-of-the-art performance in various multilingual benchmarks.

March 05, 2024

Preslav Nakov, Mohamed bin Zayed University of Artificial Intelligence

Factuality Challenges in the Era of Large Language Models

We will discuss the risks, the challenges, and the opportunities that Large Language Models (LLMs) bring regarding factuality. We will then delve into our recent work on using LLMs to assist fact-checking (e.g., claim normalization, stance detection, question-guided fact-checking, program-guided reasoning, and synthetic data generation for fake news and propaganda identification), on checking and correcting the output of LLMs, on detecting machine-generated text (blackbox and whitebox), and on fighting the ongoing misinformation pollution with LLMs. Finally, we will discuss work on safeguarding LLMs, and the safety mechanisms we incorporated in Jais-chat, the world’s best open Arabic-centric foundation and instruction-tuned LLM.

February 27, 2024

Berk Canberk, Edinburgh Napier University

Real-Time Digital Twin System in 6G Era

With the tremendous advances in the upcoming 6G era, the world enters an age of connected intelligence. This will enable real-time Digital Twin deployments with specific integrated features such as seamless automation and control, augmented reality/virtual reality, visualization, and more connected devices per square kilometer. This new knowledge-based era of the 6G vision needs seamless control systems, fully automated management, artificial-intelligence enabled communication, well-bred computing methodologies, self-organizing behaviors, and high-end connectivity. Here, the importance of Digital Twins (DT) based systems has become vital. Digital Twin (DT) is the virtual representation of a Cyber-Physical System’s network elements and dynamics. The use of DT provides undue advantages such as resiliency, sustainability, real-time monitoring, control-tower-based management, thorough what-if analyses, an extremely high-performance simulation model for research, testing, and optimization. With these in mind, in this talk, first, a short recap of the ai-enabled digital twin concept and its potential market size in Industry 4.0 will be introduced. The technology behind DT, such as the high precision virtual network modeling and edge intelligence for ultra-low latency, will then be described. The reliability, latency, capacity, and connectivity issues in DT will be discussed. Moreover, several application areas of DT will also be underlined in terms of demand forecasting, warehouse automation, predictive maintenance, anomaly detection, risk assessment, intelligent scheduling, and control tower. Some important implementation areas of DT such as Supply Chain Management, Smart Manufacturing, Sustainable Product Line Management, Healthcare, and Smart Cities will also be covered.