Nov 24, 2022
Utku Günay Acer, Nokia Bell Labs
This talk presents SensiX, a multi-tenant runtime for adaptive model execution with integrated MLOps on edge devices, e.g., a camera, a microphone, or IoT sensors. Through its highly modular componentisation to externalise data operations with clear abstractions and document-centric manifestation for system-wide orchestration, SensiX can serve multiple models efficiently with fine-grained control on edge devices while minimising data operation redundancy, managing data and device heterogeneity, reducing resource contention and removing manual MLOps.
A particular deployment of SensiX is an urban conversational agent. Lingo is a hyper-local conversational agent embedded deeply into the urban infrastructure that provides rich, purposeful, detailed, and in some cases, playful information relevant to a neighbourhood. Lingo provides hyper-local responses to user queries. The responses are computed by SensiX to act as an information source. These queries are served through a covert communication mechanism over Wi-Fi management frames to enable privacy-preserving proxemic interactions.
Nov 8, 2022
Chi-Chun Lee (Jeremy) ,National Tsing Hua University
Speech technology has proliferated into our life, and speech emotion recognition (SER) modules add humaine aspect to the wide-spread use of speech based services. Deep learning techniques play a key role in realizing SER for into-life application. In this talk, we will talk briefly about three main components of using deep models for SER: robustness, generalization and usability, and share several of our recent developments in each of the three main components.
Nov 1, 2022
Mehmet Esat Belviranli, Colorado School of Mines
Computing systems are becoming more complex by integrating specialized processing units, i.e., accelerators, that are optimized to perform a specific type of operation. This demand is fueled by the need to run distinct workloads in mobile and autonomous platforms. Such systems often embed diversely heterogeneous System-on-Chips(SoC) where an operation can be executed by more than a single type of accelerator with varying performance, energy, and latency characteristics. A hybrid (i.e., multi-accelerator) execution of popular workloads, such as neural network (NN) inference, collaboratively and concurrently on different types of accelerators in a diversely heterogeneous SoC is a relatively new and unexplored scheme. Multi-accelerator execution has the potential to provide unique benefits for computing systems with limited resources. In this talk, we investigate a framework that enables resource-constraint aware multi-accelerator execution for diversely heterogeneous SoCs. We achieve this by distributing the layers of a NN inference across different accelerators so that the trade-off between performance and energy satisfies system constraints. We further explore improving total throughput by concurrently using different types of accelerators for executing NNs in parallel. Our proposed methodology uniquely considers inter-accelerator transition costs, shared-memory contention and accelerator architectures that embed internal hardware pipelines. We employ empirical performance models and constraint-based optimization problems to determine optimal multi-accelerator execution schedules.
Oct 25, 2022
Erol Şahin, METU-ROMER (Center for Robotics and AI)
Industrial robots, a shining example of the success of robotics in the manufacturing domain, are developed as manipulators without any support for human-robot interaction (HRI). However, a new generation of manipulators, called Collaborative robots (Cobots), designed with embedded safety features, are being deployed to operate alongside humans. These advances are pushing HRI research, most of which is being conducted on “toy robots that do not do much work,” towards deployment on Cobots. In our two TUBITAK projects, called CIRAK and KALFA, we study how Cobots can be imbued with HRI capabilities in a collaborative assembly task. Based on the observation that manipulation skills of Cobots being (and will remain in the near future) inferior to the skill of workers, we envision Cobots positioning themselves as unskilled coworkers (hence the name CIRAK and KALFA) in which they hand in proper tools and parts to the worker. Within this talk, I will summarize our work towards imbuing HRI skills on Cobots through the use of some animation principles through behaviors such as “breathing” and “gazing”, as well as automatic assembly learning. Finally, I will briefly share the developments about METU-ROMER.
Oct 18, 2022
Mehmet Doğar, University of Leeds
I will give an overview of our work on robotic object manipulation. First, I will talk about physics-based planning. This refers to robot motion planners that use predictions about the motion of contacted objects. We have particularly been interested in developing such planners for cluttered scenes, where multiple objects might simultaneously move as a result of robot contact. Second, I will talk about a more conventional grasping-based problem, where a robot must manipulate an object for the application of external forceful operations on it. Imagine a robot holding and moving a wooden board for a human, while the human drills holes into the board and cuts parts of it. I will describe our efforts in developing a planner that addresses the geometric, force stability, and human-comfort constraints for such a system.
Oct 04, 2022
Zeyu Wang, Hong Kong University of Science and Technology
Despite advances in computer-aided design (CAD) systems and video editing software, digital content creation for design, storytelling, and interactive experiences remains a challenging problem. This talk introduces a series of studies, techniques, and systems along three thrusts that engage creators more directly and enhance the user experience in authoring digital content. First, we present a drawing dataset and spatiotemporal analysis that provide insight into how people draw by comparing tracing, freehand drawing, and computer-generated approximations. We found a high degree of similarity in stroke placement and types of strokes used over time, which informs methods for customized stroke treatment and emulating drawing processes. We also propose a deep learning-based technique for line drawing synthesis from animated 3D models, where our learned style space and optimization-based embedding enable the generation of line drawing animations while allowing interactive user control across frames. Second, we demonstrate the importance of utilizing spatial context in the creative process in augmented reality (AR) through two tablet-based interfaces. DistanciAR enables designers to create site-specific AR experiences for remote environments using LiDAR capture and new authoring modes, such as Dollhouse and Peek. PointShopAR integrates point cloud capture and editing in a single AR workflow to help users quickly prototype design ideas in their spatial context. Our user studies show that LiDAR capture and the point cloud representation in these systems can make rapid AR prototyping more accessible and versatile. Last, we introduce two procedural methods to generate time-based media for visual communication and storytelling. AniCode supports authoring and on-the-fly consumption of personalized animations in a network-free environment via a printed code. CHER-Ob generates video flythroughs for storytelling from annotated heterogeneous 2D and 3D data for cultural heritage. Our user studies show that these methods can benefit the video-oriented digital prototyping experience and facilitate the dissemination of creative and cultural ideas.
In recent years model sizes have increased substantially, and so did the cost for training them. This is problematic for two reasons: 1) it excludes organizations that do not have thousands of GPUs at hand for training such models, and 2) it becomes apparent that the hardware will not able to scale along with the growth of the models. Both can be alleviated by improving the efficiency of NLP models. This talk will first provide an overview of where efficiency may be improved within a typical NLP pipeline. We will then have a closer look at methods that improve data efficiency. Finally, we will discuss how we can quantify efficiency using different kinds of metrics.