Funded by: Google
Dates: 2021-2022
Principal Investigators: A. Erdoğan and D. Yuret
Lorem ipsum dolor sit amet consectetur adipisicing elit. Optio odit, neque molestiae voluptatum illo, ducimus quo cupiditate vitae assumenda ex debitis ullam necessitatibus deleniti sapiente repellendus quae eligendi, corporis eos.
Funded by: STINT
Dates: 2021-2022
Researchers: R.Lowe (PI) and E. Erzin, M. Sezgin, Y. Yemez
By combining Differential Outcome Training (DOT), the usage of unique stimulus-response pairings, and feedback from a humanoid simulated robot (SDK) this study aims to improve learning performance of subjects on a visuospatial gamified memory task. This is achieved by using the SDK as an interface for an algorithm that provides audiovisual, reinforcing feedback promoting engagement in a new dyadic setup for a gamified task. Additionally it enhances the differential outcome effect from the reward system of the gamified task that in turn improves learning. Learning performance was measured by percentage of correct responses on the gamified task and iMotions software, eye-tracking and Self Assessment Manikin-scale revealed affective cues and cognitive strategies. A total of 60 subjects participated in the experiment, and the results showed that subjects playing the gamified task in the DOT condition had significantly higher performance than those who did not. Qualitative analysis of the affective data revealed that participants assisted by the SDK had greater feeling of control and higher levels of valence when combined with DOT. The eye tracking revealed different cognitive strategies where a strategy that makes use of peripheral vision appeared to be most highly correlated with higher memory performance. The results imply that combining DOT with social robot feedback might be an effective way to improve learning, a finding of great significance for potential future clinical intervention of the setup in the context of memory training for patients with dementia and mild cognitive impairment.
Funded by: Koç University Seed Fund
Dates: 2020-2022
Researchers: I. Uytterhoeven (PI), F. Güney
As a proof of concept this project, proposed by Assoc. Prof. Dr. Inge Uytterhoeven (Department of Archaeology and History of Art) in collaboration with Assist. Prof. Dr. Fatma Güney (Department of Computer Science and Engineering), intends to model the collapse of ancient structures caused by earthquakes, combining research approaches of Archaeology, Architecture, Computer Engineering, Archaeoseismology, Conservation, and Cultural Heritage, and taking the archaeological site of Sagalassos (Ağlasun, Burdur) as a test case. The project aims to develop a large number of realistic simulations of the distortion, displacement and tumbling down of building elements of a set of ancient structures with different architectural characteristics at Sagalassos. In this way, it intends to offer an innovative methodology to learn the dynamics of physics, causing the collapse in seismic calamities. Moreover, we hope to discriminate between various seismic events that may have followed each other through time, as well as to distinguish between earthquake damage and other processes of structural decay that impacted ancient structures, including the salvaging of building materials for recycling purposes or natural gradual processes of decay. Furthermore, the simulations aim to contribute to the fields of conservation and anastylosis by giving insights into the position, orientation, and extent of collapsed building elements in relation to the structures they belonged to, and into the impact of future earthquakes on rebuilt structures. Finally, the project aims to contribute to the visualisation for the broad public of the effects of seismic activity on ancient urban societies.
Funded by: the Scientific and Technological Research Council of Türkiye – TÜBİTAK
Dates: 2020-2022
Researchers: M. Sezgin (PI), A. Sabuncuoğlu
Lorem ipsum dolor sit amet consectetur adipisicing elit. Optio odit, neque molestiae voluptatum illo, ducimus quo cupiditate vitae assumenda ex debitis ullam necessitatibus deleniti sapiente repellendus quae eligendi, corporis eos.
Funded by: Google LLC.
Dates: 2020-2021
Principal Investigators: A. Erdoğan and D. Yuret
Lorem ipsum dolor sit amet consectetur adipisicing elit. Optio odit, neque molestiae voluptatum illo, ducimus quo cupiditate vitae assumenda ex debitis ullam necessitatibus deleniti sapiente repellendus quae eligendi, corporis eos.
Funded by: European Research Council
Dates: 2017-2021
Principal Investigator: E. Yörük and D. Yuret
Can we say that emerging market economies are developing a new welfare regime? If so, what has caused this?
This project has two hypotheses:
Hypothesis 1: China, Brazil, India, Indonesia, Mexico, South Africa and Türkiye are forming a new welfare regime that differs from liberal, corporatist and social democratic welfare regimes of the global north on the basis of expansive, and decommodifying social assistance programmes for the poor.
Hypothesis 2: This new welfare regime is emerging principally as a response to the growing political power of the poor as a dual source of threat and support for governments.
The project challenges and expands the state-of-the-art in three different literatures by developing novel concepts and approaches:
Funded by: the Scientific and Technological Research Council of Türkiye – TÜBİTAK
Dates: 2019-2021
Principal Investigator: D. Unat
The aim of the project is to perform effortless parallelization on deep neural networks. Hybrid-parallel approaches that blend the data and the model, especially the model-parallel approach, will be applied automatically on the model and various program improvements will be performed. The project will develop a series of optimization techniques that will enable the user to use the devices in the current hardware system efficiently without any code changes according to the topology and the structure of the deep neural network trained. Suggested improvements will be implemented on popular deep learning frameworks such as TensorFlow and MXNet, which illustrate deep neural network models as a data-flow graph.
Funded by: Scientific and Technological Research Council of Türkiye (TÜBİTAK)
Dates: 2018-2021
Principal Investigator: Ç. Başdoğan
Capacitive touch screens are indispensable part of smart phones, tablets, kiosks, and laptop computers in nowadays. They are used to detect our finger position and enable us to interact with text, images, and data displayed by the above devices. To further improve these interactions, there is a growing interest in research community for displaying active tactile feedback to users through the capacitive screens. One approach followed for this purpose is to control the friction force between finger pad of user and the screen via electrostatic actuation. If an alternating voltage is applied to the conductive layer of a touch screen, an attraction force is generated between the finger and its surface. This force modulates the friction between the surface and the skin of the finger moving on it. Hence, one can generate different haptic effects on a touch screen by controlling the amplitude, frequency and waveform of this input voltage. These haptic effects could be used to develop new intelligent user interfaces, in applications involving education, data visualization, and digital games. However, this area of research is new and we do not fully understand the electro-mechanical interactions between human finger and the touch screen actuated by electrostatic forces and the effect of these interactions on our haptic perception yet. Hence, the aim of this project is to investigate the electromechanical interactions between human finger and an electrostatically actuated touch screen in depth. In particular, we will investigate the effect of following factors on the frictional forces between finger and the screen; a) amplitude of the voltage applied to the conductive layer of the touch screen, b) the normal force applied by finger on the touch screen, and c) effect of finger speed. The results of this study will not only enable us to better understand the physics of interactions between human finger and a touch screen actuated by electrostatic forces from a scientific point of view, but will also provide us with guidelines on how to program a touch screen to generate desired haptic effects for various applications.
Funded by: Scientific and Technological Research Council of Türkiye (TÜBİTAK)
Dates: 2018-2021
Principal Investigator: B. Akgün
Robots and related component technologies are getting more capable, affordable and accessible. With the advent of safe collaborative robotic arms, the number of “cage-free” robots are increasing. However, as they become more ubiquitous, the range of tasks and environments they face grow more complex. Many of these environments, such as households, machine-shops, hospitals, and schools, contain people having a wide range of preferences, expectations, assumptions, and level of technological savviness. Future robot users will want to customize their robot behavior and add new ones. Thus it is not practical to program robots for all the scenarios that they will
face when they are deployed. The field of Learning from Demonstration (LfD) emerged as answer to this challenge, with the vision of programming robots through demonstrations of the desired behavior instead of explicit
programming. Most existing LfD approaches learn a new skill from scratch, but there will inevitably be many skills required from the robot. After a certain point, teaching each skill like this would get tedious. Instead, the robot should transfer knowledge from its already learned skills. The aim of this project is to learn robotic skills from non-robotics experts and use previously learned skills to either speed up learning or increase generalization. Towards this end, this project investigates three topics; (1) design a joint action-goal model to facilitate transfer learning, (2) feature learning for skill transfer and (3) improve existing interactions for LfD or develop new ones for transfer learning.
Funded by: Scientific and Technological Research Council of Türkiye (TÜBİTAK)
Dates: 2018-2021
Principal investigator: Ç. Başdoğan
In the near future, humans and robots are expected to perform collaborative tasks involving physical interaction in various different environments such as homes, hospitals, and factories. One important research topic in physical Human-Robot Interaction (pHRI) is to develop natural haptic communication between the partners. Although there is already a large body of studies in the area of human-robot Interaction, the number of studies investigating the physical interaction between the partners and in particular the haptic communication are limited and the interaction in such systems is still artificial when compared to natural human-human collaboration. Although the collaborative tasks involving physical interaction such as assembly/disassembly of parts and transportation of an object can be planned and executed naturally and intuitively by two humans, there are unfortunately no robots in the market that can collaborate and perform the same tasks with us. In this project, we propose a fractional order adaptive control for the pHRI systems. The main goal of the project is to adapt the admittance parameters of the robot in real-time during the task, based on the changes in human and environment impedances, while balancing the tradeoff between the stability and the transparency of the coupled system. To the best of our knowledge, there is no earlier study in the literature utilizing a fractional order admittance controller for pHRI. Compared to an integer order controller, a fractional order controller enables to use fractional order derivative and integrator, which will bring us flexibility in modeling and controlling the dynamics of physical interactions between the human operator and the robot. Moreover, there is no study in literature investigating the real-time adaptation of the control parameters of a fractional order admittance controller via machine learning algorithms. Machine learning algorithms will enable us to learn from data iteratively to estimate human intention during the task and then select control parameters accordingly to optimize the task performance.
Funded by: Scientific and Technological Research Council of Türkiye (TÜBİTAK)
Dates: 2018-2021
Principal Investigator: M. A. Tekalp
The advent of deep learning is changing how we do 2D/3D image/video processing, including image/video restoration, interpolation, super-resolution, motion analysis/tracking, and compression, and light-field and hologram processing. Various deep neural network (DNN) architectures, such as convolutional neural networks (CNN), auto-encoders, recurrent neural networks (RNN), generative adversarial networks (GAN) have already been applied to different image/video processing problems. The question then arises whether data-driven deep networks and associated learning algorithms have become the preferred dominant solution to all image/video processing problems, in contrast to the traditional human-engineered, hand-crafted algorithms using domain-specific signals-systems models. The answer to this question is almost surely affirmative and deep image/video processing methods are poised to replace a large part of traditional image/video processing pipeline.
Yet, deep signal processing is a very young field, the science of DNN and how they produce such amazing image/video processing results are not sufficiently well understood and more research is needed for a clear theoretical understanding of which DNN architectures work best for what image/video processing problems and how can we obtain much better and more stable results. The current successes of deep learning in image/video processing are experimentally-driven by more-or-less on trial and error. There are several open challenges, e.g., IMAGENet large scale visual recognition, visual object tracking (VOT), large scale activity recognition (ActivityNet), and single-image super-resolution (NTIRE), and a different network architecture wins the competition in different challenges each year. Few formal works exist to understand the mathematics behind this.
This project will explore the potential for breakthrough in image and video processing using new deep learning algorithms, guided by machine-learned signal models. We believe that the relatively less studied areas of residual learning, adversarial learning, and reinforcement learning offer high-potential for image and video processing. This project will investigate some fundamental questions within a formal framework and explore the potential for further breakthrough in image/video processing, including problems that have not been addressed by using DNN, such as motion-compensated video processing, video compression, and light-field and hologram processing/compression, using deep learning guided by big-data-driven learned signal models. The proposed research is groundbreaking because it brings in new ideas, which can revolutionize the way we do image/video processing rendering some of the traditional algorithms obsolete.
Funded by: Scientific and Technological Research Council of Türkiye (TÜBİTAK)
Dates: 2017-2020
Principal Investigator: A. Erdem
The aim of the proposed project is to interpret big and noisy visual data, which has been recorded in diversified environments with no predefined constraints. To this end, the goal is to develop and apply original data mining methods towards extracting important knowledge and increase the accessibility of such archives. Particularly, we aim to focus on summarization approaches, so that the big visual data is more effectively structured and enriched with additional semantic information. The summarization approaches that make use of the multi-modal nature of the data will focus on three main problems: 1) To learn semantic concepts and spatio-temporal attributes from big visual data; 2) organization of large photograph collections; 3) summarization of videos in large web archives. In all these problems, big visual data and the additional information referred as metadata will be handled together.
Funded by: Scientific and Technological Research Council of Türkiye (TÜBİTAK)
Dates: 20018-2020
Principal Investigator: E. Erzin
We will soon see human-like systems in the form of companion robots, e-learning applications, and assisted living agents. The success of these systems to a large extent will depend on the level of realism, naturalness and engagement that they can elicit. The goal of this project is to produce a backchannel feedback modeling system that is capable of observing user behavior as well as reactions in order to produce appropriate and relevant feedback.
It is well known that frequently utilized verbal and/or non-verbal backchannel communication cues in the form of smiles, nods, and laughter are extremely important factors affecting the efficiency of face-to-face communication (Lambertz, 2011). Hence, the use of backchanneling in eliciting user engagement in the context of conversational robotic systems and virtual embodied conversational systems has been enjoying increased attention in the research community (Clavel vd., 2016). However, most of the existing solutions to this problem use rule-based strategies. They all follow the common strategy of estimating the affective and/or cognitive state of the user based on their responses and behaviors, and then synthesizing backchannel cues based on predefined rules. There are only a few pieces of work that attempt to estimate backchannel cues based on models learned from human-human interaction. These systems restrict their analysis to only prosodic and verbal features, and they generate simple verbal or visual feedback through nodding.
In this project, we aim to learn non-verbal backchannel feedback mechanisms for HCI applications from examples of human-human interaction. The main novelty of the project will be to build methods in this direction. Unlike the existing work, our models will incorporate the affective state of the user and user reactions to the learning process. In addition to the frequently used cues such as nodding, we will treat smiles and laughter acts as backchannel cues, and incorporate them into our modeling and synthesis framework.
The learning-based backchanneling model that we wish to develop requires estimating, and tracking user interest and affect, as well as extracting the relevant low-level and/or high-level features, which can be auditory or visual. Using the pre-trained model of backchannel synthesis, these features will then be processed to generate appropriate backchannel cues with proper timing through a human-like interface in the form of a robot or an embodied conversational agent. To address the multimodal modeling problem, we plan to use hidden Markov models and recurrent neural networks, which stand out in the literature as models for analyzing time-series data.
The proposed line of research addresses an important research challenge that independently came up during the course of two TUBITAK projects undertaken by the same research team (JOKER (Devillers vd., 2015) and JESTKOD (completed) (Bozkurt vd., 2016)). The project is not a simple extension of these projects, as it tackles a problem not addressed by these projects using a novel research methodology. However, the scientific and technical expertise acquired through these project as well as large amounts of data collected in them will be utilized. In particular, we plan to use the human-human interaction data collected in the JESTKOD project. We will carry out a substantial annotation effort to label backchannel specific cues in this database, and use it to train our probabilistic backchanneling model. The smile and laughter detection methods will be adopted from those developed in the context of the JOKER project. In this project, we do not plan to collect any new data for human-computer interaction or to conduct any user experiments.
Finally, we propose to validate a user interface utilizing the backchanneling model developed in this project in a constrained interaction scenario. We propose to carry out this validation using a physical setup involving a realistic robotic head that was acquired in the JOKER project (FURHAT) (Moubayed vd., 2013a; Schröder, 2010). User interaction experiments in this setup will provide the data to be used in the evaluation of overall system.
Funded by: Saudi Aramco
Dates: 20017-2020
Principal Investigator: D. Unat
Lorem ipsum dolor sit amet consectetur adipisicing elit. Optio odit, neque molestiae voluptatum illo, ducimus quo cupiditate vitae assumenda ex debitis ullam necessitatibus deleniti sapiente repellendus quae eligendi, corporis eos.
Funded by: European Commission ERA-Net Program, CHIST-ERA Intelligent User Interfaces Call
Dates: 2013-2016
Principal investigator: T. M. Sezgin
This project will build and develop JOKER, a generic intelligent user interface providing a multimodal dialogue system with social communication skills including humor, empathy, compassion, charm, and other informal socially-oriented behavior.
Talk during social interactions naturally involves the exchange of propositional content but also and perhaps more importantly the expression of interpersonal relationships, as well as displays of emotion, affect, interest, etc. This project will facilitate advanced dialogues employing complex social behaviors in order to provide a companion-machine (robot or ECA) with the skills to create and maintain a long term social relationship through verbal and non verbal language interaction. Such social interaction requires that the robot has the ability to represent and understand some complex human social behavior. It is not straightforward to design a robot with such abilities. Social interactions require social intelligence and ‘understanding’ (for planning ahead and dealing with new circumstances) and employ theory of mind for inferring the cognitive states of another person.
JOKER will emphasize the fusion of verbal and non-verbal channels for emotional and social behavior perception, interaction and generation capabilities. Our paradigm invokes two types of decision: intuitive (mainly based upon non-verbal multimodal cues) and cognitive (based upon fusion of semantic and contextual information with non-verbal multimodal cues.) The intuitive type will be used dynamically in the interaction at the non-verbal level (empathic behavior: synchrony of mimics such as smile, nods) but also at verbal levels for reflex small- talk (politeness behavior: verbal synchrony with hello, how are you, thanks, etc). Cognitive decisions will be used for reasoning on the strategy of the dialog and deciding more complex social behaviors (humor, compassion, white lies, etc.) taking into account the user profile and contextual information.
JOKER will react in real-time with a robust perception module (sensing user’s facial expressions, gaze, voice, audio and speech style and content), a social interaction module modelling user and context, with long-term memories, and a generation and synthesis module for maintaining social engagement with the user.
The research will provide a generic intelligent user interface for use with various platforms such as robots or ECAs, a collection of multimodal data with different socially-oriented behavior scenarios in two languages (French and English) and an evaluation protocol for such systems. Using the database collected in a human-machine context, cultural aspects of emotions and natural social interaction including chat, jokes, and other informal socially-oriented behavior will be incorporated.
This work was partly supported by the Chist-Era project IMOTION with contributions from the Belgian Fonds de la Recherche Scientifique (FNRS, contract no. R.50.02.14.F), the Scientific and Technological Research Council of Türkiye (T¨ubitak, grant no. 113E325), and the Swiss National Science Foundation (SNSF, contract no. 20CH21 151571).
Dates: 2013-2016
Principal investigator: T. M. Sezgin
The IMOTION project develops and evaluates innovative multi-modal user interfaces for interacting with augmented videos. Starting with an extension of existing query paradigms (keyword search in manual annotations), image search (query by example in key frames), IMOTION considers novel sketch- and speech-based user interfaces.
Funded by: Scientific and Technological Research Council of Türkiye (TÜBİTAK)
Dates: 2013-2016
Principal investigator: T. M. Sezgin
The goal of this project is to build the pen-based interfaces for the classroom of the future. Currently there is little interaction and personalized feedback between instructors and pupils. We use realtime processing of pen input to create consolidated representations of student interactions and allow teachers to give timely and to-the-point feedback to students to enhance the learning experience.
Funded by: SANTEZ Programme, Ministry of Science, Industry, and Technology, Türkiye
Dates: 2012-2015
Principal investigator: T. M. Sezgin
TVs are slowly morphing into powerful set-top computers with internet connections. As such, they slowly take over roles and functions that were traditionally associated with desktop computers. TV users, for example, can use their TV for browsing the internet. Unfortunately, the vast majority of the content in the internet has been designed for desktop viewing, hence they have to be adapted for viewing on a TV. In this Project, we aim to develop a semi-automatic content retargeting system, which is expected to work with minimal intervention of an expert.
Funded by: European Community’s Seventh Framework Programm
Dates: 2011-2014
Principal investigator: T. M. Sezgin
The main goal of this project is to develop a computer software program that will assist children with Autism Spectrum Conditions (ASC) to understand and express emotions through facial expressions, tone-of-voice and body gestures.This software will assist them to understand and interact with other people, and as a result, will increase their inclusion in society.
Funded by: DARPA/BAE/SIFT (British Aerospace/Smart Information Flow Technologies)
Dates: 2008-2009
Principal investigator: T. M. Sezgin (co-PI for the sketch-to-plan module)
Deep Green is a project that ran under the Information Processing Technology Office of the Defense Advanced Research Projects Agency. The purpose of the project was to develop a decision-making support system for United States Army commanders. The systems developed feature advanced predictive capabilities to enable computers to efficiently and accurately predict possible future scenarios, based on an analysis of the current situation, in order to give army commanders a better view of possible outcomes for their decisions [1][2][3] Deep Green is composed of four major components: Blitzkrieg – Battlefield model which analyzes current situation and determines possible future outcomes for use in planning. When a plan is presented, Blitzkrieg analyzes the plan to point out possible results of that course of action to the commander. Blitzkrieg itself does not do planning, it merely determines the likely results of a plan formulated by a human commander. Crystal Ball – Performs analysis of possible futures generated from blitzkrieg, and determines the “best” choices by measuring flexibility, usefulness, and likelihood of each. It picks the best of these choices and presents them to the commander. Also updates model of battlefield situation with information pulled from the field. This might include reports from soldiers, through a program similar to the Communicator program that was developed under the Information Awareness Office or through automated RSTA systems such as HART. Commander’s Associate – this is the user interface and visualization component. It consists of “Sketch-to-decide” which presents the commander with a list of options, and “Sketch-to-plan” which is a screen on which the commander can draw up a plan, which Deep Green will interpret and put into action