Can you briefly introduce yourself and share your background in AI research?
My name is Andrew, and I am a first year PhD student in computer science and engineering. I am from the US, but I finished my bachelor’s degree at Koç. I have been involved in AI research for about 1.5 years. I first got involved with AI research when I took a deep unsupervised learning class taught by Aykut Erdem, and wanted to become more involved in that topic. I have done research in both Natural Language Processing and Computer Vision, but my current research focus is Computer Vision.
What initially sparked your interest in the field of AI? Was there a particular moment or experience that inspired you to pursue this area of study?
I had been interested in Artificial Intelligence for quite some time, but the experience that made me decide to pursue AI research was experimenting with deep learning models in class, and reading research papers about all the new and exciting work being done in the field.
Could you tell us about your current research or thesis topic in AI? What motivated you to choose this specific area?
My current research topic is using audio to generate or edit videos. The idea is that videos greatly depend on the time component (a video is just images changing over time), and using just text to edit a video can fail to handle this time component correctly. Instead, since audio also has a time component, it could be much better to sync the audio and video together, and have the video edited based on what is currently happening in the audio.
What are some of the key challenges you’ve encountered during your research? How have you been able to overcome them?
I especially had to overcome a lot of stress during my first year. It was during the pandemic, so I did not have a very social life during that year. All of my focus was on my work and studies, which caused a lot of stress. I met with other fellows online during this period, we discussed and helped each other, which helped me overcome these challenges. After the pandemic, we were
able to live on campus, which is great. I am balancing work and social life much better now and I feel like I made life-long friends.
How do you stay updated with the latest developments and research papers in the field of AI? Are there any particular resources or platforms you rely on?
I frequently skim the recent publications on Arxiv to stay up to date with my research interests. Additionally, I use Semantic Scholar to track the papers I am currently interested in, which can also give me updates for new papers that are related.
What advice would you give to aspiring students who are interested in pursuing a career in AI and considering graduate studies?
For students who want to pursue a career in AI, it is possible to get involved during your undergraduate degree. If you push yourself and try to take relevant courses, don’t be afraid to ask a professor if you can get involved somehow.
Reflecting on your recent AI research, could you describe a particular challenge or obstacle you encountered during the process?
One big challenge that I have faced with some of my recent research has to do with large amounts of code, and how easy it is to miss small mistakes in these cases. In my situation, there was a case where 2 numbers had been swapped in one line of code, and fixing this caused our results to become almost twice as good immediately. Finding this issue took meticulous searching and testing over multiple days.
Description of my research paper: Videos consist of 3 main components: structure, style, and motion. The structure of the video has to do with the parts of the video that stay consistent over time. For example, the facial shape of a person in a video is always going to stay the same, because the person doesn’t change. The style of a video has to do with the colors and textures present, such as a person’s shirt. The motion of the video is how the video changes over time. We designed a new model to decompose a given video into these 3 parts, which allows for careful editing of only specific parts of the video. Specifically, we can modify one of these 3 parts while fixing the other 2, to preserve the parts we care about. For example, changing a person’s hair color just requires changing the style of the video, while changing how a person is walking only requires changing the motion of a video.
A book and movie that you recommend (they don’t have to be related to ai)
I would highly recommend the movie Interstellar, it is a visually stunning movie with a great story and soundtrack. For a book, I would suggest Fermat’s Enigma, about one of the most interesting stories in all of mathematics.