I hypothesized these newspaper “photos” of Genghis Khan via a IP-Adapter ComfyUI workflow that combines the structure of his portrait and the style of a real newspaper photo of my late grandfather.
(Their ancestral relationship could lend to some credibility of this experiment.)
May 15, 2024
Newspaper photo of Genghis Khan
March 27, 2024
Hands on – quick short abstract
An intern collaborator thanked me for a quick editing of the abstract of an upcoming paper submission, which reminded me of the the following story reflecting how our PhD advisers (or senior collaborators) could influence our work styles:
A few days before the SIGGRAPH 2002 paper deadline, while trying to submit the abstract from a paper draft, I received an error message saying that it was over the length limit (maximum 600 words, if I remember correctly). I could do a quick trim of the abstract but worried that I might not be able to preserve the content in such a short form, and thus sent a message to the paper advisory board asking whether I can have a longer abstract in the paper file.
Several minutes later, my PhD adviser emailed me a shortened version of the abstract, with perfect content and length.
I thanked him for his (astonishing) quality and speed, and wondered if he could read my mind (or, more likely, network traffic). He said that he happened to be on the advisory board and thus saw my message.
A few hours later, I received a reply from the paper chair clarifying that the abstract for the submission form was mainly for the paper sorting process and can differ from the abstract in the submitted paper file.
March 2, 2024
Fun versus job
Academic research can be a day job for some people, including reading, writing, and reviewing papers, coding prototypes, conducting experiments, advising students, and interacting with collaborators.
But it is a leisure activity for me, more intellectually satisfying than managing and communicating about products which is my current day job, which, in turn, I wonder might be a fun activity for others.
February 26, 2024
Cherry picking batch-generated results
An interactive UI that lets users manually input data and parameters to produce results is more suitable for iterative exploration (if you can wait for a few minutes) than mass production (which take take hours if not days), so I built a scripting environment from the same codebase.
I set up an experiment to run over 20+ inputs with a few values at each of the 3 parameter dimensions, and ended up with more than 4000 results which (dawned on me afterwards) are too many to examine one by one.
So I followed the path of iterating exploration as with the UI but with the already batched results without knowing for sure if I have missed any good ones.
Maybe we also need (semi)automatic tools to help us with cherry picking.
February 11, 2024
Murakami monster drawing experiment
Based on a (human-sized) monster sculpture by Takashi Murakami that I saw in a recent exhibition I did a quick rough drawing via Fresco with outlines and color-fills in separate layers, and then fed the outlines-only and color-fills versions to an image-to-image ComfyUI workflow. The results are very interesting and capture the style of the original artist quite well.
https://www.behance.net/gallery/191299777/TakashiMurakamiMonsterSculpture
January 7, 2024
Embodied computing
I spent the evening on some manual household installation/repair tasks and felt a visceral sense of satisfaction of using my body, which tends to be missing for sitting in front of a computer all day.
Given that our bodies are designed (or more precisely, evolved) to be moving around, I wonder if there are ways to structure our work routines so that we can use our bodies more.
The existing design for “spatial computing” is mostly for content consumption and authoring tasks that are inherently spatial (e.g., 3D modeling).
How to extend that for more abstract tasks (like programming) is an interesting question (personally I doubt if spatial manipulation of node graphs or the like is the right answer).
January 4, 2024
3-hour PhD defense
This is by far the longest PhD defense I have ever attended, and I have attended many.
The first half was spent on the presentation, with the second half on discussions (among thesis committee members and later the candidate) about how to deal with the lack of publications.
I enjoyed the first half much more than the second.
December 25, 2023
2023 academic volunteering hours
Since my company allow us to report academic services as part of our volunteering hours for matching donations*, I have been keeping track of mine in a spreadsheet (down to the levels of the hours/minutes spent in each paper across different stages of the review processes).
According to the spreadsheet, so far this year I have spent 190+ (!) hours in total, including TVCG AE and paper committees in SIGGRAPH, SIGGRAPH Asia, EG, EGSR, HPG, etc., mostly during my “spare time” in the evenings and weekends.
I won’t be able to serve for SIGGRAPH and SIGGRAPH Asia next year, so hopefully I can get back some time for other activities.
* For Adobe folks: remember to report your volunteering hours before the end of the year.
October 19, 2023
Caption video
(I need to write this down while I still feel the slight amount of excitement.)
Recently I have been involved in two projects for automatic video captioning, one for a research paper and the other for a product feature.
The research paper will be presented at UIST 2023; see this page for more details.
The product feature can be accessed via this page; if you have any feedback feel free to let me know.
The two projects share some high-level ideas (such as maintaining temporal coherence for the captions while optimizing their spatial parameters with respect to the video content), but the specific methods and implementations are quite different.
Shipping a product involves a lot of testing and tuning to ensure robust experiences for a wide range of users and use cases (often beyond what the creators can initially anticipate), while publishing a research paper often requires a lot of work in writing and presentation that can be a dominating factor in deciding its acceptance and dissemination.