Any views expressed within media held on this service are those of the contributors, should not be taken as approved or endorsed by the University, and do not necessarily reflect the views of the University in respect of any particular issue.

Week 7 Part 1-AI generated sound and AE animation production

Part 1:

This week I completed the production of the first part of the video where the AI imitates family members’ daily conversations.

The first part of the text: Based on previous literature research and video review, combined with the suggestions given by AI, I wrote 13 sentences about the daily dialogue between family members and Alzheimer’s patients.

  1. Why do you keep asking the same questions? I’ve explained it many times
  2. Mom, your memory is really getting worse.
  3. Where did you put the key? This is already the third time
  4. Can you stay by yourself for a while
  5. I am your daughter
  6. Why don’t you remember me again?
  7. Dad, why did you put the medicine in the refrigerator again?
  8. Didn’t we make an agreement?
  9. I really can’t understand why you are always so stubborn
  10. I’ve told you many times
  11. I really don’t have time to help you find things every day
  12. Why you can’t do simple things well?
  13. Why do you always forget to turn off the lights? I have to come here every time.

I tried four different ways to get AI to generate speech, and finally I chose to use Elevenlabs AI. There are various timbres, and the generated sounds are real after all. They are emotional sounds, not robot sounds.

Below are all the generated AI sounds.

I made the first part of the video using Adobe After Effects

I will make detailed adjustments next week, and then match the audio with the animation and put them together.

Week7 Part3 modification for formal presenting

This week, I’ve enhanced the third part and filmed a video demonstrating how the final presentation will appear.

However, there are some key issues to address in this section currently:

  1. When capturing images of the audience, we need to ensure the background is uniformly black and there is ample lighting on their faces. This consideration is vital for the real installation setup, impacting camera device selection and the necessity of additional fill lighting.
  2. During the formal exhibition, the positioning of the microphone requires careful consideration.  One option is using a microphone stand while its placement should be adjusted to prevent screen obstruction. Alternatively, as demonstrated in the video, the microphone can be placed on the table, and users can hold this microphone in their hand when speaking.
  3. The selection of recording equipment is not so sure. A SHURE SM58 microphone is currently being used, connected to the computer via a sound card. However, if the exhibition environment is noisy, switching to a different microphone, such as the Sennheiser 416 may be necessary. In addition, if the setup is inconvenient for using a sound card, such as due to limited cable length, a device like the ZOOM F8 could be employed to replace the sound card.
Another visual effect attempt

 

Week7 Video progress for Parts 2 and 4

Video

Part2:Blur【video】

Progress: Based on last week’s discussion and feedback from sub1, this week in part 2 of the blur, I used touchdesigner to focus on a more random and bold experiment on memory visualisation of faces using AI-generated face animations.

Issue:
1. for the video pacing, I’m not sure if I need to slow it down a bit more
2. is the cyberpunk-like magical colouring that appears in the middle and back parts necessary
3. how is the transition to the prologue handled

Part4:Vanish【interact】

Progress: based on last week’s feedback, I accessed a live webcam during the interactive part of the installation. In this part of the project, I want the participant to remain standing in front of the screen, while the camera in front of the screen will rotate 360 degrees to record the space the participant is in and load it on the screen in real time in the form of particles. When the camera has completed its rotation, the whole screen will go black and disappear.

Issue:
1. Is there any better suggestion for the interaction form or expression of this part?

Week6 – The World Through the Eyes of a Moderate Alzheimer’s Patient

I hope to create a world as seen through the eyes of an Alzheimer’s patient. In the eyes of a person with moderate Alzheimer’s disease, they tend to forget the object they are going to look for or suddenly forget their task in the process of looking for it, which in turn creates emotional and psychological problems in the form of anxiety and a sense of distrust in the world.

Therefore, through my design, I hope to present a first-hand view of a moderate Alzheimer’s patient’s condition and mimic the world as they see it in their eyes, helping the experiencers to experience the patient’s range of emotions first-hand.

Before this design, I did some research and I found that the patient’s state was associated with significant declines in overall cognition, verbal memory, language, and executive function, and these associations were amplified by increased anxiety symptoms. (Pietrzak RHLim YYNeumeister A, et al, 2015). At the same time, when I researched the world as seen from the patient’s first viewpoint, Steven R. Sabat said that through his conversations with several men and women with Alzheimer’s, he demonstrates how the powerlessness, embarrassment and stigmatization patients sufferers endure leads to a loss of self-worth.  (Sabat S R, 2001).

In terms of technical realisation, I felt that I needed to make the experience immersive by allowing the experiencer to follow the rhythm of my design and the camera changes, my intention was to create the anxiety of not being able to find it within the allotted time. I achieved my visualisation by combining the first view of the patients I researched with the touch designer’s special effects to create a video. Secondly, through the feedback, I realised that the part I did before still required manually clicking in the background to switch scenes and did not connect to the other stages of the visualisation as a whole. So, I first used touch designer to create the video + audio of the effects for different scenes, and then put them into the same container to create a connection between each part. Thus achieving the effect of scene switching without manipulation.

Examples:

(From 1:42-2:23)

Reference List:

Pietrzak RHLim YYNeumeister A, et al (2015). Amyloid-β, Anxiety, and Cognitive Decline in Preclinical Alzheimer DiseaseA Multicenter, Prospective Cohort Study; 72(3):284–291. Available at: 10.1001/jamapsychiatry.2014.2476

Sabat S R, (2001)  The experience of Alzheimer’s disease: Life through a tangled veil[M]. Oxford: Blackwell. Available at: https://d1wqtxts1xzle7.cloudfront.net/94398763/the-experience-of-alzheimer039s-disease-life-through-a-tangled-veil-by-steven-r-sabat-0631216669-libre.pdf?1668688272=&response-content-disposition=inline%3B+filename%3DThe_experience_of_Alzheimers_disease_lif.pdf&Expires=1709753257&Signature=Tpjf24b4BxYWS9zb7NbxEujSgm-rHkQmUDbewGaVB3bviIfD1XeYNHLU~JrO4CCHx1IrbJ4f9bbw7qQzdSXlyVtSrbTq9jJlKGoMYeJKIkpG2GoXZcqFA22lLXb8oRC44f~8H0ROA0VF73fgtZLWmLbXJankpIwhi0EOHhZqz7PySMfMkZQuJTEKP5qCt8ffkmkg3tUerr6xQgb8w~JAwJhyJekG9ha5N5oy34HqEKnJexMvLvmvOikLMOhpWi9fqjt-1TtENBdo3LoUwUQjAA1HDnTbkaMTqkqQspsbeH6ccDMNYACDxKIviqnsfqp3FG5KnY6zx5cSNQau9jFpww__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA

 

Week6 — OSC Feasibility Test

In the midterm feedback, our tutor Philly mentioned that we should think more about technology and how the four parts of our project should be combined together. In the tutorial, he also mentioned the OSC (Open Source Control)  to achieve that.

As for the consistency of these four parts, I first tried to implement some Max/MSP functions in Touch Designer. However after some tests, I found that audio processing in TD is very limited. Similarly, other visual effects achieved using TD are also difficult to implement in Max. Therefore, to ensure the coherence of the project display, not only the coherence between these four parts but also the coherence between audio and visual, we need technologies like OSC to realize the connection between different platforms.

The video below shows some simple tests about OSC, which implements a two-way connection between Max and TD, and simple sound and visual effects changes in TD by setting up parameters in  Max.

Reference

Week6 Part 1- Further research and understanding of Alzheimer’s disease

The first part has been changed. It was originally an animation generated by AI, telling the daily life of an old woman suffering from Alzheimer’s disease. Now, users experience the daily life of Alzheimer’s disease from a first-person perspective. On the big black screen (See picture below), some sentences about the daily life of Alzheimer’s patients floated from all directions (such as: Mom, do you close the door? You don’t recognize me? Dad, where did you put your hat? etc.) . All these words fill the screen and are accompanied by voices speaking in the ears, making the experiencer feel helpless and helpless when suffering from Alzheimer’s disease.

Then I did some literature reading and online video explanations about Alzheimer’s disease this week. I wanted to understand the patient’s daily status to complete the first part of the content.

1.The classic clinical features of Alzheimer’s disease are an amnesic type of memory impairment,8,9 deterioration of language,10 and visuospatial deficits.11,12 Motor and sensory abnormalities, gait disturbances, and seizures are uncommon until the late phases of the disease.6 (Cummings,2004)

2.We found that DAT patients produced significantly more turns and topic shifts but produced fewer total words, words per turn, unique words, narratives, direct quotes, and figures of speech than did their healthy spouses. Examination of DAT speech samples showed that they were characterized both by a lack of propositional content and illocutionary force, illustrating deterioration in communicative competence. (Blonder et al, 1994)

3.Previous research by Boss (1977) indicates that a family may exclude a member in order to reduce the boundary ambiguity.

4.In the case of an ambiguous loss such as that found in Alzheimer’s disease, the family’s task is to clarify the family boundary by redefining family tasks and roles (Boss, 1977).

Reference list:

1.Cummings, J. L. (2004). Alzheimer’s disease. New England journal of medicine, 351(1), 56-67.

2.Blonder, L. X., Kort, E. D., & Schmitt, F. A. (1994). Conversational discourse in patients with Alzheimer’s disease. Journal of Linguistic Anthropology, 4(1), 50-71.

3.Boss, P., (1987). Family stress: Perception and context (pp. 695–723). In N. Sussman & S. Steinmetz (eds.), Handbook on marriage and the family. New York : Plenum Press.

 

I searched a lot of videos on the Internet to learn more about the daily lives of patients and their families.

             https://vm.tiktok.com/ZGeUsGNcc/                https://vm.tiktok.com/ZGeUstXKG/

Week6 Further attempts at AI

More reflections on AI and memory

Update:What are you doing?

Xinxuan

1)Case study: Memories of Passersby

Analyse: Memories of Passersby uses a complex neural network system to generate a never-ending stream of portraits: grotesque and grotesque representations of male and female faces created by machines. Each portrait on display is unique and created in real time as the machine interprets its own output. For the viewer, the experience is akin to watching the endless imaginative acts taking place in the mind of the machine, which uses its “memory” of the parts of the face to generate new portraits, sometimes encountering difficulties in computational interpretation, and producing images reminiscent of André Breton’s The result is a “convulsive beauty”: a shocking and disturbing aesthetic that is, as it were, a mixture of attraction and repulsion, whose main function is to present surprising new perspectives.

klingemann-当代-日-附加-图像-1538906964_b.jpg

 

 

 

 

Created by Mario Klingemann

 

Reflection: Inspired by this example, I explored AI more deeply this week, mainly in the form of a series of videos with virtual portraits superimposed and fused together, which I generated by downloading portrait images from different countries from thispersonnotexist’s website recommended by teacher and training them using the open source StyleGAN2. In order to obtain a more abstract and exaggerated effect to present the memory visualisation, I processed the generated videos using Touchdesigner.

AI face fusion video

2)Experiment: Different AI effects

I tried two effects in total this week.

Text Combination:Firstly, the first effect was to echo the prologue of the immersive exhibition, therefore I also tried to incorporate text in the character section. I mainly used the text question “WHO AM I” to map on the video, which makes the original figurative AI face video more abstract and makes the theme more clear.

The prologue format

* Detail question about prologue part:

  1. Does the subject of a textual description need to focus on a specific virtual protagonist? Or is it a broader sentence, e.g. a generic personal pronoun such as mother?
  2. Whether the component needs to be multilingual?

Attempt about text combination 

Particle Dissipation: Meanwhile, this week I also continued to delve deeper into particle-related knowledge in Touchdesigner and further experimented with the music example in submission1.

What I’d like to express about this effect is this comment: their hazy output feels like revisiting a memory that can’t be remembered. At the same time, I’m critical of whether AI acts as a memory enhancer or a memory reconstructor or so for people with Alzheimer’s.

Attempt about particle

The vanish section immerses the audience in the experience of Alzheimer’s oblivion, and the form we discussed earlier still retains particle interaction. During the week, I tested capturing a computer camera in real time and producing the effect.

Real-time particles attempt from the computer camera

* Question about vanish part: What needs to be discussed is whether we use real-time particle disappearance or have the user press a button to take a static picture and then disappear with the particles.

Questions about other part:

* Question about Object-blur part [Detail in Yixuan yang’s blog]

  1. How does this blur effect need to be further developed?
  2. Do we need to reflect national differences in the scenario as well?

* Question about fade part [Detail in Han Zou’s blog]

  1. Do the faded parts need to interact with the user?

The solution what I think is :Not only does it interact with the sound, but the user presses a button to take a photo in real time and then the photo fades. Then use real-time particles to disappear during the vanish part.

Answers about basic info:

Where will it be shown/experienced:  Q25 in ECA. For the exhibition, we will create a dark immersive space.

When:We expect to have all the music and visuals produced around the end of March and the exhibition around the beginning of April 🙂

What do you need:

Drawed by Jiayi Sun

More Answers about comments:

Project Reflections on AI and Memory:
  • Project on AI and Memory
  • Our project uses sound and images to represent and zoom in on the process of memory loss in Alzheimer’s disease patients, and at the end allows participants to experience the dissipation of memory for themselves. The project makes one reflect on whether the role of AI in a philosophical sense is to enhance or restructure individual subjectivity.
Technical details:
  • How long is this piece expected to last overall?
  • About 20min [including Prologue, Blur, Fade and Vanish]
  • How long is each part of the piece?
  • Each part about 5min
  • Will it be projected, or on a screen? 
  • On a big screen
  • What kind of space do you envision this to take place in? 
  • Almost dark immersive space
Creative details:
  • How do you intend to connect each of the 4 parts of the piece?
  • In the form of a film, each part is played and interacted with in sequence
  • As different people are responsible for each section, how do you plan to make the technology required for each compatible?It feels like this piece would benefit from seamlessly transitioning between segments, how could this be achieved?
  • Although the results are different, we all use Touchdesigner as the output port. We could use Touchdesigner to integrate different effects in one container.

 

Reference:

Artwork link: https://www.sothebys.com/en/articles/artificial-intelligence-and-the-art-of-mario-klingemann

StyleGAN2 Open Source Website:https://github.com/NVlabs/stylegan?tab=readme-ov-file

 

 

Week6 Diary – An AI Audio Plugin “Neutone”

https://neutone.ai/fx

Neutone is an AI audio plugin that brings deep learning DSP models to real-time music production. It allows artists to integrate AI-powered tools into their creative process easily. The plugin supports VST3 and AU formats. Various timbre transfer models and deep learning-powered effects are available for download on this plugin, including transforming sounds into drum kits, voices, violins, and more. Developers can also contribute their own models to the Neutone platform using the NeutoneSDK.

Regarding our Process project, a frequently discussed issue is that, since the theme revolves around memory, the content should be presented more abstractly. This approach helps to avoid touching upon sensitive personal information while still providing a unique style. Nuetone seems to be an excellent tool for achieving this effect. It is a timbre-transform plugin capable of reshaping sound with different timbres. For instance, it can transfer instrumental timbres to voices and vice versa. Because such transformations are based on audio, the resulting outcomes are more abstract compared to having different virtual instruments play the same MIDI clips.

In the project, for voice elements that need to narrate the story, one can first generate some normal talking materials using Text to Speech tools and then process them with Neutone to obtain instrument-like timbre with speaking rhythm or transform them into other voice models for more ambiguous speech materials.

For the musical part of the project, the thematic motives of the music can be transformed into different timbres appearing in different sections. This implementation results in audio with a blurry, even low-definition feel, which aligns with the project’s need for ambiguity while allowing each section to have differences while maintaining overall coherence.

css.php

Report this page

To report inappropriate content on this page, please use the form below. Upon receiving your report, we will be in touch as per the Take Down Policy of the service.

Please note that personal data collected through this form is used and stored for the purposes of processing this report and communication with you.

If you are unable to report a concern about content via this form please contact the Service Owner.

Please enter an email address you wish to be contacted on. Please describe the unacceptable content in sufficient detail to allow us to locate it, and why you consider it to be unacceptable.
By submitting this report, you accept that it is accurate and that fraudulent or nuisance complaints may result in action by the University.

  Cancel