Video is the medium of the age and AI is the technology of the age. Combine the two and you have a potent mixture. I’ve been involved with both, working in a video production company, using video on all sorts of media, from interactive videotape machines, laserdiscs, compact discs, CDi to streaming, even making a feature film called The Killer Tongue (you really donl;t want to know). Believe me that last one lost me a ton of money. I now run an AI for learning company WildFire and am writing a book AI for Learning. I know these two worlds well. But how do they interact?
There are tools that allow you to edit video much faster and to higher quality. Different cameras shoot different colour balances – that can be fixed with AI. Same actor in different scenes with different skin tone, that can be fixed with facial recognition and skin tone matching – using AI. Need your music mixed down behind dialogue – use AI. AI is increasingly used to fix, augment and enhance moving images.
Of course, easy editing with AI also means easy fakes. AI generated avatars as TV presenters have appeared reading the news using text to speech software. One can have Obama saying whatever you like from a voiceover artist mimicking his voice. Similarly with a fake teacher, won can deliver talking head content. Even more worrying is fake porn. Many famous actresses and actors have had their faces transposed to create ‘deepfake’ porn scenes. This mimics what is possible with fake homework, essays and text output using OpenAIs GTP-2 software, so dangerous that they decided not to release it. Just feed it a question and it produces an essay.
Beyond fakery lies the world of complete video creation. Alibaba’s Aliwood software uses AI to create 20 second product videos from a company’s existing stills, identifying and selecting key images, close ups and so on. The selected images are then edited together with AI and even change with musical shifts. They say it increases online purchases by 2.6%. Some video creation software goes further and also adds a text to speech narration, with edits at appropriate points. Many pop videos and films have been made with AI tools that use AI tools such as Deep Dream for image creation along with style capture and flow tools. There’s even complete films made from AI created scripts. We already see services that create learning videos quickly and cheaply using the same methods.
Once you have created a video, AI can also add captions. This type of software can even pick up on dog barks and other sounds and is now standard on TV, YouTube, Facebook, even Android phones, increasing accessibility. It is also useful in noisy environments. Language learners also commonly report cationing as having benefits in self-directed, language learning. Although one must be careful here, as Mayer’s research shows that narration and text together have an inhibitory effect on learning.
Speech to text is also useful in transcription, where a learner may want the actual transcription of a video as notes. Some tools, such as WildFire
, take these transcriptions and use them to create online learning to supplement the video with effortful learning. The learner watches the video, which is good for attitudinal and affective learning, even process and procedures but poor on semantic knowledge and detail. Adding an online treatment of the transcript, created and assessed by AI, can provide that extra dimension to the learning experience.
One you have the transcribed text, translation is also possible. This has improved enormously, with reduced latency, from Google Translate to more sophisticated services. Google’s Translatotron promises to deliver speech-to-speech translation with an end-to-end translation model that can deliver accurate results with low latency. Advances like these will allow any video to be translated into multiple languages, allowing low-cost and quick global distribution of learning videos.
Ever thought why YouTube and other video services prevent porn and other undesirable material from appearing? AI filters that use image recognition to search and delete. Facebook claims that AI now identifies 96.8% of prohibited content. It is not that AI does the whole job here. Removing dick pics and beheadings relies on algorithms and image recognition but there’s also community flagging and real people sitting watching this stuff. AI is increasingly used to protect us from undesirable content.
Want to know something or do something? Searching YouTube is increasingly the first option chosen by learners. YouTube is probably the most used learning platform on the planet. Yet we tend to forget that it is only functional with good search. AI search techniques are what gives YouTube its power. Note that YouTube search is different from Google search. Google uses authority, relevancy, site structure and organization; whereas YouTube, being in control of all its content, uses, growth in viewing, patterns in viewing, view time, peak view times, and social media features such as shares, comments, likes and repeat views. Search is what makes YouTube such a convenient learning tool.
Video services such as YouTube, Vimeo and Netflix use AI to algorithmically present content. AI is the new UI and most video content is served up in this personalised fashion. Netflix famously handed out a £1 million prize for an algorithm and has since refined their approach. This is exactly what is happening in adaptive learning systems, where individual and aggregated data is used to personalise and optimise the learning experience for each individual, so that everyone is educated uniquely.
Talking of Netflix, there is now a huge amount of data collected on global services that can inform future decision making. This can be data about when people cut out of a show, literally showing favourite characters and sub-plots., which can be used to inform future script writing. Data on stars, and genres can also be used to guide scripting and spend on original content. Similarly in learning, analytics around usage, cut outs and so on can inform decisions about the efficacy of the learning.
All of the above are and will affect the delivery of video in learning. Several are already de facto techniques. We can expect them all to develop in line with advances in AI, as well as learner demand. This is clearly an example of where the learning world has lots to learn and lots to gain from consumer services. Most of the above techniques are being built, honed and delivered on consumer platforms first then being used in a learning context.