mltk: machine listening toolkit

Michael Simpson

Sound is a multiplicity of qualities and features. What if a tool allowed us to easily extract and use more of these data points in real-time?<br />
<br />
The <i>Machine Listening Toolkit</i> is a project aimed at easing the use of computational audition for creative purposes.

http://mltk.mgs.nyc/

Description

The Machine Listening Toolkit is a software toolkit for streamlining the use of machine listening algorithms. The toolkit was designed to be used in real-time applications and provides access to utilities and data structures that help alleviate the related burdens. MLTK currently includes an openFrameworks add-on, a collection of learning resources, and a standalone tool that can be used to visually explore representations of the data and data flows.

The dashboard allows users to select, configure, and explore a vast array of relevant algorithms and have their output rendered in real-time as a graphic visualization. The dashboard can display a matrix of real-time visualizations which the user can arrange freely. The simultaneous view allows differences between graphs to become apparent and observed. This is useful as a way of better understanding the algorithms and their relationships to each other but also provides a real-time indication of how the algorithms perform and seem most appropriate for a certain task given the sound. The graphs can individually be explored in 3D to help reveal historical trends in the data visually. Visualizations can be selected on-the-fly to make their data stream available to external applications using OSC. Data can also be exported into flat files in several common data file formats.

Classes

Thesis

Thesis Presentation Video

http://mltk.mgs.nyc/

Mtindo

Koji Kanao

'Mtindo' is a style transfer application for the desktop that offers non-technical people a unique opportunity to experience machine learning in action that explores some of the practical potentials of style transfer, beyond the game-like applications that are currently available. 'Mtindo' means style in Swahili<br /><br /><br />
I've often wondered what a self-portrait drawn by Van Gogh in the style of Picasso might look like and how I'd feel about it. Obviously, this never happened because Van Gogh died in 1890 when Picasso was just nine years old so we will never know for sure. Van Gogh never had a chance to see Picasso's art. Today, however, with texture synthesis techniques developed by Gatys et al. we have the potential to come close to simulating this possibility. We can combine an image created in one style with an image created in a different style to generate a new image with the content of the first image, but the style of the second. In this way, with style-transfer techniques, we can make a self-portrait painted by Van Gogh, Picasso-ish.

https://mtindo.ml

Description

What kinds of practical style transfer applications can be developed for non-technical people, such as creative professionals, that are easy to use, efficient, and push beyond the limits of what is currently possible with existing graphics tools? People think that machine learning requires a Ph.D./CS degree and a background in mathematics and statistics. It all sounds very complicated and difficult.

Unfortunately, machine learning isn’t easy to learn, but using applications designed with non-technical people in mind isn’t so hard and can be a lot of fun. Style transfer, which is the technique of recomposing images in the style of other images, is one form of machine learning that has started to be made available online to the general public through simple games.

Mtindo is a style transfer application for the desktop that offers non-technical people a unique opportunity to experience machine learning in action that explores some of the practical potentials of style transfer, beyond the game-like applications that are currently available. Mtindo means style in Swahili.

Classes

Thesis

Runway

Cristobal Valenzuela

Constant increases in computer power and the proliferation and access to large databases have allowed machine learning to rapidly impact our society. While commercial applications can be found everywhere, experimentation of new prototypes and techniques remains – for the most- confined to computer scientists or engineers and has a limited reach to other fields. My thesis will try to explore who can a digital tool simplify the approachability and exploration of machine learning models in other disciplines. I will specifically focus my research on giving better access to state of the art machine intelligence techniques to artists and designers.<br /><br />
<br /><br />
Given this framework, I will like to explore questions like:<br /><br />
<br /><br />
– How can modern machine learning models be accessible for people outside of the realm of computer science?<br /><br />
<br /><br />
– Can we have machines and humans collaborate in a generative process using tools that simplify how the two work together?

https://runwayml.com/

Description

Building on previous and current work I have been developing at ITP, I will like to produce a system that allows artists and designers to experiment with state of the art ML models. My thesis project, called Runway, will be a tool that will take the form of a desktop application for all major operating systems; MacOs, Windows, and Linux. Wrapping typical and common practices when handling and using machine intelligence architectures. Input manipulation, process, modification, evaluating and outputting will be managed and automatized. This will enable the creators to focus on handling the model the right inputs and on using the outputs the model will produce in any given way.

Just as Computer Vision (CV), another branch of artificial intelligence, gain popularity among artists in the early 2000s with tools like OpenCV, ML has the opportunity to have a higher reach among artists due to its general set of applications in image processing, text analysis, and content generation.

One main feature of Runway will be that it will not be focused on data pre-processing, training, transfer learning, architecture optimizing or any other characteristic of creating custom and personalized machine learning models. Instead, it will be centered on using pre-trained, proven and working models that can output constant values for given inputs. For example, some of this models might include: image recognition, pose estimation, feature detection, image segmentation, text and sound generation and general classifiers.

I envision the working prototype as a GUI that is compromised of a series of variable options allowing artists and creators to experiment and explore ML models. As part of this process, I would like to create a series of collaboration with featured artists, whose practice has not been influenced by this technology, and have them propose and execute small works with Runway.

Classes

Thesis