ORBIT Dataset

Novel smartphones apps using Artificial Intelligence (A.I.) are really useful in making visual information accessible to people who are blind or low vision. For instance, Seeing A.I. or TapTapSee allow you to take a picture of your surroundings, and then they tell you what things are recognised, for example, “a person sitting on a sofa”. While A.I. recognises objects in a scene if they are common, at the moment these apps can’t tell you which of the things it recognises is yours, and they don’t know about things that are particularly important to users who are blind or low vision.

Using A.I. techniques in computer vision to recognise objects has made great strides, it does not work so well for personalised object recognition. Previous research has started to make some advances to solving the problem by looking at how people who are blind or low vision take pictures, what algorithms could be used to personalise object recognition, and the kinds of data that are best suited for enabling personalised object recognition. However, research is currently held back by the lack of available data, particularly from people who are blind or low vision, to use for training and then evaluating A.I. algorithms for personalised object recognition.

This project, funded by Microsoft A.I. for Accessibility, aims to construct a large dataset by involving blind people.

– The ORBIT (Object Recognition for Blind Image Training) Dataset project page

How do you construct a “large dataset”? And do so with the accessibility credo “with us, not for us”? Well, you need a camera app that people will want to use, that then talks to a dataset collating back-end. In particular, the user-experience challenge of a camera for the blind, and research-ethics grade data infrastructure. That’s what I was brought on to make.

ORBIT Camera: https://github.com/orbit-a11y/ORBIT-Camera
ORBIT Data: https://github.com/orbit-a11y/orbit_data

project | 2020

Choose Your Own Adventuresque

A hyperlinked slide deck system built with no constraints on look or animation – or, as the work to make its one-off commercial forebear was sold to me: ‘choose your own adventure’. And who doesn’t like that?

Choose Your Own Adventuresque is touchscreen app demo. Replace the placeholder ‘adventure’ in content.js. Adapt the look or functionality as you like, or better still commission me for lots more whizz-bangery.

As released here, each story node is written as below. But it’s all configurable, a matter of building rendering templates using PixiJS that consume the data provided (see app.js).

storyA1: {
    template: 'choice',
    text: 'This is placeholder content to demonstrate the codebase.',
    choices: [
        { text: 'Choose this', to: 'storyA2A', image: 'assets/cyoa001.jpg' },
        { text: 'Or this', to: 'storyA2B', image: 'assets/cyoa003.jpg' },
        { text: 'Perhaps this', to: 'storyA2C', image: 'assets/cyoa006.jpg' },
        { text: 'Or realise the futility...', to: 'storyA2D', image: 'assets/cyoa022.jpg' },
    ],
},

Download on GitHub
https://github.com/tobyspark/CYOAesque

project | 2019

Cardboard Wipeout

A gaming workshop for Studio Digital. Ten hours, ten teenagers. Go!

I developed and led a workshop on digital creativity. The brief was gaming. I wanted some kind of physical computing spectacle to expand everyone’s idea of what ‘Studio Digital’ could mean.

And so: Cardboard Wipeout, an immersive racing game that riffs off the seminal (to me) futuristic video game wipE’out”. We bought two super-cheap radio-control cars, asked the town for all their post-christmas cardboard, and over two sessions improvised our way to what you can see in the video.

In doing so, the participants:

Adapted a video game to a new medium
Created an immersive, interactive experience
Worked through marking-out and fitting-up a three-dimensional figure-of-eight track out of flat cardboard sheets
Made vinyl graphics to brand the track and space
Learnt how a micro:bit controller can animate special LED strips
Learnt how micro:bit controllers can talk to each other and ‘run’ a game
Experimented whether first-person video could work for driving the cars
An’ stuff…

Most of all, I think the workshop provided two insights:

The world is malleable. If all you’ve ever known is passive consumption of media, games, etc., it’s quite a leap to realise you don’t have to accept things as they are: you can break “warranty void if broken” seals and make your own culture.
You can make something that looks amazing… even if it’s mega-scrappy when you turn the lights back on. So: you got there, you’re as good as anyone else actually is, now go iterate and make it truly amazing.

Thanks to Jon, Sophie, Naomi and everyone at Contains Art and their Studio Digital programme.

Implementation

The game needs

A micro:bit controller to control the game logic, using the A button to start countdown, and B button to manually record a race finish. This micro:bit can also used for finish-line sensor and a/v controller (see below).
A micro:bit controller per strip of individually-addressable LEDs (WS2812B, aka NeoPixel). I used 2x 5m, 150 LED strips.
A radio-control car and track.
Cardboard saws. Yes, they’re a thing: MakeDo, Canary.

Bonuses

A micro:bit controller to message audio-visual kit capable of running music, countdowns, lap timers etc., e.g. a PC listening to a serial port over the micro:bit’s USB connection. In the repo linked below, there is a Mac OS X Quartz Composer patch that will do this.
A micro:bit in each car can run the game-logic, and opens up much game-play and a/v potential. See this diary post.
A micro:bit with sensor to detect cars passing the finish-line. I had a break-beam sensor to do this, it didn’t seen too reliable and then a wire broke during the workshop.

Code

https://github.com/tobyspark/ContainsArt-CardboardWipeout

Notes

This is the first time I used micro:bit controller boards. They’re great. Really practical feature set – that 5x5 LED grid and peer-to-peer radio in particular – backed by a wealth of teaching materials and stand-alone projects. Drag-and-drop blocks in a webpage for those new to programming, and python and the mu editor for those needing something more. And they’re cheap.

Two things caught me out. There is no breakout-board for the micro:bit that interfaces 3.3v and 5v. NeoPixel-like LED strips for instance need 5v, so a NeoPixel controller micro:bit needs both a power regulator in and a line driver back out. I hacked these onto a bread:bit breakout board, but to be robust and repeatable this really should be built into a PCB like that. (The controller board I made for Tekton is that on steriods, for the Raspberry Pi; it’s thanks to making those that I had the line driver chips lying around/).

The other thing is that sometimes the race would finish straight after starting, or after the finish state pass straight through waiting-for-player into countdown. The problem is the radio module has a queue of incoming messages, which meant stale or even dropped messages. So I wrote a only-care-about-the-latest wrapper class: radiolatest.py on GitHub

project | 2019

Drawing Interactions

Techniques for the transcription, analysis and presentation of social interaction, in a week long hack session, with funding.

That was the pitch Saul Albert made to me. Tasked with coming up with prototypes, I happily accepted. After all, these themes and the practical work around it became a large part of my PhD. I’d forked one app and wrote another from scratch, and done so while around the others involved – Pat Healey, Claude Heath and Sophie Skatch.

A day exploring, a day presenting, and for me the time inbetween deep in code with an exacting challenge: it was fun, rewarding work, and everything we did has gone down well. I’m certainly happy with the app I made. Watch the demo above, and read more in the diary posts below.

Main project page: http://saulalbert.net/project/drawing-interactions/
App source code: https://github.com/tobyspark/DrawingInteractions
App to-do: issues on GitHub
App done: closed issues on GitHub

project | 2018

Machine Folk

A post-doc position helping Bob Sturm and Oded Ben-Tal with ‘machine-folk’. The approach is to bring machine learning into the domain of musical practice – performance, composition, improvisation – to test how the models can fuel musical activities.

This meant engaging potential users of folkrnn who weren’t comfortable with a command-line, which for me mostly meant building an online implementation of the tool and community site to go with it.

The diary posts tell the story…

Tool – https://folkrnn.org
Site – https://themachinefolksession.org
Code - https://github.com/tobyspark/folk-rnn-webapp/

project | 2017 | downloads: folkrnn-abstract-ismir-2018.pdf · folkrnn-poster-ismir-2018.pdf

Visualising Performer–Audience Dynamics

Live performances involve complex interactions between a large number of co-present people. Performance has been defined in terms of these performer–audience dynamics (Fischer-Lichte 2014), but little is known about how they manifest. One reason for this is the empirical challenge of capturing the behaviour of performers and massed audiences. Video-based approaches typical of human interaction research elsewhere do not scale, and interest in audience response has led to diverse techniques of instrumentation being explored (eg. physiological in Silva et al. 2013, continuous report in Stevens et al. 2014). Another reason is the difficulty of interpreting the resulting data. Again, inductive discovery of phenomena as successfully practised with video data (eg. Bavelas 2016) becomes problematic when starting with numerical data sets – you cannot watch a spreadsheet, after all…

A spoken paper presented at the International Symposium on Performance Science, Reykjavík 2017. The talk is a good way to see what I got up to during my PhD… and hey, there’s no stats and lots of pretty pictures.

Recording of ISPS2017 presentation - https://vimeo.com/205481355
Dataset - https://github.com/tobyspark/ComedyLab
Visualiser - https://github.com/tobyspark/Comedy-Lab-Dataset-Viewer

project | 2017 | downloads: ComedyLabDatasetViewer-HitTest.png · ComedyLabDatasetViewer-Promo-5up.png · ComedyLabDatasetViewer-Promo-Aud11.png · ComedyLabDatasetViewer-Promo-BlurZoom.png · ComedyLabDatasetViewer-Promo-HideTop.png · ComedyLabDatasetViewer-Promo-Perf.png · ISPS2017-tobyspark-abstracts2.jpg · ISPS2017-tobyspark-abstracts3.jpg

ORBIT open-sourced

To complement the release of the dataset, here’s the infrastructure used to create it – my code open sourced for future dataset collection projects to build on our work. And as our work shows, machine learning needs more inclusive datasets.

https://github.com/orbit-a11y/ORBIT-Camera
https://github.com/orbit-a11y/orbit_data

diary | 31 mar 2021 | tagged: orbit · research · release · code

ORBIT Camera v2 → App Store

ORBIT Camera is available on the App Store, worldwide.

We’re worldwide! We are now in Phase Two of the ORBIT Dataset project, with an improved app, easier instructions, and most importantly: working around the world.

diary | 04 nov 2020 | tagged: orbit · release

ORBIT Camera → App Store

ORBIT Camera is available on the UK App Store, for iPhone and iPad. And with that, phase one data collection starts. Huzzah!

The app used by blind and low-vision people to collect videos for the ORBIT dataset project – Object Recognition for Blind Image Training.

If you are blind or have low-vision, read on! We are collecting a dataset to help develop AI recognition apps that will use your mobile phone’s camera to find and identify the things that are important to you. For example, has someone moved your house keys? Do you regularly need to identify specific items while out shopping? What about your guide cane – have you forgotten where you put it, or gotten it confused with someone else’s? Maybe you want to recognise the door of a friend’s house? Imagine if you did not have to know exactly where your things were in order to find or identify them again.

To build these recognition apps, a large dataset of videos taken by blind and visually impaired users is needed. As part of the ORBIT dataset project, you will be asked to take multiple videos of at least ten things that are meaningful to you or that you regularly need to find or identify. We will combine these videos with submissions from other ORBIT contributors to form a large dataset of different objects. This dataset can then be used to develop new AI algorithms to help build apps that will work for blind and visually impaired users all over the world.

Not that it was without drama…

Thank you for contacting App Store Review to request an expedited review. We have made a one-time exception and will proceed with an expedited review of ORBIT Camera.

diary | 07 may 2020 | tagged: orbit · release · code

dvi mixer - code released

things are building to a crescendo. as promised the software that runs the controller will be open-source, and so here it is being released.

http://mbed.org/users/tobyspark/code/SPK-DVIMXR/

notably -

the main controller code at v18: http://mbed.org/users/tobyspark/code/SPK-DVIMXR/file/d46cc49f0f37/main.cpp
library for communicating with TV-One units: http://mbed.org/users/tobyspark/code/SPK-TVOne/
library for driving the OLED display used: http://mbed.org/users/tobyspark/code/spk_oled_ssd1305/

i’ve also been corralling the OSC code available for mbed into a library: http://mbed.org/users/tobyspark/code/OSC/

for history’s sake, and perhaps it will help any hackers, attached is a zip of the arduino code i had before the leap to mbed was made, v07 to today’s v18. none of the interface goodness, but has got the fast serial communication technique i came to along with keying etc.

diary | 02 aug 2012 | tagged: code · release · dvi-mixer · vj | downloads: spk_dvimxr_v07_arduino_final.zip

SPK-Calligraphy v1.2

…and now having used it in anger, here we have

Added bounds feature, to give you all the sizing information you need to block out your calligraphy renderers.
Fixed a crashing bug triggered by sending a clear all lines signal mid-stroke
Added an advanced example derived from KineTXT development. Use space to send chunks of calligraphy to the screen, as if you were writing on a horizontal scroll.

diary | 11 may 2009 | tagged: quartz composer · live illustration · vj · code · mac os · kinetxt · release | downloads: SPK-Calligraphy-v1.2.zip

SPK-Calligraphy v1.1

…and here is the bugfix release.

fixed purge last object exception
removed unused boilerplate methods
added zPos to animator
ordered ports (arrange in @dynamic line)
fixed x,y mis-patch in sample qtz file

diary | 07 may 2009 | tagged: *spark · quartz composer · live illustration · vj · code · mac os · release | downloads: SPK-Calligraphy-v1.1.zip

SPK-Calligraphy v1.0

KineTXT has spurred many custom plug-ins, generally either esoteric or usurped by kineme or the next major release of QC. the latest however probably deserves to see the wider light of day, and so here is a snap-shot of it having just passed a notional ‘v1.0’. its two patches designed to capture and render handwriting and doodles from a tablet, but they should be pretty useful to anyone who wishes for some form of digital graffiti in their QC compositions.

if you want anti-aliasing, you’ll need to leave the QC app behind unfortunately, but if you can run without the patching editor window its just three lines of code to add to the qc player sample application and voila: this plugin and all 3D geometry become anti-aliased. vade worked it all out and outlines the territory here: http://abstrakt.vade.info/?p=186.

if you want different nibs, pen-on-paper-like textures or suchlike… well i have my needs and ideas, but the source is there. share and share alike!

the plug-in is released under gplv3, and is attached below.

diary | 28 apr 2009 | tagged: *spark · quartz composer · mac os · vj · code · release | downloads: SPK-Calligraphy-v1.0.zip

SPK-StringToImageStructure

having worked through the hillegass cocoa book, its time to start putting that to good use. and project number one was always going to be one of the big glaring omissions in quartz composer to my mind: a means of animating a string on a per-character basis.

if you want to compete with after-effects, then you need to be able to produce the various type animations directors are used to, and you need to do so at a decent framerate. to animate say the entry of a character onto the screen, you would create the animation for one character and then iterate that operation along the string. the problem is, rendering each glyph inside the iterator is both massively expensive and massively redundant, but thats the only approach qc allows, hacks on the back of hacks apart. a much better approach would be to have a single patch that takes a string and produces a data glob of rendered characters and their sizing and spacing information, firing off just once at the beginning and feeding the result to the animation iterator: at which point you’re just moving sprites around and the gpu barely notices.

the patch is released under gplv3, and is attached below.

a massive shout to the kineme duo for leading the way with their custom plug-ins and general all-round heroic qualities. in particular their ‘structure tools’ patches were the enabler for those early text sequencing experiments.

diary | 15 feb 2008 | tagged: *spark · quartz composer · vj · code · mac os · release | downloads: SPK-StringToImageStructure-1.0.zip

*spark titler redux and release

as shown in the ‘pun me this’ entry, the *spark titler was used in nascent form at sheep music, and the promise to tidy-up and release as open-source software has been followed through. so, please find attached: sparktitler-v1.1.zip.

the titler’s interface allows you to take between two sets of title/subtitle, with the choice of four backgrounds: black / green / a quicktime movie or a folder of images. the output window will automatically go full-screen on the second monitor if it detects one is available at launch, otherwise it will remain a resizable conventional window.

it is released with the intention that it can be reused for other events without changing a single line of code: you can design the animation and incorporate quicktime movies in the design by editing the ‘GFX’ macro in the quartz composer patch, and its a matter of drag and drop replace the logo in the interface.

for those who wish to dig deeper and improve the whole package, the source is released under GPL. the xcode project provides an adequate shell for the patch, implemented with just two cocoa classes and an nib file complete with bindings between the qc patch and the interface window. the classes are required to tell the quartz composer patch where to find the resource directory of the application’s bundle (neccessary for any ‘image with movie’ nodes), and to subclass the output window so it is sent borderless to the second display if appropriate. features apart, there is certainly room for improvement, a ‘open file’ dialog instead of the raw text fields would be good, likewise solving the text field update issue.

if you do use it, let us know: operator@tobyz.net

diary | 30 jul 2007 | tagged: titler · release · *spark · vj · code · mac os · quartz composer | downloads: sparktitler-v1.1.zip

vdmx plug-in: dj style mixer

here is a prototype/demonstration of using your own image kernel in vdmx. rather than being an effect, this is an A/B mixer that means you can use vdmx in the ‘old skool’ way, by mixing together two video streams rather than rendering the whole stack of layers. it also has controls like a DJ scratch mixer, so as well as a crossfader, you’ve got a fader for each channel, and a fader curve control.

to use, make a layer or group for the A channel and another for the B channel, and a layer at the top of the stack for your output. trigger the qc patch in the output layer, and assign the A and B layers/groups to its video input drop downs.

if you open the qc patch, you’ll see the video inputs get resized to the output res, as image kernels don’t handle different sized inputs too well, and then all the inputs are fed into an image kernel, ie a little filter written specially for the graphics card. in that there is some basic maths for applying a variable crossfade curve, and a line that adds the two inputs together. take a look, its not so hard; i have far more trouble with doing things like translating the crossfader curves into a mathematical expression than with the code itself.

so take this as a starter for ten if you’re interested. attached below.

i have my own mixer now that does three channel mixing just how i want, and its really cleaned up my vdmx interface^[1] let alone the directness of the processing. sweet.

no more layer masks or fade to blacks getting in the way in the different layers and the preview window are now just post fx and not post masking/mixing fx as well. ↩︎

diary | 08 jun 2007 | tagged: *spark · quartz composer · vj · vdmx · release | downloads: spark-DJmixer-v3.qtz

Toby Harris

tagged: release

ORBIT Dataset

Choose Your Own Adventuresque