Abram Hindle's Blog

Software is Hard

Software Science and Empirical Software Engineering

An introduction via my MSR 2018 Most Impactful Paper Talk

I recently gave a talk at MSR 2018 about a paper I wrote 10 years ago that was not immediately actionable to practitioners but potentially actionable to researchers.

Here’s a practice video of the talk, but I think I was tired and you can audibly hear yawning.

https://www.youtube.com/watch?v=XMEIJTPf_zo

My concerns were that we as a community:

  • were rejecting papers that could be used to build a body of knowledge because they weren’t immediately actionable;
  • were writing reviews saying “isn’t this obvious” when it was purely anecdotal before;
  • were providing lip service to replication yet not accepting of actual replication papers;
  • were grappling with the idea that engineering has to solve a problem;
  • were facing randomness in the review process because of “so what”, “not useful”, “irrelevant to practitioners” reviews that might not be accurate;
  • were not sure how to deal with papers that would add to our body of knowledge yet would not be applied immediately.

In the talk I was concerned that as a community we are too quick to reject work that is not immediately actionable to the stakeholders: developers, managers, software companies. We use our value judgment and evaluate the motivation of the authors and if we don’t agree we argue the paper should be rejected because it isn’t useful or actionable or poorly motivated. The point is that you can have a technically sound paper and get it easily rejected because people do not like argument. It’s not that you’re wrong or technically incorrect or inaccurate (which you could be) but it’s that in terms of a solving a problem right now you might not have done that. You might not have convinced them of the utility of the work. My focus in the talk was on rejecting work that wasn’t immediately actionable to software engineers and related stakeholders. So I brought up a term that Jim Cordy brought up with me in discussions we had: “software science”.

Software Science

In the talk I referenced “software science” and I tried to make a distinction between it and Empirical Software Engineering. In my view engineering is solving a problem with economic constraints. I think engineering is usually distinguished from science in the sense of application and practicality, where as science is far more general. My view of science is that it is meant to systematically study something. Some people make a distinction between the natural world and not. Some people argue science is the process or the knowledge produced by it. In general I think what we’re supposed to get from science is more knowledge and that knowledge might not be immediately usable.

Software science already exists in many different ways. Much of empirical software engineering is not actionable, pure measurement and statistical tests that tries to reflect what is being shown in the data we mine. Halstead wrote a book “Software Science” where he argued for measurement and came up with measures. Halstead’s metrics' validity and usefulness is argued to this day, but people in 2018 are still using them—I’ve implemented and used Halstead metrics before as well. There are many departments and divisions of Software Science at various universities that study software and publish empirical software engineering works. I’m not trying to claim the term or over shadow the other uses of “software science” I’m just trying to ask for an understanding of research that has value, but is not immediately actionable.

Software science in this context is a broad term meant to encompass research that seeks to apply scientific principles to the study of software and software development, not necessarily with the constraints of an engineering discipline. Is software science a superset or subset of empirical software engineering? No, not to me at least. Many works belong in both categories. But there are works that fit software science more than empirical software engineering. For instance Israel Herraiz’s work on statistically characterizingthe distributions of version control data seen in repositories. It might be useful to a tool builder but its direct application is not apparent—yet that body of knowledge of functions that can generate these distributions is important for theory building and understanding.

Regardless, until we mature as a discipline and find a way to aggregate knowledge we will be harming ourselves because we are actively preventing the build up of knowledge in our quest for novelty and immediate actionability. These are important, and we all recognize them as important. Yet as a community we cannot mature until we:

  • solidify knowledge with empirical studies;
  • accept “this is obvious” it not a valid reply from reviewers;
  • accept that collecting and building knowledge is important for science;

– paraphrased points from Daniel German

Thus I argue we should be able to signal that our work is not necessarily engineering but more on the scientific side. That doesn’t mean we give a free pass to anyone who does something. No if you claim your work is “software science” then you’d better back that claim up.

What should be in a software science paper?

So if one is to submit a software science paper where is the bar? What is required?

In my opinion one big problem in our field is overclaim. I’m guilty of it. You’re probably guilty of it. Overclaim in a paper can kill follow up papers because the onus on replication—the difficulty of it—and the lack of acceptance of straight replication is low. Furthermore your paper during review will always be in the shadow of the original who has laid claim to the space. We need to allow others into the same space—multiple times. We’ll never get meta-analysis like medicine has without it.

Another sin is poor future work. How many times have you read a future work section: we plan to apply this to more projects and different programming languages. Is it not obvious enough that your work had limitations, please can we have better future work sections where the authors (myself included) use their imaginations and think about where this will go. If you cannot envision the future of use of your labour the reviewers are going to have a hard time too. Because future work is literally science fiction (I stole that joke) you can always make something up. It just has to be plausible.

Furthermore in the interests of science and not claiming too much space you should make the assumptions and the uninvestigated clear to the reader. It might not be apparent through your claims.

Finally do you want to have an impact? Do you want people to cite you? Well let them follow up on your work by using your data and analysis scripts. At least share the analysis scripts—there’s a whole raft of good excuses to not share data unfortunately.

You should tell the reader:

  • Where this work could be used. What do you envision this work could be aggregated with?
  • What is the scope and context of your work? Don’t overclaim or over generalize. There will be need for other niches to be studied. If you investigate TDD projects written in Java on Github, then you shouldn’t generalize to all TDD projects in all languages in all repositories. If you investigate the performance of an Android software development pattern, should you claim all performance for all mobile platforms?
  • What isn’t covered and what assumptions you made.
  • Where you data is, and preferable where your analysis scripts are.

Thus you should have:

  • Future work sections that are more detailed than “we want to test on more projects”, how about the hard stuff you did not get to or papers you envision that could include this work?
  • The scope of the work clearly documented in the title, abstract, intro, and conclusions.
  • A description of what you did not explore and what assumptions you had to make.
  • Hyperlinks to your data and analysis script (please!)

How should I review a software science paper?

Review it like you did before. But make sure:

  • authors aren’t overclaiming in title, abstract, introduction, or conclusions;
  • Make sure the paper is properly scoped;
  • Make sure the paper is replicable;
  • Check if the future work is good enough;
  • Check if the authors have come up with a rationale why this might be used in the future;
  • If you’re going to reject based on motivation, please consider how the authors have argued for the future use of this work—could a measurement this paper be used as a parameter or advice for another developer?
  • Avoid saying “isn’t it obvious” unless you have prior work to back it up—anecdotes are not enough;
  • If you are reviewing a paper in a context you’re not familiar with, try to brush on that area. We’re doing a grave disservice to the younger researchers who are often more on the cutting edge of development and practical SE trends that you might be unfamiliar with.
  • Check if data or analysis scripts are available.

What should relevant conferences and journals do?

Conferences and journals that wish to be open to this kind of work need to help authors and reviews understand where they stand and what the bar is.

  • Allow lightweight shepherding so that overclaim can be dealt with in camera ready in a safeway.
  • Provide lip service to acting replications and provide clear guidelines of what is expected of authors and reviewers.
  • Provide lip service to software science and provide clear guidelines of what is expected from authors and reviewers.
  • Allow shorter papers and emphasize to reviewers you cannot do everything under the sun in a shorter paper.
  • Provide signals of where software science and empirical software engineering works should go.
  • Provide examples of “the bar” and guidelines for meeting the bar. I like checklists.

What I’m not saying

I’m not saying accept everything, my goal is more consistent reviewing and less work getting rejected because of lack of experience or field knowledge of the reviewers. My goal is also to improve on the empirical software engineering dilemma whereby much work is expected and the actionability is out of reach, yet it could be used later by other authors to build upon.

I’m not saying we have to let boring things into conferences. Although we need to have a discussion where we send dry results, negative results, or replications.

In conclusion

In other sciences it took the slow build and acquisition of knowledge before many leaps in understanding and practicality could be made. Software should be no different, we’re too young to be discriminating so harshly on an area we so poorly understand. How many times have you heard people argue for the abolishment of the term software engineering because “we can’t even approximate engineering”. This is a signal that our field is in its infancy and we should use this chance to build up our body of knowledge.

If we focus so heavily on the immediately actionable we will hobble our future.

To push the idea of “software science” or science in software engineering we need to have buy in from many stakeholders:

  • Authors need to provide clear signal in their papers about limits of their work, the scope of their work, where their work can used in the future.
  • Reviewers need to understand that we can’t do everything in 1 paper; that the obvious is not obvious yet until we can cite it as obvious; that there is a value in building up knowledge from results that might not be immediately actionable.
  • Journals and Conferences, as the gatekeepers of SE knowledge, need to make it clear how to build articles that can be accepted. Make it clear to reviewers and authors what you seek and if it is software science what you expect from it.

Thus I argued that we need to consider that we have different kinds of research and that we might have to more clearly identify works that fall under the software science umbrella of not immediately actionable results such that we can evaluate them in a fair manner. If we don’t, we risk slowing down our progress too much with a bias towards novelty and immediate application rather than truly improving the field with from a knowledge oriented perspective.

If you do wish to engage with me on this

Please do it via email and/or blog. abram.hindle@ this domain will work fine.

Most of what is written here is not mine. It’s a mix of Jim Cordy, Daniel German, Ric Holt, Michael Godfrey, Ahmed Hassan, Prem Devanbu, Wikipedia, the SE community, and western philosophy.

More Sabbatical Thoughts

More sabbatical tips that I learned. Perhaps the hard way:

  • Bring your spouse! Everything can be a lot more lonely or less fun than you imagined. Also distance-based relationships are very stressful.
  • Airbnb is potentially your friend and enemy. If you need a furnished apartment it’s a good option. It could also be terrible.
  • Ear plugs are your friend for travel and visits.
  • If you are going to a furnished apartment don’t pack much.
  • Canadians should check with NSERC if they want to apply for JSPS scholarships.

Analyzing the Effects of Test Driven Development in GitHub

So I’m pretty happy about this paper. Neil Borle and Meysam Feghni worked very hard in my grad class on this project. Their work was to investigate if TDD (test driven development, or more specifically test-first) existed with Github Java projects and if it could be shown statistically to improve software quality over projects who were not Test-First followers of TDD.

They worked hard and they came up with a systematic definition of how to identify TDD test-first projects with a threshold for how “test-first” it was. They showed that for a variety of thresholds that TDD on Github Java projects was rare and that there was not a clear signal of quality improvement.

So is TDD more a suggestion than a process actually followed? I know personally from my experience with undergraduates that testing is hard, and testing first is even harder.

Anyways onto the paper and see for yourself:

Paper Title: Analyzing The Effects of Test Driven Development In GitHub

Journal: EMSE

Journal-ID: EMSE-D-17-00057R2

Authors: Neil C. Borle, Meysam Feghni, Eleni Stroulia, Russell Greiner, Abram Hindle from Department of Computing Science, University of Alberta, Edmonton, Canada

EMSE URL: http://rdcu.be/zqhX

Self-archived PDF URL/Pointer: http://softwareprocess.es/pubs/borle2017EMSE-TDD.pdf

Here’s our abstract:

Testing is an integral part of the software development lifecycle, approached with varying degrees of rigor by different process models. Agile process models recommend Test Driven Development (TDD) as a key practice for reducing costs and improving code quality. The objective of this work is to perform a cost-benefit analysis of this practice. To that end, we have conducted a comparative analysis of GitHub repositories that adopts TDD to a lesser or greater extent, in order to determine how TDD affects software development productivity and software quality. We classified GitHub repositories archived in 2015 in terms of how rigorously they practiced TDD, thus creating a TDD spectrum. We then matched and compared various subsets of these repositories on this TDD spectrum with control sets of equal size. The control sets were samples from all github repositories that matched certain characteristics, and that contained at least one test file. We compared how the TDD sets differed from the control sets on the following characteristics: number of test files, average commit velocity, number of bug-referencing commits, number of issues recorded, usage of continuous integration, number of pull requests, and distribution of commits per author. We found that Java TDD projects were relatively rare. In addition, there were very few significant differences in any of the metrics we used to compare TDD-like and non-TDD projects; therefore, {our results do} not reveal any observable benefits from using TDD.

Here’s the data and scripts if you need them: https://archive.org/details/tdd_project

@Article{Borle2017,
    author="Borle, Neil C.
        and Feghhi, Meysam
        and Stroulia, Eleni
        and Greiner, Russell
        and Hindle, Abram",
    title="Analyzing the effects of test driven development in GitHub",
    journal="Empirical Software Engineering",
    year="2017",
    month="Nov",
    day="25",
    abstract="Testing is an integral part of the software development lifecycle, approached with varying degrees of rigor by different process models. Agile process models recommend Test Driven Development (TDD) as a key practice for reducing costs and improving code quality. The objective of this work is to perform a cost-benefit analysis of this practice. To that end, we have conducted a comparative analysis of GitHub repositories that adopts TDD to a lesser or greater extent, in order to determine how TDD affects software development productivity and software quality. We classified GitHub repositories archived in 2015 in terms of how rigorously they practiced TDD, thus creating a TDD spectrum. We then matched and compared various subsets of these repositories on this TDD spectrum with control sets of equal size. The control sets were samples from all GitHub repositories that matched certain characteristics, and that contained at least one test file. We compared how the TDD sets differed from the control sets on the following characteristics: number of test files, average commit velocity, number of bug-referencing commits, number of issues recorded, usage of continuous integration, number of pull requests, and distribution of commits per author. We found that Java TDD projects were relatively rare. In addition, there were very few significant differences in any of the metrics we used to compare TDD-like and non-TDD projects; therefore, our results do not reveal any observable benefits from using TDD.",
    issn="1573-7616",
    doi="10.1007/s10664-017-9576-3",
    url="https://doi.org/10.1007/s10664-017-9576-3"
}

They Gave Me Tenure. Or You're Stuck With Me Now.

July 1, 2017 was the day that the University of Alberta awarded me Tenure. Thus I was promoted to Associate Professor. You’re stuck with me now. Or I’m stuck with you.

I thought tenure would solve all of life’s problems such as caring about publications. No, it did not. I still care about publications because I am attached to the success of my students. Publications at top tier venues are effectively king-makers for students. They can make or break a career. Thus tenure did not reduce my attachment to publications, mostly it re-informed me about a reason why some of these publications are very important: they present my students to the research world and can affect their job prospects.

This attachment to students means I still feel pain even if some of the publication attachment has been diminished.

More importantly my sabbatical has started. I have learned a few things so far in the first 3 months:

  • Drop as much responsibility as you can. Really do it. Reviews, committees, obligations. Use your strength to say no, to say no or put it off till you are back.
  • Travel as early as possible. I made the mistake of not travelling immediately when I started and it just got me into a bunch of service that I didn’t need or want.
  • Plan for sabbatical 1 year a head of time at the latest. And read the fine print. If you’re applying for a scholarship be sure you can apply while still an assistant professor.

It’s not all bad. Ahmed Hassan of the SAIL lab invited me to come to Kingston to work with his students. We’re collaborating on some interesting software energy consumption topics regarding the app store.

Kingston is a nice city to visit if you live near the downtown. The downtown is actually useful, it has hardware stores, computer stores, actual grocery stores, office supplies, banks, video stores (still), an excellent independent cinema (The Screening Room), walk in-clinic, gym, pool, library, hospital, lots of restaurants, and most importantly Ahmed’s lab! So it’s very convienant if you’re close to the downtown. Not a lot of small towns of 125-160k people have a coherent and useful downtown, many focus too much on high fashion and stores you’d find in a mall. Alteranatively some downtowns are just waste lands of ever-failing small businesses. Kingston’s downtown is coherent and not just a mall slammed into mainstreet. So for a few months it is definitely pleasent.

For experimental music Kingston has Queen’s University which has a nice music department and in the late fall they also have Tone Deaf (@tonedeafyg) which is an experimental music festival. Kingston is pretty arty with a variety of arts civic groups as well as Arts at QueensU. To the south west of QueensU is the beautiful Isabel Bader Centre for the Performing Arts which often hosts more of the academic faire. It is obvious that the citizens of Kingston and QueensU try very hard to keep arts and arts discourse relevant and prominent in Kingston, especially since their youth can escape to the deceptively greener pastures of Toronto or Montreal on a 2.5 hr bus at any time.

I have other sabbatical plans and my time at Kingston will soon end.

Deep Learning Bitmaps to PCM

Deep Learning Bitmaps to PCM, Audio fun with deep belief networks

Can we learn from video frames to produce audio? Our training set can be synchronized audio and video, whereby we train a deep belief network to convert a bitmap of a video frame into PCM audio.

My former master’s student Gregory Burlet wrote a masters thesis on Guitar transcription using deep learning. I thought I’d join the fray and try an idea I had with deep learning. Prior authors had relied on relatively simple features or reduced representations of data, such as re-sizing a bitmap or down-sampling audio, and used that raw data as features instead of more complicated summaries. Gregory used short time Fourier transforms (STFTs) to describe the input audio. I decided not to use audio as input, I wanted to associate video frame with audio.

Deep Learning Setup

Thus I set up a DBN like so:

Input: 64x64 gray scaled pixels -> 
             deep belief network -> 
               PCM audio (floating point samples)

The training data / validation data is whatever video I feel like. Different videos have different results. The output is the PCM audio of that frame. I thought wow gee if the DBN could produce PCM audio that would pretty interesting, there’s a lot of complicated things that go on in audio signals and if a DBN can do it well that’ll be really impressive.

Input frames were scaled down too 64x64 gray scaled bitmaps with each pixel represented as a value within [0,1]. Audio was monaural and resampled to 22050hz PCM floats.

Training took between 2000 and 7000 minutes per brain. Some brains were simple 4096 inputs -> 1000 units -> 735 outputs. Some were more complicated such as 4096 -> 1000 -> 1000 -> 1000 -> 735 or 4096 -> 2048 -> 1000 -> 1000 -> 735.

Training Data

In this repository I have provided numerous video examples and brains that you are free to play with.

  • armstrong-basic - This is a brain trained off of a video of John Armstrong et al. playing rock music with a theremin. See armstrong-basic/armstrong-basic.avi.webm or youtube. Network 4096 -> 1000 -> 735
  • lines-small - lines for clarinet by John Osborne a local Edmonton animator. Network 4096 -> 1000 -> 735. See vimeo
  • osborne-combined-big - trained off of a larger compilation of John Osborne videos:
  • seeing-a-sound-shallow trained from Seeing a sound quickly
  • KUNGFURY – Trained on KUNG FURY the movie. 4096 -> 2000 -> 1500 -> 1000 -> 735 . See kungfury.com and youtube
  • lines - - lines for clarinet by John Osborne. See vimeo . 4096 -> 1000 -> 1000 -> 1000 -> 735
  • ramshackletyping. Trained on a video I shot of the Olm? Typing. Network 4096 -> 1000 -> 735
  • seeing-a-sound-deeper from Seeing a sound quickly vimeo by John Osborne. Network 4096 -> 1000 -> 1000 -> 1000 -> 735

Observations

It produces audio! The audio isn’t great. The audio often responds to action on the screen. The audio doesn’t respond to theme or content. There is no memory. There is often repeating annoying noises.

It took between 2000 and 7000 minutes to train each brain on a CPU. Kung Fury wasn’t finished training by the time this was written.

The audio is awful, there’s often 30hz harmonics throughout the audio due to the cutting off of frame sounds and no windowing. Windowing can improve the situation but still induces 30hz noise.

I used CSound to reinterpret the sound as granular synthesis, that worked better but lost it’s on-time edge. Granular synthesis smears events.

See rendered examples section at the end of this document to see all rendered examples.

Armstrong-basic

Trained on armstrong-basic/armstrong-basic.avi.webm or youtube. This is A complicated scene filmed from a camera, not a lot visual difference. This I think leads to really blaring output for unseen animations.

For Alphabet conspiracy raw sounds awful, but the granular synthesis seems to work with the talking xray.

Osborne’s Etudes come out very loud but interesting:

I like the on-time response seen in the hand animation Ode to Jimi:

Kung Fury

See kungfury.com and youtube .

A large dataset seems to produce more pleasant PCM output.

Some of the granular synthesis seems quite appropriate:

1392099724.mkv

20 second borys did not work so well: Raw

Human figures seem to have more effect on the sound

Fire sounds pretty good.

Kung Fury seems like a better sounding dataset / brain than others. Perhaps more data and deeper networks are much better?

Lines and lines-small

Lines for clarinet by John Osborne

Both do quite well trained on themselves:

But the smaller network seems to produce more interesting sound with Osborne’s seeing sound:

Perhaps I need to ensure that I’m properly training my network given the performance of the shallower network.

Osborne-combined-big

This dataset was a 15 minute long concatenation of some of the works of John Osborne. The results tend to sound a lot like the other networks.

Fire sounds pretty good.

For granular synthesis Etude 2 stands out:

ramshackletyping

This one illustrates what a lack of variation in training data can do. Just brutal noise.

Here’s some of the better tracks (less noise, still bad):

Essentially if you want really aggressive sounds, maybe train on less and overfit to the input?

Here’s it overfitting to itself:

Seeing a sound quickly

One problem with training on this video is there isn’t a lot of variation. It is very binary, on or off.

There seems to be little differentiation between deep and shallow in this case.

The lines for clarinet video is similar to the seeing a sound quickly video and works quite well:

Discussion

Activity of black is a natural choice, scratched film seems like a good input.

A wider range of training inputs leads to a more robust output, but a tighter higher accuracy brain seems to produce sonically interesting results.

In general everything sounds pretty similar so I am not impressed by the results of this experiment.

The difference between shallow and deep networks is not really that sonically evident.

A common interpretation seems to be that white is loud and black is not. This could be a problem.

Suggestions

This experiment sounds interesting and horrible at the same time! What can be done to improve the sound?

  • Every training set should include 30 seconds or so of black screen and white screens with silent audio. That way the system would keep black screens quieter how we expect them.

  • Use history, this is a very stateless approach. An RNN might be a great idea.

  • Is PCM the most effecient representation? If I want to produce sonically interesting perhaps I might do better in frequency space (STFT) or a vocoder space.

  • Color and past frames were not included. Furthermore no analysis of the frames were used either. Perhaps an Eigen-faces style of operation would work where by the bitmaps Eigen vectors / PCA components are used.

Conclusions

Briefly I’ll conclude, without prior context of prior frames or prior sound that was already output, the quality of the audio output is pretty low. Either we need way more data for training, which I don’t want to spend time on, or we need to add more context to the frame. There’s an inherent independence assumption: 1 frame of video induces 1 frame of audio. But consider that 1 guitar pluck induces an audible signal for a lot longer than the guitar pluck, so there’s a slight problem.

Yet what this shows is that you can produce associations even if it is slightly overfit and they can have some musical value.

We do not recommend generating raw PCM data, intermediate representations might be more appropriate.

Attribution

John Osborne is a local animator who I have been working with. His animation is great, but I’m not sure he likes any of the sounds I put to them :(

These videos are © John Osborne – assume similar rules to CC-BY-NC-ND

Public domain images from Archive.org

  • 015-loud_barking_and_guitar.1397370485.10527-out.15-loud_barking_and_guitar.wav.audio.mkv
  • 114-tones.1397368837.20976-out.114-tones.wav.audio.mkv
  • 1408297309.27876-out.caffeine.wav.audio.mkv
  • 1408304868.8993-out.caffeine.wav.audio.mkv

Assume Public domain

Abram’s photos and images and video

  • 20secondBorys.mp4
  • belch-kitchen-sample.mp4
  • drone-sample.mp4 – video of the Olm
  • govid3-oldsketch.mkv
  • MVI_9117.mov
  • osborne-seeing-sound.mp4
  • spikey-mouth-loop.mkv
  • VID_20130404_003435.mp4.1384674117.corpus.mkv
  • VID_20130531_132327.mp4.1384676233.corpus.mkv

Assume CC-BY 4.0 Abram Hindle

Public domain from Archive.org

  • alphabet-conspiracy.mp4 – Alphabet Conspiracy
  • Bimbo’s_Initiation_1931.mp4 – Max Ernst Bimbo’s Initiation 1931

I think these might have some images from Evelyn Berg in it:

  • 1392098818.mkv
  • 1392098671.mkv
  • 1392099724.mkv

Assume CC-BY-NC.

Many ideas and inspiration are from Gregory Burlet:

https://peerj.com/preprints/1193/

Burlet G, Hindle A. (2015) Isolated instrument transcription using a deep belief network. PeerJ PrePrints 3:e1455 https://dx.doi.org/10.7287/peerj.preprints.1193v1

How to use this stuff

This repository is for support files and examples of applying mostly deep multilayered perceptrons (deep belief networks) to the task of converting video frames to PCM.

Training is simple, run pickler.py on a video and generate video.pkl and audio.pkl. Then run theanet.py to learn a brain between the 2. This can take more than a week for 30 minutes of video. Once a theanet.py.net.pkl is produce you can run render.sh and produce a rendered version of a video.

There are 2 render modes, raw and granular synthesis. Raw has issues with 30 hz harmonics (30fps) and granular synthesis isn’t always on time.

Current observations: the audio produced is high frequency, but the length of the output is not enough to produce continuous low frequency tones anyways. A lot of the output is noise.

Latest source code should be here:

Assume GPL3.0 license on all source code.

Assume GPL3.0 on all DBN pickles.

Rendered Examples

Youtube and Content ID

I’ve been having a lot of issues dealing with erroneous and egregious copyright claims against my own videos that I upload to youtube!

Public Domain

From Archive.org I got a public domain copy of Battleship Potemkin. It is a terrible rip of the movie.

Bronenosets Potyomkin (Battleship Potemkin) (1925) https://archive.org/details/PhantasmagoriaTheater-BattleshipPotemkin1925396

Regardless, MOSFILMS and the creators of Potemkin did not renew the movie in the US for copyright. Meaning even though it was released in 1925, it is in the public domain due to inaction on the part of the MOSFILMS et al. Furthermore relations with Russia between Russia and United States at the time were questionable.

In Canada this movie is far past its PUBLIC DOMAIN due date and it is now in the public domain. Sergei Mikhailovich Eisenstein died in 1948 so 1998 was 50 years after his death. Meaning even by the strictest standard of public domain in Canada (50 years after death) the work is public domain. If it’s a performance it was performed in 1925 so 50 years after performances is even earlier.

See: http://en.wikipedia.org/wiki/Copyright_law_of_Canada#Public_domain

But what has happened? Many content holders on Youtube have laid claim to my posting of a soundtracked version of Battleship Potemkin.

The soundtrack was automatically generated by software, but the video was provided by http://archive.org. It is a public domain copy of the public domain movie. It isn’t a copyrighted copy like those produced by Kino films et al. who remastered the images effectively making a new work.

Nonethless the Youtube content ID has no taste for subtlely and various organizations have made claims against these videos. This affects my youtube account because it puts me into the proverbial dog house so to speak, where I cannot upload longer videos and limits my account in other ways. This stain on my account also threatens the content under claim.

VTR claims they own the Odessa Steps sequence in the movie, they don’t:

https://www.youtube.com/watch?v=1_Li7KLVhDE&t=54m52s

I am having a hard to figuring out who VTR is but I think they have a music video in their collection that has the Odessa Steps scene in it. Otherwise they would’ve claimed the whole movie and not that sequence. For that reason alone it is quite apparent that VTR has no claim to my version of the movie.

So just to emphasize the large number of claims I have been dealing with here is a HTML “screenshot” of 2 of my Potemkin videos:

Battleship Potemkin Soundtracked Video Texture 2: Strings

Your video may include content that is owned by a third party.

To watch the matched content please play the video on the right. The video will play from the point where the matched content was identified.

Your video is available and playable.

Here are the details:

  • Visual content administered by: 3:00;)

    Mosfilm Claim released.

  • Visual content administered by: 1:04;)

    egeda Claim released.

  • “Час истины-Час истины - Первая русская революция”, visual content administered by: 48:50;)

    Mediagates TV Claim released.

  • “Zoom- Start: Encouraçado Potemkin”, visual content administered by: 54:28;)

    Fundação Padre Anchieta (TV Cultura) Claim released.

  • Visual content administered by: 54:52;)

    VTR Broadcast Your Dispute awaiting response by 5/14/14

To learn more about how claims impact your videos click here.

https://www.youtube.com/watch?v=VcGSCmp8-N4

Battleship Potemkin Metropolis Automatically Soundtracked

Here are the details:

  • Visual content administered by: 3:00;)

    Mosfilm Claim released.

  • “Час истины-Час истины - Первая русская революция”, visual content administered by: 48:50;)

    Mediagates TV Claim released.

  • Visual content administered by: 54:52;)

    VTR Broadcast Your appeal is awaiting response by 5/14/14

To learn more about how claims impact your videos click here.

Another Fight Over Public Domain Music

http://www.opengoldbergvariations.org/ The Open Goldberg Variations are a very nicely produced and 100% public domain rendition of Bach’s Goldberg Variations. It is beautiful and I enjoy using this music as source material, I could never play anything like it so I am grateful for its availability.

https://www.youtube.com/watch?v=Ua-PcbC5xMI This exceptionally sorry demo video used a sample of the music at https://www.youtube.com/watch?v=Ua-PcbC5xMI&t=3m00s . Also I use the music as input to a granular synthesis engine. I had numerous groups claiming they owned my music. Why? Because Bach is played a lot and lot of copyright holders will have a rendition of Open Goldberg Variations in their catalogue.

What was even stranger was that granular synthesis parts were being picked up by copyright holders such as CD-Baby the popular indie netlabel that lets you sell your music on CD online. I guess my granular synthesis output of my instrument sounded like someone else’s granular synthesis (very likely). Regardless they did not own my work.

This potentially jeopardized my acceptance into NIME 2014 (yeah!) as Youtube could’ve taken down my demo video.

Public Domain Summary

Thus you can see that many different rights holders are claiming this film, but often not the whole film or just segments that matched their catalogue. I wish Youtube’s content ID could understand that just because we use the same public domain work does not mean that either party has ownership over the content.

This is not a new problem

The gaming community has a beef with youtube ContentID but I think their claims are different than mine. My claims are MY content or PUBLIC DOMAIN content, and my ability to freely publish MY content and PUBLIC DOMAIN materials. The gamers are in a different world where they do not own the game assets of the games they are recording:

Sony took down the Blender Community’s Sintel from youtube http://boingboing.net/2014/04/06/sony-issues-fraudulent-takedow.html#more-296460

http://Waxy.org ’s Andy Baio or Nicole Wilke discusses experiences people have had with illegitimate claims: http://waxy.org/2012/03/youtube_bypasses_the_dmca/

So it is concerning and it definitely harms my youtube experience. I’m especially disturbed by the claims against my own music that aren’t sampling from copyrighted sources.

You’re a Computer Scientist What Do You Think?

I think what this kind of interaction highlights is that machine learning and media fingerprinting is not enough. We need Youtube and future crowd content providers to develop systems on top of these systems.

We need to recognize that the law is not the same.

We need to recognize that the public domain does not exist everywhere, but that should not those who have the rights to the commons. If per-country censorship is necessary to protect me from litigation as a user then so be it. Perhaps some of the claims against my work are legitimate in some venues. I’d rather my work be censored in those venues than face litigation. This is an ethical software issue, the user is not against you, you should help protect the user. Do not assume malice in the face of international venues and laws.

We need systems to learn cases where contentID matches but the uses are

legitimate.

In the case of public domain works both derivative makers have a right to the work. The provenance of the work really matters, and thus our content ID systems need to be aware of these scenarios. In the case of youtube it could be as simple as indexing all of the public domain movies in the USA to avoid false claims on public domain sources. Rules could be learnt depending on the provenance and context.

We need provenance aware contentID.

The world is a messy place and the legal rules are messy as well, contentID needs to know about the history of content, we cannot just trust a single rightsholder. They have been shown wrong many times in the past. Furthermore with public domain, a large site like Youtube should be able to determine the flow or the provenance of some of the material and aid in determining what is fair reuse and what is not.

Media needs more metadata about its provenance

Sadly our media files tend to lose metadata like the Heartbleed bug makes SSL services lose secrets. We need to enable better tracking and encoding of the provenance of entities for all those involved. Imagine if you make a remix and a system can tell you all the sources you used and manage that for you? That’s especially important in the Open Source world where the main currency is attribution.

Conclusions

I could go on, but just because you own a music video that uses the Odessa Steps doesn’t mean you own my content. Furthermore Youtube should be far more careful with OpenID and directly address the Public Domain. We should be free to use the commons as we please unmolested by content ID claims. Youtube contentID should know better, and Google has enough engineers to make it better.

The future of IP and provenance in media is interesting and I think there are a lot of avenues for researchers willing to try the legal side, but alas we also need the conferences to recognize the importance of IP and licensing.

Updates -- 2013 in Review

How are you doing? Long time no see. I have been as busy as a bee.

What have I been up to?

An Attempt at a Video Lecture

Next week I’ll be at ICSM presenting a neat paper I worked on with Christian Bird, Thomas Zimmermann, and Nachiappan Nagappan: Relating Requirements to Implementation via Topic Analysis, in Proceedings of the 2012 International Conference on Software Maintenance (ICSM 2012), IEEE, 25 September 2012.

But that means I’ll be away! I’m currently teaching CMPUT301: Intro to Software Engineering. So what shall I do? I have guest speakers coming in but neither has significant experience with programming for Android, so I felt I had to prep the students before the assignment was due. The assignment was an Android app.

I decided to record a few lectures to cover the missing material that I could not get to yet.

I used:

To record audio and video I used a desktop recorder, this means whatever is on my screen the audience will see. This means close your email and clean up your desktop!

What I found was that record my desktop was very finicky. There were only a few settings I could use to ensure proper synchronization of frames. I had to use 22050hz audio, I had to not take a screenshot per frame, I had to set it to encode on the fly. If I didn’t use these settings I tended to produce videos that had wildly out of sync audio and video.

Once that was solved, I hunkered down, opened my class notes and started talking. Ubuntu has an accessibility option to highlight your cursor if you press control, I enabled that and found it useful to draw attention to the cursor.

Sound quality is a big issue, if you want to do this seriously you want a good headset in a quiet room and preferrably something to filter the noise. One might consider manually noise filtering the signal later and compressing it (eq + amplification). I filmed these videos outside so you can hear cars and trucks and airplanes.

Finding a space was surprisingly difficult on the go, you need a spot where you can use your projection voices, because you’re presenting and you need to project. Furthermore the audience can’t see you, so if you do need to make gestures I recommend a webcam program like cheese which allows you to just see yourself, the desktop recorder can take care of recording your webcam program.

In the end I produced some videos about Android and Sequence Diagrams:

Whether or not this is the future, I’m not sure, but because of the knowledge that lectures can be replaced by video I’ve been trying to make CMPUT301 more interactive by adding quizzes, in-class group work, discussions and other exercises. My wife suggests that learning isn’t just passive and sometimes we have get up and do something.

SWARMED

I recently just performed 2 performances of interactive music instrument: SWARMED.

This instrument allows the audience to play it with their own cellphones. They just need to sign up to my wifi network and they get redirected to a web instrument that plays out on the PA!

I will be improving the instrument greatly in preparation for my big noise set at the Victoria Noise Festival in August 23-24, 2012!

The Works Performance

Example instrument architecture

SkruntSkrunt.ca