Abram Hindle's Blog

Software is Hard

Youtube and Content ID

I’ve been having a lot of issues dealing with erroneous and egregious copyright claims against my own videos that I upload to youtube!

Public Domain

From Archive.org I got a public domain copy of Battleship Potemkin. It is a terrible rip of the movie.

Bronenosets Potyomkin (Battleship Potemkin) (1925) https://archive.org/details/PhantasmagoriaTheater-BattleshipPotemkin1925396

Regardless, MOSFILMS and the creators of Potemkin did not renew the movie in the US for copyright. Meaning even though it was released in 1925, it is in the public domain due to inaction on the part of the MOSFILMS et al. Furthermore relations with Russia between Russia and United States at the time were questionable.

In Canada this movie is far past its PUBLIC DOMAIN due date and it is now in the public domain. Sergei Mikhailovich Eisenstein died in 1948 so 1998 was 50 years after his death. Meaning even by the strictest standard of public domain in Canada (50 years after death) the work is public domain. If it’s a performance it was performed in 1925 so 50 years after performances is even earlier.

See: http://en.wikipedia.org/wiki/Copyright_law_of_Canada#Public_domain

But what has happened? Many content holders on Youtube have laid claim to my posting of a soundtracked version of Battleship Potemkin.

The soundtrack was automatically generated by software, but the video was provided by http://archive.org. It is a public domain copy of the public domain movie. It isn’t a copyrighted copy like those produced by Kino films et al. who remastered the images effectively making a new work.

Nonethless the Youtube content ID has no taste for subtlely and various organizations have made claims against these videos. This affects my youtube account because it puts me into the proverbial dog house so to speak, where I cannot upload longer videos and limits my account in other ways. This stain on my account also threatens the content under claim.

VTR claims they own the Odessa Steps sequence in the movie, they don’t:

https://www.youtube.com/watch?v=1_Li7KLVhDE&t=54m52s

I am having a hard to figuring out who VTR is but I think they have a music video in their collection that has the Odessa Steps scene in it. Otherwise they would’ve claimed the whole movie and not that sequence. For that reason alone it is quite apparent that VTR has no claim to my version of the movie.

So just to emphasize the large number of claims I have been dealing with here is a HTML “screenshot” of 2 of my Potemkin videos:

Battleship Potemkin Soundtracked Video Texture 2: Strings

Your video may include content that is owned by a third party.

To watch the matched content please play the video on the right. The video will play from the point where the matched content was identified.

Your video is available and playable.

Here are the details:

  • Visual content administered by: 3:00;)

    Mosfilm Claim released.

  • Visual content administered by: 1:04;)

    egeda Claim released.

  • “Час истины-Час истины - Первая русская революция”, visual content administered by: 48:50;)

    Mediagates TV Claim released.

  • “Zoom- Start: Encouraçado Potemkin”, visual content administered by: 54:28;)

    Fundação Padre Anchieta (TV Cultura) Claim released.

  • Visual content administered by: 54:52;)

    VTR Broadcast Your Dispute awaiting response by 5/14/14

To learn more about how claims impact your videos click here.

https://www.youtube.com/watch?v=VcGSCmp8-N4

Battleship Potemkin Metropolis Automatically Soundtracked

Here are the details:

  • Visual content administered by: 3:00;)

    Mosfilm Claim released.

  • “Час истины-Час истины - Первая русская революция”, visual content administered by: 48:50;)

    Mediagates TV Claim released.

  • Visual content administered by: 54:52;)

    VTR Broadcast Your appeal is awaiting response by 5/14/14

To learn more about how claims impact your videos click here.

Another Fight Over Public Domain Music

http://www.opengoldbergvariations.org/ The Open Goldberg Variations are a very nicely produced and 100% public domain rendition of Bach’s Goldberg Variations. It is beautiful and I enjoy using this music as source material, I could never play anything like it so I am grateful for its availability.

https://www.youtube.com/watch?v=Ua-PcbC5xMI This exceptionally sorry demo video used a sample of the music at https://www.youtube.com/watch?v=Ua-PcbC5xMI&t=3m00s . Also I use the music as input to a granular synthesis engine. I had numerous groups claiming they owned my music. Why? Because Bach is played a lot and lot of copyright holders will have a rendition of Open Goldberg Variations in their catalogue.

What was even stranger was that granular synthesis parts were being picked up by copyright holders such as CD-Baby the popular indie netlabel that lets you sell your music on CD online. I guess my granular synthesis output of my instrument sounded like someone else’s granular synthesis (very likely). Regardless they did not own my work.

This potentially jeopardized my acceptance into NIME 2014 (yeah!) as Youtube could’ve taken down my demo video.

Public Domain Summary

Thus you can see that many different rights holders are claiming this film, but often not the whole film or just segments that matched their catalogue. I wish Youtube’s content ID could understand that just because we use the same public domain work does not mean that either party has ownership over the content.

This is not a new problem

The gaming community has a beef with youtube ContentID but I think their claims are different than mine. My claims are MY content or PUBLIC DOMAIN content, and my ability to freely publish MY content and PUBLIC DOMAIN materials. The gamers are in a different world where they do not own the game assets of the games they are recording:

Sony took down the Blender Community’s Sintel from youtube http://boingboing.net/2014/04/06/sony-issues-fraudulent-takedow.html#more-296460

http://Waxy.org ’s Andy Baio or Nicole Wilke discusses experiences people have had with illegitimate claims: http://waxy.org/2012/03/youtube_bypasses_the_dmca/

So it is concerning and it definitely harms my youtube experience. I’m especially disturbed by the claims against my own music that aren’t sampling from copyrighted sources.

You’re a Computer Scientist What Do You Think?

I think what this kind of interaction highlights is that machine learning and media fingerprinting is not enough. We need Youtube and future crowd content providers to develop systems on top of these systems.

We need to recognize that the law is not the same.

We need to recognize that the public domain does not exist everywhere, but that should not those who have the rights to the commons. If per-country censorship is necessary to protect me from litigation as a user then so be it. Perhaps some of the claims against my work are legitimate in some venues. I’d rather my work be censored in those venues than face litigation. This is an ethical software issue, the user is not against you, you should help protect the user. Do not assume malice in the face of international venues and laws.

We need systems to learn cases where contentID matches but the uses are

legitimate.

In the case of public domain works both derivative makers have a right to the work. The provenance of the work really matters, and thus our content ID systems need to be aware of these scenarios. In the case of youtube it could be as simple as indexing all of the public domain movies in the USA to avoid false claims on public domain sources. Rules could be learnt depending on the provenance and context.

We need provenance aware contentID.

The world is a messy place and the legal rules are messy as well, contentID needs to know about the history of content, we cannot just trust a single rightsholder. They have been shown wrong many times in the past. Furthermore with public domain, a large site like Youtube should be able to determine the flow or the provenance of some of the material and aid in determining what is fair reuse and what is not.

Media needs more metadata about its provenance

Sadly our media files tend to lose metadata like the Heartbleed bug makes SSL services lose secrets. We need to enable better tracking and encoding of the provenance of entities for all those involved. Imagine if you make a remix and a system can tell you all the sources you used and manage that for you? That’s especially important in the Open Source world where the main currency is attribution.

Conclusions

I could go on, but just because you own a music video that uses the Odessa Steps doesn’t mean you own my content. Furthermore Youtube should be far more careful with OpenID and directly address the Public Domain. We should be free to use the commons as we please unmolested by content ID claims. Youtube contentID should know better, and Google has enough engineers to make it better.

The future of IP and provenance in media is interesting and I think there are a lot of avenues for researchers willing to try the legal side, but alas we also need the conferences to recognize the importance of IP and licensing.