Stanford HAI 2019 Fall Conference – Owning AI: Intellectual Property for Artificial Intelligence

– [Mark] Okay, welcome everybody. I’m Mark Lemley. I am a professor at Stanford Law School and Director of the Program of
Law, Science, and Technology. I am also partner at the
Durie Tangri law firm and I do both intellectual property work and, increasingly, scholarship
on artificial intelligence. Let me briefly introduce our panelists and then I wanna hear a little bit from the audience about
sort of background level of knowledge, interests,
and that sort of thing. So to my left is Dr. Ryan Abbott, who is a professor at
the University of Surrey in the UK, at the law school. Also, a professor in the
School of Medicine at UCLA. And, chiropractor? – [Ryan] Acupuncturist.
– [Mark] Acupuncturist. (audience laughing) Has written quite a bit in intellectual property and AI, as well as other things like, should robots pay taxes? And has a forthcoming book on the subject. Chris, you might have
to bring your own chair, or maybe there’s one over there. To my right is Lisa Willett, my colleague at the law school who
specializes in patent law, and also innovation law
policy more broadly. And finally, Jim Cooley who is an attorney in practice here in Silicon Valley, and also the former
Deputy Head of the World Intellectual Property Organization. So we got a bunch of things
we’re gonna talk about, but this is gonna be a
conversation both among the four panelists but also hopefully among the room at large. So it would help to know a
little bit about background. How many people here are lawyers? How many people here are engineers? Okay, public policy, social science? Okay, so pretty good mix. So we are gonna sort of do
some, we’re going to talk about a variety of intellectual property issues and doctrines, but we
will not sort of presume a lot of knowledge. If we are slipping too far
into legal jargon, stop us and say I have no idea
what you’re talking about. And please interfere, please
interrupt with questions as we start out. I want to start out with patent law, the sort of fundamental,
sort of first fundamental question is, is AI patentable? And I think that raises the question, what do we mean by that? So Lisa, you want to lead us off? – [Lisa] Sure. So just as a kind of background for those of you who don’t
know anything about patent law, I mean, I think my view
is that AI doesn’t require a fundamental rethinking of patent law, that there are basic patent
criteria that now we have to think exactly how these
apply in the AI context, including which aspects
of AI we’re talking about. But the basic requirements
for getting a patent are that the invention
has to be new compared to everything that has happened before. And not just new, but non-obvious. So a leap from what has
happened in the past, looking at all prior
publications and things that were being used in the world. It also has to be what’s called
patentable subject matter. So that means that you can’t get a patent on something like the
abstract idea of using AI to categorize skin lesions, but you could get a patent, perhaps, on a particular algorithm for doing that. So we can’t patent the abstract idea or a law of nature, but can
patent more specific things. This is a area of law
that’s changing a lot right now so patent lawyers are fighting on exactly what that rule means. And then you also have to
disclose enough about your invention so that other
people who are in this field are able to make and use the invention without undue experimentation. But that doesn’t mean
that you have to know how your invention works,
or give a specific recipe for doing it so, including,
giving the source code for an algorithm. So, undue experimentation can be quite a bit of experimentation. Figuring out how these apply
to AI can be challenging and the U.S. Patent and
Trademark Office is wrestling with this right now. They put out a request
for comments in August for a list of questions
related to AI inventions, including asking about
like what are the different aspects of AI that could be patented? They listed some specific
aspects like the structure of the database on which
the AI will be trained and will act, the training
of the algorithm on the data, the algorithm itself, the results of the AI invention through an automated process,
the policies or weights to be applied to the data
that affects the outcome of the results. But they also asked for
other ideas about different aspects of AI that could be
protected through patents. They’re also interested
in issues of inventorship, who owns these, that we’ll get to later. The questions that
actually interested me most that they asked and that
are things I’ve thought about in the broader
aspect of my scholarship are the questions related to disclosure, like what you have to
disclose to get a patent on a AI invention. So they asked about whether the degree of unpredictability of certain
AI systems and things like deep learning
systems with hidden layers that evolve during the
learning and training process change the requirements
for how much disclosure is necessary to get a
patent in these areas. And I think that these
disclosure problems are not that different from problems
that the patent system has wrestled with before, including in context that might seem very
different like biotechnology, where in the 1970s there
were a lot of biotech inventions involving new microorganisms and things where it wasn’t clear that just a written disclosure would be enough to allow other people to
make it, that you needed to have access to the
materials in that context. And then, the international
patent system dealt with this by saying that
you could get a patent by making a deposit of
biological materials, and there was a treaty in 1977 to set out the requirements for a internationally recognized depository. These deposited materials have to be made available to the public once they’re, the patent issues and
it’s not that expensive to deposit materials or
to access them as a member of the public. The disclosure rules
also changed in 1990 for patents on genetic
sequences where you have to, if you’re getting a patent
related to the nucleic acid or protein sequences, you
had to disclose the sequence in a standardized electronic
format to allow people to find that more easily. – [Mark] So do you think we
should deposit algorithms? – [Lisa] I think currently there’s a consensus that these
disclosure rules are not working as well in
the software context as they are in the biochemical context, either in terms of limiting
patents scope or in terms of conveying technical
knowledge to people in those fields. A number of people have
suggested that software patents should be
required to disclose more including disclosing
the source code itself. I think it probably is a
non-starter politically, but might be a good
idea as a policy matter to require disclosure and perhaps deposit of the electronic, of
the digital materials in a standardized
depository in the same way it is for biological ones. – [Audience Member] But why would someone want the patent on something like a deep learning
algorithm when treating it more like a trade secret
would be the way to go? I mean, it’s a great technical
question, but it seems to me the trade-off of giving away
some of the secret sauce, in this case you’re like great, we’ll just keep it
behind the scenes anyway. – [Mark] So, I mean, I think this is gonna be a substantial, a
significant issue, right? And I think we’ll talk
about, we do want to talk about trade secret questions. I mean, the kind of
lay-person answer to why you would choose patent over
trade secret, right, is the mix of rights you get is different. And so what a trade secret
will give you is the right to prevent someone else
from copying your algorithm. What a patent will give
you is the right to prevent someone else from producing
an algorithm like this, even if it is independently generated. And indeed, in software today,
the overwhelming majority of the patent lawsuits are
filed not against people who copied the technology
from the plaintiff, they’re filed against people
who independently came up with their own algorithm,
but the plaintiff says, well it looks too much like mine. So the trade-off for that
is, I think, as you suggest, that we’ve got a, you got
to sort of tell people something, you gotta give
them theoretically enough information to enable them to
make and use the algorithm. And my sense is, and
Lisa may differ on this, that software companies
have been pretty good at writing down enough information
in vague generalities and the courts have been
lax enough in letting them get away with it, that
they don’t convey much in the way of useful information. So, if you’re a, if what
you wanted was kind of broader protection, but
not to have to give away the crown jewels, right,
you might get away with, in this AI context, describing
the process of your AI at a high enough level
that you satisfy the courts but don’t actually give people
the information you’d want. – [Audience Member] Question,
A bit of background (mumbles) in the software services company and they have a lot of patents. (Mumbles) patents. (Mumbles) and generally
I have seen that people are talking about (mumbles)
in the context of (mumbles). Which is really not I believe the right thing technically, right? So because it’s a very
fundamental difference, right? (Mumbles) that you’re
writing, when you’re writing a deep learning system
(mumbles) or anything. It is (mumbles) and then the
network lands on its own. And what is important
(mumbles) is you’re feeding all the data, right? As an example, (mumbles)
only 250 lines of code, but how good the system is really depends on the millions of images
your feeding to make the system learn, you know,
what the human face looks like. They can then go from there. – [Mark] And so that I think
gets to the sort of question of what is it we would patent? Right. The kind of easy traditional
case would be I have written a new algorithm, whether
or not that’s patentable actually is debatable, right? But that’s something we’ve kind of, we’ve dealt with in software, right? A second thing you might
patent is the training process. People might want to patent,
although I don’t think they can, the sort of database of training information itself, or you
might want the output, right? So maybe what I think
what you really might be interested in as a
business model is kind of the output state, after
having been trained. So the kind of neural
net once it is trained. – [Audience Member] Yeah,
the architects (mumbles) to those links. That is what does it
really, you know, (mumbles). – [Audience Member] But one assumes that that’s static, right? It doesn’t assume that actually
that changes over time. – [Audience Member] (mumbles)
training (mumbles) also. – [Mark] Right, and this is
I think one of the issues that is most problematic for patent law. I think there are questions
as to whether or not some of those things
are patentable at all. So if you, if what I really
care about is the training weights I put on something, that’s looking pretty close to math of
the sort that we have said is unpatentable subject matter. But even if you get over that hurdle, the real problem might be that if I want to patent what
I do, I write down an application, I send it
to the patent office. They then do nothing
with it for 18 months, then they pick it up and
look at it and then we have a back and forth
for a while and it’s two and a half to three years,
which is faster than it used to be a couple
decades ago, but it’s still two and a half to three
years before I get a patent. Now, the problem is,
right, if what I want is kind of the end state after
I have trained the algorithm, the likelihood that that
end state is gonna be very much like what it was three years ago when I filed the patent
application seems to me pretty low. So the thing I might get
is a kind of snapshot of what my algorithm used to do in the past as a patent and that seems less valuable. – [Lisa] That’s not so different. – [Audience Member]
(mumbles) patent been granted (mumbles) so far? – [Mark] Yes, oh no, there are
a bunch of patents out there. I mean, so. – [Lisa] Yeah, I mean, there
are patents on all of these different aspects of AI
and that doesn’t mean that all of these patents would
be upheld if challenged robustly in litigation
for various reasons. I forgot what I was gonna say. – [Mark] You were gonna
push back on the question of whether getting a useful
patent in the United States. – [Lisa] It’s not so
fundamentally different from other technologies where
at the time you’re filing your patent application,
often you don’t have your final commercial
product and it gets refined a lot in the company as a result of that. You have to give enough
such that people can make and use your invention
to the extent you have it at that point and that
you’re claiming in your claims and that might require
for certain kinds of things. Disclosing the training data where that, if that is some unique
thing where it’s really necessary for someone
else to make and use your invention I think there’s a good argument that under patent rules you
should have to disclose that. – [Mark] Yeah, so I mean
I think that, I think it’s right to say this is not a problem that we have never encountered before. I think one of the things
that most patent scholars and probably most patent lawyers would say the patent system works
better in the life sciences, in the pharmaceutical,
in the biotech industry. One of the reasons for that
is that the time periods are longer, right? So the thing I’m getting
a patent on is actually the chemical that the FDA
will ultimately approve several years later. The other reason I think
that’s important is we actually have a good
standardized language for describing what it is that I’ve invented, so that you can read a
patent on a pharmaceutical or a biotechnological
invention and have a pretty good idea of what’s covered. That has not, neither of those things has traditionally been true
in software patenting. And so software patenting
have been, software patents have been more worrisome, I think, in various respects, right,
because they’re vaguer because they tend to be written
with old technology in mind by the time they are
issued and certainly by the time they are litigated. And I guess I worry that sort of, AI makes that problem worse,
precisely because the, it’s not just sort of we’re
generating a new version of our product and releasing
it every couple of years or even every six months or so, right? It’s that we’re actually
training and modifying the algorithm in real time. So you’ve either got a
patent on this particular set of weights as a result
of this particular algorithm, and its training at this time, which is a snapshot that almost
instantly becomes obsolete, or you try to solve that
problem by writing very broad, vague claims to
the whole concept, right, of training in a particular
way or the idea of an algorithm trained on
this database at all. And those seem to me,
they might be vulnerable to attack, they might be
invalid, but if they’re not vulnerable to attack then
they might kind of interfere with a lot of the industry,
because they prevent a bunch of different people
from trying different methods. – [Audience Member (mumbles)
explainability, Right? So (mumbles) neural network
models the biggest challenge is explaining how it works. You know it is, it has
not been solved, right? And I’m sure that one of
the key tenants of patenting is you have to explain how
the whole thing works, right? – [Mark] And that was,
Lisa’s, why you’re suggesting this idea that maybe we think
we could have deposit, right? I can’t tell you why it produces, I can tell you how I made it,
tell you what I started with and I can tell you what I trained it on. – [Audience Member] But why
should that meet the standard? Why should that meet the
enablement requirement just to have, if you have
a, let’s say it’s on a CD, to go really old school, what
does that get the public? – [Lisa] If they have access to it. – [Audience Member] No, I know,
but in this context, right, like you have something that can’t be reverse-engineered that way if
you’re looking at the result, how does that teach, how does that meet the public policy requirement
of enablement, right? How does that teach the
public how to practice the invention if you have a starting point and you have an ending point, but there’s nothing in between? – [Lisa] I mean it’s
beneficial to them the same way it is to the inventor
who now has this system that is, like, for
whatever application that system was designed for they can also, they have access to whatever. – [Audience Member] But
if it comes with the data that argument is true, but if you don’t have the training data that comes with it, which I think in your proposal, your explanation of commercially it’s not practicable to give the
training data as well, right? If you’re missing that piece, how does it meet the public policy
goal of basic patents? – [Audience Member] (mumbles)
data is impossible to share. – [Audience Member] Physical media. – [Mark] Physical media or other
legal concerns, which we’ll talk about in a minute, right, yeah. – [Lisa] I mean if you can’t
give enough such that other people can make and use the
invention, I don’t think it should meet the patent (mumbles). – [Mark] Well, so I mean, what but this, the closest analogy of
how we’ve dealt with this in patent law, are circumstances in which I can tell you how I
did it in enough detail that you can understand how I did it even though you won’t yourself be able to replicate it because you don’t have access to the information. And that, I think, we have
generally said is permissible. So if I, if I said okay,
here’s the base algorithm I started with, and here is
exactly how I trained it, here is the, I have
identified the data sets, I can tell you where they came from. Turns out all those data
sets are proprietary. Google owns them all, right,
and if you’re don’t have a license for Google you’ll
never be able to replicate this. I think the courts would
still say you’ve taught people at least sort of
theoretically how to do it. They know how to do it, they
might not be able to do it, but that’s not because they
lack the technical ability, it’s because they don’t actually have the rights to this information. I mean, yeah. – [Lisa] Has that been
challenged, I think that’s true where you have a case
where the public, like, in theory, they couldn’t develop the data that they need or to have the
web crawling, but when it’s something they, just
even in theory with work could not get access to
it then it seems more like the microorganism deposit
requirement where you like, we require deposit of material in order to satisfy (mumbles). – [Mark] Yeah, no, I see
the argument for deposit although the problem is what
you really want to deposit, or maybe you’d say, well if it’s, as long as I deposit the end state, the post trained algorithm. – [Audience Member] It’s different than depositing the cell line though in the biotech context because
that you can work from that. – [Mark] And now one thing you can do, I think, is you can do
some black box sort of reverse-engineering and
testing that you wouldn’t necessarily by able to do
without access to that thing. So if I could, if I have access to, here’s your end state,
post-trained algorithm, even if I can’t sort of,
kind of understand it, read it, right, because of
the explainability problem, I can at least sort of do
some testing to try to, to see what might work,
what might not work. – [Audience Member] So the
version would have to be (mumbles), if you wanted to play with it, maybe the deposit has to be enough that you can run stuff on it, even though you don’t have the original data to figure out how it worked. – [Audience Member] Performance would be very different, right? – [Audience Member] Yeah, that may be, but that’s the legal trick, right? – [Audience Member] But I am (mumbles) the public welfare aspect of the whole patent system, right? (mumbles) I don’t think it got answered correctly. (mumbles) It all depends on the
data and those big players like the Googles and
Facebooks of the world are such a massive motivator, which, in one way, I think, this is the reason why there is no next Google,
next Facebook, (mumbles) because the current big players are of the monopoly of the data, right, simply because unlike natural, you know, petroleum companies if they’re going and occupying a space and drilling, now we are going to Google and Facebook and giving all our
assets, the data assets, and we don’t go to anybody else. And that’s why these players have such a strong, strong hold on the data that in turn leads to these newer models, more efficient (mumbles). – [Mark] So I think that’s right. One question is kind of
what kind of business consequences flow from it? Another question is what, if anything, the law should do about it. In a world in which
Google just happens to be well-positioned to have
access to all of the data, because of their existing business model, and I think for a lot of
things, that’s true, right? So if you wanna train
on, I wanna train my, my self-driving car to
recognize stop signs, right? Well, who’s got the largest database of photographs of stop signs, right? It’s Google Images. If I wanna train on a corpus of text, who’s got all of the books scanned, all of these things. They didn’t do this for AI, but they happened to have built
these sort of enormous set of databases. In a world in which they were not also in the AI development business, right, then they might actually sort of be perfectly happy to license those access to those databases to AI developers who wanted to train on a
non-exclusive basis, maybe not. But they wouldn’t have some sort of interest in making sure that their cars and not competing cars had
access to them, for instance. But if they’re vertically
integrated, right, if they’re gonna use the, if they own both the training data
set and they are the ones developing the algorithm,
then I think it’s right that they do have a substantial structural advantage. Now the thing I’ll note about that is that’s not mostly, it’s not mostly IP law that is giving them that
structural advantage, except to the extent that
the, that it prevents other people from just copying
the database altogether. – [Audience Member] So
(mumbles) to the question and combine the disclosure
proposal that you have with the centroplex facilities
it becomes (mumbles). – [Mark] Yes, if you’re
willing to say, basically, you have to share this
because the world can’t, you can’t compete without it. – [Audience Member] But
the argument that we heard just now is so critical that, you know, it’s Google that basically
always has the structural competitive advantage
that you mentioned, right? – [Panelist] Yeah I was just gonna say to that point that the law is
not always coherent, right? If you have two actually opposing forces between IP and antitrust law, right, IP law, I think Lisa can
tell us how, you know, patents are a government-granted monopoly, by definition, right? And then you have antitrust,
how when you aggravate, so like, the question
becomes let’s imagine you can patent elements of
AI, although I don’t know that algorithms are the
right part of (mumbles), maybe more systemic, but
let’s say you can patent parts of AI and you amass this
giant pool of patents, right, then, at what point does
the interest of IP law, which is, definitionally,
I share with the public open kimono everything I discover for exchange for a 20
year government-mandated monopoly, at what point
does that, while legal, then brush up against
this other body of law which is competing which is antitrust, and then maybe, I don’t
know like FRAND, you know, now is the interest to your question about why don’t we just give,
why don’t we have a regime that forces people to
license their data pools and data lakes and access to
this massive amount of data. That regime exists with
FRAND wanting Europe does a much better job. – [Mark] If you commit to
it voluntarily is the issue with the FRAND rule. So there’s a bunch here, so let me try to sort out a couple of things. I think antitrust law, there is a doctrine in antitrust essential facilities– (audience member mumbles) Yeah, we’ll get there. There’s a doctrine in antitrust law called the essential facilities
doctrine that says some things are so critical to competition that you have to share
them with your competitors, at least if you’re vertically integrated, so the classic example is
I own all the power lines in the state of Wisconsin, and I refuse to allow my power-producing competitors to interconnect with them. The court says no, you’ve
actually gotta interconnect and allow them to sell
power that goes across your lines because nobody can, we don’t want everyone
to build three or four or five different competing
power grids, right? We want them to share the information. That doctrine in the United States is very, very rare in application. There are some people who think it’s actually affirmatively dead. I don’t think it’s affirmatively dead, but it’s, it’s sort of the last refuge of somebody who’s otherwise
got a good argument. It is much stronger in Europe. I think one thing that’s notable here is we might well have a different legal regime in a world in which the European courts said hey, competing AI developers
have to have access to Google’s training
data set, and the U.S. courts say no we wouldn’t do that. – [Audience Member] I think
that gets a little bit toward what you were
saying earlier, right, which is, currently if
I understand correctly what you’re saying is,
I can tell you the how, I don’t have to give
you the goods to do it, is one way to think about this. – [Mark] Right, and then
essential facilities would be an argument that says yes, you do have to give me the goods to do it. – [Audience Member] We
talked about what is possible potentially (mumbles), what is actually the current field in being
able to uphold those, because I think that’s
almost more important rather than what a body decides that hasn’t been tested, right? – [Mark] So this is, part of the answer is we don’t know because the patents are relatively new. It takes quite a while for
them to make it into court to get answers in a specific case. Part of the answer is we don’t know because a lot depends on the scope of this doctrine of
patentable subject matter. So what we say is you can’t
patent an abstract idea. You can patent sort of
specific applications of an abstract idea. Courts are all over the map on what constitutes an abstract idea. We go back and forth. We don’t have a lot of guidance. So it’s really hard to tell. If I had to hazard a
guess, I would say that both algorithms that one starts out with and kind of things like weights that you have done for training, and the training data sets themselves I think are hard, those are gonna be hard to patent in a way that
survives, that upholds. The patent office may
well be granting them. The patent office has sent
some signals in the last couple of years in the new administration that it wants to patent
more of these things, but it is not the party
that gets the final say in whether or not
they’re actually valid. So I think a lot of those
things are problematic. I think if you’ve got a method, if I’ve got a kind of new way of actually sort of, training, right, a new way of actually generating results, so, you know, the first
person who comes up with a generative adversarial
network, for instance, right? That seems like a sort of thing that actually improves the operation of the computer itself in a way that probably the courts
would say that’s okay. And I think implementations especially if they’re in physical hardware are probably gonna be okay. So if what I have is,
if what I claim is a car that actually recognizes
and stops at stop signs, that I got to through AI, I could probably patent that because it doesn’t feel like an abstract idea, that feels
like a concrete implementation. That said, these are guesses, right? We don’t know for sure. I’ll make one other point
though which is it may matter less than you think it does whether or not these patents are valid. I think a lot of the
value and a lot of the use especially in a new technology that people get out of patents is not
I will sue my competitor and keep them off the market or make them change their products. That sometimes happens. It happens in pharmaceuticals,
but it doesn’t actually happen all
that often in software. And even in the rare cases it does like Apple versus Samsung, right, it’s not obvious that
Apple winning that case actually changed very
much in the marketplace. The values of patents might be anything from startups being able to kind of stake out place and signal
to venture capitalists where they are, facilitating
kind of business deals and trades that
really are about know-how, or licensing of kind of intangibles but now I’ve got some actual things that I can attach to them. And those values might,
you might get that value whether or not the
patents would ultimately, five years from now, turn out to be valid. – [Lisa] I think that’s right. And one good example of that is the, one of the main lawsuits we’ve seen of people arguing over this technology is the Waymo Uber case where there were patent claims, but they
were not the strong, like, it was really
about the trade secrets. – [Mark] Right. So maybe actually we can
use that as a fulcrum. I think one of the, in
part, because it takes three years, in part,
because of this definition in disclosure problems, I
think what you might say, look, patents just aren’t
the best fit for AI, and if what you’ve got is a really good set of weights that, as a result
of your training data set, the thing you should do is keep it secret. Jim, you wanna– – [Jim] Yeah. (laughs) Secrecy is the oldest form of
intellectual property, right? It’s been used for millennia,
and, as Mark points out, unlike patents, it doesn’t
give you the ability to exclude others and
you always have the risk that someone else is gonna
come up with something. Software structures,
software itself has been protected for a long time through secrecy, which can be very, very powerful because, particularly, if you don’t
have to trust anybody with any instanciation of the code, right, as we have now with the cloud. – [Mark] You have to try
to stay on the (mumbles) (laughs) – [Jim] Right, but you can deal with that in contract,
right, and figure it out. The point is all your customers don’t get anywhere near the code, they just query the tool
and they get an answer. That model is very robust for protecting software innovations. My assumption is, in,
you know, trade secrets, the area that I’ve worked most in, my assumption is that most of what we see with AI will be protected through secrecy. Secrecy protects
information, not inventions, not little teeny things,
but information, data, in all its forms, extracted,
raw, what have you. All is protectable by trade secrets. The challenge, and I think the greatest challenge is gonna be, for AI applications that affect public health, there will be a demand for transparency because everybody wants to know
how is that algorithm that decides whether
the car is going to kill the baby or grandma when it has to go one way or another, or solve
other ethical problems. People are gonna want to
have that understanding. The problem is, of course,
if you have transparency, it’s completely incompatible with secrecy. And if you don’t provide secrecy as a payoff for investing
in the development of the tool, where are
you gonna get people who will put in all that
investment and make the tool? So this is a conundrum. We had this, as I’ve pointed
out something recently, we had this issue around pharmaceuticals over a century ago,
and we created the FDA. So we had people who
could look at this stuff who were competent, most of the time, and could pass on whether
it was safe and effective. And you might imagine, in some world where you could do something like that to preserve the secrecy of the AI tools, but then, these are
things that are dynamic. They keep changing, you know? A molecule is a molecule
and it sits there. So how do you handle this? I don’t know. I’m interested in what other
people think about that. – [Mark] So, just to note, I think while this is a huge issue with sort of health and safety, right, and the medical regulatory folks are gonna be very nervous about the, well yeah, we have a robotic AI surgeon, but we won’t tell you how it works, or we’ll diagnose, but we won’t tell you how we make decisions. But it’s not limited to medical. – [Jim] Well, that’s right. And you have places, we
were talking at lunch, in France, there’s been
very, very strong arguments about any tools, any AI tools that possibly touch on human experience will need to be unshrouded. – [Mark] We want them to be explainable. – [Jim] What does explainable mean? – [Mark] My personal
feeling is we just need to get over that, to a large extent because it’s not, I don’t
know that it’s really feasible, at least not at cost of changing things substantially,
but for some categories of information, and I think these are some fairly significant categories, we’re gonna want some amount of that. So in addition to health and safety, I’ll say policing and
criminal justice, right? When I am sort of stopped
on the basis of an algorithm that sort of predicts my
likely criminal behavior, and I wanna defend
myself, one of the things I’m gonna wanna do is to be able to say well, hey, wait a minute, did that pick me because of my race, because of my gender, because of some other category? To do that I need, theoretically, both access to the algorithm,
which the companies that provide treat as a trade secret, and some ability to
understand or at least query the algorithm to run results that will allow me to test it. – [Lisa] That’s a live issue in today’s criminal justice system where, and Rebecca Wexler and Natalie Ram each have a great article
dealing with this problem where AI is being used in sentencing and forensic, like, various things where criminal defendants
are faced with these, and it’s being protected under
a trade secret privilege. – [Panelist] Although, it should be noted, it’s not a question of
using AI or not here, as much as, to me, it’s a question of using AI versus using a person who has many of the same biases and is also equally unexplainable, or sometimes less. With the doctor looking at a chest x-ray and diagnosing pneumonia, you know, the doctor may not– – [Mark] I wanna right
to explainability, right. – [Panelist] Or a judge
sentencing a defendant, right? If there’s evidence to the judges making pervasively biased decisions, if its an unconscious bias,
they’re not aware of it, and if it’s a conscious bias, the judge is extremely unlikely to say I, you know, gave this person a longer sentence because of their race. So the human mind may be as black boxy as some of these neural networks, and, for me, as a patient,
not that explainability isn’t important, ’cause it is, but, as a patient, I’d
be much more interested in hearing well, what are the outcomes of the robot surgeon? And if they’re better than the human surgeon’s outcomes, but you have no idea how the robot does it, I’d probably pick the robot surgeon. – [Audience Member] But there’s a little game you’re playing here because look, in theory, judges have to articulate why. People get to ask that question. And even if it’s subconscious or not, over time, we might start
having more statistics on, you know, it turns
out you put brown people in jail, you do this to women,
you do this to whomever. It’s not perfect and I think the subtlety of what you just said
there is, a robot surgeon, we have product liability issues with doctors and malpractice,
there are experts that say this was good
standard of care or not. There’s some sense of, excuse me? – [Audience Member] You
can examine the doctor. – [Audience Member] You do all that and we have something
that feels like traction, albeit fuzzy, to your point. The difference with software is maybe, instead of explainability, Dierdre’s been calling this contestability,
Josh Kroll and I have a paper where we talk about using, possibly, a CS hack called
a zero-knowledge proof. Presenting engineers with,
look, what we really want is how do we know that you obeyed a spec, if it was a legal rule, or that this thing is not crashing as
often as it seems to be. And that’s, I think, where
engineers can go, oh, well, if you’ve given up on
this crazy explainability, because most of the
engineers I talk to say forget it, but you say, build me something that you can show, with high confidence, this is how it works. Other people can test it. Now you’re in a world
that’s more plausible to the engineering community. – [Mark] So I agree with that, and I guess I’d say a couple things One is, I don’t think that’s inconsistent with what Ryan is saying because while we can confront the doctor, we have experts who will come and testify, a lot of that is actually anecdote masquerading as expertise and knowledge, and that we may be a lot, the world of human decision may be a lot less transparent than we think it is. We may be less good at
telling when people are lying, less good at telling
whether they’re confident in their opinion because they’re right or just because they’re
confident and so forth. So the trade-offs are comparative. The other piece, I guess I’ll note, is I do think even if
it’s not explainable, the mere fact that we get large pattern outcomes may enable us to find things that trouble us in AI that don’t, that we wouldn’t find otherwise. There’s a great example
of employment algorithm, screening people for employment, and they are very careful in designing the screening algorithm, not to take race or gender into account, but they’re also very careful to kind of learn from who gets hired, who gets promoted, various other things,
and then somebody sort of decided to run a predictive
testing on the algorithm, and it turns out that the two things that are best predictive of success are being named Jared and having played lacrosse in high school. And those are not because lacrosse is a great skill at
whatever job this is, right? Those are because we’ve
now, the computer’s done a good job of finding proxies for rich white men. And we said don’t take this into account, but it went out and found that hey, you know what? Actually, rich white men are doing better in the hiring and promotion decisions and we can replicate that. – [Audience Member] Because
it was built (mumbles). – [Mark] Yeah, right. And you can look at that and you can say that’s a problem? But I look at that and
say that’s a problem we had beforehand, and now we can see and document that it is a problem. So there may be information that we get access to, but,
this to bring us back to the trade secret point, right, that’s all available only to the extent that sort of some of this information is kinda turned over, either
to the public at large, or at least kinda to
a criminal defendants, lawyers, to somebody, a social scientist who can test it. Now you might not need to sort of turn over the actual thing itself. You might, you know, this the Jared and lacrosse stuff you get by kinda black box testing the end state. – [Panelist] Right, right, you could. You can imagine structures
in which you would take the thing and there would be certain queries you could use that would solve everybody’s problems. But the conversation
that’s going on right now is really about, I mean, I just keep hearing this, transparency. We want the whole thing laid open. We don’t want anything lurking inside that we can’t see. We’re gonna need to square that (mumbles). – [Audience Member] Can
I just ask something really about the transparency? So I’m from Europe and I’m a
European diplomat based here. And for me, for example,
this judicial decision predicting criminals who,
to me this is not really so much a problem about, you know, human bias versus machine bias, but about taking away
responsibility from humans. Because in the end, if I’m a, I mean, I am a beuarocrat, and
if there is an algorithm suggesting something,
I cannot really then, you know, it’s the safe thing to do, is to do what the algorithm suggests. So, in a sense, if you’re a judge, that’s the nature of the judge. He has to think for himself,
and even if he’s biased at least, it’s a human
being responsible for that which you can contest. If you start beaurocratizing
everything to machines, in a sense, you take human responsibility out of the equation. And I think that’s the bigger problem than just measuring success rates of predictability between
machines and human beings. Because what are the
judges then there for? – [Mark] I think that’s right, although it does raise to me the question of sort of why we want human decision-making in the loop. And so there are a couple of possible answers to that question. One possible answer is we think humans will make better decisions, right? So Europe now has the
right to a human decision in the GDPR– – [Audience Member] Not better decisions, but human decisions. – [Mark] Okay, and I want to distinguish these two things, right? So I think one argument would be humans would make better decisions. That’s a testable, empirical proposition. I think it will ultimately probably turn out to be wrong in
a lot of these cases. But what I hear you saying
is something different which is that there is some value in having a human make the decision over some things that control my fate. That might be true of criminal justice and criminal sentencing. – [Audience Member] And the president of the United States, would you like an algorithm to (audience
noise drowns out speaker) – [Mark] In a heartbeat, please. (laughs) – [Audience Member] Vote
for an algorithm (mumbles). – [Audience Member] We’re working on it. – [Mark] I think on the kind of do I want a human decision, you know, there may be a kind of sense in which we feel that humans are sort of happier if they feel that they have been heard by another human being, even though they get worse results. And there’s some psychological literature that kind of points in that direction. The question is how much should we be willing to trade off against sort of bad decisions. Even in sort of criminal law context, it might depend on how much
bias we think there is. It strikes me as not crazy if you were a black man in America to say that you would rather have an algorithm decide whether or not a cop is gonna pull his gun on you, than have a cop in the moment decide to do it. And, I could be wrong. – [Audience Member] At
first I was like no, but, at some point, the systems, I think, are diagnostic, right? So what you’re saying is
deference to the (mumbles). ‘Cause what I’m hearing you say is total deference when a judge just isn’t doing anything? That’s probably got some issues because someone, somewhere should be testing was this good software? Asking ex post, unfortunately, but most of law works that way, these rules, are they working
as they were implemented? But if, over time, something says always Jared who plays lacrosse, someone should go whoa. Or the someone somewhat
infamous Boston pothole example that Kate Proctor talked about. What was good is the rest of her paper, if I remember pointed out they worked with BU and very quickly
said that’s absurd. Just visualize this data. There’s no way it’s
only rich neighborhoods that have potholes. It’s being not too deferential. – [Mark] But someone has to look at it. – [Audience Member] And to
review as you point out. This point, data at
scale and good software might reveal whoa, we didn’t even know there’s a whole bunch of systemic bias– – [Mark] Although that someone could be another AI. – [Panelist] And indeed the first point you made about wanting to have human judges was a comment that said we could aggregate what
human judges are doing and see if they’ve got
bias over the long-term. Although France has recently implemented a law that prevents you from doing that on penalty of five years imprisonment for aggregating judicial data. – [Mark] I am a felon in
France for starting (mumbles). – [Panelist] But AI indeed, I think, is going to be a lot more
correctable than people because if you have a
judge tending to make racially disparate decisions, right, it’s always explainable,
and even if a judge agrees to say all right,
I’m not gonna do that, very hard to get someone to internalize a new rule like that. Whereas, with AI, you can just make a rule and say, now sometimes
you have to track data, it’s using Jared instead of male, but you can make a rule and say optimize this except you can’t take these things or these proxies into account and you can always improve it. – [Mark] So to bring this back to sort of IP and trade secrecy, the value, from an intellectual property perspective, I think trade secrets actually, as Jim suggests, probably are the right match in a kind of economic sense for AI. Partly for the reason he talked about but I’ll mention one
other, which is it’s cheap. You don’t pay anybody
to write your patents. – [Audience Member] Relatively. – [Mark] Well, no, no, no. – [Audience Member] But you
have to prove the (mumbles), you don’t get trade secret protection just because you call it. – [Mark] Well. If you go to court,
everything’s expensive. From the perspective of– – [Audience Member] Even
to manage, there is a cost. – [Mark] There is a cost, absolutely. But I think it’s substantially less than I wanna patent everything and certainly if it’s I want to patent snapshots of my, of my invention. – [Audience Member] Enlighten us that are not familiar what those costs would be, like what that looks like. – [Panelist] What the costs would be of– – [Mark] Of a trade secret management. – [Panelist] You have to figure out what the risks are of losing or getting contamination of your most valuable stuff, how likely those things are to happen, and then you’re presented, if these things are truly important
to your organization, then you look at well what am I willing to spend on mechanisms
that will reduce that risk? Are we gonna put in more guards? Are we gonna have fewer people that get access to it. We’re gonna be less
productive as a result. Every bit of secure,
every security mechanism has a trade-off in terms of money cost or inefficiency in
doing what you’re doing. So you aggregate all of that and pay attention to it and you can see that it’s not free, but, although I absolutely agree with Mark, it’s less expensive. – [Mark] And it’s also– (audience discussing) You have to show that you have taken reasonable efforts
to protect the secrets. – [Audience Member] You treat it like an actual secret. You can’t just say I would prefer other people not to find it. You can truly restrict
access many ways (mumbles). – [Mark] I guess what I would say is yeah, I do think it is cheaper
than patents in the long run. But I also think the marginal cost is close to zero. That is, these are things tech companies are already doing anyway. So most of the things you would do, badged access to buildings, password access to computers, are things tech companies are doing maybe ultimately for trade secret reasons generally, or for some general security reasons, but I don’t think the law is gonna require you to spend a lot more money doing things you wouldn’t otherwise do. – [Panelist] Whereas
each patent application will cost you thousands
and thousands of dollars. – [Audience Member]
Plus, privacy law, right? Big companies. Google was, when I was there, very serious about how many people had access to data, the rules were very secure. So any company doing
data work, interestingly, you get a benefit. Trade secret secure, yes. You’re also showing that you’re trying to be private. So you actually, you’re
gonna have two reasons where, as a cost question,
you’re going to be doing this. – [Panelist] You’re gonna
do access management. – [Audience Member] I think so, right? And you actually look good because by doing that, though data
breach is still a problem, you actually are complying
with another piece of what you should be doing as a company. – [Lisa] It seems worth
noting from an IP perspective that there’s two different conversations we’ve been having in
the past five minutes. One is about whether we should be using AI versus human decision-makers
in different contexts, like when are we gonna be using AI, which it’s not really
an IP quite as important law, policy question. Then there’s this other question of given that AI is being used in various contexts, whether it’s criminal justice or medical or other things, when should society require disclosure, and what disclosure should be required about that versus allowing it to be protected
through trade secrets. And I hear Jim saying,
raising the concern, when you require disclosure,
that’s taking away some of the incentive to create it, which is the kind of problem that IP is concerned about, and
we should remember that secrecy and protection and exclusivity is not the only kind of incentive we have for producing these things. So when we’re talking
about something where like a medical technology
that’s being used where there might be
a huge public interest in other people being able
to test and build on it, there may be other ways to, through the regulatory
process, provide a reward to the people who develop it in exchange for the disclosure. Just by saying you can’t
keep it as a secret doesn’t mean that there
aren’t other things that the legal system can
do to provide incentives. – [Panelist] Yeah. (laughs) – [Audience Member] (mumbles)
I don’t know if this will help about the property over the data? – [Mark] Yes, actually, just
where I wanted to go now. One of the things we’ve established is the training data is one of those more critical pieces that differentiates this from other things. I think there are kind of two dimensions in which to think about this. One is as the user. And one is as the owner or possessor. From the user perspective,
AI isn’t gonna work unless I have very large, very good training data sets, right? And some of the issues
we’ve been talking about about transparency, about
bias, and various other things are gonna be influenced
by not only how big is the training data set,
but how is it constructed? So a lot of the bias in algorithm stuff has turned out to be
small number problems. We don’t have very many facial images of people from American Samoa, and so we’re actually less good at recognizing them. So if you don’t oversample, if you don’t build a training data set that’s sensitive to those problems, you end up in trouble. Everybody wants good training data sets. Training data sets are, tend to be facts about the world, fine. Pictures, images, videos,
text, various things. Almost all of those except
facts about the world are copyrighted. And the key thing about them is that those copyrights are
owned by a whole bunch of different people. So Google has collected a really large group of videos and it can process a lot of information and learn things about videos. It didn’t make any of those videos. And because of the way Google works, it also didn’t collect that database by going out and buying all of that video content from everybody, right? It set up in YouTube it set up a site where people could
upload their own content, sometimes the content of other people. YouTube has a legal right to use it, subject to certain conditions, for the purpose for
which they are using it, traditionally, which is make this video sharing site available to the public. Similarly, the search engine, right? We went and crawled all over the internet. All the text on the internet is, almost all of it is copyrighted, we collected all that stuff. We didn’t get licenses and permission from everyone, some caveats around what you think the robot
exclusion header does, but we, the courts said
that’s permissible, that’s a lawful use because you’re doing something valuable with it for the world. You are creating a search engine that allows people to find
what they’re looking for. It’s not clear that Google’s taking that database it has built and has a right to use to provide search results or to sort of share people, and saying I’m going to use this to train my self-driving car, or
I’m going to use this to train my personal assistant, will be a fair use of that material. And if it’s not, well there are a lot of different copyright owners out there, each of whom could have a lawsuit, and could have a lawsuit,
not just for whatever minuscule share of the
value of the training data set there is, but
we have in copyright law, unlike patent law,
statutory minimum damages that generally far exceed the value of the individual use you might make here. So I think one issue if you are a training database
user is, is it actually gonna be lawful for me to use a training data set for this purpose
when all the stuff in it is owned by millions
of different people? – [Lisa] Question. – [Audience Member] So
how would that argument apply in voice? – [Mark] So, interesting,
I think the same issue would apply. Voice is a little more complicated because, kind of truly
unplanned conversations, generally speaking, are not copyrighted. But they can be made copyrightable in relatively easy ways. So anything that is an
audio communication, that is a movie, a play, a concert, anything that’s scripted, or anything that is recorded with the permission of the person speaking. Those things get copyright protection. So if you wanted to run a AI in voice and you could somehow find only unscripted conversations out on the street between people, you’d probably be fine on the copyright side, you might actually now be guilty of
violating the surveillance and wiretap laws. (laughs) But that’s outside our scope. But once you’re in the world of well, where am I gonna find
good examples of voice, I wanna go find television performances, all of that stuff, all that stuff is gonna be copyrighted. – [Ryan] So some jurisdictions like Japan have text data mining and such exempt, but you were gonna
talk about that next. – [Mark] No, no, go for it. – [Ryan] I was gonna ask you what you thought about it, or what we should be doing in States, or with the Google Books project, for example, where they went and digitized everyone’s copyright books and now they’re training their AI on it, and that was held to be a fair use, well, not the AI training yet, but digitization. – [Mark] So I represented Google in the digitization case and the copyright case– – [Panelist] Well, well done, sir. – [Mark] I’m very happy that we were allowed to actually
digitize all the world’s books and make snippets available. But I don’t know that a court would necessarily conclude that because that was legal, that putting it into your training project
is going to be legal. And I think it may well be the case that even if a U.S.
court would conclude it a European court, which has a much more restrictive copyright regime right now, probably wouldn’t reach that conclusion. Interestingly, as Ryan points out, Japan is gone in the opposite direction. A new version of the
Japanese copyright act writes a specific exemption for training data sets from copyright law precisely in order to encourage people to do this. – [Ryan] Right, so the
European new copyright act also has one of those,
but it’s more restrictive than the Japanese one. – [Mark] Yep. – [Audience Member] What’s
the claim on the copyright? In terms of, let’s pretend we have the money to just buy it, first sale document of sorts, build my set, and use it in the background to train my data, where is the copyright violation if I’m training? – [Mark] Who are you buying it from? – [Audience Member] Let’s say I go to, let’s say Amazon. And I actually buy it. – [Mark] You buy it. But why does Amazon have it? Amazon has it because they made digital copies of all of this stuff under a fair use exemption from copyright infringement that was tied
to their specific use. And now, they’re making
a different use of it. That different use might also be, I think, should also be permissible. But I can imagine a court looking at that and saying wait a minute,
this isn’t, right, you’re not transforming
the work in the sense that you’re providing the consumer with the snippet that
allows them to find it. You’re just taking this stuff commercially and dumping it into your
self-driving car program. – [Audience Member] If
someone took the Google approach with cash and said
I’m just buying a corpus, not scanning, and then I scanned it, and then I trained on the back end, where’s the copyright claim– – [Mark] Wait, ’cause you made a copy. How did you scan it? You made a copy. This was the whole problem
with Google Book search. They had legal access to all the books they scanned, but
scanning it makes a copy. And the publishers said
hey, no, that’s unlawful. Now the court said no, that’s permissible because it’s valuable,
but it’s not obvious that they will come to that conclusion. I think they should, but, you know, reasonable people can differ, and there’s a lot of money if you’re wrong. – [Audience Member] I’m trying to find a solution to the problem (mumbles). In the situation (mumbles), In the situation like this where Google Maps allows (mumbles), like what is the right, what is my right now to use that? – [Mark] So I think if you’re, if you’re using it to try to catch and identify deepfakes, you probably, you’ll probably be, it’ll
probably be permissible. You’d still have to walk through the copyright fair use analysis. So copyright law, if you make a copy of something, in whole or in part, you’ve triggered copyright law. In the U.S. we have a fair use defense that’s like multifactor, case by case, we know it when we see it test for figuring out some
combination of is this good for the world, and are you taking money away from the copyright owner? – [Audience Member] But you’re
still infringing (mumbles). – [Mark] Yes. – [Audience Member] The point is that it’s still an infringement. – [Mark] Yes, right. – [Audience Member] So
if Google creates videos, like thousands of deepfake
videos of (mumbles) that were born in all
kinds of sexual positions doing all kinds of (mumbles) or whatever, that’s okay? – [Mark] Is it okay? – [Audience Member]
(mumbles) would consent? – [Mark] Well, so the answer is, I don’t think it should be okay, and I think it might not be okay, but it is probably not
copyright infringement– – [Lisa] That’s gonna be
infringing the copyright (mumbles), depending on the people, the videos that you’re starting with. – [Audience Member] So
you start with images of the real person. – [Mark] Right. – [Lisa] So all of those videos and images you’re starting with,
whoever owns the copyright in them, you are probably
infringing their copyright and there are– – [Mark] But the photographer might not, photographer is not necessarily the person who objects to this. The person who objects to this is Elizabeth Warren, she doesn’t own the copyrights because she
didn’t take the pictures. And so this is, this has actually come up in some of the sexual privacy cases involving not deepfakes, but revenge porn, where people are in a relationship, the one party takes a picture of the other unclothed, they break up, they then, the party who took the picture posts it to the internet. The party who’s injured, who’s the person depicted, doesn’t own the copyright in that image because they
didn’t take the picture. The copyright is in the
person who took the picture. – [Audience Member] I have a question. Let’s say I’m an artist and I write an algorithm and train the algorithm with all the texts of Franz Kafka. And I want the algorithm to produce a new Kafka text. So I understand that the training set there are these copyright protections of whoever owns the
copyright to the Kafka text, but what about the product? – [Mark] Yeah. – [Audience Member] I find it fascinating because it deals with human creativity versus AI creativity. The same, of course, question about music that’s created. Is it the ones who write the algorithms– – [Mark] Make a song that
sounds like Taylor Swift, right? So I think this is a huge,
really interesting question and they’re kind of two
different pieces to it, right? Has the algorithm that
wrote a new Franz Kafka story done something creative that warrants copyright protection and if so, who is the author who gets
the copyright protection? And there’s a second question of in the course of doing that, are you infringing Franz Kafka’s rights? Because even though he
never wrote this story, you have collected together his works and so the thing that you have made is presumably something which is not identical to anything he wrote, but it’s similar at kind of some
level to the things he wrote. – [Lisa] And for those who aren’t familiar with copyright law, you are liable for copyright infringement, not just for making a direct copy, but for making anything that is substantially similar, so if you are inspired by some other work writing your own sequel that involves some similar
themes and characters, that can also be copyright infringement. – [Mark] So Marvin Gaye, for instance, his estate has won copyright suits against people who wrote new
songs, different songs, but that sound quite a bit like his songs. – [Audience Member] So just wondered if you’d come back a
little bit on the YouTube example, and I don’t know how it fits in with patent law, did
something of copyright law which probably is important for (mumbles). So I heard you say about the book example where it seems to be like another public good, therefore, it was judged to be useful,
therefore, permissible. How, I’m just thinking too, how does that work for using data sets for which not specific permission has been given? First of all, how would
that be judged eventually such as in the case of YouTube if it would help prevent, I dunno, people committing suicide? Giving another example, like being able to derive (mumbles), so that’s one. And then secondly, what’s the best way, thinking about creating
data sets for which, what’s the best way of getting permission for actually using them to train. – [Mark] Yeah. So it’s a good question. Let me start with the second one. So the problem is, and the reason why I think we ought to want this to be a fair use is, if the answer is I have to get every photographer’s permission who’s ever taken a photo that’s on the internet, that’s never gonna happen. And so what we’ll end up with instead are kind of small and non-representative sub-libraries for which
I can get permission. Well, Getty has a certain set of pictures and they are more likely to have this characteristic and so that’s what we’ll train on. That strikes me as bad from
a training perspective. We’d rather have access to
the best possible corpus. But it’s really hard to do if the answer is you have to go get permission from everybody separately. There’s a second layer
to the question though, I think, which is, okay, somebody has compiled all of this stuff. Google has this large database of photo images. One question is can they use it? That’s the fair use question
we’ve been discussing. A second question is, somebody else wants access to that stuff. I want my self-driving cars to stop at stop signs too. I want good training to identify stop signs that aren’t marked in some electronic map. Now, I can imagine Google saying well, yeah, we built this database, we put a lot of effort into it. It’s true that we built this database using the fair use doctrine, but you shouldn’t be able to just copy our database outright. You could come buy it from us or come license the right to us if we’re willing to sell it to you. But if you’re a competitor, maybe we’re not willing to sell it to you. And so I think this
goes back in some sense to this essential facilities idea, right? I don’t think that U.S. antitrust law will get there on essential facilities. European law might. But there may be other ways to get at it. So there’s a interesting ninth circuit decision this year called hiQ that suggests that it might actually be permissible to try to scrape somebody else’s public database. In the past, we’ve used
laws like the computer fraud and abuse act to restrict that. We’ve used efforts to sort of put a contractual terms and conditions thing on a license to restrict that, and the law has generally
enforced those restrictions. But it may be that courts are more willing to say yeah, you know what, if what you’ve got is a bunch of stuff that is not itself belong to you, it’s in your database and I wanna come and scrape it from you, maybe
that shouldn’t be unlawful. And that would be a way of effectively getting the kind of essential
facilities competition. It’s not you must affirmatively share it with all your
competitors, but you can’t stop people– – [Panelist] As long as they
can access the database. – [Mark] As long as they can have access to the database. – [Panelist] And in Europe then you have also the database directive and copyright (mumbles) database, so Europe, well. – [Mark] No, no, no, go. – [Lisa] There’s a question
back in the corner also. – [Audience Member] I was wondering if we could return to the downstream question of machine creativity, whether machines can actually hold influential property rights because I think current understanding of U.S. patent law is that there has to be a human inventor. And so it creates perverse incentives to suggest that there was a human involved even if it was an autonomous machine that did the creation. And my understanding is that in the U.K. and Australia there’s been a little bit of a shift recently, and I was wondering what your
thoughts are around this. – [Mark] Yeah, and I think the answers might differ depending on whether the output of the AI is a patentable, potentially patentable invention or whether it is a novel, a copyrightable novel, of Franz Kafka. So Ryan, you wanna start on the– – [Audience Member] Can I
just add a little bit to that? – [Mark] Yeah. – [Audience Member] ‘Cause I was thinking about the Taylor Swift example is kind of interesting because I think there is the AI out there available that if you don’t really take Taylor Swift, but you take, you know, Selena Gomez and two
or three other similar, feed all of that into the AI so that not any of them can say it’s significantly similar to them individually, then is it okay? So that’s kind of this
creativity (mumbles) around it. – [Mark] I think the answer is yes, it’s okay if it’s not too similar to any one of them. That’s basically, that’s our pop music, our non-AI version of pop
music right now, right? – [Audience Member] But you can say the same thing for television
shows, or, you know. – [Mark] No, I think that’s right. The answer is yes, as long as what you’re doing, even if it’s kind of derivative if it’s not
sort of too derivative of just one person, then it’s permissible. You can do it. – [Lisa] Though the question
is not a process one, it’s a, like, does the output, from the perspective of a
ordinary listener, sound similar. – [Audience Member] Well the whole point would be that machine can come up with a part of it that’s gonna please people’s tastes. – [Mark] Yeah. So Lisa’s point is an important one. Which is, maybe you wanna– – [Lisa] Yeah. The question is not did I feed in three different musicians and thus if none of them would have a claim as a matter of the process that I used to create it, it is, given whatever the output is, would a ordinary listener, which is the standard we use in copyright law, think
it is substantially similar to any of these other songs? – [Mark] So if the answer is I fed in three different boy bands, and they all sound like *NSYNC because all three of the bands sound like *NSYNC and so the resulting output still sounds a lot like *NSYNC, I’m copyright infringement, right, I’m infringing. – [Audience Member] It wouldn’t be if the producer found a whole bunch of boy bands and said I’ve got your sound in reality. – [Mark] Sure, if they
made it too similar to one. – [Audience Member]
That’s exactly the edge that would happen, right? It’s back to once we put
the human in the loop, and the producer is like I produce boy bands, I produce young pop stars and they all use auto-tune. – [Mark] I disagree with that though. – [Lisa] It’s still copyright infringement under current law. – [Mark] This is Blurred Lines. – [Audience Member] (mumbles) they’re not suing each other only because they happen to be under the
same corporate umbrella. That’s your point then. – [Mark] No, no, no, no. This is Blurred Lines. – [Panelist] The point is whether or not you’re automating, you still
have the same task, right? – [Mark] Our point is it’s infringement even though humans do it. – [Lisa] Yeah. – [Audience Member] Yeah, but the irony is in practice. The machine that uses the music industry will produce people who absolutely are told sound like this, with a slight variance, but they’re hitting a genre– – [Panelist] (mumbles)
of variance, just enough. – [Mark] You have to vary
enough or they get sued. This is what happened in
the Blurred Lines case, and they lose to Marvin Gaye. So let’s go back to this question, right? Because I do think we got both a, who, if anyone, owns the output of an AI if it’s a technical output, and who owns it if it’s
a copyrighted output. Ryan, why don’t you go with it. – [Ryan] Well, I’ll start
with the patentable output. So people have been claiming machines have been doing this since the ’80s, but it hadn’t, until very recently, come up as a case. But it may be a substantial part of what AI is doing now in things like the Watson Insights business model. So companies give big data to IBM which runs the data through the suite of algorithms that are Watson. It produces some output that outputs an insight. It belongs by contract to the client, and the client can file a patent on it. But it isn’t clear there you’ve really got a traditional human inventor, right? Because if they just handed their data over, that doesn’t do it. If they commissioned
research, that doesn’t do it. IBM’s not an inventor ’cause companies can’t be inventors. Maybe the people who programmed Watson or curated or preened it on data, but probably not if, well either it could be hundreds or thousands of programmers, but if they didn’t have a specific expectation of the problem they were going on to conceive of, that wouldn’t make them an inventor. – [Mark] Maybe. – [Ryan] Maybe, well, sometimes. – [Mark] Accidental
invention is a real thing. – [Ryan] Well, accidental invention is a real thing if you were the first person to recognize the patentable subject matter of the output. And if Watson says here’s 1000 ways to design a new aircraft wing and someone goes through them and says I think this one’s gonna work, right? That might be an inventor. But if Watson says here’s, you know, I ask Watson give me a
better aircraft wing, here’s all of Boeing’s data, and Watson says here’s your aircraft
wing, and you say great, I’m gonna file a patent on this. If people were doing that, if IBM was going to a team of human researchers, or Boeing was going to
human team of researchers and saying I want a better aircraft wing here’s my data, and they come and say here’s your better
aircraft wing, and you say great, I’m gonna be rich. Let’s file a patent on this. That wouldn’t make you an inventor, right? In the same way if I train my PHD student or I have children, I’m not an inventor on their patents, at least without conceiving of the direct invention. So we filed a couple patent test cases. A group I’m leading in the U.S., the U.K., Europe, Taiwan, and Israel now that got announced a couple
months ago, and they were autonomously made by an AI. It was a neural network system. In it’s ’90s iteration, you had one neural network trained on data. It would perturb its own connection way, it’s corrupting its data and outputting novel data, and you
train a critical network, a critic network to
evaluate the outcoming data and say I’m looking for
something interesting that meets these criteria and
this new output does that. Some of the time, someone in there may be a human inventor,
and some of the time maybe not. If you’re just asking
it to solve a problem and giving it some very general stuff. Now, these machines have thousands or millions of neural
networks working together. Each network encodes a conceptual space, so there may be a network for academic and a network for boring
and a network for escape, and it gets trained to
combine them into simple ideas like escape an academic
lecture that’s boring. And if you combine it with– – [Mark] Too close to home here, Ryan. – [Ryan] Too close to home. I can jump out of a window but I’ll die, but flying devices may
let me jump out a window and then it combines it into a new use for a flying device is to escape a boring academic lecture. That could be an idea. So it made a new beverage container and it made a flashing light. And we filed for these and in almost every jurisdiction, but
not every jurisdiction, there’s a law that says a natural person has to be an
inventor and an inventor generally has first rights
to a patent application. And these laws date back decades to times when people were concerned about companies owning patents, which they own most patents, and not listing human inventors, so human inventors wouldn’t be acknowledged. It would violate there moral rights. And so a person has to be an inventor even though a company usually owns it. And there’s really no
law in any jurisdiction talking about whether you can protect something that has no human inventor. Who would be listed if you don’t have a traditional human inventor or who would own that thing? – [Audience Member] Does
Korea not speak on that? – [Ryan] We’ll come to Korea. Okay. So we filed these applications and say well, there’s a lot of
reasons we have patents, but the primary economic
one in the U.S., U.K. regime is that we want
to incentivize innovation and people have said well, machines don’t care about patents, but people like, or companies like IBM and Google that are investing in inventive
AI do care about patents. The people who build, use, and train AI care about patents. And so really what we want to incentivize is people to build inventive AI and this will get us more
innovation in the long run. Also, in twenty years from now, if machines are better than people at innovating, at least in certain areas, if only people could get patents but machines couldn’t, you’d have a real perverse incentive where you couldn’t use a machine
to invent something, even if it was better if you really cared about patent protection. So we think these things should get patent protection. We think a machine should be listed as the inventor when it has functioned as an inventor because even though machines don’t care about being acknowledged, if I could take credit for having Watson invent
1000 things for me, that cheapens the act of human invention. So it would allow a real human inventor and someone who just asked a machine to solve a problem to have
the same acknowledgment. – [Audience Member] No
it just really plays to a lot of the anti-AI, general AI being one to start getting AI that could start improving itself. Like if you would be able to patent that, you’d get, like, basically undesirable ethical outcome– – [Ryan] What would that be? – [Audience Member] (mumbles) catch up. – [Ryan] Well, right. So we’ll come to that next, right? And then who would own these patents? The owners or the AI
should own its patents. So we’re filing these on behalf of the AI’s owner with the AI listed as the inventor and– – [Mark] The AI. – [Ryan] The AI. All right, well let’s come to that. Let’s go back to that. Right now, inventive
AI is kind of a novelty or some even big companies telling me it’s happening, but kind of as a side show or like occasionally. But AI is getting a lot
better and people aren’t. (audience laughs) – [Audience Member]
(mumbles) that is exactly what you’re talking about (mumbles). – [Ryan] Right, well there’s a lot of different AI architectures that can generate new information. Genetic programming, neural networks, even some symbolic and logic and expert system has seemed to have done this, but, in any case, if in 20 years from now you can go to Watson and the life sciences or Pfizer can license
Watson and say, you know, what antigen should I be looking at? what antibodies will target these? Tell me how these will
behave in clinical trials. And really, you just have people asking machines for answers. Indeed, it would get harder
for people to compete. And indeed, the standard
we use in patent law of the person having
ordinary skill in the art, which we use to evaluate whether something is obvious or not, is going to change as people are increasingly
augmented by AI, making them more knowledgeable
and sophisticated. And eventually, as inventive AI becomes the standard means of inventing, the person of reasonable
skill would really be an inventive AI, which would make it very hard for people
to compete with these things. On the other hand, if what we care about is generating innovation primarily and inventive AI is one day, the best way to do that, then
maybe Watson and DeepMind will cure cancer, and maybe they will have a 20 year monopoly on that, no sort of anti-competitive problems– – [Mark] Why do we need to give ’em a 20 year monopoly? I see why we want to give a human being a 20 year monopoly, right? Why does the AI need a 20 year monopoly? – [Audience Member] Because the rules are clearly very different. – [Mark] Well, I don’t
know if I agree with that. – [Ryan] So if, indeed, as
companies are the people investing in this and hiring researchers and owning most patents anyway, right? It’s a choice for IBM or Pfizer to say do I have to ask a human research team what’s a new use of Viagra or
can I just ask Watson, right? And, ultimately for them, investment upstream in the inventive AI is what we wanna subsidize to get
this sort of innovation. – [Mark] So that’s right, at some level. But it seems to me it becomes less and less right as the AI improves itself. – [Ryan] As the AI improves itself, it will naturally raise the bar to inventive activity
because it will have raised the standard level of
the average researcher, and it will be harder and harder to get patents even for inventive AI. – [Audience Member] You make an assumption there that it raises the level of the average researcher. That’s not necessarily the case if not the average research
has access to this thing. So here’s the point, like,
it won’t be averaging out. – [Audience Member] The mean
may increase with the median. Actually. – [Ryan] Well, it is certainly going to be fact-dependent depending
on the area that you’re in. But, I mean, that is a
world we’re thinking about, where inventive AI is
outperforming people. – [Mark] But the point is
also distribution of the AI. And this goes back to a
question you raised earlier about why it is we
aren’t seeing disruptions of the Googles of the
world, and if, in fact, these are self-reinforcing, then you might be in a situation in
which the patent system actually contributes to
that self-reinforcing. Because only the top-level
AIs are producing things, including
improvements in themselves, that can be patented. No one else who’s trying to play catch up gets patent protection because they’re below the standard of (mumbles). – [Ryan] And so to the extent that what you’re really concerned about is an anti-competitive environment, it seems to me that we
do have tools for that, although, as you point out, they’re really not used at all. But yes, it would seem to exacerbate this concern we have about
market consolidation. Although it is not necessarily going to be the case that they’re only be two super powerful Ais in the world. I mean, there are lots of other people working on AI. The AI that invented these
things we’re filing on was from a small company
out in middle America, so it is not just DeepMind
and Watson making things. And if it does turn
out to just be DeepMind and Watson, there are things
we could do about that. – [Audience Member]
(mumbles) fundamental point as to whether actually patent law should be completely revised
in light of, kind of. – [Mark] Of course, if you’re starting from the position that really we don’t need patents at all to start with and they do more harm than good and we shouldn’t have
them in the first place, even if they start from that. – [Audience Member]
Does that argument start to increase in weight as these things do start to occur where outputs of AI– – [Mark] Right. The economic theory of patents has to be that we won’t get innovation without it. And so the story for justification of this has to be, I think, we won’t get up front initial investment by the IBMs and Googles of the world that produces an AI that’s good enough to then start improving itself to the point where we don’t actually need
that investment anymore. – [Ryan] Right, and I want
to challenge that slightly. Because I think that the
economics of patent law are that we want to accelerate innovation that would probably be
happening anyway, right? And so eventually, if you have an infinite time course, people will invent everything there is to
invent, and there are other reasons to invent, but providing an initial financial impetus to do so accelerates investment. And, really quickly,
really quickly before, and what we, well, it’s
good you’re excited. And what we really to accelerate is development of super intelligent AI that is going to solve all our problems like climate change and the president. (audience laughs) And things like that. And if we can encourage that now, through the IP system, we’ll have a lot of social benefit. – [Audience Member] Let’s
make it very explicit. What would happen if a
generative adversarial, GANs, like, if that would have been a patented invention, do you think that would actually (mumbles) or would it have actually reduced innovation? – [Ryan] Well GANs may
have been a patented thing before they were called GANs, but that’s a whole separate thing. Patenting in AI is different than patenting the output of an AI, right? And so your question is kind of what we were talking about earlier, can you patent an AI, you have to deposit an inventive AI, and this is, you know, that’d be like saying we’re gonna patent our researcher person,
everything, you know. – [Audience Member]
There’s actually another point, though is that you can patent things and people can
choose not to enforce them, right, for differing incentives. There’s plenty of things out there in the world that are sitting as unenforced patents and people continue to file on things without any intention of blocking other people from copying that exact (mumbles). – [Mark] I actually think a truly fascinating question under Ryan’s universe is what to we do with open-source AIs? But again, not the question
of patenting the AI, who, if anyone, should own the output of an open-source AI? – [Audience Member] If it’s out there, and everyone can do it,
you’re not gonna need– – [Mark] Then everything is obvious, anyone can use this AI. – [Lisa] It’s gonna depend on the facts. If the concept is what
you are giving to this, the question you are asking this open-source AI, I mean, I think there’s still, that person could plausibly be the inventor. – [Mark] Okay so, you’ve been
waiting patiently, sorry. – [Audience Member] So again, back to this creativity of the question whether the copyright owner is the AI over the output. I’ve come across recently about, on a European product, an
algorithm called (mumbles) by an Italian group that basically helps professional musicians by feeding the algorithm with their own music. And it helps them then to create. So, in a sense, and I think that’s what I hear from musicians, in reality, AI doesn’t really come up with terribly exciting new stuff, if musicians use it, it’s usually either for marketing reasons so that they can say there’s AI in it, because it’s the new thing, or in order, as an augmented tool, in order to advance and to, kind of, tweaking and fasten their
own creative processes. So if you have an algorithm like that for a professional musician, that just helps, you know, comes
up with suggestions, new melodies and so on. Who owns that? – [Mark] There’s a line. There’s a spectrum here. So, on the one end of the spectrum, let’s put Microsoft Word. There’s no question that Microsoft Word helps me write things. It helps me write them more efficiently than it used to be. It’s also, I think, no question that we would say, it is not an AI creator of the thing, the creativity is in the human input
which is being augmented. And on the other end of the spectrum I would put the, we
fed all the Franz Kafka novels into a Kafka novel generator and the Kafka novel generator generates novels that are like Franz Kafka and let’s assume are good, or at least interesting enough that
they might be purchased. And there, I think, in that latter case, there’s a parallel
argument to the one Ryan’s making which says just as he thinks the AI in itself should be an inventor, the AI itself should be the author. The AI has generated something. And yes, it’s true that it generated it from a base set of knowledge, but that’s kind of like a
training data set, right? I think your case is in between the two. And realistically, that’s where we are and probably will be
for most types of things for a little while, although there are examples of painting
AIs which have generated and sold paintings in the style of Dutch masters, for instance. Dutch masters was not
an accidental choice. Copyrights last a long time, but they do eventually expire. So as long as your training data set has only things from before
1924, you’re in good shape. But of course that’s not really what popular music or most other things will wanna be trained on. So I think on this copyright question, the laws might actually turn out to be somewhat different. Maybe not. – [Panelist] You wanna talk
about the chimpanzee (mumbles). – [Mark] Yes, exactly. So we do have a case that
presents this question not quite for AI. This is the monkey selfie. You might or might not have heard of the monkey selfie, but about four or five years ago, a photographer went down to a tribe of monkeys and set his camera down
and one of the monkeys picked up a camera and took a very iconic selfie photograph of it. And this photograph became very popular, very well-known. And the question was
who owned it, if anyone? That question was decided by the ninth circuit court of appeals
here in California last year. The People for the Ethical
Treatment of Animals intervened on behalf of the monkey saying the monkey is the copyright owner. The monkey is the one
who took the photograph. The monkey has created the creativity. The owner of the camera says no, I am the copyright owner because it’s my camera and there’s a photograph that came out of the camera. The ninth circuit said
last year in this case that a monkey cannot own a copyright. – [Panelist] Well, not quite. – [Mark] Well, okay, yes
they did, but all right. – [Panelist] They kicked
the case on standing. They said Congress did not authorize a monkey to sue under the copyright act, so we are not going to
entertain this case. – [Mark] Correct. – [Panelist] And the copyright office, there is no case on this, the copyright office has a human authorship requirement which it cites to the 1889 case of Burrow-Giles v Sarony in support of, saying that only– – [Mark] Earliest photography case. – [Panelist] Only the acts of human intellectual creativity can be protected. And that was a case where the defendant was saying because he used a camera it negates human authorship. This is just a recreation
of a natural phenomenon. But there’s never really been substantive consideration of the policy behind non-human authorship. – [Mark] To be fair,
it’s not obvious to me that it makes sense to say the monkey can’t have a copyright, but the one thing I think that does seem right to me is that the owner of the camera shouldn’t have the copyright. And so this, I think, has an interesting parallel to what Ryan’s
talking about, right? His claim is I own the camera. The camera, through some non-human agency, generated creativity. It created this photograph, therefore I should own this output. That seems to me kind of parallel to the argument that you’re
making for patents and AI. – [Ryan] So I think that it is a very parallel case but not here. The camera is just a tool the way almost every computer is a tool. The inventions we filed on and some leading AI systems are functionally mimicking human creative behavior. – [Mark] But so is the monkey. – [Ryan] Well first, David Slater who was the photographer for that case by the way like this
Napoleon v Sarony case claimed later that he
set the whole thing up very carefully and enticed
the monkey to come over– – [Mark] Not what he claimed at the, when the picture was first taken but by the time of litigation that is, in fact, what he claimed. – [Ryan] And I should be clear in our patent case, and in my scholarship, I’m not saying the AI
would own the patents. That’s just crazy. Because, in addition to legally being unable to not having legal personhood, they have no interests or moral agency, they wouldn’t be
incentivized by patents. It’s really the behavior of the people who make, sell, and use AI. And for the same reason
as copywritten work. So AI has been able to write books or take photographs or make paintings for a long time. They’re just starting to have real financial value. And again, in 20 years, AI may be doing a better job of making boy bands than actual boy bands themselves, in which case this isn’t gonna be like a funny monkey novelty, it’s gonna be like wow, our
society is changing– – [Mark] We’re halfway there
with K-pop anyway, right? (audience discussing) – [Ryan] And so it’s even, I think that’s a little trickier in copyright where the bar to not being a tool for a person is so very low, that wasn’t quite how I meant to say it, but I think it works, right? I mean, with human
invention, we have a test. Did you conceive of or
devise an invention, right? And I can give you a factual scenario, hypothetically or not, where there’s no person who qualifies there and the act was done by a machine. So if you have a machine
that is, you know, training on every picture on the internet and then you push a button and say give me a new picture
and it makes a picture that for no one is confusingly similar to another picture, right? I’d say the machine there functionally made something creative. If you are just pushing a button and a camera is capturing light and imprinting it digitally or otherwise, I think that’s a tool. Even though the bar for originality and creativity is so low in copyright, there is still a bar for that, right? So I mean, I guess if you dropped your own camera and it takes a picture that doesn’t make anyone an author of that picture. It hasn’t come up in a case yet. – [Mark] It’s unclear, I mean, there’s language, stray language that suggest it might. To me, where this goes in both of these contexts is, it doesn’t seem obvious to me that anyone should own the photo in the monkey selfie case. That is, if the point of copyright law is to encourage creativity and creativity requires that human agency to be encouraged, the monkey doesn’t need the incentive, and the guy who owns the camera didn’t do anything
that involves the incentive. – [Audience Member] So the answer can be that nobody owns it, right? That’s another option? – [Mark] Exactly. – [Ryan] So that’s the
copyright office policy. The U.K. incidentally has a different rule that says in the case
of a computer-generated work the producer of the work is deemed the author of the work. And I think that you should have copyright for animal art. And I think that the owner of the monkey should own its copyright which would be the
Indonesian Nature Reserve. And even though, this is a novelty, there’s not a huge market for animal art, but there is a market for animal art, and if people by paintings by gorillas and elephants, if you allowed copyright on this, you would incentivize that. You would have zoos and nature reserves finding a new potential source of revenue. It’s a small thing. The AI is like that
but a big thing, right? – [Audience Member] So the U.K. works like that across every principle though, right? If any employee creates something, their employer basically owns that under U.K. IP regimes, right? – [Ryan] Well under the default one generally, but the U.S. has a similar sort of regime like that. – [Mark] All right, sir. – [Audience Member] So on this, this is not hypothetical, what we’re looking at in terms of biodiversity, way looking at AI is
to look at biomimicry, so the behavior of certain species and how that can be used, ’cause that’s got value to different robotics and can be a revenue stream
back to the countries and provinces (mumbles), and so in that case you just mentioned,
the chimpanzee case, the IP would not revert, you know, cannot be a human inventor. So in our application of nature behaving a certain way, swarms of species of individuals, there is an AI which is improving identifying those bodies, creating an algorithm. That algorithm if then sold on to robotics companies you need them for various reasons of traffic flow and other areas. In that case, who would own that IP? Or are you saying that
there is no protection really for this sort of application? – [Ryan] Well, I think probably, under current legal regimes, there would be no protection and the U.S. Isn’t part of the Nagoya Protocol. But that is a very
interesting intersection and how computer-generated works and biogenetic resources and biodiversity have kind of this interaction like that. Or maybe data harvesting from companies using AI to train it. Those are kind of cutting edge, unclear questions to me. But I don’t think they’d have a case under current U.S. laws. – [Lisa] If I understand
what you’re describing, which maybe I don’t, you’re describing a algorithm that is inspired by what– – [Audience Member] Either
imitate or inspired by. – [Lisa] Right. That seems the creators of the algorithm would own any patents
that are related to that. – [Ryan] I think he was asking if it was inspired by a nation’s biological diversity, would they have some claim on the IP
that comes out of that. – [Lisa] There’s just lots of mechanical things that have been inspired by trying to mimic various things about the real world that’s the creator of that mechanical thing owns– – [Mark] And not just
mechanical things, right? Lots of pharmaceuticals. I think we’ve both, at least in the U.S., we’ve distinguished between you are the inventor of this new drug Taxol that you got from the bark of the Pacific Yew tree, even though it came out of nature, because you’ve changed it and modified it in
ways that are valuable, or you made a synthetic equivalent. You don’t get the bark
of the Pacific Yew tree. I think you can get– – [Audience Member] You can get the method of making a bark– – [Mark] You can get a method of making, you can get a synthetic, I think you can also get stuff taken and isolated and put in pill form, although that is– – [Audience Member] Depends
on method of treat– – [Mark] Well, no, no,
no, the pill itself. (audience talking) That is less clear now than it used to be under the law. But then there is a second level question about whether or not we want regimes or rules to give back to the countries in which these things were found. I would call that a
sort of quasi-IP regime. Intellectual property
doesn’t actually do it, but we’ve got various
efforts, and the Nagoya Protocol is an example here of efforts to try to make sure that if we got the basis of this valuable invention out of the rainforest,
out of nature somewhere, or out of a culture, that we give back some rights to that. I think it doesn’t fit well with the way we structured existing IP law, but we sort of layer it on top of it. – [Audience Member]
(mumbles) algorithm it’s something within (mumbles) – [Mark] I think you could patent the algorithm and you would own that algorithm free and clear, despite the fact that you were inspired by the, by the animals. You might feel like there’s some moral or maybe kind of legal obligation, depending on where you are, to say hey, we should compensate or give back to the society from which
we got this inspiration. – [Audience Member] Question behind you. – [Audience Member] You
go ahead, you go first. – [Audience Member] So, Ryan, you are essentially arguing that there needs to be some
agency, or it came up that there should be some agency behind an invented app. – [Ryan] No, no, no, that was in terms of ownership of property. I think having AI owning property right now, both legally doesn’t fly, but also wouldn’t have, you know, there are good reasons for that including the fact that AI is property. It doesn’t have any moral interests. It would wreak havoc on our legal systems without a very good narrow case for it. – [Audience Member] So artists often talk about human intentionality as a basic principle of art. Even animals can have intentionality like the monkey case. How would you argue that an AI system operating completely autonomously has some intentionality or has some agency that drove the creative app in the first place? Is it necessary or not? – [Ryan] I would tend to say throughout all of this that intentionality and awareness of one owns activities are not really necessary
for regulation of AI. It should be based on what an AI is functionally doing, not whether an AI can think in a philosophical sense. And so if an AI can autonomously generate a useful idea
that’s the foundation of a patent, it really
shouldn’t matter to us whether it’s thinking
about what it’s doing or just combining things in a
way that makes useful stuff. – [Audience Member] So
if there was a human that set up the problem that said have the AI system try to figure out and you would mention in the area of flashing lights. Is the setup of the problem a contribution to the invention or not? – [Ryan] Sometimes. It depends how specific the problem is. In some fields, figuring out the problem is really the inventive act, right? Formulating the problem precisely. If you say build a better aircraft wing or make a better battery or find a new use of this drug, right? That I do not think qualifies someone to be a patent inventor. But if you say I need to find a new use for this in this
particular indication. I have some idea it’s gonna work here. I think we need to look at these data sources and I think we need to exclude this outlying data and then plug it into the algorithm and see if it works. I think that makes a person an inventor. And I think really the analogy is we have people that can help reduce things to practices like laboratory assistants or Microsoft Word or tools. If the AI was a person, would what it is doing qualify it to be an inventor, is, I think, the right way to frame it. – [Audience Member] Just
a question for (mumbles), I kind of really like
what you’re doing, Ryan, in trying to kind of bring some cases to get some jurisprudence out there. What do you all think would be interesting if you could make up a case or find one? Would that really help resolve some outstanding challenges (mumbles)? – [Mark] You know, it’s been my experience that you can’t make up cases as crazy as the ones we get in the real world. (audience laughs) – [Audience Member] (mumbles)
monkeys in pictures. – [Mark] Well, right, exactly, right. I would never have thought to come up with that as a copyright
exam hypothetical. That sort of monkey takes a selfie, right? I think a lot really depends on the ways in which technology develops in ways that I just don’t
fully understand yet, right? I think the legal system looks very different in a world in which what we’ve got is really big data and we just kind of test a bunch of hypotheses and do a bunch of statistics and because we have more data and we’re faster at it, we get better
at outcome than we used to. With that, I tend to
lean toward Lisa’s view of the yeah, the patent
system’s got this, right? We’ve got our own quirks, right? But we kinda know how to
deal with those things. The world in which the AI actually starts to self-improve in truly unexpected ways to paren what
is an improvement, right, at that point, I think
just is much harder, right? Because I do think it is, I think it’s more difficult for us to judge that by existing legal standards. So maybe the answer is we judge it by existing legal standards and we just end up throwing everything
out, Ryan’s paper, everything is obvious because now the AIs can all figure it out. Maybe it’s, well, we adapt the standards to try to replicate what we’ve done in the human world when now that the AI’s the creator, AI is the inventor. I’m inclined to think that if we’re in that world, and, you know, the AIs will sort of, kind of,
generating new stuff because they’re running, you know, we started them doing something, but now they’re kind of running themselves and they’re doing what they want. The idea that we need to kind of map that to an existing legal
system of intellectual property that wasn’t designed for it seems, to me, wrong. I think the IP is a
system that is designed to create artificial scarcity where we didn’t have it because we get scarcity. That’s how our economics works, that’s how our business works, that’s how we know how to sell things. But it’s not obvious in this world, right, with that AI that we need to sort of tie ourselves to a scarcity model and a post-scarcity landscape in which the answer is inventions aren’t scarce and don’t require huge investment teams. They happen because AIs were asked to solve a problem and they solved it. That may be a world in which intellectual property is not actually the best way to think about this. Because it won’t end up accelerating the development of those new things, it’ll end up delaying it as we end up in legal fights over ownership. So my inclination– – [Lisa] Well, I think that answer is really the same in many ways as Ryan’s everything is obvious answer. The obviousness doctrine in patent law, a lot of people say the right way to think about it, and I think
there’s a lot to this, is through the question of, like, something is not obvious such as we should give a patent when you need the patent in order to induce its creation in the reasonable time compared to what would have happened without the patent. And when we move to a world where things are being created regularly without the incentive of a patent, then everything is obvious, you don’t need the patent system under
the patent system’s own logic. – [Mark] So what’s curious is that copyright and patent law may go in different directions in this regard. Because our standard for originality in copyright is so low, I think we would say the Franz Kafka novel is copyrightable subject matter, and there is no obviousness test as there is in patent law. I don’t know that we need to be granting copyrights to people who happen to own AIs who are generating Franz Kafka novels. If we do, and think about this from the patent side too, if we do, it’s not to encourage the AI, I think it’s actually to sort of, kind of create a welfare mechanism for artists, human artists and inventors. It’s perfectly possible for me to imagine 20 years from now, not just that an AI’s as good at making boy bands as the boy bands are, but that all of the music that people most want to listen to is actually generated by AIs, because AIs have
actually figured out what people wanna listen to and they have responded to that better than individual humans can. And maybe the answer is, great. We’ve got free music of a kind that sort of pleases us more, excellent. I feel sorry for the music people who used to make a living as musicians just as I feel sorry for the people who used to make a living as, you know, horse-drawn carriage drivers or truck drivers by this time, right? But too bad, that’s
one of those businesses that’s gone by the wayside. But we might actually
think, you know what, actually there’s value in human creativity that’s not just measured by what’s the, how much do people like it, how much are they willing to pay for it, but the value in the
act of creating, right, and in the act of producing the art that some maybe that’s constitutive of what makes people human. Maybe it’s, sort of, contributes to the culture in a way that is independent of the output. So then, what we end up with is IP as a kind of subsidy for human creativity in a world where AI could
do it without the IP. – [Audience Member]
(mumbles) oh, you know what? I’ll be quiet because you’ve been trying to get in for a while. – [Mark] Sorry we have our backs– – [Audience Member] That’s okay. Can we circle over, can Ryan,
that’s your name, right? I feel like the cases you mentioned as kind of high letter test cases are very interesting, so I kinda wanna walk out with a correct takeaway. Is the takeaway, in your mind, on those cases a test of whether AI can be considered the human, the proxy human for innovation? Or what is the central thing
those cases are testing? – [Ryan] Well, they’re
testing three things. One, whether you can get a patent on something not made in a traditional way by a person, so where you don’t have a traditional human inventor. We could have a new rule for saying someone’s a human inventor there, but we don’t right now. Two, who would be listed as an inventor on those applications, or what? And then three, who or what
would own those applications? Our position is an invention’s an invention’s an invention. So there’s no reason why IBM shouldn’t be able to patent something made by Watson as well as one
of its research scientists. If a machine has functionally been an inventor it should be listed as one. And the rights to the invention would go to the AI’s owner. – [Audience Member] Okay. – [Ryan] So we announced this a couple months ago, patent offices have been very surprisingly receptive to it, not all of them, some of them not so much. – [Audience Member]
(mumbles) Korea if you could, that’s relevant to the answer. – [Ryan] Right. So in Israel they said
you don’t even have to disclose an inventor, so this really isn’t an issue for us. It only comes up during a dispute. Discussing with some others, the U.S. PTO is having a request for comment period and is interested in
the case and is trying to figure out what do we
want to do about this? – [Audience Member] It’s
clearly cutting-edge, so. – [Ryan] Right. I think even five years ago, this would have been viewed less favorably, but it seems to be, you know, people are interested in this right now. – [Audience Member] And then my second question was, do you think either in law schools or with patent
or copyright lawyers around the country, how many of them are thinking about the AI issues, or it’s really not (mumbles). – [Mark] Oh my God. Like the difference between now and two years ago is unbelievable. Our students, this is kind of the hottest thing for our students right now. There are the Stanford AI and Law Society which the students founded two years ago as kind of, 60 or 70
members in the new class. The classes are springing up everywhere. People are really starting
to think about it. I started teaching 25 years ago when the law of the internet was a kind of new and baby
thing that was, sort of, looked a little odd and this feels a lot like that to me, for what it’s worth. Right, that there’s a whole bunch of kind of really fascinating questions that have both practical import, but also kind of end up driving our
philosophical, you know, why do we have this? What is it that makes the
creativity worth protecting and that kind of thing. – [Audience Member] In addition to the, we talked a lot about the (mumbles) of property rights and the
copyrights and so on, right? Is there a third front of the data rights? Who really owns the data, right? – [Mark] Yeah. So we ended up kinda
detouring off of this, right? So I talked about the training data set from the user’s perspective. Am I gonna get in trouble from training, from using copyrighted material in training data. But the people who happen to have or have collected sets of data have an interest in protecting that themselves. Now, if it is there own creative material, if you took a bunch of
photographs, no problem, copyright law will protect
those individual bits. But usually, it is either
stuff you don’t yourself own the underlying
copyright to, or it’s facts. It’s information about the world which is not copyrightable. So the question is is there some right in the aggregate
whole of the data set, even though I don’t have rights in any of the individual parts to own it? – [Audience Member] (mumbles)
where does it fit, right? It is not, and may be (mubmles). – [Mark] Ah, okay. That’s a different question. So I’m envisioning, sort of, you go and search the Google database. Google would like to be able to say we have ownership rights
over our database, our universe of things, and so we can control what you do with it. In Europe, there is a database directive. So there is at least limited protection that gives control to people who have built and compiled databases. In the U.S., we have traditionally said no protection for the database, for the individual facts in the database, but you can protect the selection or arrangement of the data. Now– – [Audience Member] This is
for published information. – [Mark] Yeah, under
copyright law, right, yes. Trade secret is another. Good point, we will come
back to trade secrets. But published or not, it is, copyright law will protect the selection, a creative selection or arrangement of a data set. So if I wrote a, sort of, my 100 favorite restaurants in the bay area, even though they are individual facts, my selection of which ones go in the
100 is idiosyncratic and creative and I get to protect that against someone who copies it. The problem is arrangement has essentially gone by the
wayside as databases have become computerized. We used to think about
how is this organized? So if I rank them from best restaurant to 100th best restaurant, I might protect that creative arrangement. Databases are not really arranging things in a creative way. They’re arranging things in
a functionally-driven way. And selection, for some databases, that will matter. So if you’ve curated the
best photos of stop signs or the best indicators
of kind of exemplars of voice for each different accent, that could get you protection. But if what I want is a training data set that is as big as possible,
has all of the things I can collect that might
be relevant to this, just gathering everything
is not itself creative. So I think we will
ultimately see a kind of push to get some sort of legal protection for the training data set itself, at least against wholesale copying. At least against somebody who comes in and sort of scrapes the whole thing. So we might see that in Congress. We might see it through trade secret law or through some effort to use contract, or if it is a secret. – [Audience Member] (mumbles)
talk about the other (mumbles) ’cause I think
the answer to your question is your inputs into, let’s say, the search engine or Amazon, by
contract, you’ve given up, in your interaction, they are deemed the owners of your interaction. – [Mark] And there are proposals, and there are proposals
out there to create sort of, property-like rights in your individual personal data, right? To say, you know, my data
in the Facebook graph, or my sort of clickstream choices are things that I should own. I don’t know that those are gonna go anywhere. I kinda hope they don’t
go anywhere because, while I think there’s a real benefit to greater privacy protection, propertizing each
individual’s privacy data strikes me as a bad way to do it. In part, because now you’ve got 500 million people– – [Audience Member] Talk
about rights clearance. – [Mark] Exactly. And the only way we will clear, we either won’t use any products that use the data or we’ll clear it in this kind of, well, by coming on my webpage ever, you have agreed to the following
terms and conditions which I will change at my convenience whenever I want, which is kinda how we do it now. – [Audience Member] What
about derivative data? For instance, for machine learning, often isn’t actually the
video that’s important, it’s actually the data you
derive from it, such as, distance between people, or. So it’s actually the
machine learning algorithms that can train on the derivative of the actual copyrighted (mumbles). – [Mark] This is a great question. I have a draft paper now talking about this sort of, this copyright ability. Will I be in trouble for copyright for using this training data set? What I argue is, Bryan Casey and I argue is that the, where the thing that makes this valuable to me, my AI, is not the creative part, not the reason we are protecting it. We shouldn’t treat my use of it by the AI as copyright infringement. So if all I’m interested in is the facts. All I’m interested in is
the stop signs and not the artistic lighting
you took of the picture that caught, the reason we give copyright to those things just
doesn’t map to the reason that I’m making use of those things. – [Audience Member] Is that the functional non-functional– – [Mark] It is, right. – [Audience Member] That
might be worth explaining because it’s an exception to the whole– – [Mark] Right. Copyright law protects the creative aspects of things but not either the underlying idea or the functional elements of things. And that can look a little weird. At a basic level it means I get to, I can’t copy your novel, but I can write a novel that has a number of the same elements. It has a love triangle and a sort of murder in a locked room and whatever else. – [Audience Member] The exception also means there’s a large body of folks which, I’ll say I’m in the camp, but most of software is
probably not copyrightable. – [Mark] Yeah, so right. We copyright computer source code, but one of the oddities is, right, in theory we are protecting only the, only the creative and not the functional aspect to that code– – [Audience Member] Seems like front-end interface not the stuff that we actually– – [Mark] Wait. It may also mean that the worse you are at writing code, the
more protection you get. If you got from A to B by going in a straight line, that’s functional. If you kinda meandered all over the place, you gotta lot of, exactly. I mean, so what we’ve really done is basically used that to prevent primarily exact copying of code. But there are cases, including the Oracle versus Google case going on right now, right, in which we’ve broadened that universe pretty broadly to effectively get some control over functional stuff. I think the answer is there oughta be a way for an AI to
use the unprotectable parts of a copyrighted
work, and the problem is that everything involves making a copy of the whole thing to ingest it, and that’s how you get in trouble. But we oughta– – [Audience Member] Unless
it’s an ephemeral copy. – [Panelist] You could make, right, you could make milliseconds long, right, the courts have tested
millisecond long copy, at some point in the spectrum, it’s not deemed a copy
even if though logically– – [Mark] The law there is a bit of a mess. In the second circuit you seem to have at least a second. You’re making a copy that only lasts for a second is transitory duration. Here in the ninth circuit, the courts very early on held that turning on your computer is an act
of copyright infringement. Because you’re loading
data into RAM memory every time you’re turning on a computer, and so you’re making a copy. I think that’s silly, but it’s out there. – [Lisa] New question in the back. – [Audience Member] Just kind of a wild card question about to the extent there is a copyright in the database, so apart from the trade secrets issue, (mumbles) such as a Google controls the large database it won’t license it. Could there be a scenario where a government could impose
eminent domain on that? For the public good? – [Mark] Yeah. The problem is if you
impose eminent domain you gotta pay for it, right? – [Audience Member] Unless it’s Bayh-Dohl. Unless the government is funded– – [Mark] No, on the patent
side, on the patent side, right. But we don’t have a copyright equivalent of Bayh-Dohl, that’s right. And so I will say, in the trade secret context, Jim mentioned earlier that sort of FDA and other things, when we cited food and insecticides and various other chemicals had to list all of the ingredient labels, we took a bunch of people’s trade
secrets from them, right? And we had to pay for it. – [Panelist] It depends
on how much, right? In a lot of those things, the courts held that the requiring to
list your ingredients did not reveal the whole
process and formula. And if you were to take the whole process and formula,
that would be a taking. But there is this line between regulation and taking. And just listing ingredients if it doesn’t reveal how to make it, doesn’t pass the essence of the secret, then the government hasn’t taken it. – [Mark] So the food, for food it hasn’t taken it. For insecticides where we actually had to give the percentages, the court, Montana versus (mumbles) says
it is a taking, right, so. – [Panelist] Yeah, well
there they gave it up. What had been filed got released. So the property such as it
was, was gone through that. So yes, eminent domain would apply. Trade secrets are treated as property. So it could be done but there is a very important health and safety regulatory framework to all this, right? I mean because you can’t say, if you have hazardous chemicals,
right, the fire department will not have to pay you to make you tell them what you have. – [Audience Member] Right,
but I’m just thinking the use of a data set for machine learning purposes for a social benefit. – [Mark] I think there’s gonna be substantial pressure, whether it takes the form of eminent domain or not, I think there’s gonna be substantial pressure in a number of fronts to make sure that you know what, if this
is actually determining how safe your car is, we want not only Google to have access to that data, not only company access. If it’s medical surgery or medical diagnostics, if it’s criminal sentencing. But probably that will sort of spread to a bunch of other things
like loan, bank loans. Maybe not weather prediction, although I can see the argument for that, right? But I think a number of kind of databases that are integral to AI that has some public safety or health effect or some kind of fundamental
life activity effect, there’s gonna be strong pressure to get access to that information and that’s gonna push against the trade secrets. – [Audience Member]
Every time Tesla updates over the air, they’re begging for this reacting, in my opinion, right? Because it’s the combination of we just changed the software on everything on the road. You don’t really know what that was or the data behind it. I’m sorry I know you’re a fan, but. – [Mark] Tesla actually separates the drivetrain software from the, actually air-gaps the
drivetrain software from the computer control software. But, that said– – [Audience Member] So
the autopilot’s not– – [Mark] No, you’re right. I think that’s right. There’s also the question of in what form is that information useful? So it’s not obvious that, sort of, just that information is valuable to anyone other than, who doesn’t already have a tesla algorithm, and this goes back to the, some combination
of the explainability and what do I get. Is the deposit actually helpful? All right, so we are out of time. Thank you guys. I walked in here thinking I’m not sure we have two hours to fill, but you guys helped us fill it. So thank you. (audience clapping)

Leave a Reply

Your email address will not be published. Required fields are marked *