Today's AI g/1710.html" class="superseo">is trained on the work of artists and writers without attribution, its core values decided by a privileged few. What if the future of AI was more open and democraticResearcher Percy Liang offers a vision of a transparent, participatory future for emerging technology, one that credits contributors and gives everyone a voice.
链接? https://www.ted.com/talks/percy_liang_a_new_way_to_build_ai_openly?subtitle=en
WEBVTT
00:00:03.917 --> 00:00:05.252
2004.
00:00:05.585 --> 00:00:07.671
I was a young masters student
00:00:07.712 --> 00:00:10.840
about to start my first
NLP research project,
00:00:10.882 --> 00:00:14.010
and my task was to train a language model.
00:00:14.678 --> 00:00:18.974
Now that language model was a little bit
smaller than the ones we have today.
00:00:19.015 --> 00:00:21.851
It was trained on millions
rather than trillions of words.
00:00:21.893 --> 00:00:25.021
I used a hidden Markov model
as opposed to a transformer,
00:00:25.063 --> 00:00:27.232
but that little language model I trained
00:00:27.232 --> 00:00:29.734
did something I thought was amazing.
00:00:30.235 --> 00:00:33.071
It took all this raw text
00:00:33.113 --> 00:00:36.992
and somehow it organized it into concepts.
00:00:37.450 --> 00:00:39.953
A concept for months,
00:00:39.953 --> 00:00:42.330
male first names,
00:00:42.372 --> 00:00:45.166
words related to the law,
00:00:45.208 --> 00:00:47.586
countries and continents and so on.
00:00:47.919 --> 00:00:51.214
But no one taught
these concepts to this model.
00:00:51.256 --> 00:00:56.011
It discovered them all by itself,
just by analyzing the raw text.
00:00:56.886 --> 00:00:58.013
But how
00:00:58.346 --> 00:01:01.224
I was intrigued,
I wanted to understand it,
00:01:01.224 --> 00:01:04.019
I wanted to see how far
we could go with this.
00:01:04.060 --> 00:01:05.979
So I became an AI researcher.
00:01:07.188 --> 00:01:08.857
In the last 19 years,
00:01:08.857 --> 00:01:12.819
we have come a long way
as a research community.
00:01:12.861 --> 00:01:16.364
Language models and more generally,
foundation models, have taken off
00:01:16.406 --> 00:01:17.907
and entered the mainstream.
00:01:18.617 --> 00:01:23.496
But, it is important to realize
that all of these achievements
00:01:23.496 --> 00:01:25.832
are based on decades of research.
00:01:25.832 --> 00:01:27.667
Research on model architectures,
00:01:27.709 --> 00:01:32.005
research on optimization algorithms,
training objectives, data sets.
00:01:32.714 --> 00:01:34.049
For a while,
00:01:34.090 --> 00:01:37.052
we had an incredible free culture,
00:01:37.093 --> 00:01:39.304
a culture of open innovation,
00:01:39.304 --> 00:01:41.222
a culture where researchers published,
00:01:41.264 --> 00:01:43.224
researchers released data sets, code,
00:01:43.266 --> 00:01:45.685
so that others can go further.
00:01:45.685 --> 00:01:49.397
It was like a jazz ensemble where everyone
was riffing off of each other,
00:01:49.397 --> 00:01:52.317
developing the technology
that we have today.
00:01:53.652 --> 00:01:55.779
But then in 2020,
00:01:55.820 --> 00:01:57.238
things started changing.
00:01:58.031 --> 00:02:00.367
Innovation became less open.
00:02:00.408 --> 00:02:04.913
And then today, the most advanced
foundation models in the world
00:02:04.913 --> 00:02:06.498
are not released openly.
00:02:06.539 --> 00:02:10.335
They are instead guarded closely
behind black box APIs
00:02:10.335 --> 00:02:13.296
with little to no information
about how they're built.
00:02:14.214 --> 00:02:16.299
So it's like we have these castles
00:02:16.341 --> 00:02:18.718
which house the world's most advanced AIs
00:02:18.718 --> 00:02:21.137
and the secret recipes for creating them.
00:02:21.596 --> 00:02:24.724
Meanwhile, the open community
still continues to innovate,
00:02:24.766 --> 00:02:29.145
but the resource and information
asymmetry is stark.
00:02:30.021 --> 00:02:35.193
This opacity and centralization
of power is concerning.
00:02:35.485 --> 00:02:37.529
Let me give you three reasons why.
00:02:37.529 --> 00:02:39.155
First, transparency.
00:02:39.906 --> 00:02:44.452
With closed foundation models,
we lose the ability to see,
00:02:44.452 --> 00:02:46.830
to evaluate, to audit these models
00:02:46.871 --> 00:02:49.708
which are going to impact
billions of people.
00:02:49.749 --> 00:02:54.003
Say we evaluate a model through an API
on medical question answering
00:02:54.045 --> 00:02:56.297
and it gets 95 percent accuracy.
00:02:56.715 --> 00:02:58.591
What does that 95 percent mean
00:02:59.050 --> 00:03:01.052
The most basic tenet of machine learning
00:03:01.052 --> 00:03:04.097
is that the training data
and the test data
00:03:04.139 --> 00:03:07.559
have to be independent
for evaluation to be meaningful.
00:03:08.101 --> 00:03:10.437
So if we don't know
what's in the training data,
00:03:10.437 --> 00:03:13.273
then that 95 percent
number is meaningless.
00:03:14.274 --> 00:03:17.485
And with all the enthusiasm
to deploying these models
00:03:17.527 --> 00:03:20.864
in the real world
without meaningful evaluation,
00:03:20.864 --> 00:03:22.615
we are flying blind.
00:03:23.700 --> 00:03:26.953
And transparency isn't just
about the training data or evaluation.
00:03:26.995 --> 00:03:29.664
It's also about environmental impact,
00:03:29.706 --> 00:03:32.792
labor practices, release processes,
00:03:32.792 --> 00:03:34.878
risk mitigation strategies.
00:03:35.253 --> 00:03:38.506
Without transparency,
we lose accountability.
00:03:39.048 --> 00:03:42.385
It's like not having nutrition labels
on the food you eat,
00:03:42.385 --> 00:03:46.181
or not having safety ratings
on the cars you drive.
00:03:46.222 --> 00:03:51.060
Fortunately, the food and auto industries
have matured over time,
00:03:51.102 --> 00:03:53.146
but AI still has a long way to go.
00:03:54.481 --> 00:03:56.107
Second, values.
00:03:56.983 --> 00:04:01.237
So model developers like to talk
about aligning foundation models
00:04:01.279 --> 00:04:04.199
to human values,
which sounds wonderful.
00:04:04.783 --> 00:04:07.577
But whose values
are we talking about here
00:04:08.495 --> 00:04:11.247
If we were just building a model
to answer math questions,
00:04:11.247 --> 00:04:12.540
maybe we wouldn't care,
00:04:12.582 --> 00:04:15.251
because as long as the model
produces the right answer,
00:04:15.293 --> 00:04:18.546
we would be happy,
just as we're happy with calculators.
00:04:19.005 --> 00:04:21.216
But these models are not calculators.
00:04:21.216 --> 00:04:24.177
These models will attempt to answer
any question you throw it.
00:04:24.219 --> 00:04:26.721
Who is the best basketball
player of all time
00:04:27.180 --> 00:04:29.766
Should we build nuclear reactors
00:04:30.099 --> 00:04:32.310
What do you think of affirmative action
00:04:32.727 --> 00:04:37.065
These are highly subjective,
controversial, contested question,
00:04:37.106 --> 00:04:42.862
and any decision on how to answer them
is necessarily value laden.
00:04:42.862 --> 00:04:46.115
And currently, these values
are unilaterally decided
00:04:46.115 --> 00:04:48.326
by the rulers of the castles.
00:04:48.952 --> 00:04:52.288
So can we imagine
a more democratic process
00:04:52.288 --> 00:04:57.126
for determining these values
based on the input from everybody
00:04:57.168 --> 00:05:03.091
So foundation models will be the primary
way that we interact with information.
00:05:03.424 --> 00:05:07.011
And so determining these values
and how we set them
00:05:07.053 --> 00:05:08.596
will have a sweeping impact
00:05:08.638 --> 00:05:12.183
on how we see the world and how we think.
00:05:13.643 --> 00:05:15.395
Third, attribution.
00:05:16.020 --> 00:05:18.773
So why are these foundation
models so powerful
00:05:19.107 --> 00:05:22.861
It's because they're trained
on massive amounts of data.
00:05:23.361 --> 00:05:27.448
See what machine-learning
researchers call data
00:05:27.448 --> 00:05:30.326
is what artists call art
00:05:30.368 --> 00:05:33.204
or writers call books
00:05:33.246 --> 00:05:35.206
or programers call software.
00:05:35.790 --> 00:05:40.169
The data here is a result of human labor,
00:05:40.211 --> 00:05:42.463
and currently this data is being scraped,
00:05:42.463 --> 00:05:44.924
often without attribution or consent.
00:05:45.216 --> 00:05:48.261
So understandably, some people are upset,
00:05:48.261 --> 00:05:50.597
filing lawsuits, going on strike.
00:05:50.597 --> 00:05:55.810
But this is just an indication
that the incentive system is broken.
00:05:56.227 --> 00:05:59.606
And in order to fix it,
we need to center the creators.
00:05:59.606 --> 00:06:01.983
We need to figure out
how to compensate them
00:06:01.983 --> 00:06:04.485
for the value of the content
they produced,
00:06:04.527 --> 00:06:07.947
and how to incentivize them
to continue innovating.
00:06:08.823 --> 00:06:11.868
Figuring this out
will be critical to sustaining
00:06:11.910 --> 00:06:14.162
the long term development of AI.
00:06:15.622 --> 00:06:16.831
So here we are.
00:06:17.373 --> 00:06:21.461
We don't have transparency
about how the models are being built.
00:06:22.003 --> 00:06:25.965
We have to live with a fixed values
set by the rulers of the castles,
00:06:26.007 --> 00:06:28.384
and we have no means of attributing
00:06:28.426 --> 00:06:31.471
the creators who make
foundation models possible.
00:06:33.139 --> 00:06:35.183
So how can we change the status quo
00:06:35.767 --> 00:06:37.268
With these castles,
00:06:37.268 --> 00:06:39.646
the situation might seem pretty bleak.
00:06:40.355 --> 00:06:42.523
But let me try to give you some hope.
00:06:43.232 --> 00:06:44.734
In 2001,
00:06:44.776 --> 00:06:47.779
Encyclopedia Britannica was a castle.
00:06:48.112 --> 00:06:51.449
Wikipedia was an open experiment.
00:06:51.449 --> 00:06:55.119
It was a website
where anyone could edit it,
00:06:55.161 --> 00:06:59.666
and all the resulting knowledge
would be made freely available
00:06:59.707 --> 00:07:01.751
to everyone on the planet.
00:07:02.085 --> 00:07:03.795
It was a radical idea.
00:07:04.170 --> 00:07:06.422
In fact, it was a ridiculous idea.
00:07:07.131 --> 00:07:10.468
But against all odds, Wikipedia prevailed.
00:07:11.928 --> 00:07:15.098
In the '90s, Microsoft
Windows was a castle.
00:07:15.848 --> 00:07:17.642
Linux was an open experiment.
00:07:17.684 --> 00:07:20.728
Anyone could read its source code,
anyone could contribute.
00:07:20.728 --> 00:07:23.064
And over the last two decades,
00:07:23.064 --> 00:07:25.441
Linux went from being a hobbyist toy
00:07:25.483 --> 00:07:30.321
to the dominant operating system
on mobile and in the data center.
00:07:31.155 --> 00:07:35.493
So let us not underestimate
the power of open source
00:07:35.535 --> 00:07:37.036
and peer production.
00:07:37.620 --> 00:07:42.041
These examples show us a different way
that the world could work.
00:07:42.041 --> 00:07:44.877
A world in which everyone can participate
00:07:44.919 --> 00:07:47.296
and development is transparent.
00:07:48.297 --> 00:07:50.466
So how can we do the same for AI
00:07:51.592 --> 00:07:53.803
Let me end with a picture.
00:07:54.637 --> 00:07:57.765
The world is filled
with incredible people:
00:07:57.807 --> 00:08:00.893
artists, musicians, writers, scientists.
00:08:00.893 --> 00:08:06.065
Each person has unique skills,
knowledge and values.
00:08:06.065 --> 00:08:11.029
Collectively, this defines
the culture of our civilization.
00:08:11.029 --> 00:08:13.531
And the purpose of AI, as I see it,
00:08:13.531 --> 00:08:16.701
should be to organize
and augment this culture.
00:08:16.743 --> 00:08:20.913
So we need to enable people to create,
to invent, to discover.
00:08:20.955 --> 00:08:23.624
And we want everyone to have a voice.
00:08:23.958 --> 00:08:28.004
The research community has focused
so much on the technical progress
00:08:28.046 --> 00:08:30.214
that is necessary to build these models,
00:08:30.214 --> 00:08:33.426
because for so long,
that was the bottleneck.
00:08:33.801 --> 00:08:37.263
But now we need to consider
the social context
00:08:37.305 --> 00:08:38.890
in which these models are built.
00:08:39.265 --> 00:08:40.683
Instead of castles,
00:08:40.725 --> 00:08:46.814
let us imagine a more transparent
and participatory process for building AI.
00:08:47.356 --> 00:08:49.984
I feel the same excitement
about this vision
00:08:50.026 --> 00:08:52.987
as I did 19 years ago
as that masters student,
00:08:52.987 --> 00:08:55.782
embarking on his first
NLP research project.
00:08:56.783 --> 00:08:59.577
But realizing this vision will be hard.
00:08:59.911 --> 00:09:01.954
It will require innovation.
00:09:01.996 --> 00:09:08.169
It will require participation
of researchers, companies, policymakers,
00:09:08.169 --> 00:09:10.046
and all of you
00:09:10.046 --> 00:09:13.424
to not accept the status quo as inevitable
00:09:13.424 --> 00:09:18.721
and demand a more participatory
and transparent future for AI.
00:09:19.055 --> 00:09:20.223
Thank you.
00:09:20.264 --> 00:09:23.226
(Applause)