-
Notifications
You must be signed in to change notification settings - Fork 1
/
aises_4_6
622 lines (621 loc) · 38.6 KB
/
aises_4_6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
<h1 id="systemic-factors">4.6 Systemic Factors</h1>
<p>As discussed above, if we want to improve the safety of a complex,
sociotechnical system, it might be most effective to address the blunt
end, or the broad systemic factors that can diffusely influence
operations. Some of the most important systemic factors include
regulations, social pressure, technosolutionism, competitive pressure,
safety costs, and safety culture. We will now discuss each of these in
more detail.</p>
<p><strong>Safety regulations can be imposed by government or internal
policies.</strong> Safety regulations can require an organization to
adhere to various safety standards, such as conducting regular staff
training and equipment maintenance. These stipulations can be defined
and enforced by a government or by an organization’s own internal
policies. The more stringent and effectively targeted these requirements
are, the safer a system is likely to be.</p>
<p><strong>Social pressure can encourage organizations to improve
safety.</strong> Public attitudes towards a particular technology can
affect an organization’s attitude to safety. Significant social pressure
about risks can mean that organizations are subject to more scrutiny,
while little public awareness can allow organizations to take a more
relaxed attitude toward safety.</p>
<p><strong>Technosolutionism should be discouraged.</strong> Attempting
to fix problems simply by introducing a piece of technology is called
technosolutionism. It does not always work, especially in complex and
sociotechnical systems. Although technology can certainly be helpful in
solving problems, relying on it can lead organizations to neglect the
broader system. They should consider how the proposed technological
solution will actually function in the context of the whole system, and
how it might affect the behavior of other components and human
operators.<p>
Multiple geoengineering technologies have been proposed as solutions to
climate change, such as spraying particles high in the atmosphere to
reflect sunlight. However, there are concerns that attempting this could
have unexpected side effects. Even if spraying particles in the
atmosphere did reverse global heating, it might also interfere with
other components of the atmosphere in ways that we fail to predict,
potentially with harmful consequences for life. Instead, we could focus
on non-technical interventions like preserving forested areas that are
more robustly likely to work without significant unforeseen negative
side-effects.</p>
<p><strong>Competitive pressures can lead to compromise on
safety.</strong> If multiple organizations or countries are pursuing the
same goal, they will be incentivized to get an edge over one another.
They might try to do this by reaching the goal more quickly or by trying
to make the end product more valuable to customers in terms of the
functionality it offers. These competitive pressures can compel
employees and decision-makers to cut corners and pay less attention to
safety.<p>
On a larger scale, competitive pressures might put organizations or
countries in an arms race, wherein safety standards slip because of the
urgency of the situation. This will be especially true if one of the
organizations or countries has lower safety standards and consequently
moves quicker; others might feel the need to lower their standards as
well, in order to keep up. The risks this process presents are
encapsulated by Publilius Syrus’s aphorism: “Nothing can be done at once
hastily and prudently.” We consider this further in the Collective Action Problems chapter.</p>
<p><strong>Various safety costs can discourage the adoption of safety
measures.</strong> There are often multiple costs of increasing safety,
not only financial costs but also slowdowns and reduced product
performance. Adopting safety measures might therefore decrease
productivity and slow progress toward a goal, reducing profits. The
higher the costs of safety measures, the more reluctant an organization
might be to adopt them.<p>
Developers of AI systems may want to put more effort into transparency
and interpretability. However, investigating these areas is costly: at
the very least, there will be personnel and compute costs that could
otherwise have been used to directly create more capable systems.
Additionally, it may delay the completion of the finished product. There
might also be costs from making a system more interpretable in terms of
product performance. Creating more transparent models might require AIs
to select only those actions which are clearly explainable. In general,
safety features can reduce model capabilities, which organizations might
prefer to avoid.</p>
<p><strong>The general safety culture of an organization is an important
systemic factor.</strong> A final systemic factor that will broadly
influence a system’s safety can simply be referred to as its “safety
culture.” This captures the general attitude that the people in an
organization have toward safety—how seriously they take it, and how that
translates into their actions. We will discuss some specific features of
a diligent safety culture in the next section.</p>
<p><strong>Summary.</strong> We have seen that component failure
accident models have some significant limitations, since they do not
usually capture diffuse sources of risk that can shape a system’s
dynamics and indirectly affect the likelihood of accidents. These
include important systemic factors such as competitive pressures, safety
costs, and safety culture. We will now turn to systemic accident models
that acknowledge these ideas and attempt to account for them in risk
analysis.</p>
<h2 id="systemic-accident-models">4.6.1 Systemic Accident Models</h2>
<p>We have explored how component failure accident models are
insufficient for properly understanding accidents in complex systems.
When it comes to AIs, we must understand what sort of system we are
dealing with. Comparing AI safety to ensuring the safety of specific
systems like rockets, power plants, or computer programs can be
misleading. The reality of today’s world is that many hazardous
technologies are operated by a variety of human organizations: together,
these form complex sociotechnical systems that we need to make safer.
While there may be some similarities between different hazardous
technologies, there are also significant differences in the properties
of these technologies which means it will not necessarily work to take
safety strategies from one system and map them directly onto another. We
should not anchor to individual safety approaches used in rockets or
power plants.<p>
Instead, it is more beneficial to approach AI safety from a broader
perspective of making complex, sociotechnical systems safer. To this
end, we can draw on the theory of sociotechnical systems, which offers
“a method of viewing organizations which emphasizes the interrelatedness
of the functioning of the social and technological subsystems of the
organization and the relation of the organization as a whole to the
environment in which it operates.”<p>
We can also use the complex systems literature more generally, which is
largely about the shared structure of many different complex systems.
Accidents in complex systems can often be better understood by looking
at the system as a whole, rather than focusing solely on individual
components. Therefore, we will now consider systemic accident models,
which aim to provide insights into why accidents occur in systems by
analyzing the overall structure and interactions within the system,
including human factors that are not usually captured well by component
failure models.</p>
<p><strong>Normal Accident Theory (NAT).</strong> Normal Accident Theory
(NAT) is one approach to understanding accidents in complex systems. It
suggests that accidents are inevitable in systems that exhibit the
following two properties:</p>
<ol>
<li><p>Complexity: a large number of interactions between components in
the system such as feedback loops, discussed in the Complex Systems
chapter. Complexity can make it infeasible to thoroughly understand a
system or exhaustively predict all its potential failure modes.</p></li>
<li><p>Tight coupling: one component in a system can rapidly affect
others so that one relatively small event can rapidly escalate to become
a larger accident.</p></li>
</ol>
<p>NAT concludes that, if a system is both highly complex and tightly
coupled, then accidents are inevitable—or “normal”—regardless of how
well the system is managed <span class="citation"
data-cites="perrow1999normal">[1]</span>.</p>
<p><strong>NAT focuses on systemic factors.</strong> According to NAT,
accidents are not caused by a single component failure or human error,
but rather by the interactions and interdependencies between multiple
components and subsystems. NAT argues that accidents are a normal part
of complex systems and cannot be completely eliminated. Instead, the
focus should be on managing and mitigating the risks associated with
these systems to minimize the severity and frequency of accidents. NAT
emphasizes the importance of systemic factors, such as system design,
human factors such as organizational culture, and operational
procedures, in influencing accident outcomes. By understanding and
addressing these systemic factors, it is possible to improve the safety
and resilience of complex systems.</p>
<p><strong>Some safety features create additional complexity.</strong>
Although we can try to incorporate safety features, NAT argues that many
attempts to prevent accidents in these kinds of systems can sometimes be
counterproductive, as they might just add another layer of complexity.
As we explore in the Complex Systems chapter, systems often respond to
interventions in unexpected ways. Interventions can cause negative side
effects or even inadvertently exacerbate the problems they were
introduced to solve.<p>
Redundancy, which was listed earlier as a safe design principle, is
supposed to increase safety by providing a backup for critical
components, in case one of them fails. However, redundancy also
increases complexity, which increases the risks of unforeseen and
unintended interactions that can make it impossible for operators to
predict all potential issues <span class="citation"
data-cites="Leveson2009MovingBN">[2]</span>. Having redundant components
can also cause confusion; for example, people might receive
contradictory instructions from multiple redundant monitoring systems
and not know which one to believe.</p>
<p><strong>Reducing complexity can be a safety feature.</strong> We may
not be able to completely avoid complexity and tight coupling in all
systems, but there are many cases where we can reduce one or both of
them and thus meaningfully reduce risk. One example of this is reducing
the potential for human error by making systems more intuitive, such as
by using color coding and male/female adapters in electrical
applications to reduce the incidence of wiring errors. Such initiatives
do not eliminate risks, and accidents are still normal in these systems,
but they can help reduce the frequency of errors.</p>
<p><strong>The performance of some organizations suggests serious
accidents might be avoidable.</strong> The main assertion of NAT is that
accidents are inevitable in complex, tightly coupled systems. In
response to this conclusion, which might be perceived as pessimistic,
other academics developed a more optimistic theory that points to “high
reliability organizations” (HROs) that consistently operate hazardous
technologies with low accident rates. These precedents include air
traffic control, aircraft carriers, and nuclear power plants.<p>
HRO theory emphasizes the importance of human factors, arguing that it
must be possible to manage even complex, tightly coupled systems in a
way that reliably avoids accidents. It identifies five key features of
HROs’ management culture that can significantly lower the risk of
accidents <span class="citation"
data-cites="Dietterich2017Steps">[3]</span>. We will now discuss these
five features and how AIs might help improve them.</p>
<ol>
<li><p><strong>Preoccupation with failure means reporting and studying
mistakes and near misses.</strong> HROs encourage the reporting of all
anomalies, known failures, and near misses. They study these events
carefully and learn from them. HROs also keep in mind potential failure
modes that have not occurred yet and which have not been predicted. The
possibility of unanticipated failure modes constitutes a risk of black
swan events, which will be discussed in detail later in this chapter.
HROs are therefore vigilant about looking out for emerging hazards. AI
systems tend to be good at detecting anomalies, but not near
misses.</p></li>
<li><p><strong>Reluctance to simplify interpretations means looking at
the bigger picture.</strong> HROs understand that reducing accidents to
chains of events often oversimplifies the situation, and is not
necessarily helpful for learning from mistakes and improving safety.
They develop a wide range of expertise so that they can come up with
multiple different interpretations of any incident. This can help with
understanding the broader context surrounding an event, and systemic
factors that might have been at play. HROs also implement many checks
and balances, invest in hiring staff with diverse perspectives, and
regularly retrain everyone. AIs could be used to generate explanations
for hazardous events or conduct adversarial reviews of explanations of
system failures.</p></li>
<li><p><strong>Sensitivity to operations means maintaining awareness of
how a system is operating.</strong> HROs invest in the close monitoring
of systems to maintain a continual, comprehensive understanding of how
they are behaving, whether through excellent monitoring tools or hiring
operators with deep situational awareness. This can ensure that
operations are going as planned, and notice early if anything
unexpected happens, permitting taking corrective action early, before
the situation escalates. AI systems that dynamically aggregate
information in real-time can help improve situational
awareness.</p></li>
<li><p><strong>Commitment to resilience means actively preparing to
tackle unexpected problems.</strong> HROs train their teams in
adaptability and improvising solutions when confronted with novel
circumstances. By practicing dealing with issues they have not seen
before, employees develop problem-solving skills that will help them
cope if anything new and unexpected arises in reality. AIs have the
potential to enhance teams’ on-the-spot problem-solving, such as by
creating surprising situations for testing organizational
efficiency.</p></li>
<li><p><strong>Under-specification of structures means information can
flow rapidly in a system.</strong> Instead of having rigid chains of
communication that employees must follow, HROs have communication
throughout the whole system. All employees are allowed to raise an
alarm, regardless of their level of seniority. This increases the
likelihood that problems will be flagged early, and also allows
information to travel rapidly throughout the organization. This
under-specification of structures is also sometimes referred to as
“deference to expertise,” because it means that all employees are
empowered to make decisions relating to their expertise, regardless of
their place in the hierarchy.</p></li>
</ol>
<p>High-reliability organizations (HROs) provide valuable insights into
the development and application of AI technology. By emulating the
characteristics of HROs, we can create combined human-machine systems
that prioritize safety and mitigate risks. These sociotechnical systems
should continuously monitor their own behavior and the environment for
anomalies and unanticipated side effects. These systems should also
support combined human-machine situational awareness and improvisational
planning, allowing for real-time adaptation and flexibility. Lastly, AIs
should have models of their own expertise and the expertise of human
operators to ensure effective problem routing. By adhering to these
principles, we can develop AI systems that function like HROs, ensuring
high reliability and minimizing the potential risks associated with
their deployment and use.<p>
<strong>Doubts have been raised about how widely HRO theory can be
applied.</strong> Although the practices listed above can improve
safety, a main criticism of HRO theory is that they cannot be applied to
all systems and technologies <span class="citation"
data-cites="Leveson2009MovingBN">[2]</span>. This is because the theory
was developed from a relatively small group of example systems, and
certain features of them cannot be replicated in all systems.</p>
<p><strong>It is difficult to understand systems sufficiently
well.</strong> First, in the the examples of HROs identified (such as air traffic control
or nuclear power plants), operators usually have near-complete knowledge
of the technical processes involved. These organizations’ processes have
also remained largely unchanged for decades, allowing for lessons to be
learned from errors and for safety systems to become more refined.
However, according to NAT, the main reason that complexity contributes
to accidents is that it <em>precludes</em> a thorough understanding of
all processes, and anticipation of all potential failure modes. HROs
with near-complete knowledge of technical processes might be considered
rare cases. These conditions cannot be replicated in all systems,
especially not in those operating new technologies.</p>
<p><strong>HROs prioritize safety, but other organizations might
not.</strong> The second reason why HRO theory might not be broadly
applicable is that its suggestions generally focus on prioritizing
safety as a goal. This might make sense for several of the example HROs,
where safety is an intrinsic part of the mission. Airlines, for
instance, would not be viable businesses if they did not have a strong
track record of transporting passengers safely. However, this is less
feasible in organizations where safety is not so pertinent to the
mission. In many other profit maximization organizations, safety can
conflict with the main mission, as safety measures may be costly and
reduce productivity.</p>
<p><strong>Not all HROs are tightly coupled.</strong> Another criticism
of HRO theory is that several of the example systems might actually be
considered loosely coupled. For instance, in air traffic control, extra
time is scheduled in between landings on the same runway, to allow for
delays, errors, and corrections. However, NAT claims that tight coupling
is the second system feature that makes accidents inevitable. Loosely
coupled systems may not, therefore, be a good counterexample.</p>
<p><strong>Deference to expertise might not always be
realistic.</strong> A final reservation about HRO theory is that the
fifth characteristic (deference to expertise) assumes that employees
will have the necessary knowledge to make the best decisions at the
local level. However, information on the system as a whole is sometimes
required in order to make the best decisions, as actions in one
subsystem may have knock-on effects for other subsystems. Employees
might not always have enough information about the rest of the system to
be able to take this big-picture view.</p>
<p><strong>The debate over whether accidents are inevitable or avoidable
remains unsettled.</strong> A particular sticking point is that, despite
having low accident rates, some of the HRO examples have experienced
multiple near misses. This could be interpreted as evidence that NAT is
correct. We could view it as a matter of luck that these near misses did
not become anything larger. This would indicate that organizations
presented as HROs are in fact vulnerable to accidents. On the other
hand, near misses could instead be interpreted as supporting HRO theory;
the fact that they did not turn into anything larger could be considered
evidence that HROs have the appropriate measures in place to prevent
accidents. It is not clear which of these interpretations is correct
<span class="citation"
data-cites="Leveson2009MovingBN">[2]</span>.<p>
Nevertheless, both NAT and HRO theory have contributed important
concepts to safety engineering. NAT has identified complexity and tight
coupling as key risk factors, while HRO theory has developed important
principles for a good organizational safety culture. Both schools of
thought acknowledge that complex systems must be treated differently
from simpler systems, requiring consideration of all the components,
their interactions, and human factors. We will now explore some
alternative approaches that view system safety more holistically, rather
than considering it a product of reliable components, interactions, and
operators.</p>
<p><strong>System control with safety boundaries.</strong> Rasmussen’s
Risk Management Framework (RMF) is an accident model that recognizes
that accidents are usually the culmination of many different factors,
rather than a single root cause <span class="citation"
data-cites="rasmussen1997">[4]</span>. This model frames risk management
as a question of control, emphasizing the need for clearly defined
safety boundaries that a system’s operations must stay within.</p>
<p><strong>Levels of organization and AcciMap.</strong> The RMF
considers six hierarchical levels of organization within a system, each
of which can affect its safety: government, regulators, the company,
management, frontline workers, and the work itself. By drawing out an
“AcciMap” with this hierarchy, we can identify actors at different
levels who share responsibility for safety, as well as conditions that
may influence the risk of an accident. This analysis makes it explicit
that accidents cannot be solely explained by an action at the sharp
end.</p>
<figure id="fig:Rasmussen">
<img src="https://raw.githubusercontent.com/WilliamHodgkins/AISES/main/images/Rasmussen.png" class="tb-img-full" style="width: 70%"/>
<p class="tb-caption">Figure 4.9: Rasmussen’s risk management framework lays out six levels of organization and their
interactions, aiming to mark consistent safety boundaries by identifying hazards and those responsible
for them</p>
<!--<figcaption>Rasmussen’s Risk Management Framework</figcaption>-->
</figure>
<p><strong>Systems can gradually migrate into unsafe states.</strong>
The RMF also asserts that behaviors and conditions can gradually
“migrate” over time, due to environmental pressures. If this migration
leads to unsafe systemic conditions, this creates the potential for an
event at the sharp end to trigger an accident. This is why it is
essential to continually enforce safety boundaries and avoid the system
migrating into unsafe states.</p>
<p><strong>STAMP is based on insights from the study of complex
systems.</strong> According to the systems-theory paradigm, safety is an
emergent property that is unlikely to be sufficiently understood just by
looking at individual components in isolation. This is the view taken by
System-Theoretic Accident Model and Processes (STAMP). STAMP identifies
multiple levels of organization within a system, where each level is of
higher complexity than the one below. Each level has novel emergent
properties that cannot be practically understood through a reductive
analysis of the level below. STAMP also recognizes that a system can be
highly reliable but still be unsafe, and therefore puts the emphasis on
safety rather than just on the reliability of components.</p>
<p><strong>STAMP frames safety as a question of top-down
control.</strong> STAMP proposes that safety can be enforced by each
level effectively placing safety constraints on the one below to keep
operations from migrating into unsafe states <span class="citation"
data-cites="Leveson2009MovingBN">[2]</span>. Performing STAMP-based risk
analysis and management involves creating models of four aspects of a
system: the organizational safety structure, the dynamics that can cause
this structure to deteriorate, the models of system processes that
operators must have, and the surrounding context. We will now discuss
each of these in more detail.</p>
<p><strong>The organizational safety structure.</strong> The first
aspect is the safety constraints, the set of unsafe conditions which
must be avoided. It tells us which components and operators are in place
to avoid each of those unsafe conditions occurring. This can help to
prevent accidents from component failures, design errors, and
interactions between components that could produce unsafe states.</p>
<p><strong>Dynamic deterioration of the safety structure.</strong> The
second aspect is about how the safety structure can deteriorate over
time, leading to safety constraints being enforced less stringently.
Systems can “migrate” toward failure when many small events escalate
into a larger accident. Since complex and sociotechnical systems involve
large numbers of interactions, we cannot methodically compute the
effects of every event within the system and exhaustively identify all
the pathways to an accident. We cannot always reduce an accident to a
neat chain of events or find a root cause: such instincts are often
based on the desire to have a feeling of control by assigning blame.
Instead, it might make sense to describe a system as migrating toward
failure, due to the accumulation of many seemingly insignificant
events.<p>
This might include natural processes, such as wear and tear of
equipment. It can also include systemic factors, such as competitive
pressures, that might compel employees to omit safety checks. If being
less safety-conscious does not quickly lead to an accident, developers
might start to think that safety-consciousness is unnecessary. Having a
model of these processes can increase awareness and vigilance around
what needs to be done to maintain an effective safety structure.</p>
<p><strong>Knowledge and communication about process models.</strong>
The third aspect is the knowledge that operators must have about how the
system functions in order to make safe decisions. Operators may be
humans or automated systems that have to monitor feedback from the
system and respond to keep it on track.<p>
The process model that these operators should have includes the
assumptions about operating conditions that were made during the design
stage so that they will be aware of the conditions in which the system
might not function properly, such as outside regular temperature ranges.
It might also include information about how the specific subsystem that
the operator is concerned with interacts with other parts of the system.
The communication required for operators to maintain an accurate process
model over time should also be specified. This can help to avoid
accidents resulting from operators or software making decisions based on
inaccurate beliefs about how the system is functioning.</p>
<p><strong>The cultural and political context of the decision-making
processes.</strong> The fourth aspect is the systemic factors that could
influence the safety structure. It might include information about who
the stakeholders are and what their primary concern is. For example,
governments may impose stringent regulations, or they may put pressure
on an organization to reach its goals quickly, depending on what is most
important to them at the time. Similarly, social pressures and attitudes
may put pressure on organizations to improve safety or pressure to
achieve goals quickly.<p>
Table "Assumptions” summarizes how
the STAMP perspective contrasts with those of traditional component
failure models.<p>
<strong>STAMP-based analysis techniques include System-Theoretic Process
Analysis (STPA).</strong> On a practical level, there are methods of
analyzing systems that take the holistic approach outlined by STAMP.
These include System-Theoretic Process Analysis (STPA), which can be
used at the design stage, including steps such as identifying hazards
and constructing a control structure to mitigate their effects and
improve system safety.</p>
<p><strong>Decrementalism is the deterioration of system processes
through a series of small changes.</strong> A third accident model based
on systems theory is Dekker’s Drift into Failure (DIF) model <span
class="citation" data-cites="dekker2011Drift">[5]</span>. DIF focuses on
the migration of systems that the RMF and STAMP also acknowledge,
describing how this can lead to a “drift into failure.” Since an
individual decision to change processes may be relatively minor, it can
seem that it will not make any difference to a system’s operations or
safety. For this reason, systems are often subject to decrementalism, a
gradual process of changes through one small decision at a time that
degrades the safety of a system’s operations.</p>
<p><strong>Many relatively minor decisions can combine to lead to a
major difference in risk.</strong> Within complex systems, it is
difficult to know all the potential consequences of a change in the
system, or how it might interact with other changes. Many alterations to
processes within a system, each of which might not make a difference by
itself, can interact in complex and unforeseen ways to result in a much
higher state of risk. This is often only realized when an accident
happens, at which point it is too late.</p>
<p><strong>Summary.</strong> Normal accident theory argues that
accidents are inevitable in systems with a high degree of complexity and
tight coupling, no matter how well they are organized. On the other
hand, it has been argued that HROs with consistently low accident rates
demonstrate that it is possible to avoid accidents. HRO theory
identifies five key characteristics that contribute to a good safety
culture and reduce the likelihood of accidents. However, it might not be
feasible to replicate these across all organizations.<p>
Systemic models like Rasmussen’s RMF, STAMP, and Dekker’s DIF model are
grounded in an understanding of complex systems, viewing safety as an
emergent property. The RMF and STAMP both view safety as an issue of
control and enforcing safety constraints on operations. They both
identify a hierarchy of levels of organization within a system, showing
how accidents are caused by multiple factors, rather than just by one
event at the sharp end. DIF describes how systems are often subject to
decrementalism, whereby the safety of processes is gradually degraded
through a series of minor changes, each of which seems minor on its
own.<p>
In general, component failure models focus on identifying specific
components or factors that can go wrong in a system and finding ways to
improve those components. These models are effective at pinpointing
direct causes of failure and proposing targeted interventions. However,
they have a limitation in that they tend to overlook other risk sources
and potential interventions that may not be directly related to the
identified components. On the other hand, systemic accident models take
a broader approach by considering the interactions and interdependencies
between various components in a system, such as feedback loops, human
factors, and diffuse causality models. This allows them to capture a
wider range of risk sources and potential interventions, making them
more comprehensive in addressing system failures.</p>
<h2 id="drift-into-failure-and-existential-risks">4.6.2 Drift into Failure and
Existential Risks</h2>
<p>This book presents multiple ways in which the development and
deployment of AIs could entail risks, some of which could be
catastrophic or even existential. However, the systemic accident models
discussed above highlight that events in the real world often unfold in
a much more complex manner than the hypothetical scenarios we use to
illustrate risks. It is possible that many relatively minor events could
accumulate, leading us to drift toward an existential risk. We are
unlikely to be able to predict and address every potential combination
of events that could pave the route to a catastrophe.<p>
For this reason, although it can be useful to study the different risks
associated with AI separately when initially learning about them, we
should be aware that hypothetical example scenarios are simplified, and
that the different risks coexist. We will now discuss what we can learn
from our study of complex systems and systemic accident models when
developing an AI safety strategy.</p>
<p><strong>Risks that do not initially appear catastrophic might
escalate.</strong> Risks tend to exist on a spectrum. Power inequality,
disinformation, and automation, for example, are prevalent issues within
society and are already causing harm. Though serious, they are not
usually thought of as posing existential risks. However, if pushed to an
extreme degree by AIs, they could result in totalitarian governments or
enfeeblement. Both of these scenarios could represent a catastrophe from
which humanity may not recover. In general, if we encounter harm from a
risk on a moderate scale, we should be careful to not dismiss it as
non-existential without serious consideration.</p>
<p><strong>Multiple lower-level risks can combine to produce a
catastrophe.</strong> Another reason for thinking more comprehensively
about safety is that, even if a risk is not individually extreme, it
might interact with other risks to bring about catastrophic outcomes
<span class="citation" data-cites="hendrycks2023overview">[6]</span>.
Imagine, for instance, a scenario in which competitive pressures fuel an
AI race between developers. This may lead a company to reduce its costs
by putting less money into maintaining robust information security
systems, with the result that a powerful AI is leaked. This would
increase the likelihood that someone with malicious intent successfully
uses the AI to pursue a harmful outcome, such as the release of a deadly
pathogen.<p>
In this case, the AI race has not directly led to an existential risk by
causing companies to, for example, bring AIs with insufficient safety
measures to market. Nevertheless, it has indirectly contributed to the
existential threat of a pandemic by amplifying the risk of malicious
use.<p>
This echoes our earlier discussion of catastrophes in complex systems,
where we discussed how it is often impractical and infeasible to
attribute blame to one major “root cause” of failure. Instead, systems
often “drift into failure” through an accumulation and combination of
many seemingly minor events, none of which would be catastrophic alone.
Just as we cannot take steps to prevent every possible mistake or
malfunction within a large, complex system, we cannot predict or control
every single way that various risks might interact to result in
disaster.</p>
<p><strong>Conflict and global turbulence could make society more likely
to drift into failure.</strong> Although we have some degree of choice
in how we implement AI within society, we cannot control the wider
environment. There are several reasons why events like wars that create
societal turbulence could increase the risk of human civilization
drifting into failure. Faced with urgent, short-term threats, people
might deprioritize AI safety to focus instead on the most immediate
concerns. If AIs can be useful in tackling those concerns, it might also
incentivize people to rush into giving them greater power, without
thinking about the long-term consequences. More generally, a more
chaotic environment might also present novel conditions for an AI, that
cause it to behave unpredictably. Even if conditions like war do not
directly cause existential risks, they make them more likely to
happen.</p>
<p><strong>Broad interventions may be more effective than narrowly
targeted ones.</strong> Previous attempts to manage existential risks
have focused narrowly on avoiding risks directly from AIs, and mainly
addressed this goal through technical AI research. Given the complexity
of AIs themselves and the systems they exist within, it makes sense to
adopt a more comprehensive approach, taking into account the whole risk
landscape, including threats that may not immediately seem catastrophic.
Instead of attempting to target just existential risks precisely, it may
be more effective to implement broad interventions, including
sociotechnical measures.</p>
<p><strong>Summary.</strong> As we might expect from our study of
complex systems, different types of risks are inextricably related and
can combine in unexpected ways to amplify one another. While some risks
may be generally more concerning than others, we cannot neatly isolate
those that could contribute to an existential threat from those that
could not, and then only focus on the former while ignoring the latter.
In addressing existential threats, it is therefore reasonable to view
systems holistically and consider a wide range of issues, besides the
most obvious catastrophic risks. Due to system complexity, broad
interventions are likely to be required as well as narrowly targeted
ones.<p>
</p>
<br>
<br>
<h3>References</h3>
<div id="refs" class="references csl-bib-body" data-entry-spacing="0"
role="list">
<div id="ref-perrow1999normal" class="csl-entry" role="listitem">
<div class="csl-left-margin">[1] C.
Perrow, <em>Normal accidents: Living with high risk technologies -
updated edition</em>, REV - Revised. Princeton University Press, 1999.
Accessed: Oct. 14, 2023. [Online]. Available: <a
href="http://www.jstor.org/stable/j.ctt7srgf">http://www.jstor.org/stable/j.ctt7srgf</a></div>
</div>
<div id="ref-Leveson2009MovingBN" class="csl-entry" role="listitem">
<div class="csl-left-margin">[2] N.
G. Leveson, N. Dulac, K. Marais, and J. S. Carroll, <span>“Moving beyond
normal accidents and high reliability organizations: A systems approach
to safety in complex systems,”</span> <em>Organization Studies</em>,
2009.</div>
</div>
<div id="ref-Dietterich2017Steps" class="csl-entry" role="listitem">
<div class="csl-left-margin">[3] T.
G. Dietterich, <span>“Steps toward robust artificial
intelligence,”</span> <em>AI Magazine</em>, vol. 38, no. 3, pp. 3–24,
2017, doi: <a
href="https://doi.org/10.1609/aimag.v38i3.2756">https://doi.org/10.1609/aimag.v38i3.2756</a>.</div>
</div>
<div id="ref-rasmussen1997" class="csl-entry" role="listitem">
<div class="csl-left-margin">[4] J.
Rasmussen, <span>“Risk management in a dynamic society: A modelling
problem,”</span> <em>Safety Science</em>, vol. 27, no. 2, pp. 183–213,
1997, doi: <a
href="https://doi.org/10.1016/S0925-7535(97)00052-0">https://doi.org/10.1016/S0925-7535(97)00052-0</a>.</div>
</div>
<div id="ref-dekker2011Drift" class="csl-entry" role="listitem">
<div class="csl-left-margin">[5] S.
Dekker, <em>Drift into failure: From hunting broken components to
understanding complex systems</em>. 2011, pp. 1–220. doi: <a
href="https://doi.org/10.1201/9781315257396">10.1201/9781315257396</a>.</div>
</div>
<div id="ref-hendrycks2023overview" class="csl-entry" role="listitem">
<div class="csl-left-margin">[6] D.
Hendrycks, M. Mazeika, and T. Woodside, <span>“An overview of
catastrophic AI risks.”</span> 2023. Available: <a
href="https://arxiv.org/abs/2306.12001">https://arxiv.org/abs/2306.12001</a></div>
</div>
</div>