dan on Mon, 28 Apr 2014 05:48:57 +0200 (CEST)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: <nettime> Philosophy of the Internet of Things


> It would be good to get some nettime views.
> 
> The Internet of Things (IoT) is an umbrella term used...

For me, the rising interdependence of all players on the Internet
is a setting for common-mode failure both of technology and of
governance, and the IoT is central to that rising interdependence.
I applaud you for having the gumption to stand athwart these
developments yelling "Wait a minute" at a time when few are inclined
to do so, or to have much patience with those who so urge.

I gave this speech at NSA on 26 March.  As the nettime moderators
prefer we do here, http://geer.tinho.net/geer.nsa.26iii14.txt follows
below in full.  I bring to your attention the parts about embedded
systems in particular, about which there is much other material.

--dan

-----------------8<------------cut-here------------8<-----------------

APT in a World of Rising Interdependence
Dan Geer, 26 March 14, NSA

Thank you for the invitation and to the preceding speakers for their
viewpoints and for the shared experience.  With respect to this
elephant, each of us is one of those twelve blind men.

We are at the knee of the curve for deployment of a different model
of computation.  We've had two decades where, in round numbers,
laboratories gave us twice the computing for constant dollars every
18 months, twice the disk drive storage capacity for constant dollars
every 12 months, and twice the network speed for constant dollars
every 9 months.  That is two orders of magnitude in computes per
decade, three for storage, and four for transmission.  In constant
dollar terms, we have massively enlarged the stored data available
per compute cycle, yet that data is more mobile in the aggregate
than when there was less of it.

It is thus no wonder that cybercrime is data crime.  It is thus no
wonder that the advanced persistent threat is the targeted effort
to obtain, change, or deny information by means that are "difficult
to discover, difficult to remove, and difficult to attribute."[DG]

Yet, as we all know, laboratory results filter out into commercial
off the shelf products at rates controlled by the market power of
existing players -- just because it can be done in the laboratory
doesn't mean that you can buy it today.  So it has been with that
triad of computation, storage, and transmission capacities.  As
Martin Hilbert's studies describe, in 1986 you could fill the world's
total storage using the world's total bandwidth in two days.  Today,
it would take 150 days of the world's total bandwidth to fill the
world's total storage, and the measured curve between 1986 and today
is all but perfectly exponential.[MH]

Meanwhile, Moore's Law has begun slowing.  There are two reasons
for this.  Reason number one is physics: We can't cool chips at
clock rates much beyond what we have now.  Reason number two is
economics: The cost of new fabrication facilities doubles every two
years, which is Moore's lesser-known Second Law.  Intel canceled
its Fab42 in January of this year because the capital cost per gate
is now rising.  By 2018 one new fab will be just as expensive in
inflation adjusted terms as was the entire Manhattan Project.[GN]
The big players will have to get bigger still, or Moore's First Law
is over because of Moore's Second Law.

And hardware replacement cycles are no longer driven by customer
upgrade lust -- by which I mean the need to buy new hardware just
because you need new hardware to run new software.  "Good enough
for everything I need to do" now dominates computing excepting,
perhaps, in mobile, but that, too, is a curve that will soon flatten.
Only graphics cards are not yet "good enough for everything I need
to do", but every curve has its asymptote.  In sum, the commercial
off the shelf market is not going to keep allowing us to dream big
without regard to the underlying performance costs.  We are not
going to grow ourselves out of performance troubles of our own
making.  We were able to do that for a good long run, but that party
is over.

We can see that now in cryptography.  I will certainly not lecture
this audience on that subject.  What I can do is to bring word from
the commercial world that cryptographic performance is now a
front-and-center topic of discussion both in individual firms,
amongst expert discussion groups, and within standards bodies.  The
commercial world has evidently decided that the time has come to
add cryptographic protections to an expanded range of products and
services.  The question being unevenly debated is whether, on the
one hand, to achieve cryptographic performance with ever more adroit
algorithm design, especially design that can make full use of
parallelization, or to trend more towards hardware implementations.
As you well know, going to hardware yields really substantial gains
in performance not otherwise possible, but at the cost of zero post
installation flexibility.  This is not hypothetical; AES performance
improvements have of late been because software has been put aside
in favor of hardware.  At least in the views of some of us, hardware
embodiments make the very idea of so-called "algorithm agility"
operationally irrelevant because recapitalizing one's data center
so as to get a new hardware-based crypto algorithm spliced in is
just not going to happen, nor is turning off some optimized, not
to mention amortized, hardware just to be able to use some new
software that is consequentially 10X slower.  One is reminded of
Donald Knuth's comment that "Premature optimization is the root of
all evil."

This brings us to the hardware question in general terms.  The
embedded systems space, already bigger than what is normally thought
of as "a computer," makes the attack surface of the non-embedded
space trivial if not irrelevant.  Perhaps I overstate.  Perhaps
that isn't true today, but by tomorrow it will be true.  Quoting
an authoritative colleague[PG], "[In] the embedded world (which
makes the PC and phone and whatnot market seem trivial by comparison),
[...] performance stays constant and cost goes down.  Ten years ago
your code had to run on a Cortex-M.  Ten years from now your code
will need to run on more or less the same Cortex-M, only it'll be
cheaper and have more integrated peripherals."

Let me pause to ask a teaser question; if those embedded devices
are immortal, are they angelic?  Let me first talk, though, about
the wider world.

Beginning with Stephanie Forrest in 1997,[SF] regular attention has
been paid to the questions of monoculture in the networked environment.
There is no point belaboring the fundamental observation, but let
me state it clearly for the record: cascade failure is so very much
easier to detonate in a monoculture -- so very much easier when the
attacker has only to weaponize one bit of malware, not ten million.
The idea is obvious; believing in it is easy; acting on its
implications is, evidently, rather hard.

Despite what you may think, I am entirely sympathetic to the actual
reason we continue to deploy computing monocultures -- making
everything almost entirely alike is, and remains, our only hope for
being able to centrally manage it all in a consistent manner.  Put
differently, when you deploy a computing monoculture you are making
a fundamental risk management decision: That the downside risk of
a black swan[NT] event is more tolerable than the downside risk of
perpetual inconsistency.  This is a hard question, as all risk
management is about changing the future, not explaining the past.
So let me repeat, which would you rather have, the inordinately
unlikely event of an inordinately severe impact, or the day-to-day
burden of perpetual inconsistency?

When we opt for monocultures by choice we had better opt for tight
central control.  This, of course, supposes that we are willing to
face the risks that come with tight central control including the
paramount risk of any and all auto-update schemes -- namely the
hostile takeover of the auto-update mechanism itself.  But amongst
deployed monocultures, computer desktops are not the point; embedded
systems are.  The trendline in the count of critical monocultures
seems to be rising and most of these are embedded systems both
without a remote management interface and long lived.  That combination
-- long lived and not reachable -- is the trend that must be dealt
with, possibly even reversed.  Whether to insist that embedded
devices self destruct by some predictable age or that remote
management of them be a condition of deployment is the question,
dare I say the policy question, on the table.  In either case, the
Internet of Things, which is to say the appearance of network
connected microcontrollers in seemingly every device, should raise
hackles on every neck.  Look at Dan Farmer's work on IPMI, the
so-called Intelligent Platform Management Interface, if you need
convincing.[DF]

This is one of my key points for today -- that an advanced persistent
threat, one that is difficult to discover, difficult to remove, and
difficult to attribute, is easier in a low-end monoculture, easier
in an environment where much of the computing is done by devices
that are deaf and mute once installed, where those devices bring
no relevant societal risk by their onesies and twosies, but do bring
relevant societal risk at today's extant scales much less the scales
coming soon.  As Dave Aitel, one of your alumni, has put it many
times, for the exploit writer the hardest part by far is test, not
coding.[DA]  Put differently, over the years I've modified my
thinking on monoculture such that I now view monoculture not as an
initiator of attack but as a potentiator, not as an oncogene but
as angiogenesis.

Fifteen years ago, Lazslo Barabasi showed why it is not possible
to design a network that is at once proof against both random faults
and targeted faults.[LB]  Assuming that his conception of a scale-free
network is good enough for our planning purposes, we see that today
we have a network that is pretty well immune to failure from random
faults but which is hardly immune to targeted faults.  Ten years
ago, Sean Gorman's simulations showed a sharp increase in network-wide
susceptibility to cascade failure when a single exploitable flaw
reached 43% prevalence.[SG]  We are way above that 43% threshold
in many, many areas, most of them built-in, unseen, silent.  Five
years ago, Kelly Ziegler calculated that patching a fully deployed
Smart Grid would take an entire year to complete, largely because
of the size of the per-node firmware relative to the available
powerline bandwidth.[KZ]  How might we extrapolate from these various
researcher's findings?

The root source of risk is dependence, especially dependence on the
expectation of stable system state.  Dependence is not only individual
but mutual, not only am I dependent or not but rather a continuous
scale asking whether we are dependent or not; we are, and it is
called interdependence.  Interdependence is transitive, hence the
risk that flows from interdependence is transitive, i.e., if you
depend on the digital world and I depend on you, then I, too, am
at risk from failures in the digital world.  If individual dependencies
were only static, they would be evaluable, but we regularly and
quickly expand our dependence on new things, and that added dependence
matters because we each and severally add risk to our portfolio by
way of dependence on things for which their very newness makes risk
estimation, and thus risk management, neither predictable nor perhaps
even estimable.  Interdependence within society is today absolutely
centered on the Internet beyond all other dependencies excepting
climate, and the Internet has a time constant five orders of magnitude
smaller.

The Gordian Knot of such tradeoffs -- our tradeoffs -- is this: As
society becomes more technologic, even the mundane comes to depend
on distant digital perfection.  Our food pipeline contains less
than a week's supply, just to take one example, and that pipeline
depends on digital services for everything from GPS driven tractors
to drone-surveilled irrigators to robot vegetable sorting machinery
to coast-to-coast logistics to RFID-tagged livestock.  Is all the
technologic dependency, and the data that fuels it, making us more
resilient or more fragile?

Mitja Kolsek suggests that the way to think about the execution
space on the web today is that the client has become the server's
server.[MK]  You are expected to intake what amount to Remote
Procedure Calls (RPCs) from everywhere and everyone.  You are
supposed to believe that trust is transitive but risk is not.  That
is what Javascript does.  That is what Flash does.  That is what
HTML5 does.  That is what every embedded Browser Help Object (BHO)
does.  How do you think that embedded devices work?  As someone who
refuses Javascript, I can tell you that the World Wide Web is rapidly
shrinking because I choose to not be the server's server.

As they say on Marketwatch, let's do the numbers: The HTTP Archive
says that the average web page today makes out-references to 16
different domains as well as making 17 Javascript requests per page,
and the Javascript byte count is five times the HTML byte count.[HT]
A lot of that Javascript is about analytics which is to say
surveillance of the user "experience" (and we're not even talking
about getting your visitors to unknowingly mine Bitcoin for you by
adding that Javascript to your website.[BJ])

To return to the question of whether immortal embedded systems are
angelic or demonic, I ask you the most fundamental design question:
So should or should not an embedded system have a remote management
interface?  If it does not, then a late discovered flaw cannot be
fixed without visiting all the embedded systems -- which is likely
to be infeasible because some you will be unable to find, some will
be where you cannot again go, and there will be too many of them
in any case.  If it does have a remote management interface, the
opponent of skill will focus on that and, once a break is achieved,
will use those self-same management functions to ensure that not
only does he retain control over the long interval but, as well,
you will be unlikely to know that he is there.

Perhaps what is needed is for embedded systems to be more like
humans, and I most assuredly do not mean artificially intelligent.
By "more like humans" I mean this: Embedded systems, if having no
remote management interface and thus out of reach, are a life form
and as the purpose of life is to end, an embedded system without a
remote management interface must be so designed as to be certain
to die no later than some fixed time.  Conversely, an embedded
system with a remote management interface must be sufficiently
self-protecting that it is capable of refusing a command.  Inevitable
death and purposive resistance are two aspects of the human condition
we need to replicate, not somehow imagine that to overcome them is
to improve the future.

This is perhaps the core of my thesis, that when sentience is
available, automation will increase risk whereas when sentience is
not available, automation can reduce risk.  Note the parsing here,
that replacing available sentience with something that is not
sentient *will* increase risk but that substituting automation for
whatever you have absent sentience *can* make things better.  It
won't do so necessarily, but it can.  This devolves to a question
of what do I mean when I say "sentience is available" and that
devolves to some combination of finance and public policy, which
is to say the art of the possible both economically and politically.

Lest some of you think this is all so much picayune, tendentious,
academic perfectionist posturing, here is how to deny the Internet
to a large fraction of its users.  There are better methods, there
are more insidious methods, there are darker paths.  My apologies
to those of you who are aware of what I am about to describe, but
this one example of many is known to several of us, known in the
here and now:  Home routers have drivers and operating systems that
are binary blobs amounting to snapshots of the state of Linux plus
the lowest end commodity chips that were extant at the time of the
router's design.  Linux has moved on.  Device drivers have moved
on.  Samba has moved on.  Chipsets have moved on.  But what is sold
at Best Buy or the like is remarkably cheap and remarkably old.  At
the chip level, there are only three major manufacturers, so Gorman's
43% threshold is surpassed.  With certainty born of long engineering
experience, I assert that those manufacturers can no longer build
their deployed software blobs from source.  If, as my colleague Jim
Gettys has laboriously measured, the average age of the code base
on those ubiquitous low-end routers is 4-5 years,[JG] then you can
be assured that the CVE catalog lists numerous methods of attacking
those operating systems and device drivers remotely.[CV]  If I can
commandeer them remotely, then I can build a botnet that is on the
*outside* of the home network.  It need not ever put a single packet
through the firewall, it need never be detectible by any means
whatsoever from the interior of the network it serves, but it is
most assuredly a latent weapon, one that can be staged to whatever
level of prevalence I desire before I ask it to do more.  All I
need is to include in my exploit a way to signal that device to do
three things: stop processing anything it henceforth receives, start
flooding the network with a broadcast signal that causes other peers
to do the same, and zero the on-board firmware thus preventing
reboot for all time.  Now the only way to recover is to unplug all
the devices, throw them in the dumpster, and install new ones --
but aren't the new ones likely to have the same kind of vulnerability
spectrum in CVE that made this possible in the first place?  Of
course they do, so this is not a quick trip to the big box store
but rather flushing the entire design space and pipeline inventory
of every maker of home routers.

About now you may ask if it isn't a contradiction to imagine embedded
devices that have no management interface for you but are somehow
something that can be managed by various clowns.  The answer is
"No, it is not a contradiction."  As everyone here knows, an essential
part of software analysis is fuzzing, piping unusual input to the
program for the purpose of testing.[UW]  But that is only testing;
I refer you instead to the very important work now appearing under
the title "language-theoretic security."[LS]  Let me quote just two
paragraphs:

   The Language-theoretic approach (LANGSEC) regards the Internet
   insecurity epidemic as a consequence of ad hoc programming of
   input handling at all layers of network stacks, and in other
   kinds of software stacks.  LANGSEC posits that the only path to
   trustworthy software that takes untrusted inputs is treating all
   valid or expected inputs as a formal language, and the respective
   input-handling routines as a recognizer for that language.  The
   recognition must be feasible, and the recognizer must match the
   language in required computation power.

   When input handling is done in ad hoc way, the de facto recognizer,
   i.e., the input recognition and validation code ends up scattered
   throughout the program, does not match the programmers' assumptions
   about safety and validity of data, and thus provides ample
   opportunities for exploitation.  Moreover, for complex input
   languages the problem of full recognition of valid or expected
   inputs may be [formally] UNDECIDABLE, in which case no amount
   of input-checking code or testing will suffice to secure the
   program.  Many popular protocols and formats fell into this trap,
   the empirical fact with which security practitioners are all too
   familiar.

And that is really and truly the point.  The so-called "weird
machines" that result from maliciously well chosen input are the
machines, where regardless of whether there is a management interface
as such, that allow the target to be controlled by the attacker.
The Dartmouth group has now shown numerous examples of such weird
machines in practice, including a 2013 USENIX paper[JB] which begins:

   We demonstrate a Turing-complete execution environment driven
   solely by the IA32 architecture's interrupt handling and memory
   translation tables, in which the processor is trapped in a series
   of page faults and double faults, without ever successfully
   dispatching any instructions.  The "hard-wired" logic of handling
   these faults is used to perform arithmetic and logic primitives,
   as well as memory reads and writes. This mechanism can also
   perform branches and loops...

Therefore, we now see that devices that have no management interface
cannot be repaired by their makers but they can be commandeered by
others if enough skill is brought to bear.  Devices that do have a
management interface are better off, but only if they protect that
interface at all costs.  Because the near entirety of commercial
Internet usage beyond HTML v4 relies upon Turing-complete languages,
the security of these services cannot be proven because to do so
would be to solve the halting problem.  When weird machine style
attacks begin to involve devices that do not have a human user who
might be coherent enough to notice that something is amiss, they
will proceed in stealth.

To be brusquely clear, while I was writing this talk about the
future, the future may have appeared.  We do not know, but the worm
called TheMoon that is now working its way through the world's
Linksys routers may be precisely what I have described.[TM]  It may
be that.  It may be not that the forest could burn, but that it is
already afire.  It may be that we are one event away from not being
able to disambiguate hostile action from an industrial accident.
It is certainly total proof that Sandy Clark's work needs wide
recognition cum action plan more or less yesterday.[SC]

I don't expect any of my analysis to change the course of the world,
the market, or Capitol Hill.  Therefore, let me give my core
prediction for advanced persistent threat: In a world of rising
interdependence, APT will not be about the big-ass machines; it
will about the little.  It will not go against devices with a
hostname and a console; it will go against the ones you didn't even
know about.  It will not be something you can fix for any of the
usual senses of the English word "fix;" it will be avoidable only
by damping dependence.  It cannot and will not be damped by a laying
on of supply chain regulations.  You are Gulliver; they are the
Lilliputians.

My personal definition of a state of security is "The absence of
unmitigatable surprise."  My personal choice for the pinnacle goal
of security engineering is "No silent failure."  You, for all values
of "you," need not adopt those, but I rather imagine you will find
that in an Internet of More Things Than You Can Imagine an ounce
of prevention will be worth way, way more than a pound of cure.  We
have very little time left -- the low-end machines of four years
from now are already being deployed.  As Omar Khayam put it a
thousand years ago,

.  The Moving Finger writes: and, having writ,
.  Moves on: nor all thy Piety nor Wit
.  Shall lure it back to cancel half a Line,
.  Nor all thy Tears wash out a Word of it.


There is never enough time.  Thank you for yours.



/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\

[DG] Geer D, "Advanced Persistent Threat," Computerworld, April
2010; www.computerworld.com/s/article/9175363/Advanced_persistent_threat

[MH] www.martinhilbert.net/WorldInfoCapacityPPT.html (reflecting
Hilbert & Lopez, Science:v332/n6025/p60-65) extrapolated to 2014
with concurrence of its author

[PG] Gutmann P, U Auckland, personal communication

[GN] "Slowing Moore's Law;" www.gwern.net/Slowing%20Moore's%20Law

[SF] Forrest S, Somayaji, & Ackley, "Building Diverse Computer
Systems," HotOS-VI, 1997; www.cs.unm.edu/~immsec/publications/hotos-97.pdf

[NT] Taleb NN, _Fooled By Randomness_, Random House, 2001

[DF] Farmer D, "IPMI: Freight Train to Hell v2.01," 2013;
fish2.com/ipmi/itrain.pdf

[DA] Aitel D, CTO, Immunity, Miami, personal communication

[LB] Barabasi L & Albert R, "Emergence of scaling in random networks,"
Science, v286 p509-512, October 1999

[SG] Gorman S, et al., "The Effect of Technology Monocultures on
Critical Infrastructure," 2004;
policy.gmu.edu/imp/research/Microsoft_Threat.pdf

[KZ] Ziegler K, "The Future of Keeping the Lights On," USENIX, 2010;
static.usenix.org/events/sec10/tech/slides/ziegler.pdf

[MK] Kolsek M, ACROS, Slovenia, personal communication

[HT] Trends, HTTP Archive; www.httparchive.org/trends.php

[BJ] Bitcoin Miner for Websites; www.bitcoinplus.com/miner/embeddable

[JG] Gettys J, former VP Software, One Laptop Per Child, personal
communication

[CV] Common Vulnerabilities and Exposures, cve.mitre.org/cve

[UW] source concepts at U Wisconsin, pages.cs.wisc.edu/~bart/fuzz

[LS] The View from the Tower of Babel, langsec.org/

[JB] Bangert J, et al., "The Page-Fault Weird Machine: Lessons in
Instruction-less Computation," USENIX, 2013;
www.usenix.org/conference/woot13/workshop-program/presentation/bangert

[TM] "Linksys Worm "TheMoon" Summary: What we know so far," 27 Mar 14,
isc.sans.edu/forums/diary/A+few+updates+on+The+Moon+worm/17855

[SC] Clark S, et al., "The Honeymoon Effect and the Role of Legacy
Code in Zero-Day Vulnerabilities," ACSAC, 2010;
www.acsac.org/2010/openconf/modules/request.php?module=oc_program&action=view.php&a=&id=69&type=2

=====
this and other material on file under geer.tinho.net/pubs


#  distributed via <nettime>: no commercial use without permission
#  <nettime>  is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: http://mx.kein.org/mailman/listinfo/nettime-l
#  archive: http://www.nettime.org contact: nettime@kein.org