This is the best foreword I've read to date. It alone has changed how I think about metrics that measure security. If you never own this book or read it to completion, read the foreword. At only 4 pages, it is a concise and fundamental articulation of how to think about quantitatively measuring security. If you haven't read it, stop by a bookstore and check it out when you can spare 5 minutes. You'll be happy you did.
[Lines of code] is a very costly measuring unit because it encourages the writing of insipid code, but today I am less interested in how foolish a unit it is from even a pure business point of view. My point today is that, if we wish to count lines of code, we should not regard them as "lines produced" but as "lines spent": the current conventinoal wisdom is so foolish as to book that count on the wrong side of the ledger.Could it be that our software development process is fundamentally flawed? That vulnerabilities are merely an artifact, or symptom, of a problem that transcends all software engineering? In his essay, Dijkstra insists upon building code guided by formal mathematical proof, as such code is correct by design. Does this sound familiar? Perhaps like "secure by design?" This is a grave and pessimistic evaluation of the state of software development that still holds a great deal of merit two decades after it was written. Today, we see Dijkstra's diagnosis painfully manifested as viruses, worms, hackers, computer network exploitation, and the resultant loss of intellectual property.
Later, Dijkstra enumerates opposition to his proposed approach to development paired with formal mathematical proof. Again intersecting the security discipline, he writes:
the business community, which, having been sold to the idea that computers would make life easier, is mentally unprepared to accept that they only solve the easier problems at the price of creating much harder ones.And thus, on December 2, 1988 - almost exactly twenty years ago to the day as I write this - Edsger W. Dijkstra defines the source of computer security problems by reiterating the "law" of unintended consequences. Accepting this axiom, security practitioners focus on identifying the harder problems resulting from "easy," mathematically imprecise, logically dubious solutions upon which the bulk of our computing infrastructure operates. I feel very strongly that this one statement scopes our discipline better than any other that has yet been made - so strongly that it is worth re-evaluating what information security is.
Security is the identification and mitigation of the unintended consequences of computer system use that results in the compromise of the confidentiality, integrity, or availability of said system or its constituent data.
For thousands of years, mankind has relied on oral history to pass along anecdotes, stories of our history, lessons learned, and any other bit of collective knowledge that societies felt necessary to preserve in order to facilitate the survival of the species - explicitly or otherwise. It is the recognition of this benefit that has largely enabled humans to thrive in societies which wisely chose the knowledge to pass along, and has led to the creation of such constructs as "conventional wisdom," "wives tales," fables, stories, and even religion. While it was initially feared as challenging this status quo of knowledge transfer, Gutenberg's invention of the printing press around 1439 was an amplification of these constructs; an argument reinforced by the first book to be pressed - the Bible - and proven to be correct over time. This invention was the mother of all evolutionary inventions in man's history at that time.
While the pairing of the printing press and widespread literacy opened the door of knowledge to many more of our species, the spread of and access to this information was still spotty and slow. It had been, and still was, necessary for mankind to keep much of the knowledge needed to process information and analyze various aspects of one's own life, surroundings, and society in our collective heads for daily use. This was the driving need for the continuity of our legacy constructs: while we could gain knowledge and share it far more easily, to leverage it in a practical sense we had to be able to keep that information in our heads. We had evolved to easily store knowledge in terms of these constructs through natural selection, and thus our conventional mechanisms for knowledge transfer between generations survived, and even thrived, under this new regime of recordation.
Computer systems also have a problem of information access, for which various components have been developed to address. ENIAC, and early computers like it, had to be able to store information that would be processed in the "processor" itself. It only had one type of memory - essentially, a flipflop. It was this single mechanism that was available for all type of data. This limited the computation to that which could be crammed into this expensive memory. Later, the concept of a slower "core" memory unit was developed. Data that had to be immediately operated upon had to exist in registers (memory) on the processor itself. That which did not need to be operated on immediately could be swapped out to the slower, larger "core" memory. Modern computers have many levels of memory, from registers that operate at the speed of the processor, to multi-layer on-chip cache from which the registers are populated, to RAM which holds necessary but less-immediately accessed data, to disk which holds infrequently accessed data. Along with evolutions in mechanisms for storing data have come evolutions in how to most effectively leverage them, including predictive algorithms for data caching and swapping from the slower to faster storage devices to minimize execution delays due to memory access.
Like the development of slower, larger memory to support data computation in our modern computers, we have collectively invented this revolutionary tool known as the internet. As the ready availability of data to mankind increases, we are forced to rely less and less on our conventional (less accurate) mental constructs, just as computers needed to store smaller and smaller portions of the data and instructions that could be processed at ready access to the CPU. As a result of all of this, in the case of computers as well as mankind, the set of information available increased exponentially. When performing tasks, we now have a wealth of available information that doesn't have to be at the tip of our fingers, or on the top of our brain, in order to be processed in a reasonable period of time. We read things on the internet, perform research in a few minutes, and - if necessary - remember it to perform a task more quickly the next time. We may "swap out", or forget, something that we previously needed on a regular basis with confidence that if we need it again later, we will be able to find it. This is a rudimentary memory management algorithm, adapted to human nature.
All of this raises some important questions that mankind needs to reckon with in the not-too-distant future. How might this revolution in the very essence of our thinking change our constructs? In what ways will fictional literature be impacted? Will we still tell our children stories? How will religion survive? Can computer memory management techniques be adapted by psychologists to train humans to more effectively leverage our new tools like the internet? Is this evolution leaving us vulnerable should we somehow "lose" this tool through war or regression in civilization like that which happened after the first Roman empire? These questions will be answered, implicitly or explicitly, in coming generations. How we answer these questions and resolve the inevitable conflict in between the question and answer will shape no less than the future of our species. It is essential that we recognize the existence and significance of these questions now, if we have hope of answering them as a civilized society, rather than through war or deterioration of our hard-won civilization.
Research that recognizes the issue of technology fundamentally changing ourselves and society is now being highlighted by mainstream media outlets. Recently, USAToday published an article that discusses technology's impact to our social interactions. Closer to the point I make above is this article discussing how surfing the internet alters how one thinks. The latter seems to infer that this model of cognition will be more efficient than our legacy constructs by suggesting those who are able to leverage it will be ahead of others intellectually and socially in future generations.
Over the past few days, a discussion has been forming on the GCFA mailing list regarding the use of the word evidence. Specifically, how appropriate is it to call a hard drive (or more logical construct such as a file) "evidence" when it may turn out that the object will serve no purpose in conclusively resolving an investigation? Is it evidence, or is another word more apropos?
Reading the dialogue reminded me once again of the importance of vocabulary, particularly in technical fields where clear, precise communication is an operational imperative rather than merely a creative expression or embellishment. While it may seem academic, mutual agreement on the use of these critical terms serves as the basis for communication in computer forensics. The more clearly defined our language is, the more effective and efficient our communications will be. Even in the first-person, definitions carry great significance, influencing no less than the very way that we think. As George Orwell said, if thought corrupts language, language can also corrupt thought. This feedback loop cannot be overstated - clarity in language will influence a deeper clarity of thought.
Insofar as our fields of study are concerned, largely in their infancy with respect to other scientific fields, disambiguation of terminology is a significant challenge. Various leading texts provide differing and sometimes conflicting word definitions & usage - even with basics such as what an 'incident' is. Media coverage of security compromises often overlooks the significant differences between CNA ("taking out the DNS infrastructure") and CNE ("industrial espionage"). Our vendors are not exactly helping the situation either - as a high-profile example, see Microsoft's Threat Modeling, which is really risk modeling. It is easy to see that we, as professionals in our young field, wield great power in shaping the future through contributions to our common language where it is still unclear or improperly used. I encourage readers to participate in these discussions whenever they arise. Diversity in opinion and vigorous dialogue are necessary to solve these foundational problems and mature our industry.
As to the definition of the word evidence, I'll leave that to a better discussion forum than a blog.