Can Security be Built into Pure Data?

by Wyatt Lee

This is a question that might not make much sense when you first read it.

In fact, it's probably gibberish.

For this reason, this article is not going to follow the traditional problem, solution, conclusion format.  Instead of starting with the problem and giving a solution, I'm going to start with a question, explain how why the question makes sense, and end with an interesting challenge to every ethical hacker.

Imagine you're a developer commissioned to build a web app of some kind.

No matter what architecture you choose or what that app does, it is probably going to involve parsing some XML and rendering it in a human readable form.  Other alternatives, of course, are using relational databases to store data, but this is a field in itself and much is known about best practices for interacting with these types of data stores, like avoiding SQL injections, so we will restrict ourselves to considering the following:

1.)  Some web application which may be, for example, a Node.js app that runs completely in the back-end or a JavaScript front-end app that writes data to the browser's cache or your hard drive.

2.)  User data is stored as some kind of XML on disk or in the browser's cache (less secure).

The first example could be a web browser which reads in HTML, CSS, and JavaScript and renders it in the browser's GUI.  This architecture, which has been used since the 1990s, presupposes that security should be the responsibility of the browser as a second line of defense after the router and any packet filtering firewall or intrusion detection system you have installed.

The question I would like to put forth is, is this necessary?  Why not just have a secure XML document so that additional security is built into the data itself?

Why would this be beneficial?  Because it would greatly reduce the need for browsers to be secure.  Now, assuming you're with me so far, how do we do such a thing?

What I am proposing is a secure XML standard for an additional layer of security between the browser (or even a mobile app or apps running in the cloud or fog) and possibly malicious packets of data (whether they can be decoded to XML or not).  An extra layer of security for virtually no overhead is always a good idea.

Consider the following piece of XML like code:
<data format = "JSON", key = "Your Key Here">
  <code>
  /* JSON file here */
  </code>
</data>
A simple parser could easily be written that takes any document of this form sent over the web and have a separate validation server parse it in a secure way (this would be more secure than most general purpose parsers that are more complex and thus may have bugs).

Such a parser could be formally verified quite easily.

The string between <code> and </code> would ideally be separated physically in memory, preferably on a cluster where nodes are physically separated in space.

A map reduce validation algorithm can then be used to decide if the data is suspicious or not, minimizing the risk of buffer overflow attacks because each worker node's memory is physically isolated and would receive a random portion of the string.

Once this validation phase is over, the data could be verified with the key, which is a checksum of the validated data.

If anybody is interested in meeting and turning this into a cryptographic protocol for very sensitive applications on the web such as apps running in the cloud, or wants to propose some open standards for a secure XML for crypto applications, send me an email.

Return to $2600 Index