Parsing Javascript in Javascript

Closures are a language feature which allow the programmer to inject static variables into a function’s scope, without the use of method parameters. E.g.:

a = "b";
function c() { return a; }
c()
"b"

A common pattern is to return a second function, using arguments from the first:

function c(a) { 
  return function f(b) { 
    return a + b; 
  } 
}
c(5)(8)
13

This is a powerful construct, as it lets you essentially create custom functions. The downside is that closed variables are difficult to discover in the browser, without the use of the debugger. If a library is built entirely from closures, it will hand you objects that appear to have no state, since the state is contained within functions – this is how D3.js works (To D3’s credit, it does offer plenty of options to expose the data). A worse scenario is a mix, where a few items of state are hidden from the developer, but most are visible.

Since it’s trivial to convert a function back to a string, I thought a first step to fixing this might be parsing the javascript, to determine which values are closures. Once you have this information, you can be aware of the issue, and if you could run code within that function’s context, you could log the values.

There are a few parsers available already, because any library that minifies or analyzes code requires this. JSLint already parses Javascript, and exposes the results:

JSLINT.jslint(“var a = 1”)
JSLINT.data()

Here is an example from the parse tree, representing a single “var” keyword:

1: Object
dead: false
from: 10
function: Object
identifier: true
init: true
kind: "var"
line: 1
master: undefined
statement: true
string: "c"
thru: 11
used: 0
writeable: false

If you want to just inspect a single function’s contents, there is no guarantee this will be able to find a closure, since we’re parsing bits of javascript code. Consequently, what I’m demonstrating here will return the superset of all closed variables, but practically speaking it’s pretty good.

Consider this example:

a = "b";
"b"
function c() { return a; }
c()
"b"
d = c.toString()
"function c() { return a; }"

JSLINT.jslint(d)
JSLINT.data()

If we just run this on a function on it’s own (what I want to do) we have a problem – this isn’t a closure at all, but a function with an undefined variable. I’m going to assume the library I’m analyzing works though, and count this as good enough.

var text = "function c(){ return a; } "

$.each(JSLINT.data().errors, function f(i, obj) {
  var err = "'{" + obj.a + "}'" + " was used before it was defined."
  if (err === obj.raw) {
     console.log("Possible Closure: " + obj.a);
  }
 })

Possible Closure: a

And it works!

In the second example, we’ll define the variable. Strictly speaking “a” is a global, but if this was extracted from deep within a library, it might be a closure. You can inspect the JSLint results of this to find this first class of closures:

var text = "var a = 'b'; function c(){ return a; } "

$.each(JSLINT.data().tokens, function f(i, obj) {
  if ('function' === obj.kind) {
     console.log(obj.function.global);
  }
 })

["a"]

Now, just for completeness, consider a real closure, which JSLint can find on its own:

var text = "function f() { var a = 'b'; function c(){ return a; } }"

$.each(JSLINT.data().tokens, function f(i, obj) {
  if ('function' === obj.kind && obj.function.closure && obj.function.closure.length > 0) {
     console.log(obj.function.closure);
  }
 })
["a"]

At this point, I’m not sure what you can do with this information, but at least it demonstrates some interesting issues.