#+TITLE: UglifyJS -- a JavaScript parser/compressor/beautifier #+KEYWORDS: javascript, js, parser, compiler, compressor, mangle, minify, minifier #+DESCRIPTION: a JavaScript parser/compressor/beautifier in JavaScript #+STYLE: #+AUTHOR: Mihai Bazon #+EMAIL: mihai.bazon@gmail.com * UglifyJS --- a JavaScript parser/compressor/beautifier This package implements a general-purpose JavaScript parser/compressor/beautifier toolkit. It is developed on [[http://nodejs.org/][NodeJS]], but it should work on any JavaScript platform supporting the CommonJS module system (and if your platform of choice doesn't support CommonJS, you can easily implement it, or discard the =exports.*= lines from UglifyJS sources). The tokenizer/parser generates an abstract syntax tree from JS code. You can then traverse the AST to learn more about the code, or do various manipulations on it. This part is implemented in [[../lib/parse-js.js][parse-js.js]] and it's a port to JavaScript of the excellent [[http://marijn.haverbeke.nl/parse-js/][parse-js]] Common Lisp library from [[http://marijn.haverbeke.nl/][Marijn Haverbeke]]. ( See [[http://github.com/mishoo/cl-uglify-js][cl-uglify-js]] if you're looking for the Common Lisp version of UglifyJS. ) The second part of this package, implemented in [[../lib/process.js][process.js]], inspects and manipulates the AST generated by the parser to provide the following: - ability to re-generate JavaScript code from the AST. Optionally indented---you can use this if you want to “beautify” a program that has been compressed, so that you can inspect the source. But you can also run our code generator to print out an AST without any whitespace, so you achieve compression as well. - shorten variable names (usually to single characters). Our mangler will analyze the code and generate proper variable names, depending on scope and usage, and is smart enough to deal with globals defined elsewhere, or with =eval()= calls or =with{}= statements. In short, if =eval()= or =with{}= are used in some scope, then all variables in that scope and any variables in the parent scopes will remain unmangled, and any references to such variables remain unmangled as well. - various small optimizations that may lead to faster code but certainly lead to smaller code. Where possible, we do the following: - foo["bar"] ==> foo.bar - remove block brackets ={}= - join consecutive var declarations: var a = 10; var b = 20; ==> var a=10,b=20; - resolve simple constant expressions: 1 +2 * 3 ==> 7. We only do the replacement if the result occupies less bytes; for example 1/3 would translate to 0.333333333333, so in this case we don't replace it. - consecutive statements in blocks are merged into a sequence; in many cases, this leaves blocks with a single statement, so then we can remove the block brackets. - various optimizations for IF statements: - if (foo) bar(); else baz(); ==> foo?bar():baz(); - if (!foo) bar(); else baz(); ==> foo?baz():bar(); - if (foo) bar(); ==> foo&&bar(); - if (!foo) bar(); ==> foo||bar(); - if (foo) return bar(); else return baz(); ==> return foo?bar():baz(); - if (foo) return bar(); else something(); ==> {if(foo)return bar();something()} - remove some unreachable code and warn about it (code that follows a =return=, =throw=, =break= or =continue= statement, except function/variable declarations). ** <> The following transformations can in theory break code, although they're probably safe in most practical cases. To enable them you need to pass the =--unsafe= flag. *** Calls involving the global Array constructor The following transformations occur: #+BEGIN_SRC js new Array(1, 2, 3, 4) => [1,2,3,4] Array(a, b, c) => [a,b,c] new Array(5) => Array(5) new Array(a) => Array(a) #+END_SRC These are all safe if the Array name isn't redefined. JavaScript does allow one to globally redefine Array (and pretty much everything, in fact) but I personally don't see why would anyone do that. UglifyJS does handle the case where Array is redefined locally, or even globally but with a =function= or =var= declaration. Therefore, in the following cases UglifyJS *doesn't touch* calls or instantiations of Array: #+BEGIN_SRC js // case 1. globally declared variable var Array; new Array(1, 2, 3); Array(a, b); // or (can be declared later) new Array(1, 2, 3); var Array; // or (can be a function) new Array(1, 2, 3); function Array() { ... } // case 2. declared in a function (function(){ a = new Array(1, 2, 3); b = Array(5, 6); var Array; })(); // or (function(Array){ return Array(5, 6, 7); })(); // or (function(){ return new Array(1, 2, 3, 4); function Array() { ... } })(); // etc. #+END_SRC *** =obj.toString()= ==> =obj+“”= ** Install (NPM) UglifyJS is now available through NPM --- =npm install uglify-js= should do the job. ** Install latest code from GitHub #+BEGIN_SRC sh ## clone the repository mkdir -p /where/you/wanna/put/it cd /where/you/wanna/put/it git clone git://github.com/mishoo/UglifyJS.git ## make the module available to Node mkdir -p ~/.node_libraries/ cd ~/.node_libraries/ ln -s /where/you/wanna/put/it/UglifyJS/uglify-js.js ## and if you want the CLI script too: mkdir -p ~/bin cd ~/bin ln -s /where/you/wanna/put/it/UglifyJS/bin/uglifyjs # (then add ~/bin to your $PATH if it's not there already) #+END_SRC ** Usage There is a command-line tool that exposes the functionality of this library for your shell-scripting needs: #+BEGIN_SRC sh uglifyjs [ options... ] [ filename ] #+END_SRC =filename= should be the last argument and should name the file from which to read the JavaScript code. If you don't specify it, it will read code from STDIN. Supported options: - =-b= or =--beautify= --- output indented code; when passed, additional options control the beautifier: - =-i N= or =--indent N= --- indentation level (number of spaces) - =-q= or =--quote-keys= --- quote keys in literal objects (by default, only keys that cannot be identifier names will be quotes). - =--ascii= --- pass this argument to encode non-ASCII characters as =\uXXXX= sequences. By default UglifyJS won't bother to do it and will output Unicode characters instead. (the output is always encoded in UTF8, but if you pass this option you'll only get ASCII). - =-nm= or =--no-mangle= --- don't mangle variable names - =-ns= or =--no-squeeze= --- don't call =ast_squeeze()= (which does various optimizations that result in smaller, less readable code). - =-mt= or =--mangle-toplevel= --- mangle names in the toplevel scope too (by default we don't do this). - =--no-seqs= --- when =ast_squeeze()= is called (thus, unless you pass =--no-squeeze=) it will reduce consecutive statements in blocks into a sequence. For example, "a = 10; b = 20; foo();" will be written as "a=10,b=20,foo();". In various occasions, this allows us to discard the block brackets (since the block becomes a single statement). This is ON by default because it seems safe and saves a few hundred bytes on some libs that I tested it on, but pass =--no-seqs= to disable it. - =--no-dead-code= --- by default, UglifyJS will remove code that is obviously unreachable (code that follows a =return=, =throw=, =break= or =continue= statement and is not a function/variable declaration). Pass this option to disable this optimization. - =-nc= or =--no-copyright= --- by default, =uglifyjs= will keep the initial comment tokens in the generated code (assumed to be copyright information etc.). If you pass this it will discard it. - =-o filename= or =--output filename= --- put the result in =filename=. If this isn't given, the result goes to standard output (or see next one). - =--overwrite= --- if the code is read from a file (not from STDIN) and you pass =--overwrite= then the output will be written in the same file. - =--ast= --- pass this if you want to get the Abstract Syntax Tree instead of JavaScript as output. Useful for debugging or learning more about the internals. - =-v= or =--verbose= --- output some notes on STDERR (for now just how long each operation takes). - =--unsafe= --- enable other additional optimizations that are known to be unsafe in some contrived situations, but could still be generally useful. For now only this: - foo.toString() ==> foo+"" - =--max-line-len= (default 32K characters) --- add a newline after around 32K characters. I've seen both FF and Chrome croak when all the code was on a single line of around 670K. Pass --max-line-len 0 to disable this safety feature. - =--reserved-names= --- some libraries rely on certain names to be used, as pointed out in issue #92 and #81, so this option allow you to exclude such names from the mangler. For example, to keep names =require= and =$super= intact you'd specify --reserved-names "require,$super". - =--inline-script= -- when you want to include the output literally in an HTML =