README in metasm-1.0.1 vs README in metasm-1.0.2
- old
+ new
@@ -19,11 +19,15 @@
Ready-to-use scripts can be found in the samples/ subdirectory, check the
comments in the scripts headers. You can also try the --help argument if
you're feeling lucky.
+For more information, check the doc/ subdirectory. The text files can be
+compiled to html using the misc/txt2html.rb script.
+
+
Here is a short overview of the Metasm internals.
Assembly:
@@ -165,12 +169,12 @@
You can encode/decode an ExeFormat (ie decode sections, imports, headers etc)
Constructor: ExeFormat.decode_file(str), ExeFormat.decode_file_header(str)
Methods: ExeFormat#encode_file(filename), ExeFormat#encode_string
-PE and ELF files have a LoadedPE/LoadedELF counterpart, that is able to work
-with memory-mmaped versions of those formats (e.g. to debugging running
+PE and ELF files have a LoadedPE/LoadedELF counterpart, that are able to work
+with memory-mmaped versions of those formats (e.g. to debug running
processes)
VirtualString:
@@ -196,33 +200,37 @@
disassembly/patching easily (using LoadedPE/LoadedELF as ExeFormat)
Debugging:
-Metasm includes a few interfaces to allow live debugging.
+Metasm includes a few interfaces to handle debugging.
The WinOS and LinOS classes offer access to the underlying OS processes (e.g.
OS.current.find_process('foobar') will retrieve a running process with foobar
in its filename ; then process.mem can be used to access its memory.)
-The Windows and Linux debugging APIs (x86 only) have a basic ruby interface
-(PTrace32, extended in samples/rubstop.rb ; and WinDBG, a simple mapping of the
-windows debugging API) ; those will be more worked on/integrated in the future.
+The Windows and Linux low-level debugging APIs have a basic ruby interface
+(PTrace and WinAPI) ; which are used by the unified high-end Debugger class.
+Remote debugging is supported through the GDB server wire protocol.
-A linux console debugging interface is available in samples/lindebug.rb ; it
-uses a SoftICE-like look and feel.
-This interface can talk to a gdb-server through samples/gdbclient.rb ; use
-[udp:]<host:port> as target.
+High-level debuggers can be created with the following ruby line:
+Metasm::OS.current.create_debugger('foo')
-The disassembler scripts allow live process interaction by using as target
-'live:<pid or part of filename>'.
+Only one kind of host debugger class can exist at a time ; to debug multiple
+processes, attach to other processes using the existing class. This is due
+to the way the OS debugging API works on Windows and Linux.
-A generic debugging interface is available, it is defined in metasm/os/main.rb
-It may be accessed using the Metasm::OS.current.create_debugger('foo')
+The low-level backends are defined in the os/ subdirectory, the front-end is
+defined in debug.rb.
-It can be viewed in action using the GUI and 'open live' target.
+A linux console debugging interface is available in samples/lindebug.rb ; it
+uses a (simplified) SoftICE-like look and feel.
+It can talk to a gdb-server socket ; use a [udp:]<host:port> target.
+The disassembler-gui sample allow live process interaction when using as
+target 'live:<pid or part of program name>'.
+
C Parser:
Metasm includes a hand-written C Parser.
It handles all the constructs i am aware of, except hex floats:
- static const L"bla"
@@ -234,12 +242,16 @@
- C99 declarators - type bla = { [ 2 ... 14 ].toto = 28 };
- Nested functions
- __int8 etc native types
- Label addresses (&&label)
Also note that all those things are parsed, but most of them will fail to
-compile on the Ia32 backend (the only one implemented so far.)
+compile on the Ia32/X64 backend (the only one implemented so far.)
+Parsing C files should be done using an existing ExeFormat, with the
+parse_c_file method. This ensures that format-specific macros/ABI are correctly
+defined (ex: size of the 'long' type, ABI to pass parameters to functions, etc)
+
When you parse a C String using C::Parser.parse(text), you receive a Parser
object. It holds a #toplevel field, which is a C::Block, which holds #structs,
#symbols and #statements. The top-level functions are found in the #symbol hash
whose keys are the symbol names, associated to a C::Variable object holding
the functions. The function parameter/attributes are accessible through
@@ -247,18 +259,14 @@
Under it you'll find a tree-like structure of C::Statements (If, While, Asm,
CExpressions...)
A C::Parser may be #precompiled to transform it into a simplified version that
is easier to compile: typedefs are removed, control sequences are transformed
-in if () goto ; etc.
+into 'if (XX) goto YY;' etc.
To compile a C program, use PE/ELF.compile_c, that will create a C::Parser with
exe-specific macros defined (eg __PE__ or __ELF__).
-The prefered way to create a C::Parser is to initialize it with a CPU and the
-desired ExeFormat, so that it is
-correctly initialized (eg type sizes: is long 4 or 8 bytes? etc) ; and
-may define preprocessor macros needed to correctly parse standard headers.
Vendor-specific headers may need to use either #pragma prepare_visualstudio
(to parse the Microsoft Visual Studio headers) or prepare_gcc (for gcc), the
latter may be auto-detected (or may not).
Vendor headers tested are VS2003 (incl. DDK) and gcc4 ; ymmv.