
        *****************************************************
        *****************************************************
        *****                                           *****
        *****   THE DATA-RACE ANALYZER FOR C	        *****
        *****                                           *****
        *****   Version: 1.0            Aug. 13, 2002   *****
        *****                                           *****
        *****   Authors:                Helmut Seidl    *****
        *****                           Varmo Vene      *****
        *****                                           *****
        *****                           Uni Trier       *****
        *****                                           *****
        *****************************************************
        *****************************************************
 
 
INTRODUCTION:
=============
The goal of our tool is to enhance reliability of multi-threaded C code 
by using program analyzer technology for obtaining sanity checks or even 
certificates for absence of certain programming errors.  This type of 
application does not demand analyzers which run in a few seconds --- but 
flag thousands of unnecessary warnings which later-on must be checked manually 
by highly paid software engineers.  We are clearly willing to spend some minutes
analysis time on larger programs --- given only that the number of spurious errors 
is dramatically decreased.  Therefore, we aim at a good balance between precision 
and analysis time.
 
The analysis of multi-threaded programs has been considered as notoriously difficult 
and expensive.  In fact, precise analyses are known for some restricted classes
of parallel programs but for very simple program properties only.  In order to arrive 
at the necessary precision for a non-trivial fragment of C, however, we have, e.g.,
to resolve function pointers and integrate some form of {\em points-to\/} analysis.
Also, we have to take into account the possible interference between the execution of 
different threads.

The key observation for our inter-procedural C analyzer is that Posix threads communicate 
through global variables. In order to separate the analysis of the different threads,
we attempt to infer for each global variable one **single** value which safely approximates 
**all** possible states of the global variable. This single invariant for the globals then 
is used for analyzing each thread individually.  The separation is particularly successful 
in applications where the threads are only loosely coupled, i.e., where the control-flow
mainly depends on the values of locals.  

We are left with the task of inferring an as tight invariant as possible. The basic idea 
for this approximation is to refine a **partial invariant** during the fixpoint iteration 
by collecting **side-effects** of constraint evaluation. For this, we use a suitably
enhanced application-independent local fixpoint engine. The application-independence 
is particularly important for the development of a high-quality analyzer, as it allows 
to separate tracking of programming errors in the solving machinery from tracking 
specification errors in the analysis itself.

The implementation was done in Standard ML using the ckit as our C frontend.  We used the 
framework to implement various analyses for the detection of data-races.  For efficiency
reasons we organized the analysis as a multi-stage procedure.  In the first stage, we determine 
approximative data values for all globals. This first invariant then is used in the following 
stages which additionally track acquired mutex locks.  The implemented analyses handle most of 
the Posix threads library interface. The implemented analyzers were tested on preliminary versions 
of a large (non-safety critical) on-board program provided by AIRBUS FRANCE. The whole system 
consists of seven components ranging in size from 23,000 to 80,000 LOC (before pre-processing and 
excluding header files).  The analysis was performed on a 1 Gigahertz Athlon with 1 GB memory using 
SUSE Linux and the SML compiler smlnj-110.0.7.

The analysis times for these components varied from a few minutes to less than half an hour.  
The numbers of flagged potential data-race errors were small enough to be manually inspected by 
humans.  Since our analyses compute **safe supersets** of data-race errors, these experiments 
clearly indicate the high quality of the analyzed software.  They also demonstrate that the 
global invariant approach is sufficiently efficient to deal with real-world software components 
and still precise enough to flag relatively few alarms.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

CONTENTS:
=========

A.	Installing the Analyzer

B.	Using the Analyzer

C. 	Possible Outputs

	(0)	The Intermediate Messages
	(1)	The Different View Items
	(2)	The Result Files

D.	Understanding the Results

E.	Limitations

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


A. Installing the Analyzer:
===========================

(0)	Install smlnj, e.g., version 110.0.7
	Install the gtk library, version 1.2

(1)	Go into the main directory of our analyzer.
	Set the environment variable MUTEX to this directory.
	If, e.g., this directory is: 	
	
			/home/dummy/Mutex
	you should set:
			setenv MUTEX /home/dummy/Mutex

	In the main directory, you should find the following files:


	prepro.sh 		-- a shell script which you probably will 
				-- use to pre-process your C sources.

	analyzerGui		-- the executable starting the (simple) Graphical
				-- user interface for our tool. 

	mutexrc			-- a text file by which you can customize the
				-- visual appearance of the GUI.

	Furthermore, you will find there the sub-directories:

	bin			-- contains the pre-generated heap image for SML.
	aux			-- receives the result files produced by the analyzer.
	examples		-- containing some example C files for testing the analyzer.

B. Using the Analyzer:
======================

In order to start the GUI for our analyzer, you should run the executable:

	analyzerGui

If everything works fine, a window should pop up providing you with a menu
bar together with three further buttons. 

(1)	The Menu Bar:
	--------------
	The menu "File" provides you with possibilities to choose the
	input system to be analyzed or quit. Currently, we support 
	three forms of systems corresponding to the following three
	item:

	**	File:		The whole program resides inside one file;
	**	Directory:	The whole program resides within one directory
				(possibly distributed over several files);
	**	Project:	The system is described by a project file
				which contains the (relative) file descriptors
				of all C programs of the project.

	If you select one of these items in the menu, a file selector pops up
	which allows you to perform the necessary selection. The result of this
	selection will be displayed in the bottom-line of the main window.

	The menu "View" allows you to inspect one of the produced outputs of the
	analysis. We will return to this item in C.1 below.

	The menu "Options" and "Help" still are vacuous :-)

(2)	The Buttons:
	------------
	The two buttons to the left allow you to accommodate the analysis:

	**	Failing Locks:	By default, the analyzer assumes that mutex locks
				always succeed. If this assumption is not guaranteed
				in your application, you better press this button.
				the analyzer then only concludes a mutex to be locked
				if the return value of the corresponding call is 0.

	**	Powersets:	In some cases, the set of locked mutexes does not only
				depend on the reached program point but also the values
				of certain locals. In order to increase the precision of
				the analysis, you may in this case press this button.
				WARNING: the resulting analysis probably is (slightly)
				slower and more space consuming.

	The button "Analyze" to the right then starts the analysis.
	The intermediate messages on the current stage of the analysis are displayed
	inside the (scrollable) main window of the GUI.

	NOTE:	the GUI is blocked until the analysis has terminated.

C. Possible Outputs:
====================

The (currently implemented) mutex analysis dumps some information to the main GUI window.
Besides this, it produces two result files and four files with extra information to the user. 
The extra information can also be inspected by means of the menu "View". 

We start the explanation with the intermediate messages inside the main GUI window.

(0) 	The Intermediate Messages:
	--------------------------

	Information about the current stage of the analysis are displayed inside the 
	(scrollable) main window of the GUI. Such information may look like:

   +------------------------------------------------------------------------------------+
   |	Preprocessing ...								|
   |	Iterating ...									|
   |											|
   |	Message: pthread_create at: "/home/bilbo/seidl/DAEDALUS/tmp/99.c":50.42-44	|
   |											|
   |	Warning: unknown ref par. in pthread_create ...					|
   |											|
   |	Cumulated number of evaluated constraints: 	29				|
   |											|
   |											|
   |	Elapsed time: 0s								|
   |											|
   |											|
   |	Preprocessing for main analysis ...						|
   |	Iterating main analysis ...							|
   |											|
   |	Message: pthread_create at: "/home/bilbo/seidl/DAEDALUS/tmp/99.c":50.42-44	|
   |											|
   |	Warning: unknown ref par. in pthread_create ...					|
   |											|
   |	Cumulated number of evaluated constraints: 	29				|
   |											|
   |											|
   |	Elapsed time: 0s								|
   |											|	
   +------------------------------------------------------------------------------------+

	You see status messages (like "Preprocessing ...").

	Furthermore, some statistical information is dumped such as the elapsed time
	(in full seconds only) and the number of evaluated constraints.

	Interspersed with status messages, you may find printouts of found calls to 
	"pthread_create" together with more serious "Warnings" like:
	"unknown ref parameters". Valuable data possibly can be overwritten
	in these places. Such warnings therefore should be tracked seriously. 
	The analyzer here assumes that nothing harmful will happen!

	Here is kind of a list of situations where (currently) warnings are issued:
 
	** 	if a a created thread receives a reference to local variables
   		of the calling thread (occurs in one of your benchmarks :-);
 	
	** 	if an assignment has to be processed where the lval (address)
   		of the left-hand side cannot be determined by the analyzer;
 	
	** 	a similar warning in case of reference parameters of
   		unknown functions;
 	
	** 	a similar warning in case of calls to unknown functions.
 	
	These warnings refer to potential imprecision/incompleteness
	of the analyzer. So, in a reliable analyzer run, such warnings
	should NOT pop up.
 	
	The interesting warnings are those which refer to potential
	mal-function of the program. In the present analysis, there is
	currently only one:
 	
	** 	if a mutex is unlocked -- although it is (possibly) not owned by 
		the current thread.
	
(1)	The Different View Items:
	--------------------------
	**	Warnings:	This is, perhaps, the most important item here.
				It displays all global variables where mutual
				exclusion possibly is violated. One such example
				message could read:

		[RIP_Ri_EchecAudit] = 1: RIP_Ri_CanalSorties.Controle, 1: , 
			main <1, 2, 3, 4, 5, 6, 7>: RIP_Ri_CanalSorties.Controle
		
		This means that the global "RIP_Ri_EchecAudit" is accessed by thread 1
		once with locked mutex "RIP_Ri_CanalSorties.Controle" and once without.
		Furthermore it is accessed by the main thread - after creation of the
		threads 1, 2, 3, 4, 5, 6, 7.
	
	**	Uncalled functions:
				This item lists all functions which are defined within 
				the currently selected system but never called.
	**	External functions:
				This item lists all functions which are called within
                                the currently selected system but not defined.
	**	Nonterminating calls:
				This item lists all function calls which are definitely
				found not to terminate regularly. Example:

			RNO_Se_StartMgr ({argv%65326: [[<- ->]], argn%65325: <- ->})

		This means that a non-terminating call to the function "RNO_Se_StartMgr"
		has been found where the values of the formal parameters "argv" and "argn"
		are given by an array of arrays of unknown int's and an unknown int,
		respectively. NOTE: suffices like "%65326" are used to dis-ambiguate
		names of locals.
	
(2)	The Result Files:
	-----------------
	The analyzer dumps all information about globals and analyzed function calls.
	This information is split into two files.

	**	The File "base.txt"

		This file contains the computed safe approximation to the values
		of globals as well as for every analyzed function call, the values
		of its locals at return. This information, may, e.g., look as follows:
	
		[#112073] = "    REQUEST"	-- an abstract heap location holding a
						-- string constant

		[RGD_Ri_SwitchYesNo] = <- ->	-- a global holding an unknown int value

		[RGD_Ri_RouterDebugPhysique] = struct { Lignes = &(RGD_Ri_LignesRouterDebug[], null); 
							Index = &(null, RGD_Ri_RouterDebugIndex[]); }

						-- a global holding a struct consisting of two potential
						-- pointers to array elements

		RSH_Se_SetValidite({Rf_Validite%56430: <+ 1 +>, Rf_Param%56429: <+ 3 +>}) = 
			{Rf_Validite%56430: <+ 1 +>, Rf_Param%56429: <+ 3 +>, ValiditePrecedente%56431: ?}

						-- a function call:
						--
						-- 	RSH_Se_SetValidite(Rf_Validite,Rf_Param)
						-- 
						-- where the formals are bound to 1 and 3 where the extra
						-- local "ValiditePrecedente" receives an unknown value

	**	The File "result.txt"

		This file contains the computed information about the definitely locked mutexes
		at a (possibly concurrent) write access to a global and the information about 
		locked mutexes at function calls. In case of possibly failing locks, the
		determined information consists of a set of definitely held mutexes together
		with a set of additionally possibly held ones.

D. 	Understanding the Results:
==================================

Globals are identified through square brackets. So, 

	[x]    represents the global variable x. 
	[#123] represents a heap object with (abstract)
	       heap address 123.

NOTE: there is one specific "global variable",
namely: 
	"$" (our representation of unknown shared memory).

The information computed for global variables in file 
ThreadMayMutex.a lists all pairs of "abstract thread id"s and
sets of mutexes hold when writing to this variable. Thus,

	[x] = (main: HMA_Ri_MutexMcdu)

... means that variable x is written to by the main thread with 
where the mutex "HMA_Ri_MutexMcdu" holds.

If the result is:

	[x] = (main <1, 2>: HMA_Ri_MutexMcdu), (2: ) 

... then the access of main occurs **after creation** of threads 1 and 2,
while the write access by thread 2 is not protected by a mutex 
(-> potential data-race!)

NOTE: Thread id's are invented by the analyzer according to the creation point 
and local state of the thread. In particular, for accesses of the main thread, 
we additionally record which threads already have been created by main.
Thus, we do not flag a warning if the main thread performs an unprotected access to
a variable before a competing thread has been created.

The information computed for global variables in file base.txt describes all what 
the analyzer could find out about the potential values of variables. So, we have 
representations for int's, references, arrays and structs/unions.  Perhaps, this 
part of the information is not of so much value to you - besides for checking how 
smart/dull our analyzer is or, even, where it is WRONG ...

In particular, our information about int values takes one of the following forms:

	<+ 42 +>  		// definite value 42
	<- 0 5 9 ->		// excluded list of values


In principle, besides these much more warnings could pop up. 
Here is kind of a list of situations where (currently) warnings 
are issued:

** if a a created thread receives a reference to local variables
   of the calling thread (occurs in one of your benchmarks :-);

** if an assignment has to be processed where the lval (address)
   of the left-hand side cannot be determined by the analyzer;

** a similar warning in case of reference parameters of
   unknown functions;

** a similar warning in case of calls to unknown functions.

These warnings refer to potential imprecision/incompleteness 
of the analyzer. So, in a reliable analyzer run, such warnings
should NOT pop up. 

The interesting warnings are those which refer to potential
mal-function of the program. In the present analysis, these are
currently three:

** if a mutex is unlocked -- although it is not owned by the
   current thread;
** if a global is accessed by threads having different thread id's
   and non-intersecting sets of mutexes.

E.	Limitations:
====================

The present version of our data-race analyzer is a research prototype. 
We cannot take any warranty that all data-race errors are detected
or nothing harmful will happen if no warning is issued. Such potential
failures may be due to programming errors within the tool, but also
due to various assumptions on C programs which we have made during
construction whose violation may invalidate the reported results of the 
analyzer. So, our analyzer takes for granted that

**	array accesses are never out of bound;
**	null pointers are never dereferenced;
**	unions can be treated like structs;
**	unknown functions with reference parameters only modify 
	the storage cells pointed at. In particular they do not
	modify any global not explicitly pointed at;

[Comment: this is clearly a very strong assumption. On the other hand,
no valuable information can be inferred otherwise.]
	
**	string manipulation functions are used "modestly".
	We only track "strcpy" and "strcmp". For the first one,
	we assume that the target string is completely overwritten.
	
[Comment: If this is violated, the analyzer possibly might infer wrong
strings. If such computed strings later-on affect the control-flow, e.g.,
through queries of "strcmp", wrong results may follow.]

**	mutexes can be identified by abstract addresses; 

[Comment: data-races are possibly missed if arrays of mutexes are used
or heap-allocated data-structures containing mutexes.]

**	threads can be identified by program point and local state; 

[Comment: data-races are possibly missed if threads are created
iteratively in loops or by recursive function calls.]

