Skip to content

Commit

Permalink
Make small proofreading edits to README
Browse files Browse the repository at this point in the history
  • Loading branch information
amyjzhu committed May 1, 2018
1 parent 7be61dc commit 4f68cf4
Showing 1 changed file with 31 additions and 31 deletions.
62 changes: 31 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[![Screenshot](https://bitbucket.org/bestchai/dinv/raw/default/figs/play-demo-video.jpg)](https://www.youtube.com/watch?v=n9fH9ABJ6S4)

A demo video explaining Dinv.
A demo video explaining DInv.


# DInv is a distributed system data invariant detector
Expand Down Expand Up @@ -46,11 +46,11 @@ More concretely, DInv analyzes Go programs and can:

DInv is written in [go lang](http://golang.org/) and requires a working installation of [Daikon](http://plse.cs.washington.edu/daikon/). The following instructions are for Ubuntu 14.04.

The version of Go available through the apt package manager is out of date. Please [download go from the offical website](https://golang.org/dl/) and follow the [installation instructions](https://golang.org/doc/install#install) to setup a go workspace. The latest version of Go that DInv is confirmed to work with is `1.6.3`.
The version of Go available through the apt package manager is out of date. Please [download go from the offical website](https://golang.org/dl/) and follow the [installation instructions](https://golang.org/doc/install#install) to set up a go workspace. The latest version of Go that DInv is confirmed to work with is `1.6.3`.

(Note that DInv cannot be compiled using gccgo. Running DInv with gccgo will cause spurious errors.)

DInv is built to run within a standard go workspace environment. Go workspaces provide a mechanism for importing, and running the source code of bitbucket and github repositories. This interface is used throughout the installation, and is necessary for the instrumentation process. For detailed instructions on how to configure a go workspace see [how to write go code](http://golang.org/doc/code.html#Organization).
DInv is built to run within a standard go workspace environment. Go workspaces provide a mechanism for importing and running the source code of bitbucket and github repositories. This interface is used throughout the installation, and is necessary for the instrumentation process. For detailed instructions on how to configure a go workspace see [how to write go code](http://golang.org/doc/code.html#Organization).

The `GOPATH` environment variable is a reference to the root of your go workspace. This variable must be set to use both the `go get` instruction, as well as our example programs.

Expand All @@ -64,11 +64,11 @@ To clone the repository run the following commands

`hg clone https://bitbucket.org/bestchai/dinv`

For a quick start to using Dinv check out out [tutorial](./TUTORIAL.md)
For a quick start to using DInv check out the [tutorial](./TUTORIAL.md)

## Dependencies

Dinv depends on a number of remote repositories, you can install these by running `dependencies.sh`:
DInv depends on a number of remote repositories, you can install these by running `dependencies.sh`:

* github.com/godoctor/godoctor/analysis/cfg
* github.com/arcaneiceman/GoVector/govec/vclock
Expand All @@ -87,7 +87,7 @@ To infer invariants on the trace files produced by DInv
After you install the above dependencies, install DInv by running:
`go install bitbucket.org/bestchai/dinv`

The dinv binary will be generated at `$GOPATH/bin/dinv`.
The DInv binary will be generated at `$GOPATH/bin/dinv`.



Expand Down Expand Up @@ -127,10 +127,10 @@ code can be processed by the instrumenter. Instrumentation occurs at
the directory level. All source files within the directory will be
analyzed for annotations. The result of instrumentation is two
directories. If `foo` was the directory given to the instrumenter, the
result would be `foo` and `foo_orig`. `foo_orig` Contains the original
result would be `foo` and `foo_orig`. `foo_orig` contains the original
files, and `foo` contains instrumented code.

Running instrumented code produces the logs dinv uses to detect
Running instrumented code produces the logs DInv uses to detect
invariants. Instrumented directories retain the name of the original,
preserving external dependencies. You should be able to run
instrumented code, using identical methods for uninstrumented code.
Expand Down Expand Up @@ -166,7 +166,7 @@ The variables logged by dump code are only those which are in scope, and have th
10 sendMsg(Pack(response))
```

In this example the variables collected by the dump annotation include {`buf`,`msg`,`response`}. The variable `foo` is not collected because it's value is not affected by the sending function, and its value does not directly affect the value of `response`. For more information checkout the [dump wiki entry](https://bitbucket.org/bestchai/dinv/wiki/Dump%20Annotations).
In this example the variables collected by the dump annotation include {`buf`,`msg`,`response`}. The variable `foo` is not collected because its value is not affected by the sending function, and its value does not directly affect the value of `response`. For more information check out the [dump wiki entry](https://bitbucket.org/bestchai/dinv/wiki/Dump%20Annotations).


### Wrap messages with vector timestamps (Step 2)
Expand Down Expand Up @@ -240,7 +240,7 @@ server.go
1 message := instrumenter.Unpack ( buffer )
```

For more information on the runtime api checkout `/instrumenter/api`
For more information on the runtime api, check out `/instrumenter/api`

### Run the instrumenter (Step 3)
--------------------------------------
Expand All @@ -255,7 +255,7 @@ Instrumentation runs at the directory level. Running `dinv -i foo` produces:
## Run system and analyze its logs ( Part 2 )
---------------

Dinv mines invariants by analyzing the logs generated by instrumented code. Instrumented code retains external dependences and can be run like the original code. The product of running instrumented code is a set of log files for each host in the system. The logs contain vector clocks maintained over the course of the execution, along with variables, and their values at the `//@dump` annotations. The vector clocks are used to model the history of the execution. By examining the execution total orderings between host communication can be determined. Program points logged in these total orderings are merged together. The merged points are grouped together by both host and line number the points were collected on. Variable values from these groups are transcribed into Daikon readable trace files. The last step of the execution is to run Daikon on the generated trace files, to infer invariants on their values.
DInv mines invariants by analyzing the logs generated by instrumented code. Instrumented code retains external dependences and can be run like the original code. The product of running instrumented code is a set of log files for each host in the system. The logs contain vector clocks maintained over the course of the execution, along with variables, and their values at the `//@dump` annotations. The vector clocks are used to model the history of the execution. By examining the execution, total orderings between host communication can be determined. Program points logged in these total orderings are merged together. The merged points are grouped together by both host and line number the points were collected on. Variable values from these groups are transcribed into Daikon readable trace files. The last step of the execution is to run Daikon on the generated trace files to infer invariants on their values.

### Run instrumented code ( Step 4 )
---------------------------------------------
Expand All @@ -273,10 +273,10 @@ code. The difference is insturmented code generates logs. The logs are written t
####GoVector logs

GoVector logs are generated with the format
`packagename_timestampId_log.Log.txt` the logs contain a
`packagename_timestampId_log.Log.txt` and the logs contain a
[ShiViz](http://bestchai.bitbucket.org/shiviz/) log generated on the host with a corresponding timestamp.
These entries in these logs are generated by calls to the `Pack()` and `Unpack()`
methods. Log entries are of the form.
methods. Log entries are of the form

SelfHostID { "hostID:time", ..., "hostID:time" }
Event Message - filename:lineOfCode SelfHostID
Expand All @@ -293,7 +293,7 @@ Two logs are generated for each host of the form
* PackageName-HostID-Encoded.txt
* PackageName-HostID-Readable.txt

The readable log is a debugging tool and is not used again by Dinv. The log entries have the format
The readable log is a debugging tool and is not used again by DInv. The log entries have the format

hostId_package_filename_lineNumber
{"hostid":clockvalue, ... }
Expand All @@ -305,7 +305,7 @@ The encoded log used by the log merger in the next step. Conceptually the encode
### Merge logs ( Step 5 )
---------------------------

Post execution independent logging files have been written by each
Post-execution independent logging files have been written by each
instrumented host. In order to detect data invariants on the logged
variables, all the logs must be analysed together. The log merging
process tracks vector clocks logged along with variables. The
Expand All @@ -317,22 +317,22 @@ for invariant detection. The output of the logmerger is a set of
trace files uniquely identified by the consistent distributed point
they represent.

Merging logs is hands off when compared with instrumenting. In order
to merge the logs accurately all of the encoded program point logs, and
Merging logs is hands-off when compared with instrumenting. In order
to merge the logs accurately, all of the encoded program point logs and
GoVector logs generated during the execution of the instrumented code
must be given to the logmerger.

the logmerger expects as a argument the list of all GoVec logs and encoded point logs written during execution. The order of the arguments does not matter, however the inclusion of each file is necessary. In the case of a missing log an error will be thrown and the merging process terminated.
The logmerger expects as a argument the list of all GoVec logs and encoded point logs written during execution. The order of the arguments does not matter; however, the inclusion of each file is necessary. In the case of a missing log an error will be thrown and the merging process terminated.


as an example consider the merging of logs collected from two host
As an example consider the merging of logs collected from two host
client and server

`dinv -logmerger client-clientidEncoded.txt server-serveridEncoded.txt clientid.log-Log.txt serverid.log-Log.txt`

This example is verbose, for the sake of explanation.

The simplest way to execute the logmerger, is to move all generated
The simplest way to execute the logmerger is to move all generated
logs to a single directory and run

`dinv -l *Encoded.txt *Log.txt`
Expand All @@ -342,8 +342,8 @@ logs to a single directory and run


The output of the logmerger is a set of dtrace files. The dtrace files
have names corresponding to the program points where the values they track have been
extracted from. An individual point has a name of the form
have names corresponding to the program points from where the values they track have been
extracted. An individual point has a name of the form

`point` = `_hostid_packageName_FileName_lineNumber`

Expand Down Expand Up @@ -381,7 +381,7 @@ a directory.

The examples directory contains a number of small distributed systems
that have been instrumented with DInv's interface. Using DInv is a
multi step procedure. Each of the example programs is paired with a
multi-step procedure. Each of the example programs is paired with a
script `run.sh`. The script oversees the execution of each stage, and
pipes each steps output into its successor. The scripts have a
standard interface
Expand All @@ -392,7 +392,7 @@ standard interface
+ Options :
* -d : dirty run, the generated files at each stage are
not remove after execution
* -c : cleanup generated files
* -c : clean up generated files

### Hello DInv
Hello DInv is our introductory example program, in which a client
Expand Down Expand Up @@ -422,24 +422,24 @@ server which adds them, and sends back the results as the variable

### Linear

The linear example program is located in `exampels/linear`. The
The linear example program is located in `examples/linear`. The
execution of the program is scripted in `run.sh`. In order to run the
script, use the following command.

DinvDirectory/examples/linear/run.sh

Linear is a three host system. The hosts are `client`, `coeff`, and
`linn`. The hosts only pass messages with their neighbours IE `client
Linear is a three host-system. The hosts are `client`, `coeff`, and
`linn`. The hosts only pass messages with their neighbours, i.e. `client
<--> coeff <--> linn`.

Similar to the sum system the `client` randomly generates two terms,
Similar to the sum system, the `client` randomly generates two terms,
packages them, and sends them to its neighbour `coeff`. `coeff`
generates a coefficient for the first term, then packages it along
with the first two terms and sends to `linn`. `linn` calculates the
linear equation `linn = coeff * term1 + term2`. The variables `sum` is
propagated back through `coeff` to the `client` host.

Detected invariant include:
Detected invariants include

linn > coeff
linn > term1
Expand All @@ -455,7 +455,7 @@ Dinv has a set of command line arguments that can be broken into 3 categories

## Instrumenter dinv - i

The `-i` argument specifies the instrumenter, valid proceding arguments are as follows
The `-i` argument specifies the instrumenter; valid proceding arguments are as follows

* `-file` specifies a file for insturmentation `dinv -i -file=example.go`
* `-dir` specifies a directory for instrumentation `dinv -i -dir=exampleDir`
Expand All @@ -464,7 +464,7 @@ The `-i` argument specifies the instrumenter, valid proceding arguments are as f

## LogMerger dinv -l

The `-l` argument specifies the log merger, the following arguments should be followed with a complete list of shiviz, and encoded log files from an execution.
The `-l` argument specifies the log merger; the following arguments should be followed with a complete list of shiviz, and encoded log files from an execution.

* `-plan` speficies the plan to use while merging logs (default is totally ordered cuts) `dinv -l -plan="TOLN" files`
* `-sample` the sample rate of consistat cuts to be analyzed (default 100%) `dinv -l -sample=50 files`
Expand Down

0 comments on commit 4f68cf4

Please sign in to comment.