-
Notifications
You must be signed in to change notification settings - Fork 0
MineruleOptions
In order to make MINERULE working the user must provide an option file placed in the build directory. This file is created with the following command:
mr defaults > options.txt
Below there is an explaination of each block contained in the options file. Each one is composed by at list one main option and optionally by some sub-options. Rows starting with "+" contain the current value of an option attribute.
mrdb::{ +name= +username= +password= +cacheWrites=False +dbms=postgres }
The user must specify postgres settings to allow MINERULE to create a connection with the DATABASE. At the moment only postgres dbms is supported, in future another choices will be available.
safety::{ +overwriteHomonymMinerules=False +allowCascadeDeletes=False }
These options are related to data SAFETY issues:
- overwriteHomonymMinerules: if the option is set to 'True', then the system will delete old results whenever a new minerule having the same name of an old one is inserted. Otherwise the system will report an error message and exit.
- allowCascadeDeletes: if is set to True, then the following option decides whether the system should delete also the minerules for which the result depends on the deleted one. If the option is set to True, those minerule will be deleted as well, otherwise the system with halt reporting an error.
miningalgorithms::{ rulesmining::{ +preferredAlgorithm=None partitionbase::{ +rowsPerPartition=300000 } partitionwithclusters::{ +rowsPerPartition=300000 } fpgrowth::{ +algoType=Original } } itemsetsmining::{ +preferredAlgorithm=None } }
Options related to mining algorithms:
-
rulesmining: contains a set of options for configuring rule mining algorithms (if used):
- preferredAlgorithm is the algorithm used by MINE RULE for the rule mining. If it's set to 'None' the choice is leave to the system.
- partitionbase contains settings for PartitionBase algorithm.
- partitionwithclusters contains settings for PartitionWithClusters algorithm.
- fpgrowth contains settings for FPGrowth algorithm.
- itemsetsmining contains the attribute to set preferred algorithm for itemset mining.
optimizations::{ +enableOptimizations=False +avoidDominanceDetection=False +avoidEquivalenceDetection=False +avoidCombinationDetection=False +incrementalAlgorithm=Auto combinator::{ +timeOut=4 +maxDisjuncts=3 +maxQueries=5 +maxDistinctPredicates=10 } }
Options related to optimizations used by OPTIMIZER module (if enabled):
- avoidDominanceDetection: If set to True, this option will disable the detection of dominant queries (this imply also that the system will not try to find equivalent queries, since they are a particular case of dominance)
- avoidEquivalenceDetection: If set to True this option will make the optimizer to consider equivalent queries as if they were dominant ones (i.e., it will call an incremental algorithm instead of dealing with the equivalence).
- avoidCombinationDetection: If set to True the optimizer will not try to find a combinations of previous queries equivalent to the current one. Notice that the search for combination may be a slow process
- incrementalAlgorithm: The following option allows the user to specify how a particular incremental algorithm have to be chosen. The following values are allowed:{Constructive,Destructive, Auto} Constructive and Destructive force the corresponding algorithm to be chosen. Auto leaves the choice to the optimizer.
-
combinator: Options related to the query combinator algorithm:
- timeOut: amount of time the search for a combination is allowed to run
- maxDisjuncts: Max number of disjuncts. It is the number of disjuncts that is considered during the search. Notice that increasing this number has a strong impact on the dimension of the search space.
- maxQueries: Max number of queries. Max number of distinct queries the user allows to be combined in the result. Formulae with a larger number of queries are penalized in the evaluation function.
- maxDistinctPredicates: Max distinct predicates. Max number of distinct predicates that the user# allows. This afflict the response time: the time spent in assessing each formula grows exponentially fast with the number of predicates.
parsers::{ +logfile=/dev/null +minBodyElems=1 +maxBodyElems=1000 +minHeadElems=1 +maxHeadElems=1000 }
Options related to the PARSING algorithms:
- logfile: Parsers log stream, valid names are {, and any writeable file}.
- The others four options allows to set constraint on cardinalities of elements which appears in the body/head part of rules. The constraints set here 'win' on the ones in minerules (i.e., if you say '1..n' as BODY in your minerule but set it to 1..5 here, than 1..5 will be used instead.
logstream::{ +stream= +loglevel=100 } errstream::{ +stream= +loglevel=100 } warnstream::{ +stream= +loglevel=100 } debugstream::{ +stream= +loglevel=100 }
Options related to STREAMS, by default they are commented. The reason is that the current conresponds to default settings instead of the actual ones. In specifying the stream parameter, you can:
- Specify a file path
- Specify , in order to specify the standard output and the standard error respectively
- Specify a file path including the %m and %i symbols