-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #152 from idealista/develop
Develop
- Loading branch information
Showing
104 changed files
with
16,464 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1 change: 1 addition & 0 deletions
1
molecule/default/files/collections/sample_techproducts_configs_2/_rest_managed.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"initArgs":{},"managedList":[]} |
38 changes: 38 additions & 0 deletions
38
...t/files/collections/sample_techproducts_configs_2/_schema_analysis_stopwords_english.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
{ | ||
"initArgs":{"ignoreCase":true}, | ||
"managedList":[ | ||
"a", | ||
"an", | ||
"and", | ||
"are", | ||
"as", | ||
"at", | ||
"be", | ||
"but", | ||
"by", | ||
"for", | ||
"if", | ||
"in", | ||
"into", | ||
"is", | ||
"it", | ||
"no", | ||
"not", | ||
"of", | ||
"on", | ||
"or", | ||
"stopworda", | ||
"stopwordb", | ||
"such", | ||
"that", | ||
"the", | ||
"their", | ||
"then", | ||
"there", | ||
"these", | ||
"they", | ||
"this", | ||
"to", | ||
"was", | ||
"will", | ||
"with"]} |
11 changes: 11 additions & 0 deletions
11
...lt/files/collections/sample_techproducts_configs_2/_schema_analysis_synonyms_english.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
{ | ||
"initArgs":{ | ||
"ignoreCase":true, | ||
"format":"solr" | ||
}, | ||
"managedMap":{ | ||
"GB":["GiB","Gigabyte"], | ||
"happy":["glad","joyful"], | ||
"TV":["Television"] | ||
} | ||
} |
11 changes: 11 additions & 0 deletions
11
...ule/default/files/collections/sample_techproducts_configs_2/clustering/carrot2/README.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
An override location of the clustering algorithm's resources | ||
attribute definitions and lexical resources. | ||
|
||
A directory from which to load algorithm-specific stop words, | ||
stop labels and attribute definition XMLs. | ||
|
||
For an overview of Carrot2 lexical resources, see: | ||
http://download.carrot2.org/head/manual/#chapter.lexical-resources | ||
|
||
For an overview of Lingo3G lexical resources, see: | ||
http://download.carrotsearch.com/lingo3g/manual/#chapter.lexical-resources |
19 changes: 19 additions & 0 deletions
19
.../files/collections/sample_techproducts_configs_2/clustering/carrot2/kmeans-attributes.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
<!-- | ||
Default configuration for the bisecting k-means clustering algorithm. | ||
This file can be loaded (and saved) by Carrot2 Workbench. | ||
http://project.carrot2.org/download.html | ||
--> | ||
<attribute-sets default="attributes"> | ||
<attribute-set id="attributes"> | ||
<value-set> | ||
<label>attributes</label> | ||
<attribute key="MultilingualClustering.defaultLanguage"> | ||
<value type="org.carrot2.core.LanguageCode" value="ENGLISH"/> | ||
</attribute> | ||
<attribute key="MultilingualClustering.languageAggregationStrategy"> | ||
<value type="org.carrot2.text.clustering.MultilingualClustering$LanguageAggregationStrategy" value="FLATTEN_MAJOR_LANGUAGE"/> | ||
</attribute> | ||
</value-set> | ||
</attribute-set> | ||
</attribute-sets> |
24 changes: 24 additions & 0 deletions
24
...t/files/collections/sample_techproducts_configs_2/clustering/carrot2/lingo-attributes.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
<!-- | ||
Default configuration for the Lingo clustering algorithm. | ||
This file can be loaded (and saved) by Carrot2 Workbench. | ||
http://project.carrot2.org/download.html | ||
--> | ||
<attribute-sets default="attributes"> | ||
<attribute-set id="attributes"> | ||
<value-set> | ||
<label>attributes</label> | ||
<!-- | ||
The language to assume for clustered documents. | ||
For a list of allowed values, see: | ||
http://download.carrot2.org/stable/manual/#section.attribute.lingo.MultilingualClustering.defaultLanguage | ||
--> | ||
<attribute key="MultilingualClustering.defaultLanguage"> | ||
<value type="org.carrot2.core.LanguageCode" value="ENGLISH"/> | ||
</attribute> | ||
<attribute key="LingoClusteringAlgorithm.desiredClusterCountBase"> | ||
<value type="java.lang.Integer" value="20"/> | ||
</attribute> | ||
</value-set> | ||
</attribute-set> | ||
</attribute-sets> |
19 changes: 19 additions & 0 deletions
19
...ult/files/collections/sample_techproducts_configs_2/clustering/carrot2/stc-attributes.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
<!-- | ||
Default configuration for the STC clustering algorithm. | ||
This file can be loaded (and saved) by Carrot2 Workbench. | ||
http://project.carrot2.org/download.html | ||
--> | ||
<attribute-sets default="attributes"> | ||
<attribute-set id="attributes"> | ||
<value-set> | ||
<label>attributes</label> | ||
<attribute key="MultilingualClustering.defaultLanguage"> | ||
<value type="org.carrot2.core.LanguageCode" value="ENGLISH"/> | ||
</attribute> | ||
<attribute key="MultilingualClustering.languageAggregationStrategy"> | ||
<value type="org.carrot2.text.clustering.MultilingualClustering$LanguageAggregationStrategy" value="FLATTEN_MAJOR_LANGUAGE"/> | ||
</attribute> | ||
</value-set> | ||
</attribute-set> | ||
</attribute-sets> |
67 changes: 67 additions & 0 deletions
67
molecule/default/files/collections/sample_techproducts_configs_2/currency.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
<?xml version="1.0" ?> | ||
<!-- | ||
Licensed to the Apache Software Foundation (ASF) under one or more | ||
contributor license agreements. See the NOTICE file distributed with | ||
this work for additional information regarding copyright ownership. | ||
The ASF licenses this file to You under the Apache License, Version 2.0 | ||
(the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
|
||
<!-- Example exchange rates file for CurrencyField type named "currency" in example schema --> | ||
|
||
<currencyConfig version="1.0"> | ||
<rates> | ||
<!-- Updated from http://www.exchangerate.com/ at 2011-09-27 --> | ||
<rate from="USD" to="ARS" rate="4.333871" comment="ARGENTINA Peso" /> | ||
<rate from="USD" to="AUD" rate="1.025768" comment="AUSTRALIA Dollar" /> | ||
<rate from="USD" to="EUR" rate="0.743676" comment="European Euro" /> | ||
<rate from="USD" to="BRL" rate="1.881093" comment="BRAZIL Real" /> | ||
<rate from="USD" to="CAD" rate="1.030815" comment="CANADA Dollar" /> | ||
<rate from="USD" to="CLP" rate="519.0996" comment="CHILE Peso" /> | ||
<rate from="USD" to="CNY" rate="6.387310" comment="CHINA Yuan" /> | ||
<rate from="USD" to="CZK" rate="18.47134" comment="CZECH REP. Koruna" /> | ||
<rate from="USD" to="DKK" rate="5.515436" comment="DENMARK Krone" /> | ||
<rate from="USD" to="HKD" rate="7.801922" comment="HONG KONG Dollar" /> | ||
<rate from="USD" to="HUF" rate="215.6169" comment="HUNGARY Forint" /> | ||
<rate from="USD" to="ISK" rate="118.1280" comment="ICELAND Krona" /> | ||
<rate from="USD" to="INR" rate="49.49088" comment="INDIA Rupee" /> | ||
<rate from="USD" to="XDR" rate="0.641358" comment="INTNL MON. FUND SDR" /> | ||
<rate from="USD" to="ILS" rate="3.709739" comment="ISRAEL Sheqel" /> | ||
<rate from="USD" to="JPY" rate="76.32419" comment="JAPAN Yen" /> | ||
<rate from="USD" to="KRW" rate="1169.173" comment="KOREA (SOUTH) Won" /> | ||
<rate from="USD" to="KWD" rate="0.275142" comment="KUWAIT Dinar" /> | ||
<rate from="USD" to="MXN" rate="13.85895" comment="MEXICO Peso" /> | ||
<rate from="USD" to="NZD" rate="1.285159" comment="NEW ZEALAND Dollar" /> | ||
<rate from="USD" to="NOK" rate="5.859035" comment="NORWAY Krone" /> | ||
<rate from="USD" to="PKR" rate="87.57007" comment="PAKISTAN Rupee" /> | ||
<rate from="USD" to="PEN" rate="2.730683" comment="PERU Sol" /> | ||
<rate from="USD" to="PHP" rate="43.62039" comment="PHILIPPINES Peso" /> | ||
<rate from="USD" to="PLN" rate="3.310139" comment="POLAND Zloty" /> | ||
<rate from="USD" to="RON" rate="3.100932" comment="ROMANIA Leu" /> | ||
<rate from="USD" to="RUB" rate="32.14663" comment="RUSSIA Ruble" /> | ||
<rate from="USD" to="SAR" rate="3.750465" comment="SAUDI ARABIA Riyal" /> | ||
<rate from="USD" to="SGD" rate="1.299352" comment="SINGAPORE Dollar" /> | ||
<rate from="USD" to="ZAR" rate="8.329761" comment="SOUTH AFRICA Rand" /> | ||
<rate from="USD" to="SEK" rate="6.883442" comment="SWEDEN Krona" /> | ||
<rate from="USD" to="CHF" rate="0.906035" comment="SWITZERLAND Franc" /> | ||
<rate from="USD" to="TWD" rate="30.40283" comment="TAIWAN Dollar" /> | ||
<rate from="USD" to="THB" rate="30.89487" comment="THAILAND Baht" /> | ||
<rate from="USD" to="AED" rate="3.672955" comment="U.A.E. Dirham" /> | ||
<rate from="USD" to="UAH" rate="7.988582" comment="UKRAINE Hryvnia" /> | ||
<rate from="USD" to="GBP" rate="0.647910" comment="UNITED KINGDOM Pound" /> | ||
|
||
<!-- Cross-rates for some common currencies --> | ||
<rate from="EUR" to="GBP" rate="0.869914" /> | ||
<rate from="EUR" to="NOK" rate="7.800095" /> | ||
<rate from="GBP" to="NOK" rate="8.966508" /> | ||
</rates> | ||
</currencyConfig> |
42 changes: 42 additions & 0 deletions
42
molecule/default/files/collections/sample_techproducts_configs_2/elevate.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
<?xml version="1.0" encoding="UTF-8" ?> | ||
<!-- | ||
Licensed to the Apache Software Foundation (ASF) under one or more | ||
contributor license agreements. See the NOTICE file distributed with | ||
this work for additional information regarding copyright ownership. | ||
The ASF licenses this file to You under the Apache License, Version 2.0 | ||
(the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
|
||
<!-- If this file is found in the config directory, it will only be | ||
loaded once at startup. If it is found in Solr's data | ||
directory, it will be re-loaded every commit. | ||
See http://wiki.apache.org/solr/QueryElevationComponent for more info | ||
--> | ||
<elevate> | ||
<!-- Query elevation examples | ||
<query text="foo bar"> | ||
<doc id="1" /> | ||
<doc id="2" /> | ||
<doc id="3" /> | ||
</query> | ||
for use with techproducts example | ||
<query text="ipod"> | ||
<doc id="MA147LL/A" /> put the actual ipod at the top | ||
<doc id="IW-02" exclude="true" /> exclude this cable | ||
</query> | ||
--> | ||
|
||
</elevate> |
8 changes: 8 additions & 0 deletions
8
molecule/default/files/collections/sample_techproducts_configs_2/lang/contractions_ca.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# Set of Catalan contractions for ElisionFilter | ||
# TODO: load this as a resource from the analyzer and sync it in build.xml | ||
d | ||
l | ||
m | ||
n | ||
s | ||
t |
15 changes: 15 additions & 0 deletions
15
molecule/default/files/collections/sample_techproducts_configs_2/lang/contractions_fr.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Set of French contractions for ElisionFilter | ||
# TODO: load this as a resource from the analyzer and sync it in build.xml | ||
l | ||
m | ||
t | ||
qu | ||
n | ||
s | ||
j | ||
d | ||
c | ||
jusqu | ||
quoiqu | ||
lorsqu | ||
puisqu |
5 changes: 5 additions & 0 deletions
5
molecule/default/files/collections/sample_techproducts_configs_2/lang/contractions_ga.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Set of Irish contractions for ElisionFilter | ||
# TODO: load this as a resource from the analyzer and sync it in build.xml | ||
d | ||
m | ||
b |
23 changes: 23 additions & 0 deletions
23
molecule/default/files/collections/sample_techproducts_configs_2/lang/contractions_it.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# Set of Italian contractions for ElisionFilter | ||
# TODO: load this as a resource from the analyzer and sync it in build.xml | ||
c | ||
l | ||
all | ||
dall | ||
dell | ||
nell | ||
sull | ||
coll | ||
pell | ||
gl | ||
agl | ||
dagl | ||
degl | ||
negl | ||
sugl | ||
un | ||
m | ||
t | ||
s | ||
v | ||
d |
5 changes: 5 additions & 0 deletions
5
molecule/default/files/collections/sample_techproducts_configs_2/lang/hyphenations_ga.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Set of Irish hyphenations for StopFilter | ||
# TODO: load this as a resource from the analyzer and sync it in build.xml | ||
h | ||
n | ||
t |
6 changes: 6 additions & 0 deletions
6
molecule/default/files/collections/sample_techproducts_configs_2/lang/stemdict_nl.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Set of overrides for the dutch stemmer | ||
# TODO: load this as a resource from the analyzer and sync it in build.xml | ||
fiets fiets | ||
bromfiets bromfiets | ||
ei eier | ||
kind kinder |
Oops, something went wrong.