-
Notifications
You must be signed in to change notification settings - Fork 21
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Solr config folder added. README.md updated with Solr Cloud deploy usage. Minor README.md improvements.
- Loading branch information
Thomas Egense
committed
Mar 26, 2024
1 parent
5038524
commit 5b629a2
Showing
50 changed files
with
8,776 additions
and
113 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
non redirect skal ikke sætte headers? | ||
redirect skal adde headers og ikke overskrive. | ||
|
||
Eksempel 1: | ||
https://solrwb-test.kb.dk:4000/solrwayback/services/memento/http://www.twenty-fourflowers.com/ | ||
Eksempel 2: | ||
https://solrwb-test.kb.dk:4000/solrwayback/services/memento/http://prak10k.dk/?page_id=13 | ||
|
||
|
||
|
||
|
||
ref: | ||
http://timetravel.mementoweb.org/api/json/2013/http://cnn.com | ||
|
||
|
||
|
||
Review: | ||
Bug: fixed. Redirect må ikke have payload. | ||
Kun redirect support - playback kan ikke køre under /memento url også. (Kompliceret forklaring). | ||
Host -> localhost | ||
todo comment in DatetimeNegotiationTest | ||
Good unittests + solr unittest |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,104 +1,4 @@ | ||
# SolrWayback bundle | ||
|
||
Resources used when building the SolrWayback bundle. | ||
|
||
- `install SolrWayback bundle`: See install guide [SolrWayback README](https://github.com/netarchivesuite/solrwayback/blob/master/README.md/) | ||
- `indexing`: Scripts for indexing WARC files using [webarchive-discovery](https://github.com/ukwa/webarchive-discovery/) | ||
- `Changes.md`: See version history [SolrWayback](https://github.com/netarchivesuite/solrwayback/blob/master/CHANGES.md/) | ||
|
||
- solrwaybackproxy | ||
- Solr 9 config files | ||
- Tomcat 9 | ||
- Solr 9 | ||
|
||
## How to for package managers | ||
|
||
### Build WARs and JAR | ||
|
||
Create the SolrWayback WAR | ||
``` | ||
mvn clean package | ||
``` | ||
|
||
Build a `warc-indexer-0.3.2-SNAPSHOT-jar-with-dependencies.jar` from [webarchive-discovery](https://github.com/ukwa/webarchive-discovery/). | ||
|
||
Build a `solrwaybackrootproxy-4.3.1.war` from [solrwaybackrootproxy](https://github.com/netarchivesuite/solrwaybackrootproxy). | ||
|
||
### Folder structure | ||
|
||
``` | ||
mkdir solrwayback_package_4.5 | ||
cd solrwayback_package_4.5/ | ||
cp -r ../src/bundle/indexing/ . | ||
cp | ||
cp -r ../src/test/resources/solr_9/ solr_9_files. | ||
cp ../README.md ../CHANGES.md . | ||
mkdir properties | ||
cp ../src/test/resources/properties/solrwayback.properties properties/ | ||
cp ../src/test/resources/properties/solrwaybackweb.properties properties/ | ||
``` | ||
|
||
Copy the previously generated `warc-indexer-XXX-jar-with-dependencies.jar` to the `indexing/` folder. | ||
|
||
### Tomcat 9 | ||
|
||
Download and unpack Tomcat 9 (in current folder `solrwayback_package_4.5`) | ||
``` | ||
wget 'https://dlcdn.apache.org/tomcat/tomcat-9/v9.0.84/bin/apache-tomcat-9.0.84.tar.gz' | ||
tar -xzovf apache-tomcat-9.0.84.tar.gz | ||
mv apache-tomcat-9.0.84 tomcat-9 | ||
rm apache-tomcat-9.0.84.tar.gz | ||
``` | ||
|
||
Copy WAR and context: | ||
``` | ||
cp ../target/solrwayback-*.war tomcat-9/webapps/solrwayback.war | ||
mkdir -p conf/Catalina/localhost/ | ||
cp ../src/main/webapp/META-INF/context.xml tomcat-9/conf/Catalina/localhost/solrwayback.xml | ||
``` | ||
|
||
Edit `tomcat-9/conf/Catalina/localhost/solrwayback.xml` and set | ||
* `solrwayback-config` to `properties/solrwayback.properties` | ||
* `solrwaybackweb-config` to `properties/solrwaybackweb.properties` | ||
|
||
Copy and rename the previously generated `solrwaybackrootproxy-4.3.1.war` to `tomcat/webapps/ROOT.war`. | ||
|
||
### Solr 9 | ||
|
||
Download and unpack Solr 9 (in current folder `solrwayback_package_4.5`) | ||
``` | ||
wget 'https://www.apache.org/dyn/closer.lua/solr/solr/9.4.0/solr-9.4.0.tgz?action=download' -O solr-9.4.0.tgz | ||
tar -xovf solr-9.4.0.tgz | ||
mv solr-9.4.0 solr-9 | ||
rm solr-9.4.0.tgz | ||
``` | ||
|
||
/Optional but makes it easier to debug:/ Open Solr to the World instead of just localhost | ||
``` | ||
sed -i 's/#SOLR_JETTY_HOST="127.0.0.1"/SOLR_JETTY_HOST="0.0.0.0"/' solr-9.4.0/bin/solr.in.sh | ||
sed -i 's/REM set SOLR_JETTY_HOST=127.0.0.1/set SOLR_JETTY_HOST=0.0.0.0/' solr-9.4.0/bin/solr.in.cmd | ||
``` | ||
|
||
Start Solr in cloud mode, create a 1 shard `netarchivebuilder` collection and shut it down | ||
``` | ||
solr-9/bin/solr start -c -m 1g | ||
solr-9/bin/solr create_collection -c netarchivebuilder -d solr_9_files/netarchivebuilder/conf/ -n sw_conf_1 -shards 1 | ||
solr-9/bin/solr stop | ||
``` | ||
|
||
### Finishing and packing (in current folder `solrwayback_package_4.5`) | ||
|
||
Remove Emacs backup files (if any) | ||
``` | ||
find . -iname "*~" | xargs rm | ||
``` | ||
|
||
Create the bundle | ||
``` | ||
cd .. | ||
zip -r solrwayback_package_4.5.zip solrwayback_package_4.5/ | ||
``` | ||
|
||
- `properties`: Default properties for the SolrWayback Bundle | ||
# Solr configuration | ||
|
||
This folder contains a copy of the Solr configuration and can be used upload a new Solr configuration to Solr. Only for experience Solr users that knows what they are doing. | ||
See the' Update Solr cloud configuration' in the project README.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
<?xml version="1.0" encoding="UTF-8" ?> | ||
<!-- | ||
Licensed to the Apache Software Foundation (ASF) under one or more | ||
contributor license agreements. See the NOTICE file distributed with | ||
this work for additional information regarding copyright ownership. | ||
The ASF licenses this file to You under the Apache License, Version 2.0 | ||
(the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
|
||
<!-- If this file is found in the config directory, it will only be | ||
loaded once at startup. If it is found in Solr's data | ||
directory, it will be re-loaded every commit. | ||
See http://wiki.apache.org/solr/QueryElevationComponent for more info | ||
--> | ||
<elevate> | ||
<query text="foo bar"> | ||
<doc id="1" /> | ||
<doc id="2" /> | ||
<doc id="3" /> | ||
</query> | ||
|
||
<query text="ipod"> | ||
<doc id="MA147LL/A" /> <!-- put the actual ipod at the top --> | ||
<doc id="IW-02" exclude="true" /> <!-- exclude this cable --> | ||
</query> | ||
|
||
</elevate> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# Set of Catalan contractions for ElisionFilter | ||
# TODO: load this as a resource from the analyzer and sync it in build.xml | ||
d | ||
l | ||
m | ||
n | ||
s | ||
t |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# Set of French contractions for ElisionFilter | ||
# TODO: load this as a resource from the analyzer and sync it in build.xml | ||
l | ||
m | ||
t | ||
qu | ||
n | ||
s | ||
j |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Set of Irish contractions for ElisionFilter | ||
# TODO: load this as a resource from the analyzer and sync it in build.xml | ||
d | ||
m | ||
b |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# Set of Italian contractions for ElisionFilter | ||
# TODO: load this as a resource from the analyzer and sync it in build.xml | ||
c | ||
l | ||
all | ||
dall | ||
dell | ||
nell | ||
sull | ||
coll | ||
pell | ||
gl | ||
agl | ||
dagl | ||
degl | ||
negl | ||
sugl | ||
un | ||
m | ||
t | ||
s | ||
v | ||
d |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Set of Irish hyphenations for StopFilter | ||
# TODO: load this as a resource from the analyzer and sync it in build.xml | ||
h | ||
n | ||
t |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Set of overrides for the dutch stemmer | ||
# TODO: load this as a resource from the analyzer and sync it in build.xml | ||
fiets fiets | ||
bromfiets bromfiets | ||
ei eier | ||
kind kinder |
Oops, something went wrong.