Search server prototype using data from LTI's official repository.
- Elasticsearch 5.4
- Perl 5.24.1
- openjdk 1.8.0
wget -qO - | sudo apt-key add -
sudo apt-get install apt-transport-https
echo "deb stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-5.x.list
sudo apt-get update && sudo apt-get install elasticsearch
sudo apt-get install curl
sudo cd /usr/share/elasticsearch
sudo apt install git
git clone git://
sudo git clone git://
cd elasticsearch-head
sudo apt-get install docker
sudo docker install
sudo apt install
sudo docker run -p 9100:9100 mobz/elasticsearch-head:5 &
sudo ufw allow 9100
sudo apt-get update && sudo apt-get install kibana
sudo bin/elasticsearch-plugin install ingest-attachment
sudo /bin/systemctl daemon-reload
sudo /bin/systemctl enable elasticsearch.service
sudo systemctl status elasticsearch.service
sudo ufw allow 9200
sudo vi /etc/elasticsearch/elasticsearch.yml
and add the following:
http.cors.enabled: true
http.cors.allow-origin: /.*/
http.cors.allow-credentials: true
sudo systemctl start elasticsearch.service
sudo systemctl status elasticsearch.service
tail -f /var/log/elasticsearch/elasticsearch.log
curl -XGET localhost:9200/
sudo /bin/systemctl enable kibana.service
sudo vi /etc/kibana/kibana.yml
and add the following: ""
sudo systemctl status elasticsearch.service
sudo systemctl start kibana.service
sudo systemctl status kibana.service
sudo ufw allow 5601
sudo ufw status
On local Windows machine
- Navigate to http://elasticsearchIP:9200/_plugin/head/ to browse the contents of Elasticsearch
- Navigate to to issue commands against Elasticsearch
- Change ip address to that of Elasticsearch server
- Issue the following to delete and create the ingest attachment pipeline
DELETE _ingest/pipeline/attachment
PUT _ingest/pipeline/attachment
"description" : "Extract attachment information",
"processors" : [
"attachment" : {
"field" : "file",
"indexed_chars": -1
- Issue the following to delete and create an index with the relevant fields
PUT /lti
"settings" : { "index" : { "number_of_shards" : 1, "number_of_replicas" : 0 }},
"mappings" : {
"_default_" : {
"properties" : {
"date" : {"type": "string", "index" : "not_analyzed" },
"title" : {"type": "string", "index" : "not_analyzed" },
"url" : {"type": "string", "index" : "not_analyzed" },
"description" : { "type" : "string" },
"duration" : { "type" : "string" }
"document": {
"properties": {
"file": {
"type": "text"
Issue the following to check a document is added to the index:
PUT /lti/en/1?pipeline=attachment
"date": "2013/05/23",
"title": "What is The Venus Project?",
"url": "",
"description": "A dynamic text video explaining TVP in 83 seconds",
"duration": "1:23",
"file" : "77u/MQowMDowMDowMCwwOTMgLS0+IDAwOjAwOjA0LDAzOApXaGF0IGlzIFRoZSBWZW51cyBQcm9qZWN0PwoKMgowMDowMDowNCwxNzggLS0+IDAwOjAwOjA3LDUzMgpUaGUgVmVudXMgUHJvamVjdCBvZmZlcnMgYSBuZXcgc29jaW8tZWNvbm9taWMgc3lzdGVtIHRoYXQgaXNuJ3Q6CgozCjAwOjAwOjA3"
Issue the following to see what is in Elasticsearch:
POST _search
"query": {
"match_all": {}
Issue the following to search for the word "socio" (the word will be found in the attachment content):
POST /_search?pretty=true
"query" : {
"query_string" : {
"query" : "socio"
"highlight" : {
"fields" : {
"attachment.content": {
"fragment_size": 150,
"number_of_fragments": 3,
"no_match_size": 150
Issue the following to search for the phrase "socio-economic system":
POST /lti/en/_search?pretty=true
"_source": {
"includes": [ "title" ],
"excludes": [ "file" ]
"query" : {
"match_phrase": {
"attachment.content": "socio-economic system"
"highlight" : {
"fields" : {
"file" : {}
Issue the following to delete the lti index and test document:
Re-create the lti index and mapping (as before).
Pre-requisites for the the main script on Ubuntu
sudo apt-get install libhtml-tableextract-perl
sudo apt-get install libwww-perl
sudo apt-get install libjson-perl
sudo apt-get install libhtml-tokeparser-simple-perl
sudo apt-get install libfile-slurp-perl
sudo apt-get install cpanminus
sudo cpanm WWW::JSON
sudo cpanm String::Util
Copy over script (755 permissions)
Then run by - perl elasticsearchIP workingPath - e.g. sudo perl localhost /tmp
This should import the records from the official repository. An index (equivalent to a database) containing data for 304 videos will take up about 40MB.
Re-issue the keyword searches and you should get a few more results.
From Windows to check if can access ports, start Powershell and issue the following commands, modifying the port as appropriate:
- $t = New-Object Net.Sockets.TcpClient "", 9200
- $t.Connected
If you want to make the health of the cluster green, update the number of replicas to 0 for the kibana index:
PUT /.kibana/_settings { "index" : { "number_of_replicas" : 0 } }
Other references: