Skip to content

Commit

Permalink
User defined archive access url (local branch PR) (#78)
Browse files Browse the repository at this point in the history
* added custom access url input and conditional

* Custom access url + url validation + regex fix

* enhance preview deployment

* deploy preview create missing dirs

* explicitly create preview tool dir

* renaming cosmetics

* linting

* fixed linting (unreachable url)

* update planemo version

* Updated tool version number

* WIP added validator

* Update tools/archives/pyvo_integration/astronomical_archives.xml

* tests + regex

* WIP regex + timeout bypass + check on archive tables initialization

* fix for file writing bug

* fix for regex bug in python

* Delete .github/workflows/lint-and-test.yml

* remove unreachable url for linting

* Update tools/archives/pyvo_integration/astronomical_archives.py

f string instead of concat

Co-authored-by: Denys Savchenko <[email protected]>

* Apply suggestions from code review

* fix regex

---------

Co-authored-by: Denys SAVCHENKO <[email protected]>
Co-authored-by: Volodymyr <[email protected]>
Co-authored-by: Denys Savchenko <[email protected]>
  • Loading branch information
4 people authored Mar 26, 2024
1 parent 472dee3 commit b77ceb5
Show file tree
Hide file tree
Showing 3 changed files with 99 additions and 61 deletions.
37 changes: 0 additions & 37 deletions .github/workflows/lint-and-test.yml

This file was deleted.

75 changes: 55 additions & 20 deletions tools/archives/pyvo_integration/astronomical_archives.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import functools
import json
import os
import re
import signal
import sys
import urllib
Expand All @@ -17,6 +18,10 @@
MAX_ALLOWED_ENTRIES = 100
MAX_REGISTRIES_TO_SEARCH = 100

ARCHIVES_TIMEOUT_BYPASS = [
"https://datalab.noirlab.edu/tap"
]


class TimeoutException(Exception):
pass
Expand Down Expand Up @@ -217,28 +222,34 @@ def _set_archive_tables(self):

self.tables = []

for table in self.archive_service.tables:
archive_table = {
'name': table.name,
'type': table.type,
'fields': None
}

fields = []

for table_field in table.columns:
field = {
'name': table_field.name,
'description': table_field.description,
'unit': table_field.unit,
'datatype': table_field.datatype.content
try:
for table in self.archive_service.tables:
archive_table = {
'name': table.name,
'type': table.type,
'fields': None
}

fields.append(field)
fields = []

for table_field in table.columns:
field = {
'name': table_field.name,
'description': table_field.description,
'unit': table_field.unit,
'datatype': table_field.datatype.content
}

fields.append(field)

archive_table['fields'] = fields
archive_table['fields'] = fields

self.tables.append(archive_table)
self.tables.append(archive_table)

# Exception is raised when a table schema is missing
# Missing table will be omitted so no action needed
except DALServiceError:
pass

def _is_query_valid(self, query) -> bool:
is_valid = True
Expand Down Expand Up @@ -507,6 +518,20 @@ def _set_archive(self):
self._archives.append(
TapArchive(access_url=self._service_access_url))

elif self._archive_type == 'custom':
self._service_access_url = \
self._json_parameters['archive_selection']['access_url']

if Utils.is_valid_url(self._service_access_url):
self._archives.append(
TapArchive(access_url=self._service_access_url))
else:
error_message = "archive access url is not a valid url"
Logger.create_action_log(
Logger.ACTION_ERROR,
Logger.ACTION_TYPE_ARCHIVE_CONNECTION,
error_message)

else:
keyword = \
self._json_parameters['archive_selection']['keyword']
Expand Down Expand Up @@ -752,6 +777,11 @@ def run(self):

for archive in self._archives:
try:

if archive.access_url in ARCHIVES_TIMEOUT_BYPASS:
archive.get_resources = \
timeout(40)(TapArchive.get_resources.__get__(archive)) # noqa: E501

_file_url, error_message = archive.get_resources(
self._adql_query,
self._number_of_files,
Expand Down Expand Up @@ -1250,9 +1280,9 @@ def write_urls_to_output(urls: [], output, access_url="access_url"):
with open(output, "w") as file_output:
for url in urls:
try:
file_output.write(url[access_url] + ',')
file_output.write(str(url[access_url]) + ',')
except Exception:
error_message = "url field not found for url"
error_message = f"url field {access_url} not found for url"
Logger.create_action_log(
Logger.ACTION_ERROR,
Logger.ACTION_TYPE_WRITE_URL,
Expand Down Expand Up @@ -1305,6 +1335,11 @@ def collect_resource_keys(urls_data: list) -> list:
resource_keys.append(key)
return resource_keys

@staticmethod
def is_valid_url(url: str) -> bool:
regex_url = re.compile(r'^https?://(?:[A-Za-z0-9-]+\.)+[A-Za-z]{2,6}(?::\d+)?(?:/[^\s]*)?$') # noqa: E501
return re.match(regex_url, url) is not None


class Logger:
_logs = []
Expand Down
48 changes: 44 additions & 4 deletions tools/archives/pyvo_integration/astronomical_archives.xml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<tool id="astronomical_archives" name="Astronomical Archives (IVOA)" version="0.9.1">
<tool id="astronomical_archives" name="Astronomical Archives (IVOA)" version="0.10.0">
<description>queries astronomical archives through Virtual Observatory protocols</description>
<edam_operations>
<edam_operation>operation_0224</edam_operation>
Expand All @@ -24,6 +24,7 @@
<param name="archive_type" type="select" label="Archive Selection">
<option value="archive">Query specific IVOA archive</option>
<option value="registry">Query all matching IVOA archives</option>
<option value="custom">Query custom TAP archive</option>
</param>
<when value="registry">
<param name="keyword" type="text" label="Keyword" />
Expand All @@ -47,6 +48,11 @@
<options from_data_table="astronomical_archives" />
</param>
</when>
<when value="custom">
<param name="access_url" type="text" label="TAP archive access url">
<validator type="regex" message="URL seems invalid">^https?://[A-Za-z0-9]([A-Za-z0-9-\.]{0,61}[A-Za-z0-9])?\.[A-Za-z]{2,6}(:\d+)?(/[^\s]*)?$</validator>
</param>
</when>
</conditional>
<section name="query_section" title="Query selection" expanded="true">
<conditional name="query_selection">
Expand Down Expand Up @@ -255,6 +261,40 @@
</assert_contents>
</output>
</test>
<test expect_num_outputs="2">
<param name="output_selection" value="c"/>
<param name="number_of_files" value="1"/>
<conditional name="archive_selection">
<param name="archive_type" value="custom"/>
<param name="access_url" value="https://datalab.noirlab.edu/tap"/>
</conditional>
<conditional name="query_selection">
<param name="query_type" value="raw_query" />
<param name="table" value="allwise.source" />
<param name="url_field" value="random_id" />
</conditional>
<output name="output_csv" count="1">
<assert_contents>
<has_text_matching expression=".*\S+.*" />
</assert_contents>
</output>
</test>
<test expect_num_outputs="2">
<param name="output_selection" value="c"/>
<param name="number_of_files" value="1"/>
<conditional name="archive_selection">
<param name="archive_type" value="custom"/>
<param name="access_url" value="http://voparis-tap-he.obspm.fr/tap"/>
</conditional>
<conditional name="query_selection">
<param name="query_type" value="obscore_query" />
</conditional>
<output name="output_csv" count="1">
<assert_contents>
<has_text_matching expression=".*\S+.*" />
</assert_contents>
</output>
</test>
</tests>
<help>

Expand Down Expand Up @@ -460,7 +500,7 @@ The Table Access Protocol (TAP) lets you execute queries against our database ta

-----

The MAST Archive at STScI TAP end point for the TESS Input Catalog.

The TIC is used to help identify two-minute cadence target selection for the TESS mission, and to calculate physical and observational properties of planet candidates. It is for use by both the TESS science team and the public, and it is periodically updated – the current version is TIC-8. TIC-8 uses the GAIA DR2 catalog as a base and merges a large number of other photometric catalogs, including 2MASS, UCAC4, APASS, SDSS, WISE, etc. There are roughly 1.5 billion stellar and extended sources in TIC-8, containing compiled magnitudes including B, V, u, g, r, i, z, J, H, K, W1-W4, and G.
The MAST Archive at STScI TAP end point for the TESS Input Catalog.The TIC is used to help identify two-minute cadence target selection for the TESS mission, and to calculate physical and observational properties of planet candidates. It is for use by both the TESS science team and the public, and it is periodically updated – the current version is TIC-8. TIC-8 uses the GAIA DR2 catalog as a base and merges a large number of other photometric catalogs, including 2MASS, UCAC4, APASS, SDSS, WISE, etc. There are roughly 1.5 billion stellar and extended sources in TIC-8, containing compiled magnitudes including B, V, u, g, r, i, z, J, H, K, W1-W4, and G.
The TIC can be directly accessed through the Mikulski Archive for Space Telescopes (MAST), using either queries or bulk download.

The Table Access Protocol (TAP) lets you execute queries against our database tables, and inspect various metadata. Upload is not currently supported.
Expand Down Expand Up @@ -543,7 +583,7 @@ Tables exposed through this endpoint include: epn_core from the gem_mars schema,

-----

**ArVO Byu TAP** http://arvo-registry.sci.am/tap ArVO Byurakan TAP service
**ArVO Byu TAP** arvo-registry.sci.am/tap ArVO Byurakan TAP service

-----

Expand Down Expand Up @@ -612,7 +652,7 @@ Illumination by the Sun of each face of the comet 67P/Churyumov-Gerasimenko base
CSHP_DV_130_01_LORES_OBJ.OBJ. The service provides the cosine between the normal of each face (in the same order as the faces defined in the shape model) and the Sun direction; both
numerical values and images of the illumination are available. Each map is defined for a given position of the Sun
in the frame of 67P (67P/C-G_CK). Longitude 0 is at the center of each map. The code is developed by A. Beth,
Imperial College London, UK and the service is provided by CDPP (http://cdpp.eu). Acknowlegment: The illumination models
Imperial College London, UK and the service is provided by CDPP (cdpp.eu). Acknowlegment: The illumination models
have been developed at the Department of Physics at Imperial College London (UK) under the financial support of STFC
grant of UK ST/N000692/1 and ESA contract 4000119035/16/ES/JD (Rosetta RPC-PIU). We would also like to warmly
thank Bernhard Geiger (ESA) for his support in validating the 2D-illumination maps.
Expand Down

0 comments on commit b77ceb5

Please sign in to comment.