Skip to content

Commit

Permalink
Reordeno ejemplos en json en una carpeta "examples" aparte, con el no…
Browse files Browse the repository at this point in the history
…mbre que tendrían en la realidad. Dejo sólo la lista de themes del superThemeTaxonomy, sin la key superThemeTaxonomy (la definición de la taxonomía es independiente del nombre de la variable).
  • Loading branch information
abenassi committed Oct 29, 2016
1 parent e9d6f57 commit dea10ed
Show file tree
Hide file tree
Showing 10 changed files with 599 additions and 130 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.ipynb_checkpoints
309 changes: 309 additions & 0 deletions Generación de ejemplos.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,309 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Generación de ejemplos"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Genero ejemplos de metadatos a partir del ejemplo de `data.json`."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import json\n",
"from pprint import pprint"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"with open(\"data.json\", \"rb\") as f:\n",
" data = json.load(f)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exploración del data.json"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"13\n"
]
},
{
"data": {
"text/plain": [
"[u'rights',\n",
" u'publisher',\n",
" u'license',\n",
" u'description',\n",
" u'language',\n",
" u'title',\n",
" u'issued',\n",
" u'superThemeTaxonomy',\n",
" u'modified',\n",
" u'themeTaxonomy',\n",
" u'spatial',\n",
" u'dataset',\n",
" u'homepage']"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# campos de metadata de un catalog\n",
"print len(data)\n",
"data.keys()"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"17\n"
]
},
{
"data": {
"text/plain": [
"[u'publisher',\n",
" u'landingPage',\n",
" u'description',\n",
" u'superTheme',\n",
" u'title',\n",
" u'issued',\n",
" u'temporal',\n",
" u'modified',\n",
" u'language',\n",
" u'theme',\n",
" u'keyword',\n",
" u'accrualPeriodicity',\n",
" u'spatial',\n",
" u'license',\n",
" u'contactPoint',\n",
" u'identifier',\n",
" u'distribution']"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# campos de metadata de un dataset\n",
"print len(data[\"dataset\"][0].keys())\n",
"data[\"dataset\"][0].keys()"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"12\n"
]
},
{
"data": {
"text/plain": [
"[u'accessURL',\n",
" u'rights',\n",
" u'description',\n",
" u'license',\n",
" u'title',\n",
" u'byteSize',\n",
" u'format',\n",
" u'mediaType',\n",
" u'modified',\n",
" u'downloadURL',\n",
" u'field',\n",
" u'issued']"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# campos de metadata de un distribucion\n",
"print len(data[\"dataset\"][0][\"distribution\"][0].keys())\n",
"data[\"dataset\"][0][\"distribution\"][0].keys()"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3\n"
]
},
{
"data": {
"text/plain": [
"[u'type', u'description', u'title']"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# campos de metadata de un field\n",
"print len(data[\"dataset\"][0][\"distribution\"][0][\"field\"][0].keys())\n",
"data[\"dataset\"][0][\"distribution\"][0][\"field\"][0].keys()"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3\n"
]
},
{
"data": {
"text/plain": [
"[u'description', u'id', u'label']"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# campos de metadata de un theme\n",
"print len(data[\"themeTaxonomy\"][0].keys())\n",
"data[\"themeTaxonomy\"][0].keys()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Separación de ejemplos en JSON"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def write_to_json(example_name, example):\n",
" with open(\"examples/{}.json\".format(example_name), \"wb\") as f:\n",
" f.write(json.dumps(example, f, indent=4, ensure_ascii=False).encode(\"utf8\"))"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"write_to_json(\"dataset\", data[\"dataset\"][0])\n",
"write_to_json(\"distribution\", data[\"dataset\"][0][\"distribution\"][0])\n",
"write_to_json(\"field\", data[\"dataset\"][0][\"distribution\"][0][\"field\"][0])\n",
"write_to_json(\"theme\", data[\"themeTaxonomy\"][0])\n",
"write_to_json(\"themeTaxonomy\", data[\"themeTaxonomy\"])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python [Root]",
"language": "python",
"name": "Python [Root]"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.12"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
59 changes: 0 additions & 59 deletions ejemplo-field.json

This file was deleted.

4 changes: 2 additions & 2 deletions ejemplo-completo.json → examples/data.json
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@
{
"title": "organismo_unidad_operativa_contrataciones_id",
"type": "integer",
"description": "Identificador único del organismo que realiza la convocatoría. Organismo de máximo nivel jerárquico al que pertenece la unidad operativa de contrataciones."
"description": "Identificador único del organismo que realiza la convocatoria. Organismo de máximo nivel jerárquico al que pertenece la unidad operativa de contrataciones."
},
{
"title": "unidad_operativa_contrataciones_id",
Expand All @@ -115,7 +115,7 @@
{
"title": "organismo_unidad_operativa_contrataciones_desc",
"type": "string",
"description": "Organismo que realiza la convocatoría. Organismo de máximo nivel jerárquico al que pertenece la unidad operativa de contrataciones."
"description": "Organismo que realiza la convocatoria. Organismo de máximo nivel jerárquico al que pertenece la unidad operativa de contrataciones."
},
{
"title": "unidad_operativa_contrataciones_desc",
Expand Down
Loading

0 comments on commit dea10ed

Please sign in to comment.