-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
datacopy question (and documentation) #198
Comments
The API Reference section is indeed mostly for developers, and it is so stated in the opening paragraph. However, there are some sections, e.g., Table, that have sample YAML. |
I like that structure. I'd also add a top-level "Getting Started" page which runs through installing via pip and running yamltodb with a sample yaml file. |
If we move the pages around, we'll probably want to redirect the old URLs to new ones. This page describes how to do it. From what I can tell, we'd want "Page Redirects", not "Exact Redirects". |
I've got a start on it here: I'd like to rename the index.rst page to be "Overview", and the existing overview page renamed to features. There's more I'd like to change, but I don't want to jump too far ahead. What do you think so far? |
IIRC index.rst is so-named because it becomes the home page of the documentation, i.e., so that you can visit I like the idea of a "Getting Started" and I think it could replace or be based on the current install.rst. Side note: I'd remove the "For development" from the Summary of that page, but still include GitHub installations instructions for users (somewhere), since sometimes they have problems doing that when trying out a fix (see issue #195, last comment). I'd probably put the Known Issues after the User Docs or as one of the closing sections in the latter. I'm not too concerned about the redirects. We can probably do this work in a |
Looks like ReadTheDocs doesn't allow the index page to reference itself, so I'm thinking about duplicating some of the content into the Overview/Features page. Not completely sure about this though. Getting Started is done. I'll move/remove the install page at some point. I'll include the docs for running pyrseas from git, but somewhere in an "advanced" section. I'm thinking it's not common to do, so let's not overload users with details they don't need to worry about. Agree about Known Issues. Sounds good about redirects. Definitely easier. |
I've created the |
Do you prefer multiple PRs? I feel there's more to do. I think a command prompt makes it clearer what users should do. Would a Linux prompt be better, for example: |
Looks like I've updated the PR unintentionally. I'm not a regular PR user, so this is just due to ignorance. I'd still like to add a Reference section under User Docs, that lists every table, sequence, type, extension, and their attributes. |
I prefer a plain dollar sign prompt because it's used by most Unix and Linux shells (Cygwin too IIRC), and also by VMS :-). |
I think the Getting Started needs to go under User. and Known Issues too. In outline, it would look something like this:
I'm not entirely happy with "For Users". I also considered "How to Use", "User Instructions", "User Information" but none of them struck a chord either. The Facilities section can be created from the "Description" subsections of dbtoyaml.rst, yamltodb.rst and dbaugment.rst, i.e., it's intermediate betweeen the Getting Started and the Command Line Reference but hopefully avoiding much repetition. I would also change the Getting Started to introduce dbtoyaml first. Running it against an empty database will output the standard public schema comment and the plpgsql extension, so the user can then either create a table with SQL or edit the YAML output and run yamltodb. I'm not sure what you ultimately want to achieve with the Schema Reference section, so please elaborate. I think that instead of repeating much of the PG CREATE statements documentation, it may be preferable to include the sample YAMLs that are already in the API ref section, except that instead of using a JSON/Python dict format, it would be formatted like standard indented YAML and either annotated or described in subsequent paragraphs. |
Agreed, plain $ is better -- for me it removes the text clutter. I haven't completely decided how to document the yaml, but definitely will have lots of yaml examples. Some properties, like identity (on a column) probably needs some text describing the two allowed strings. How about this: a datatype for each attribute, example yaml with the equivalent CREATE TABLE (or SEQUENCE, etc) for the yaml.
There's a good chance I'm going to get things wrong with all the possible yaml attributes. Unfortunately I don't know Python, so I'm doing my best to read through it. Facilities and Getting Started have some overlap in answering, "what's this all about?". I want to point out that given a yaml file, the yamltodb program will alter the database. Maybe I could change the description sections to be more explicit (show the yaml file, run yamltodb, and then show the output). Then make Getting Started more a Tutorial area (or How To). 1) Install components, 2) reverse engineer db, 3) add a column and deploy. dbaugment is an example where I don't know what it does. Does it change the yaml files? Does it touch the database directly? I think an example walk through would clear this up. Similarly, a walk through with yamltodb (showing the yaml and the command you run) makes things much clearer. |
I will respond later to the above but I have a quick question: I'm wondering to what extent have you used or are you using |
It’s all about the yaml ;) I can right-click on a yaml file (a table for example) and see the git history of all the changes I’ve hand modified from the beginning. I think this explains things. I’ve struggled to add new things to my yaml like triggers bc I couldn’t see how to write it. I feel like I’m reverse engineering Pyrseas by using dbtoyaml to find out what the yaml would be. It’s a way of working that I don’t see anywhere. I don’t know why bc I think everyone should do this. So for the documentation, I’d love to promote this way of working, but I don’t know how you feel about that. Maybe at a minimum we have pages for the different ways we envision users using Pyrseas. |
Microsoft has a similar product as yours, SSDT, and they also don't have pre & post SQL scripts. They went down a route that they called refactor log which I think you might be headed, however I'd discourage it, b/c it can't handle everything very well, https://stackoverflow.com/questions/23768919/with-ssdt-how-do-i-create-a-column-with-a-unique-constraint, and it's in a language that is custom to SSDT. Pre & Post SQL scripts are easy to understand and can handle any scenario. |
Regarding overlap between Getting Started and Facilities, I see the former as a quick start guide/how to (maybe it can be called Quick Start). For example, no need for PG install instructions, except for a short "see here". Why? Because Pyrseas only runs under Postgres, so I would expect people who find it, already have some version of it installed. Python may not be installed, so a brief instruction on installing Py 3 should suffice. Then create database, SQL create table with one or two columns, run dbtoyaml saving output, edit YAML to do something simple, like add a NOT NULL or make a PRIMARY KEY, and then run yamltodb to see the generated output (perhaps with a comment that adding Facilities OTOH is a much more detailed intro, discussing each utility and how they can be used (and perhaps including the more detailed Installation steps first). That's why I suggested taking the descriptive sections of the Command Reference pages. If you prefer, it can be titled Features, but maybe we should put the text together and then come up with a suitable title. This is going to be the meat of the user manual, the rest being more like a reference manual. |
Regarding |
Regarding promoting using I think YAML examples are good, and if you want to add more, there's a (near) perfect place for you to look: the unit tests under |
I see why you want to talk about dbtoyaml first - it's what you envision users to do. dbaugment: "you specify them in a separate file that can be merged with the existing database to generate YAML that is then fed into yamltodb to actually modify it" It sounds like you use it in place of dbtoyaml. It works the same, but takes another input file with the augment information. You make some valid points about maintaining the documentation -- it'll be extra work to keep them up to date. I had a look at JSON schemas as an alternative to documentation, but intellisense in vscode isn't quite there with yaml. Works great with JSON files (*.json). So to get this to work well, Pyrseas would have to read *.json files in --multiple-files mode. I'm not sure what you think about handling json schemas and *.json files. |
YAML is compatible with JSON and although As I see it, the user YAML reference documentation should (a) describe a particular PG object, say |
I prefer walk through documentation (create this file, create that file, run dbaument, see you have extra columns/triggers/etc in the newly created file) more than descriptions. It's more concrete I guess. But you could have both. Yeah, I like writing/reading yaml more than json as well. Fewer symbols make it easier to digest. Documenting the yaml isn't interesting to someone who uses dbtoyaml rather than editing yaml by hand. If you know what you created (a table for instance), then the generated yaml is familiar. However, I still want to edit the yaml by hand. If I had to choose, I'd prefer to use a json schema to give me intellisense in vscode for my yaml files, rather than documentation. I know what a foreign key is. I just want to know how to write it in yaml. cntrl-space -> oh, it's called "foreign_keys". What if I were to create a json schema file, would you include it in the project? In particular, update the tests to validate the schema whenever they output yaml? In practical terms, it'd mean having separate dependency to handle the schema, https://stackoverflow.com/questions/3262569/validating-a-yaml-document-in-python |
I've been working on a JSON schema for the yaml (attached). pyrseas.schema.schema.json.txt It makes it much easier to write the yaml. If you want to have a play:
To try it out, create a file called table.test.yaml hit cntr-space whenever you want a suggestion. The attached schema isn't exhaustive, but for a play, it'll do most things you'll be interested in. It covers everything in my personal project and the pagila schema in the tests. |
I'm sorry, Clay, but let's just say I'm too set in my ways. I do have VS on a Windows laptop but it lays mostly dormant. So I think you'll need to explain to me how do you envision those two files being distributed and used with the existing Pyrseas code and how will it affect the documentation. From a distribution standpoint, I presume it would fit into what Python calls As far as documentation, it looks like you would not have what is now the Lastly is the issue of tests. Currently, unit tests use Python dicts which are compatible with JSON and therefore with YAML. The only tests that generate YAML or take YAML as input are the functional ones, and they all use the same approach: create/alter a "source" database, run |
Just avoid potential headache - VS Code is completely different from Visual Studio. I think VS Code is written in javascript, so probably totally different from what you have installed. The idea with the schema files would to publish them to a website. So there is no need to distribute them to users. Hopefully there is a free site for schemas or static file web hosting. Worst case scenario would be Amazon S3 hosting from my account. In theory other text editors could implement this, but with a quick google search, I didn't see anything. it seems that JSON files are getting the json-schema support first (json-schema is still very young - just a draft).
It's a good point about Python coding. I know I won't be able to do this, having had a look around the code. So what I'll ask is if you could update the tests to validate any generated yaml (which the purpose would be to validate the schema, rather the yaml). If you don't have time (I know you've already spent lots of time on this project), then I'd just publish the schemas and add docs if people want to configure this. And just for VSCode to start. If someone wants to figure out another IDE, then I think it's reasonable that they can help with the docs. The downside would be potentially the code and schema will become out of sync. Worst case is there will be a Github issue to fix it, which wouldn't take me long to do. The schema wouldn't (and I think shouldn't) affect the running of the app. So no need for spec-reference section. :) But I agree samples are good. |
If you're going to the trouble of creating the files, we might as well distribute them with the Pyrseas package. It's not a big deal, like I said, they will be marked as Did a quick look for an Atom package and found json-schema among others (there's also YAML Atom-IDE support and others). Sublime Text has Schema Validator, Vim has vison. For Emacs there's flymake-json. I may install the latter, but no promises. As I wrote before, none of the Pyrseas tests currently generate YAML, only Python dicts. The functional tests use |
It'll be easier to describe the file path using a URL, so that it is the same for everybody. Also, if the settings file is committed to source control, the path to the schema may not be the same for everyone. Lastly, some tools allow the path to the schema to be in the yaml file itself ($schema property), which ideally would be a URL. OK, let's skip modifying the tests. |
I'm not sure to what "file path" you're referring. Is that a path to a validation file, analogous to XML uses for validation? |
Yes that’s what I mean. |
I've pushed up the complete schema for the yaml files in my pull request. In terms of hosting, you may want to consider http://schemastore.org/json/, which will host the schema files for us. From what I can tell, you do a pull request to a github repo to add/update a json schema. |
Although I merged your latest pull request, I haven't yet had the time to look at it, but I just ran (again) into something that makes me (again) doubt the advisability of simply editing YAML files and not using |
Here are other much simpler cases. If you have a CHAR(n) column and a VARCHAR(n) column and want to specify the default for both as (in SQL) |
I had a similar situation happen to me last week. I think I had to write a check constarint something non-obvious like ‘-1’::integer, to your point. |
I'm struggling to get datacopy to work. Nothing seems to happen. Can you provide an example?
In general, the documentation seems to be targeted to developers of Pyrseas, rather than users of Pyrseas.
Would you be interested in a pull request to move these docs to a Developers sub area and have *.yaml examples for users (tables, foreign keys, columns, sequences, etc) with all the supported attributes?
As a concrete example, this page, https://pyrseas.readthedocs.io/en/latest/column.html, doesn't say that not_null is a boolean and "name" isn't a yaml property (like the way not_null is). There isn't a way someone can write the yaml based on the docs.
The text was updated successfully, but these errors were encountered: