Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
ByteYJ authored Jun 4, 2024
0 parents commit 04d818c
Show file tree
Hide file tree
Showing 38 changed files with 1,735 additions and 0 deletions.
3 changes: 3 additions & 0 deletions .dvc/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
/config.local
/tmp
/cache
Empty file added .dvc/config
Empty file.
3 changes: 3 additions & 0 deletions .dvcignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Add patterns of files dvc should ignore, which could improve
# the performance. Learn more at
# https://dvc.org/doc/user-guide/dvcignore
35 changes: 35 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''

---

**Describe the bug**
A clear and concise description of what the bug is.

**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

**Expected behavior**
A clear and concise description of what you expected to happen.

**Data Resources**
Please provide any data resources, such as model files, input data or configuration files associated with this issue.

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Desktop (please complete the following information):**
- OS: [e.g. iOS]
- Python version [e.g. 3.8]
- Version [e.g. 22]

**Additional context**
Add any other context about the problem here.
25 changes: 25 additions & 0 deletions .github/ISSUE_TEMPLATE/user-story.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
name: User Story
about: Describe context and goals for providing value to users in this project.
title: ''
labels: ''
assignees: ''

---

As a [user concerned]
I want [goal]
so that [reason]

### Timebox [optional]
What's the maximum amount of time that should be spent working on this?

### Definition of Done
A checklist of things that need to happen in order for this story to be successfully completed
- [ ] Implement and check-in code changes
- [ ] Test code changes
- [ ] Documentation updated (if needed)
- [ ] Code review

### Subtasks
You can add additional sub-tasks if you'd like to break down the work further.
36 changes: 36 additions & 0 deletions .github/workflows/python_package.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: Python package

on:
pull_request:
push:
branches: [ $default-branch ]


jobs:
build:

runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10", "3.11", "3.12"]

steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install flake8
pip install .[test]
- name: Lint with flake8
run: |
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test with pytest
run: |
pytest
8 changes: 8 additions & 0 deletions .github/workflows/ruff.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name: Ruff
on: [push, pull_request]
jobs:
ruff:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: chartboost/ruff-action@v1
129 changes: 129 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/
33 changes: 33 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Changelog
All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

You should also add project tags for each release in Github, see [Managing releases in a repository](https://docs.github.com/en/repositories/releasing-projects-on-github/managing-releases-in-a-repository).

## [2.0.0] - 2024-05-29
### Added
- Added example auto-built Sphinx documentation in the `docs` folder
- Github workflow for running ruff linter
- A note about conda dependencies to README
- A note about using docker containers to README
- Ruff as a linter for development
### Changed
- All build and packaging switched to use only pyproject.toml
- Minimum python version changed to 3.10
- Github workflow checks python versions 3.10, 3.11, 3.12
- Updated DVC version to avoid `ImportError: cannot import name 'fsspec_loop'` in older versions
### Removed
- Removed setup.cfg

## [1.0.0] - 2022-05-23
### Added
- README and CHANGELOG
- cdstemplate packages for computing word count from input text
- corpus_counter_script.py as a user-facing script with argparse examples
- Tests of cdstemplate packages
- A github workflow to trigger tests on pull request to the main branch
- Sample text data from Project Gutenberg
- Data Version Control stage for the corpus_counter_script.py
- A sample Jupyter notebook that plots most frequent words the Gutenberg data
38 changes: 38 additions & 0 deletions CONTRIBUTIONS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Contribution Guidelines
This is a community-driven, open source project that welcomes all contributions. Whether you're a seasoned contributor or new to the project, we're grateful for all contributions.

## Community standards

We are an inclusive community that values open dialogue, mutual respect, and fair treatment. Every submission will be treated equally and we encourage those with diverse backgrounds and perspectives to contribute.

We are part of the University of Massachusetts Amherst, so we adhere to the [UMass Code of Student Conduct](https://www.umass.edu/dean_students/codeofconduct).

## Getting started
Before contributing to the project, take a look at the README file, which contains information about system requirements, environment setup steps, and a project summary.

Further documentation for this project is found in the docs folder.

## Selecting an issue
Issues that are open for contribution are given the following labels:
- good-first-issue
- Issues with this tag are suited for those that do not have previous experience with the project.
- help-wanted
- Issues with this tag are open for contribution and are suited for those with experience in contribution.

## Submitting contributions

To contribute to the project, do the following:
- [Fork and clone](https://docs.github.com/en/get-started/quickstart/fork-a-repo) the repository
- Create a [branch](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-and-deleting-branches-within-your-repository) for your issue
- Make a [pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request) to the main branch of the upstream repository
- Title your pull request with the issue you fixed
- For example, "Fixed upload error to resolve Issue #987"
- Include a short description of the changes you made

## Issue reporting and help
Report bugs, issues, or suggested features to * insert email.*

Direct all questions to *insert email *, but keep in mind that we are a small team and may take awhile to respond.



21 changes: 21 additions & 0 deletions LICENSE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2023 University of Massachusetts Amherst, Center for Data Science

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Loading

0 comments on commit 04d818c

Please sign in to comment.