-
Notifications
You must be signed in to change notification settings - Fork 60
Docker Installation
- [Required] Docker
- [Required] Notion Token
- [Optional] OpenAI API KEY
- [Optional] Google API KEY
- [Optional] Ollama service (Open source models)
- [Optional] Notion Web Clipper Highly Recommended!
- [Optional] Reddit Tokens
- [Optional] Twitter Developer Tokens, Paid Account Only
After creating the Notion Token, go to Notion, create a page as the main entry (For example Readings
page), and enable Notion Integration
for this page
Checkout the repo and copy .env.template
to build/.env
, then fill up the environment vars:
NOTION_TOKEN
NOTION_ENTRY_PAGE_ID
OPENAI_API_KEY
- [Optional]
REDDIT_CLIENT_ID
andREDDIT_CLIENT_SECRET
- [Optional] Vars with
TWITTER_
prefix
Double check LLM_PROVIDER=xxx
, default is openai
, we could switch to Google Gemini or Ollama, also fill the values accordingly, e.g.
-
LLM_PROVIDER=openai:
- OPENAI_API_KEY=sk-xxx
- OPENAI_MODEL=gpt-3.5-turbo-0125
-
LLM_PROVIDER=google:
- GOOGLE_MODEL=gemini-1.5-flash-latest
- GOOGLE_API_KEY=xxx
-
LLM_PROVIDER=ollama
- OLLAMA_MODEL=llama3
- OLLAMA_URL=http://<ollama_hostname>:11434
Check EMBEDDING_PROVIDER=xxx
, default is openai
, we could switch to HuggingFace or Ollama as the provider, and also modify the EMBEDDING_MODEL
accordingly (Tips: For Huggingface and Ollama, Make sure the embedding models are pre-downloaded and ready to use). e.g.
-
EMBEDDING_PROVIDER=openai
- EMBEDDING_MODEL=text-embedding-ada-002
-
EMBEDDING_PROVIDER=hf
- EMBEDDING_MODEL=all-MiniLM-L6-v2
-
EMBEDDING_PROVIDER=ollama
- EMBEDDING_MODEL=nomic-embed-text
Notes: Replace <ollama_hostname>
with the actual Ollama service hostname, and make sure the node is accessible.
Notes: All make
commands below are executed at the root of the auto-news
source code folder.
make deps && make deploy
make start
Now that the services are running, it will pull sources every hour.
Go to the Notion entry page we created before, and we will see the following folder structure has been created automatically:
Readings
├── Inbox
│ ├── Inbox - Article
│ └── Inbox - YouTube
│ └── Inbox - Journal
├── Index
│ ├── Index - Inbox
│ ├── Index - ToRead
│ ├── RSS_List
│ └── Tweet_List
│ └── Reddit_List
└── ToRead
└── ToRead
- Go to
RSS_List
page, and fill in the RSS name and URL - Go to
Reddit_List
page, and fill in the subreddit names - Go to
Tweet_List
page, and fill in the Tweet screen names (Tips: Paid Account Only)
Go to the Notion ToRead
database page; all the data will flow into this database later on. Create the database views for different sources to help us organize flows more easily, E.g., Tweets, Articles, YouTube, RSS, etc. You may want to watch this video to get an initial idea of how to define your personal database views that you customize yourself.
Now, enjoy and have fun.
For troubleshooting, we can use the URLs below to access the services and check the logs and data.
Service | Role | Panel URL |
---|---|---|
Airflow | Orchestration | http://localhost:8080 |
Milvus | Vector Database | http://localhost:9100 |
Adminer | DB accessor | http://localhost:8070 |
Go to http://localhost:8080
, and use the default Airflow account and password airflow
to log in (To change it, modify _AIRFLOW_WWW_USER_USERNAME
and _AIRFLOW_WWW_USER_PASSWORD:
in the docker/docker-compose.yaml
file):
Go to http://localhost:8070
, and use the below default account to sign in to the database accessor (To change it, modify MYSQL__
env vars from build/.env
file):
Go to http://localhost:9100
, and click Connect
button to log in to the Milvus vector database management UI
# Stop all services
make stop
# Restart all services
make stop && make start
Modify build/.env, then:
make stop && make deploy && make start
git pull && make stop && make deploy && make start
make stop && make build && make start