Skip to content

Latest commit

 

History

History
193 lines (116 loc) · 7.06 KB

README.md

File metadata and controls

193 lines (116 loc) · 7.06 KB

git-header

SurfSense

Well when I’m browsing the internet, I tend to save a ton of content—but remembering when and what you saved? Total brain freeze! That’s where SurfSense comes in. SurfSense is a Personal AI Assistant for anything you see (Social Media Chats, Calender Invites, Important Mails, Tutorials, Recipies and anything ) on the World Wide Web. Now, you’ll never forget any browsing session. Easily capture your web browsing session and desired webpage content using an easy-to-use cross browser extension. Then, ask your personal knowledge base anything about your saved content, and voilà—instant recall!

Video

surfv3.mp4

Key Features

  • 💡 Idea: Save any content you see on the internet in your own personal knowledge base.
  • ⚙️ Cross Browser Extension: Save content from your favourite browser.
  • 🔍 Powerful Search: Quickly find anything in your Web Browsing Sessions.
  • 💬 Chat with your Web History: Interact in Natural Language with your saved Web Browsing Sessions and get cited answers.
  • 🔔 Local LLM Support: Works Flawlessly with Ollama local LLMs.
  • 🏠 Self Hostable: Open source and easy to deploy locally.
  • 📊 Advanced RAG Techniques: Utilize the power of Advanced RAG Techniques.
  • 🔟% Cheap On Wallet: Works Flawlessly with OpenAI gpt-4o-mini model and Ollama local LLMs.
  • 🕸️ No WebScraping: Extension directly reads the data from DOM to get accurate data.

How to get started?


UPDATE 25 SEPTEMBER 2024:

  • Thanks @hnico21 for adding Docker Support

Docker Support

  1. Setup SurfSense-Frontend/.env and backend/.env
  2. Run docker-compose build --no-cache.
  3. After building image run docker-compose up -d
  4. Now connect the extension with docker live backend url by updating ss-cross-browser-extension/.env and building it.

UPDATE 20 SEPTEMBER 2024:

  • SurfSense now works on Hierarchical Indices.
  • Knowledge Graph dependency is removed for now until I find some better Graph RAG solutions.
  • Added support for Local LLMs

Until I find a good host for my backend you need to setup SurfSense locally for now.


Backend

For authentication purposes, you’ll also need a PostgreSQL instance running on your machine.

Now lets setup the SurfSense BackEnd

  1. Clone this repo.
  2. Go to ./backend subdirectory.
  3. Setup Python Virtual Environment
  4. Run pip install -r requirements.txt to install all required dependencies.
  5. Update/Make the required Environment variables in .env
ENV VARIABLE Description
IS_LOCAL_SETUP 'true' for Ollama mistral-nemo or 'false' from OpenAI gpt-4o-mini
POSTGRES_DATABASE_URL postgresql+psycopg2://user:pass@host:5432/database
API_SECRET_KEY Can be any Random String value. Make Sure to remember it for as you need to send it in user registration request to Backend for security purposes.
  1. Backend is a FastAPI Backend so now just run the server on unicorn using command uvicorn server:app --host 0.0.0.0 --port 8000
  2. If everything worked fine you should see screen like this.

backend


FrontEnd

For local frontend setup just fill out the .env file of frontend.

ENV VARIABLE DESCRIPTION
NEXT_PUBLIC_API_SECRET_KEY Same String value your set for Backend
NEXT_PUBLIC_BACKEND_URL Give hosted backend url here. Eg. http://127.0.0.1:8000
NEXT_PUBLIC_RECAPTCHA_SITE_KEY Google Recaptcha v2 Client Key
RECAPTCHA_SECRET_KEY Google Recaptcha v2 Server Key

and run it using pnpm run dev

You should see your Next.js frontend running at localhost:3000

Make sure to register an account from frontend so you can login to extension.


Extension

Extension is in plasmo framework which is a cross browser extension framework.

For building extension just fill out the .env file of frontend.

ENV VARIABLE DESCRIPTION
PLASMO_PUBLIC_BACKEND_URL SurfSense Backend URL eg. "http://127.0.0.1:8000"

Build the extension for your favorite browser using this guide: https://docs.plasmo.com/framework/workflows/build#with-a-specific-target

When you load and start the extension you should see a Login page like this

extension login

After logging in you will need to fill your OpenAPI Key. Fill random value if you are using Ollama.

ext-settings

After Saving you should be able to use extension now.

ext-home

Options Explanations
Search Space Think of it like a category tag for the webpages you want to save.
Clear Inactive History Sessions It clears the saved content for Inactive Tab Sessions.
Save Current Webpage Snapshot Stores the current webpage session info into SurfSense history store
Save to SurfSense Processes the SurfSense History Store & Initiates a Save Job
  1. Now just start browsing the Internet. Whatever you want to save any content take its Snapshot and save it to SurfSense. After Save Job is completed you are ready to ask anything about it to SurfSense 🧠.

  2. Now go to SurfSense Dashboard After Logging in.

DASHBOARD OPTIONS DESCRIPTION
Playground See saved documents and can have chat with multiple docs.
Search Space Chat Used for questions about your content in particular search space.
Saved Chats All your saved chats.
Settings If you want to update your Open API key.

Screenshots

Playground

front-dash

Search Spaces Chat (Ollama LLM)

citations

front-spacechat

Multiple Document Chat (Ollama LLM)

multidocs-localllm

Saved Chats

front-savedchat

Tech Stack

  • Extenstion : Manifest v3 on Plasmo
  • BackEnd : FastAPI with LangChain
  • FrontEnd: Next.js with Aceternity.

Architecture:

In Progress...........

Future Work

  • Implement Canvas.
  • Add support for file uploads QA.
  • Shift to WebSockets for Streaming responses.
  • Based on feedback, I will work on making it compatible with local models. [Done]
  • Cross Browser Extension [Done]
  • Critical Notifications [Done | PAUSED]
  • Saving Chats [Done]
  • Basic keyword search page for saved sessions [Done]
  • Multi & Single Document Chat [Done]

Contribute

Contributions are very welcome! A contribution can be as small as a ⭐ or even finding and creating issues. Fine-tuning the Backend is always desired.