An advanced AI-powered video surveillance and analysis system that provides automated monitoring, visual Q&A, and real-time video captioning capabilities for CCTV cameras and video feeds.
- Support for multiple video input sources:
- Local video files
- Webcam feeds
- RTSP streams from IP cameras
- Real-time video captioning using Salesforce BLIP model
- Natural language visual Q&A using VILT model
- Interactive Streamlit interface
- Continuous monitoring and analysis
- Clone this repository:
git clone https://github.com/yourusername/securade-sentinel.git
cd securade-sentinel
- Create a virtual environment and activate it:
python -m venv venv
source venv/bin/activate # On Windows, use: venv\Scripts\activate
- Install the required dependencies:
pip install -r requirements.txt
- Python 3.8+
- streamlit
- torch
- transformers
- opencv-python
- pillow
- numpy
- Start the application:
streamlit run app.py
-
Select your video source:
- Upload a local video file
- Use your webcam
- Enter an RTSP stream URL
-
The application will begin processing the video feed and displaying:
- Live video stream
- Real-time captions
- Q&A interface for querying the video content
The application uses two main AI models:
-
Video Captioning: Salesforce/blip-image-captioning-large
- Generates natural language descriptions of video scenes
-
Visual Q&A: dandelin/vilt-b32-finetuned-vqa
- Answers questions about the video content in natural language
Contributions are welcome! Please feel free to submit a Pull Request.