This system is designed to efficiently ingest, store, and query vast volumes of log data. It comprises a Log Ingestor responsible for accepting log data over HTTP and a Query Interface that enables users to perform full-text searches and apply filters on various log attributes.
- Programming Language: Python
- Database: MySQL
- Technologies: Kafka , Kafka Rest Proxy, Kafka Schema Registry
- Frontend: HTML CSS JavaScript
- Backend: Flask
- Log Ingestor
- Ingests logs in the provided JSON format via HTTP on port
3000
. - Ensures scalability to handle high log volumes.
- Optimizes I/O operations and database write speeds.
- Ingests logs in the provided JSON format via HTTP on port
- Query Interface
- Offers a user-friendly interface (Web UI/CLI) for full-text search.
- Includes filters for:
- level
- message
- resourceId
- timestamp
- traceId
- spanId
- commit
- metadata.parentResourceId
- Implements efficient search algorithms for quick results.
- Advanced Features (To be Implemented...)
- Search within specific date ranges.
- Utilization of regular expressions for search.
- Combining multiple filters for precise queries.
- Real-time log ingestion and searching capabilities.
- Role-based access control to the query interface.
- Utilizes an HTTP server to receive logs.
- Parses incoming JSON logs and publishes them to kafka topic.
- Subscribes to the Kafka topic and consumes the log from topic.
- Stores log from topic to primary read database instance.
- Provides a user interface for search and filtering.
- Processes user queries and translates them into database queries.
- Utilizes optimized indexing for faster search results.
- MYSQL - Relational Database: Stores structured log data, optimizing for structured queries and joins.
- NoSQL Database (e.g., Elasticsearch): Facilitates full-text search and complex queries efficiently. (To be implemented...)
- Scalability: Implements database sharding for distributing load.
- Caching Mechanism: Utilizes caching strategies for frequently accessed data.
- Load Balancing: Distributes incoming requests across multiple servers for enhanced performance.
- docker
- Clone the repository
git clone https://github.com/MasterZesty/log-ingestor-with-query-interface.git
- Navigate to the project directory:
cd log-ingestor-with-query-interface
- Run
docker-compose up -d
- Wait for 1-2 mins as it takes some time to create resources by Kafka and MySQL.
- To ingest log from ui in browser go to
http://localhost:3000/
- Start consumer service
http://localhost:3000/consumer
- To search log in browser go to
http://localhost:3000/search
- You can also use POST method to send json data to http endpoint
curl --location 'localhost:3000' \ --header 'Content-Type: application/json' \ --data '{ "level": "error", "message": "Failed to connect to DB", "resourceId": "server-1234", "timestamp": "2023-09-15T08:00:00Z", "traceId": "abc-xyz-123", "spanId": "span-456", "commit": "5e5342f", "metadata": { "parentResourceId": "boy server-0987" } }'
- Real-time Capabilities: Enhance real-time log ingestion and search.
- Enhanced Security: Strengthen security measures, especially for user access and data integrity.
- Optimization: Continuously optimize database queries and indexing strategies for better performance.
- Volume: Handles massive log volumes efficiently.
- Speed: Provides quick search results.
- Scalability: Adaptable to increasing log volumes and queries.
- Usability: Offers an intuitive interface for users.
- Advanced Features: Implements bonus functionalities.
- Readability: Maintains a clean and structured codebase.
This system effectively manages log data ingestion and provides a seamless query interface for users to retrieve specific logs based on various attributes. Continuous improvements can enhance its performance and capabilities.