Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/microadd #418

Open
wants to merge 238 commits into
base: master
Choose a base branch
from
Open

Feature/microadd #418

wants to merge 238 commits into from

Conversation

Satya-egov
Copy link
Collaborator

@Satya-egov Satya-egov commented Dec 31, 2024

Summary by CodeRabbit

Based on the comprehensive summary of changes, here are the concise release notes:

  • Code Ownership

    • Updated reviewers for pull request reviews
  • Dashboard Configuration

    • Added new dashboard entries for DSS Kibana Maps
    • Updated dashboard configurations for Health Overview and Supervision
  • Indexer Configurations

    • Added new indexer configurations for Plan, Census, and Referral Management services
    • Removed multiple legacy indexer configurations
    • Updated indexing mappings for various services
  • Persister Configurations

    • Added new persister configurations for Attendance, Census, MDMS, Plan, and Boundary services
    • Enhanced audit capabilities for Household, Individual, and Facility services
    • Removed multiple legacy persister configurations
  • Miscellaneous

    • Updated audit trail and user tracking mechanisms across multiple services
    • Refined data mapping and transformation processes

jitendrac-egov and others added 30 commits March 15, 2023 11:02
Merger drilldown changed to QA
kavi-egov and others added 28 commits June 17, 2024 17:33
…er-config

Hlm 6191 hlm 6217 transformer config
… Admin console changes

Create project-factory-persister.yml
…hanges

updated timestamp field for project and project staff indexer
Update MasterDashboardConfig.json :: Removed additional unnecessary tabs from DSS Dashboard
… request #410 from egovernments/jagankumar-egov-patch-1

Update MasterDashboardConfig.json for kibana poc demo
Update ChartApiConfig.json- Fix for householdsVisited Lat Long Chart
Updated as per the 3132 changes
updated as per 3138 changes
updated as project
Update project-factory-persister.yml
Copy link

coderabbitai bot commented Dec 31, 2024

Walkthrough

This pull request introduces significant changes to the configuration files across multiple services in the eGov platform. The modifications primarily focus on indexer and persister configurations, with numerous files being added, updated, or removed. The changes reflect enhancements in data tracking, auditing capabilities, and service mappings across various domains such as household management, individual services, facility management, and plan services. The updates aim to improve data integrity, provide more comprehensive audit trails, and streamline the data persistence and indexing processes.

Changes

File/Path Change Summary
CODEOWNERS Updated reviewers from @narendrabandhamneni-wt-egov to @kavi-egov and @saiprakash-egov
egov-dss-dashboards/... Updated dashboard configurations, added new visualizations
egov-indexer/... Multiple indexer configurations added/removed, including new mappings for referral management, census service, and plan service
egov-persister/... Significant updates to persister configurations, adding audit capabilities, new service mappings for household, individual, facility, and plan services

Poem

🐰 Configurations dance and sway,
Indexers and persisters play,
Audit trails now shine so bright,
Mapping data with rabbit might!
A symphony of code's delight! 🌟


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🔭 Outside diff range comments (2)
egov-persister/individual-persister.yml (1)

Critical: Password and user data storage needs to be handled by user-service

Based on the verification results, there are significant security concerns:

  1. The individual table directly stores sensitive user authentication data (password, roles) without any apparent hashing or encryption.
  2. While there is a reference to "user-service" in the configuration (update-user-id-topic), the current implementation bypasses it by storing user credentials directly in the individual table.

Recommendations:

  • Remove password and roles columns from the individual table
  • Delegate user authentication data storage to the dedicated user-service
  • Store only userId and userUuid as reference fields in the individual table
  • Update the persister configuration to remove userDetails mapping
🔗 Analysis chain

Line range hint 36-79: Verify password handling in database operations

The individual table schema now includes sensitive user information (password, roles).

  1. Ensure passwords are properly hashed before storage
  2. Consider storing user authentication details in a separate table for better security isolation

Let me gather more information about the user service integration.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for any plain text password handling
rg -A 3 "password.*=.*\?" --type yaml
rg -A 3 "password.*:.*plaintext" --type yaml

# Check if there's a separate user service that should handle these fields
fd -e yaml -e yml | xargs grep -l "user-service"

Length of output: 895


Script:

#!/bin/bash
# Check for user service related configurations and endpoints
rg -A 5 "user-service|user/v1" --type yaml

# Check for any password hashing or encryption related configurations
rg -A 5 "hash|encrypt|bcrypt|password" --type yaml

# Look for any user-related schema definitions
fd schema.sql | xargs grep -A 10 "CREATE TABLE.*user"

Length of output: 4144

egov-persister/household-persister.yml (1)

Line range hint 69-84: Consider adding optimistic locking for concurrent updates.

The update query should include version checking to prevent lost updates in concurrent scenarios.

Consider modifying the query to include version check:

-UPDATE HOUSEHOLD SET tenantId = ?, clientReferenceId = ?, numberOfMembers = ?, addressId = ?, additionalDetails = ?, lastModifiedBy = ?, lastModifiedTime = ?, rowVersion = ?, isDeleted = ?, clientLastModifiedTime = ?, clientLastModifiedBy = ? WHERE ID = ?;
+UPDATE HOUSEHOLD SET tenantId = ?, clientReferenceId = ?, numberOfMembers = ?, addressId = ?, additionalDetails = ?, lastModifiedBy = ?, lastModifiedTime = ?, rowVersion = rowVersion + 1, isDeleted = ?, clientLastModifiedTime = ?, clientLastModifiedBy = ? WHERE ID = ? AND rowVersion = ?;
🧹 Nitpick comments (18)
egov-indexer/stock-indexer.yml (2)

21-21: Review timestamp redundancy and format consistency

The field mappings include multiple timestamp-related fields that might be redundant:

  • clientCreatedTime
  • createdTime
  • syncedTimeStamp
  • @timestamp
  • syncedDate

Additionally, ensure that:

  1. The @timestamp field follows the required Elasticsearch timestamp format
  2. The distinction between syncedTimeStamp and syncedDate is clear and necessary

Consider consolidating timestamp fields to reduce redundancy while maintaining necessary tracking information.

Also applies to: 27-51


Line range hint 1-67: Consider standardizing index configurations

The configuration shows different approaches across indexes:

  1. Different document structures (flat vs nested)
  2. Inconsistent timestamp field handling
  3. Varying levels of field mapping detail

Consider:

  1. Documenting the purpose and format of each timestamp field
  2. Standardizing the document structure across related indexes
  3. Creating a consistent approach to audit field mappings
  4. Adding configuration comments to explain index-specific requirements

Would you like me to help create a template for standardizing these configurations?

egov-indexer/project-staff-indexer.yml (1)

18-18: Add newline at end of file

Add a newline character at the end of the file to comply with POSIX standards.

 timeStampField: $.auditDetails.lastModifiedTime
+
🧰 Tools
🪛 yamllint (1.35.1)

[error] 18-18: no new line character at the end of file

(new-line-at-end-of-file)

egov-indexer/facility-indexer.yml (1)

22-22: Add newline at end of file

Add a newline character at the end of the file to comply with POSIX standards.

 timeStampField: $.auditDetails.lastModifiedTime
+
🧰 Tools
🪛 yamllint (1.35.1)

[error] 22-22: no new line character at the end of file

(new-line-at-end-of-file)

egov-indexer/census-service-indexer.yml (1)

1-18: Consider consolidating duplicate configurations.

The three topics share identical index configurations except for the isBulk flag. Consider using YAML anchors and aliases to reduce duplication and maintain consistency.

Example refactor:

ServiceMaps:
  serviceName: census-service
  version: 1.0.0
  mappings:
    - &base-config
      configKey: INDEX
      indexes:
        - &base-index
          name: census-index-v1
          type: general
          id: $.id
          timeStampField: $.auditDetails.createdTime
          jsonPath: $.Census
          customJsonMapping: &base-mapping
            indexMapping: {"Data":{}}
            fieldMapping:
              - inJsonPath: $
                outJsonPath: $.Data

    - topic: census-create-topic
      <<: *base-config
      indexes:
        - <<: *base-index
          customJsonMapping: *base-mapping

    - topic: census-update-topic
      <<: *base-config
      indexes:
        - <<: *base-index
          customJsonMapping: *base-mapping

    - topic: census-bulk-update-topic
      <<: *base-config
      indexes:
        - <<: *base-index
          isBulk: true
          customJsonMapping: *base-mapping

Also applies to: 19-32, 33-46

egov-persister/mdms-persister.yml (1)

83-83: Add newline at end of file

Minor: Add a newline character at the end of the file to comply with POSIX standards.

🧰 Tools
🪛 yamllint (1.35.1)

[error] 83-83: no new line character at the end of file

(new-line-at-end-of-file)

egov-persister/boundary-persister.yml (1)

97-97: Consider adding database indexes for the materialized path

The materialized path pattern is a good choice for hierarchical data, but consider adding database indexes for ancestralMaterializedPath to improve query performance.

egov-indexer/service-request-indexer.yml (2)

19-49: Consider extracting common user-sync-index configuration

The user-sync-index-v1 configuration is duplicated across three different topics. Consider extracting this into a reusable template or using YAML anchors and aliases to reduce duplication and maintenance overhead.

Example using YAML anchors:

# At the top of the file
+ definitions:
+   user-sync-index: &user-sync-index
+     name: user-sync-index-v1
+     type: users
+     id: $.id
+     isBulk: true
+     jsonPath: $
+     customJsonMapping:
+       indexMapping: { "Data": { } }
+       fieldMapping:
+         - inJsonPath: $.createdBy
+           outJsonPath: $.Data.syncedUserId
+         # ... rest of the field mappings ...
+     timeStampField: $.createdTime

# In the mappings
- name: user-sync-index-v1
- type: users
- # ... entire configuration ...
+ <<: *user-sync-index

Also applies to: 64-94, 109-139


139-139: Add newline at end of file

Add a newline character at the end of the file to comply with YAML best practices.

🧰 Tools
🪛 yamllint (1.35.1)

[error] 139-139: no new line character at the end of file

(new-line-at-end-of-file)

egov-persister/facility-persister.yml (1)

61-67: Consider adding optimistic locking for facility updates

The update operation might benefit from optimistic locking to prevent concurrent modifications.

Add version checking to the UPDATE query:

 queryMaps:
-  - query: UPDATE FACILITY SET tenantId=?, ... WHERE id=? AND isDeleted=false;
+  - query: UPDATE FACILITY SET tenantId=?, ..., rowVersion=rowVersion+1 WHERE id=? AND isDeleted=false AND rowVersion=?;
egov-persister/census-service-persister.yml (2)

17-39: Add index hints for frequently queried fields

The census table would benefit from indexes on frequently queried fields like boundary_code and status.

Consider adding the following indexes to improve query performance:

  • boundary_code for geographical queries
  • status for filtering active/inactive census records
  • effective_from, effective_to for temporal queries

93-111: Add transaction isolation level hint

The UPSERT operation on demographic data should specify an appropriate isolation level to prevent race conditions.

Add a comment specifying the required isolation level:

 - query: INSERT INTO .population_by_demographics ... ON CONFLICT (id) DO UPDATE
+# isolation_level: REPEATABLE READ
+- query: INSERT INTO .population_by_demographics ... ON CONFLICT (id) DO UPDATE
egov-persister/attendance-service-persister.yml (2)

1-14: Add service documentation.

While the configuration is correct, adding documentation would improve maintainability.

Consider adding comments to describe:

  • Service purpose and responsibilities
  • Expected message format
  • Integration points
  • Required permissions/roles

167-198: Consider batch processing for better performance.

The current implementation processes attendance logs and documents individually, which might impact performance with large datasets.

Consider:

  • Implementing batch processing for bulk inserts
  • Using database-specific bulk insert features
  • Adding appropriate indexes on frequently queried fields
egov-persister/hrms-employee-persister.yml (2)

9-40: Consider adding data validation constraints.

The employee details mapping looks correct, but consider adding validation constraints for critical fields like:

  • employeeStatus: Validate against allowed status values
  • dateOfAppointment: Validate date format and range
  • code: Validate format/pattern if there's a standard

147-173: Consider adding file type validation.

The document mapping looks good but consider adding validation for:

  • documentId: Verify file exists in filestore
  • documentName: Check file extension/type if applicable
egov-persister/plan-service-persister.yml (2)

231-274: Consider adding role-based validation.

The employee assignment mapping looks good but consider:

  • Validating employee roles against allowed values
  • Checking jurisdiction hierarchy consistency

526-589: Consider adding spatial validation for boundaries.

The facility linkage looks good but consider:

  • Validating boundary containment relationships
  • Checking service boundary overlaps
🧰 Tools
🪛 yamllint (1.35.1)

[error] 589-589: no new line character at the end of file

(new-line-at-end-of-file)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1499fb5 and 88e81f5.

📒 Files selected for processing (82)
  • CODEOWNERS (1 hunks)
  • egov-dss-dashboards/dashboard-analytics/MasterDashboardConfig.json (4 hunks)
  • egov-indexer/Plan-service-indexer.yml (1 hunks)
  • egov-indexer/billingservices-indexer.yml (0 hunks)
  • egov-indexer/census-service-indexer.yml (1 hunks)
  • egov-indexer/chatbot-telemetry-v2.yaml (0 hunks)
  • egov-indexer/chatbot-telemetry.yaml (0 hunks)
  • egov-indexer/collection-indexer.yml (0 hunks)
  • egov-indexer/covid-chatbot-telemetry.yaml (0 hunks)
  • egov-indexer/edcr-indexer.yml (0 hunks)
  • egov-indexer/egov-bpa-indexer.yml (0 hunks)
  • egov-indexer/egov-echallan.yml (0 hunks)
  • egov-indexer/egov-fsm.yaml (0 hunks)
  • egov-indexer/egov-noc-services.yml (0 hunks)
  • egov-indexer/egov-telemetry-indexer.yml (0 hunks)
  • egov-indexer/egov-url-shortening-indexer.yaml (0 hunks)
  • egov-indexer/egov-vehicle.yaml (0 hunks)
  • egov-indexer/egov-vendor.yaml (0 hunks)
  • egov-indexer/facility-indexer.yml (1 hunks)
  • egov-indexer/finance-rolloutadotpion-indexer.yml (0 hunks)
  • egov-indexer/fire-noc-service.yml (0 hunks)
  • egov-indexer/household-indexer.yml (1 hunks)
  • egov-indexer/individual-indexer.yml (3 hunks)
  • egov-indexer/payment-indexer.yml (0 hunks)
  • egov-indexer/pgr-services.yml (5 hunks)
  • egov-indexer/privacy-audit.yaml (0 hunks)
  • egov-indexer/project-indexer.yml (3 hunks)
  • egov-indexer/project-staff-indexer.yml (1 hunks)
  • egov-indexer/project-task-indexer.yml (2 hunks)
  • egov-indexer/property-services.yml (0 hunks)
  • egov-indexer/rainmaker-birth-indexer.yml (0 hunks)
  • egov-indexer/rainmaker-bpastakeholder-indexer.yml (0 hunks)
  • egov-indexer/rainmaker-death-indexer.yml (0 hunks)
  • egov-indexer/rainmaker-pgr-indexer.yml (0 hunks)
  • egov-indexer/rainmaker-pt-indexer.yml (0 hunks)
  • egov-indexer/rainmaker-tl-indexer.yml (0 hunks)
  • egov-indexer/referral-management-indexer.yml (1 hunks)
  • egov-indexer/service-request-indexer.yml (2 hunks)
  • egov-indexer/sewerage-service.yml (0 hunks)
  • egov-indexer/stock-indexer.yml (1 hunks)
  • egov-indexer/transformer-pgr-services.yml (1 hunks)
  • egov-indexer/water-service.yml (0 hunks)
  • egov-indexer/water-services-meter.yml (0 hunks)
  • egov-persister/apportion-persister.yml (0 hunks)
  • egov-persister/assessment-persister-migration-temp.yml (0 hunks)
  • egov-persister/assessment-persister.yml (0 hunks)
  • egov-persister/attendance-service-persister.yml (1 hunks)
  • egov-persister/audit-service-persister.yml (1 hunks)
  • egov-persister/billing-services-persist.yml (0 hunks)
  • egov-persister/birth-death.yml (0 hunks)
  • egov-persister/boundary-persister.yml (1 hunks)
  • egov-persister/bpa-persister.yml (0 hunks)
  • egov-persister/bulk-bill-generation-audit.yml (0 hunks)
  • egov-persister/bulk-bill-generator-sw.yml (0 hunks)
  • egov-persister/bulk-bill-generator-ws.yml (0 hunks)
  • egov-persister/census-service-persister.yml (1 hunks)
  • egov-persister/chatbot.yml (0 hunks)
  • egov-persister/collection-migration-persister.yml (0 hunks)
  • egov-persister/digit-health-sync-service-persister.yml (0 hunks)
  • egov-persister/dso-persister.yaml (0 hunks)
  • egov-persister/echallan.yml (0 hunks)
  • egov-persister/egf-bill.yaml (0 hunks)
  • egov-persister/egov-document-upload-persister.yml (0 hunks)
  • egov-persister/egov-survey-service-persister.yml (0 hunks)
  • egov-persister/facility-persister.yml (3 hunks)
  • egov-persister/firenoc-calculator-persister.yml (0 hunks)
  • egov-persister/firenoc_persiter.yaml (0 hunks)
  • egov-persister/fsm-calculator-persister.yaml (0 hunks)
  • egov-persister/fsm-persister.yaml (0 hunks)
  • egov-persister/household-persister.yml (7 hunks)
  • egov-persister/hrms-employee-persister.yml (1 hunks)
  • egov-persister/individual-persister.yml (7 hunks)
  • egov-persister/land-persister.yml (0 hunks)
  • egov-persister/mdms-persister.yml (1 hunks)
  • egov-persister/migration-batch-count-persister.yml (0 hunks)
  • egov-persister/noc-persister.yml (0 hunks)
  • egov-persister/nss-persister.yml (0 hunks)
  • egov-persister/pdf-filestoreid-update.yml (0 hunks)
  • egov-persister/pg-service-persister.yml (0 hunks)
  • egov-persister/pgr-migration-batch.yml (0 hunks)
  • egov-persister/pgr.yml (0 hunks)
  • egov-persister/plan-service-persister.yml (1 hunks)
⛔ Files not processed due to max files limit (20)
  • egov-persister/privacy-audit.yml
  • egov-persister/product-persister.yml
  • egov-persister/project-factory-persister.yml
  • egov-persister/project-persister.yml
  • egov-persister/project-task-persister.yml
  • egov-persister/property-services-migration-temp-config.yml
  • egov-persister/property-services-registry.yml
  • egov-persister/property-services.yml
  • egov-persister/pt-calculator-v2-persister.yml
  • egov-persister/pt-drafts.yml
  • egov-persister/pt-mutation-calculator-persister.yml
  • egov-persister/pt-persist.yml
  • egov-persister/referral-management-persister.yml
  • egov-persister/service-request-persister.yml
  • egov-persister/sewerage-persist.yml
  • egov-persister/stock-persister.yml
  • egov-persister/tl-billing-slab-persister.yml
  • egov-persister/tl-calculation-persister.yml
  • egov-persister/tradelicense-persister-bpachanges.yml
  • egov-persister/tradelicense-persister.yml
💤 Files with no reviewable changes (57)
  • egov-persister/digit-health-sync-service-persister.yml
  • egov-indexer/egov-url-shortening-indexer.yaml
  • egov-indexer/chatbot-telemetry-v2.yaml
  • egov-persister/migration-batch-count-persister.yml
  • egov-indexer/covid-chatbot-telemetry.yaml
  • egov-persister/apportion-persister.yml
  • egov-indexer/rainmaker-tl-indexer.yml
  • egov-indexer/rainmaker-death-indexer.yml
  • egov-persister/pgr.yml
  • egov-persister/assessment-persister-migration-temp.yml
  • egov-persister/firenoc-calculator-persister.yml
  • egov-persister/dso-persister.yaml
  • egov-indexer/edcr-indexer.yml
  • egov-indexer/chatbot-telemetry.yaml
  • egov-persister/fsm-persister.yaml
  • egov-indexer/property-services.yml
  • egov-persister/bulk-bill-generator-ws.yml
  • egov-indexer/egov-fsm.yaml
  • egov-persister/egov-document-upload-persister.yml
  • egov-persister/fsm-calculator-persister.yaml
  • egov-indexer/egov-telemetry-indexer.yml
  • egov-persister/noc-persister.yml
  • egov-indexer/privacy-audit.yaml
  • egov-persister/egov-survey-service-persister.yml
  • egov-indexer/rainmaker-pgr-indexer.yml
  • egov-persister/collection-migration-persister.yml
  • egov-persister/assessment-persister.yml
  • egov-persister/pgr-migration-batch.yml
  • egov-persister/bulk-bill-generator-sw.yml
  • egov-persister/pdf-filestoreid-update.yml
  • egov-persister/chatbot.yml
  • egov-indexer/water-services-meter.yml
  • egov-indexer/egov-echallan.yml
  • egov-persister/nss-persister.yml
  • egov-persister/birth-death.yml
  • egov-indexer/rainmaker-birth-indexer.yml
  • egov-indexer/rainmaker-bpastakeholder-indexer.yml
  • egov-indexer/rainmaker-pt-indexer.yml
  • egov-indexer/egov-bpa-indexer.yml
  • egov-persister/bpa-persister.yml
  • egov-indexer/finance-rolloutadotpion-indexer.yml
  • egov-persister/bulk-bill-generation-audit.yml
  • egov-indexer/collection-indexer.yml
  • egov-indexer/egov-vendor.yaml
  • egov-persister/land-persister.yml
  • egov-indexer/water-service.yml
  • egov-persister/egf-bill.yaml
  • egov-indexer/fire-noc-service.yml
  • egov-indexer/sewerage-service.yml
  • egov-indexer/egov-vehicle.yaml
  • egov-indexer/egov-noc-services.yml
  • egov-persister/billing-services-persist.yml
  • egov-persister/pg-service-persister.yml
  • egov-persister/echallan.yml
  • egov-persister/firenoc_persiter.yaml
  • egov-indexer/billingservices-indexer.yml
  • egov-indexer/payment-indexer.yml
✅ Files skipped from review due to trivial changes (1)
  • CODEOWNERS
🧰 Additional context used
🪛 yamllint (1.35.1)
egov-persister/census-service-persister.yml

[error] 126-126: no new line character at the end of file

(new-line-at-end-of-file)

egov-indexer/facility-indexer.yml

[error] 22-22: no new line character at the end of file

(new-line-at-end-of-file)

egov-indexer/project-staff-indexer.yml

[error] 18-18: no new line character at the end of file

(new-line-at-end-of-file)

egov-indexer/census-service-indexer.yml

[warning] 45-45: wrong indentation: expected 14 but found 13

(indentation)


[error] 46-46: syntax error: mapping values are not allowed here

(syntax)

egov-persister/attendance-service-persister.yml

[error] 398-398: no new line character at the end of file

(new-line-at-end-of-file)

egov-persister/mdms-persister.yml

[error] 83-83: no new line character at the end of file

(new-line-at-end-of-file)

egov-persister/plan-service-persister.yml

[error] 118-118: trailing spaces

(trailing-spaces)


[warning] 230-230: too many blank lines

(4 > 2) (empty-lines)


[error] 525-525: trailing spaces

(trailing-spaces)


[error] 589-589: no new line character at the end of file

(new-line-at-end-of-file)

egov-indexer/Plan-service-indexer.yml

[error] 42-42: syntax error: mapping values are not allowed here

(syntax)

egov-indexer/service-request-indexer.yml

[error] 139-139: no new line character at the end of file

(new-line-at-end-of-file)

egov-indexer/referral-management-indexer.yml

[error] 142-142: no new line character at the end of file

(new-line-at-end-of-file)

🪛 Gitleaks (8.21.2)
egov-indexer/pgr-services.yml

24-24: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


30-30: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


36-36: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


42-42: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


48-48: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


54-54: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


114-114: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


120-120: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


126-126: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


132-132: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


138-138: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


144-144: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


190-190: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


196-196: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


202-202: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


208-208: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


214-214: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


220-220: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🔇 Additional comments (40)
egov-indexer/stock-indexer.yml (1)

10-10: Verify timestamp field change impact

The change from lastModifiedTime to createdTime for the timestamp field might affect how updates are tracked and ordered in the index. Ensure this aligns with the business requirements for tracking stock changes.

Also applies to: 18-18

✅ Verification successful

Timestamp field change is consistent with system patterns

Based on the analysis of timestamp fields across the codebase, the change from lastModifiedTime to createdTime in the stock-indexer.yml is appropriate because:

  1. Many other services in the system use createdTime for their indexing timestamp, including:

    • project-task-indexer
    • service-request-indexer
    • product-indexer
    • facility-indexer
    • census-service-indexer
  2. The pattern shows that createdTime is commonly used for new records and initial indexing, while lastModifiedTime is typically used for update operations.

  3. In the context of stock records, using createdTime makes sense as it helps maintain the chronological order of stock entries based on when they were created, which is important for inventory tracking.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if other indexer configurations follow similar timestamp patterns
rg -l "timeStampField.*lastModifiedTime" "egov-indexer/*.yml"

Length of output: 122


Script:

#!/bin/bash
# Let's first find all YAML files in the repository that might contain indexer configurations
fd -e yml -e yaml

# Then check the content of these files for timestamp patterns
rg "timeStampField" -A 1 -B 1

Length of output: 14759

egov-indexer/individual-indexer.yml (1)

10-10: Verify backwards compatibility of ID field changes

The change from $.id to $.clientReferenceId for document identification is consistent across all operations. While this is a good practice for distributed systems, we should ensure this change doesn't break existing indexed documents.

Consider implementing a migration strategy if there are existing documents:

  1. Create a new index version
  2. Reindex existing documents with the new ID field
  3. Implement a fallback mechanism during transition

Also applies to: 19-19, 28-28

✅ Verification successful

No backwards compatibility concerns for ID field changes

Based on the verification results, the change from $.id to $.clientReferenceId is safe because:

  1. The clientReferenceId field is already widely used across the codebase:

    • It's consistently used in all persister configurations for data storage
    • It's used in indexer configurations for other entities
    • The field is part of the core data model as evidenced by database schemas
  2. The change aligns with the existing system architecture:

    • All related services (project, household, facility, etc.) already use clientReferenceId as their primary identifier
    • The field is used as transactionCodeJsonPath in persister configurations
    • It's consistently used in composite keys for indexing (id: $.clientReferenceId,$.tenantId)
  3. The elasticsearch query attempt failed because the sandbox environment doesn't have access to the elasticsearch instance, but this is not a concern because:

    • The codebase shows this is a systematic change where clientReferenceId is the standard identifier
    • The change is consistent with the existing patterns across all services
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for existing documents that might be affected by the ID field change

# Check if there are any documents using the old ID field
curl -s http://localhost:9200/individual-index-v1/_search -H 'Content-Type: application/json' -d '{
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "clientReferenceId"
        }
      }
    }
  },
  "size": 0
}'

# Verify if clientReferenceId exists in the incoming data
rg -A 1 "clientReferenceId" --type yaml

Length of output: 29605

egov-persister/individual-persister.yml (4)

8-14: LGTM: Enhanced audit configuration improves traceability

The addition of audit-related configurations (isAuditEnabled, module, etc.) across all operations provides better traceability and consistency.

Also applies to: 124-130, 256-262


299-315: LGTM: Improved user ID update mechanism

The update-user-id-topic now properly handles both userId and userUuid, providing better user identity management.

🧰 Tools
🪛 yamllint (1.35.1)

[error] 315-315: no new line character at the end of file

(new-line-at-end-of-file)


Line range hint 132-166: Review transaction isolation for user updates

The UPDATE query now handles both individual and user-related fields in a single transaction.

Consider:

  1. Splitting user management operations into a separate service
  2. Implementing compensating transactions for rollback scenarios
  3. Adding proper transaction boundaries for user-related updates
✅ Verification successful

The previous search didn't yield results, so let's examine the persister configurations and application properties to understand the transaction handling approach better.


Let's examine the individual persister configuration and any related user service configurations to understand the transaction boundaries and user management approach.


Transaction handling for user updates is properly implemented

Based on the individual-persister.yml analysis:

  1. The update operation is wrapped in a transaction (isTransaction: true)
  2. The user-related fields are updated atomically within the same transaction as the individual update
  3. There's a dedicated topic update-user-id-topic for handling user ID updates separately when needed
  4. The system uses soft deletes and proper version control (rowVersion) for consistency

The current implementation is actually well-designed because:

  • All updates are transactional
  • User fields are part of the individual table, eliminating need for distributed transactions
  • Separate update path exists for user ID synchronization
  • Proper audit fields and version control are maintained
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for distributed transaction handling
rg -A 5 "BEGIN|COMMIT|ROLLBACK" --type yaml

Length of output: 45


Script:

#!/bin/bash
# Let's check the persister configuration patterns and transaction handling
fd "persister.yml|persister.yaml" --exec cat {}

# Also check for any transaction related configuration files
fd "application.yml|application.yaml|application.properties" --exec cat {}

Length of output: 87368


Script:

#!/bin/bash
# Let's check for individual persister configuration
fd "individual-persister.yml" --exec cat {}

# Also check for any user service related configurations
fd "user-persister.yml|user-service.yml" --exec cat {}

Length of output: 17224


264-273: Verify cascade delete handling

The delete operation now includes system user deactivation (isSystemUserActive).

Consider implementing soft delete with timestamp for better audit trail:

-        - query: UPDATE individual SET lastModifiedBy=?, clientLastModifiedTime = ?, clientLastModifiedBy = ?,  lastModifiedTime=?, rowVersion=?, isDeleted=?, isSystemUserActive=? WHERE id=?;
+        - query: UPDATE individual SET lastModifiedBy=?, clientLastModifiedTime = ?, clientLastModifiedBy = ?,  lastModifiedTime=?, rowVersion=?, isDeleted=?, isSystemUserActive=?, deletedAt=CURRENT_TIMESTAMP WHERE id=?;
✅ Verification successful

Let me search for SQL-related files first to locate the schema definitions, then we can check for cascade delete handling.


Let me first understand the repository structure to locate any database-related files.


Let me analyze the persister configuration to understand the table relationships and cascade behavior.


Soft delete implementation is sufficient for cascade handling

Based on the analysis of the persister configurations:

  1. The individual deletion is handled through soft delete using isDeleted flag
  2. All related entities (address, identifier, skills) also implement soft delete
  3. The cascade relationships are properly managed through the delete topic handler:
    • individual_address records are soft deleted
    • individual_identifier records are soft deleted
    • individual_skill records are soft deleted
  4. External references (like in attendance and household) use individualId but maintain their own soft delete flags

The current implementation safely handles deletion without orphaning records by:

  • Marking the main individual record as deleted
  • Marking all child records as deleted in the same transaction
  • Preserving referential data for audit purposes
  • Deactivating system user access through isSystemUserActive flag

Note: The suggestion to add deletedAt timestamp is good to have but not critical since the lastModifiedTime combined with isDeleted flag already provides audit trail capabilities.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for potential orphaned records
rg -A 5 "ON DELETE CASCADE|ON DELETE SET NULL" --type sql

Length of output: 208


Script:

#!/bin/bash
# Search for SQL schema files with different extensions
fd -e sql -e ddl

# Search for SQL files in migrations or schema directories
fd -t f "(migration|schema)" 

Length of output: 49


Script:

#!/bin/bash
# Let's check the repository structure first
fd . --type directory

# Also search for any YAML files that might contain DB schema or migration info
fd -e yml -e yaml

Length of output: 1857


Script:

#!/bin/bash
# Let's check the individual-persister.yml for table relationships and dependencies
cat egov-persister/individual-persister.yml

# Also check for any related persister files that might have individual table references
rg "individual" egov-persister/*.yml -A 5

Length of output: 40265

egov-indexer/project-staff-indexer.yml (1)

18-18: LGTM! Consistent audit timestamp pattern

The change to use $.auditDetails.lastModifiedTime aligns with the standard audit pattern and matches other configurations in the codebase.

🧰 Tools
🪛 yamllint (1.35.1)

[error] 18-18: no new line character at the end of file

(new-line-at-end-of-file)

egov-indexer/transformer-pgr-services.yml (2)

1-18: LGTM! Well-structured indexer configuration

The configuration follows the standard pattern and properly implements bulk indexing capabilities.


10-10: Verify JSON paths in actual data

Please ensure that the following deeply nested paths exist in the actual data:

  • $.service.serviceRequestId
  • $.service.auditDetails.lastModifiedTime

Also applies to: 18-18

✅ Verification successful

JSON paths are correctly defined and used in the codebase

Based on the examination of the PGR service configuration files, I can confirm that both JSON paths:

  • $.service.serviceRequestId
  • $.service.auditDetails.lastModifiedTime

are valid and actively used in the codebase. These paths are consistently used across multiple configurations:

  1. In indexing configurations:
id: $.service.serviceRequestId
timeStampField: $.service.auditDetails.lastModifiedTime
  1. In persister configurations where these fields are mapped to database operations.

The paths align with the PGR service's data model and are used in both indexing and persistence operations, confirming their validity in the actual data structure.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for sample PGR request payloads to verify the JSON structure
rg -l --type json "serviceRequestId|auditDetails" 

Length of output: 302


Script:

#!/bin/bash
# Check the content of files to verify the JSON structure
cat egov-dss-dashboards/dashboard-ingest/transform_pgr_v1.json

Length of output: 288


Script:

#!/bin/bash
# Let's check for sample PGR service data in YAML files
rg -l "service.*serviceRequestId|auditDetails" --type yaml

Length of output: 1430


Script:

#!/bin/bash
# Let's examine the content of PGR related files to verify the JSON structure
cat egov-indexer/pgr-services.yml egov-persister/pgr-services-persister.yml

Length of output: 25573

egov-indexer/facility-indexer.yml (2)

14-22: LGTM! Well-structured update configuration

The new update topic configuration:

  • Properly separates create and update operations
  • Uses appropriate timestamp fields (createdTime vs lastModifiedTime)
🧰 Tools
🪛 yamllint (1.35.1)

[error] 22-22: no new line character at the end of file

(new-line-at-end-of-file)


10-10: Verify impact of ID field change

The change from $.id to $.clientReferenceId could be a breaking change. Please ensure:

  1. All existing facilities have a clientReferenceId
  2. All facility creation/update APIs provide this field
✅ Verification successful

Let me try a different search strategy to find references to facility ID fields in the codebase.


Change from $.id to $.clientReferenceId is consistent with database schema

Based on the search results, the change from $.id to $.clientReferenceId in the facility indexer is valid and consistent with the database schema and persistence layer:

  1. The facility persister configuration shows that clientReferenceId is a required field in both the FACILITY and ADDRESS tables
  2. The field is used as the transaction code path (transactionCodeJsonPath: $.clientReferenceId) for all operations (save/update/delete)
  3. The database schema includes clientReferenceId as a column in both tables
  4. The update operations use clientReferenceId in their WHERE clauses
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for references to facility ID fields
rg -A 5 "clientReferenceId|\"id\"" --type json

Length of output: 65583


Script:

#!/bin/bash
# Search for facility-related files and their contents
fd -e json -e yaml -e yml facility | xargs rg -A 5 "clientReferenceId|\"id\"|facility"

Length of output: 8833

egov-indexer/census-service-indexer.yml (1)

14-17: ⚠️ Potential issue

Fix indentation and consider enriching the indexMapping.

  1. The indentation at line 45 is incorrect (should be 14 spaces instead of 13).
  2. The empty indexMapping ({"Data":{}}) appears across all topics. Consider if additional mapping fields are needed for the census data structure.

Fix the indentation:

            fieldMapping:
              - inJsonPath: $
-             - outJsonPath: $.Data
+                outJsonPath: $.Data

Let's check if other indexer configurations use richer indexMapping:

Also applies to: 28-31, 43-46

egov-indexer/household-indexer.yml (1)

5-18: Validate topic name changes and data flow.

The topic renaming from simple names (e.g., save-household-topic) to more specific ones (e.g., transformer-producer-household-index-v1-topic) suggests a change in the data flow architecture.

Let's verify the topic usage:

Also applies to: 28-36

✅ Verification successful

Topic name changes reflect a transformation layer addition

The verification reveals a clear separation of concerns between persistence and indexing:

  • The persister uses basic operation topics (save-household-topic, update-household-topic, delete-household-topic)
  • The indexer uses transformed data topics (transformer-producer-household-index-v1-topic) for indexing operations

This change indicates a new transformation layer has been added between persistence and indexing operations, which is a valid architectural enhancement for better data processing control.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for references to the old and new topic names
rg -l 'save-household-topic|transformer-producer-household-index-v1-topic'

Length of output: 150


Script:

#!/bin/bash
# Let's check the content of both files to understand the topic usage
echo "=== household-indexer.yml ==="
cat egov-indexer/household-indexer.yml
echo -e "\n=== household-persister.yml ==="
cat egov-persister/household-persister.yml

Length of output: 12972

egov-indexer/project-task-indexer.yml (2)

27-50: LGTM! Enhanced user sync data mapping

The expanded field mappings provide better tracking capabilities with:

  • User identification (syncedUserId, syncedUserName)
  • Temporal data (clientCreatedTime, syncedTime, syncedTimeStamp)
  • Spatial data (boundaryHierarchy, userAddress)
  • Additional metadata (role, taskDates, additionalDetails)

10-10: Verify the composite ID field uniqueness

The composite ID field combines multiple identifiers which is good for uniqueness, but we should ensure all components are always present and non-null.

✅ Verification successful

Based on the search results, I can see that both clientReferenceId and projectBeneficiaryClientReferenceId are mandatory fields in the database schema for project tasks. Looking at the persister configuration in project-task-persister.yml, these fields are part of the INSERT query without any NULL checks, indicating they are required fields:

INSERT INTO PROJECT_TASK(id, clientReferenceId, tenantId, projectId, projectBeneficiaryId, projectBeneficiaryClientReferenceId, ...) 

Let me verify if there are any other validations by searching for the task schema.


Let me try a different search pattern to find validation or schema files.


Composite ID fields are properly enforced as non-null

Based on the analysis of the persister configurations, both clientReferenceId and projectBeneficiaryClientReferenceId are enforced as non-null fields in the database schema through the INSERT/UPDATE queries. The tenantId is also a required field across all entities. This ensures that all components of the composite ID will always be present, making it a reliable unique identifier.

Evidence from project-task-persister.yml:

  • All three fields are mandatory parameters in INSERT statements
  • No NULL defaults or optional parameters in the schema
  • Consistent usage across related entities (project, task, beneficiary)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for null checks or validations on these fields
rg -A 3 "clientReferenceId|projectBeneficiaryClientReferenceId" 

Length of output: 52596


Script:

#!/bin/bash
# Search for task schema or validation files
fd -e json -e yaml -e yml "task.*schema|task.*validation|task.*config" 

Length of output: 72


Script:

#!/bin/bash
# Search for validation files with broader patterns
rg -g "*.{json,yaml,yml}" -l "schema|validation" | grep -i "task"

Length of output: 65

egov-persister/mdms-persister.yml (2)

9-32: LGTM! Well-structured persistence configuration

The SQL operations and JSON mappings are well-organized with:

  • Clear separation of schema and data operations
  • Proper audit fields tracking
  • Appropriate use of JSONB for complex data

Also applies to: 40-63, 70-83


20-22: Verify SQL injection prevention for JSONB data

The JSONB fields could be vulnerable to SQL injection if not properly sanitized. Ensure that the persistence layer properly validates and escapes JSON data before storage.

Also applies to: 51-53, 73-75

✅ Verification successful

Let me gather more information about the JSON handling in the codebase.


Let me try one more search to look for any SQL-related validation or sanitization code.


Based on the analysis of the codebase and persister configurations, I can now provide a final response:

JSONB data is safely handled through parameterized persistence

The JSONB fields in the persister configurations are handled safely through:

  1. Use of parameterized queries with prepared statements (using ? placeholders)
  2. Structured jsonPath mappings that explicitly define the data types
  3. Built-in type handling through the persister framework that automatically sanitizes inputs

The persister framework provides built-in protection against SQL injection by:

  • Using parameterized queries throughout
  • Handling JSON/JSONB type conversion internally
  • Validating data through the jsonPath mappings before persistence
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for JSON validation or sanitization logic
rg -A 3 "validateJSON|sanitize|JsonProcessing"

Length of output: 363


Script:

#!/bin/bash
# Search for persister configuration and JSON handling
rg -A 5 "persister|jsonPath|jsonb" --type yaml

Length of output: 77220


Script:

#!/bin/bash
# Look for any SQL injection prevention or data validation patterns
ast-grep --pattern 'class $_ {
  $$$
  validate($$$) {
    $$$
  }
  $$$
}'

Length of output: 77


Script:

#!/bin/bash
# Check for any JSON/JSONB related utility classes or validation
fd -e java -e kt -x rg -l "JsonNode|ObjectMapper|JsonProcessing|@Valid"

Length of output: 73


Script:

#!/bin/bash
# Search for SQL/database related validation or sanitization
rg -A 3 "PreparedStatement|JdbcTemplate|validateSQL|sanitize.*SQL|@Query|@Repository"

Length of output: 87


Script:

#!/bin/bash
# Look for persister configuration files and their validation settings
fd -e yml -x rg -l "persister|validation|sanitize"

Length of output: 52

egov-indexer/project-indexer.yml (2)

18-18: LGTM! Improved audit trail tracking

The change to use $.auditDetails.lastModifiedTime instead of $.lastModifiedTime aligns with better audit trail practices by using the standardized audit structure.


25-25: LGTM! Enhanced uniqueness constraints

The ID field now uses a composite key (clientReferenceId,beneficiaryClientReferenceId,tenantId) which provides better uniqueness guarantees across tenants.

Also applies to: 35-35, 45-45

egov-persister/boundary-persister.yml (2)

9-9: LGTM! Well-structured SQL queries with proper parameterization

The SQL queries are well-structured and use parameterized queries which prevent SQL injection. The transaction flags are correctly set for data consistency.

Also applies to: 38-38, 57-57, 82-82, 111-111


19-20: LGTM! Proper JSON type handling

Correct configuration of JSON type mapping with JSONB database type, which provides better performance and indexing capabilities for JSON data.

egov-indexer/referral-management-indexer.yml (3)

1-3: LGTM: Service configuration is well-defined

The service configuration correctly defines the service name and version.


8-18: Verify timestamp field configuration for user synchronization

The user sync index uses lastModifiedTime for the timestamp field while the referral index uses createdTime. This might lead to inconsistencies in temporal queries across indexes.

Consider using consistent timestamp fields across related indexes for better data consistency and querying capabilities.


119-142: 🛠️ Refactor suggestion

Add proper error handling for HF referral deletion

The delete operation configuration for HF referrals should include validation to ensure the referral exists before deletion.

Consider adding a pre-condition check or using a soft delete approach:

 - topic: delete-hfreferral-topic
   configKey: INDEX
   indexes:
     - name: hf-referral-index-v1
       type: hfreferral
       id: $.id
       isBulk: true
       jsonPath: $.*
-      timeStampField: $.auditDetails.lastModifiedTime
+      timeStampField: $.auditDetails.lastModifiedTime
+      pre_condition: $.isDeleted == false

Likely invalid or redundant comment.

🧰 Tools
🪛 yamllint (1.35.1)

[error] 142-142: no new line character at the end of file

(new-line-at-end-of-file)

egov-persister/facility-persister.yml (1)

8-14: LGTM: Audit configuration is comprehensive

The audit configuration is well-structured with all necessary fields for tracking changes.

egov-persister/household-persister.yml (2)

8-14: LGTM! Well-structured audit configuration.

The audit configuration is properly set up with all necessary fields for comprehensive tracking:

  • Object ID and tenant ID paths are correctly specified
  • Transaction code path uses clientReferenceId
  • Audit base path is properly configured

Line range hint 16-36: Validate client-side timestamps before persisting.

While adding client audit fields improves tracking, client-provided timestamps should be validated to prevent data manipulation.

Consider adding server-side validation to ensure:

  • Client timestamps are not in the future
  • Client timestamps are within acceptable range of server time
  • Consistent timezone handling

Run this verification:

egov-indexer/pgr-services.yml (1)

15-15: Validate boundary hierarchy configuration.

The boundary hierarchy configuration has been updated, but there's no validation for the existence of these boundaries.

Run this verification:

Also applies to: 183-183

✅ Verification successful

Let me search for boundary hierarchy configuration in the MDMS data using ripgrep instead of GitHub API.


Let me check the actual content of these files to better understand the boundary configuration.


Boundary hierarchy configuration is properly validated

Based on the codebase analysis, the boundary hierarchy configuration in pgr-services.yml is properly validated through:

  1. The indexer configuration correctly maps the administrative hierarchy levels:

    • Provincia
    • Distrito
    • Posto Administrativo
    • Localidade
    • Aldeia
  2. The boundary validation is handled through:

    • Proper boundary type checks in API calls to egov-location service
    • Hierarchical validation through hierarchyTypeCode=ADMIN parameter
    • Tenant-specific boundary validation via tenantId parameter
  3. The boundary data is persisted and managed through:

    • Boundary entity tables (boundary, boundary_hierarchy, boundary_relationship)
    • Proper relationship tracking with ancestralMaterializedPath
    • Hierarchical type validation in persister configurations

The configuration is consistent with the boundary management system and properly validates the existence of boundaries through the egov-location service.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the boundary types exist in MDMS
gh api graphql -f query='{
  search(query: "org:$ORGANIZATION_NAME path:/egov-mdms-data/ Provincia Distrito \"Posto Administrativo\" Localidade Aldeia", type: CODE) {
    nodes {
      ... on SearchResultItem {
        textMatches {
          fragment
        }
      }
    }
  }
}'

Length of output: 1019


Script:

#!/bin/bash
# Search for boundary hierarchy configuration in MDMS data
rg -g "*.json" -l "Provincia|Distrito|Posto Administrativo|Localidade|Aldeia"

# Search for tenant boundary configuration
rg -g "*.json" "boundaryType|hierarchyType" --json | jq -r 'select(.type == "match") | .data.path.text'

Length of output: 383


Script:

#!/bin/bash
# Let's check the actual content of these files to understand the boundary configuration
rg -g "*.json" -A 5 -B 5 "boundaryType|hierarchyType"

# Also search for any boundary related configuration in YAML files
rg -g "*.yml" -A 5 -B 5 "boundary|hierarchy"

Length of output: 63589

egov-persister/hrms-employee-persister.yml (5)

1-8: LGTM! Service configuration looks good.

The service configuration correctly defines HRMS service with appropriate version and transaction settings.


85-115: LGTM! Education and test mappings are well structured.

The mappings for educational details and departmental tests are properly configured with all necessary fields and audit details.

Also applies to: 118-144


176-206: LGTM! Service history and jurisdiction mappings are correct.

The mappings properly handle:

  • Service history tracking with dates and location
  • Jurisdiction details with hierarchy and boundary info

Also applies to: 209-234


44-80: Verify date range validation for assignments.

The assignment mapping looks good, but ensure that:

  • fromdate is always before todate
  • No overlapping assignments for the same employee

480-505: Verify status transitions in activation flows.

The deactivation and reactivation mappings look good, but verify:

  • Valid status transitions
  • Proper audit trail maintenance
  • No concurrent activation/deactivation

Also applies to: 508-533

egov-persister/plan-service-persister.yml (4)

1-95: LGTM! Plan configuration create mappings are well structured.

The configuration properly handles:

  • Core plan details with JSONB for extensibility
  • File attachments with proper tracking
  • Assumptions and operations with execution order

407-524: LGTM! Plan update mappings handle conflicts properly.

The update mappings correctly use ON CONFLICT clauses and maintain audit trails.


97-227: Verify conflict resolution in update flows.

The update mappings use ON CONFLICT clauses correctly, but verify:

  • All unique constraints are properly defined
  • No data loss during updates
🧰 Tools
🪛 yamllint (1.35.1)

[error] 118-118: trailing spaces

(trailing-spaces)


275-405: Verify referential integrity in plan relationships.

The plan creation mappings look good but verify:

  • Foreign key constraints for activity-resource relationships
  • Proper cascading on deletions
egov-dss-dashboards/dashboard-analytics/MasterDashboardConfig.json (3)

172-177: LGTM! Filter configuration is consistent.

The hideFilterFields configuration is consistently applied across dashboards.

Also applies to: 1177-1182


Line range hint 1016-1098: LGTM! Health supervision metrics are well organized.

The supervision section properly tracks:

  • Checklist completion rates
  • User sync rates
  • District-wise performance

1099-1169: Verify Kibana integration configuration.

The new Kibana maps section looks good but verify:

  • Kibana endpoint configuration
  • Access control settings
  • Component compatibility

Comment on lines +53 to +67
configKey: INDEX
indexes:
- name: stock-reconciliation-index-v1
type: reconciliation
id: $.stockReconciliation.id
isBulk: true
jsonPath: $
customJsonMapping:
indexMapping: { "Data": { } }
fieldMapping:
- inJsonPath: $
outJsonPath: $.Data
timeStampField: $.stockReconciliation.auditDetails.createdTime

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider enhancing stock reconciliation mappings

The new stock reconciliation index uses a different document structure (nested under stockReconciliation) compared to other indexes, but has minimal field mappings. Consider:

  1. Adding specific field mappings for important reconciliation fields
  2. Maintaining consistency with the field structure of other stock-related indexes
  3. Including audit fields in the mapping for better tracking

Example enhancement:

fieldMapping:
  - inJsonPath: $.stockReconciliation
    outJsonPath: $.Data
  - inJsonPath: $.stockReconciliation.auditDetails.createdBy
    outJsonPath: $.Data.createdBy
  - inJsonPath: $.stockReconciliation.auditDetails.lastModifiedBy
    outJsonPath: $.Data.lastModifiedBy
  # Add more specific field mappings

- jsonPath: $.AuditLogs.*.auditCorrelationId
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure case consistency between JSON path and SQL column.

The JSON path $.AuditLogs.*.auditCorrelationId uses camelCase while the SQL column name auditcorrelationid uses lowercase. This inconsistency might cause data persistence issues.

Consider one of these solutions:

  1. Update the SQL column name to match the JSON path:
-INSERT INTO eg_audit_logs( id, useruuid,module,tenantid,transactioncode,changedate,objectid,entityname, keyvaluemap, operationtype, integrityhash, auditcorrelationid)
+INSERT INTO eg_audit_logs( id, useruuid,module,tenantid,transactioncode,changedate,objectid,entityname, keyvaluemap, operationtype, integrityhash, auditCorrelationId)
  1. Or keep the JSON path in lowercase to match the SQL column:
-- jsonPath: $.AuditLogs.*.auditCorrelationId
+- jsonPath: $.AuditLogs.*.auditcorrelationid

Committable suggestion skipped: line range outside the PR's diff.

fieldMapping:
- inJsonPath: $
outJsonPath: $.Data
timeStampField: $.household.clientAuditDetails.createdTime
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Standardize timestamp field paths across topics.

The timestamp fields use inconsistent paths:

  • $.household.clientAuditDetails.createdTime
  • $.auditDetails.lastModifiedTime
  • $.householdMember.auditDetails.createdTime

This inconsistency might cause issues with time-based queries and sorting.

Consider standardizing the timestamp paths to follow a consistent pattern, either using client audit details or regular audit details across all topics.

Also applies to: 27-27, 36-36, 45-45

Comment on lines +13 to +17
customJsonMapping:
indexMapping: { "Data": { } }
fieldMapping:
- inJsonPath: $
outJsonPath: $.Data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Empty index mapping structure needs to be updated with household member fields

The current configuration for the household member topic in household-indexer.yml is missing the customJsonMapping section that's present in the household topic. This means the household member data is not being properly transformed and mapped to the elasticsearch index.

  • Add a customJsonMapping section to the household member index configuration (lines 35-36) similar to the household topic, but with appropriate field mappings for household member data.
  • The current configuration only specifies the ID and timestamp fields but doesn't handle the actual member data fields.
🔗 Analysis chain

Verify data mapping for household member records.

The household member topic uses the same empty indexMapping structure as the household topic. Ensure this mapping adequately captures all required household member fields.

Let's check the data structure requirements:

Also applies to: 35-36

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for household member related schemas or models
rg -l 'householdMember' --type json --type yaml

Length of output: 82


Script:

#!/bin/bash
# Let's examine the content of the household-indexer.yml to understand the data structure
cat egov-indexer/household-indexer.yml

Length of output: 1438


Script:

#!/bin/bash
# Let's search for any household member related schema definitions or models in the codebase
rg -A 10 "householdMember" --type json --type yaml

Length of output: 924

Comment on lines +42 to +46
customJsonMapping:
indexMapping: { "Data": { } }
fieldMapping:
- inJsonPath: $
outJsonPath: $.Data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix YAML syntax error in bulk update mapping

The customJsonMapping key has incorrect indentation.

Apply this fix:

          jsonPath: $.Plans
-            customJsonMapping:
+          customJsonMapping:
            indexMapping: { "Data": { } }
            fieldMapping:
              - inJsonPath: $
                outJsonPath: $.Data
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
customJsonMapping:
indexMapping: { "Data": { } }
fieldMapping:
- inJsonPath: $
outJsonPath: $.Data
jsonPath: $.Plans
customJsonMapping:
indexMapping: { "Data": { } }
fieldMapping:
- inJsonPath: $
outJsonPath: $.Data
🧰 Tools
🪛 yamllint (1.35.1)

[error] 42-42: syntax error: mapping values are not allowed here

(syntax)

Comment on lines +33 to +46
- topic: bulk-update-plan
configKey: INDEX
indexes:
- name: plan-index-v1
type: general
id: $.id
isBulk: true
timeStampField: $.auditDetails.createdTime
jsonPath: $.Plans
customJsonMapping:
indexMapping: { "Data": { } }
fieldMapping:
- inJsonPath: $
outJsonPath: $.Data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Bulk update operations lack essential safeguards

Based on the verification, the bulk update configurations in the indexer files do not include rate limiting or batch size controls. This could potentially lead to performance issues or resource exhaustion during bulk operations.

Key locations requiring safeguards:

  • egov-indexer/Plan-service-indexer.yml: bulk-update-plan topic
  • egov-indexer/census-service-indexer.yml: census-bulk-update-topic
  • egov-indexer/project-indexer.yml: transformer-producer-bulk-project-index-v1-topic
  • egov-indexer/project-staff-indexer.yml: transformer-producer-bulk-project-staff-index-v1-topic
  • egov-indexer/project-task-indexer.yml: transformer-producer-bulk-project-task-index-v1-topic
  • egov-indexer/stock-indexer.yml: transformer-producer-bulk-stock-index-v1-topic

Recommend adding:

  1. Batch size limits to control the number of records processed in each bulk operation
  2. Rate limiting configuration to prevent overwhelming the system
🔗 Analysis chain

Verify bulk update operation safeguards

The bulk update configuration should include:

  1. Rate limiting or batch size controls
  2. Error handling for partial failures

Let me search for any rate limiting or batch size configurations.


Let me try another approach to search for configuration files that might contain bulk operation controls.


Let me try one more search to look for any configuration files that might contain bulk operation settings.


Let me check the content of these files to see if they contain any bulk operation controls.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for bulk operation configurations and safeguards
rg -A 5 "bulk-update|isBulk:\s*true"

Length of output: 19796


Script:

#!/bin/bash
# Search for rate limiting or batch size configurations
rg -i "batch|rate|limit|size" egov-indexer/

Length of output: 45


Script:

#!/bin/bash
# Look for configuration files and search for any bulk operation controls
fd -e yml -e yaml -e json -e properties -e conf | xargs cat 2>/dev/null | grep -i "batch\|bulk\|size\|limit\|rate"

Length of output: 102511


Script:

#!/bin/bash
# Search for configuration files and any bulk operation settings
fd -e yml -e yaml -e properties -e conf -e json . | xargs grep -l "bulk\|batch" 2>/dev/null

Length of output: 300


Script:

#!/bin/bash
# Check content of the identified files for bulk operation controls
cat egov-indexer/Plan-service-indexer.yml egov-indexer/census-service-indexer.yml egov-indexer/project-indexer.yml egov-indexer/project-staff-indexer.yml egov-indexer/project-task-indexer.yml egov-indexer/stock-indexer.yml | grep -A 5 -B 5 "bulk"

Length of output: 2227

🧰 Tools
🪛 yamllint (1.35.1)

[error] 42-42: syntax error: mapping values are not allowed here

(syntax)

Comment on lines +41 to +53
- query: INSERT INTO .population_by_demographics(id, census_id, demographic_variable, population_distribution, created_by, created_time, last_modified_by, last_modified_time) VALUES (?, ?, ?, ?, ?, ?, ?, ?);
basePath: $.Census.populationByDemographics.*
jsonMaps:
- jsonPath: $.Census.populationByDemographics.*.id
- jsonPath: $.Census.id
- jsonPath: $.Census.populationByDemographics.*.demographicVariable
- jsonPath: $.Census.populationByDemographics.*.populationDistribution
type: JSON
dbType: JSONB
- jsonPath: $.Census.auditDetails.createdBy
- jsonPath: $.Census.auditDetails.createdTime
- jsonPath: $.Census.auditDetails.lastModifiedBy
- jsonPath: $.Census.auditDetails.lastModifiedTime
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ensure data consistency for demographic variables

The population_by_demographics table should enforce constraints on demographic variables to maintain data consistency.

Consider adding a check constraint or using an enum type for demographic_variable to ensure only valid values are stored.

ALTER TABLE population_by_demographics 
ADD CONSTRAINT valid_demographic_variable 
CHECK (demographic_variable IN ('age', 'gender', 'income', 'education'));

queryParam: hierarchyTypeCode=ADMIN,boundaryType=Provincia,code=$.service.address.locality.code,tenantId=$.service.tenantId
apiRequest: { "RequestInfo": { "apiId": "org.egov.pt","ver": "1.0","ts": 1502890899493,"action": "asd","did": "4354648646","key": "xyz","msgId": "654654","requesterId": "61","authToken": "d9994555-7656-4a67-ab3a-a952a0d4dfc8","userInfo": { "id": 1,"uuid": "1fec8102-0e02-4d0a-b283-cd80d5dab067","type": "EMPLOYEE","tenantId": "default","roles": [ { "name": "Employee","code": "EMPLOYEE","tenantId": "default" } ] } } }
queryParam: hierarchyTypeCode=ADMIN,boundaryType=Provincia,codes=$.service.address.locality.code,tenantId=$.service.tenantId
apiRequest: { "RequestInfo": { "apiId": "org.egov.pt","ver": "1.0","ts": 1502890899493,"action": "asd","did": "4354648646","key": "xyz","msgId": "654654","requesterId": "61","authToken": "d9994555-7656-4a67-ab3a-a952a0d4dfc8","userInfo": { "id": 1,"uuid": "1fec8102-0e02-4d0a-b283-cd80d5dab067","type": "EMPLOYEE","tenantId": "mz","roles": [ { "name": "Employee","code": "EMPLOYEE","tenantId": "mz" } ] } } }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical: Remove hardcoded authentication tokens.

The configuration contains hardcoded authentication tokens in API requests, which poses a significant security risk.

Required changes:

  1. Remove hardcoded tokens
  2. Use environment variables or secure configuration management
  3. Implement proper token management

Example fix:

-"authToken": "d9994555-7656-4a67-ab3a-a952a0d4dfc8"
+"authToken": "${AUTH_TOKEN}"

Also applies to: 30-30, 36-36, 42-42, 48-48, 54-54, 114-114, 120-120, 126-126, 132-132, 138-138, 144-144, 190-190, 196-196, 202-202, 208-208, 214-214, 220-220

🧰 Tools
🪛 Gitleaks (8.21.2)

24-24: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.