Security Scanning Setup Overview

Summary

This document outlines the security scanning infrastructure created for the RAGFlow project using four security tools: Semgrep, OSV Scanner, TruffleHog, and ClamAV.

Actions Taken

1. Created Security Scanning Script

Created /usr/src/ragflow/security_scan.sh - a comprehensive bash script that:

2. Tool Installation

Semgrep (SAST - Static Application Security Testing)

OSV Scanner (SCA - Software Composition Analysis)

TruffleHog (Secrets Detection)

ClamAV (Malware Detection)

3. Scan Results

ToolTypeFindings
SemgrepSAST74 code issues detected
OSV ScannerSCA43 vulnerabilities in dependencies
TruffleHogSecrets2 unverified secrets found
ClamAVMalwareNot available

4. Output Files

All results are saved to /usr/src/ragflow/security_scan_results/:

secursotiesrtmvuyg_f_rrfseelcpsea_uhnrlo_etgrss_eu.rsljeutsslsout.nlsjt/sso.njson###DSeStpeaectnridecetnsacnydaelvtyueslcintseiroranebsirulelisttusiletss

Usage

Run Full Security Scan

./security_scan.sh

Prerequisites

The script automatically handles:

Manual Tool Paths

If tools are not found, add to PATH:

export PATH="$HOME/.local/bin:$PATH"

Tool Descriptions

Semgrep

A fast static analysis engine that finds bugs and security vulnerabilities using pattern matching. It supports 30+ languages and has extensive rule sets for common vulnerabilities.

OSV Scanner

Google’s OSV (Open Source Vulnerabilities) scanner that checks dependencies against the OSV database, supporting multiple ecosystems including PyPI, npm, Go, etc.

TruffleHog

Scans repositories for leaked credentials, API keys, passwords, and other sensitive information. Supports filesystem and git scanning.

ClamAV

Open-source antivirus engine for detecting malware, trojans, and viruses. Requires system-level installation.

Recommendations

  1. For ClamAV: Install with root access using sudo apt-get install clamav clamav-daemon

  2. Periodic Scanning: Consider adding to CI/CD pipeline:

    ./security_scan.sh && echo "Security scan completed"
    
  3. Review Results: Manually review the JSON output files for:

    • False positives in Semgrep
    • Vulnerable packages in OSV results
    • Actual secrets vs test data in TruffleHog results
  4. Fix Vulnerabilities:

    • Update vulnerable dependencies identified by OSV Scanner
    • Remove or rotate exposed secrets found by TruffleHog
    • Address code issues flagged by Semgrep