Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

修复#172issue,部分情况敏感词屏蔽失效 #173

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kenan131
Copy link

@kenan131 kenan131 commented May 14, 2024

DFAFilter类filter方法,优化代码,减少一次循环。

public String filter(String text) {
    StringBuilder result = new StringBuilder(text);
    int index = 0;
    while (index < result.length()) {
        char c = result.charAt(index);
        if (skip(c)) {
            index++;
            continue;
        }
        Word word = root;
        int start = index;
        boolean found = false;
        for (int i = index; i < result.length(); i++) {
            c = result.charAt(i);
            if (skip(c)) {
                continue;
            }
            if (c >= 'A' && c <= 'Z') {
                c += 32;
            }
            word = word.next.get(c);
            if (word == null) {
                break;
            }
            if (word.end) {
                found = true;
                for (int j = start; j <= i; j++) {
                    result.setCharAt(j, replace);
                }
        // found 为ture,index 赋值为敏感词最后一个下标,但在外层if判断未+1,导致下一次循环必定skip为true走一次continue
        // 解决方法,去掉fuound变量 以及 if判断。
                index = i;
            }
        }
        if (!found) {
            index++;
        }
    }
    return result.toString();
}

ACProTrie类,修复bug&优化代码&代码格式化

// 原代码 match方法节选
public String match(String matchWord) {
    Word walkNode = root;
    char[] wordArray = matchWord.toCharArray();
    for (int i = 0; i < wordArray.length; i++) {
        // 失败"回溯"
        while (!walkNode.hasChild(wordArray[i]) && walkNode.failOver != null) {
            walkNode = walkNode.failOver;
        }
        if (walkNode.hasChild(wordArray[i])) {
            walkNode = walkNode.next.get(wordArray[i]);
            if (walkNode.end) {
                Word sentinelA = walkNode; 
                Word sentinelB = walkNode; 
                int k = i + 1;
                boolean flag = false;
                while (k < wordArray.length && sentinelA.hasChild(wordArray[k])) {
                    sentinelA = sentinelA.next.get(wordArray[k]);
                    k++;
                    if (sentinelA.end) {
                        sentinelB = sentinelA;
                        flag = true;
                    }
                }
                int len = flag ? sentinelB.depth : walkNode.depth;
                while (len > 0) {
                    len--;
                    int index = flag ? i - walkNode.depth + 1 + len : i - len;
                    wordArray[index] = MASK;
                }
                // 问题来源。
                // 此刻的i是第一个敏感词的最后一个字符下标,所以在加上总长度后需要再减去第一个敏感词的长度
                i += flag ? sentinelB.depth : 0;
                walkNode = flag ? sentinelB.failOver : walkNode.failOver;
            }
        }
    }
    return new String(wordArray);
}
// 原代码 
public   void createACTrie(List<String> list){
    Word currentNode = new Word();
    root=currentNode;
    for(String key : list)
    {
        currentNode=root;
        for(int j=0;j<key.length();j++)
        {
            if(currentNode.next!=null&&currentNode.next.containsKey(key.charAt(j))){
                currentNode= currentNode.next.get(key.charAt(j));
                // 这里的if判断完全多余,等j==key.length时候,在外层赋值就行。
                if(j==key.length()-1){
                    currentNode.end=true;
                }
            }else {
                Word map = new Word();
                // 这里的if判断完全多余,等j==key.length时候,在外层赋值就行。
                if(j==key.length()-1){
                    map.end=true;
                }
                currentNode.next.put(key.charAt(j), map);
                // 此刻map的变量赋值给了外层currentNode变量,因此在外层设置结尾标识没问题。
                currentNode=map;
            }
            currentNode.depth = j+1;
        }
    }
    initFailOver();
}

SensitiveTest类,添加一类测试数据

ACProTrie类,修复bug&优化代码&代码格式化
SensitiveTest类,添加一类测试数据
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant