首页 文章

检查字符串是否是另一个字符串的子字符串

提问于
浏览
4

我读了很好的练习关于检查字符串是另一个的子字符串 .

练习的内容是:

编写一个从命令行获取2个字符串参数的程序 . 程序必须验证第二个字符串是否是第一个字符串的子字符串(您不能使用substr,substring或任何其他标准函数,包括正则表达式库) . 第二个子字符串中每次出现表示它可以匹配第一个字符串的零个或多个字符 . 考虑以下示例:输入字符串1:abcd输入字符串2:a * c程序应该评估字符串2是字符串1的子字符串 . 此外,星号()可以被视为常规字符,如果它前面是反斜杠(\) . 反斜杠(\)在除星号(*)之前的所有情况下都被视为常规字符 .

我写了一个简单的应用程序,它首先检查第二个字符串是否不长于第一个(但有一个问题,当测试(“ab”,“a * b”)这是正确的测试,但方法失败):

public static boolean checkCharactersCount(String firstString, String secondString) {
    return (firstString.length() > 0 && secondString.length() > 0) &&
            (firstString.length() > secondString.length());

...而下一次验证是一个子字符串:

public static boolean checkSubstring(String firstString, String secondString) {
    int correctCharCounter = 0;
    int lastCorrectCharAtIndex = -1;

    for (int i = 0; i < secondString.length(); i++) {
        for (int j = 0; j < firstString.length(); j++) {
            if (j > lastCorrectCharAtIndex) {

                if ((secondString.charAt(i) == firstString.charAt(j)) || secondString.charAt(i) == '*') {
                    correctCharCounter++;
                    lastCorrectCharAtIndex = j;
                }

                if (correctCharCounter >= secondString.length())
                    return true;
            }
        }
    }

    return false;
}

但是有两个问题:

  • 我上面的代码不支持字符连续性(例如test:checkSubstring("abacd","bcd")返回true,但是它错了! - 应该返回false)

  • 任何想法如何验证特殊符号为"*"?要测试的样本(checkSubstring(“abc ", " \ b”))

您对解决方案的看法如何? :)

4 回答

  • 1

    我发现这是一个很好的挑战 . 这个练习真的迫使我们在一般的语言和算法上进行思考 . 没有lambda,没有溪流,没有正则表达式,没有找到,没有子串,没有 . 只是旧的CharAt,一些fors,什么不是 . 基本上我做了一个查找方法,查找要找到的字符串的第一个字符,然后另一个查找从该点开始考虑您的规则 . 如果失败,则返回到找到的第一个索引,添加一个,它会执行多少次迭代,直到字符串结束 . 如果没有找到匹配,则应该返回false . 如果只找到一个,则足以将其视为子串 . 在微积分的开始处考虑最重要的角点情况,以便如果检测到错误则确定它不会进一步发展 . 因此'*'单独表示任何字符匹配,我们可以用\来逃避它 . 我试图包括大多数角落案件,这真的是一个挑战 . 我不完全确定我的代码是否涵盖了你的所有案例,但它应该涵盖了不少 . 我真的很想帮助你,所以这是我的方法,这是我的代码:

    package com.jesperancinha.string;
    
    public class StringExercise {
    
        private static final char ASTERISK = '*';
        private static final char BACKSLASH = '\\';
    
        public boolean checkIsSubString(String mainString, String checkString) {
            int nextIndex = getNextIndex(0, checkString.charAt(0), mainString);
            if (nextIndex == -1) {
                return false;
            }
            boolean result = checkFromIndex(nextIndex, mainString, checkString);
            while (nextIndex < mainString.length() - 1 && nextIndex > -1) {
                if (!result) {
                    nextIndex = getNextIndex(nextIndex + 1, checkString.charAt(0), mainString);
                    if (nextIndex > -1) {
                        result = checkFromIndex(nextIndex, mainString, checkString);
                    }
                } else {
                    return result;
                }
            }
            return result;
        }
    
        private int getNextIndex(int start, char charAt, String mainString) {
            if (charAt == ASTERISK || charAt == BACKSLASH) {
                return start;
            }
            for (int i = start; i < mainString.length(); i++) {
                if (mainString.charAt(i) == charAt) {
                    return i;
                }
            }
            return -1;
        }
    
        private boolean checkFromIndex(int nextIndex, String mainString, String checkString) {
            for (int i = 0, j = 0; i < checkString.length(); i++, j++) {
                if (i < (checkString.length() - 2) && checkString.charAt(i) == BACKSLASH
                        && checkString.charAt(i + 1) == ASTERISK) {
                    i++;
                    if (mainString.charAt(j + nextIndex) == BACKSLASH) {
                        j++;
                    }
                    if (checkString.charAt(i) != mainString.charAt(j + nextIndex)) {
                        return false;
                    }
                }
                if (i > 0 && checkString.charAt(i - 1) != BACKSLASH
                        && checkString.charAt(i) == ASTERISK) {
                    if (i < checkString.length() - 1 && (j + nextIndex) < (mainString.length() - 1)
                            && checkString.charAt(i + 1) !=
                            mainString.charAt(j + nextIndex + 1)) {
                        i--;
                    } else {
                        if (j + nextIndex == mainString.length() - 1
                                && checkString.charAt(checkString.length() - 1) != ASTERISK
                                && checkString.charAt(checkString.length() - 2) != BACKSLASH) {
                            return false;
                        }
                    }
                } else {
                    if ((j + nextIndex) < (mainString.length() - 2) &&
                            mainString.charAt(j + nextIndex)
                                    != checkString.charAt(i)) {
                        return false;
                    }
                }
            }
            return true;
        }
    
    }
    

    我已经进行了一系列的单元测试,但是如果我把整个课程放在这里,那就太长了,我唯一要告诉你的是我在单元测试中实现的测试用例 . 以下是此案例的单元测试的精简版本:

    package com.jesperancinha.string;
    
    import static org.assertj.core.api.Assertions.assertThat;
    
    import org.junit.jupiter.api.Test;
    
    class StringExerciseMegaTest {
    
        @Test
        void checkIsSubString() {
            StringExercise stringExercise = new StringExercise();
            boolean test = stringExercise.checkIsSubString("abcd", "a*c");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("abcd", "a\\*c");
            assertThat(test).isFalse();
            test = stringExercise.checkIsSubString("a*c", "a\\*c");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("aasdsadasa*c", "a\\*c");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("aasdsadasa*csdfdsfdsfdsf", "a\\*c");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("aasdsadasa**csdfdsfdsfdsf", "a\\*c");
            assertThat(test).isFalse();
            test = stringExercise.checkIsSubString("aasdsadasa**csdfdsfdsfdsf", "a*c");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("aasdsadasa*csdfdsfdsfdsf", "a*c");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("aasdweriouiauoisdf9977675tyhfgh", "a*c");
            assertThat(test).isFalse();
            test = stringExercise.checkIsSubString("aasdweriouiauoisdf9977675tyhfgh", "dwer");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("aasdweriouiauoisdf9977675tyhfgh", "75tyhfgh");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("aasdweriou\\iauoisdf9977675tyhfgh", "riou\\iauois");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("aasdweriou\\*iauoisdf9977675tyhfgh", "riou\\\\*iauois");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("aasdweriou\\*iauoisdf9\\*977675tyhfgh", "\\\\*977675tyhfgh");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("aasdweriou\\*iauoisdf9\\*977675tyhfgh", "\\*977675tyhfgh");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("\\*aasdweriou\\*iauoisdf9\\*977675tyhfgh", "\\*aasdwer");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("*aasdweriou\\*iauoisdf9\\*977675tyhfgh", "*aasdwer");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("abcd", "bc");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("abcd", "zbc");
            assertThat(test).isFalse();
            test = stringExercise.checkIsSubString("abcd", "*bc*");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("*bcd", "\\*bc*");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("abcd", "a*c");
            assertThat(test).isTrue();
            test = stringExercise.checkIsSubString("abcd", "az*bc");
            assertThat(test).isFalse();
        }
    }
    
  • 1

    我的解决方案看起来像这样,我评论了一切,所以希望你能理解这一点 .

    public static void main(String [] args) throws Exception {
            System.err.println(contains("bruderMusssLos".toCharArray(),"Mu*L*".toCharArray()));
    }
    
    public static boolean contains(char [] a, char [] b) {
    
        int counterB = 0; // correct characters
        char lastChar = '-'; //last Character encountered in B
    
        for(int i = 0; i < a.length; i++) {
    
            //if last character * it can be 0 to infinite characters
            if(lastChar == '*') {
    
                //if next characters in a is next in b reset last char
                // this will be true as long the next a is not the next b
                if(a[i] == b[counterB]) {
                    lastChar = b[counterB];
                    counterB++;
    
                }else {
                    counterB++;
                }
    
            }else {
    
                //if next char is * and lastchar is not \ count infinite to next hit
                //otherwise * is normal character
                if(b[counterB] == '*' && lastChar != '\\') {
                    lastChar = '*';
                    counterB++;
                }else {
                    //if next a is next b count
                    if(a[i] == b[counterB]) {
                        lastChar = b[counterB];
                        counterB++;
                    }else {
                        //otherwise set counter to 0
                        counterB = 0;
                    }                   
                }
    
            }
    
            //if counterB == length a contains b
            if(counterB == b.length)
                return true;
    
        }
    
    
        return false;
    }
    

    当前测试返回true,例如:)

  • 1

    试试这个:(评论添加了解释)

    // only for non empty Strings
    public boolean isSubString(String string1,String string2)
    {
        // step 1: split by *, but not by \*
        List<String>list1 = new ArrayList<String>();
        char[]cs = string2.toCharArray();
        int lastIndex = 0 ;
        char lastChar = 0 ;
        int i = 0 ;
        for(; i < cs.length ; ++i)
        {
            if(cs[i]=='*' && lastChar!='\\')
            {
                list1.add(new String(cs,lastIndex,i-lastIndex).replace("\\*", "*"));
                //earlier buggy line:
                //list1.add(new String(cs,lastIndex,i-lastIndex));
                lastIndex = i + 1 ;
            }
            lastChar = cs[i];
        }
        if(lastIndex < i )
        {
            list1.add(new String(cs,lastIndex,i-lastIndex).replace("\\*", "*"));
        }
        // step 2: check indices of each string in the list
        // Note: all indices should be in proper order.
        lastIndex = 0;
        for(String str : list1)
        {
            int newIndex = string1.indexOf(str,lastIndex);
            if(newIndex < 0)
            {
                return false;
            }
            lastIndex = newIndex+str.length();
        }
        return true;
    }
    

    如果您不允许使用 String.indexOf() ,则编写函数 public int indexOf(String string1,String string2, int index2) 并替换此语句

    int newIndex = string1.indexOf(str,lastInxdex);
    

    声明:

    int newIndex = indexOf(string1, str,lastInxdex);
    

    ================================================== ======

    附录A:我测试的代码:

    package jdk.conf;
    
    import java.util.ArrayList;
    import java.util.List;
    
    public class Test01 {
        public static void main(String[] args)
        {
            Test01 test01 = new Test01();
            System.out.println(test01.isSubString("abcd", "a*c"));
            System.out.println(test01.isSubString("abcd", "bcd"));
            System.out.println(test01.isSubString("abcd", "*b"));
            System.out.println(test01.isSubString("abcd", "ac"));
            System.out.println(test01.isSubString("abcd", "bd"));
            System.out.println(test01.isSubString("abcd", "b*d"));
            System.out.println(test01.isSubString("abcd", "b\\*d"));
            System.out.println(test01.isSubString("abcd", "\\*d"));
            System.out.println(test01.isSubString("abcd", "b\\*"));
    
            System.out.println(test01.isSubString("a*cd", "\\*b"));
            System.out.println(test01.isSubString("", "b\\*"));
            System.out.println(test01.isSubString("abcd", ""));
    
            System.out.println(test01.isSubString("a*bd", "\\*b"));
        }
        // only for non empty Strings
        public boolean isSubString(String string1,String string2)
        {
            // step 1: split by *, but not by \*
            List<String>list1 = new ArrayList<String>();
            char[]cs = string2.toCharArray();
            int lastIndex = 0 ;
            char lastChar = 0 ;
            int i = 0 ;
            for(; i < cs.length ; ++i)
            {
                if(cs[i]=='*' && lastChar!='\\')
                {
                    list1.add(new String(cs,lastIndex,i-lastIndex).replace("\\*", "*"));
                    lastIndex = i + 1 ;
                }
                lastChar = cs[i];
            }
            if(lastIndex < i )
            {
                list1.add(new String(cs,lastIndex,i-lastIndex).replace("\\*", "*"));
            }
            // step 2: check indices of each string in the list
            // Note: all indices should be in proper order.
            lastIndex = 0;
            for(String str : list1)
            {
                int newIndex = string1.indexOf(str,lastIndex);
                if(newIndex < 0)
                {
                    return false;
                }
                lastIndex = newIndex+str.length();
            }
            return true;
        }
    }
    

    输出:

    true
    true
    true
    false
    false
    true
    false
    false
    false
    false
    false
    true
    true
    
  • 0

    我会在几个阶段做到这一点 .

    让我们调用潜在的子字符串p和我们正在测试包含子字符串s的字符串 .

    将“包含”部分分解为“从第n个位置开始匹配?”的一系列问题;显然你从第一个位置开始经过s看看p是否匹配s的任何位置 .

    在匹配中,我们有可能遇到“”;在这种情况下,我们想要知道之后的p部分是否是在匹配p的部分直到之后的s部分的子字符串 . 这表明了一个递归例程,将要匹配的部分和要匹配的字符串匹配并返回true / false . 遇到时,请形成两个新字符串并自行调用 .

    如果您遇到\,那么您只需继续与下一个字符进行常规匹配,而不是进行递归调用 . 鉴于您需要这样做,我想如果您从原始p构建pPrime可能是最简单的,这样您就可以在遇到它们时删除反斜杠,而不是从通配符中删除星号匹配 .

    我实际上没有写过任何代码,你只是要求方法 .

相关问题