首页 文章

使用递归和DP的最长公共子串

提问于
浏览
0

我正在尝试使用递归和DP找到两个字符串的最长公共子串 . 请注意,我不是指最长的连续子序列 . 所以,如果这两个字符串是

String s1 = "abcdf"; String s2 = "bzcdf" 
Longest Common Substring == "cdf" (not "bcdf").
Basically they have to be continuous elements

我试图使用递归和回溯来做到这一点 . 但问题是,如果我使用如下所示的递归, +1 会在一个帧中预先添加,这在调用堆栈中更高,并且不知道要出现的字符是否确实是连续元素或不是 . 因此,按照上面的例子,"bcdf"将是答案 .

public class ThisIsLongestCommonSubsequence_NotSubstring {
public static void main(String[] args) {

    String s1 = "abcdgh";
    String s2 = "abefgh";
    System.out.println(fun(s1, s1.length()-1, s2, s2.length()-1));
}

static int fun(String s1, int i, String s2, int j)
{
    if(i == -1 || j == -1)
        return 0;

    int ret = 0;
    if(s1.charAt(i) == s2.charAt(j))
        ret = fun(s1, i-1, s2, j-1) + 1;
    else
        ret = max(fun(s1, i-1, s2, j), fun(s1, i, s2, j-1));

    return ret;
}

static int max(int a, int b)
{
    return a>b?a:b;
}
}

至于现在,下面的代码是我提出的 . 请注意,每当我发现不匹配时,我将计数重置为0 . 并使用名为 int count 的变量跟踪匹配字符的数量,并使用名为 int maxcount 的变量在程序中的任何位置记录最高值 . 我的代码如下 .

public class LongestContinuousSubstringGlobalvariable {

static int maxcount = 0;

public static void main(String[] args) {
    String s1 = "abcdghijl";
    String s2 = "abefghijk";

    fun(s1, s2, s1.length()-1, s2.length()-1, 0);
    System.out.println("maxcount == "+maxcount);
}

static void fun(String s1, String s2, int i, int j, int count)
{
    if(i == -1 || j==-1)
        return;

    if(s1.charAt(i) == s2.charAt(j))
    {
        if(count+1 >  maxcount)
            maxcount = count+1;
        fun(s1, s2, i-1, j-1, count+1); 
    }
    else
    {
        fun(s1, s2, i-1, j, 0);
        fun(s1, s2, i, j-1, 0);
    }
}
}

这很好用 . 但是,我不喜欢我的代码

  • 使用全局变量(static int maxcount)来跨帧进行比较

  • 我不认为这是真正的动态编程或回溯,因为较低的帧不是 returning 它输出到更高的帧,然后决定如何处理它 .

请告诉我如何在不使用全局变量和使用回溯的情况下实现此目的 .

PS:我知道这个问题的其他方法,比如保持矩阵,做类似的事情

M [i] [j] = M [i-1] [j-1] 1 if(str [i] == str [j])

目标不是解决问题,而是寻找优雅的递归/回溯解决方案 .

1 回答

  • -1

    它可能在Prolog中完成 . 以下是我可以在这篇文章的帮助下放下的代码:Foreach not working in Prologhttp://obvcode.blogspot.in/2008/11/working-with-strings-in-prolog.htmlHow do I find the longest list in a list of lists?

    myrun(S1, S2):-
        writeln("-------- codes of first string ---------"),
        string_codes(S1, C1list),
        writeln(C1list),
    
        writeln("-------- codes of second string ---------"),
        string_codes(S2, C2list),
        writeln(C2list),
    
        writeln("--------- substrings of first --------"),
        findall(X, sublist(X, C1list), L),   
        writeln(L),
    
        writeln("--------- substrings of second --------"),
        findall(X, sublist(X, C2list), M),
        writeln(M),
    
        writeln("------ codes of common substrings -------"),
        intersection(L,M, Outl),
        writeln(Outl), 
    
        writeln("--------- common strings in one line -------"),
        maplist(string_codes, Sl, Outl), 
        writeln(Sl),
        writeln("------ common strings one by one -------"),
        maplist(writeln, Sl),
    
        writeln("------ find longest -------"),
        longest(Outl, LongestL),
        writeln(LongestL),
        string_codes(LongestS, LongestL),
        writeln(LongestS).
    
    sublist(S, L) :-
      append(_, L2, L),
      append(S, _, L2).
    
    longest([L], L) :-
       !.
    longest([H|T], H) :- 
       length(H, N),
       longest(T, X),
       length(X, M),
       N > M,
       !.
    longest([H|T], X) :-
       longest(T, X),
       !.
    

    它运行显示所有步骤:它将字符串转换为代码,然后从两者中创建所有可能的子字符串,然后找到常见的并列出它们:

    ?- myrun("abcdf", "bzcdf").
    -------- codes of first string ---------
    [97,98,99,100,102]
    -------- codes of second string ---------
    [98,122,99,100,102]
    --------- substrings of first --------
    [[],[97],[97,98],[97,98,99],[97,98,99,100],[97,98,99,100,102],[],[98],[98,99],[98,99,100],[98,99,100,102],[],[99],[99,100],[99,100,102],[],[100],[100,102],[],[102],[]]
    --------- substrings of second --------
    [[],[98],[98,122],[98,122,99],[98,122,99,100],[98,122,99,100,102],[],[122],[122,99],[122,99,100],[122,99,100,102],[],[99],[99,100],[99,100,102],[],[100],[100,102],[],[102],[]]
    ------ codes of common substrings -------
    [[],[],[98],[],[99],[99,100],[99,100,102],[],[100],[100,102],[],[102],[]]
    --------- common strings in one line -------
    [,,b,,c,cd,cdf,,d,df,,f,]
    ------ common strings one by one -------
    
    
    b
    
    c
    cd
    cdf
    
    d
    df
    
    f
    
    ------ find longest -------
    [99,100,102]
    cdf
    true.
    

    最后忽略'true' .

    如果删除解释性部分,程序会更短:

    myrun(S1, S2):-
        string_codes(S1, C1list),
        string_codes(S2, C2list),
        findall(X, sublist(X, C1list), L),    
        findall(X, sublist(X, C2list), M),
        intersection(L,M, Outl),
        longest(Outl, LongestL),
        string_codes(LongestS, LongestL),
        writeln(LongestS).
    
    sublist(S, L) :-
      append(_, L2, L),
      append(S, _, L2).
    
    longest([L], L) :-
       !.
    longest([H|T], H) :- 
       length(H, N),
       longest(T, X),
       length(X, M),
       N > M,
       !.
    longest([H|T], X) :-
       longest(T, X),
       !.
    
    
    ?- myrun("abcdf", "bzcdf").
    cdf
    true.
    

相关问题