首页 文章

VSTO 2007:如何确定范围的页面和段落编号?

提问于
浏览
2

我正在构建一个MS Word加载项,它必须从文档中收集所有评论气球并在列表中汇总它们 . 我的结果将是一个ReviewItem类列表,其中包含Comment本身,段落编号和注释文本所在的页码 .

我的部分代码如下所示:

private static List<ReviewItem> FindComments()
    {
        List<ReviewItem> result = new List<ReviewItem>();
        foreach (Comment c in WorkingDoc.Comments)
        {
            ReviewItem item = new ReviewItem()
            {
                Remark = c.Reference.Text,
                Paragraph = c.Scope. ???, // How to determine the paragraph number?
                Page = c.Scope. ??? // How to determine the page number?
            };
            result.Add(item);
        }
        return result;
   }

Comment 类的 Scope 属性指向注释所在文档中的实际文本,其类型为 Microsoft.Office.Interop.Word.Range . 我无法弄清楚如何确定该范围所在的页面和段落 .

对于段落编号,我实际上是指段落的“编号列表”编号,例如“2.3”或“1.3.2” .

有什么建议?谢谢!

3 回答

  • 11

    在Mike Regan的帮助下给了我答案(再次感谢Mike),我设法制定了一个我想在这里分享的解决方案 . 也许这也澄清了我的目标 . 在性能方面,这可能不是最快或最有效的解决方案 . 随意提出改进建议 .

    我的代码的结果是一个ReviewItem类列表,将在别处处理 . 不用多说了,这是代码:

    /// <summary>
    /// Worker class that collects comments from a Word document and exports them as ReviewItems
    /// </summary>
    internal class ReviewItemCollector
    {
        /// <summary>
        /// Working document
        /// </summary>
        private Word.Document WorkingDoc = new Word.DocumentClass();
    
        /// <summary>
        /// Extracts the review results from a Word document
        /// </summary>
        /// <param name="fileName">Fully qualified path of the file to be evaluated</param>
        /// <returns></returns>
        public ReviewResult GetReviewResults(string fileName)
        {
            Word.Application wordApp = null;
            List<ReviewItem> reviewItems = new List<ReviewItem>();
    
            object missing = System.Reflection.Missing.Value;
    
            try
            {
                // Fire up Word
                wordApp = new Word.ApplicationClass();
    
                // Some object variables because the Word API requires this
                object fileNameForWord = fileName;
                object readOnly = true;
    
                WorkingDoc = wordApp.Documents.Open(ref fileNameForWord,
                    ref missing, ref readOnly,
                    ref missing, ref missing, ref missing, ref missing, ref missing,
                    ref missing, ref missing, ref missing, ref missing, ref missing,
                    ref missing, ref missing, ref missing);
    
                // Gather all paragraphs that are chapter headers, sorted by their start position
                var headers = (from Word.Paragraph p in WorkingDoc.Paragraphs
                               where IsHeading(p)
                               select new Heading()
                               {
                                   Text = GetHeading(p),
                                   Start = p.Range.Start
                               }).ToList().OrderBy(h => h.Start);
    
                reviewItems.AddRange(FindComments(headers));
    
                // I will be doing similar things with Revisions in the document
            }
            catch (Exception x)
            {
                MessageBox.Show(x.ToString(), 
                    "Error while collecting review items", 
                    MessageBoxButtons.OK, 
                    MessageBoxIcon.Error);
            }
            finally
            {
                if (wordApp != null)
                {
                    object doNotSave = Word.WdSaveOptions.wdDoNotSaveChanges;
                    wordApp.Quit(ref doNotSave, ref missing, ref missing);
                }
            }
            ReviewResult result = new ReviewResult();
            result.Items = reviewItems.OrderBy(i => i.Position);
            return result;
        }
    
        /// <summary>
        /// Finds all comments in the document and converts them to review items
        /// </summary>
        /// <returns>List of ReviewItems generated from comments</returns>
        private List<ReviewItem> FindComments(IOrderedEnumerable<Heading> headers)
        {
            List<ReviewItem> result = new List<ReviewItem>();
    
            // Generate ReviewItems from the comments in the documents
            var reviewItems = from Word.Comment c in WorkingDoc.Comments
                              select new ReviewItem()
                              {
                                  Position = c.Scope.Start,
                                  Page = GetPageNumberOfRange(c.Scope),
                                  Paragraph = GetHeaderForRange(headers, c.Scope),
                                  Description = c.Range.Text,
                                  ItemType = DetermineCommentType(c)
                              };
    
            return reviewItems.ToList();
        }
    
        /// <summary>
        /// Brute force translation of comment type based on the contents...
        /// </summary>
        /// <param name="c"></param>
        /// <returns></returns>
        private static string DetermineCommentType(Word.Comment c)
        {
            // This code is very specific to my solution, might be made more flexible/configurable
            // For now, this works :-)
    
            string text = c.Range.Text.ToLower();
    
            if (text.EndsWith("?"))
            {
                return "Vraag";
            }
            if (text.Contains("spelling") || text.Contains("spelfout"))
            {
                return "Spelling";
            }
            if (text.Contains("typfout") || text.Contains("typefout"))
            {
                return "Typefout";
            }
            if (text.ToLower().Contains("omissie"))
            {
                return "Omissie";
            }
    
            return "Opmerking";
        }
    
        /// <summary>
        /// Determine the last header before the given range's start position. That would be the chapter the range is part of.
        /// </summary>
        /// <param name="headings">List of headings as identified in the document.</param>
        /// <param name="range">The current range</param>
        /// <returns></returns>
        private static string GetHeaderForRange(IEnumerable<Heading> headings, Word.Range range)
        {
            var found = (from h in headings
                         where h.Start <= range.Start
                         select h).LastOrDefault();
    
            if (found != null)
            {
                return found.Text;
            }
            return "Unknown";
        }
    
        /// <summary>
        /// Identifies whether a paragraph is a heading, based on its styling.
        /// Note: the documents we're reviewing are always in a certain format, we can assume that headers
        /// have a style named "Heading..." or "Kop..."
        /// </summary>
        /// <param name="paragraph">The paragraph to be evaluated.</param>
        /// <returns></returns>
        private static bool IsHeading(Word.Paragraph paragraph)
        {
            Word.Style style = paragraph.get_Style() as Word.Style;
            return (style != null && style.NameLocal.StartsWith("Heading") || style.NameLocal.StartsWith("Kop"));
        }
    
        /// <summary>
        /// Translates a paragraph into the form we want to see: preferably the chapter/paragraph number, otherwise the
        /// title itself will do.
        /// </summary>
        /// <param name="paragraph">The paragraph to be translated</param>
        /// <returns></returns>
        private static string GetHeading(Word.Paragraph paragraph)
        {
            string heading = "";
    
            // Try to get the list number, otherwise just take the entire heading text
            heading = paragraph.Range.ListFormat.ListString;
            if (string.IsNullOrEmpty(heading))
            {
                heading = paragraph.Range.Text;
                heading = Regex.Replace(heading, "\\s+$", "");
            }
            return heading;
        }
    
        /// <summary>
        /// Determines the pagenumber of a range.
        /// </summary>
        /// <param name="range">The range to be located.</param>
        /// <returns></returns>
        private static int GetPageNumberOfRange(Word.Range range)
        {
            return (int)range.get_Information(Word.WdInformation.wdActiveEndPageNumber);
        }
    }
    
  • 1

    试试这个页码:

    Page = c.Scope.Information(wdActiveEndPageNumber);
    

    哪个应该给你一个页码的最终值 . 如果您想要开头的页面值,请先尝试:

    Word.Range rng = c.Scope.Collapse(wdCollapseStart);
    Page = rng.Information(wdActiveEndPageNumber);
    

    对于段落编号,请参阅以下内容:

    c.Scope.Paragraphs; //Returns a paragraphs collection
    

    我的猜测是采取上面返回的集合中的第一个段落对象,从该段落的末尾到文档的开头获取一个新的范围并获取此整数值:

    [range].Paragraphs.Count; //Returns int
    

    这应该给出评论范围开头的准确段落编号 .

  • 7

    我认为有一种更简单的方法 . 您可以从 Range 对象本身获取它 . Range.get_Information 为您提供页面号,行号等信息, except you got to know how many pages or lines the range spans across. 这是捕获,一个范围不必在一个页面中 .

    因此,您可以获取范围的起点和终点,然后计算页面号,或行号等 . 这应该做:

    public static void GetStartAndEndPageNumbers(Word.Range range, out int startPageNo,
                                                 out int endPageNo)
    {
        Word.Range rngStart;
        Word.Range rngEnd;
        GetStartAndEndRange(range, rngStart, rngEnd);
    
        startPageNo = GetPageNumber(rngStart);
        endPageNo = rngEnd != null ? GetPageNumber(rngEnd) : startPageNo;
    }
    
    static void GetStartAndEndRange(Word.Range range, out Word.Range rngStart,
                                    out Word.Range rngEnd)
    {
        object posStart = range.Start, posEnd = range.End;
    
        rngStart = range.Document.Range(ref posStart, ref posStart);
    
        try
        {
            rngEnd = range.Document.Range(ref posEnd, ref posEnd);
        }
        catch
        {
            rngEnd = null;
        }
    }
    
    static int GetPageNumber(Word.Range range)
    {
        return (int)range.get_Information(Word.WdInformation.wdActiveEndPageNumber);
    }
    

    您也可以对行号进行相同的操作:

    public static void GetStartAndEndLineNumbers(Word.Range range, out int startLineNo,
                                                  out int endLineNo)
    {
        Word.Range rngStart;
        Word.Range rngEnd;
        GetStartAndEndRange(range, rngStart, rngEnd);
    
        startLineNo = GetLineNumber(rngStart);
        endLineNo = rngEnd != null ? GetLineNumber(rngEnd) : startLineNo;
    }
    
    static int GetLineNumber(Word.Range range)
    {
        return (int)range.get_Information(Word.WdInformation.wdFirstCharacterLineNumber);
    }
    

相关问题