首页 文章

使用基于幻灯片 Headers 的布尔值从多个powerpoint文件中提取文本

提问于
浏览
1

我试图从多个PowerPoint文件中提取文本到txt文件或excel文件 .

The problem is ,我试图仅提取具有特定文本 Headers 的幻灯片 . 由于我有几个PowerPoint文件,我希望我的导出也可以创建为几个单独的文件 .

我相信可以运行一个可以做到这一点但我不完全确定的宏 . 这实际上是否可行,如果是,那么编码它的最佳方法是什么?

从理论上讲,这应该是一个简单的“IF”声明,但我对VBA并不是很熟悉或熟悉 .

我一直在使用以下代码:(第二组代码)http://www.pptfaq.com/FAQ00274_Export_Text_to_a_text_file-extract_text_from_PowerPoint-Mac_or_PC-.htm

以及这个链接:(也是第二组代码)Extracting all text from a powerpoint file in VBA

第一个链接将文本提取到txt文件中,但不允许我提取具有特定 Headers 幻灯片的文本 . 它似乎也有一行代码标识 Headers 幻灯片,这似乎是有帮助的 . 第二个链接可以导出多个txt文件,但我的输出txt文件是空白的,我不能让它工作 .

我相信这样的东西对于试图在几个PowerPoint中对大量数据进行排序的人来说会很有用 .

如果有人有任何想法会很棒!

FOLLOWUP

根据我们在下面的评论中的讨论,我添加了一个LIKE和一个通配符("*")函数,以便代码返回所有带有幻灯片 Headers 的文本 "Walkthrough:" + (Anything after the word walkthrough). 当我尝试添加like函数时,它只会导致.txt文件显示第一行文字 .

至于超链接 . 它们仍然显示为显示文本 .

Siddharth Rout,感谢您的帮助以及您对我的耐心 .

FOLLOWUP(2)

'~~> Change Slide Title here
Const ppSTitle As String = "Walkthrough"
'~~> Change PPT Source Directory Here
Const sDir As String = "C:\Documents and Settings\r126162\Desktop\test\"

Sub Sample()
    Dim ppPrsn As Presentation
    Dim ppSlide As Slide
    Dim filesize As Integer
    Dim shp As Shape
    Dim vFile
    Dim No As Long

    vFile = Dir(sDir & "*.ppt*")

    No = 1

    Do While vFile <> ""
        Set ppPrsn = Presentations.Open(FileName:=sDir & vFile)

        For Each ppSlide In ppPrsn.Slides
            If InStr(1, ppSlide.Shapes.Title.TextFrame.TextRange.Text, ppSTitle, vbTextCompare) Then
                '~~> Get a free file handle
                filesize = FreeFile()

                '~~> Open your file
                Open vFile & ".txt" For Output As #filesize

                For Each shp In ppSlide.Shapes
                    If shp.HasTextFrame Then
                        If shp.TextFrame.HasText Then
                            '~~> Export Text
                            Print #filesize, shp.TextFrame.TextRange.Text & " " & shp.TextFrame.TextRange.Characters.ActionSettings(ppMouseClick).Hyperlink.Address
                        End If
                    End If
                Next

                Close #filesize

                No = No + 1
                Exit For
            End If
        Next

        ppPrsn.Close
        vFile = Dir
    Loop
    Set ppPrsn = Nothing
End Sub

2 回答

  • 1

    如果定义的字符串可用,则此代码将查看每个形状 .
    如果可用,它会使用Filesystem对象将形状包含的文本写入文本文件 .
    要使用它,您需要引用MS Scripting Runtime库 .
    此外,我还提供了一种循环指定文件夹并检索可用PowerPoint演示文稿的方法 .

    Option Explicit
    
    Sub Get_PPT()
    
    Dim oApp                As PowerPoint.Application
    Dim oPres               As PowerPoint.Presentation
    Dim oSlides             As PowerPoint.Slides
    Dim oSlide              As PowerPoint.Slide
    Dim oShapes             As PowerPoint.Shapes
    Dim oShape              As PowerPoint.Shape
    Dim sFolder             As String
    Dim sFile               As String
    Dim sPath               As String
    Dim sSearch             As String
    Dim sTitle              As String
    Dim iCnt                As Integer
    
    Dim FSO_Ext             As FileSystemObject
    Dim FSO                 As FileSystemObject
    Dim FSOFile             As TextStream
    Dim sFilePath           As String
    Dim iNoOfLoop           As Integer
    Dim sExtension          As String
    
    
    Set oApp = CreateObject("Powerpoint.Application")
    
    sFolder = "U:"
    If sFolder <> "" Then
        If Right(sFolder, 1) <> "\" Then
            sFolder = sFolder & "\"
        End If
        sFile = Dir(sFolder, vbNormal)
        Do While sFile <> ""
            sPath = sFolder & sFile
            Set FSO_Ext = New FileSystemObject
            sExtension = FSO_Ext.GetExtensionName(sPath)
            If sExtension = "ppt" Or sExtension = "pptx" Then
                Set oPres = oApp.Presentations.Open(sPath)
                sSearch = "partner"
                For Each oSlide In oPres.Slides
                    Set oShapes = oSlide.Shapes
                    For Each oShape In oShapes
                        If oShape.HasTextFrame Then
                            Debug.Print sTitle
                            sTitle = oShape.TextFrame.TextRange.Text
    
                            If InStr(UCase(Trim(sTitle)), UCase(Trim(sSearch))) <> 0 Then
                                iCnt = iCnt + 1
                                sFilePath = sPath & iCnt & ".txt"
                                Set FSO = New FileSystemObject
                                Set FSOFile = FSO.OpenTextFile(sFilePath, 2, True)
                                FSOFile.writeline (sTitle)
                                FSOFile.Close
                            End If
                        End If
                    Next oShape
                Next oSlide
                Set oSlides = Nothing
                Set oShapes = Nothing
                oPres.Close
            End If
            Set FSO_Ext = Nothing
        sFile = Dir
        Loop
    End If
    
    oApp.Quit
    
    End Sub
    

    请注意,没有什么能阻止您自定义此代码 .
    想象一下,例如,您想在文本文件中添加更多行(由同一幻灯片中的其他形状包含),您可以通过将'Writeline'放在循环中来使用FSO编写多行:

    For iCnt = 1 To 5 
        FSOFile.WriteLine ("Text at line" & iCnt) 
    Next iCnt
    
  • 1

    第一个链接将文本提取到txt文件中,但不允许我提取具有特定 Headers 幻灯片的文本 .

    这对我有用

    '~~> Change Title here
    Const ppSTitle As String = "Title1"
    '~~> Change File Name here
    Const FlName = "C:\Sample.Txt"
    
    Sub Sample()
        Dim ppPrsn As Presentation
        Dim ppSlide As Slide
        Dim filesize As Integer
        Dim shp As Shape
    
        Set ppPrsn = ActivePresentation
    
        For Each ppSlide In ppPrsn.Slides
            If ppSlide.Shapes.Title.TextFrame.TextRange.Text = ppSTitle Then
    
                '~~> Get a free file handle
                filesize = FreeFile()
    
                '~~> Open your file
                Open FlName For Output As #filesize
    
                For Each shp In ppSlide.Shapes
                    If shp.HasTextFrame Then
                        If shp.TextFrame.HasText Then
                            '~~> Export Text
                            Print #filesize, shp.TextFrame.TextRange.Text
                            Debug.Print
                        End If
                    End If
                Next
    
                Close #filesize
    
                Exit For
            End If
        Next
    End Sub
    

    FOLLOWUP

    这将创建类似 Sample_1.txtSample_2.txtSample_3.txt 等文件 . 根据您的要求进行修改

    '~~> Change Title here
    Const ppSTitle As String = "Title1"
    '~~> Change File Name here
    Const FlName As String = "C:\Sample"
    '~~> Change Directory Here
    Const sDir As String = "C:\Temp\"
    
    Sub Sample()
        Dim ppPrsn As Presentation
        Dim ppSlide As Slide
        Dim filesize As Integer
        Dim shp As Shape
        Dim vFile
        Dim No As Long
    
        vFile = Dir(sDir & "*.ppt*")
    
        No = 1
    
        Do While vFile <> ""
            Set ppPrsn = Presentations.Open(FileName:=sDir & vFile)
    
            For Each ppSlide In ppPrsn.Slides
                If ppSlide.Shapes.Title.TextFrame.TextRange.Text = ppSTitle Then
                    '~~> Get a free file handle
                    filesize = FreeFile()
    
                    '~~> Open your file
                    Open FlName & "_" & No & ".txt" For Output As #filesize
    
                    For Each shp In ppSlide.Shapes
                        If shp.HasTextFrame Then
                            If shp.TextFrame.HasText Then
                                '~~> Export Text
                                Print #filesize, shp.TextFrame.TextRange.Text
                            End If
                        End If
                    Next
    
                    Close #filesize
    
                    No = No + 1
                    Exit For
                End If
            Next
    
            ppPrsn.Close
            vFile = Dir
        Loop
        Set ppPrsn = Nothing
    End Sub
    

相关问题