2016年8月9日星期二

Automation of Adding Bookmark to PDF via Excel VBA


In pharmaceutical industry, SAS programmers are often required to combine all reports into a PDF file which includes bookmark to facilitate review. Most common approach is to create individual PDF files and then bundle them. But there are several limitations. The first one is that RTF code cannot be applied and therefore flexibility of customizing TFL is decreased. The second one is that there are multiple bookmarks for each individual PDF file and this is obviously not what we want.

A better solution – adding bookmarks into PDF file converted from bundled RTF file via Excel VBA – can solve that problem under premise that TOC exists in bundled RTF file and Adobe Acrobat is installed.  The task of adding bookmarks into PDF file can be completed by following 4 steps:  1) go to first page of a specific TFL. 2) Create a new bookmark. 3) Set the title of bookmark as the title of that specific TFL. 4) Repeating above process until all bookmarks are added.  

You can see that the above implies that TFL title and location of first page of each TFL is required.  And that’s the reason why we need the bundled RTF file containing TOC.

At first, I’ll introduce how to retrieve title and page number from TOC via VBA. Below is the whole program and explanations in green. Please note that there are several ways to create TOC and therefore the code (in red) may vary from case to case when splitting each TOC entry. Here is only an example and for your own case, you might have to tune the code.



Sub ExtractTOC_Click()

'Clear content

ThisWorkbook.Worksheets(1).Activate

For i = 1 To ActiveSheet.UsedRange.Rows.Count

   For j = 1 To ActiveSheet.UsedRange.Columns.Count

     Cells(i, j).ClearContents

   Next j

Next i


'Create Object and Open RTF file

Dim wrdApp As Object  

Dim wrdDoc As Object

Dim wrdNam As String    

Set wrdApp = CreateObject("Word.Application")

wrdNam = "D:\Combined RTF.rtf"


If Dir(wrdNam) > " " Then

    Set wrdDoc = GetObject(wrdNam)

    If wrdDoc.TablesOfContents.Count > 0 Then

    'Retrieve TOC

    Dim lastrow As Integer, nRow As Integer, StrTOC As String

    nRow = 0

    With wrdDoc

        For Each TOC In wrdDoc.TablesOfContents

          StrTOC = TOC.Range.Text

          For i = 0 To UBound(Split(StrTOC, vbCr))

            nRow = nRow + 1

            For j = 0 To UBound(Split(Split(StrTOC, vbCr)(i), vbTab))

              n = j + 1

             Cells(nRow, n).Value = Split(Split(StrTOC, vbCr)(i), vbTab)(j)

            Next

          Next

        Next

    End With

    'Manipulate to keep only title and page number

    lastrow = ActiveSheet.UsedRange.Rows.Count

    For i = 1 To lastrow

       Cells(i, 1) = Cells(i, 1) & " " & Cells(i, 2)

       Cells(i, 2) = Cells(i, 3)

       Cells(i, 3) = ""

    Next i

   End If

    wrdDoc.Close

End If

wrdApp.Quit

End Sub


Before giving the code related adding bookmarks, I’d like to let you know that we need to make sure that VBA knows about the Acrobat objects. First of all, Select the “Tools > Reference” menu item on the VBA dialog. And then select Acrobat as below shows on the dialog that pops up.

 

And finally come the last part – full program for how to add bookmarks.



Sub OpenAcrobat()

   'Create Object

    Dim AcroApp As Acrobat.CAcroApp

    Dim PDoc As Acrobat.CAcroPDDoc

    Dim ADoc As AcroAVDoc

    Dim PDBookmark As AcroPDBookmark

    Dim PDFPageView As AcroAVPageView

  

    Set AcroApp = CreateObject("AcroExch.App")

    Set PDoc = CreateObject("AcroExch.PDDoc")

    Set ADoc = CreateObject("AcroExch.AVDoc")

    Set PDBookmark = CreateObject("AcroExch.PDBookmark", "")

 

    PDoc.Open ("D:\Combined RTF.pdf")

    Set ADoc = PDoc.OpenAVDoc("D:\Combined RTF.pdf")

    Set PDFPageView = ADoc.GetAVPageView()


    'Loop through all TFL titles and page numbers that we just retrieved  

For i = 1 To ThisWorkbook.Sheets(2).UsedRange.Rows.Count

       'Create a variable to contain page number

        pg = ThisWorkbook.Sheets(2).Cells(i, 2)

       'Create a variable to contain TFL title

        bm = ThisWorkbook.Sheets(2).Cells(i, 1)

   

       'Set the focus to adobe acrobat pro

        AppActivate "Adobe Acrobat Pro"

     

       'Go to specific page – location of first page of a specific TFL  

        Call PDFPageView.GoTo(pg)

       'Create a new bookmark

        AcroApp.MenuItemExecute ("NewBookmark")

        'Give new bookmark a title as “Untitled”

        btitle = PDBookmark.GetByTitle(PDoc, "Untitled")

        'Set title of bookmark as the title of TFL

        btitle = PDBookmark.SetTitle(bm)


       'Set the focus to Microsoft Excel to continue next iteration in the loop

       AppActivate "Microsoft Excel"

    Next i

    

    'Save PDF

    n = PDoc.Save(PDSaveFull, "D:\Combined RTF.pdf")

    PDoc.Close


    AcroApp.Exit

    Set AcroApp = Nothing

    Set PDoc = Nothing

    Set ADoc = Nothing

End Sub

4 条评论: